163 95 5MB
English Pages 169 [180] Year 1985
Generalized Quantifiers in Natural Language
Groningen-Amsterdam Studies in Semantics (GRASS) This series of books on the semantics of natural language contains collections of original research on selected topics as well as monographs in this area. Contributions from linguists, philosophers, logicians, computer-scientists and cognitive psychologists are brought together to promote interdisciplinary and international research. Editors Alice ter Meulen Martin Stokhof
Editorial Board Renate Bartsch University of Amsterdam Johan van Benthem University of Groningen Henk Verkuyl University of Utrecht
Other books in this series: 1.
Alice G.B. ter Meulen (ed.) Studies in Modeltheoretic Semantics
2.
Jeroen Groenendijk, T h e o M.V. Janssen and Martin Stokhof (eds.) Truth, Interpretation and Information
3.
Fred Landman and Frank Veltman (eds.) Varieties of Formal Semantics
All communications to the editors can be sent to: Department of Philosophy University of Amsterdam Grimburgwal 10 1012 G A Amsterdam The Netherlands or Department of Linguistics, G N 40 University of Washington Seattle, Washington 98185 U.S.A
Johan van Benthem Alice ter Meulen (eds.)
Generalized Quantifiers in Natural Language
¥ 1985 FORIS PUBLICATIONS Dordrecht - Holland/Cinnaminson - U.S.A.
Published by: Foris Publications Holland P.O. Box 509 3300 AM Dordrecht, The Netherlands Sole distributor for the U.S.A. and Canada: Foris Publications U.S.A. P.O. Box C-50 Cinnaminson N.J. 08077 U.S.A. CIP-DATA Generalized Generalized quantifiers in natural language / Johan van Benthem, Alice ter Meulen(eds.). - Dordrecht [etc.] : Foris. - (Groningen-Amsterdam Studies in Semantics ; 4) With biliogr. ISBN 90-6765-081-1 bound ISBN 90-6765-082-X paper SISO 805.5 UDC 801.5 Subject heading: semantics.
ISBN 90 6765 081 1 (Bound) ISBN 90 6765 082 X (Paper) ® 1984 Foris Publications - Dordrecht. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission from the copyright owner. Printed in the Netherlands by ICG Printing, Dordrecht.
Table of Contents
Preface
VII
Jan van Eijck Generalized Quantifiers and Traditional Logic
1
Franciska de Jong and Henk Verkuyl Generalized Quantifiers: the Properness of their Strength
21
Dag Westerstahl Determiners and Context Sets
45
Edward L. Keenan and Lawrence S. Moss Generalized Quantifiers and the Expressive Power of Natural Language
73
Elias Thijsse Counting Quantifiers
127
Kees van Deem ter Generalized Quantifiers: Finite versus Infinite
147
Johan van Benthem Themes from a Workshop
163
Preface This collection of papers on generalized quantifiers in natural language results from an international workshop on the topic conducted by the Taakgroep Formeie Linguistiek at the Rijksuniversiteit Groningen in July 1983, funded by a ZWO grant (Netherlands Organization for the Advancement of Pure Research). The current volume represents a selection from the research reported at the meeting, and we regret not having been able to include work by Jack Hoeksema on partitives, Frans Zwarts on the relational theory of determiners and Jan Lenning on mass terms, which will be published at a later occasion. Two main new developments can be recognized in current research on generalized quantifiers in natural language, which originates in Richard Montague's work on quantification (R. Montague,Formal Philosophy, R. Thomason (ed.), Yale university Press, 1974) and was significantly developed by J. Barwise and R. Cooper ('Generalized Quantifiers and Natural Language', Linguistics and Philosophy 4,1981). On the one hand, there has been considerable development in the theoretical foundations of the subject. Starting from the most general notion of apossible quantifier, the class of quantifiers that are empirically realized or realizable in natural language is approached in a principled way by studying their characteristic semantic properties. The latter turn out to exhibit enough logical or mathematical structure to inspire various types of further theoretical questions, especially concerning the interplay of abstract semantic characterization and actual linguistic enumeration. On the other hand, empirical applications of the theory to natural language have been extended considerably beyond simple quantifiers to more complex forms of quantification and determiners in general, in English as well as in other languages. The papers in this volume reflect the work of a small international community of linguistics, philosophers and logicians working on both of these directions. New questions are addressed concerning diverse topics such as determiners with more arguments, context-dependence of quantifiers, the need for partial interpretation to account for presupositions, the consequences of admitting infinite domains and the connection with traditional syllogistic theory. This book is intended to demonstrate the fruits of a closer cooperation between mathematical logic and theoretical linguistics in the semantics of natural language. The Editors Johan van Benthem Alice ter Meulen
Chapter 1
Generalized Quantifiers and Traditional Logic Jan van Eijck
0.
INTRODUCTION
The aim of this paper is to provide an analysis in the spirit of the Generalized Quantifier perspective (cf. Mostowski (1957), Barwise & Cooper (1981), Van Benthem (1984) ) of traditional syllogistic logic. In section 1 we introduce some fundamental operations on generalized quantifiers. These operations are used in the subsequent discussion of three viewpoints on syllogistic logic. In section 2 the inferential pattern of the-traditional square of opposition is studied. In section 3 we turn to syllogistic inference-patterns and the relations between these. Finally, in section 4 the 'rules of thumb' of traditional logic are reconsidered, and a modern reconstruction is given for the Medieval theory of distribution. Hopefully, this application of modern semantic concepts to syllogistic logic will reveal some interesting new aspects of this most traditional of all logical theories.
1.
FUNDAMENTAL OPERATIONS ON GENERALIZED QUANTIFIERS
A Generalized Quantifier Q E is a relation between subsets of domains of objects E that has the following properties: (As the quantifier properties that we are concerned with generally are universal ones, we will omit the universal quantifier prefixes ΛΑ, ΛΒ, etc.) CONSERVATIVITY:
Q £ AB
«
Q E A(A (Ί B)
Conservativity (called the 'live-on' property in Barwise & Cooper (1981) ) says that the first argument in the quantified statement "sets the stage".
* I am indebted to Johan van Benthem, Jack Hoeksema, Elias Thijsse, Dag Westerstahl and Frans Zwarts for comments on previous versions of this paper. The research for this paper was sponsored by the Netherlands Organization for the Advancement of Pure Research (ZWO), grant no. 22-65.
2
Jan van Eijck
QUANTITY:
QgAB
~
Q E F[A]F[B], for any permutation F of E.
Quantity, proposed already in Mostowski (1957), says that Q E only depends on the number of individuals in Α, Α Π Β, Β and E. EXTENSION:
A, Β ç E ç E ' =>
(Q E AB
o
Q E , AB)
Extension says that quantifiers are stable under growth of the universe. Extension plus conservativity is equivalent to the following property (cf. Westerstàhl (1982b) ): STRONG CONSERVATIVITY:
Q E AB
o
Q a A(A Π Β)
Thus, conservativity plus extension permit us to suppress the parameter E. They ensure that the truth of QAB depends only on A and Α Π Β. Conservativity, extension and quantity ensure that the truth of QAB depends only on the cardinal numbers I A I and I Α Π Β 1 (or, equivalently, on I Α-B I and IΑ Π Β I ). The three afore-mentioned properties enable us to represent every generalized quantifier Q by a corresponding numerical relation RQ . In Mostowski (1957), Van Benthem (1984) and Westerstàhl (1984) these numerical relations are used to characterize quantifiers. If only finite universes are considered (call this requirement FIN), there is, for any quantifier Q, a relation RQ on ω χ ω such that: QAB
o
R q ( I Α-B I, I Α Π Β I)
Let me introduce some fundamental operations on generalized quantifiers. The first one is mentioned already in Mostowski (1957). Mostowski defines an operation on quantifiers as follows: if Q is the quantifier determined by R, then the quantifier Q* determined by R* such that R*(m,n) o i R ( n ^ i ) , is called the dual of Q. The universal quantifier is determined by the relation R such that R(m,n) o m = 0. The existential quantifier is its dual, for it is determined by the relation R ' such that R ' i (m,η) ·» η i 0, so R ' = R*. Likewise 'no . . . are . . . 'is the dual of 'some . . . are n o t . . . ' . Two other operations on quantifiers: if generalized quantifier Q is determined by R, then I will call the quantifier Q'that is determined by R~ such that R~(m,n) » R(n,m), the co-quantifier of Q. (The term is coined in Westerstàhl (1984).) Further, I will call the quantifier Q # determined by R # such that R #(m,n) nR(m,n) the opposite quantifier of Q. Observe that R # and R~ determine dual quantifiers, R* and R # co-quantifiers, and R" and R * opposites. The following picture now emerges:
Generalized Quantifiers and Traditional Logic R —
co-quantifìers — R
\
d u a
opposites
R*—
3
\
d u a
co-quantifiers — R #
I will denote a quartet of quantifiers that is determined by the relations R, R R* and R # as Q, Q " Q* and Q #, respectively. It must be noted, however, that some caution is needed in the switching back and forth between the set-theoretic notation ("QAB") and the numerical notation ("RQ(m,n)"). One must beware e.g. of the temptation to view Q " as the converse relation of Q. It isn't Q " AB ·»· QBA, although we have R "(m,n) R(n,m). Suppose R determines the quantifier Q. Then we have Q" AB o R" ( I Α-B I, I Α Π Β I ) • R( 1 Α η Β I , I Α-B l ) » R ( I A-(A-B) I, IA η (Α-B) I ) o QA(A-B). This points the way to the definition of Q ", Q *, and Q # in terms of Q: Q"AB
o
QA(A-B)
Q #AB
ο
Ί QAB
Q*AB
ο
Ί Q "AB
Ί QA(A-B)
It is interesting to note that the three operations on generalized quantifiers " , * , # , together with the identity operation id, form the so-called Klein 4-group with respect to composition of these operations. 1 We have:
# #
id
*
#
id
*
*
id
#
#
id
#
id
* id
4
Jan van Eijck
To visualize the relation between four quantifiers related by the three operations that we have distinguished, we can use the tree-format that was proposed in Van Benthem (1984). Then, we can chart the behaviour of a quantifier with the help of pictures like the following: ( I Α-B I, I Α η Β I ) I I I I
A| Al Al A|
= = = =
0 1 2 3
0,0 1,0
0,1
2,0
1,1
3,0
0,2
2,1
1,2
ν
0,3
ν
Every generalized quantifier Q is characterized in this tree by a pattern of '+' and '-' on the nodes, where the signs indicate whether RQ holds for the number-pair at the node or not. Now it follows from the numerical definition of the operators that the pattern of Q # is the complement of the pattern of Q. Take the quantifier 'every' as an example. 'Every' is determined by the numerical condition R(m,n) ·» m = 0. The pattern in the numerical tree is:
+ -
+ -
-
+ -
+
The pattern of 'not every', the opposite of 'every', is the complement of the pattern of 'every':
+ + +
+
+
+
-
The pattern of Q "is the mirror-image of the pattern of Q. If Q is 'every', then in the pattern of 'no', the co-quantifier of Q, the right '+'-diagonal for 'every' changes into a left '+'-diagonal:
5
Generalized Quantifiers and Traditional Logic + +
+
-
-
-
The pattern for Q* is the complement of the pattern of Q i.e. it is the complement of the mirror-image of the pattern of Q. The dual of 'every' is 'some', with the following pattern:
-
-
+
+ +
+ +
+
As these examples indicate, the four quantifiers in the traditional Square of Opposition are related by the operations that we have introduced here. We now turn to the study of the properties of the Square from the generalized quantifier perspective.
2.
THE INFERENTIAL PATTERN OF THE SQUARE OF OPPOSITION
The familiar picture of the Square of Opposition is a "special case" of the square pictured in section 1 : every A B s u b a 1 t
—
contraries — no A Β s contradictories 1
some A Β — subcontraries —
u b a t not every A Β
The quantifiers of contradictory statements in the Square of Opposition are opposites; the quantifiers of statements in the so-called subaltern relation are duals, and the quantifiers of contrary or subcontrary statements are co-quantifiers. The rules of classical syllogistics must be read with some care, because of the presupposition of "existential import" for its quantifiers. We introduce this presupposition as follows:
6
Jan van Eijck
SUBSTANCE: (QAB or iQAB)
12 V «o«o
03
-lNo
Otto
For all nodes (m,n) where m and η are finite the same argument as in theorem 3 holds. Only the values op the final row, the row that presupposes an infinite domain, remain to be determined. From (0,1) e R we have, by COSYM, (o,N o) e R· We prove that R contains no other node of the final row. Suppose (m, Κ o ) e R for some finite m > 0. Then, by COSYM, (m,m) e R, which contradicts SUBALT. Also, («0, «o) 4 R because of SUBALT. Finally, suppose that for some finite k, («e, k) e R. Then, by COSYM, (K0> 0) e R, and, by SUBALT, (0, Κ o ) / R, a contradiction with what we found above.
The property that accounts for inference (9) can be rewritten as: ACCIDENS:
QAB
Ί QB (B-A).
The numerical version of ACCIDENS (obtained by switching to cardinal numbers and appropriate substitutions) is: Am, n, k ( R Q (m,n)
-> Π R Q ( η * ) )
This is equivalent to: An, k (Vm R Q (m,n)
-»· Ί R Q (n, k) )
By contraposition and appropriate substitutions we can derive the parallel principle: Am, k ( V n R Q (m,n)
Ί R Q (k^m) )
These formulas tell us that in the numerical tree the relation RQ of a quantifier for which ACCIDENS holds obeys the following rule: If (m,n) e RQ, then both the left and the right diagonal through (n,m) (the mirror node of (m,n) ) 4 R n . (10) can be derived from (9) by contraposition and the equivalences Q "AB o HQ* AB and Q # AB"·» Ί QAB. Further, ACCIDENS im-
11
Generalized Quantifiers and Traditional Logic
plies SUBALTERNACY, as can be seen from the numerical versions of these principles. Also, we note that ACCIDENS is implied by COSYMMETRY plus SUBALTERNACY: QAB
*
Q*AB
(SUBALT)
·» Q*BA
(COSYM).
In order to prove a further result about the property ACCIDENS we need an extra condition on quantifiers: VARIETY:
Λ Α (Α ψ 0
( V Β QAB
&
V C Ί QAC ) )
The numerical version of this condition can be expressed thus: on every row of the tree, some node is in the numerical relation RQ, and some node is not in RQ. (The above version of VARIETY is the strongest of a number of proposals that are compared in Westerstâhl (1984).) THEOREM 5: If SUBSTANCE holds, then Q observes ACCIDENS and VARIETY iff Q is 'no' or 'all'. PROOF 'if: 'no' and 'all' observe ACCIDENS and VARIETY. 'only i f : We reason in the numerical tree. By VARIETY, either (1,0) or (0,1) e RQ. Suppose the former. Then, by ACCIDENS, the right and left diagonals through (0,1) 4 RQ. By VARIETY, one of (2,0), (1,1), (0,2) e RQ; (1,1) and (0,2) are oat, for they are on the right and left diagonals through (0,1), respectively. So (2,0) e RQ, and the diagonals through (0,2) are out. Similarly, (3,0) e R 0 , etc. (Draw a picture.) Thus RQ consists of the left diagonal through (1,0), and Q = 'no\ Alternatively, suppose (0,1) e RQ. NOW by similar reasoning RQ consists of the right diagonal through (CT,1) and Q = 'all'.
3.
ONCE AGAIN: ARISTOTLE REVISITED
The aim of Aristotelian syllogistics is to determine which inferential patterns of the following forms are valid (Q T , Q 2 and Q 3 each range over Q. Q "» Q * and Q # in the Square of Opposition): QIBC
QICB
QIBC
QTCB
Q2AB
Q2AB
Q2BA
Q2BA
Q3AC
Q3AC
Q3AC
Q3AC
I have listed the 'four figures' left to right. The presentation is the traditional one, with the major premiss first. The figures are defined according to whether the major term - the predicate of the conclusion (C in the
12
Jan van Eìjck
above forms) - and the minor term - the subject of the conclusion (A in the above forms) - occur as predicate or subject in the major and the minor premiss, respectively. The major premiss is the premiss that contains the major term. It has been shown (cf. Van Benthem (1984), Westerstâhl (1984), and Zwarts (1983) ) that no quantifiers satisfy the patterns: QCB QAB
QBC QBA
QCB QBA
QAC
QAC
QAC
(anti-Euclidicity) (Euclidicity) (circularity). There are no Euclidean quantifiers, there are no circular quantifiers and, modulo a weak version of variety, there are no anti-Euclidean quantifiers. These results preclude the possibility that in the second, third and fourth syllogistic figure above Qj = Q 2 = Q 3 . It would seem that none of the properties of the Square of Opposition that were listed at the beginning of section 2 could guarantee the syllogistic interference for the first figure with Ch = Q 2 = Q 3 . But, of course, the quantifier 'all' has the property of TRANSITIVITY:
(QBC&QAB)
-»·
QAC.
Some important results about this property: If Q is transitive then: QAB -> (A c Β or QA0 ) (cf. Van Benthem (1984); in the proof the assumption of finite models is used.) If Q is transitive then: QA0 -»· QAB (cf. Westerstâhl (1984); again finite models are assumed). Under VARIETY, 'all' and 'all e ' are the only transitive quantifiers (Van Benthem (1984)). Thus, under VARIETY and SUBSTANCE, 'all' is the only transitive quantifier. The fact that the quantifier 'all' has the property TRANSITIVITY means the following, in view of theorem 4: under the assumptions of SUBSTANCE and DENUM, NONTRIV + COSYM + SUB ALT imply TRANS. (NONTRIV is the property of a quantifier to be neither the empty nor the universal relation). Note that the co-quantifier, the dual and the opposite of 'all' do not observe TRANS. Thus there can only be one instance of the first figure where Qi = Q 2 = Q3, and the pattern of 'Barbara' emerges:
Generalized
Quantifiers
and Traditional
Logic
13
all Β are C all A are Β all A are C In the theory of the syllogism a distinction is made among valid syllogisms between pure and subaltern syllogisms. A pure syllogism is a syllogism that has Q or Q " in the conclusion, or, if it has Q* or Q #, then the argument breaks down if Q* or Q # is replaced by Q or Q " respectively. A subaltern syllogism is a syllogism that has Q* or Q # as quantifier of the conclusion, and moreover the argument still holds if Q* or Q # are replaced by Q or Q ", respectively. Thus in the pure syllogisms the conclusion is as strong as the argument warrants, in the subaltern syllogisms this is not the case. THEOREM 6: All valid syllogisms can be derived from 'Barbara' by means of the properties COSYMMETRY, SUBALTERNACY, and CONSERVATIVITY of the quantifier 'all'. PROOF: It is well-known that the valid pure syllogisms of the second, third and fourth figure can be derived from ('reduced to') valid pure syllogisms of the first figure by means of simple conversion, propositional reasoning, and conversion per accidens on the conclusion.(Cf. Prior (1967) for an account of this 'reductio'.) We have seen in section 2 that COSYM accounts for simple conversion and COSYM + SUBALT for conversion per accidens. Any valid subaltern syllogism can be derived from a valid pure syllogism by means of an application of SUBALT to the conclusion of the pure syllogism. We are going to add a reductio of all valid firstfigure modi to 'Barbara'. The reduction of the syllogism 'Celarent' to 'Barbara' uses only substitution and CONSERVATIVITY:
(Celarent) Q ~BC ~ QAB DEF. Q "
QB (B-C) QB ( (A υ B)-C) QAB CONSERV QAB
Q "AC
QA (Α-C) (Barbara)
* substitution
QBC QAB QAC
QA( (A υ B)-C)
14
Jan van
Eijck
The reduction of 'Dani' to 'Barbara' also uses COSYMMETRY and piopositional contraposition: (Dani) QBC Q*AB
« COSYM
Q*AC QBC QC(C-A)
QBC QBC «· QBC « Q*BA DEF. HQ "BA prop. Q ' C A DEF. Q* logic Q" Q*CA ~IQ~CA Q'BA
QBC ~ QAB CONSERV QC((B U C)-A) substitution QBC
QB(B-A)
QB((B υ C)-A)
Q A C
(Barbara) QBC reversal of QAB premisses QAC The reduction of 'Ferio' to 'Barbara': (Ferio) QBC Q*AB
« COSYM
Q#AC
Q*CB » Q'CB Q*AB DEFs "IQ'AB Q#AC
«· prop. logic
"IQAC
Q'CB QAC Q"AB (Barbara)
DEFs
QC(C-B) «· QAC CONSERV
QC( (C U A)-B) « QAC substitution
QBC QAB
QA(A-B)
QA( (C U A)-B)
QAC
Note that we have used nothing but CONSERVATIVITY, a property that all quantifiers have, and COSYMMETRY. This proves the theorem.
4.
RULES OF THUMB OF SYLLOGISTIC LOGIC
The result o f section 3 sheds light o n some important "rules o f t h u m b " o f syllogistic logic. In the first place, it explains the so-called Rules o f Quality: — A valid syllogism does not have t w o negative premises (i.e. premises with Q"or Q # as their quantifier) - A valid syllogism has a negative conclusion iff it has one negative premiss.
Generalized Quantifiers and Traditional Logic
15
We have seen that a syllogism is valid iff it can be derived from BARBARA, a form with only positive statements, by prepositional contraposition, applications of the properties COSYMMETRY and CONSERVATIVITY to a premiss or a conclusion, applications of the definition QAB =*• Q " A(A-B) to a premiss and the conclusion, and applications of SUBALTERNACY to the conclusion. COSYMMETRY, CONSERVATIVITY and SUBALTERNACY do not permit the transition from a positive to a negative statement. Applications of the definition QAB =» Q 'A(A-B) to a premiss and the conclusion always introduce a negative conclusion and one negative premiss. Prepositional contraposition, when applied to positive statements, likewise introduces a negative conclusion and one negative premiss; when applied to a positive and a negative statement, the qualities of premises and conclusion remain as they were, and when applied to two negative statements the quality of conclusion and negative premiss is changed back to positive. As 'Barbara' observes the rules of quality, and as all reduction rules preserve them, every valid syllogism observes these rules. Next, there are the Dictum de Omni, often considered as the cornerstone of traditional logic, and the theory of Distribution. The authorities on traditional logic that I have consulted disagree about what the Dictum de Omni says and even about whether it makes sense. For a reformulation of the version in Geach (1980) in terms of properties of generalized quantifiers, cf. Van Benthem (1983b). A simpler version of the Dictum that also makes sense says that whatever is distributively predicated of any class must be predicated of anything belonging to that class (cf. Prior (1967) ). This version of the Dictum is closely connected to the theory of distribution. Letting Q be the lefthand top quantifier in the Square of Opposition, and using the notation introduced above, medieval terminology has it that Q 'distributes its subject' and Q " 'distributes its predicate'. The theory of distribution can be stated as follows: 'Qi - -» Qî - (i) (iia) (lib)
Q3AC' is a valid syllogism only if:
the middle term (the term not occurring in the conclusion) is distributed in at least one of the premises; if A is distributed in the conclusion Q 3 AC, then A is distributed in the minor premiss (the premiss in which A occurs); if C is distributed in the conclusion Q 3 AC, then C is distributed in the major premiss.
A clue to what 'distribution' means is given in Prior (1967:39):
16
Jan van Eijck It is often said [. . ] that a distributed term refers to all, and an undistributed term to only a part, of its extension. But in what way does "Some men are mortal", for example, refer to only a part of the class of men? Any man whatever will do to verify it; if any man whatever turns out to be mortal, "Some men are mortal" is true. What the traditional writers were trying to express seems to be something of the following sort: a term t is distributed in a proposition f(t) if and only if it is replaceable in f(t), without loss of truth, by any term "falling under it" in the way that a species falls under a genus.
This quotation suggests an 'explication' (in Carnap's sense) of the concept 'distribution' in terms of a well-known property that holds for some generalized quantifiers (Van Benthem 1983b offers such an explication; my proposal is only slightly different). A quantifier has the property φ MON iff ¿MON (QAB & A ' c A)
-*• Q A ' B , .
A quantifier has the property tMON iff tMON (QAB & A £ A ' ) -»· QA ' Β Note that under the presupposition of SUBSTANCE no nontrivial quantifier is IMON. The reason is, that NONTRIV + 4-MON for a quantifier Q imply that there is a Β such that Q0B, which contradicts SUBSTANCE. SUBSTANCE necessitates a relaxation of 4-MON: ¿MON (QAB & A ' Ç A & A ' f 0 )
-"QA'B.
If a quantifier Q is ¿MON without SUBSTANCE, then Q with SUBSTANCE is ¿MON. ¿MON will be called 'almost left monotonie downward'. This notion has an obvious counterpart. A quantifier Q is tMON ('almost left monotonie upward') iff tMON (QAB & A Ç A ' & Α ψ 0 )
->· QA , Β
Note that going from a quantifier to its co-quantifier preserves left (almost) monotonicity in the downward, but not in the upward direction, and that going from a quantifier to its opposite changes the direction of left (almost) monotonicity. Similar definitions for the right hand argument of a quantifier can be given: MON
(QAB & Β ' Ç Β) -> QAB
Generalized Quantifiers and Traditional
Logic
17
MONt, MONI, and MONI are defined analogously. Going from a quantifier to its co-quantifier reverses the direction of right (almost) monoton icity, and going from a quantifier to its opposite does the same. Because the theory of the syllogism is narrowly tied to SUBSTANCE, I use almost monotonicity instead of monotonicity proper in the explication of the doctrine of distribution: - A is distributed in QAB iff Q is left almost downward monotonie (¿MON); - Β is distributed in QAB iff Q is right almost downward monotonie (MONI). A modern version of the theory of distribution can now be formulated in terms of almost monotonicity. Note that the theory of distribution is nothing more than a rule of thumb that comes in handy for a quick refutation of validity; if a syllogism does not fulfil the conditions, then it cannot be valid. Notably, the theory does not say that only valid syllogisms fulfil the conditions (i), (iia), (iib). Indeed, here is a counterexample: No philosophers are fools No politicians are philosophers No politicians are fools. Although (i), (iia), and (iib) are fulfilled, the argument is not valid. (Of course this example sins against the rules of quality; indeed, fulfilling the conditions of the theory of distribution plus the rules of quality seems to be a guarantee for validity.) In order to show that the modern version of the theory of distribution holds we have only to prove that all valid syllogisms observe (i), (iia) and (iib). THEOREM 7: The modern version of the theory of distribution holds. PR OOF Observe that the quantifiers that fit the Square of Opposition have the following monotonicity properties: It
Q
Q"
U
tf
Q*
Q#
η
(Incidentally, this means that, given SUBSTANCE and DENUM, ¿MONt is implied by NONTRIV + COSYM + SUBALT.)
18
Jan van Eijck It is easy to verify that the modern version of the theory of distribution holds for the syllogism 'Barbara'. Theorem 6 has told us that a syllogism is valid iff it is derivable from 'Barbara' by a sequence of steps that may involve uniform substitution, propositional reasoning, applications of quantifier definition QA(A-B) :«• Q "AB, COSYMMETRY, CONSERVATIVITY. and applications of SUBALTERNACY to the conclusion. Thus, if we can prove that these steps do not affect properties (i), (iia) and (lib) we have proved the theorem. And, for this purpose, a series of routine checks suffices.
We conclude with an open problem. Can our generalized quantifier perspective be used to provide an illuminating motivation for the success of the combined Distribution/Quality test? Classical arguments in this area consist in mere combinational checking of all possible cases: one would like to replace that by a more semantic analysis.
NOTE: i . After the completion of this paper my attention was drawn to Gottschalk (1953), where the operations that I have called '#', "* and '*' are discussed. Gottschalk calls the 4-group formed by those operations and the identity operation the "group of quaternity". He does not elaborate on the quantifier case, but he discusses the operations in a more general setting. Briefly, if A and E are sets, and F is an automorphism E on A, then any function f with domain A and range A has a co-function, an op—
posite £ function, and a dual function.
£
Let F be the extension of F to A , i.e. for any
g e A , F (g) is defined as F(g)(c) = F(g)(e)) (for all e e E). Now the opposite function of f is F o f ; the co-function of f is f o F ; and the dual function of f is F o f o F. The case where A = 0,1 F = < 1 , 0 > , < 0 , 1 > and E is a domain of individuals gives the "theory of quaternality" for unrestricted quantifiers.
REFERENCES: Barwise, Jon, & Robin Cooper: (1981), 'Generalized Quantifiers and Natural Language', Linguistics and Philosophy 4, 159-219. van Benthem, Johan: (1983a), 'Determiners and Logic', Linguistics and Philosophy 6,447-478. van Benthem, Johan: (1983b), Ά Linguistic Turn: New Directions in Logic', to appear in Logic, Methodology and Philosophy of Science VII, North Holland Pubi. Co., Amsterdam. van Benthem, Johan: (1984), 'Questions about Quantifiers', Journal of Symbolic Logic 49:2, 443-466. Geach, Peter Thomas: (1980), Reference and Generality, Cornell University Press, Ithaca and London; first edition: 1962. Gottschalk, W.H. (1953), 'The Theory of Quaternality'; Journal of Symbolic Logic 18,193-196.
Generalized
Quantifiers
and Traditional
Logic
19
Mostowski, A: (1957), On a Generalization oí Quantifiers', Fundamenta Mathematica* 44, 12-36. Prior, A.N.: (1967), 'Traditional Logic', in P. Edwards (ed.), The Encyclopedia of Philosophy, Macmillan, New York & London., Vol. V, 34-45. Westerstáhl, Dag: (1982), 'Logical Constants in Quantifier Languages', to appear in Linguistics and Philosophy. Westerstáhl, Dag: (1984), 'Some Results on Quantifiers', Notre Dame Journal of Formal Logic 25:2,152-170. Zwarts, Frans: (1983), 'Determiners: A Relational Perspective', in A. ter Meulen (ed.), Studies in Modeltheoretic Semantics, GRASS 1, Foris, Dordrecht., 37-62.
Chapter 2
Generalized Quantifiers: the Properness of their Strength Franciska de Jong and Henk
0.
Verkuyl
INTRODUCTION
In the theory of generalized quantification determiners are interpreted as relations between sets of individuals. That is, a determiner Det in the structure [ [ N p Det N] VP] is taken as a functor D £ operating on a universe E and relating a set A to a set Β, where A = [NJ and Β = [VP J and A,Β c E. From a linguistic point of view an important question is whether the semantic properties of NPs can be related to their syntactic properties. Our purpose is to provide a positive answer to that question. However, this answer requires that a decision be made with respect to the question whether partial interpretation of NPs is allowed. We argue in favour of a system that does. The fact that one can shift by mere stipulation from partial (Barwise & Cooper (1981) and Zwarts (1981) ) to total (van Benthem (1984) and Zwarts (1983) ) interpretation indicates some weakness in the theory of generalized quantification as applied to natural language. It is important to decide the matter on empirical grounds. In section 1 we discuss the B&C-framework as modified by Zwarts (1981) in view of the Dutch data. Our object language is also Dutch. Zwarts (1981) distinguishes between proper and improper quantifiers and between strong and weak quantifiers, slightly deviating from the B&Cdefinitions. We argue in section 2 that the notion of properness (sieve) is redundant: it can be reduced to the (revised) notion of strength. In Zwarts (1981) improperness and strength are restricted to NPs expressing universal quantification. We think that such NPs are strong and proper, * We would like to thank Frans Zwarts whose helpful comment led to the discussion about total and partial interpretation. We are grateful to Cecile Wijne for her enthousiastic participation in the initial phase preceding the various stages this paper went through and to Anke Le Loux and Leonoor Oversteegen for their comments on earlier drafts. The editors provided helpful suggestions to give this paper a linguistic flavour rather than the bouquet of a modeltheoretic excercise. The research was partially supported by the Foundation for Linguistic Research, which is funded by the Netherlands organization for the advancement of pure research, ZWO.
22
Franciska de Jong and Henk Verkuyl
the latter property being an epiphenomenon of universal quantification. Our revision of Zwarts's classification, given in section 3, leads, in section 4, to a classification reflecting the systematical relation of the semantic NP-internal and NP-external properties of determiners to some of their syntactic properties.
1.
SOME DEFINITIONS
To interpret expressions of natural language Zwarts (1981) assumes a set theoretical model M = , where E is the domain of discourse and ¡[J is an interpretation function assigning quantifiers Q on M as values to NPs, where Q is a collection of subsets of E. The quantifier Q which is the denotation of an expression a of the category NP, is short for fe]. In the definitions we use the expression NP as abbreviating 'an expression a of the category NP'. An NP is monotone increasing iff on every model M, for all X, Y c E, if X e Q and X c Y, then Y e Q. Examples of monotone increasing quantifiers are f de Ν] (the), falle Ν] (all), fsommige N] (some of the). An NP is monotone decreasing iff on every model M, for all X,Y c E, if X e Q and Y ç X, then Υ β Q. Examples: [Niet alle Ν] (not all), \weinig N] (few), \hoogstens η Ν] (at most n). The notion of monotonicity applied to quantifiers in natural language is quite familiar nowadays and also unproblematic though we shall modify it later on. Numerals such as twee (two), drie (three), etc. and weinige (few), enkele (some, sm), are taken to have an "at least" meaning. Their corresponding quantifiers will therefore be monotone. Not monotone quantifiers such as fprecies η Ν] (exactly η) are not in the scope of this article. The second partition separates proper quantifiers from improper ones: (1)
An NP is proper iff on every model M on which fNP] is defined (i) Q ψ 0 and (ii) Q ψ Pow(E)
An NP is improper only if it is not proper. Some relevant examples for (1) are de Ν (the), de meeste η Ν (most), sommige Ν (some of the) and Bertrand. Expressions such as geen Ν (no) and aile Ν (all) are improper given their appropriate definitions. Consider, for example, the definition of falle Ν] (all) in (2) and the definition of \een N] (a) in (3). (2)
ffl/feN] = { X ç E l X n f N ]
= fN]}
(3)
leen Ν] = {Xç E I X n f N j
f
0}
Generalized Quantifiers: the Properness of their Strength
23
Suppose that there are no individuals in E belonging to the set JNJ, i.e. if |[N] = 0, then every member of Pow(E) can be a value of X in Χ η [N] = [N], because Χ Π 0 = 0 is always true. So if [NJ =0, [a//é>N]l=Pow(E) which makes the NP improper according to (Iii). Furthermore, given (3) and given the absence of believers in E, \een N] (a) is also improper, because for all X c E, Χ Π 0 = 0. Hence, \een Ν J has no members, thus violating (i) in (1). Zwarts' third distinction concerns strength: (4)
a. b.
An NP is positive strong iff for every model M on which [NPJ is defined, E e Q An NP is negative strong iff for every model M on which [[NP] is defined, E ¿ Q
An NP is weak only if it is not strong. According to Barwise & Cooper (1981:182) the distinction between weak and strong applies to determiners rather than to quantifiers. Zwarts pushes up this distinction to the NPlevel because it explains the distribution of determiners in Dutch existential sentences.1 The Dutch determiner sommige, having no clear equivalent in English (possibly except for the stressed use of some), behaves like a strong determiner: it cannot occur in the subject position of existential sentences. The B&C-definition, however, would mark it as weak, which is corrected by (4) as far as Dutch is concerned. Zwarts' definition of filters is given in (5): (5)
An NP is a filter iff on every model M on which [NPJ is defined: for all X,Yc E, (X e Q & Y e Q) *-»· Χ Π Υ e Q. Condition: Q f 0
Examples of filters are \de Ν J (the), fbeide N] (both), and \Bertrand\. An NP is a non-filter only if it is not a filter. Zwarts needs a properness condition in (5): Q cannot be the empty set. It follows that filters can only become improper if they refer to the power set of the domain. Filters constitute a subset of the set of monotone increasing quantifiers, but they have a counterpart in the set of monotone decreasing quantifiers, namely ideals: (6)
An NP is an ideal iff on every model M on which [[NPJ is defined, for all Χ,Υς E, (X e Q & Y e Q) X U Y e Q Condition: Q ψ 0
Note that again one of the conditions of properness is called upon. An NP is non-ideal only if it is not an ideal. Both the class of filters and of ideals contain proper subsets that
24
Franciska de Jong and Henk
Verkuyl
are characterized by a constraint on the second condition on properness in (1), viz. Q ψ Pow(E). We will leave out the definition of proper filters assuming them to be filters that are proper on every model given (1). Examples of proper filters are: |[¿>eKfeN]¡ (both), [de NJ (the), whereas lalle Ν J (all) is an example of an improper filter. Ideals like [geen van beide Ν] (none of both N) and Inot Bertrand} are proper whereas \niemand]| (nobody), \alleen Ν J (only N) are improper ideals. Zwarts' classification is construed by the following opposition pairs: (7)
(a) (b) (c) (d)
monotone increasing proper strong filter/ideal
vs vs vs vs
monotone decreasing improper weak non-filter/non-ideal
Note that this mixes up notions that are hierarchically ordered with notions that are not inherently related to each other. Consequently, a classification of NPs based on (7) and its definitions is a bit of a jumble, as shown by diagram I. The notion of strength seems to interfere with the notion of properness in a quite unsatisfactory way. As the notion of properness is related to virtually all oppositions in diagram I it is worth to focus on this interference. One of the peculiarities of (I) is that the distinction between weak and strong shows up twice, because it applies to both proper and improper NPs. On the other hand, the set of proper weak NPs is empty. This fact is begging for an explanation. This becomes even more urgent if we take into account its reverse counterpart, the category of improper strong NPs which is poorly filled. If this category would also be empty, an obvious reduction into a bipartition could be effectuated. In the next section we argue to that effect.
Generalized Quantifiers: the Properness of their Strength
25
DIAGRAM I
IMPROPER
PROPER STRONG M 0 Ν I Ν C R.
M 0 Ν. D E C R.
2.
F I L Τ Ε R
de Ν (the) beide Ν (both) de beide (the both) de η Ν (the η) Proper names
WEAK
STRONG alle Ν (all) elke Ν (every) alles (everything)
enkele (N) some niet alleen Ν (not only) nN minstens η Ν (at least)
sommige N(some of F the, certain) Ν I de meeste Ν (most) 0 L enkele van de Ν (some of the) Ν Τ E R
geen Ν (no) alleen Ν (only) niet (nothing)
I geen van de Ν (none of the) D E geen van beide (none of the both) A L niet de Ν (not the) I Ν D 0 E Ν A L
WEAK
niet alle Ν (not all) niet alles (not everything)
hoogstens η Ν (at most n) weinig Ν (few)
THE REDUCTION OF PROPERNESS TO STRENGTH
In section 2.1 we show that the reduction of properness to strength bears on the treatment of universal quantification. In section 2.2 we will argue for a partial interpretation of universal quantifiers. 2.1.
Properness, strength and universal quantification
That there are no proper weak NPs is accounted for by definition. A proper NP never denotes the empty set, given (li). It follows that for a proper NP there is always a subset X of E such that X is a member of the corresponding quantifier. As a consequence proper monotone increasing quantifiers always have E as a member. So given the properness of \de N ]
26
Franciska de Jong and Henk
Verkuyl
(the), \sommige Ν J (some of the), they are (positive) strong by necessity. An analogous argument holds for the monotone decreasing cases. Given the properness of \geen van beide N]| (none of both), |{geen van de NJ (none of the), these quantifiers are negative strong. Evidently properness implies strength, but the reverse does not hold. Some strong NPs can be improper as noted earlier. It seems preferable to have a system in which strength implies properness as well, so that the subclass of improper strong NPs can be eliminated. There appear to be more substantial reasons, however, to aim at this goal. Firstly, in Zwarts (1981) all improper positive strong NPs denote the same quantifier. That is, alle Ν (all), alles (everything), and elke Ν (every) all have the same definition. The same applies to the improper negative strong NPs. However, two types of quantifier do not exhibit the same sort of improperness. Positive strong NPs are improper if they denote Pow(E). Negative strong NPs, however, are improper if they denote 0. Both categories share the property of being improper in the case their Ν denotes the empty set. We shall illustrate this point shortly. The second reason is that there is a considerable redundancy in the conditions under which NPs are strong and improper. The fact that properness is incompatible with weakness is clearly indicative of this. If a monotone increasing quantifier always contains E, being strong, it cannot denote the empty set, so it cannot be but proper. On the other hand, if a monotone decreasing quantifier never includes E, being strong, it cannot denote Pow(E), so it cannot be but proper either. It is worthwhile to go further into both issues raised here. Monotone increasing quantification involves the increase of the cardinality - by natural inclusion — of the subsets constituting Pow(E) from a given lowest bound towards E. For strong NPs the upper bound is E as provided for by (4a) and for all quantifiers, except the universals, the lowest bound is beyond 0, i.e. at least 0 is not included. For the strong cases this is due to a restriction on the cardinality of [NJ in the definition of the quantifier. The weak NPs do not include 0 either, due to a cardinality condition on the intersection Χ Π [[Ν] that is involved in their interpretation, as we will show in (8). Likewise, monotone decreasing quantification can be taken as counting down from a given upper bound towards 0. For strong NPs this upper bound is below E as provided for by (4b), i.e. at least E is not included. In this respect the (negative) universal ones are once again exceptional: of the monotone decreasing quantifiers only the universal ones do not contain 0. That all other strong cases contain 0 is due to a condition on the cardinality of [NJ. Again, the weak NP s contain the empty set due to a cardinality condition on Χ Π [ΝJ in their definition. Note that this effect is the reverse of the effect for the monotone increasing weak NPs.
Generalized Quantifiers: the Properness of their Strength
27
We will now elaborate on the role of 0 as a possible member of Q, because the empty set plays a keyrole in the improper denotation of monotone increasing NPs. First, we give the Zwarts-definitions of some relevant NP-denotations: (8) a. lalle Ν J (all) b. \niet alle Ν] (not all) e- ¥ e N p l J (the) d. \sommige NJ (some of the) e. |[n NJ f. \enkele N] (some, sm) g. Jhoogstens η Ν] (at most) h. \geen Ν J (no) i. Igeen van de Ν J (none of the)
= = = = = = = = =
{X Ç E IX η INJ = I N J } {Xç E Ix n JNJ φ INJ } lalle NJ ; card (INJ) > 2, OU Ienkele NJ ; card (INJ > 2, OU {X e E lcard(Xn l N j ) > n } j x ç E lcard(Xn I N J ) > 2 } { X ç E lcard(Xn l N j ) < n } { X ç E IX (Ι INJ = 0 } Igeen NJ ; card (INJ > 2, OU
Some quantifiers in (8) have a partial interpretation: they have an additional condition on the size of IN J. If this condition is not met in a given model, the quantifier is undefined (OU abbreviates 'otherwise undefined'). Our interpretation of monotonicity and strength implies the need to supply the definitions (8a) and (8b) with such a condition. We want to enrich the natural language notion of monotone increasing with the condition that 0 4 Q and the definition of monotone decreasing with the condition that 0 e Q. To illustrate this point, let us consider the two possibilities for INJ with respect to I aile Ν J, viz. INJ = 0 or INJ φ 0. According to (8a), 0 can only be a value of X if INJ = 0, because 0 Π 0 = 0, whereas INJ φ 0 would exclude 0 as a value of X because 0 Π INJ φ INJ. The same holds for \niet alle NJ. According to (8b), 0 is always a member of the quantifier if INJ φ 0, because 0 Π INJ = 0 is always true. So 0 can be X in the condition Χ Π INJ ψ INJ, if INJ φ 0. But if INJ = 0 no X can satisfy Χ Π 0 φ 0. Suppose now that we are to eliminate those models in which INJ = 0 for I (niet) aile Ν J, then the NPs in (8a) and (8b), and in fact, all strong NPs are proper. As to positive strong NPs, E e Q by definition (4a), so the quantifier cannot be empty satisfying (li). Q cannot be Pow(E) either because 0 4 Q> thus satisfying (Iii). Hence a positive strong NP can only be proper. As to negative strong NPs the reverse appears to hold: E 4 Q by (4b), but 0 e Q. Hence Q satisfies both (li) and (Iii) in this case. The subsets 0 and E of Pow(E) turn out to be the key-elements in the classification of NP-types within the theory of generalized quantifiers. The restriction imposed on quantifiers by monotonicity concerns the absence or presence of 0. Strength imposes a restriction on the absence or presence of E in the denotation of an NP.
28
Franciska de Jong and Henk
Verkuyl
Turning now to the improperness of weak NPs we observe that this property is predictable irrespective of the stipulation we had to make for the strong cases in (8). For example, |enkele N] (some) never contains 0 due to the requirement in (8f) that the intersection Χ Π [ΝJ contains at least two members. Nevertheless, [enkele N] is an improper quantifier because it can denote the empty set in cases where card( [N ]) < 1, which makes Q improper by (li). Analogously, \hoagstens η Ν] as defined in (8g) can always contain 0 due to the inclusion-relation, because this relation only fixes the upper bound. As E e Q is not excluded, it is possible for Q to be the Pow(E) which makes Q improper by (Iii). Looking to the set of improper strong NPs just from the observational point of view, we note that among the NPs that are not improper and strong, i.e. the NPs in (8c-i), the following generalization appears to hold. The interpretation of all weak NPs is defined without restrictions, whereas the (proper) strong NPs are defined only for those models which meet certain conditions. In other words, except for (niet) alle Ν, weak NPs have a total interpretation, whereas for strong NPs the interpretation is only partially defined. But why do not alle Ν and niet alle Ν get a non-partial interpretation? 2.2.
A partial definition of universal quantifiers
In Zwarts (1981) aile Ν (all) and niet alle Ν (not all) have a total interpretation, just like the corresponding NPs in Barwise & Cooper (1981). Unlike the interpretations in (8c), (8d) and (8i), (8a) and (8b) are not restricted as the determiners of natural language expressing universality are assigned the standard interpretation of the universal quantifier in formal logic, i.e. an interpretation involving material implication. As a consequence, a statement like (9) is true if there are no individuals in the domain of interpretation. (9)
Vx(Px
Qx)
This standard interpretation follows the Frege/Russell theory of quantification: universal quantification is determined by properties of material implication, assigning a total interpretation to all N. Strawson invoked a notorious debate by defending that quantification in natural language allows for truth value gaps. In Strawson (1952) it is argued that all Ν should be assigned a partial interpretation. Some of his arguments are comparable with the arguments we give in favour of a partial interpretation of all N. The systematic account of the modeltheoretic properties of determiners in the framework of generalized quantification requires that a
Generalized
Quantifiers: the Properness of their Strength
29
stand with respect to this controversy be motivated and that a decision be based on the properties of non-universal determiners as well. Though their interpretation of all Ν is not partial, Barwise & Cooper (1981) and Zwarts (1981) allow a partial interpretation of several other NPs. In fact they do not only allow it, they also need it so as to be able to distinguish between strong NPs and weak NPs along the lines of their definition of 'strong' and 'weak', and to be able to deal with the so-called existential sentences. A total interpretation of e.g. de Ν j (plural the) as proposed in Zwarts (1983) - who seems to have shifted Fo the Russellian position - , would make de Ν (the) weak rather than strong. Furthermore, the differences between sommige (some of the) and enkele (some) can only be expressed if one allows both partial and total interpretation. So partial interpretation is a useful instrument in explaining certain properties of NPs in natural language. The empirical observations in section 4 emphasize the importance of the weak/strong dichotomy. We continue now with some arguments in favour of a partial interpretation of all N. First, we claim that the standard interpretation of universal quantification is not based upon the most regular use of all in natural language, but rather upon the marked use of this expression: its conditional use. Due to its conditional structure (9) expresses a specific relation between Ρ and Q. An interpretation of (10) on the basis of (9) is favoured by the fact that the set of all ravens is a subset of the set of black entities, which is not based on observation, but on induction or hypothesis. Blackness is taken as a property which is inherent to ravens, as long as no counterexample shows up. (10)
All ravens are black
(11)
All seats are taken
(12)
All men are ill
Examples such as (10) must be treated as marked cases in comparison with contingent sentences such as (11) and (12). We use the term 'marked' here in the linguistic sense. Sentence (10) is a clear example of a statement having the status of a law - or a hypothesis, an opinion or a belief - that is firmly settled in science, in biological theory and also in our everyday naive physics. Laws are statements which, apart from their explanatory function, are of interest to a scientific theory. The material implication warrants the correct use of the Modus Ponens and Modus Tollens. The fact that material implication is connected with unrestricted quantification is justified by the scientific need to establish general principles. Unlike contingent sentences, statements like (10)
30
Franciska de Jong and Henk
Verkuyl
do not presuppose specific contexts. In general, sentences like (11) and (12) are fully nonsensical if there are no men or seats in the context of use. This seems to be due to the fact that there is no inherent relation between seats and the property 'to be taken', or between men and the property 'being ill'.2 As a consequence (9) cannot serve as basis for this interpretation of (11) and (12). However, suppose that (ultrafeminist) science discovers that (12) is a law of nature. In that case, the interpretation of (12) is on a par with (10). So it depends on whether a certain sentence functions as a lawlike statement in a theory (or a more or less consistent set of everyday assumptions), when the conditional use gets the upper hand. The difference between marked and unmarked is supported by a linguistic fact: the marked, i.e. conditional interpretation of (10) - (12) is equivalent to the sentences without all, whereas the unmarked interpretation requires the presence of all. The same applies to the Dutch counterparts. We regard the lawlike use of sentences as marked because we do not think it is a property of natural language that there are theories, whether scientific or embedded in our everyday opinions. Familiarity with properties of ravens makes it hard to assign the unmarked interpretation to (10), due to the difficulty one may have in returning to a more innocent state of knowledge and belief, ignorant of the blackness of ravens. Linguistically, however, the unmarked interpretation of (10) is not impossible. Indeed, from a structural semantic point of view the unmarked interpretation is to be preferred, because the marked interpretation has its roots in the lexically stored knowledge of speakers of a particular language. In its marked interpretation, (10) has a meta-linguistic dimension being about one of the properties we assign to the noun raven in our lexicon. Hence our treatment of universal quantifiers in natural language will be based upon their use in contingent statements. Notice that there is no a priori ground for deciding whether this requires the elimination of the material implication in the representation of sentences with universal quantifying elements (cf. Hausser 1976), or whether the representation should include a presupposition, which is the point of view taken here, or whether a truth-functional constraint should be added (cf. Verkuyl 1981). Summarizing, all can be used in lawlike sentences as well as in contingent statements. Both contexts impose different interpretations on all. Only in hypothetical contexts can all be interpreted without presuppositions on the size of f N ] . In contingent statements the use of all requires a non-empty noun denotation. The existence of a marked interpretation of all seems to be confirmed by the observation of Barwise & Cooper (1981:180) that it is very hard to interpret all Ν as a non-sieve, i.e. that is difficult to find sentences in which the noun of an NP with all as determiner has the empty set as its
Generalized Quantifiers: the Properness of their Strength
31
denotation. So if the definition of \all N j is restricted to those contexts in which the interpretation does not involve material implication as in (11) and (12), then [Ν] φ 0 should be stipulated. Taking this particular context as basic for the interpretation of all Ν is not confined to universal quantifiers. Note that the same choice is made for other quantifiers as well. For example, consider definition (8c) which is restricted to those models in which card([Nj) > 2. But in many contexts there is no need for such a restriction. Just as in the case of all, there are contexts for the which allow a conditional interpretation comparable with the interpretation of (10). We give some examples in (13). (13)
a. b.
If you leave garbage about, the rats have free play For the good children Santa Claus will have a surprise
In spite of the claim in (8c) with respect to the size of |[N]], these sentences are not without interpretation if there are no rats and no good children respectively, in E. On the other hand, the sentences in (11a) and (12a) just like those in (11) and (12), are indeed completely useless if there are no seats or men in E. (11)
a.
The seats are taken
(12)
a.
The men are ill
This suggests that not only for all the constraints on the domain of interpretation vary with the kind of context that it occurs in. But if one agrees to this, one should require that all quantifier definitions be based on the same kind of context. And given the (implicit) choice that is made for the in this respect, the addition of a restriction on the size of [NJ in the case of lall Ν], is not ad hoc. Note that our argument with respect to (11a) and (12a) relates to the position taken by Strawson in the discussion on Russell's famous 'The king of France is bald'. Cf. Verkuyl (1981:586ff) for an analysis supporting the claim that all should be treated on a par with these and the. Our second argument in support of partial interpretation concerns the evaluation of statements of the form 'All Ν VP', where VP denotes some X ç E. If it is interpreted as Χ η | N J = [NJ, and if |[Nl = 0, then every X makes Χ Π 0 = 0 true. This implies that for the family of models for which | N J = 0 the VP-predicate has no effect on the truth value of the sentence. On the other hand, a statement as 'Not all Ν VP' turns out to be false in every model in which [N J = 0 , because no X can make χ η 0 φ 0 true. Hence it is contradictory in such models.
32
Franciska de Jong and Henk
Verkuyl
The point can be illustrated by the following example. Let us assume a model with a domain that does not contain unicorns as individuals. If we evaluate a sentence like All unicorns are waiting for the traffic lights with respect to this model, it is true if we interpret all in accordance with (8a). This strange and unwanted consequence of (8a) does not follow if we impose a cardinality-re striction on [a//N]]. Finally an interesting consequence of our proposal to assign alle Ν a partial interpretation must be discussed. Barwise & Cooper (1981) explain the restrictions on the selection of determiners in the lower NP of partitive NPs in terms of definiteness: lower NPs must be definite. According to them de (the) is definite, whereas alle (all) is not due to the non-partial definition of \all N]]. However, a partial interpretation of alle renders it definite, just like de. At first sight this seems to be a complication for the explanation of the restrictions on determiners in the lower NPs in partitives: alle and de are predicted to behave identically, but this seems to be countered by (14b) and (14c). However, the unwellformedness of (14c) differs from the one in (14d), at least in Dutch, given (14e-h): (14)
b. c. d. e. f. g. h.
Sommige van de kinderen zijn ziek (some of the children are ill) *Sommige van alle kinderen zijn ziek (some of all children are ill) *Sommige van tien kinderen zijn ziek (some of ten children are ill) De helft van de kinderen is ziek (half of the children is ill) De helft van alle kinderen is ziek (half of all children is ill) De helft van de tien kinderen is ziek (half of the ten children is ill) *De helft van tien kinderen is ziek (half of ten children is ill)
Whereas de (the) in (14b) is in opposition to alle (all) and tien (ten) in (14c) and (14d), the observations in (14e-h) suggest that de and alle, and also de tien, are on a par against tien. Perhaps the acceptability of (14f) is less convincing than that of (14e), but with some additional information (14f) is perfectly acceptable, whereas (14h) remains unacceptable in the same contexts: (14)
i.
De helft van alle kinderen op school is ziek (half of all children at school is ill)
Generalized Quantifiers: the Properness of their Strength
33
j. *De helft van tien kinderen op school is ziek (half of ten children at school is ill) We conclude that the differences between alle (all) and de (the) are quite complicated but certainly they should not be related to a difference in properness. 2.3.
Conclusion
Summarizing our argument in section 2, we come to the following conclusions, which command a reduction of diagram I: (a)
(b) (c)
(d)
3.
monotonicity and strength are systematically related by extending the definition of monotone increasing with the requirement that 0 4 Q and by extending the definition of monotone decreasing with the requirement that 0 e Q; properness is an epiphenomenon rather than a classificatory device; lalle Ν J, \niet alle Ν J and their variants are defined only if card ( [N J) ^ 0 . Models in which JN J = 0 lead to nonsensical or at least irrelevant interpretations; lalle NJ, [niet alle Ν J and their variants are strong and proper, given the correctness of (a).
TOWARDS AN EXPLANATORY CLASSIFICATION
In this section we shall first perform a reduction of diagram I and reconsider the definitions involved. Though the revised classification is an improvement, it still has only little explanatory force. The classification is still based upon the problematic mixture of properties that must, and properties that cannot be ordered hierarchically. As a consequence there remains an intriguing asymmetry between monotone increasing NPs and monotone decreasing NPs. However, by transforming the reduced scheme into a matrix that is based on notions that are independently motivated in De Jong (1983a; 1983b) a more explanatory and empirically motivated dimension is gained of some aspects of general quantification in natural language over Barwise & Cooper (1981) and Zwarts (1981).
34
Franciska de Jong and Henk
Verkuyl
DIAGRAM II
POS. STRONG (E e Q)
F I L Τ E R
de (η) Ν (the n) Proper names alle Ν (all) alles (everything) elke Ν (every) beide (both) de beide (the both) sommige Ν (some of the F certain) I de meeste Ν (most) L enkele van de Ν (some of the) Τ Ε R
Ν 0 Ν
NEG. STRONG (E i Q) I D Ε Α L
Ν 0 Ν
POS. WEAK
geen van de Ν (none of the) geen van de beide (none of the both) niet de Ν (not the) I niet alle Ν D (not all) Ε niet alles Α (not everything) L
enkele Ν (some) niet alleen Ν (not only) nN minstens η Ν (at least n)
NEG. WEAK geen Ν (no) alleen Ν (only) niets (nothing)
hoogstens η Ν (at most n) weinig Ν (few)
Diagram II represents the reduction of Zwarts' scheme (I) on the basis of our conclusions in section 2.3. We shall briefly discuss the subcategories in the reduced scheme (II). The monotone increasing quantifiers do not contain the empty set as an element, according to our conclusion (a) in section 2.3. This property, plus an equivalent of monotonicity is incorporated in the new definitions in (15) The (positive) strong NPs are a proper subset of the monotone increasing NPs having the properties in (16). Filters are a proper subset of the positive strong NPs having the property of being closed under finite intersection, as indicated in (17).
Generalized Quantifiers: the Properness of their
Strength
35
(15) If NP is monotone increasing the following holds on every model M for which it is defined: (i) for all X, Y ç E, if Χ Π Υ e Q, then X e Q and Y e Q; (ii) 0 Q. (16) If NP is positive strong the following holds on every model M for which it is defined: (i) NP is mon. increasing; (ii) E e Q (17) If NP is a filter the following holds on every model M for which it is defined: (i) NP is positive strong; (ii) for every X, Y c E, (Xe Q & Y e Q ) < - * X n Y e Q An NP is monotone weak if it is monotone increasing and not positive strong. An biP is a non-filter if it is positive strong and not a filter. The proposed reduction does not affect the empty upper rightmost box of (I) which also shows up in (II). Various hierarchies may be defined with these conditions. For instance, Chellas (1980) suggests the following order: monotone increasing < quasifilter < filter, corresponding with the properties (15i), (15i) + (17ii), and (15i) + (17ii) + (16ii). This hierarchy does not seem to fit the empirical data: diagram (II) shows that - at least in Dutch - there are no quasi-filters that are not filters. On the other hand, the Dutch determiner sommige and the English determiner most, both strong non-filters, should be fitted properly into some hierarchy. Hence we suggest transforming (II) into a three-step matrix that can express what otherwise would be concealed. Monotone decreasing NPs are still defined as before but with the additional requirement that the empty set is an element of their denotation on every model for which they are defined, as in (18). The negative strong NPs are a proper subset of the monotone decreasing NPs, as is clear from (19). In contrast to the filters, the ideals are not a subset of the set of strong NPs. Their properties are given in (20). (18) If NP is monotone decreasing the following holds on every model M for which it is defined: (i) for all X, Y ç E, if X U Y e Q, then X e Q and Y e Q; (ii) 0 e Q (19) If NP is positive strong the following holds on every model M for which it is defined: (i) NP is monotone decreasing; (ii) E i Q (20) If NP is an ideal the following holds on every model M for which it is defined: (i) NP is monotone decreasing; (ii) for all Χ , Υ ς Ε , if X e Q & Y e Q, then X U Y e Q An NP is negative weak if it is monotone decreasing and not negative
36
Franciska de Jong and Henk Verkuyl
strong. An NP is non-ideal if it is negative strong and not an ideal. How can we relate (II) to generalizations that are theoretically important? To answer this question satisfactorily we must go back to our definitions in (8). We have argued in section 2 that (8a) and (8b) are to be supplied with '([N]| φ 0, otherwise undefined', so we change them into (8a') and (8b'), respectively. (8a') [alle Ν J (all)
= {Χ ς E IΧ η [NI = [Ν] } ; [Ν] ψ 0, OU
(8b') \niet alle Ν] (not all) = {X ç E IΧ Π [Ν] ψ [Ν] } ; [Ν J f 0, OU From the definitions in (8) it appears that there are two types of opposition involved. The first opposition distinguishes between partially and totally interpreted NPs. We discussed this distinction already with respect to all, but as will be clear by now, it is related to the presence or absence of a general condition in (8) on the size of |NJ. The opposition between partially and totally interpreted NPs opposes (8a'), (8b'), (8c), (8d) and (8i) on the one hand, to (8e-h) on the other hand. The second opposition concerns the absence or presence of the cardinality-condition on the intersection Χ Π | N ] . Note that for (8a) the condition Χ η |NJ = |NJ is equal to [Ν] - X = 0, whereas for card fl[NJ Γ) X) = η there is no equivalent in terms of [Ν ] - X. This second distinction invokes a bipartition between (8a'), (8b'), (8c), (8h) and (8i) on the one hand, and (8d-g) on the other hand. These two bipartitions can easily be fitted into a matrix marking the quantifiers involving a condition on the size of [NJ with + and the ones lacking it with -. Likewise we mark quantifiers that involve a condition on the cardinality of Χ Π [N J with +, and the quantifiers occurring in the complement of the opposition with -.
37
Generalized Quantifiers: the Properness of their Strength D I A G R A M III restriction on the
restriction on the
cardinality of
cardinality of
[Ni Bertrand de (η) alle elke alles (de) beide elk van de
+
sommige de meeste enkele van de η enkele minstens η veel
+ +
Χ η IN] -
+
-
+
-
+
-
+
-
+
+
+ +
+
_
+
-
+
-
+
-
+
+
niet Bertrand niet de (η) niet alle niet elke niet alles geen van (de) beide
[NI +
Χ η |N] -
+
-
+
-
+
-
+
-
+
—
geen alleen niets
_
_
-
-
-
-
weinig hoogstens η
-
+ +
Note that in (III), the weak NPs of (II) correspond to those NPs whose interpretation is not restricted, while the strong NPs correspond to the expressions in (III) whose interpretation is only partially defined, due to a constraint on the size of IN]. The correspondence indicates that it is possible to give a uniform treatment of positive and negative strong NPs. This seems to favour the analysis that underlies diagram III, because the distributional properties of the positive strong NPs are identical to those that can be observed within the class of negative strong NPs. (Cf. section 4.) In this regard the opposition between NPs with a partial and NPs with an total interpretation is more adequate than the tripartition (in disguise) in (4). 3
4.
EXPLANATION NEARBY?
Diagram II is purely semantic in the sense that this classification rests upon modeltheoretic concepts only. Diagram III, on the other hand, employing the notions 'restriction on [N]' and 'restriction on Χ η [NJ', can be related to notions of a syntactic nature. The latter notion bears on the relation between the Ν and the VP established by the determiner.
38
Franciska de Jong and Henk
Verkuyl
In this regard the determiner is an operator having the VP in its domain. This aspect of the determiner can be labelled as 'NP-external', whereas the first notion pertains to the NP-internal aspect of the determiner. In this section we will discuss the relation of the semantic classification of determiners to the distributional properties of determiners to be observed in certain syntactic configurations. In De Jong (1983a) it is argued that a tripartition along the rows of (III) plays a crucial role in the explanation of the syntactic and semantic behaviour of NPs in Dutch. Given some reasonable assumptions about their prenominai structure which are not relevant here, the following syntactic properties of determiners are of interest. (21) a. b. c. d.
never followed by a numeral possibly followed by a numeral never in adjectival-position possibly in adjectival-position
Determiners can be classified by assigning them one property out of (21a) or (21b), and one out of (21c) or (2Id). As shown in diagram IV these syntactic properties correspond one-to-one with the semantic properties of the classification of the monotone increasing quantifiers in (III). That is, the difference between the syntactic properties (21a) and (21b) corresponds to a difference in the semantic restriction on the intersection Χ Π [ΝJ as in diagram III: (21a) corresponds to a plus-value and (21b) to a minus-value in the right columns in (III). The difference between the syntactic properties (21c) and (2Id) corresponds to the presence or absence of a condition on [NJ. D I A G R A M IV
examples
syntactic properties
semantic conditions
A. sommige
a. (21a) c. (21c)
i. card ([NJ η [VP]]) > 2 ii. card ([N]) > 2
B. enkele, vele, numerals
a. (21a) d. (2Id)
i. card ( [ Ν ] Π I VP] > η ii. no condition on IN]
C. alle, de
b. (21b) c. (21c)
i. [NI η [VP] = IN] ü. card (IN]) > 2
The following might serve as an illustration of the relevant observations. Consider for example the distributional properties of de (the) as compared
Generalized Quantifiers: the Properness of their Strength
39
to those of sommige (some of the) and those of numerals. First, in leftmost position de and alle (all) can be followed by a numeral, whereas sommige and numerals cannot: (22)
a. b. c.
de vele boeken (23) a.*sommige vele boeken (the many books) (some of the many books) de drie boeken b.*sommige drie boeken (the three books) (some of the three books) alle honderd deelnemers c.*honderd vele deelnemers (all hundred participants) (hundred many participants)
Secondly, numerals can be preceded by another determiner whereas de and sommige cannot. 4 (24)
a. b. c.
de vele boeken (25) a.*de sommige boeken (the many books) (the some of the books) de drie boeken b.*drie de boeken (the three books) (three the books) alle honderd deelnemers c.*honderd sommige boeken (all hundred participants) (hundred some of the books)
Another set of data shows that recasting (II) into (III) sheds some light on the nature of the restrictions on [N]. As we noticed already in section 3, the minus-values in the left columns of the two boxes in (III) express weakness, and the plus-values strength. In Barwise & Cooper (1981: 183, 204-206) sentences like (26) are analyzed such that their interpretation is E e [a cookie] and E e \fìve cookies\, respectively. The interpretation is based on the assumption that the denotation of there is {El.The rules generating and interpreting (26) explain the ill-formedness of cases like (27) as well. (26)
a. b.
There is a cookie There are five cookies.
(27)
a. *There is the cookie b. *There are not the cookies.
In (27a) the cookie is positive strong, which implies that on every model E β I the cookie\. But it also implies that the information that (27a) is supposed to convey is already present. So its interpretation is uninformative or tautological. In case (27b) the expression not the cookie is negative strong. Hence E i |[not the cookies J on every model, which means that a contradiction arises. Ingenious though this solution seems to
40
Franciska de Jong and Henk
Verkuyl
be, there are at least two problems. The first problem, also noted by Barwise & Cooper, is that sentences like (28) (28)
There are five cookies left (There be NP XP)
fall outside the range of the rules that account for (26) and (27) and it is not clear at all how these could be extended so as to cover (28). The second problem is based on the following examples suggesting that existential sentences are just symptoms of a widely spread phenomenon. The a-examples contain weak NPs, the b-examples strong NPs.5 (29) a.
b.
(30) a.
b.
Er zijn (There are)
Er zijn
η enkele hoogstens η vele *alle *de *beide *sommige
(31) a.
Ze heeft (She has
b.
Ze heeft
η enkele (a few) hoogstens η (at most n) [vele (many) *alle *de *beide (both) *sommige (some of the)
jaren geleden (years ago)
jaren geleden
η enkele hoogstens η veel *de *beide *sommige *alle
broers (brothers)
broers
koekjes over (cookies left)
koekjes over
Generalized Quantifiers: the Properness of their Strength (32) a.
Het duurde (It took)
b.
Het duurde
a.
Het pad is (The path is)
(33)
b.
Het pad is
η enkele hoogstens η vele *de •beide *sommige •alle η enkele hoogstens η vele *de •beide *sommige •alle
41
minuten (minutes)
minuten
meters lang
meters lang
The contexts (34) have properties in common that are usually ascribed to existential sentences only: (34)
a. b. c. d. e.
There is/are [ — N] (XP) [ — Ν] ago She has [ — N] (where Ν is a relational noun) It took [ — N] The path is [ — N] long
Each of the contexts can be characterized as non-presuppositional with respect to the presence (in some way or other) of entities belonging to [ N | . In (34b), (34d) and (34e) this is due to the occurrence of measure constituents (cf. Klooster 1972). The noun indicating the measuring unit is not a common referring expression, so it cannot denote a set at all. With regard to (34a) and (34c) one might say that the contribution of the sentence is to bring about a change in the domain of discourse. Both (34a) and (34c) introduce a (sub)set f N j concurrently with some additional information about its elements, e.g. that they are left or that they have a certain person as their sister. As a consequence of this non-presuppositional nature the contexts in (34) require a determiner that does not impose presuppositions on JNJ. Hence, they do not allow determiners with + in the column 'restriction on the cardinality of JNJ ' . The determiners with - in this column do not necessarily presuppose [NJ to con-
42
Franciska de Jong and Henk
Verkuyl
tain at least one element, so they fit perfectly well in the contexts in (34). Our analysis in terms of cardinality-conditions on [ N ] is in accordance with the analysis of existential sentences in Milsark (1974). Moreover, the extended concept of non-presuppositional contexts can be seen as a confirmation of its relevance: one can now corroborate his analysis with a solid modeltheoretic foundation.
5.
CONCLUSION
Our re-analysis of Zwarts' definitions and indirectly of the assumptions in Barwise & Cooper (1981) results in a better motivated treatment of strength and properness, leading to a descriptively more adequate account of the data but also to a better understanding of the relation between the syntactic and the semantic properties of determiners. Furthermore the explanatory force of our analysis strengthens our argument in favour of a semantic theory that allows partial interpretation.
NOTES 1. The terms 'weak' and 'strong' are introduced in Milsark (1977) though defined differently from the corresponding definitions in Barwise & Cooper (1981) or Zwarts (1981). Cf. De Jong (1983b). 2. One might come to think that what is central here is a distinction between predicates that do require a strong NP for a subject, and NPs that do not. Cf. Milsark (1977). But the question is more complex. The class of predicates that varies with respect to the requirements it imposes on the subject NP, is the class of predicates that allow for any kind of subject. For example, the predicate to be ill selects only a strong NP for a subject, if the denotation of the occurring noun is not the empty set. But this is not a general characteristic of this predicate. Weak NP's can be selected as the subject of to be ill as well. If it happens to be that an NP is selected, whose noun has the empty set as its denotation, the resulting sentence is not senseless, but just false. 3. The quantifier corresponding to proper names does not fit in diagram II because they cannot be analysed as [Det N], nor is there any intersection Χ η [N] involved. But if we follow the suggestion of Barwise & Cooper to consider proper names as Russellian definite descriptions, the corresponding quantifier is equivalent to [the N]. This assumption motivates the values for {not) Bertrand in (II). The interpretation of alles (everything), niet alles (not everything) and niets (nothing) is equivalent to lalle NI, {niet alle NJ and I geeη NJ. 4. These differences can be partly explained by adopting an output-condition on NPs, stating that no more than one expression imposing restrictions on the cardinality of Χ η [Ν] can occur within NPs. As these matters are discussed in detail elsewhere (cf. De Jong, in preparation) we won't discuss them here any further. 5. We owe (31) to Barbara Partee. Some other examples are given in De Jong (1983b) and in De Jong (in prep.).
Generalized
Quantifiers:
the Properness
of their
Strength
43
BIBLIOGRAPHY Barwise, J. & R. Cooper (1981): 'Generalized Quantifiers and Natural Language'. In: Linguistics and Philosophy 4,159-219. Benthem, J. van (1984): 'Questions about quantifiers', In: Journal of Symbolic Logic 49:2, 443-466. Chellas, B. (1980): Modal Logic: an Introduction. Cambridge University Press. Cambridge. Hausser, R. (1976): 'Scope ambiguity and scope restriction in Montague grammar'. In: J. Groenendijk & M. Stokhof (eds.), Proceedings of the Amsterdam colloquium on Montague grammar and related topics, Amsterdam Papers in Formal Grammar, Vol. 1. Amsterdam, 95-131. Jong, F.M.G. de (1983a): 'Numerals as determiners'. In: H. Bennis & W.U.S. van Lessen Kloeke (eds.), Linguistics in the Netherlands. Dordrecht, 105-14. Jong, F.M.G. de (1983b): 'Sommige niet, andere wel; de verklaring van een raadselachtig verschil'. In: Glot 6, 229-246. Jong, F.M.G. de (in preparation): 'Features of determiners determined'. Keenan, Ε. & Y. Stavi (1982): Ά semantic characterization of natural language determiners'. In: Linguistics and Philosophy (to appear). Klooster, W.G. (1972): The Structure Underlying Measure Phrase Sentences. Reidel Pubi. Dordrecht. Ladusaw, W.A. (1982): 'Semantic constraints on the English Partitive Construction'. In: D. Flickinger e.a. (eds.), Proceedings of the First West Coast Conference on Formal Linguistics. Stanford, 231-242. Milsaik, G. (1977): 'Toward an Explanation of Certain Peculiarities of the Existential Construction in English'. In: Linguistic Aanalysis 3,1-29. Strawson, P.F. (1950): O n Referring'. In: Mind LIX, 320-344. Strawson, P.F. (1952): Introduction to Logical Theory. London. Verkuyl, H.J. (1981): 'Numerals and Quantifiers in X-syntax and their semantic interpretation'. In: J.A.G. Groenendijk, T.M.V. Janssen & M.B.J. Stokhof (eds.), Formal Methods in the study of Language. MCT 136. Amsterdam, 567-599. Ζwarts, F. (1981): 'Negatief polaire uitdrukkingen I'. In: Glot 4, 35-132. Zwarts, F. (1983): 'Determiners: a relational perspective'. In: A.G.B, ter Meulen (ed.), Studies in Modeltheoretic Semantics. GRASS 1, Foris. Dordrecht, 37-62.
Chapter 3
Determiners and Context Sets Dag Westerstâhl
1.
UNIVERSES
There are several roles for universes in modeltheoretic semantics. To begin with, we have universes of models - the set M in a modelwH^= - or discourse universes1. But suppose that, during a piece of discourse about the participants at some political meeting (which are thus elements of M) I point to the supporters of Jones and say (1)
All cheered.
(1) is then equipped with a contextually selected sub-universe of M: the set of supporters of Jones at that meeting. Such sets will be called context sets in what follows. This use of sub-universes is very common; pointing is of course just one of the many ways in which, at various points in a discourse, context sets can be selected. For example, if I say (describing the meeting afterwards to someone) (2)
The mayor entered the podium and gave a short speech. All cheered,
a different context set has been selected (this time the set of all participants at the meeting, except the mayor, presumably), by a different mechanism. Getting the universe right is important primarily for quantification; a universe is often described as the universe of quantification. If we agree with Montague's PTQ or Barwise & Cooper (1981) that quantification occurs in noun phrases in natural language, a third sort of universe can be * The participants of the workshop, in particular Johan van Benthem, Jack Hoeksema, Ed Keenan and Barbara Partee, made several useful comments when I presented a (very) preliminary version of this paper. For the first written version Hoeksema's manuscript was also very useful to me. Then, the editors Johan van Benthem and Alice ter Meulen provided me with a long list of thoughtful criticisms and suggestions, and the present paper is the result of trying to take account of these as far as I could.
46
Dag Westerstáhl
identified: the NP universe. When the NP consists of a determiner and a noun, this universe is simply the denotation of the noun in the model. For example, if I had said, instead of (1), (3)
All supporters of Jones cheered,
the NP universe is the previously mentioned context set, this time selected explicitly, not contextually. Clearly, the universe of quantification is restricted in (3) to the NP universe. These distinctions are usually taken lightly.2 One assumes that suitable universes are somehow selected, so that one's examples get 'normally intended' meanings, but that the mechanisms belong to pragmatics rather than semantics. In practice this means identifying context sets with (temporarily chosen) model universes. Let us call this the flexible universe (FU) strategy. Although seldom discussed, it could perhaps be motivated by the fact that even though the choice of universe may affect truth values of quantified sentences, semantics proper deals not with actual truth values but truth conditions, and these are uniform over all universes. Then, it would not really matter whether or not one grasps the intended meaning of a given example, as long as one understands what a possible model for it looks like. Nevertheless, in this paper I wish to argue that the three kinds of universe must be distinguished also by semantics proper if we want to get the linguistic facts right. The FU strategy can be made to work in some cases (such as the examples above), but even then it is methodologically unsound; in other cases it is simply wrong (section 3). Moreover, the occurrence of context sets is often indicated in the sentences themselves in ways which are worth noticing. One way is when a determiner occurs without a noun, as in (1). Another concerns a particular group of determiners, among them the definite article, which will be called the definites. I will suggest a semantic treatment of definites different from the usual treatment of determiners, based on their role as context set indicators (sections 8 and 10). The definites also have a particular relation to partitive NPs (section 9). As formal semantic background I will use the treatment of quantifiers in natural language given in Barwise & Cooper (1981); B&C for short (section 4). This framework can easily be accommodated to the use of context sets (sections 5-6); the effects for the logical theory of such quantifiers (van Benthem, 1984), can also be assessed (section 7). Although the chosen framework is of the extensional, 'static',type, context sets are obviously relevant also for a more 'dynamic' semantic perspective, such as the one in Barwise & Perry (1983). Before showing why the FU strategy fails, I need to get some facts about determiners straight; this is the subject of the next section.
Determiners and Context 2.
Sets
47
CLASSIFYING DETERMINERS
Expressions of the categories NP, DET, N, are standardly divided into simple and complex ones. It is convenient to regard DETs (also) as syntactic functions which give NPs when applied to Ns. (This may be taken as the basic criterion (necessary condition) for DET-hood). A DET is η-place, if, as a function, it takes η arguments (of category N). In B&C only 1 -place DETs are discussed. Typical examples of simple 1-place DETs are (a)
all, each, some, both, most, many, few, a few, several, this, these, one, two, three,...
(b)
every, a, no, the
A typical example of a 2-place DET occurs in (4)
More men than women voted for Henry.
The most natural analysis here is to consider the NP more men than women as formed by applying the 2-place DET more... than to the arguments men and women. This is not the only analysis, however: one could see the NP as the result of applying the 1-place DET more . . . than women to the argument men. The example reveals that the analysis of DET-N structure, according to the basic criterion above, is not unique. To make a choice we need further considerations. In the present case, there are good reasons to prefer the first analysis (cf. Keenan & Stavi, (1981) and Keenan & Moss, (this volume) ); the strongest, perhaps, being that more . . . than women is not a quantitative DET (cf. section 7 below), in contrast with more. .. than. Other similar 2-place DETs are less . . . than and as many . . . as. For further examples of 2-place DETs, and of η-place DETs for η > 2, cf. Keenan & Moss (this volume). Many 1 -place DETs have the property that they can occur pronominally, i.e. without argument. Such DETs will be called pronominal here; all those in (a) are pronominal, whereas those in (b) must be followed by an N. Pronominal use of DETs, which is not discussed in B&C, is a frequent phenomenon: cf. examples like (1) or (5)
Some like it hot
(6)
Few were there to meet him.
The distinction applies to simple as well as complex DETs; a brief glance
48
Dag Westerstâhl
at the extensive list of English DETs in Keenan & Stavi (1981) reveals that most of these are pronominal; For η-place DETs with η > 1, however, pronominal use does not seem to occur. The syntactic classification of DETs given here has an immediate counterpart for their semantic interpretations - from now on I will use 'DET' for the syntactic expressions and 'determiner' for their interpretations (but it is hard to avoid all confusion, since DETs are both used and mentioned). An η-place DET is thus interpreted as an n-ary determiner, i.e. as an n-ary function from Ν denotations (subsets of M) to NP denotations (sets of subsets of M; cf. section 4).
3.
DISCOURSE UNIVERSES AND CONTEXT SETS
It is clear that NP universes cannot in general be identified with 'sentence universes', whether the latter are discourse universes or context sets; to see this we need only consider sentences with at least two NPs containing different Ns. I want to show that neither can discourse universes and context sets be identified. The first argument is methodological. One point of a discourse universe, it seems, is that it can be kept constant during a piece of discourse containing several sentences. This indicates that discourse universes should be large in the sense that they contain all objects 'relevant' to the sentences in question. We can thus formulate two methodological postulates: (MPI) Discourse universes are constant over pieces of discourse. (MP2) Discourse universes are large. (MP2) has the effect that the exact choice of discourse universe is not important, as long as it is large (this may be one sound insight at bottom of the putative argument, given in section 1, for the FU strategy). The reason is that (most) natural language DETs are 'universe-independent', in a sense which will be made precise in the next section (cf. the condition EXT). This is in contrast with, for example, the universal quantifier of standard predicate logic, for which the choice of universe is always important. Furthermore, each of (MPI) and (MP2) implies that context sets should not be identified with discourse universes. For, different context sets can occur within one piece of discourse, and context sets are in general not large (this is one point of using them). Sentences with pronominal use of DETs afford clear examples of the occurrence of context sets. In sentences (1), (5) and (6), the lack of
Determiners and Context Sets
49
an argument is a visible context set indicator, which signals the implicit occurrence of a context set (as argument for the corresponding determiner), although the sentence itself does not tell us which. However, it should be noted that restriction to context sets can occur with all kinds of DETs, whether pronominal or not, and without any explicit indication at all. This brings us to the second argument against the FU strategy. Consider the following pieces of discourse. (7)
Swedes are funny. All tennis players look like Björn Borg, and more men than women watch tennis on TV. But most non-Swedish tennis players are disliked by many.
(8)
The English love to write letters. Most children have several pen pals in many countries.
In the natural interpretation of (7), all in the second sentence is restricted to the set of Swedes, in spite of the fact that it occurs in an ordinary NP with no indication of this restriction, and similarly for the 2-place DET more . . . than. But most is not thus restricted in the last sentence, although the pronominal many is. The discourse universe must contain both Swedes and non-Swedes. Likewise, in (8) the universe of discourse must contain English as well as other children (and also countries!), but most in the second sentence is restricted to Englishmen whereas several is not. Examples such as these are conclusive against the FU strategy. For, looking only at the last sentences of (7) and (8), we see that there is no way to make sense of these sentences if the discourse universe is identified with the context set. (Note that this argument is independent of the postulates MPI and MP2.) I conclude that although the choice of discourse universe can be ignored in semantics (given MPI and MP2), the occurrence of context sets must somehow be accounted for. In what follows we shall only consider the formal framework for context sets, leaving the (more difficult) question of how context sets are chosen to more ambitious semantic theories.
4.
BACKGROUND ON DETERMINER SEMANTICS
To fix ideas we use the formal apparatus developed in B&C: the fragment of English, the syntax and semantics of the logic L(GQ), and the translation rules from the fragment into L(GQ). All of this is admirably presented in the first three sections of B&C; below, only some extensions of the B&C framework, and some minor notational differences, will be explained.
50
Dag Westerstâhl
In the fragment, Ns and VPs are translated as set terms in L(GQ), which in turn are interpreted, in a m o d e l < Μ , [ · ] | > , as subsets of M. Similarly, NPs get interpreted as quantifiers on M, i.e. set of subsets of M, and DETs are interpreted as functions from subsets of M to quantifiers on M; such a function is a determiner on M (only unary determiners are considered in B&C, but there is no problem in extending the framework to arbitrary n-ary determiners). The distinction in B&C between logical and non-logical determiners will not be important here (for a discussion of this distinction cf. Westerstâhl, (1982) ), but I will assume that all determiner symbols are constants in the sense that they denote, on each universe M, a fixed determiner on M. Thus, an n-ary determiner is a functor D which with each non-empty set M associates an n-ary determiner D M on M. With a familiar abuse of language I will often use the same letter ('D', 'Di', etc.) for determiner symbols and determiners. Similarly for Ά ' , 'Β', Ά / , . . . , 'Χ', Ύ', which in general stand for sets, but sometimes for expressions (Ns and VPs) denoting sets. A sentence in the fragment of the form [[[D]det[A]n]np[B]vp]s will be written simply (DA)B, and the corresponding truth condition in a model JG, i.e. the condition that Β e D M (A), will often be written (D M A)B, or even
emphasizing the fact that a determiner on M can be thought of as a relation between subsets of M. (This is for unary determiners; in the n-ary case a quantified sentence can be written ( D A i . . ,Àn )B ). A crucial condition on determiners in B&C is conservativity (this term, though, is from Keenan & Stavi (1981) ): (CONSERV)
For all M and all A,Β ç M, D M AB o D M A AHB.
Thus, the universe is restricted to A in the sense that only that part of Β which is common to A is important for the truth value of D M AB. But note that M is still essential since it determines the interpretation of D. A formal condition expressing the idea from the previous section that the universe is unimportant if large enough is the following:
Determiners and Context Sets
51
(EXT) If A,Β c M ç M ' then D M AB o D M ,AB. It is easy to see that the conjunction of CONSERV and EXT is equivalent to (UNIV)
For all M and all A,Β ç M, D M AB o D a A AHB.
This condition, finally, expresses the idea, mentioned in section 1, that a DET restricts (within the NP where it occurs) the universe of quantification to the NP universe. The three conditions above also have natural versions when η > 1, but these will not be needed here. Although most natural language DETS satisfy UNIV, it can be argued that EXT fails in some cases. This is hinted at in B&C, and further discussed in Westerstáhl (1982), for DETs such as many and few. For example, EXT fails if many is given the following interpretation: wawyMAB o lADBl > 1/3· (Ml (where IXl is the number of elements in the set X).
5.
ADDING CONTEXT SETS
In order to represent context sets in the semantic framework we first add a list of set variables X 0 , Χ ι , X2, . · · to the symbols of L(GQ). In the formation rules these are treated just as unary predicate symbols. So sentences of L(GQ) may now contain free set variables, and such a sentence has a truth value in a model JConly relative to a value assignment (of subsets of M to the set variables) in M. The idea is simply that the context provides this assignment; indeed, for present purposes the context can be identified with the value assignment. We also make the following (terminological) change in L(GQ). Add the formation rule (R)
If D is a determiner symbol and g a set term then D^ is a determiner symbol.
γ In particular, D is a determiner symbol if X is a set variable. Semantically, we introduce an operation on determiners which could be called restriction: if D is γ a unary determiner and X a fixed set, define a new unary determiner D by
52 (RES)
Dag
Westerstâhl DXmAB
o
D M XnA Β,
for all M and all A,Β ç M. Thus, the semantic rule of L(GQ) corresponding to (R) is (S)
[DO] = [ D ] f o l .
Since (D^A) β is equivalent to (Dx[q(x) Λ Α(χ)])|3, no logical strength has been added to L(GQ). As to the fragment, the only change we need is to allow for pronominal use of DETs. In order to preserve the B&C phrase structure we do this by including the set variables among the lexical items under N. Then the rule NP DET Ν must be modified a little, to avoid generating a set variable with a non-pronominal DET. In the fragment, relative clauses can be generated under Ν by the two rules N - > N R and R that VP. But in every Ν there is a uniquely determined principal lexical noun, namely, the leftmost lexical noun in it. Thus, the revised rule is this: NP -»• DET N, provided that, if the principal lexical noun in the Ν is a set variable, the DET is pronominal. In the B&C fragment, the phrase structure trees generated by the syntactic rules give sentences by means of (unstated) morphological rules. To preserve this feature in the revised fragment, simply add a rule which deletes all set variables. Now sentences such as the following can be generated: (9)
Many love Susan
(10)
Every girl that loves many wants all
Finally, we review the translation rules, which define the relation "a" is a translation of a", where a is a phrase structure tree or a lexical item, and a' is an expression in L(GQ). Clearly a set variable serves as its own translation. But we must also account for the introduction of set variables which do not appear in the phrase structure trees, as in the examples (7) and (8). Only the translation of NPs is affected. Here we may stipulate that, optionally, an NP [[a]DEX[0]N]NP , where β is not a set variable, is translated as α ' X (β 1 ) ,
53
Determiners and Context Sets
where X is a new set variable. This extension of the translation rules is designed to give maximal freedom in the assignment of context sets. Several variants are possible. For example, we could make the above rule obligatory, and leave to the context to assign the whole universe M to X when no restriction is intended. Further, we could require that, for each sentence of the fragment, at most one context-set variable occurs in its translation (though possibly in several places). This latter condition, which may be called the uniqueness condition on translation, appears to be satisfied in most cases. For instance, it holds for the sentences in (7) and (8). As an example we give a translation of (10): (10') (everyx(x[girl(x) λ (ma«^(Y))y[/ove(x,y)]]))x[(a"(Z))y[wfl«í(x,y)]] Here we have introduced as many set variables as possible, but X is optional and can be dropped.
6.
JUSTIFICATION
Restricting the universe to a smaller set is common practice in logic, where it is usually called relativization. In logic with generalized quantifiers there is a standard procedure for relativizing sentences (and formulas) to a fixed formula with one free variable. This procedure can be transferred to L(GQ), where sentences can be relativized to a fixed set term. The important step is passing from each unary determiner D to the binary relativized determiner D r , defined as follows: For all M and all A,B,C, ç M, (REL)
DrMABC o
DcAPiC
BnC
(in general, if D is n-ary, an (n+l)-ary D r is defined similarly). Thus, relativizing in L(GQ) means adding a new binary determiner symbol also denoted D r - for each unary D, to be interpreted according to (REL). Let us assume, for simplicity, that the set term we relativize to is a set variable X, and that no individual constants occur in the formulas we consider. Then, for each set term η and each formula φ in L(GQ) where X does not occur, and the relativizations of these expressions to X, are defined inductively as follows:
54
Dag
(11)
Westerstâhl =Ρ
( P a unary predicate symbol)
xfo](X) = x[i(X)] R ( x i , . . . , x n ) « > = R ( x t , . . . , x n ) (R an n-ary relation symbol) ί(χ)(Χ) =
ηίΧ)(χ)
((DR])Ô) ( X ) = ( D R X I ] ( X ) ) S ( X )
(φ
Λ ψ)
( Χ )
=
φ
( Χ )
Λ
φ(Χ)
The point of all this is that the sentence expresses just what ψ says in a universe restricted to X. More precisely, if J C = is a model and X is assigned the subset C of M, let J C f X be the model < C , H ' > , where [RJ ' = [ R ] η C n for each n-ary relation symbol R. Then it is a fact of logic that (12)
is true in^lCiff φ is true in^CfX.
This logical technique thus provably accomplishes restriction to a subuniverse of M. On the other hand, it is not adapted to a natural language context. Indeed, the determiner D 1 defined by (REL) does not in general correspond to a 2-place natural language DET (such as those mentioned in section 2), even when D corresponds to a 1-place natural language DET. For this reason, we did not use the above technique when defining restriction to context sets in the previous section, as the reader will have noticed. We did not increase the number of arguments of the determiners, but used the given determiners with the restriction operation instead. To justify this simpler procedure we must show that it achieves the same results as ordinary relativization. This, in fact, is guaranteed by the conditions CONSERV and EXT. The following shows why: (13)
D r M ABC
D c A n C BPiC
(def.ofDr)
DMAnc Bnc
(EXT)
o
D M AnC AHBnC
(CONSERV)
o
D m ADC Β
(CONSERV)
O
DCMAB
(def. of D C ) .
o
Determiners and Context
Sets
55
Thus, the restriction operation actually performs relativization, under EXT and CONSERV, i.e. p. under UNIV. We also have a converse: if DM' A B C is equivalent to D M AB for all A,B,C, ç M, then D satisfies UNIV (as is seen by letting C = A in this equivalence). If the uniqueness conditions holds, a bit more can be said, for then each sentence φ obtained by translation is logically equivalent to a relativized sentence. In fact, this relativized sentence is simply obtained as follows: Delete, in ψ, all superscript set variables, and replace the remaining ones with a predicate denoting the univese (e.g. the logical predicate thing in B&C). Call the result φ*. Then, using (12) and (13) it is easy to verify that φ is logically equivalent to φ*(χ\ The justification of our procedure may be summarized in the following PROPOSITION 1: Assume the uniqueness condition for translation. Let ψ be a sentence of L(GQ) obtained by translation from the fragment, containing the set variable X. Then, if UNIV holds, φ is logically equivalent to φ * ( χ \ Conversely, if this equivalence holds for all translated sentences, UNIV will be satisfied. It follows that the procedure will not work when EXT fails. Consider, for example, the interpretation of many from section 4: manyuAB IΑΠΒI> 1/3* IMI .Translating Many boys run we get {many x(boy)
)run
with the truth condition IXn \boyl Π [fun] l> 1/3· |M| , instead of the more natural reading. IX η \boyl η [runl '> 1/3* IXITo get this we must use, instead of manyx, miner manyT.
7.
the binary relativized deter-
RESTRICTED DETERMINERS
There is a fairly well developed logical theory of unary determiners; the
56
Dag Westerstâhl
main source here is van Benthem (1984). Can the results of this theory be transferred to restricted determiners of the form D x ? In general this is not the case. The present section will give some illustrations of what is lost, and what can be preserved, when attention is confined to determiners restricted to a fixed context set. Recall the definition of D x : for all M and all A,B ç M, (RES) D X M A B
O
DMXNA
Β
(we do not have to assume that X ç M here, but we will assume that Χ φ 0). A property of determiners is preserved under restriction, if, whenever D has the property, so does every D x . PROPOSITION 2: CONSERV and EXT are preserved under restriction. PROOF For EXT this is immediate; for CONSERV we have D X M A B «· D ^ X n A Β D M X n A X n A n B (since D is conservative) · DAA for all A,B), as well as monotonicity properties (upward or downward monotonicity in the right or left argument). Many of these are preserved under restrictions; in fact, all of the above-mentioned ones are absolute in the sense that D has the property if and only //every D x has the property. Johan van Benthem pointed out to me that (part of) this observation is an instance of a general phenome-
Determiners and Context Sets
57
non: Call a property of determiners simply universal, if it can be expressed in form (*)
VAi . . . V A n ^ ( A i , . . . , A n f>),
where φ(Αι, . . . , A n ;D) is a truth functional combination of expressions of the form DAjA- Then we have PROPOSITION 3: All simply universal properties of determiners are absolute. PROOF This is a consequence of CONSERV. Suppose first that D satisfies (*); we must show that so does D x . Take any A, , · . ' . . , A n . As in the proof of Proposition 3, D X A¡Aj is equivalent to DXnAj XOA: , by CONSERV. It follows that ψ ( Α , , . . . , Α η ί ) χ ) is equivalent to ψ ( Χ η Α , , . . . ,XnA n ; D). But the last statement is a consequence of (*) for D. Conversely, suppose that every D x satisfies (*), and take any A , , . . ., A n . Let X = A, U . . . UA n . We have ψ (A,, . . . , A n ;D X ), and thus, as before, ψίΧηΑ,, . . ., ΧηΑ η ί>). Hence, since XnAj = A¡,
Of the properties mentioned above, all except those involving monotonicity are simply universal (e.g. the following version of monotonicity is not simply universal: DAB & B çC=> DAC). Actually, Proposition 3 can be generalized, if one wishes, to wider notions of universality, which include monotonicity. An example of a property not preserved under restriction is anti-symmetry: DAB & DBA => A = Β (from D x AB & D X BA we only get ΧΠΑ = ΧΠΒ, not A = B; i.e. the identity relation must be restricted too). Other examples are various non-triviality properties (which involve existential quantification over sets): Call D trivial on M, if either DAB for all A,Β ç M, or DAB for no A,Β ç M. A weak non-triviality requirement is that, on some M, D is not trivial. This is not preserved when passing from D to D x , and the same holds for the stronger requirement that D is nontrivial on all M. In fact, D x never has this latter property, since it is trivial on all M such that ΜΠΧ = 0. For restricted determiners, the corresponding γ
requirement is instead that D is non-trivial on all M which have nonempty intersection with X. A result from B&C is that symmetry is equivalent to the property (SYMM)
DAB «· DAHB AHB.
From this it follows that if D is (ir)reflexive and symmetric, then D is trivial on all M. Since QUANT is not used, this holds for restricted determiners too. Other results, which do not use QUANT but involve the stron-
58
Dag Westerstâhl
ger non-triviality requirement, may be reformulated for the restricted case. Here is an example from van Benthem (1984): If D is symmetric, quasireflexive, and non-trivial on all M, then D = some. A restricted version goes as follows: THEOREM4: If D x is symmetric, quasi-reflexive, and non-trivial on all M such that ΜΠΧ φ 0, then D x = somex on all such M. The proof is just a variation of van Benthem's proof, so we omit it. But when QUANT is needed, there may be no 'restricted versions'. For example, van Benthem shows that if D is asymmetric, then D is trivial on all M. Here QUANT is used, and there are in fact non-trivial asymmetric D x , e.g. the following. Let DAB lA-Bl > ^ a nd take X suchthat 1X1 = η. Then D X AB o I (ΑΠΧ)-Β I >
φ
Ix I is non-trivial (on some M). But l(AHX)-Bl > and IX I I (ΒΠΧ)-Α I > — cannot both be true, so D x is asymmetric. I conclude that the logical theory of restricted determiners is of somewhat doubtful interest, in view of the importance of QUANT. But then, restricted determiners arise from unrestricted ones, and it can be argued that all unrestricted determiners that are interpretations of natural language DETs actually satisfy QUANT. Such an argument will be produced in section 10. It proceeds, however, via a detour involving a linguistic application of the idea of context sets, and that is the subject of the next section. Clearly D x
8.
THE DEFINITE ARTICLE
In B&C, as in other places, the definite article is treated as a DET with the following interpretation: (14)
( « all AB, if I A I - 1 theAB j ( undefined, otherwise
A slight (Russellian) variant is to make theAB false when |A| φ 1. There are several reasons to be suspicious of this analysis. The first, and most obvious, is that it ignores the contextual reference of the definite article. In general, it is certainly not a condition for the truth of
Determiners and Context Sets (15)
59
The dog bit John
that there is exactly one dog in the whole universe. The fact that (14) is still proposed reveals, I think, an instance of tacit use of the FU strategy. But this strategy has already been discredited. Thus, introduction of a context set is essential here. Indeed, the method of section 6 lets us do just that, translating (15) as the*AB. I am going to suggest, however, another, slightly more radical, solution. A second unhappy feature of (14) is that it treats only the singular use of the. Thus it will not take care of (16)
The dogs bit John.
It is then sometimes suggested that we define another definite article, say , by (17)
/ «· allAB, if I A| > 1 theplAB ( undefined, otherwise.
But this may be another case of poor methodology. Prima facie, at least, there seems to be no good reason why the should be ambiguous between the singular and the plural case. More principled arguments can be given. The singular-plural distinction is essential to the definite article. This property distinguishes the from most other DETs. In fact, I am going to suggest (section 10) that the singural-plural distinction is never essential for determiners, in the present framework. Then the is not DET; an alternative analysis will be given below. The idea that the should not be interpreted as a determiner is not new. For example, Heim (1982) gives a special treatment of definites (and indefinites), although with a rather different motivation.3 Likewise, Barwise & Perry (1983) do not interpret the as an ordinary determiner (ch. 7). A third and final drawback of (14) and (17) is that they do not tell us how to analyze more complex NPs containing the. Consider the sentences: (18)
The boys saw the film
(19)
Susan ate the cake
(20)
Most of the men love Linda
60
Dag Westerstâhl
(21)
John kissed each of the girls
(22)
The captain knew most of the few survivors
(23)
Several of the seven men felt sick
(24)
The three boys saw Harry
A uniform treatment of the use of the in all these cases is clearly desirable. The idea behind the analysis of the that I shall propose can be formulated as follows: (THE) the is not a DET but a context indicator which signals the presence of a context set X, in such a way that the A denotes ΧΠΑ, a subset of A. It would be possible to let the actually denote X, but I shall not do this. The question of the syntactic category of the will be resumed in section 10.
It is immediately clear that an analysis according to (THE) avoids the first two problems with the standard analysis. It accounts for the presence of contextual reference, and it does not distinguish a singular from a plural the. Instead, the singular-plural distinction comes in a natural way from the syntactic form of the N, as an extra condition on ΧΠΑ. But what about real definite descriptions, where the Ν succeeds in uniquely determining an object (or at least is intended to do that) independently of the context? Since nothing is said here about how context sets are chosen, we can assume for the time being that this is a special case of (THE), for example, one in which the context set is the whole universe M. It remains to analyze (18) - (24). The NPs in these sentences have the following forms: (i)
the A
(ii)
D of the A
(iii)
Djo/tfieDîA
(iv)
the D 2 A
(iii) can be seen as the most general form, of which the others are special cases. The interpretations that follow conform to this intuition. Consider
Determiners and Context
Sets
61
first (ii). Here the idea of (THE) fits directly: we can let the A give the argument for the determiner. Formally, this can be expressed with a restricted determiner: (ii')
(D of the A)B
D X AB,
where X is the context set indicated by the. Now (iii) is interpreted by the following extension of this idea:
(iii')
Í «· D j X A B , if D 2 X AM (Dj of the D 2 Α) Β j ( undefined, otherwise.
We see that (ii') is the special case of (iii') when D2 = all. Finally, (i') is obtained by letting D = all in (ii'), and (iv') by letting Di = all in (iii'). These uniform interpretations are easily seen to give the 'right' meaning to (i) - (iv) (in the case of (i) and (iv) this depends on the fact that only distributive quantification is allowed in the B&C framework; cf. section 10). The only thing one might wish to add is a 'plural condition', corresponding to the fact that the Ν denoting A in (ii) - (iv) is always plural (except for mass nouns, which are not treated here). This syntactic property appears to have semantic effect, which in our case can be taken care of by adding, on the right hand side of (ii') - (iv'), the condition that IΧΠΑI > 1 (undefined otherwise). Similarly, we can make (i') sensitive to the syntactic number of the Ν by requiring that ΙΧιΊΑΐ = 1 in the singular case and I ΧΠΑ I > 1 in the plural case. The condition D 2 X AM for the sentence to have a truth value in (iii') is adapted from the B&C analysis of "there are"-sentences. B&C interpret sentences of the form (25)
There are DA
as DAM. Similarly, our use of the condition D 2 X AM becomes clear if it is read as (26)
There are D2 As in X.
To incorporate the above treatment of the definite article into the B&C framework, we do the following. In L(GQ), delete the determiners thel, the 2, the 3, . . . (these are all primitive in B&C, which hardly seems nat-
62
Dag Westerstáhl
ural). In particular, the (= the 1) is deleted. Further, add a new determinerforming operation: from two determiner symbols D] and D 2 , and a set variable X (or a set term), form the determiner symbol Di o / D 2 x , which is interpreted according to (iii'). Now the η can be expressed: {then A)B ~ {allof wM A) B; so nothing is lost by its omission. As for the fragment and the translation rules, there are various ways to extend these to the present analysis. Perhaps the simplest is the following. Delete the from the list of DETs, and add the syntactic rules:
(
the Ν
DET of the Ν DET of the DET Ν the DET Ν Then, the corresponding translation rules, according to (i') - (iv') above, are immediate. An obvious adjustment (of (DF) and the translation rules) will take care of the 'plural conditions'just mentioned on the NPs involved. (DF) and (i') - (iv') have been formulated for arbitrary DETs. In natural language, however, there are certain clear restrictions on which DETs may occur here. We return to these restrictions in the next section. Note that we have not introduced any DET-forming rules corresponding to the two determiner-forming operations added to L(GQ). The reason is that the partitive constructions of (ii) and (iii) are difficult to iterate in natural language. For example, a rule like DET
DET of the DET
would yield ungrammatical NPs like some of the two of the five boys (on the other hand, iteration within a relative clause is allowed by (DF), so e.g. two of the boys that love all of the girls can be generated). Under certain circumstances, however, it appears that the partitive construction can be iterated; this is not included here. 4
Determiners and Context Sets
63
Partitive NPs are also discussed in B&C (Appendix A). They add, essentially, a rule Ν
-»· o/NP,
where the NP must be formed with the or the η or both, and a syntactic partitive marking prevents iteration of the construction. Disregarding both (which is equivalent to the 2), the phrase structures generated in this way can, essentially, be obtained also by (DF) (actually (DF) is a little more general). So on the syntactic side, their treatment and the one given here are rather similar. The semantic treatment, on the other hand, is different in the two cases, mainly because the stress laid here on the occurrence of context sets (cf. also section 10).
9.
RESTRICTIONS ON PARTITIVE NOUN PHRASES
The rules (DF) are subject to certain rather interesting restrictions. To state these, we consider the most general form of NPs in (DF); this form can be written, even more generally, as (27)
DET, o/DET 2 DET 3 N.
The restrictions, stated below, on the expressions from the standard list of DETs that can take the position DETj here apply, mutatis mutandis, to the other NPs in (DF) as well. As to the position DET!, we find that only pronominal DETs are allowed. An explanation of this is provided in Hoeksema (ms.): he analyzes all partitives on the form NP o / N P , which in the case of (27) means that a 'dummy' Ν is present as an argument to DET], and only pronominal DETs can occur without the N. Can all pronominal DETs occur here? No, consider (a) (b)
possessive DETs: John's, Susan's, his, their,... demonstrative DETs: this, that, these, those.
These are all pronominal, but cannot take the position DETi in (27). Concerning the position DET 2 , we find that (a)
the possessives
64 (b)
(c)
Dag Westerstâhl the demonstratives (but only the plural ones, as should be expected from the 'plural condition' on partitives mentioned in section 8) the definite article
will do. In fact, with the possible addition of both (but cf. section 10), it would seem that precisely these DETs fit here. This is a significant restriction, which may be taken as characteristic of the partitive construction; it will be exploited further in the next section. As for DET 3 , finally, we may take our lead from the B&C analysis of "there are"-sentences mentioned before. A determiner is called weak, if, as a binary relation, it is neither reflexive nor irreflexive (otherwise it is strong). It is noted in B&C that weak determiners are characteristic of "there are"-sentences, and this is explained by the fact that such sentences, when analyzed as in B&C, become trivially true or trivially false if constructed with strong determiners. If our interpretation of NPs of the form (iii) in section 8 is correct, we should expect that only weak DETs can occur in position DET 3 , and this is indeed the case. However, not all weak DETs fit here; notable exceptions are some, one, a few, no. But, given the 'plural condition ' on partitives, these can be excluded for exactly the same kind of reason as the strong ones: they make (26) trivially true or trivially false (assuming that a few means something like at least two). The B&C explanation of the restrictions on "there are "-sentences can thus be successfully extended to the restrictions on the position DET 3 in (27).
10.
DEFINITES
Let us go back to the question, touched upon in section 8, of how DETs relate to the singular-plural distinction. Here are some simple facts. For most DETs, the syntactic number of the succeeding Ν is fixed. For example, all, many, few, most, both must take a plural N, whereas every, each, neither take only singular Ns. In a few cases, such as some and no, both singular and plural Ns can follow. But, in the examples mentioned, these syntactic features of DETs have no obvious semantic counterparts for determiners. Indeed, all and every are interpreted as the same determiner. Likewise, although both neither and both presuppose that the succeeding Ν denotes a set with exactly two elements, one takes a singular and the other a plural N. Also, the semantic difference (if any) between some man and some men, or between no man and no men, has nothing to do with the number of men in the model. These DETs are semantically indifferent to the singular-plural distinction in a sense which can be made precise as follows. We want to call a
Determiners and Context Sets
65
DET number-sensitive, if, in all well-formed NPs constructed with it, the syntactic number of the Ν determines a corresponding semantic condition on the interpretation of the Ν (in each model). The condition is that the set denoted by the Ν has exactly one element if the Ν is singular, and at least two elements if the Ν is plural. This condition, however, should be a presupposition rather than a truth condition - we do not want e.g. the DET two to be number-sensitive, even though Avo AB =* I Ai > 2 and the Ν is always plural here.6 This can be expressed, as in B&C, by using partial determiners as interpretations of DETs. Thus, the requirement for number-sensitivity is that the determiner (which interprets the DET) is defined for the argument (set) which interprets the N, just in case this argument satisfies the condition corresponding to the syntactic number of the N. 7 It is easy to check that the usual total DETs (i.e. those interpreted as total determiners) are not number-sensitive, as expected.8 In section 8 we noted, on the other hand, that syntactic number does make a difference to the definite article. Similarly, it matters for the possessives and the demonstratives. However, in these three cases we must also take account of the occurrence of context sets. Consider the NPs the toy Susan's toy this toy. In each case it is presupposed that a certain set has exactly one element. But this set is not the set of toys in the universe, i.e. the denotation A of the N. Instead, it is that denotation intersected with a context set X. This is clear for the and the demonstratives; for Susan's , X is (usually) the set of things in the universe that belong to Susan (we agree to call this a context set; actually, X is sometimes a (context-given) subset of this set). Thus, these DETs are contextually number-sensitive in the sense that the condition in the above definition of number-sensitivity holds for ΧΠΑ and not for A (the denotation of the N). They are not number-sensitive in the 'pure' sense. In fact, it seems that there are no number-sensitive DETs in English. This fact is by no means conceptually obvious or necessary; it is easy to imagine a language with number-sensitive DETs. Their absence from natural languages is something which needs to be explained. Here we shall only note that the property of being sensitive to syntactic number and the property of being a context set indicator seem to be linked together in natural language. Before going on, we shall make the following methodological move, in order to avoid certain complications: both and neither are excluded from the list of DETs. More about the reasons, and the justification, for this will be said presently.
66
Dag Westerstâhl
We have found, in this and the two preceding sections, that the group of DETs consisting of the definite article, the possessives, and the demonstratives is distinguished from other DETs in several ways: They are context set indicators They are (contextually) number-sensitive They have a special role in partitives (specified in section 9) 9 . Further, it is easy to see that the special syntactic and semantic treatment of the definite article that was suggested in section 8 can be extended to demonstratives and possessives. For example, if X is the set of things belonging to Susan, then Most of Susan's ten cars are new is true just in case the intersection of X with the set of cars has more new elements than old ones, given that it has ten elements (otherwise the sentence lacks truth value), just as the interpretation (Iii') of section 8 predicts when adapted for possessives. The following methodological proposal thus seems to be rather well motivated: Remove the above-mentioned expressions from the list of DETs and put them in a special group, the de finîtes (DEF), say. The DEFs can then be treated in the present framework by syntactic and semantic rules modelled on the ones given for the definite article in section 8 (with certain obvious modifications). The arguments given there for the advantages of this analysis over the usual one apply, in fact, to all the DEFs: it accounts for their contextual reference, it avoids treating the definite article and the possessives as ambiguous between a singular and a plural setting (which is in line with the usual assumption that DETs (and DEFs) are constants), and it covers uniformly certain complex NPs with DEFs, in particular certain partitive constructions. Further motivation for the present proposal can be obtained by stating a few consequences of it. We formulate two of these as semantic universals. The first one is (Ul)
All natural language determiners are total.
Since the only reason for introducing partial determiners was number-sensitivity, and since we have lifted out the (contextually) number-sensitive DETs, (Ul) is reasonable. It results in a notable simplification of the logical theory of determiners (indeed, in existing work on determiner theory such as van Benthem (1984) and Keenan & Stavi (1981), only total determiners are considered).
Determiners and Context Sets
67
Another consequence of the proposal is that no DETs are context set indicators. This is not a semantic universal, though, since it does not say anything directly about determiners. In particular, it does not say that no restricted determiners will ever be needed. For we have seen in sections 2 and 5 that restricted determiners may be called for without any explicit indication at all. The following universal, however, is purely semantic. It depends on the proposed separation of the DEFs from the DETs. (U2)
All natural language determiners are quantitative.10
(U2) is based on the assumption that the only serious candidates for nonquantitative DETs are the possessives: permutations of the universe will not in general preserve the ownership relations that pertain. Other possible counterexamples that have been proposed are 1) DETs of the type all blue, as in All blue grapes are tasty, and 2) DETs of type every...
but John, as in
Every professor but John attended the meeting. But it is not necessary to treat any of these as non-quantitative DETs. In 1), we can either let the Ν be complex (blue grapes) and use the ordinary DET all, or introduce a binary DET (quantitative), which is the same as using the ordinary all restricted to the set of blue things. In 2), the second option is open, i.e. we can consider every A but a as an operation with a set and an individual as arguments; this operation is quantitative. Or, we can use the ordinary DET every, and just stipulate in translation that special conditions (that John is a professor and that he didn't attend the meeting) must be added. Since many results in determiner theory depend on the assumption of quantity (section 7), (U2) has the effect of making this theory directly relevant for DETs in natural language. We shall end the discussion of the DEFs by commenting on the alternative semantic treatment of these proposed in B&C. There the DEFs are interpreted as determiners, but they are characterized by a special semantic property, called definiteness: a determiner D is definite, if, for all universes M and all A c M for which D is defined, there is a non-empty set, say B A , such that for all Β ς M, D M AB ·» B A ç Β. The first thing to note is that this characterization works only if partial determiners are allowed:
68
Dag Wesíerstáhl
PROPOSITION 5: There are no total definite determiners. PROOF Suppose D is total and definite. Take a universe M and let A = 0. D is defined for this argument, so, by definiteness, D^(0B0. Thus, by CONSERV, D M 00, and so, again by definiteness, B0 ç 0, contradicting the requirement that B^ is non-empty •
Let us, however, forget (Ul) for the time being, and assume that we have to use partial determiners. One drawback of the B&C analysis is that it neglects the function of DEFs to indicate context sets. About this more than enough has been said already. The semantic property of definiteness, though, is interesting. On closer scrutiny, it seems to tell us two things about these determiners, which, for clarity, could perhaps be kept apart. The first is that definite determiners are all special cases, as it were, of the determiner every. The second is that they make an existence assumption. It is the first property which explains their usefulness in partitives: they create quantifiers that can be reduced to a single set (the generator, B a above), which can serve as argument for the main determiner in the partitive. We can express this property by weakening the assumptions in the definition of definiteness slightly. Call a determiner D universal, if, for all A for which D is defined, there is a set Β A such that B A is nonempty if A is, and, for all B, DAB » B A cB. We assume that determiners are logical (section 7), so there is no need to mention the universe M here. Then, clearly, all definite determiners are universal. Now we shall see that universal determiners really are special cases of every : THEOREM 6: If D is universal then D = every on all arguments for which it is defined. PROOF Suppose that D is universal and defined for the set A. We must show that DAB " A c B , for all B. By universality, we have, for all B, DAB *> B A c B. We distinguish two cases. Case 1: A = 0. Then, by the proof of Proposition 5 above, B A =0 , which means that the desired conclusion holds. Case 2: Α ψ φ. By universality, it then follows that B A ψ 0. Furthermore, we have B A c A. To see this, note that, by CONSERV and universality, DAB ·» B A C AnB. for all B. Let Β = B A - Since DAB A holds, by universality, we get B A C AnB A , i.e. B A c A. Now we claim that, in fact, B A = A. From this, the theorem follows immediately. To prove the claim, suppose that it is false. Then, by the above, there is an element a in B a , and an element a ' in A-B A . Now let f be a function which permutes a and a' but leaves everything else as it is. Since DAB A holds and D is quantitative, DfIA]f[B A ] . Here f[A] = A and f[B A ] = ( B a - {a }) U { a' } = BQ. Thus, DABQ, and so, b y universality, B A c BQ.
But this is a contradiction, since a e
BA-BQ
•
Thus every itself is the only total and universal determiner. But every
Determiners and Context Sets
69
cannot be allowed to be definite, since it cannot occur in the desired positions in partitives. This is where the existence assumption comes in. Requiring that the generator B^ is always non-empty is, as we have seen, the same as requiring that D is undefined for A = 0, i.e. that it presupposes that the argument is non-empty. In conclusion, it seems to me that the B&C analysis uncovers an interesting property of partitives (i.e. universality), in addition to the ones that have already been mentioned. But Theorem 6 shows that the generator B a , i.e. the set which 'replaces' the quantifier DA, is actually identical to A. Thus it is quite feasible to use A directly in the interpretation of partitives, instead of recovering it from a quantifier, and this is precisely what our alternative proposal does (modulo context sets). It remains to say something about both and neither. Actually, they fit rather badly in the patterns we have discerned. To begin with, they seem number-sensitive in some way, but the condition of (contextual) number-sensitivity formulated earlier fails, as is easily seen, for them. They also seem to be context set indicators (e.g. both boys is synonymous with both (of) the boys). But in partitives, they can appear before of, in contrast with the DEFs. Furthermore, neither can occur in none of the DEF positions in partitives, and both only in very few of them (some instances of the NP form (ii) in section 8 are possible with both, but none of the forms (iii) and (iv)). So we don't want them as DEFs. On the other hand, if they are DETs, universal (Ul) fails. Thus, one would prefer to give them a separate treatment, and not assimilate them to either DETs or DEFs. There are in fact independent reasons for doing this. One is that in many languages it is impossible to treat them as DETs: they do not form NPs out of Ns. Here I am not thinking only about languages like French, which seems to lack these constructions altogether. But in Swedish, for example, the words bada and ingendera are quite accurate translations of both and neither, respectively, and occur in similar positions. Yet they cannot combine with Ns: bada man (both men) is impossible; one has to say bada männen (both the men). Whatever the right analysis of these words, I conclude that one shouldn't be too concerned if both and neither form exceptions to linguistic patterns that are valid for DETs (or DEFs). One final word, to avoid a possible misunderstanding. The semantic framework used in all of this paper is the modeltheory of B&C. It is in this framework that we can say, for example, that DETs are not numbersensitive, i.e. that syntactic number is irrelevant for determiners. With a more sophisticated modeltheory, such statements have to be revised. For example, if collective quantification is allowed, the above statement is no longer true; indeed we can explain the difference between all and every regarding syntactic number by their different behaviour in this
70
Dag
Westerstâhl
respect: every can never be used collectively, whereas all can. Also, the treatment of partitives would have to be modified (though I think the basic ideas can be preserved). The advantage of the simpler framework is, of course, that it is more familiar: it is easier to prove things in it. It seems to me that 'classical' determiner theory, and its use in the semantics of natural language, is yet far from exhausted.
NOTES 1. The reason for using the term "discourse universe" here will become clear in what follows. 2. There are exceptions; for example, Hausser (1974) uses variables for what I have called context sets (for certain quantifiers), and Smaby (1979) discusses 'variable domains' for the universal and existential quantifiers. 3. She treats only the singular the. Also, her framework is not the simple model theory of B&C but something more like the discourse representation semantics of Kamp (1981). The contextual reference of the is basic in her treatment, however, and she accounts for it by means of free individual variables, rather similarly to our set variables. Finally, she not only gives a formal framework, but also attempts to explain how values are given to the variables. 4. Alice ter Meulen suggested several examples, e.g. Two of the five who flunked of the boys will take the test again. 5. A further exception could be NPs like (a) two of all the boys. I prefer to regard all the boys as a partitive in itself (all of the boys)·, (a) is then an iterated partitive. 6. In other words, two lions roar is simply false if there are less than two lions in the model - there is no presupposition about the number of lions. 7. For a related, but different, type of 'semantic number' condition, cf. van Eijck (1983), who studies how syntactic and semantic number of Ns is related to mechanisms of anaphora. 8. Under reasonable assumptions about the language it should follow that no total DETs are number-sensitive. 9. This special role is clearly tied to the fact that the partitive construction itself acts as a kind of context set indicator: in an NP of the form DET o f . . . Ν (or NP o f . . . N) we are in general not talking about the whole denotation of Ν but only a contextually given subset of it. 10. This is a. strengthening of a semantic universal proposed in Keenan & Stavi (1981), which says (roughly) that all simple DETs are quantitative.
REFERENCES Barwise, J. & Cooper, R.: (1981), "Generalized quantifiers and natural language", Linguistics and Philosophy 4, 159-219. Barwise, J. & Perry, J.: (1983), Situations and Attitudes, MIT Press/Bradford, Cambridge.
Determiners and Context
Sets
71
van Benthem, J.: (1984), "Questions about quantifiers", Journal of Symbolic Logic. 49:2, 443-466. van Eijck, J.: (1983), "Discourse representation theory and plurality", in Studies in Modeltheoretic Semantics, ed. A. ter Meulen, GRASS series vol. 1, Foris, Dordrecht. Hausser, R.: (1974), Quantification in an Extended Montague Grammar, unpubl. dissertation, Univ. of Texas, Austin. Heim, I.: (1982), The Semantics of Definite and Indefinite Noun Phrases, dissertation, Univ. of Massachusetts, Amherst. Hoeksema, J.: (1984), "Partitives", manuscript. RUG. Kamp, H.: (1981), "A theory of truth and semantic representation", in Formal Methods in the Study of Language, eds. J. Groenendijk et al., Mathematical Centre, Amsterdam. Reprinted in Truth, Interpretation and Information, Grass 2, Foris Pubi. (1984). Keenan, E. & Moss, L.: (1984), "Generalized quantifiers and the expressive power of natural language", this volume. Keenan, E. & Stavi, J.: (1981), "A semantic characterization of natural language determiners", to appear in Linguistics and Philosophy. Smaby, R.: (1978), "Ambiguous coreference with quantifiers", in Formal Semantics and Pragmatics for Natural Languages, eds. F. Guenthner & S.J. Schmidt, Reidel, Dordrecht. Westerstâhl, D.: (1982), "Logical constants in quantifier languages", to appear in Linguistics and Philosophy.
Chapter 4
Generalized Quantifiers and the Expressive Power of Natural Language Edward L. Keenan and Lawrence S. Moss
This paper pursues the model theoretic semantics of natural language determiners (quantifiers) exemplified in Barwise & Cooper (1981), van Benthem (1982, 1983a, 1983b), Keenan (1981), Keenan & Moss (1984), Keenan & Stavi (1981), Thijsse (1982, 1983), Westerstáhl (1982), and Zwarts (1983). Like many of these works, ours owes a certain debt to the earlier work of Mostowski (1957) and Lindström (1966). Following most closely the notation of Keenan & Stavi (henceforth K&S) we treat one place determiner (detj's) as expressions, such as every, John's, which combine with one common noun phrase (CNP), such as house, white house, to form a full noun phrase (NP): every house, John's white house. The purpose of this paper is twofold: First, we explore the semantic properties of k> 1 place determiners (det^'s) - expressions which combine with k CNP's to form an NP. Second, we investigate the contribution to the expressive power of English of the various classes of det^'s we distinguish. The paper is organized in four sections: The first provides the linguistic motivation for studying k-place dets. The second presents several subclasses of such dets, concentrating on those we call "cardinal" and "logical". It is these classes, together with the full class of "non-logical" dets, whose expressive power is investigated in sections 3 and 4. The latter notes several unsolved problems. Before initiating our discussion, we review below several basic concepts and some notation that will be used throughout the rest of the paper.
0.
BACKGROUND NOTIONS: CONSERVATIVITY
We shall think of CNP's as denoting properties, and full NP's as denoting sets of properties. Denotations of proper nouns (e.g. John) are rather special property sets we call individuals. Our way of representing the truth conditions of e.g. John is a linguist will be to say that the linguist property * The authors would like to thank the Max-Planck-Institut für Psycholinguistik for having supported this research.
74
Edward L. Keenan and Lawrence S. Moss
is an element of the individual John. Similarly every student is a linguist is true iff. the linguist property is an element of the every student set of properties. We limit ourselves to extensional representations and thus regard properties ρ and q as the same iff they are members of the same individuals. Thus up to isomorphism the set Ρ of properties of a model is the power set of the set of individuals. Ρ then posesses a rich boolean structure. Specifically, for properties ρ and q we write (ρ Λ q) for the property an individual I has iff he has both ρ and q, that is, ρ β I and q e I. Similarly (ρ ν q) e I iff. ρ e I or q e I. So (ρ ν q) is the property of being either a ρ or a q. More generally, if Κ is a set of properties then Λ Κ is the property an individual has iff. he has each of-the properties in K. And V Κ e I iff. for some k e Κ, k e I. The symbols Λ and ν are read as meet and join respectively. We use p' for the property an individual has iff. he does not have p. And the boolean < relation is understood as follows: ρ < q means every individual with ρ also has q. We write F , called the filter generated by p, for {q: ρ < q}. An atomic property, for which we use variables a, (J,γ, δ, is one which exactly one individual has. We use 0 for the trivial property that no individual has, and 1 for the trivial property which every individual has. Our notation here is completely compatible with views which treat (extensional) properties as sets of entities. On those views, for example, the property we write as (p A q) would be written ( p f l q ) , and ρ 1 }. We note here that the distinction between an English expression and the item that interprets it is an important one. Nonetheless, where clear from context we often blur the distinction. So sometimes we write every for the English expression and sometimes for the function from Ρ into P* defined above. Several of the papers cited earlier treat det ¡ 's semantically as binary relations between properties rather than, as on our approach, functions from properties to sets of properties. So those approaches think of every in every student is a vegetarian as expressing a relation between the student and the vegetarian properties. As per (1) below these approaches are obviously mathematically equivalent. (1)
R(p,q)iff.qefR(p)
That is, given a binary relation R on P, (1) defines a function f R from Ρ into P*. The map sending R to f R is a bijection from the set of binary relations on Ρ to the set of functions from Ρ into P*. Whence anything that can be said on one approach can be said on the other. Nonetheless, the two approaches have led to somewhat different questions. See e.g. the references given earlier to Zwarts, van Benthem, and Thijsse for investigations arising naturally from the relational point of view. Conversely, the functional point of view leads to questions of the effability sort raised in section 3 of this paper which do not arise so naturally on the relational view. Moreover the functional approach provides a more natural means of directly interpreting English sentences with more than one NP. On that approach we directly assign sets of properties to each of the NP's in every student read at least one book and interpret read as a binary relation on property sets (See Keenan & Faltz (1978) and (to appear) for details here). The relational approach is awkward here as the sentence in question does not have enough property denoting expressions to interpret each of the dets as a binary relation between properties. Now K&S consider in detail the question whether, for arbitrary P, just any function from Ρ into P* is a possible det! denotation. We include in the Appendix a list of det^s they considered. Clearly the list is diverse, including several sorts of expressions that many would not want to assign
76
Edward L. Keenan and Lawrence S. Moss
the category Det to in English. Nonetheless, it was found that the natural denotations for all the expressions in the list satisfied a condition they called conservativity. A function f from Ρ into P* is called conservative iff. for all p, q e Ρ, ρ e f(q) iff (ρ Λ q) e f(q). To verify, for example, that every is conservative it is sufficient to see that each of the sentences in (2) below entails the other. (2)
a. b.
Every student is a vegeterian Every student is both a student and a vegetarian
That is, the vegetarian property is an element of every student iff. the property of being both a student and a vegetarian is an element of that set. Similarly, the reader may verify that John's is conservative by checking that (3a,b) below entail each other. (3)
a. b.
John's house is a meeting place for revolutionaries John's house is both a house and a meeting place for revolutionaries
We note that the conservativity constraint is a very strong semantic universal. Most functions from Ρ into P* are not conservative (see section 2). The intuition underlying conservativity is the following: the argument property of the one place function f determines the universe of individuals which must be checked to decide whether an arbitrary property is in the set that f associates with its argument property. In this checking we need not concern ourselves with individuals which do not have the argument property. Note here that this intuition is more naturally arrived at from a functional rather than a relational point of view, where a priori there is no reason to expect any asymmetry in the relations that the two properties bear to the determiner relation. Moreover, this intuition extends naturally to k > 1 place dets. Consider for example (4)
More students than teachers attended the rally
We argue in section 1 that the subject NP of (4), more students than teachers, should be interpreted as the value of a two place function more than at the pair of properties (student, teacher). Now to decide whether the property of attending the rally is an element of that set, we must investigate more than the individuals with the student property, we must also know something about the individuals with the teacher property. So here the universe of individuals we need to consider must include, and can be limited to those which are either students or teachers. No others need be considered. Thus we may call a two place function on Ρ conser-
Generalized Quantifiers and Expressive Power of Natural Language
77
vative iff, for all p,q, r e Ρ, ρ e f(q,r) iff (ρ Λ (q V r)) e f(q,r). Generalizing, 1. A function f from P k into P* is conservative iff for all _ , k ρ β Ρ and all q e Ρ κ , ρ e f(q) iff (ρ Λ V q.) e f(q). i=i
DEFINITION
Note the notation q for k-tuples of properties (qi, q 2 , · · · , q k ). In the sections that follow we also often use an equivalent definition of conservativity given by Thm 1 below: THEOREM 1. A function f from P k into P* conservative iff for all Pi ,p2 e Ρ and all q e p k , if pi Λ q¡ = p 2 Λ q¡, all i between 1 and k, then P! e f(q) iff p 2 e f(q) From the perspective of Thm 1, the conservative functions are just those which cannot distinguish between two properties which have the same meet (intersection) with each of their arguments: They either put both in or neither. For example, in the case k = 1 Thm 1 says that f is conservative iff. if the individual q's with p! are just those with p 2 then either both PJ and p 2 are in f(q) or neither is. In what follows we note the set of k-place conservative functions by CONS k . We use CONS for U CONS k · Note of course that just which functions are in CONSk depends on what underlying Ρ we have. In general, the theorems we prove universally quantify over the choices for P, and when Ρ must meet some particular condition we note that fact. A final point of notation: the set F p ^ p k of functions from P k into P* possesses a natural boolean structure, defined pointwise, as follows: If Κ is a set of such functions then Λ Κ is that function which sends each ρ e P k to Π k (p). VK sends each ρ to U k (p). Of course, if Κ is a keK keK . . small set, say Κ = {f,g}, then we write (f Λ g) rather than A{f,g}. Similarly we write (f ν g) rather thanV {f,g}. If f e F p «y p k then f ' is that function which maps each ρ to (f (p) ) ' , the set of properties which are not in f(p). Further we set 0 k as that function which sends each ρ to 0, and l k as that function which sends each ρ to P, the set of all properties. Finally we note that for any two such functions f,g f 1 place functions in these cases. The objection fails however, for several independent and enlightening reasons. Quite generally the NP derived by such a rule is not a paraphrase of its source, whence such a rule cannofr correctly represent the semantic relation between the NP's. For example, (12a) below is not a paraphrase of (12b), but rather means the same as (12c). (12)
a. b. c.
no student or teacher no student or no teacher no student and no teacher
Similarly (13a) is obviously not a paraphrase of (13b), nor is (14a) a paraphrase of (14b). (13)
a. b.
some student's hat and coat some student's hat and some student's coat
(14)
a. b.
every student's hat or coat every student's hat or every student's coat
Note that paraphrase fails in (13) and (14) for essentially the same reason that sentences derived by the rule fail to be paraphrases, e.g. Some student came early and left late is not a paraphrase of Some student came early and some student left late.
Generalized Quantifiers and Expressive Power of Natural Language
81
How shall we interpret these NP's? We consider first the easy case in (12). As a det;, no is interpreted as the complement ( ',) of some. So for each property p, no{p) = {some (p))'. The latter equality follows from the pointwise definition of ' in CONSi. And (some(p)) 'is the set of properties which fail to be elements of at least one individual with p. Now, since some student or teacher obviously denotes the same set of properties as some student or some teacher we may interpret the former NP as the value of someQr 2 at the pair (s,t), where s interprets student and t teacher. (Where clear from context we often omit the numerical subscript in expressions like someQr 2 ·) And we may naturally interpret no...or... in (12a) as the complement of somenr. This analysis entails the correct meaning relations in (15): (15)
no student or teacher = (someor) ' (s,t)
def of interpretation
=
(someor(s,t) ) '
pointwise def of
=
(some(s) U some{t) ) '
def of
=
(some(s) ) ' η (some(i) ) '
de Morgan laws
=
some'(s)
pointwise def of '
=
«o(s) Π no{t)
def of int. of no
=
no student and no teacher
def of int.
Π
some'(t)
someor
While we have already remarked that for each k, CONSk (the set of kplace conservative functions) is closed under the pointwise defined boolean operations, the existence of the interpretations needed for expressions such as no student or teacher provides empirical motivation for wanting the set of k-place functions needed for English to be closed under complements defined pointwise. Many other naturally expressible examples also come to mind: E.g. not more . . . than .. . will be interpreted as the complement of more.. . than.. . , a point we shall not further elaborate. Consider now (5e), some student's hat and coat. Referring the reader to K&S for details, we note that simple one place possessive dets such as John's are interpreted as per: (16)
John 's(p) = the (p which John has)
More generally (the example assumes that John's is singular) we define for each natural number n,
82
Edward L. Keenan and Lawrence S. Moss
(17)
(John 's n) (p) = (the n)(p which John has)
where (18)
(the n)(q) = d e f
if |q| j 0every(q) otherwise
=η
The definition of the detj's the η or more and John's η or more proceeds in the obvious way. And in general an NP of the form the student is interpreted by the one at the student property, and the students by the two or more at the student property. Analogously for John's hat and John's hats. Assuming now, for simplicity of presentation, that everything is in the singular, we have that for each individual I, I's is a one place (conservative) function on P. It sends each property q to whatever the one sends q which I has to. Consider now possessive dets such as some student's, where the possessor NP does not denote an individual. (We give this much detail here, as possessive dets (see section 3) account for an important part of the "non-logical" expressive power in English. We note further that possessive dets are the most complex that we shall need to consider). Such NP's always denote boolean combinations of individuals (see Keenan & Faltz (1978) and (to appear) for details here), and we obtain correct interpretations for NP's with such possessors if we interpret the 's pointwise on the individuals. E.g., where I ranges over individuals, (19)
some student's hat = ( ( U I ) ' s ) (hat) last [V(I ' s ) ] ( t o )
def of int. of some pointwise def of ' s
I 9St
pointwise def of V U (I ' s (hat) ) last = the set of properties q such that for some individual I with the student property, q is a property of I's hat. Let us return now to some student's hat and coat. Since John's hat and coat denotes the same property set as John's hat and John's coat we obtain a correct interpretation of the former NP by interpreting it as the value of John'sand at the (hat, coat) pair, where John's and (ignoring subscripts) is the k-place det formed from the one place det John's in part (i) of Thm 2. And in general for any individual I, we have a k-place det (I's) The natural extension of the det! some student's then is given by:
Generalized Quantifiers and Expressive Power of Natural Language (20)
(some studenfs)and
= ((UI) ' last =v last
83
s\nd
( a ' - U )
Interpreting then some student's hat and coat as the value of (some student's)and at the pair (hat, coat), we obtain the correct reading: (21)
some student's hat and coat = (some student's)and(hat, =
coat)
def of int
V
( (I ' s)and) (hat, coat) 19 St U ( ( I ' s) and(hat, coat)) la st
by (20) pointwise def of
U (I ' s hat Π l ' s coat) last =
def of (I '
s\nd
the set of properties q such that for some individual I with the student property, q is a property of I's hat and of I's coat.
Note crucially here that some student'smd is not directly interpreted distributively. Rather 's behaves as a homomorphism and some student'smd is interpreted as a join of functions of the form ^ s a n d which are interpreted distributively. This analysis then, and the comparable one for every student's , motivate the claim that the set of k-place functions needed to interpret English NP's should be closed under arbitrary joins and meets. There are other NP's in English of the form det cnpx , . . . and cnpk which are not interpreted distributively. An important class are those like (5f), the ninety-two students and teachers, which are interpreted disjunctively. We define,
(22)
( , ^
W s
,
t
> -
i ^ h ^
0
" " " ·
9 2 4
'
8 1
·
1
" *
1
Looking through the list of deti's in the Appendix it seems to us that most prefer the sort of disjunctive reading when they occur in constructions like (5f). We shall make no attempt to characterize just which dets prefer or require the distributive readings and which the disjunctive readings. We are only interested in which k-place functions we need to interpret NP's, not how particular NP's get associated with those functions.
84
Edward L. Keenan and Lawrence S. Moss
To this end, then, we must assure ourselves that the type of definition given in (22) stays within the class of conservative functions. Specifically, may we define conservative functions in terms of others by giving just any conditions, e.g. I s ν t I = 92, and can we define conservative functions by taking joins over their arguments as in (22)? Thm 3 below assures us that the answer to the latter question is affirmative, and Thm 4 that the answer to the former one is. THEOREM 3 For any f e CONS! define f + : P k -> P* by f + (q) =
Then f + is conservative, and the map f -» f* is an embedding of CONSi into CONS k . PROOF For the conservativity of f + , note that p e f + ( q ) iff ρ e f ( V q ¡ ) iff ρ Λ Vqj « f(Vq¡) iff ρ AVqj e f + ( q ) ;
since f is 1-conservative
To see that the map f ->· f + is one-to-one, suppose that f ψ g. Then for some property p, f(p) ψ g(p). Let ρ be the tuple that is constantly ρ i.e., Pj = ρ for all i). Then f + (ρ) ψ g + (p). Finally, to see that our map is an embedding, one need only check that the boolean operations are preserved, and we omit this routine calculation •
Theorem 4 below is a generalization of "definition by cases". THEOREM 4 (Scrambling Theorem): Let S:P k -»· CONS k be arbitrary. Define a map f s : P k P* by f s ( q ) = (S(q) Xq). Then f g is k-conservative. PROOF ρ e f s ( q ) iff ρ e (S (q) ) (q)_ iff ρ Λ Vq¡ e (S (q) ) (q) i f f p A Vq¡ e f s ( q )
(since S(q) is k-cons.) •
That (the 92) and defined in (22) is conservative follows basically from Thm 3 and Scrambling. (Note in this regard that 0, the empty set, as it occurs in (22), may always be regarded as 0 k (p) for any k-tuple p. 0 k recall is the zero element of the algebra CONSk- It sends every k-tuple to the empty set of properties.) "Inherently "k>
1 Place Dets
We have so far motivated a surprisingly large number of k-place dets, since by and large for each one place det d we have two k-place dets dgnd k and dQr k for each k. These dets, however, are built up from one place ones, and it is natural to wonder whether there are any "inherently"
Generalized Quantifiers and Expressive Power of Natural Language
85
k > 1 place dets, that is, ones which are not built up in some way from one place ones. It seems that the numerical comparatives in (6), e.g. more . . . than the same percentage o f . . . as..., are such a case. A direct definition of the two place function more(f¡an is given in (23): (23)
morethan
(p,q) = {s: is Λ ρ t > I s A q I >
That m o r e t ¡ i a n is conservative follows directly from the Thm 1 version of conservativity. Interpreting (6a), more students than teachers, as the value of moret/tgn at the pair (s,t) clearly yields the correct result. That is, more students than teachers are vegetarians is true on our interpretation just in case the number of students with the vegetarian property is greater than the number of teachers with that property, which is correct. There is quite a variety of English expressions comparable to more ... than . . . which form NP's interpreted by functions defined in the same way as more[flan is. They include: fewer... than ..., exactly as many... as . . ., between five and ten times as many . . . as . . . , three more... than . . . , the same number o f . . . as . . . , etc. A slightly more complex group includes proportionately more . . . than . .., proportionately fewer . .. than..., the same proportion o f . . . as.. „a greater percentage of the ... than of the. .., less than half as many ... as... . To see that e.g. proportionately more[flan is CONS2 we define:
(24)
s e prop. more(han
(p,q) iff ^ f P
1
> q
and I ρ I > 0 & I q I > 0 and both are finite
Obviously if properties s and t have the same meet with both ρ and q, then they both satisfy the condition on the right above or neither does, proving conservativity. Now is there any sense to claim that functions such as r n o r e t f t a n are "inherently" two place? Might we not, for example, perhaps with a clever use of definition by scrambling, be able to define some one place function which mimicked the behavior of wore f A ? Thm 5 below shows that this cannot be done, even if we assume that the universe is finite. THEOREM 5 If Ρ is finite, say I Ρ I = 2 n , and η > 3, then the range of more . . . than . . . has size > 2 n , so it cannot be the range of any 1 -place function on P. PROOF For each atom a, define the set M q = {more ρ than a: 0 < ρ < a ' }. We shall show that the sets M are disjoint for different a's, and that they each have size 2
n-1
^
-1, where η is the number of atoms in P.
86
Edward L. Keenan and Lawrence S. Moss First, suppose that α Φ β. Fix also some ρ with 0 < ρ α ' . We shall show that more ρ than a is not in M^; i.e., that for all q with 0 < q < β ' , more ρ than a Φ more q than β. Let δ be any atom < p, so δ Φ α. Now by an easy calculation, δ V β e more ρ than α, while δ V β i more q than β. Second, we show that each set has size -1, by showing that if α is an atom and 0 < p,q < a ' (p^q), then more ρ than α Φ more q than a. The result follows, since there are 2n*^ -1 such elements, one for each nontrivial join of the atoms other than a. To establish the inequality, suppose ρ 1C q, and fix some δ < ρ - q. Since ρ < α ' , δ Φ α. It is easily seen that δ e more ρ than a but δ 3, this figure is > 2 n , so we're done, since the range of any one place function on a set with 2 n elements is of size < 2 n . •
Semantically then, m o r e t ^ a n cannot be reduced to a one place function on properties, and our claim that English semantics requires inherently two place functions is supported. Moreover, in this case our analysis coincides with recent syntactic work (Napoli, 1983) on more . . . than . . . . We note here her conclusion that more . . . than .. . should be treated as an expression which combines directly with two expressions of the same category, including CNP's, to form a complex expression. We note two additional pieces of support for our analysis which we feel have some independent semantic interest. Consider the analyses of more students than teachers proposed in (25): (25)
a. b.
[ p e t more . . .than teacher] student [j-j eti more student than] teacher
The natural interpretation of these " d e t ^ s " is given by: (26)
(more than t) (p) (more s than) (p)
= {s: = {t:
I s Λ ρ I > Ί s Λ 11 }, and 11A s I > It Λ p i } .
The analysis in (26) does not violate Thm 5, as it claims in effect that Tnoret¡ian corresponds to a set of one place functions, e.g. {more . . . than t: t e Ρ } But note the following result: THEOREM 6 If Ρ has at least two atoms, then the functions in (26) are not conservative. PROOF Assume first that α and β are distinct atoms but that (more than t) were conservative. Now α Λ α = (α V β) Α α, so α e (more than β) (α)
Generalized Quantifiers and Expressive Power of Natural Language
87
iff. α V β e (more than β) (α). This, however, is not the case because
α e (more than β) (α) since I α Λ α I = lai = 1 > 0 = |0| = |α Λ 01 ), while a V β i (more than β) (a) since I (a V β) Λ a I = 1 » 1 = 101 = l(aV /?) Λ ¿31 ). For the case of (more s than), we obtain a similar contradiction. If this function were conservative, then aV/j
e
(more a V β
than) (a) iff
(αν β) Λ a = a e (more α Μ β than) (a). However, the first holds since I (a V β) Λ (a V β) I = 2 > l = | a | = |(a V β) Λ a I, and the second fails as | a A(aV/J)l = l > l = | a A a | .
•
Thus one reason that we do not want to adopt this analysis is that it would introduce one place dets whose interpretations are not conservative. Yet as the thorough study of other dets in English presented in K&S shows, even very complex one place dets are interpretable by conservative functions. If we analyze the comparatives as two place dets, we need not introduce exceptions to a well-motivated hypothesis concerning one place dets. A second reason of a quite different sort for rejecting each of the analyses in (25) is given by the natural interpretations of (7) which involve comparative nomináis with modifiers such as relative clauses (e.g. who came early in more students than teachers who came early), prepositional phrases (e.g. at the party in fewer students than teachers at the party), and adjectives (e.g. French, in a greater proportion of French students than workers). Such NP's are ambiguous according to the scope of the modifier. Thus (27a) below on preference means the same as (27b). But it also has a reading, helped with an intonation break, like that given in (27c). (27)
a. b. c.
fewer students than teachers at the party fewer students at the party than teachers at the party fewer students than teachers-at-the-party
In (27c), we compare students (in general) with teachers at the party. In (27b), we compare students at the party with teachers at the party. We can easily represent the less plausible reading (27c) by interpreting the NP as the value of fewert^an at the pair (student, teacher at the party). But how shall we represent the reading in (27b)? The natural solution in our framework is to interpret CNP modifiers (relative clauses, PP's, AP's) not merely as functions from properties to properties but in fact from k-tuples of properties to k-tuples of properties. Thus we extend the domain of modifiers in the same way as we have extended that of determiners. Thus:
88
Edward L. Keenan and Lawrence S. Moss
DEFINITION 3 A function f from P k into P k is a modifier function iff. for all ρ , , p 2 , . . . , p k in Ρ, f ( p j , p 2 , . . . , p k ) = ( f p t , f p 2 , . . . , f p k ) Such a function is called restricting iff. for each i, f(p¡) «s p¡. (That is, all fpj's are p/s). We may now analyze (27a) above as: (28)
fewerthan(at the party (student, teacher) ) = fewerthan ( (at the party) (student), (at the party)(teacher)) = fewer students at the party than teachers at the party
This analysis yields a correct result. Moreover no such analysis is available on the (more . . . than teacher) or the (more students than) analyses of such NP's. The reason is that student and teacher lie in different constituents and hence there is no straightforward way of making the restricting force of the modifier carry on both. One might, of course, try a kind of Reduction or Right Node Raising type analysis, deriving e.g. (29a) below from (29b). (29)
a. b.
fewer students than teachers at the party fewer students at the party than teachers at the party
But, as before, the Reduction analysis cannot represent the correct meaning relation between its input and output structures, since sometimes paraphrase holds and sometimes it does not. For example (30a) below has a reading on which it is a paraphrase of (30b). But (30c) has no such reading. (30)
a. b. c.
fewer students than teachers at some party for some party, fewer students there than teachers there fewer students at some party than teachers at some party
Thus our approach provides a natural and simple way to represent modifier scope ambiguities in comparative NP's and the obvious alternative approach is more complex and semantically less satisfactory. We accept, then, that the comparatives considered so far are interpreted by inherently two place functions. Moreover, such comparatives are not limited to numerical or "logical" constructions. We may, for example, interpret (31a) below by the two place function indicated in (31b) with the obvious interpretation. (31)
a. b.
more of John's dogs than of Mary's cats (were inoculated) (more of John's. . . than of M a r y ' s . . . )
Generalized Quantifiers and Expressive Power of Natural Language
89
Another possibility of extending comparative dets would be along the lines that "inherently" one place dets were extended. Namely, by using and and or. (8a), repeated below, indicates one possibility here. (32)
a. b.
more students than either teachers or deans more students than teachers and more students than deans
(32a) seems to be a synonymous with (32b). So we might represent it as the value of the three place function given below: (33)
(morethan)
o r 2 ; 3 ; (s,t,d)
= (morethan)
(s,t,) η (morethan)
(s,d)
That an "or" function would be defined in terms of intersections rather than unions seems somewhat curious (as Alice ter Meulen pointed out to us). Perhaps it relates to the fact that m o r e t f l a n is decreasing on its second argument (see section 2). As curious is (34a) below, formed with and. It seems to have a disjunctive reading as expressed in (34b). (34)
a. b.
more students than teachers and deans (morethan){ s , t v d ) )
Thus we might define the det 3 (moret/,an)anc¡2 3 a s that function sending a triple (s,t,d) of properties to the value of the two place function (more fAan ) at the pair (s, (t ν d)) as given in (34b). Similarly, we may interpret more students and teachers than deans by a det 3 ( m o r e t h a n K n d 1 2 understood disjunctively on the first two arguments. And finally, more students or teachers than deans seems best interpreted analogously to (33), replacing intersection by union. We note without argument that the three place functions defined here are all conservative. Doubtless it would be possible to develop a systematic notation for arbitrary more than'% with arbitrary conjunctions and disjunctions of arguments. More important additions to the k-place functions we need to interpret English NP's are represented by those in (9), repeated below (with variations). (We are indebted to Ewan Klein (p.c.) for drawing our attention to the problematic nature of these examples). (35)
a. b. c. d.
exactly two dogs and three cats John's two dogs and three cats the two dogs and three cats more than two dogs and three cats
90
Edward L. Keenan and Lawrence S. Moss
Obviously enough these expressions are synonymous with the corresponding ones in (36): (36)
a. b.
exactly two dogs and exactly three cats John's two dogs and John's three cats etc.
We may correctly represent their meanings as the values of two place functions at the pair (dog, cat) as follows (using (35a) as illustrative for all examples): (37)
(exactly 2. . . and 3) (p,q) = d g f (exactly 2) (ρ) η (exactly 3) (q)
That the function(s) defined in (37) are conservative follows as a special case of Thm 7 below (whose proof is straightforward and omitted). THEOREM 7 For f a k-tuple of functions from Ρ into P*, define the functions î g n d and f o r from P k into P* by:
ω L/p) =¿\ w oc f > ) =.5^) Then f a n d and f Q r are conservative if each f ; is. Note that Thm 7 generalizes Thm 2. For f a one place function, the kplace function f Q n d is just the special case of ìgnd where each f¡ = f. Once again a Reduction approach may fail to preserve meaning. For example, replacing John's in the above examples by someone's, we obtain (38a) and (38b) below, which are obviously not paraphrases. (38)
a.
someone's two dogs and three cats
b.
someone's two dogs and someone's three cats
Further, any analysis in which NP's such as exactly two dogs and three cats is analyzed as the combination of exactly and a coordination of two dogs with three cats faces several difficult issues. The most problematic is how to make the semantic effect of exactly (John's, more than) carry both on two and on three. The function which would be needed to interpret exactly doesn't, on such compositional views, have available in any obvious way the information contained in two and three. It takes as argu-
Generalized Quantifiers and Expressive Power of Natural Language
91
ment the denotation of two dogs and three cats (whatever that might be). Treating exactly two. .. and three as a single complex det precisely has the advantage that exactly does have available the information in two and three. On our approach we may for example treat exactly as follows: In simple cases exactly, more than combine with numerals to form one place dets, e.g. exactly two, more than two. Generalizing, let us think of exactly for c e {and, or \ as an expression which combines with k-tuples of numerals (each k > 1) to form a k-place det. When k = 1 exactly takes the form exactly. We may then define the exactlygnd function for example by: (30)
exactlyand{nj,...
n k ) = f a n d , where f¡ = exactly n ;
In this definition we are using the fact that by Thm 7 an NP such as exactly two dogs, three cats, and four pigeons is interpreted as the value of a three place function at the triple (dog, cat, pigeon). In fact the theorem allows us to interpret a perfectly innocent looking NP such as (40a) below as per (40b). Interpreting it directly in this case as a conjunction of two NP's, we obtain the interpretation represented in (40c). The two interpretations are of course identical. (40)
a. b. c.
every student and no teacher (every, no)and(s,t) every (s) Π no(t)
There are two further types of expressions in English whose analysis seems fruitful in terms of k-place functions: those given in (41) and (44). We treat them only briefly as they will play no role in the later sections of this paper, but the phenomena they represent do have some inherent interest. Consider (41): (41)
a. b.
at least ten students and more than twice that many teachers between five and ten dogs and twice as many cats
The (apparent) second conjunct contains an expression in the determiner position which is essentially anaphoric to the first det. The NP in (41a) perhaps has a reading in which more than twice that many means more than twice ten, that is, more than twenty. Perhaps some analysis could be concocted in which we have a numerical variable and (somehow) we represent more than twice χ as bound by an operator which mentions ten in some way. The precise statement here seems to us very messy.
92
Edward L. Keenan and Lawrence S. Moss
The NP in (41b) is considerably more problematic however and does not seem susceptible of an analysis along the above mentioned lines. The reason is that as many in (41b) does not refer specifically to five or ten, but rather to whatever number of dogs have the property in question, and as the property isn't mentioned, we can't know precisely how many dogs have it. All we know is that the number lies between five and ten (inclusively). An analysis of the interpretation of (41b) in terms of two place functions on the other hand is relatively unproblematic. (42)
Let f be a one place function on P. We define the two place function (f, more than twice as many)& by: s e (f, more than twice as many) & (p,q) iff s β f (ρ) and Is Λ q | > 2 · |s λ ρ I
This definition guarantees us that the conjunction of (43a) and (43b) below entails (43c) (43)
a. b. c.
Between five and ten dogs and more than twice as many cats were inoculated. Exactly seven dogs were inoculated More than fourteen cats were inoculated
Obviously enough, we are only touching the tip of an iceberg here. Our last example of a two place det parasitizes directly from a certain type of complex one place one. Compare: (44)
a. b.
John's biggest cows and pigs (were taken to market) John's biggest cow (was taken to market)
We feel that it is necessary to analyze John's biggest in (44b) as a d e t j . The naively more plausible analysis represented in (45) cannot be made to work, at least in any strictly compositional way it seems to us. (45)
a. b.
John's (biggest cow) ) the [ (biggest cow) which John has]
(45a) is naturally interpreted as the value of the detj John's at the "biggest cow" property. But just which property is that? Presumably the one possessed by that cow (or those cows) which are bigger than all the others. But John may have none of those cows. In which case John's biggest cow would denote the empty set of properties, as per (45b). But that is clearly incorrect. What John's biggest cow refers to is:
Generalized Quantifiers and Expressive Power of Natural Language (46)
93
the (biggest [cow which John has] )
In other words the restricting modifier (which John has) must semantically apply to the cow property before the superlative biggest is evaluated. We only take biggest relative to cows which John has, not relative to cows in general. On the other hand, a correct semantic representation is achievable by taking John's biggest as a complex d e t t . We can then interpret the relevant NP's as follows: Let -est be a function from adjective denotations to adjective denotations which for each adjectival function g yields the superlative g-est of g (which we do not define explicitly). Then for each individual I we define the one place function (I's, g -est) (p) to be the [ (gest) (p which I has)], yielding a correct interpretation for e.g. John's biggest cow. The analysis of (44a) is now straightforward. It is the value of the two place function {John's biggest)and at the pair (cow, pig). This completes our discussion of the primary motivation for wanting k-place functions on Ρ in a semantic analysis of English.
2.
SUBCLASSES OF DETERMINERS AND RELATIONS BETWEEN THEM
We are concerned here to classify English dets according as their denotations satisfy one or another condition over and above that of conservativii y. Many of the works cited at the beginning of this paper discuss several such conditions on one place functions. Here we extend those conditions to k > 1 place functions and note as well certain conditions which are specifically properties of k > 1 place functions. We begin with the latter. Commutativity A two place function f is commutative iff. f(x,y) = f(y,x), all (x,y) in its domain. Some two place commutative dets (i.e. always interpreted by commutative functions) we have considerd are every... and..., some. .. or . and exactly as many . . as . . . In general the functions f ¿ 2 and iQr J defined in Thm 2 are commutative. On the other hand,_more. . . than. . . and fewer ._. . than . are not commutative, and for f a pair of one place functions i a n d and f Q r are not in general commutative. We may generalize commutativity to functions f of any degree as follows: For h a one to one function from Κ = {1,..., k}onto itself (that is, h is a permutation of K), define f, from P k into P* by f h (p) = f(h(p)), where h(p) is the permutation of ρ induced by h, that is, h(p) = d f ( p h ^ j „ ' ' ·' Ph(k))· Then we say that f is (generalized) commutative iff. f=f h , all
94
Edward L. Keenan and Lawrence S. Moss
permutations h of Κ. We may then observe that the functions i a n d k and í Q r k are commutative but that in general, for f a k-tuple of one place functions, f ana, and For are not. The concept of commutativity generalizes further to that of converses. Converses For f and g two place functions we say that f is the converse of g iff. for all p,q e Ρ, f(p,q) = g(q,p)· more.. . than... is the converse of fewer... than . .., at most as many... as . . . is the converse of at least as many... as. . . , and exactly as many . . . as .. . is its own converse. A two place function f is its own converse iff. f is commutative. Generalizing to k-place functions f,g we say that f is a converse of g iff. for some permutation h, f = g h . We note: THEOREM 8 CONSk is closed under converses. That is if f e CONSk then for every permutation h, f h e CONSk. Finally, we observe that the set of commutative functions in CONSk forms a complete subalgebra of CONSk- Note that there are commutative functions which are not conservative. Any constant function for example is trivially commutative, but most are not conservative. Still the set of commutative functions from P k into P* forms a complete subalgebra of Fp»ypk, and thus the intersection of that set with CONSk forms a complete subalgebra of F p *y p k and thus of CONSkDets which imply existence K&S define a one place function f to imply existence iff f(0) = 0. The dets in (47a) imply existence, whereas those in (47b) do not. This accounts for the fact that the sentences in (47a) entail (47c) whereas those in (47b) do not.
(47)
a.
b.
at least one all but two the two more than two John's
unicorn(s) can fly
no at most two < at most as many male as female fewer than three all
unicorn(s) can fly
Generalized Quantifiers and Expressive Power of Natural Language c.
95
There exists a unicorn
We can generalize this notion to CONSk in two ways: DEFINITION 4 For all fe CONS k : (i) (ii)
f implies existence iff f(0) = 0. (Recall that Ô is the k-tuple which is constantly 0.) f implies existence on the jth argument iff whenever q^ = 0, f (q)=0
It is easy to check that more . . . than . . . implies existence on its first argument but not on its second. (If s e more ρ than q then ρ ψ 0 but q may be 0.) Similarly, fewer. . . than . . . implies existence on its second argument but not on its first, and exactly as many . . . as . . . , at least as many . . . as. . . , and at most as many ... as... , imply existence on neither argument. Obviously, if f implies existence on all its coordinates, then it implies existence. There are naturally expressible dets which imply existence but do not imply existence on any particular argument. Perhaps the simplest of these is some . . . or . . . as in John saw some centaur or unicorn. In fact, if d is a one place det which implies existence and d . . . or .. .is interpreted either distributively or even disjunctively, then d . . . or . . . has this property. If d implies existence and d . .. and... is interpreted disjunctively (that is, d ρ and q = d(p ν q) ), then it also implies existence but does not imply existence on any particular argument. If d.. and... is interpreted distributively, then it implies existence on each of its arguments. If d does not imply existence, then neither the distributive nor the disjunctive interpretation of either d. . . and . . . or d. . . or . . . imply existence on any argument. Monotone and polarity monotone dets Previous work has distinguished subclasses of monotone dets, namely increasing and decreasing ones. The general definition in our context is: DEFINITIONS (i)
(ii) (iii)
A subset Q of a boolean algebra is increasing iff. whenever χ e Q and x , < y, then y e Q. Q is decreasing iff whenever χ e Q and y < χ. then y e Q. A function f:P k -*• Ρ* is increasing (decreasing) iff. for each q e P k , f(q) is an increasing (decreasing) subset of P. f is monotone iff. f is either increasing or decreasing.
96
Edward L. Keenan and Lawrence S. Moss
In K&S, the increasing elements of CONS ι enjoy a role in at least two respects. First, they provide the denotations for the syntactically simple 1-place dets. (There are some exceptions though. I.e., no and neither are exceptions if analyzed as syntactically simple, and numerals such as ten must be regarded as paraphrases of syntactically complex elements such as at least ten.) Is it true that the syntactically simplest k > 2-place dets are also interpreted by increasing elements of CONSk? Dets such as every . . . and . . John's... and.. ., etc. are all increasing. This is a consequence of the more general result below, whose proof consists in checking that the intersection of a family of increasing (decreasing) sets is also increasing (decreasing) THEOREM 9 If f is a k-tuple of increasing (decreasing) one place functions then fgnd and f are increasing (decreasing) k-place functions. In consequence, f d k and f Q r k are increasing (decreasing) if f is. In contrast to this, the simple comparatives such as more . . . than . . . are not increasing on any algebra with at least two atoms. To see this, suppose a and β are distinct atoms. Then (more a than β) = F q — F^. This set is not increasing since α is an element while ανβ is not. On the other hand, more. . . than . . . is increasing on the (trivial) two-element algebra. It is not, however, decreasing on any algebra, because more 1 than 0 = Ρ - {0}. Analogous results hold for the det less . . . than . . . Similar results also show that at least as many ... as..., at most as many . . . as .. exactly as many . . . as. . . , etc. are not monotone except on the smallest algebras. The second feature of the increasing elements of CON'S ι isolated in K&S is that they generate CONSt as a boolean algebra. That is, all 1conservative functions are boolean combinations of increasing functions, in fact of a very small subset of increasing functions. It was shown that there is a generating set of size 2n, where η is the number of individuals (atoms). Psycholinguistically, this result is interpretable as meaning that we can in some sense know what the elements of CONS! are if we know the 2n generators, and if we know how to interpret boolean combinations of dets. An appropriate generalization of the proof for CONSi in K&S exists, proving THEOREM 10 CONSk has a set of generators of size 2nk. By way of explanation, let us recall the generators of CONS¡ from K&S. They were the functions some and every , defined for a an atom of Ρ by
Generalized Quantifiers and Expressive Power of Natural Language
somejp)
=
97
( F a if a < ρ j ' 0 otherwise
and
Ì
F a if α · P* preserves polarity iff .for all tuples ρ and q, if ρ < q (i.e. p¡ < q¡ for all i), then f(p) cf(q). (ii) f preserves polarity on its f h argument iff for all q e P k and all t β P, if qj < t, then f(q) c f(qVt), where qj/t is the k-tupleobtained from q by replacing q^- with t. Similar definitions will be assumed for reversing polarity and reversing polarity on the jth argument. The relationship between the two types of preservation is as follows: THEOREM 12 A k-place function f preserves polarity iff. for all j e K, f preserves polarity on its j" 1 argument.
98
Edward L. Keenan and Lawrence S. Moss
— ι, PROOF For the easy direction, if f preserves polarity, q e Ρ and qj < t, then
q < ? ' / t , s o f ( q ) c f(q j /t). Going on the other way, suppose that f preserves polarity on each of its arguments, and that ρ
q. We successively replace the p's by the q's,
using the hypothesis on f to write f ( p ) = f(pi,p2.· ·-P k ) çf(q!,P2, · · .,P k )Ç . . . C f(q 1 ,q 2 , . . . q^pPfc) Ç f(qi ,q 2 , . . .,q k ) = f(q).
•
The same argument shows that f reverses polarity iff. it reverses polarity on each of its arguments. As an application of this result, it can be shown that if f is a conservative function which preserves (reverses) polarity, then both î a n d k and f k have the same property. Also if f is a tuple with the same preservation property, then f ^ and f Q r have the same property. For example every d k reverses polarity since every does, and someand k preserves polarity since some does. It is easy to show that more. . '.than. . . preserves polarity on its first argument but reverses it on its second; fewer. . .than. . . has exactly the opposite properties. Neither of these dets either preserves or reverses polarity because they both satisfy f(0,0) = 0 = f ( l , l ) - so if they had either property, they would be 0 2 (the zero element of CONS2), but 1 e more than 0 = fewer 0 than 1. Exactly as many... as... is not polarity monotone (i.e., neither preserves nor reverses polarity) on either argument when Ρ is not the trivial algebra. This is because the interpretation satisfies f(0,0) = P, f(l,0) = f(0,l) = {01 and f ( l , l ) = P. There is also a large class of expressible dets which are polarity monotone on exactly one argument. An example is (more than ten, all but iwo) a n d as in more than ten cats and all but two dogs. We turn now to the subclasses of dets we call "Cardinal" and "Logical". It is these classes which will be the focus of our interest in expressive power discussed in section 3. Cardinal Determiners Following K&S (with a change in terminology), a cardinal function from Ρ into P* is one which decides whether to put ρ in f(q) solely on the basis of how many individuals with q also have p. So f is a one place cardinal function iff. for all p,q,r,s e Ρ, if Ip A q I = Ir A s I then ρ e f(q) iff r β f(s). Generalizing, DEFINITION 7 A function f from P k into P* is cardinal iff for all s,t e Ρ and all p,q e P k , if Is Apj I = It A qj I, all i < k, then s e f(p) iff t e f(q).
Generalized Quantifiers and Expressive Power of Natural Language
99
We denote the set of k-place cardinal functions by C A R D k and use C A R D for UC ARD. . K k A simple consequence of def 8 is that for all k, C A R D k c CONS^ and hence CARD ç CONS. Some examples of English dets always interpreted by C A R D j elements of CONS! are: at least ten, more than ten, exactly ten, fewer than ten, between five and ten, infinitely
many, just finitely
many. We note that
many other "logical" (to be defined shortly) dets such as every, the five, all but three, two of the five, 20 % of the are not C A R D j . For k >
1, a class of English expressions interpreted by C A R D k dets
arises from the following result. THEOREM
13 If f
is a k-tuple of C A R D j functions then fgnd
and ìQr
are CARD f c Another large class interpreted by CARD 2 dets is the comparatives such as more . . . than . . . and less . . . than . . .
To see, e.g., that the
former is C A R D 2 , suppose Is Λ qi 1= It A p t I and Is A q 2 I = It Λ p 2 I. Then s e more q j than q 2 iff. Is A q i I > l s A q 2 I iff. It A pi l > It A p 2 I iff t e more pi than p 2 . On the other hand,dets like every .. . and ...,
the five . . . and
and all but two . . . and . . . are not C A R D 2 . This observation is a parallel to the fact that the corresponding one place dets are not C A R D j . For example in the case of every ...
and...,
let a and β be distinct atoms,
and consider every αν/3 and αν/3 = F a V f l and every a and a = F a - Now ΙαΛ(α ν /3) I = 1 = ΙαΛα I, but a ¿ F q V ( 3 while a e F a , contradicting the definition of C A R D 2 . Further properties of C A R D k are discussed after we define the "Logical" dets. Logical
Determiners
We want to capture the intuition that "logical" dets are insensitive to "real world" information. They cannot, so to speak, tell which individuals have which properties. Compare in this regard the sentences in (50) below: (48)
a.
Exactly one house is white
b. c.
Every house is white John's house is white
All these sentences make non-trivial "real world" claims, but the contribution to those claims of the dets is different. The truth value of (50a) is determined once we know which objects have the house property and which objects have the property of being white. That information is also
100
Edward L. Keenan and Lawrence S. Moss
sufficient to determine the truth value of (50b). But to determine that of (50c) we need additional information: We must know which individual John is and which objects he has. In this sense then functions which can interpret John's can distinguish among individuals and can tell to some extent which individuals have which properties. John's then will be a nonlogical det, in contrast to exactly one and every which will be logical. To approach a formalization of this intuition, we shall want a logical one place function f to be one which treats "structurally identical" properties in "structurally identical" ways. So the value of f at a property ρ is determined by the "structure" of p, not by which particular atoms it dominates (equivalently, not by which particular individuals it is a member of). But what shall we mean by structurally identical? Standardly, two (boolean) algebras are structurally identical iff. they are isomorphic, that is, iff. there is a one to one function i from one onto the other which preserves the boolean operations (i(pAq) = i(p)/\i(q) and i(p') = (i(p) ' ) . It is natural then to consider elements p,q of the two algebras to be structurally identical if there is an isomorphism i between the algebras which identifies them, i.e. such that i(p) = q. In the case at hand, we are looking at elements p,q of the same algebra P, and noting that an isomorphism from an algebra to itself is called an automorphism, we may say that ρ and q are structurally identical iff. for some automorphism i of P, i(p) = q· Notice that an automorphism i of Ρ extends naturally to an automorphism i* of P*: i*Q = {i(q):qeQ } all Q e Ρ*. Observe (trivially) then that qeQ iff i(q) e i*Q, any automorphism i of P. (Since properties and sets of properties are noted differently here we shall normally write iQ rather than i*Q). Now, to say that a "logical" function f treats structurally identical properties in structurally identical ways we shall require that whenever i(p) = q then i(f(p) ) = f(q). So if q is the automorphic image of ρ under i then f(q) is the automorphic image of f(p) under i. This guarantees that the elements of f(q) = {i(s):sef(p) }differ from those of f(p) in just the ways that q differs from p. Technically, let us call a function f meeting the above condition automorphism invariant (AI). This notion generalizes immediately to k place functions noting only that an automorphism i of Ρ extends naturally to an automorphism of P k by the map: ( p i , . . ., p k ) Κ (i(pi), · - · 4(p k ) )· In general we write i(p) for (i (pi ) , . . . , i(p k ) ). We define then, DEFINITION 8 A function f from P k into P* is automorphism invariant (AI) iff for all p,q, e P k and all automorphisms i of P, if i(p) = q then i(f(p)) = f(q)
Generalized Quantifiers and Expressive Power of Natural Language 101 There are AI functions from P k into P* which are not conservative. For example, consider the constant function with value {1}. Since i({l }) = { 1 } this function is AI. But it is not conservative since 1 Λ 0 = 0 Λ 0, yet 1 e f(0) but 0 i f(0). As we are principally interested in AI functions which are conservative, we define: DEFINITION 9 LOG k = {f e CONS k : f i s AI}. LOG = ULOG^ To illustrate these definitions let us show that the function every is AI. Note that immediately from Def 9 we have that in general f is AI iff. for all automorphisms i and all ρ in P k , f(i (p) ) = i(f(p) ). We must show then that every(i(p) ) = i(every(p) ) for i an arbitrary automorphism and ρ an arbitrary element of P. And we have, for any property s, that s e every (i(p ) iff s e F ¡ p iff ip < s iff i - 1 (ip) < i _1 (s) iff iff iff iff iff
int. of every def. principal filter Γ 1 is an auto. & auto's
ρ < Γ 1 (s) preserve < i"1(s)eFp 1 ( ^ ( 8 ) ) e i(F p ) s e iFρ s e \(every(p) )
Some other detj 's which can similarly be shown to be AI are all but three, the ten, at least three of the ten, 20% of the, etc. Finally, let us observe that the functions which may interpret John's need not be AI. Note first that in an algebra Ρ with at least two atoms, a and rβ,' the function fria defined below is not AI: (49) { )
f (p)= \Fa Ì f a < P °-[V> < 00 otherwise otherwi;
For consider an automorphism i which interchanges a and β and is the identity on the other atoms. (This defines i since Ρ is complete and atomic so any element ρ is identical to the join of the atoms it dominates, so i(p) is V i(o) ). Then i(f a (a) ) = i(F a ) + 0. But f a (i(a) ) = f j ß ) = 0. Now the function f is a plausible interpretation for John's. Supppose that have is interpreted so that the only individual John has is F . Then for any property p, "p which John has" is ρ Λ a, which is a if a < ρ and 0 otherwise. So John 's(p) = the(p which John has) = the (a) = F^ if a < p, and the(ρ which John has) = the 0 = 0 otherwise, that is, if α ¿ p. Thus John's is interpreted by f Q .
102
Edward L. Keenan and Lawrence S. Moss
Some relations between CARD, LOG, and CONS We shall exhibit first several structural similarities between the classes of functions we have defined - CARD, LOG, and CONS, as well as CARD k , LOG k , and CONS k for each k. These similarities partly justify the naturalness of the classes distinguished. Perhaps the most important structural similarity is given in Thm 14 below. We use D k as a variable ranging over {CARD k , LOG k , CONSk }. THEOREM 14 D k is a complete subalgebra of F p * / p k THEOREM 15 D k is closed under converses. That is, for each permutation h of {1, 2, . . . , k } the function f h e D k if f is. (f h recalls sends each ρ to f(h(p) ) ). In the next several theorems we use D k as above and we use D without a subscript as a variable ranging over {CARD, LOG, CONS }. We remind the reader that just which functions are in D or D k depends on what algebra Ρ is chosen. Unless noted otherwise, however, our results do not depend on the choice of P. THEOREM 16 D is closed under boolean sequences. That is, whenever f is a k-tuple of elements of Di then f n c ¡ and i Q r are elements of D k . THEOREM 17 D is closed under identification of arguments. Specifically, if f e D k + 1 thenf" e D k , where f " ( p i , . . . , p k ) = f ( p i , . . . , p k ,p k )· Of course Thm 17 above formally only guarantees that we can identify the last two arguments of a function. The more general results follow from: COROLLARY 18 If g results from f e D k + 1 by identifying any two arguments then g e D k . COROLLARY 19 If g results from f e D k + 1 by any m < k identifications of arguments then g e D k + 1 _ . The proof is by induction on m. Thus identifying all arguments of f e D k gives rise to an element of D i . Moreover all elements of Ü! can be represented in this way: THEOREM 20 Define the function id from D k into D j by id(f)(p) = f(p). Then id is onto (and in fact a complete homomorphism). For the onto part note that for f e D j , f = id(fßnc/ k).
Generalized Quantifiers and Expressive Power of Natural Language 103 For example, we may think of one place every as arising from every . . . and . . . by identification of arguments. More interestingly, perhaps, we may think of the detj in (52a) below (italicized) as arising from the det 2 in (52b) by identification of arguments. (50)
a. b.
More of John's than of Mary's articles were accepted More of John's plays than of Mary's reviews were accepted
A similar pair is given by the dets in (53a,b) below. The det 2 in (53b) is defined in (53c). (51)
a. b.
More male than female teachers were arrested More male teachers than female students were arrested
c
(m0remale·
"
'
than
female*
M
= more
tha¿male(
female((p) = d f f( P j ). We have already used this theorem for D = CONS in exhibiting generators for CONSk (Thm 10). The proofs for LOG and CARD are again one liners. A last closure property, a corollary of which we use in section 3, is given by: THEOREM 22 D is closed under "padding". If f e D k then f 0 e D k + 1 , where { f(Pi> • • ' P k )
if
Pk+i=0
f0(Pl,...,Pk+1) = ( 0 otherwise Again the proofs are straightforward and will be omitted. COROLLARY 23 For Q a set of properties, if Q = f(p) for some f e D. and some ρ in P k , then Q = g(q) for some g e D k + 1 and some q e Ρ (Chose g = f n ) .
104
Edward L. Keenan and Lawrence S. Moss
In this sense we shall later say that any set of Q properties which can in principle be denoted by an NP of the form (det^ cnpv . . . , cnpk) where detk denotes in D k , can also in principle be denoted by an NP of the form (cfef k + 1 cnpi,..., cnpk+1), where the detk+1 denotes in D k + 1 . Finally, Thm 25 below gives a "non-closure" property for D k which has some linguistic interest. It will be convenient to state first the largely obvious: THEOREM 24 CARD k ç LOG k ç CONS k Now let us show, THEOREM 25 No D is closed under existential quantification over arguments: E.g. it is not the_case that for every f, if f e D k + 1 then Ef e D k , where Ef(p) = d e f t U p f ( p , t ) . PROOF Consider the function someor η which sends (p,q) tosome(p) Usome(q) = {s:
lsApl>lorisAql>l}.
Now Esomeor ^ (ρ) = ^ L^ {s: Is Λ ρ I > 1 or Is Λ t I > 1 }, and for r ψ 0, r e E s o m e o r ' 2 (0) 2 0 t Esome
?s: Is Λ 0 I > 1 or I s Λ ι I > 1 }. But r Λ 0 =
- (ρ) for any p. Whence Esome
or . is not conservative. The
or,2
argument generalizes easily to show that Esome Q r for any k. But someQr
k
k
is not conservative
e CARD k and thus from Thm 24 it is in LOG k
and CONS k - Hence no D k is closed under existential quantification over last arguments. To see that this holds for any argument use Thm 15 (Converses) to find a function in D k which puts the argument generalized over in last position.
•
A possible linguistic interest of Thm 25 concerns the systematic difference it provides between the Determiner Hierarchy Det!, Det 2 , . . . and the (first order) Predicate Hierarchy Predj, Pred 2 , . . . Existentially generalizing over k+1 place predicates does yield a k place predicate. E.g. existentially generalizing over the "subject" argument of a two place predicate yields its passive (Mary was kissed iff. Someone kissed Mary). In this respect, dets and the subclasses we have been discussing differ markedly from predicates. We turn now to some differences between the classes CARD, LOG, and CONS. Note first, THEOREM 26 CARD k c LOG k c CONS k
Generalized Quantifiers and Expressive Power of Natural Language 105 PROOF It has already been noted that the proper inclusion on the right holds for Ρ with at least two atoms. For the inclusion on the left, we show that the function e v e r y e
LOG^ - CARD^. That
ever
}> a n d
k
e
LOG^ follows from Thm 16 and the observation that every e LOG^ given at the end of section 1. That everyanc¡
^ is not in CARD^ is proved
by noting that if it were, then, since 11Λ 0 I = I 0 Λ1 I we would have that 1 e every
^ k (Ô) = Ρ iff 0 e
tion.
evcr
Y a n ( j ^ (1) = { l }, a contradic•
In light of the proper inclusions above, it is to be expected that CONSk will be closed under some operations which neither LOGk nor CARD k are closed under. Similarly we might expect to find operations under which both CONSk and LOG k are closed but which CARD k is not closed under. Both claims are correct. We consider the latter first. We define, using D and D k as before: DEFINITION 10 (a) D is closed under disjunctive definitions iff. for all f e Di the function f + from P k into P* defined by f + (p) = f(Vpj) is in D . . (b) D is closed under conjunctive definitions iff. for all f e D , , f° from P k into P* is in D k , where f ^ p ) = d f f(Ap ). THEOREM 27 Both LOG k and CONSk are closed under both disjunctive and conjunctive definitions. CARD k is closed under neither. On the other hand, a closure property which distinguishes CONSk from both LOGk and CARDfc is scrambling. Thm 4 already notes that CONSk is closed under scrambling. The function below is defined by scrambling two CARD j functions and is obviously not even LOGj since it can discriminate among two (structurally identical) elements, the atoms α and β (We assume that Ρ here has at least two atoms).
¡
some (p) if a < ρ
no (p) otherwise
The example shows then that neither CARDj nor LOGi are closed under scrambling (= definition by cases). It is possible to define an appropriately restricted form of scrambling under which CARD k and LOG k are closed. Lastly, a linguistically interesting differentiating factor among our classes of interest concerns their relative size. We might contrast our results here with comparable work in syntax, where linguists have been generally concerned to constrain the class of grammars the child has to
106
Edward L. Keenan and Lawrence S. Moss
chose from in order to better explain how he learns quickly, with limited exposure, etc. But it is rarely shown that the class of languages with grammars satisfying some proposed constraint is a proper subclass of the comparable grammars without the constraint. Here however we can assess by just how much the conservativity requirement reduces the set of logically imaginable ways we might associate properties with sets of properties. For example, in a world of only two individuals there are (surprisingly) 65,536 functions from Ρ into P*. But only 512 of these are conservative. So most ways of associating properties with sets of properties are ruled out by the conservativity constraint. Further, of those 512, only 64 are logical, and only 8 are cardinal. The general figures are given in Thms 28 - 30 below. We spare the reader the tedious calculations, noting only that they follow the same pattern: Each of our classes is a complete atomic boolean algebra and thus has cardinality 2 a , where A is the cardinality of the set of atoms.
THEOREM 28 For Ρ with η individuals, IP I_92n k We note that CARD k is not isomorphic to (CARDi ) k as these two sets have different sizes. However, the atoms of CARD k do correspond to ktuples of atoms of CARDj by the map ( f t ,f 2 , . . ., f k ) I-»· f d k - Thus they are expressible in English by expressions such as exactly two ... and six . . . and zero, an atom of CARD 3 , as in exactly two Danew and six Swedes and zero Albanians.
THEOREM 30 (van Benthem) |LOGk I = 2 We are indebted to Johan van Benthem for bringing this result to our attention. In addition, Elias Thijsse pointed out a simple, natural generalization of the arguments proving theorems 3, 4 in his paper in this volume, which yields the above formula as well as related results.
Generalized Quantifiers and Expressive Power of Natural Language 107 3.
EFFABILITY RESULTS
We introduce here a family of questions relating to the expressive power of natural languages. We may consider that a major function of natural language is to enable us to make assertions and to raise questions about the world. It is natural then to consider the precise semantic contribution to these assertions made by the various sorts of expressions in the language. Here we are concerned with the expressive contribution of dets and with comparing the expressive power of the various subclasses of dets introduced in the last section. The questions we are concerned with here may be expressed informally as follows: Q1 Suppose that English dets were not subject to the conservativity constraint. Is there any significant sense in which we could say more than we currently can? Q2 Suppose we had no non-logical dets. Would we then suffer a loss of expressive power? E.g., could we dispense with possessive dets like John's in favor of circumlocutions using logical dets? Q3 Could we dispense with the logical dets in favor of the cardinal ones? Q4 Does allowing k > 1 place dets increase expressive power? E.g., can we say more with LOG 2 dets than with LOGj dets? To formulate these questions rigorously we must make explicit just what is to count as a "significant" difference in expressive power. Obviously for example if we added blik as a d e t j to English and allowed it to be interpreted by non-conservative functions, then we would have trivially increased the expressive power of English in that we could now denote non-conservative functions. But such a change seems hardly significant. Rather we think of the expressive role of dets as inherently ancillary: Their role is to allow us to form full NP's, and it is full NP denotations which we make assertions about, raise questions about. In consequence we shall consider a change in the determiner system to be significant if it results in a change in the sets of properties we can refer to. We formulate this notion of change in expressive power in two distinct but related ways: as changes in the in principle expressive power, and as changes in the in practice expressive power. We illustrate the distinction by considering question (Ql). Suppose first that we could find a set Q of properties which was not the value of any conservative function at any k-tuple of properties. Then we would say that in principle not all property sets were expressible via conservative functions, and in consequence, not all property sets would be in practice expressible by NP's consisting of a k-place det and k CNP's. That
108
Edward L. Keenan and Lawrence S. Moss
is, given that det k 's are always interpreted by conservative functions, we would know that English had no NP of the form det^ cnp . . cnp^ which was interpreted as the in principle inexpressible set Q. So negative results concerning in principle expressive power imply the same result concerning in practice expressive power. But the converse fails. That is, suppose we find that every set Q of properties is in fact the value of some conservative function at some k-tuple of properties. We have no guarantee that English provides a k-place determiner and k CNP's such that the resulting NP is interpreted as Q. So it could be so that in principle every set was expressible but that in practice this was not the case. We consider first our original questions from the perspective of in principle expressive power. The corresponding in practice questions are taken up formally in the last section. In Principle Expressive Power DEFINITION 11 For D ç CONSk and Q ç P, we say Q is D-expressible iff Q e U {range f: f β D }. That is, if there is some f e D and some q e P k such that Q = f(q). We also set E D = {Q ç Ρ ; Q is D-expressible }. The classes D which we shall consider are CONS k , LOG k , and CARD^. We will also consider the classes CONS = ^ CONS k , LOG = ijJ LOG k , and CARD = ^ CARD k , and here we make the obvious generalizations of the definition above. Although we are interested in comparing these classes in single algebras, some of our results require that Ρ be sufficiently large. Accordingly, we will write
to mean that on all sufficiently large P, the D-expressible sets are a proper subset of the D'-expressible ones. For a few of our results, the distinction between finite and infinite Ρ is crucial. Our original questions may now be formulated as follows: Q1'
IsE
CONSl
=
p
*
Q2
IsE
LOGk
=
E
CONSk?
Q3'
Is E C A R D k
=
E
LOGk?
Q4-
?
F o r each k, is E L Q Q ^ =
IsE IsE
E L
=
LOG
CARD
OGk+1
=
? Is E
E E
?
CONS
LOG
?
CARDk =
E
CARDk+1?
We will answer the questions in this section, and we will also state some other facts about the various classes.
Generalized
Quantifiers
and Expressive
Power of Natural Language
109
It is tempting to make an analogy here to definability theory and to the general sort of negative results on expressibility found in mathematical logic. However, our techniques borrow more from combinatorics than from logic. Here is a simple observation that follows from the inclusion relations among the classes of dets and from Cor 23 to Thm 22. PROPOSITION 3\ (a) For all k, E C A R ç EL0G Ç ECQ (b) For all k, E C A R D * ç * C A R V k + 1
N S
.
and
E
LOGk
£
E
LOGk+1-
Moreover, both of these hold for expressibility relative to any universe P. Our first result is an affirmative answer to the first question. THEOREM 32 For all Q Ç Ρ, Q is CONS t -expressible PROOF Define f:P -»• P* by
(Q if ρ = 1 f(p)=
j
Then f is conservative, and f(l) = Q.
•
So, in fact, the conservativity constraint does not in principle limit what sets of properties we can express with determined NP's. The result is reasonable, even reassuring. The contrary result would say that the major semantic constraint on English dets was self defeating — so strong that it in principle prevented us from being able to refer to certain sets of properties. Next, we consider question (Q2) and answer it in the negative for the case k = 1. THEOREM 33 Let Ρ be any algebra with more than two elements. For all ρ with 0 < ρ < 1, {p}is not L O G j expressible. So E ^ c EcONSi PROOF Suppose that f(q) = { p } where f is LOGi and q e P. Now by conservativity, q Λ ρ e f(q), so ρ = ρ Λ q. Thus ρ < q. If equality held, then f(p) = {p}, but then by conservativity 1 e f(p) = { p } since (1 Λ ρ) = ρ e f(p), and this is contrary to our assumption that ρ < 1. So ρ < q, and thus we can find an atom α < q — p. Also 0 < p, so let β be an atom < p. Let i be an automorphism of Ρ induced by a transposition of a and β. Then i(q) = q because both a and β are < q. Since f is automorphism invariant, {p } = f(q) = f(iq) = i(f(q) ) = i({p }) = {(ρ - a) V β }. But this is a contradiction, since α ψ β. •
110
Edward L. Keenan and Lawrence S. Moss
It is interesting to note that singletons {p } for 0 < ρ < 1 are all in practice expressible using one place possessive dets (see section 4.)· We turn next to question Q3' for a while. Again we consider the case k = 1, and this time the answer depends on the size of P. THEOREM 34 For all finite P, E C A R D ^ = E L 0 G ^. PROOF By Proposition 31, it is sufficient to show that if Ρ is finite, f e LOGj, and ρ e Ρ, then there is some g e CARDj such that f(p) =g(p)· In fact it is sufficient to show this for the atoms f, because we can take the join of the corresponding g 's. Since f is an atom, there are numbers Ν and M less than the number of atoms in Ρ such that
i
{ q : Iq Λρ I 0
=
M } if
Ip I
=
Ν
if
Ip I
?
Ν
Define g : Ρ Ρ* by g(q) = {r : for some s e f(p), IrAq I = IsAq I }. It is immediate that g is CARD, and that f(p) - g(p). And the reverse inclusion follows from the formula for f above. •
We might also mention in this connection an interesting property of the CARD! expressible subsets of finite Ρ - they determine their arguments. LEMMA 35* If Ρ is finite and Q is CARDj expressible set different from 0 or P, then there is a unique ρ β Ρ such that for some f e CARD!, Q = f(p)· We note that this lemma fails for Ρ infinite, since if (exactly S 0 ) (s) f 0, then (exactly ) (s) = (exactly N0 ) (p), all ρ such that (s - ρ) ν (p - s) is finite. As a corollary to Lemma 35 we can count the CARDj expressible subsets of Ρ when Ρ is finite. THEOREM 36 If Ρ has η atoms then ¡ E C A R D i I = 2(3" - 2" + 1). The assumption that Ρ is finite is necessary in Thm 36 in view of the following fact. THEOREM 37 (a) If Ρ is infinite and ρ e Ρ dominates infinitely many atoms, then F = { q e P : p < q } i s not CARD t expressible. Thus, (b) lñ> is infinte, then E C A R D i C E L O ( V *Lars Johnson points out to us that Lemma 35 generalizes to LOGi expressible sets.
Generalized Quantifiers and Expressive Power of Natural Language 111 PROOF (a)
Suppose that ρ were an infinite property but that f(r) = F p for some CARDi f and some property r. By conservativity, ρ Λ r e f(r). Let α be an atom < p. Since ρ Λ r dominates infinitely many atoms, I (ρ - α) Λ r I = Ip Λ r I. Thus, by CARD! - ness, ρ - α e f(r) = F , and this means that ρ < ρ - α. But since α < ρ, this is a contradiction.
(b)
This follows from (a) once we recall that the function every given
by every(p) = F p is LOGi .
•
Thus none of the sets obtained by applying the logical det every to infinite properties can be obtained using cardinal dets. This compares with the observation that although the effect of every can be achieved with merely cardinal dets on finite universes (by Thm 37), it cannot be obtained by using a fixed finite set of cardinal dets. Indeed, all of our results which make essential use of infinite Ρ can be recast as nonuniformity results concerning expressibility over finite universes. The arguments of the last proof can be generalized to prove that if ρ infinite, then F is not CARD k expressible for any k. (If f(q) = F , let a be an atom < ρ - V {q : for some i < k, I ρ Λ q¡ I is finite}.) Thus ,we have also answered the second part of questions (Q3') by showing that for infinite P, E C A R D c E L Q G . For finite P, equality holds because every finite set of finite properties is CARD k expressible for some k as the next results shows. LEMMA 38 (a) Let Q be a finite set, say with cardinal M, and suppose that every element of Q is a finite property. Then Q is CARD 2M -expressible. (b) Let Q be any finite set of properties, say with cardinality M. Then Q is LOG 9M -expressible. PROOF (a)
Note first that for every finite property p, exactly I ρ I (p) and no (p')={p) (This fails for infinite p.) Thus if Q = { q i , . . . , q M }, then Q =
either exactly Iqj I (qj) and no (q^ ') or ... or exactly lqM I (q^) and no ( q ^ ' ) . The det in this expression is CARDjj^. since it is a join of M functions of the form (exactly 1 q¡ 1 ·
2'
Proves
(a). (b) the argument is the same except that now our first observation is that for every p, every{p)and no(p)= {p } so we can repeat the argument above with these functions. •
112
Edward L. Keenan and Lawrence S. Moss
We might also mention that the assumption in part (a) of this theorem is necessary - cf. the final theorem of this section. Lemma 38(b) gives a partial answer to the second in principle question as it shows that every subset of a finite Ρ is LOG expressible. In later theorems, we will show that this fails for infinite P, and also that for every k, the LOGk-expressible sets are a proper subset of the class of all CONS expressible sets, even on (large enough) finite P. These results require a more searching analysis of LOG k -expressibility than was needed for the classes we have so far considered. We also define a subset Q of Ρ to be disjoint if for all p,q e Q, ρ ψ q implies ρ Λ q = 0. We will use the following technical lemma on the expressibility of certain disjoint sets. LEMMA 39 Let Q be a nonempty disjoint subset of Ρ each of whose elements dominate at least two atoms, and suppose that Q = f(p) for some LOG k f and some k-tuple p. Then (a) (b)
Vp..= 1 if h < 2 1, r e Q, and I h (ρ) μ ψ 0, then I h (p) < r.
The L functions are understood as follows: For each non-empty subset Η of {1,2, . . . M l I H (P) = j O H Pj Λ i î H Pi ' · W e t h i n k ° f t h e 2 k " 1 sets H as indexed by the numbers h in { 1 , 1 , . . . 4 M } . By a suitable application of the lemma, the following useful assertion may be proved. THEOREM 40 Fix an algebra Ρ and an integer k. For every disjoint subset Q each of whose elements dominates exactly two atoms: (a) (b) (c) (d)
if if if if
IQI IQ I IQ I IQI
< = > =
2k2k 2 k -2k-
2, 1, 1, 1,
then Q is LOG k expressible and V Q = 1, then Q is LOG k expressible then Q is not LOG k expressible and and V Q < 1, then Q is not LOG k expressible.
Note that both (c) and (d) provide complete answers to the first part of question (Q2' ): Namely, for each k, the LOGfc expressible sets are properly included in the CONSk expressible sets (= the CONSi expressible sets = P*). Moreover this can be shown for (sufficiently large) finite P. COROLLARY 41 (a) For all sufficiently large (finite) P, E l o g (b) For all infinite P, E l o g c E c o n s .
c ELOg k
k+1
Generalized Quantifiers and Expressive Power of Natural Language 113 Cor. 41(a) directly answers the first part of question (Q4'). Cor. 41(b) answers the second part of question (Q2 1 ). In conjunction with Lemma 38, it tells us that Ρ is infinite iff there are sets which are E n C -E
li r
ι
.
LUW a
LUO
We note in the last section that, once again, the LOG. inexpressible set constructed in Cor 41(b) is in practice expressible using one place possessive dets. Now we turn to a result which despite its weakness will imply the answer to our remaining questions concerning CARD k expressibility. LEMMA 42 Let k > 3, and let Ρ have more than 2(2 k -2) atoms. Let Q be a disjoint subset of Ρ consisting of the joins of 2 k -2 pairs of atoms. Then Q is not CARD k expressible. THEOREM 43 For all k > 2 and all sufficiently large (finite) P , E r . „ n LAKU c E k e
LOGk°
We do not have as sharp a result as Thm 40 for CARD k expressibility, and so we leave as an open problem the determination of the exact extent of CARD k expressibility of disjoint sets. In spite of this, we can nevertheless prove that for sufficiently large P, CARD k expressibility is strictly weaker than CARD fe+1 expressibility. THEOREM44 For all sufficiently large finite P, E C A R D ^ c E
C A R D
^.
The proof is based on the following auxiliary result: LEMMA 45 Let R be a nonempty, disjoint, CARD k expressible set, 0 ί R, and let t be a finite property such that t Λ V R = 0. Then R U { t } isCARD k + 1 expressible. While we answered the questions we set ourselves in the last section, there remain some open problems concerning in principle expressibility. OP 1 (a)
For our various classes D of interest, can we give a "structural characterization" of the D-expressible sets - e.g. one using just the notions of boolean algebra? (b) Is there a natural operation by which the D k+1 -expressible sets can be obtained from the D-expressible sets? Concerning (b), we asked if there was any relation between the finite boolean closure of the LOG!-expressible sets and any of the other classes. For example, can we obtain every LOG-expressible set by taking finite intersections and unions of LOGj -expressible sets? We know from Thm 40
114
Edward L. Keenan and Lawrence S. Moss
that for sufficiently large (finite) Ρ there are sets which are not LOG2 expressible. And from the argument in Lemma 38, we know that for finite P, every subset of Ρ is a finite boolean combination of LOGi -expressible sets (in fact of CARD! expressible ones, in virtue of Thm 34). But this situation does not obtain when Ρ is infinite. Below we show that if Ρ is infinite, there is a subset Q of Ρ which is LOG2-expressible (and in fact CARD 2 expressible) which is not a finite boolean combination of LOGt (or CARDj) expressible sets. Our example is stated for the case Ρ = the set of subsets of the natural numbers. We are concerned with the set EO = { p e P : p i s a finite set with as many even as odd elements }. This set EO is LOG 2 -, even CARD2-expressible, since it is V ( { 0 , 2 , 4 , . . . }) Λ exactly m ( { 1 , 3 , 5 , . . . } ) .
exactly m
THEOREM 46 EO is not a finite boolean combination of LOG j -expressible sets. PR OOF Suppose toward a contradiction that it were. Since the LOG χ -expressible sets are closed under complements, we have a representation (53) a. EO = X j U x 2 . . . U X N where each X¡ is an intersection
b. x i - f \ ( P i 1 ) n . . . n f ¿ ( p i j i ) for some m, some LOGj functions and some properties which depend on i. When looking at a single Xj we will always omit the superscripts. There are only a finite number of properties pj which figure into the above representation, so we can find an infinite set Β of odd numbers so that for all b j ,b 2 β Β and all the relevant properties p, b j e ρ iff b 2 e p. We can also find an infinite set C of even numbers with this same property. The point is that if χ and y each contain k elements of Β and k elements of C, then for all the relevant ρ and all f e LOG, χ e f(p) iff. y e f(p). This is because there is an automorphism taking χ to y and fixing ρ as a set. As a result, for each i e Ν, χ e Xj iff y e X¡. We use this observation and the representation of EO in (53a) to define a map F: 1 place dets (for each of our subclasses) increase in practice expressive power? Here we have some positive results and some open problems. OP 4: For any k, is the class of sets which are in practice CONSk expressible a proper subset of those which are in practice CONSj^ j expressible? We know of course from Thm 32 and Proposition 31 that every set Q of properties is in principle CONSk expressible for every k. For the classes CARD and LOG we have more definitive results: CLAIM 3 The sets which are in practice CARDj (LOGj) expressible are properly included in those which are CARD2 (LOG2) expressible. ARGUMENT We showed earlier that for all ρ, 0 < ρ < 1, the unit set {p } was not in principle LOGj expressible, and thus not in principle CARDj expressible. Now interpret the CNP student as an atom, say a, and observe that more students than non-students is the unit set {a}. Since more,iian is CARD2 ç LOG2, we have that the in practice LOG2 (CARD2) expressible sets properly include the in practice LOGj (CARDj) expressible sets. We shall not pursue further here the general question whether the in practice CARDk (LOG k ) expressible sets are properly included among the in practice CARD. +1 (LOG,,, ) expressible sets.
APPENDIX: SOME ONE PLACE ENGLISH DETERMINERS
(The groupings given below are intended for mnemonic reasons; several expressions arguably belong in more than one group). superficially simplex dets every, each, all, both, a some, two, three, . . . most, several, a few, the, no neither, zero, (this) a , (these) p . strict cardinal dets at least ten, at most ten, more than ten, exactly ten, fewer than ten adverb + many infinitely many, uncountably many, finitely many a + AP + number of
Generalized Quantifiers and Expressive Power of Natural Language 123 a prime number of, an even number of possessive dets John's, no student's bounding dets exactly ten, between five and ten, and Dets of the form: only + Det\ only five, only John's, only finitely many, only SOME (indicates contrastive stress), only the LIBERAL exception dets all but three, all but finitely many, all but the three smartest, every but John (as it occurs in every student but John left) boolean combinations not one, not every, not only John's, neither the dumbest nor the smartest, neither John's nor Mary's, some but not all, at least two and at most ten, either exactly two or exactly ten comparative dets more than ten, etc. more male than female (as in more male than female students failed the exam), exactly as many male as female, twice as many male as female det+AP the tallest, the five tallest, all graduate (as in all graduate and a few undergraduate students attended), John's biggest determined dets the five, John's five, the five or more, John's five biggest partitive dets most of John's, at least two of the five, the three tallest of the twenty, more than half of John's, more of John's than of Mary's, fewer than ten of the hundred or more proportional dets most, half (of) the, every other, every third, five percent of the, two thirds of the, less than half (of) the, most of the, less than ten percent of the
REFERENCES Baiwise, J. and R. Cooper (1981): 'Generalized Quantifiers and Natural Language', Linguistics and Philosophy 4-2, 159-219. Benthem, J. van (1982): 'Five Easy Pieces', in Studies in Modeltheoretic Semantics, GRASS series vol. 1, A. ter Meulen (ed.), Foris, Dordrecht. Benthem, J. van (1983a): 'Questions about Quantifiers', Journal of Symbolic Logic 49-2, 443-466. Benthem, J. van (1983b): Ά Linguistic Turn: New Directions in Logic', to appear in Logic, Methodology and Philosophy Science VII, North-Holland (1984). Bergman, M. (1982): 'Cross Categorial Semantics for Conjoined Common Nouns', Linguistics and Philosophy, 5:3, 399-403.
124
Edward L. Keenan and Lawrence S. Moss
Gil, D. (1982), Distributive Numerals, UCLA Ph.D. dissertation, Dept. of Linguistics, UCLA. Keenan, E.L. (1981): Ά Boolean Approach to Semantics', in Truth, Interpretation and Information, J.A.G. Groenendijk, et al. (eds.), GRASS 2, Foris Pubi., Dordrecht. Keenan, E.L. and L. Faltz (1978), Logical Types for Natural Language, UCLA Occasional Papers in Linguistics 3. Keenan, E.L. and L. Faltz (to appear), Boolean Semantics for Natural Language, to appear in D. Reidel, Synthese Language Library Series. Keenan, E.L. and J. Stavi (1981), Ά Semantic Characterization of Natural Language Determiners', to appear in Linguistics and Philosophy. Keenan, E.L. and L.S. Moss (1984). 'Determiners and the Logical Expressive Power of Natural Language', Proceedings of the West Coast Conference on Formal Linguistics, vol. 3, M. Wescoat et al. (eds.). Stanford Linguistics Association. Lindström, P. (1966), 'First Order Predicate Logic with Generalized Quantifiers', Theoria 32, 186-195. Mostowski, A. (1957): 'On a Generalization of Quantifiers', Fundamenta Mathematica 44, 12-36. Napoli, D.J. (1983): 'Comparative Ellipsis: A Phrase Structure Analysis' Linguistic Inquiry, 14: 4,675-695. Thijsse, E. (1982): O n Some Proposed Universale of Natural Language', in ter Meulen (ed.), Studies in Modeltheoretic Semantics, GRASS 1, Foris Pubi. Dordrecht. Thijsse, E. (1983): Laws of Language: Universal Properties of Determiners as Generalized Quantifiers, doctoraalscriptie, Rijksuniversiteit Groningen. Westerstâhl, D. (1982): 'Some Results on Quantifiers'. Notre Dame Journal of Formal Logic 25-2,152-170. Zwarts, F. (1983): 'Determiners: A Relational Perspective', in ter Meulen, (ed.) Studies in Modeltheoretic Semantics, GRASS 1, Foris Pubi. Dordrecht.
Chapter 5
Counting Quantifiers Elias Thijsse
1.
INTRODUCTION
In the last decades various conditions have been proposed which characterize important classes of interpretations for categories of natural language. This paper deals with the question how many quantifiers there are that satisfy a certain combination of conditions, given the size of the domain of interpretation. Though both NP-interpretations and DET-interpretations can be considered as generalized quantifiers (in the sense of Lindström (1966) ), I will reserve the term quantifier (Q) for NP-denotations (cf. Barwise & Cooper (1981) ) and call DET-denotations dets, following Keenan & Stavi (1981). Every Q is then a set of subsets of the domain and every det a function from subsets of the domain to Qs. Moreover, Qs and dets are unconditioned: neither conservativity nor isomorphism (to be explained below) is presupposed in the notions Q and det. This makes the effect of the conditions on the number of Qs and dets allowed more transparent. Before stating the actual calculations, I will give some motivation for the present research. It provides a mathematical evaluation of the proposed conditions. Compared to the methodological perspective used in Thijsse (1983a), our present means of evaluation is refined in some respects. We can still answer questions concerning the dependence, falsifiability and consistency of conditions, but now we can also (i) compare the strength of conditions. For example, isomorphism is generally more restrictive than conservativity; (ii) keep track of the way we have succeeded in constraining the set of permissible denotations for determiners and NPs as much as possible. The ultimate goal is to characterize the notion of possible natural language from the semantic side. Although we stepwise approximate this goal, the results show that there is still a long way to go. At first sight, we seem to have lost control over the syntax in shifting from methodology to counting, since the latter treatment is purely semantical. However, (iii) the numerical results can be used to obtain knowledge of the
126
Elias Thijsse
expressive power of the syntax (see Keenan & Moss, this volume), (iv) Related to (ii) is the possibility of using counting results to get a better idea of the way information is processed by the language user. Until a compatible psychological theory will have been found, this application heavily leans on suitable interpretations of the numerical results. Besides, other complexity measures than those induced by conditions may be more relevant here (cf. van Benthem (1984b). The rest of this paper is organized as follows. In the next section finite models will be considered. We first calculate the effect of conditions in isolation, and later treat their combinations. In the third section we deal with influite domains. The effects of many combinations of conditions appear to coincide. Apart from this collapse we will find that a definability result for finite domains carries over to the infinite case, but that extrapolating from the finite case can be very dangerous as well.
2.
FINITE MODELS
In most papers devoted to the linguistic application of generalized quantifier techniques, only finite models are considered (but notice the contributions by van Deemter and van Eijck to this volume). A number of nontrivial problems have been raised in this area. 2.1.
Conservativity
Consider a set E of η individuals. How many NP-quantifiers are there on E? Every set of subsets of E is a quantifier. Therefore the set of quantifiers on E correspond with P(P(E) ). The desired number is thus |P(P(E) )|. Now XeP(Y) (i.e Xç Y) uniquely corresponds to a characteristic mapping κ: Y ->· {0,1} where κ(χ) = 1 for every xeX and κ(χ) = 0 for every χ i X. So |P(Y)| = 2 , Y | . Because |Ε| = η, there are 2 2 " quantifiers on E. Apart from some cases where the number of quantifiers is relevant we will be concerned here with determiners (dets), mainly. How many dets are there when no conditions are imposed? Call this number |DET|nTHEOREM 1 : |DET| n = 2 ( 4 Π ) (or, 2 4 " ) PROOF:Let E be a set such that I E I = n. Each possible det is a function from subsets of E to subsets of P(E). I.e. I DET I equals the size of the set of functions P(P(E) ) P ( E ) . Thus I DET I = (2 2 ") 2 = 2 2 "2 = 2 ( 4
)
•
Counting
127
Quantifiers
At least from a 'brute force' point of view, and with the proviso pointed out in (iv) of the introduction, there are objections to super-exponential growth. E.g. for n=2 we have 2 1 6 = 6 5 536 structures. This argument forces us to look for limitations to the class of possible dets. For any set Σ of conditions, let IΠΣ iη be the amount of dets allowed byJ Σ for a domain of size η (formally DET e Σ is presupposed throughout this paper). One of the first conditions proposed and defended in the linguistic GQ-literature is conservativity. A det D is called conservative if for every A,Xc E: X e D (Α) » ( Χ Π A) eD(A). The condition holds for all simple dets (= interpretations for lexically simple determiners) and is preserved under Boolean combinations, adjectival restrictions and definition by cases (Keenan & Stavi (1981) ). We abbreviate the class of conservative dets by CONS. This condition eliminates most dets 2 : THEOREM2: |CONSIn = 2 3 " PROOF:Let E have η individuals and consider CONS over E. Any conservative D then maps sets A c E to sets Q cP(E). Let IA I = m, then 0 < m < η. Because of CONS the behaviour of D(A) is totally determined by its conduct on A. So D(A) is characterized by a GQ on A, as the reader may ,,ηκ easily verify. There are of course 2 ' such sets of subsets of A. Moreover, on each "level" m there are (I!) choices for A out of E. Finally, any D can be gotten from a series of choices of A, which is construed here as first choosing an m, and then an A c E such that |A| = m. A det is the product of all these choices; so, there are ~j Γ ( 2 ^ -
ra "
2
m
η n1 m r n ì Ì < 1 Σ ' W '= 2 W = 2o n
^)
=
m=0
conservative dets over E. The last
m=0 step is an application of the binomial theorem.
•
An elegant proof was suggested to me by Johan van Benthem. PROOF: For any D, to ask whether or not Β e D(A) only depends on the way A and Β are situated in E.
128
Elias Thijsse But instead of listing the members of A and B, we could specify for any element of E, whether it belongs to the join Α η Β (value 1), to A-B (value 2), to B-Α (value 3), or E-(A υ B) (value 4). In this way, any mapping in 4*· determines a pair (A,B) (this extends the notion 'characteristic mapping')· Now we identify any det D with the set of (A,B)-pairs 4n such that Β e D(A). So, without CONS, there are 2 dets (the result proved in thm 1). The effect of CONS in this reconstruction is to rule out one value: only Α η Β, Α-B and E-A matter.
Thus, analogously, modulo CONS there are 2
3n
dets.
•
The reduction we have reached is considerable. Instead of the 2 16 dets allowed in the case n=2, there are now 2®=512 possible dets. The problem is that 'super-exponential' functions (i.e. an exponential function of which the exponent is itself an exponential function, whose exponent is linearly dependent on η) increase enormously when η grows. Such an outcome is highly unsatisfactory, and we shall inspect other candidates for delimiting the amount involved. A strengthened form of conservativity which typically holds for simple dets is (see Thijsse (1983b), pp. 76-78 for a motivation) CONS + :D is cons+ iff A is the smallest set over which nontrivial D(A) is cons. So, D β CONS + iff for every A such that D(A) Φ 0 and D(A) Φ P(A) and for any B ç E which satisfies Χ η BeD(A) o XeD(A) for every X ç E, it holds that A ç Β. As the reader may verify, this condition is already implied by CONS and the following condition (e.g. use the proofs of theorems 34 & 36 in K&M).
2.2
Isomorphism
Well-known in classical GQ-theory is the condition of isomorphism closure (cf. Mostowski (1957) ): ISOM: A det D on E is closed under isomorphism iff for every permutation f of E: X e D(A) f[X] e D(f[A])
Counting
Quantifiers
129
Informally, ISOM ensures that no particular individual or relation is favoured: only quantities matter. For simple, extensional dets ISOM seems to hold. The only exceptions appear to be items like few, many and enough. E.g., we may say that (a) "Many demonstrators were arrested" is true but (b) "Many criminals were arrested" is false even while the relevant numbers are the same. We would like to maintain a qualification of Keenan and Stavi (1981), however. What makes them exceptional is not that they contradict ISOM, but that they are clearly intensional. This means that they cannot be treated in an extensional theory like B&C's. For notice that 'demonstrators' and 'criminals' may be different descriptions of the same persons. Still (a) may be true when (b) is false. Used in this way 'many' means: more than expected. A complex det may violate ISOM when it contains lexical material. E.g. the possessive John's is interpreted by 'the . . . of John's, and interchanging 'John' with say 'Peter' can give different truth-values. In order to calculate the effect of ISOM, let us meditate first. As van Benthem (1984a) explains, ISOM means that only the number of elements in Α η Β, Α-B, B-Α and E-(A υ Β) determine whether Β e D(A). We shall now repeat the formal pendant of this idea, which can be found in the appendix of Higginbotham & May (1981). ΛΙ4
THEOREM 3 Every D e ISOM corresponds to a δ e 2
n
,
4
where Δ = (i,j,k,fi) : i + j + k + £ = η and i,j e c o } , v.v. Moreover, the correspondence μ : D -*• δ, with δ (i,j,k,fi) = 1 3 A e E SB ç C [ IA-BI = i Λ |AnB|= j A |B-A| = k Λ|Ε =2
Vi(n + l)(n + 2) =2
LIπ
When both CONS and ISOM hold,only IΑ η Β ¡ , I Α-B I and i E-A ¡ matter. Call I Α-B I = i, IΑ Π Β I = j then I Ε-A I = n-i-j. A det thus corresponds to a relation between natural numbers which we can diagram in a tree: i +j = 0 i +j = 1 i +j = 2
±(0,0) ±(1,0) ±(2,0)
i + j = n ±(n,0)±(n-1,1)
±(0,1) ±(1,1)
±(0,2)
±(l,n - 1) ±(0,n)
+ (i,j) means: every subset Β (with cardinality j) of a set A (with cardinality i + j) is a member of the det applied to A. This representation was introduced in van Benthem (1984a) 3 and will be used frequently in section 2.4. It can be applied to give an alternative proof of theorem 5. 2.3.
Monotonicity constraints: some unsolved problems.
Monotonicity in its various forms has turned out to be relevant to both linguistic and logical applications. 'Monotone decreasing' quantifiers trigger negative polarity items (Ladusaw (1979) and Zwarts (1981)), and simple NPs are always interpreted by 'continuous' quantifiers (B&C (1981), Thijsse (1983a)). Monotonicity also plays a role in definability results (cf. 2.4 below). A quantifier Q is called monotone increasing ( + mon) iff X e Q & X c Y => Y e Q, and monotone decreasing (-mon) i f f X e Q & Y c X = > Y e Q. Q is continuous (cont) i f f X e Q & Z e Q & X ç Y c Z = > Y e Q . This induces some classes of dets: ± MON, MON and CONT. By analogy, we also have monotonicity with respect to the argument place: persistence ( + pers), anti-persistence ( - pers), so ·'"
Counting
Quantifiers
+ MON - MON MON CONT + PERS -PERS PERS
D D D D D D D
is is is is is is is
131 + mon - mon mon cont + pers - pers pers
if if if if if if if
VA-E: D (A) is +mon VA-Ε: D (A) is - m o n D is + mon or D is - m o n VA - E: D (A) is cont VA - Β - E: D ( A ) ^ D ( B ) VA - Β - E: D (B) - D (A) D is + pers or D is - pers
In order to compare their strength with other conditions, we would like to calculate the effect of ( ± )MON and CONT even in the absence of ISOM. However, this enterprise proves to be reducible to solving a longstanding mathematical problem. Therefore, we shall merely give some approximations and calculate some values for small n. First, let us call the number + mon Qs over a domain with size n: q n . The following propositions show that ± MON can be determined once VsTHEOREM 6: |+MONIn = | -MON l n = I+PERS I n = I -PERS I n = q n 2 " PROOF-X + mon) Every + mon det maps 2 n subsets of the domain to q n sets. ( - mon) The complement-operation ' defined by D 1 (A) = Ρ (E) - D(A) gives a one-to-one correspondence with + MON. Finally notice that ° defined by Β e D° (A) « A e D (B) is a one-to-one correspondence between + MON and + PERS and between - MON and - PERS. •
THEOREM 7: I MON I n = I PERS I n = 2 . q n 2 " - 2 2 " PROOF-X mon) Every mon D is either + mon or - mon, so counting I MON ln> we can add l + MON I and I-MON I and then substract l + MON η η η - MON I which is counted twice. Now if D is both + mon and - mon,
η
D(A) is + mon and - mon for every A c E, but then D (A) = 0 or D (A) = Ρ (E), the socalled trivial quantifiers. Consequently, I MON I = HMONI n + I - MON l n - I +MON η - MON l n = 2.q n 2 - 2 2 . (pers). The map ° defined in the proof of theorem 6 again is one-to-one between MON and PERS. •
More or less the same reduction is obtained by considering the additional effect of CONS. η ("ρ THEOREM 8 : | CONS η + MON i n = ! CONS Π - MON ¡n = | | q¡ i=0
132
Elias Thijsse
PR OOF .(cons, + mon) We can proceed as in the first proof of theorem 2. Let D e CONS η + MON. Then for every A c E : D (A) is uniquely determined by the restricted quantifier { Χ Π A: X e D (A) } and vice versa, and the restricted counterpart is + mon on A precisely when D (A) is + mon on E. Now for every i with O < i < η there are (Ç) subsets of E with cardinality i, each of them serving as the domain for qj +mon quantifiers. Since the values D takes on different A are totally unrelated, the number of choices for D is q„. q i " ^
q^ = J ) q. ^ ^
0
•
The derived formula has a direct consequence which parallels theorem 7. It suffices to notice that the trivial quantifiers 0 and Ρ (E) are conservative over every subset of E: n (n) THEOREM 9 :1 CONS Π MON I n = 2 j ~ j q¡ * i =0
η - 22
It would be more satisfactory to have an explicit formula for q n . This problem was already raised by Dedekind in 1897 and the solution is still unknown. In order to calculate some values of q n , we shall rephrase the problem. Every + mon Q on E can be represented by its inclusion-minimal elements. Define Kg by {X ç E: Q nP(X) = {Χ}}. Then Kg is an anti-chain: any two different members of Kg are incomparable with respect to c. In other words, for every X, Y e Kg: (X ç Y or Y ç Χ) =» Χ = Υ. Therefore we can find q n by counting the number of anti-chains in Ρ ( {1, 2, 3 , . . . , η }). For η = 0 and η = 1 we can simply give the + mon quantifiers. For η = 0, E = 0, so 0 and { 0 } are the only Qs and both are monotone increasing. For η = 1, three of the four Qs are + mon: 0, { E } and {0,E}. This shows q 0 = 2 and q^ = 3. If η = 2,let E be {a,b }and call { a }= A and { b }= B. We diagram the subsets of E and their possible inclusion relation and get the anti-chains in Ρ (E) as follows.
Proceeding from right to left through the picture gives: K j = 0, K 2 = { E } , K 3 = {A } , K 4 = { B } , K 5 = { A , B } , K 6 = { 0 } ; s o q 2 = 6. The same method also gives q 3 without the. need for checking 2 2
=
Counting
Quantifiers
133
256 quantifiers to see whether they are + mon or not. Let E = {a,b,c } A = { a } , Β = { b } , C = { c } , D = {a,b}, F = {a,c}, G = {b,c}. The inclusion-pattern is depicted below:
The anti-chains in P(E) are now: KJ= 0
K5={G}
K9={D,F,G}
Κ13={ΒΛ)
K17={B,F}
K2={E} K3={D}
Kg={D,F} K7={D,G}
K10={B} KU={A}
K14={B,C} K15={A,C}
K18={A,G} K19={C,D}
K4={F}
K8={F,G}
K
K16={A,B,C}K2O={0}
1 2
={C}
This shows q 3 = 20, and similarly q 4 = 1 6 8 (Dedekind 1897) and q 5 = 7581 (R. Church 1940). Greater values of η seem to require computer programs (use the lower bound expressed by thm 10); but M. Ward found q 6 = 7 828 354 by hand in 1946! Following R. Church (1965) most programs devised for this problem yield 4 q 7 = 2 414 682 040 998. However, we may ask whether it is really necessary to know explicit numbers. For most applications a good approximation will do just as well: estimating q n answers questions concerning the relative strength and order of growth of MON. Inspection of the inclusion-diagrams made for q 2 and q 3 shows that most anti-chains originate from sets of equal size. We specify this idea in THEOREM 10: Hq n > 2 2 ^ 0
-η> 2^
^
+
> 2
^
(π η is the parity of n, i.e. πη = 0 for even n, and 1 for odd n) PROOF:Different subsets of equal size are incomparable. There are (j^) subsets of E of size m,and each non-empty subset of the set of subsets of size m constitutes an anti-chain foi m > 0:and all these anti-chains are different. Finally, 0 and {0 } are always anti-chains too, so
η q„n > 2 + Σ (2
1
(n ) m
η - 1) = 2 - η + Σ2
l
Ο
η
m
o
= Σ 2
(")
m
- η. Now ("m) is
maximal for m= [ Vz n], the greatest natural number smaller than % n, but for odd n,this maximum is reached twice (for [14 n] and [Vm] + 1). •
134
Elias Thijsse
Sperner has found a bound for the length of the anti-chains. He proves that an anti-chain has maximal length once it contains only sets of size [Vin]·, then this maximum is clearly ( . " . ). The same expression turns up again in an upper bound found by Hansel in 1966.s THEOREM 11: q n < 3
( [nV 2 n ] )
For a proof, see Greene & Kleitman (1978) or Thijsse (1983b). (n ) (n ) Theorems 10 and 11 imply that 2 l l / 2 n l < q n < 3 [ ' / 2 n I ' We may wonder which bound is more exact. This appears to be the lower bound: although = does not hold for η > 0 (cf. theorem 10), it is approached in the following sense: 2 lQ
g qn lim η = 1 (see Kleitman 1969). The asymptotic behaviour of " hVz n]> log q n is studied intensively, but will not concern us here.6 Finally, like we proved that ISOM is effectively stronger than CONS in 2.2, we can use thms 6 & 10 to show that for η > 4, ICONS | < |+MON|n. 2.4 Monotonicity conditions on logical determiners. In this section we inspect the effect of the monotonicity constraints relative to the conditions CONS and ISOM. Dets for which both CONS and ISOM hold are sometimes called logical. Logical dets can be represented in the tree-form introduced in 2.2. With this representation many problems concerning logical dets have been solved. To see what +MON and -MON mean in terms of trees, one has to look at the smallest jumps: let Bj e D(A) and add one element to Bj ; this forms B2- Assume D is + mon: then B2 e D(A). When the added object is not in A, nothing happens, when it is, we have: |A-B2I = |A-Bj| - 1 and |A η B 2 | = |A η Bj I + 1. Thus +(i,j) implies +(i-l j+1). So, + mon translates into: a plus stretches to the right in the same row. Analogoustfi ly, when D is - mon, + stretches out to the left. For the m row, there are m+1 elements, and D being +mon gives m+2 options for this row. E.g., when m = 3 we have the possibilities: , + , - - + + , - + + + and + + + +. Since the patterns of the rows are independent, the total number of +MON logical dets is the product of the numbers for the separate rows, i.e. 2.3.4.5 (n + 2) = (n + 2)! This proves THEOREM 12:1 CONS Π ISOM Π +MON I n = ¡ CONS Π ISOM Π - MON |. = (n + 2)!
Counting
Quantifiers
135
From this we can derive, analogously to theorem 7 & 9: THEOREM 13 : ICONS ΠISOM η MON I n = 2 (η + 2) ! - 2 n
+ 1
Despite the central position of CONT, witness the earlier-mentioned universal concerning simple NP's, no results were mentioned in the preceding section which would show its numerical effect in isolation or together with CONS. In the case of continuous logical dets we can say more. Recall that D is cont iff for every A,BJ,B 2 ,B 3 such that B 1 ç B 2 ç Eg : B 1 e D(A) & B 3 e D(A) => B 2 e D(A). In geometric terms: every row of the tree for D contains only +'s in a closed sequence. We can find - - + + + -, but not - + - + + - . η+l THEOREM 14:1 CONS Π ISOM Π CONT ln =»] [ (Vim2 + Vim + 1) m= 1 PROOF: There are η + 1 rows, and for the (m - l ) t h row (1 «S m < η + 1 ) , we have the following possibilities: (i) it contains no + (1 configuration), (ii) it has just one +: m configurations, or (iii) it contains more than one +: )= Vim (m - 1) configurations. In total, there are 1 + m + Vim^-1/2m= Vim + Vim + 1 option for row m - 1, and the rows are again independently chosen. •
We can simplify this formula by considering the additional effect of the condition VAR : D satisfies variety iff VA ç E : A * 0 => D(A) Φ PÍE) & D(A) Φ 0 (n+3) ! η ! THEOREM 15 : ι CONS η ISOM η CONT Π VAR I n 3.2n PROOF·. Compare the last proof. The two trivial arrangements excluded for m > 2 both continuous. Hence we have I CONS η ISOM η CONT η VAR I n = n+1 2.~|
n+1 ( % m 2 + t ë m + l - 2 ) = 2.~|
m=2 T~~T (m + 2 ) = IJ2 2
n+1
f" Vi(m - 1) (m+2)
m= 2 . n! ÍSJÍiLL = η ! (η + 3 ) ! 3.2 3.2 n
[" (m-1). 2
m=2
•
Although VAR eliminates just two possible values for every non-empty argument set, many dets do not satisfy this condition. It holds for dets like some (unstressed, phonetically [s9m] ), all, no, most (see e.g. Thijsse (1983a) for definitions), but cannot be universal: witness (at least) two,
136
Elias Thijsse
(precisely) two, and both (when given the full Russellian interpretation both(A) = { X c E : A c X & IA I = 2 }). However, VAR can be an interesting class restriction: together with PERS, CONS & ISOM it defines the four classical dets of the square of opposition. To see what PERS means in terms of trees, recall that D is + pers (- pers) when A c C implies D (A) c D (C) (D (C)c D(A) respectively). First assume D is + pers. Let us have + (i,j) in the tree of D, then there are A and Β such that IA - Β I = i, IΑ η Β i = j, and Β e D (A). Now A can be extended in two ways: with an element a 1 e Β to A^and with an element a, i Β to A- :
Since D is + pers, Be D ( A : ) a n d p cD(A 2 ). Then+ (i, j + l ) a n d + (i+ l , j ) because I A j - Β I = i, IA 1 η Β ι = j + 1, i A 2 - Β I = i + 1 and ¡ A 2 η Β ι = j. This shows that + is transferred in a + pers tree as + , and likewise
/\
- in a - pers tree as
. We are ready for the next result7:
-
THEOREM 16 : i CONS (Ί ISOM η VAR Π PERS ι n = 4 (when η > 0) PROOF: Let D be a det that satisfies the requirements. By CONS and ISOM we can restrict ourselves to tree-representations. PERS forces us to distinguish: (i) D is + pers. Then we cannot have + (0,0) since this would produce
+ +
+
where + (1,0) and +(0,1) violate VAR. Therefore we
can only have - (0,0). For the second row only (+ - ) and ( - +) satisfy VAR which yields +
- and -
+ for η = 1. For η > 1, +pers. fills
the downward tree completely: yielding the patterns of noi all and some respectively. (ii) D is - pers. By a similar argument, one obtains just the patterns corresponding to all and no. •
Replacing VAR by CONT and restricting attention to + PERS we get the following elegant result:
Counting
Quantifiers
137
THEOREM 17: ICONS η ISOM η CONT η +PERS l n = F(2n) Here, F (η) is the Fibonacci sequence with F (0) = 2 and F (1) = 3 8 PROOF9: Abbreviate the number ICONS η ISOM η CONT η + PERSI by d . First we derive a recursive expression. Focus on the last ( n ^ ) row of the tree. When the leftmost position of this row is -, the lefthand side of the tree must also be - (by +pers) and the positions that are yet undetermined can be filled in d , ways: When this (n,0) position is +, call the rightmost +in the last row (n-i,i). By CONT, this row is then determined: + up to (n-ia) and then -. Again +pers assures that everything diagonally above a - is also -, and only the triangle remains undetermined: :d n-1 ++ · . . + -
·
·
Since the basis of this triangle contains i positions, we have dj ^ ways to fill this triangle. In the case i = 0 the whole tree is determined (- except for + (n,0) ): which contributes one configuration. Now i runs from 0 to n,and this yields d n = + 1 + dg + d^ + . . . . + ^ and dp = 2. Now define F (η) by (i) F (2n) = d n for every n > 0 (ii) F (2n +1) = F (2n) + F (2n-l) for every η > 1 (in) F (1) = 3 F (k) is a Fibonacci-sequence when F (k + l ) = F ( k ) + F (k-1). For odd k, this holds by (ii), and for even k, we notice that F (2n-l)=l +dQ+dj + . . . +d λ + ä ι (proof by induction) and consequently F (2n) = d n = d n ] + (1 + Oq + . . . . + d n l ) = F (2n-2) + F (2n-l). • Some values for d n are dQ = 5, d 2 = 13, d 3 = 34, d 4 = 89 etc. Unlike most results derived so far, we cannot simply change + PERS into - PERS in thm 17 and get the same numerical expression. One reason is that the complement-operation used in thm 6 is not useful here: the complement of a continuous det need not be continuous. E.g. we have ICONS η ISOM η CONT η - PERSI 3 = 35. Sincethe technical difficulties in this case are not yet overcome, we shall not pursue this matter further here. 3.
Infinite
models.
Despite the success of the application of finite models to the interpretation of natural language, there are reasons to consider infinite domains seriously. First of all, infinite models are important for other areas (like mathematics and logic) and the following results may be useful there as well. Apart from this, natural language seems to require infinite models in
138
Elias Thijsse
a number of constructions. There is the word infinite: although it is often used in the meaning of 'very many', one could argue that the semantics is still the mathematical sense of the word. Moreover, mass terms and temporal expressions seem to refer to dense structures (at least in the physical sense of these expressions). However, for this purpose the model has to be enriched with a measure for the length of intervals and this measure often replaces the cardinality of the sets involved (cf. van Benthem (1983b) and note 8 in Thijsse (1983a)). Finally, infinite models allow for a refined interpretation of some categories. E.g. we can now express 'this is the general rule' by 'this is true except for finitely many exceptions'. We might employ this interpretation to trigger the difference between normal and generic use of NPs. Let us turn now from speculations to hard figures. In the sequel,c stands for an infinite cardinality. Two sets have the same cardinality whe'n there is a one-to-one correspondence between them. The cardinality of the set of integers is called Κ 0 : the smallest infinite cardinal number. To be fully general, we adopt the axiom of choice (AC) and the generalized continuum hypothesis (GCH) in some cases. GCH guarantees that 2 K < * = N a + 1 and AC that c.c. = c for infinite c. 10 The last fact is used in the following propositions. THEOREM 18 (AC):lDET i c = 2 2 ° PROOF (cf. theorem 1) I DET I = 2 2 °· 2 ° = 2 2 ,by the above fact.
•
Since, roughly speaking, only the exponential order is of importance to infinite cardinalities, the next drastic result should not be a surprise, but at first sight it probably is. Inspection of the conditions studied in sections 2.1 and 2.3 above shows that THEOREM
19 (AC): ICONS l c = l(± ) MON , c = l ( ± ) P E R S ¡ c = ι CONS η v( ± ') MON Ic = i CONS Π CONT i = 22°
PROOF'. As in theorem 7 & 9 we use the fact that trivial dets are conservative, + monotone and - monotone, and hence continuous. So, let Σ be any combination of the conditions CONS, CONT, + MON, MON, then In Σ I > I {(J, Ρ (E)} P
(E)
1= 2 2 ,and by theorem 18 also In Σ I
ρ c < 22 : and the well-known Cantor/Bernstein theorem implies I η Σ I = e ° 22 . Finally, the reversal ° defined in the proof of theorem 6 is one-toone, so I ( ± ) PERS I c = I ( ± ) MON I c -
Q
Counting
139
Quantifiers
This result is easily misinterpreted. It does not show that conditions have no effect in the infinite realm. It can be proven that the set of nonconservative dets, of ¿^continuous dets (etc.) have again cardinality 7C
2 , so still many dets are ruled out by the conditions stated in theorem 19. On the other hand, the situation is now less informative than with finite models: we cannot simply determine the strength of conditions under inspection by comparing the numbers of dets they allow. However, ISOM is still numerically effective. THEOREM 20:1 ISOM L
ΝO
CONT L
«ο
= 1 CONS Π ISOM L
"0
=1 CONS Π ISOM Π 2X°
= I CONS Π ISOM Π ( ± ) MON I = " o
PÄ OOF: Since ICONS η ISOM η ± MON l c < ICONS η ISOM η MON I < I CONS η ISOM Π CONT I c < I CONS.O ISOM I < I ISOM I
by
the N 0
Cantor/Bernstein theorem it suffices to show that I ISOM L < 2 «o and I CONS η ISOM η ± MON I N > 2 K ° . To prove the first inequality, notice that theorem 3 still holds with «o substituted foi n. Each of the variables i,j,k,/ in the equation i + j + k + / = Κ 0 can take Κ o values, so the equation has H o '
=
K o solutions. According to theorem 3,every
subset of such quadruples describes one det in ISOM and vice versa, so I ISOM I „ =tt n. For the second inequality we use the trivial dets once κ o again (cf. theorem 19). The trivial dets in ISOM also belong to CONS and ( ± ) MON. By CONS we can omit I in the equation above. Put h = i + i. Then h + k = X
has Ν solutions,and every subset of solutions 0 0 (h, k) determines a trivial det in ISOM: when IE l = K o> IA I = h
and I E - A l = k, + ( h , k) D ( A ) = P ( E ) and - (h, k) •* D ( A ) = 0 0. Therefore, the number of trivial logical dets is and consequently I CONS η ISOM Π ( ± ) MON I „ > 2 X « o
0
•
This theorem directly generalizes to the case where there are denumerably many infinite rows : THEOREM 21 (AC):If Ια I < Κ „ then |ISOM| I xa= Icons ηisoM η ( ± ) m o n
I
=2 *
= 0
a
The last theorem again shows the immense strength of ISOM: it does even more than reduce super-exponential to exponential growth. What we have found so far, are two types of collapse, and this does not reveal much of the relative strength of conditions. However, there are other types of growth functions too. For example, confine attention to countable mod-
140
Elias Thijsse
els. When iE I = Κ GCH leaves only the possible cardinalities X 2> Κ ρ K q , and the finite numbers (constants) for admissible sets of determiner denotations. In fact, the latter two possibilities are actually realized by combinations of natural conditions. 11 We can only expect a finite number when this number is a constant on the finite domains. This was exemplified in theorem 16 and we may wonder whether this result still holds in the infinite realm. In that case we might truly say that the quadruple CONS, ISOM, PERS and VAR defines the classical determiners all, some, not, all and no. For this purpose, we need the following lemma LEMMA : Let D be + pers. Then modulo CONS, ISOM: i ' > i, j ' > j, k < c, i and j finite, i ' + j ' + k = c, + (i, j, c) implies + (i ' , j ' , k). For - pers, substitute - for +. THEOREM 22:1 CONS Π ISOM Π PERS Π VAR I c = 4 PROOF·.Assume that D e CONS η ISOM η +PERS η VAR. For finite i + j we can argue as in theorem 16: + (i, j, c ) either for all finite i with j Φ 0 or for all finite j with i Φ 0. Consider the first case, i.e. that of some for finite i + j. For arbitrary i ' , j 1 , k with j ' Φ 0 we take finite i < i 1 and 0 < j < j ' . Then i + j + c = c and j Φ 0 implies + ( i, j, c ) and we apply the lemma to get + (i ' , j ' , k ). Finally VAR assures that - ( i 1 , 0, k ), and this patterns as some for infinite A and E too. The proof for the other cases is analogous. •
We now turn to the related question which sets of conditions leave exactly countably many candidates. One plausible strategy for finding 'infinitary counting formulas' is this: look at the finite cases and extrapolate to countable cardinalities. E.g. IDET l n = 2 4 " so 'if n - * f n +3 3 )
then IDET l n
2 4 * ° =22*
°,and
I ISOM I = 2 so 'if η H n then lISOM I = η u n '. We have to be very careful here, however. For the next degree of infinity, the number of dets in ISOM is still 2 K ° , whereas the scheme suggests 2 K 1 . Perhaps more surprising is that even the extrapolation from finite to countable domains may be dangerous. In 2.4 we found that I CONS η ISOM Π CONT η + PERS i n = F(2n), where F(0) = 2, F(l) = 3 and F(n + 2) = F(n + 1 ) + F(n). Since 2 n + 1 < F(2n) < 2 2 n + 1 we would expect cardinality 2 K ° in denumerable domains for continuous, + persistent dets. Is this true? The answer may be puzzling: yes and no. It depends on whether we assume a rather natural condition or not. Let us introduce 12
Counting
Quantifiers
141
EXT: A det D satisfies extension iff for all A,Β ç E j c E 2 : B e D E ( A ) ~ B e D E (A)
Together with CONS, EXT has the effect that for Β e D (A) only Α Π Β and A - Β matter, and the picture becomes:
For finite E, EXT does not influence the number of (semantical!) dets that satisfy a certain set of conditions. E.g. modulo CONS and ISOM, every choice of IA I = i + j determines IE - A i = k, given IE I = η, and therefore E - A can not influence ± (i, j). Thus I CONS Π ISOM Π CONT η + PERSO EXT i n = F(2n). This situation changes drastically when we turn to infinite models. For c = Κ 0 ,CONS and ISOM entail that only i, j, k in i + j +k=i H Q matter. But once i + j *= X 0 , k can still be chosen in various ways, and k operates as a contextual parameter. This sort of dependence on k is typically excluded by EXT. We may then reconsider van Benthem's tree representation; complete the tree by adding one denumerable row to the (infinitely many) finite rows: 0,0 1,0 0,1 2,0 1 , 1 0,2
N 0 ,0 K„,l . . Κ 0 ! Κ 0 . . - 1,N 0 0,N 0 With the places in row X 0 arranged as depicted,CONT still translates as a convexity condition, i.e. + only appears in closed intervals. Considerations similar to those given in 2.4 show that + PERS translates on the infinite row as centering' of the +: K 0 ,0 - K 0 ,1
s0,x0 - ... - 1
- o,K 0
THEOREM 23 : ι CONS η ISOM Π CONT Π + PERS η EXT I = K 0 κo PROOF-.Conûàtt row N0 of the extended tree first. In the case where the bottom is totally filled with -, +PERS causes the whole tree to be negative. When + is present, (Κ 0 , Κ o) is+by +PERS, and CONT ensures that the plusses
142
Elias
Thijsse
form a closed interval, so this block is determined by at most two different choices, say (N o> n ) and (m,Ko). for finite m and η
(0,0)
-f+++ +++++++. A .
N 0 ,0
. .f+4++++++
N 0 ,n
(In case + ( Κ o>n) a n d -(m, Κ 0 ) f° r ^ m> or only + (X 0>N o)> or + (m,N o) f° r some m and -(Η o,n) for all n, the lemma implies that the finite rows only contain — and the rest of the proof holds then). Outside this interval we have only -, and again by +PERS, these 'pile up' to the top of the tree, as shown in the figure. So everything outside the sub-tree with top (m,n) is determined, while the finite part of the latter is as yet undetermined. But, by a simple geometrical argument (cf. van Benthem (1984a), Westerstâhl (1984), there can only be countably many +/- shapes for this tree — as indicated in the above picture. Thus, in all, only countably many combinations occur of final row/ initial triangle. •
Some additional reflection shows that CONT can be left out of theorem 23 and that +PERS can be generalized to PERS without changing the counting result. 13 How many persistent logical dets are there without invoking EXT? This question may be answered by introducing a suitable 'ternary form' of the tree, not to be displayed here. Without proof, we state our final result: THEOREM 2 4 : 1 CONS Π ISOM Π CONT η + PERS L
=2K
Thus, giving up context-neutrality has great numerical effects. This completes our combinatorial reconnaissance of the infinite quantifier case.
Counting
Quantifiers
143 NOTES
1 One day in the summer of 1982, Johan van Benthem informed me of Keenan & Stavi's result (essentially theorem 2 here), which initiated the search for solutions of other counting problems. The material presented in this paper is a selection of the theorems given in chapter 3 of my thesis, Thijsse (1983b), with some modifications. Furthermore I would like to thank the audience at the ZWO-workshop on Generalized Quantifiers, July 5 1983 in Groningen, for their patience and helpful suggestions. If the reader wants to be informed about the proofs which are left out of the text, apply to the author: c/o Katholieke Hogeschool Tilburg, Werkverband Taal & Informatica, Subfaculteit Letteren i.o., Postbus 90153, 5000 LE Tilburg, The Netherlands. 2 In the usual sense of most, viz. more than half, CONS selects less than half: of the dets, IDET -n CONS I > IDET η CONS I which holds for η > 2, since then η A l11·*.^1 d.n ςΠ 4 > 3 + 1 , so 2 >2* and hence Y -V >2 . 3 There is a subtle, yet important difference between the representation given in the text and the one van Benthem introduced. He in fact uses the full infinite tree, where I stick to the finite triangle consisting of the rows of the tree until depth n. As will become clear in section 3, the original trees presuppose an additional condition (EXT), but for the finite triangles we may do without EXT. 4 F. Lunnon ('The IU Function: the Size of a Free Distributive Lattice', in Welsh (ed.) — Combinatorial Mathematics and its Applications, 1971) reports q^ = 2 208 061 288 138, but Church's value is supported by J. Berman (e.a.)'s analysis in 1975. 5 See Greene & Kleitman (1978) or Thijsse (1983b) for the intricate proof. 6 For big n, q n is of the order Since ( j i " n j ) is approximated by 2 n / π η, the number of monotone quantifiers can be directly compared with the totality of (NP) - quantifiers: I NP I n __ 2 2 " 22
n
=
22
n
( l - CA π η) ~ 1/2 }
/ V^üíi
7 This type of definability result originates with van Benthem (1984a). Now MON is (in isolation) a much weaker condition than ISOM (cf. sections 2.1 and 2.4) : for η » 4,1 ISOM I < ICONS l n < l+MON l n < IMON l n - Therefore the following theorem improves theorem 16 (van Benthem (1983c)): CONS n MON η PERS n VAR = {all, no, some sg) not a//}where all, no, somenot all ought to be understood semantically. 8 We can eliminate F by using Binet's explicit formula for the Fibonacci numbers, using square roots of 5. 9 The more insightful geometrical derivation of (i) - (iii), which replaces an esoteric algebraic argument given in Thijsse (1983b), was suggested by Johan van Benthem. 10 Dag Westerstâhl proposed this presentation, which avoids our earlier account in terms of 'Beth cardinals'. 11 With GCH the earlier types of Σ are evidently those for which : (i)lnsL = « 2 and(ii) Ι η Σ I „ = N i WO 12 EXT is discussed in van Benthem (1984a, 1983a,b,c), often absorbed in a strengthened form of CONS, and in Westerstâhl (1982a), who calls it CONST. 13 Theorem 23 can even be generalized further by introducing 'left-continuity':
144
Elias Thijsse
if A C B Ç C C E then D(A) η D(C) ç D(B). WesterstShl (1984) proves that the leftcontinuous dets ate essentially the first-order definable ones. Replacing + PERS by LCONT clearly does not change the number for countable domains.
REFERENCES Barwise, J. & R. Cooper (1981) - 'Generalized Quantifiers and Natural Language', Linguistics and Philosophy 4:2, 159-219. Van Benthem, J. (1983a) - 'Five Easy Pieces', in Ter Meulen (ed.), 1-17. Van Benthem, J. (1983b) - 'Determiners and Logic', Linguistics and Philosophy 6:4,447-478. Van Benthem, J. (1983c) - Ά Linguistic Turn: New Directions in Logic', to appear in: Logic, Methodology and Philosophy of Science VII, North Holland Pubi. Co., Amsterdam 1984. Van Benthem, J. (1984a) — 'Questions about Quantifiers', Journal of Symbolic Logic 49:2, 443466. Van Benthem, J. (1984b) - 'The Logic of Semantics", in: F. Landman & F. Veltman (eds.) - Varieties of Formal Semantics, GRASS 3, Foris, Dordrecht. Church, R. (1965) - 'Enumeration by rank of the elements of the FD with 7 generators', Notices American Mathematical Society, vol. 12, 724. Greene & D. Kleitman (1978) - 'Proof Techniques in the Theory of Finite Sets', in: Rota (ed.) - Studies in Combinatorics, 22-79. Higginbotham, J. & R. May (1981) - 'Questions, Quantifiers and Crossing', The Linguistic Review 1, 41-79. Keenan, E, & Y. Stavi (1981) - Ά Semantic Characterization of Natural Language Determiners', to appear in Linguistics and Philosophy. Kleitman, D. (1969) — O n Dedekind's Problem: the Number of Monotone Boolean Functions', Proceedings of the American Mathematical Society vol. 21, 677-682. Ladusaw, W. (1979) - Polarity Sensitivity as inherent Scope Relations, Ph.D. dissertation, University of Texas, Austin. Lindström, P. (1966) - 'First Order Predicate Logic with Generalized Quantifiers', Theoria 32, 186-195. Ter Meulen, A. (ed.) (1983) - Studies in Modeltheoretic Semantics, Grass 1, Foris, Dordrecht. Mostowski, A. (1957) - 'On a Generalization of Quantifiers', Fundamenta Mathematica 44, 12-36. Thijsse, E. (1983a) - 'On Some Proposed Universale of Natural Language', in Ter Meulen (ed.), 19-36. Thijsse, E. (1983b) - Laws of Language, M.A. thesis, University of Groningen. Westerstáhl, D. (1982a) - 'Logical Constants in Quantifier Languages', to appear in Linguistics and Philosophy. Westerstáhl, D. (1984) - 'Some Results on Quantifiers, Notre Dame Journal of Formal Logic 25:2, 152-170. Ζ warts, F. (1981) - 'Negatief Polaire Uitdrukkingen*, GLOT 4:1, 35-132.
Chapter 6
Generalized: Finite versus Infinite Kees van Deem ter
1.
INTRODUCTION
Infinite sets are ubiquitous in mathematics; but their occurrence is not confined to mathematics. In sciences as diverse as physics and linguistics the concept of infinity plays a role, and not only in modern times; for instance, the notorious medieval discussion of the number of angels on the point of a needle was really about the possibility of an actually infinite set. In the nineteenth century, Georg Cantor propounded a new approach towards the infinite, and ever since, mathematicians are acknowledged experts on the subject. Generalized Quantifier theory (GQ theory) formulates modeltheoretic properties of determiner denotations in general ('constraints'), and uses these to propose semantical characterizations of those possible determiner denotations which are actually expressible in natural language. We take this to be an attempt to explain, which quantifiers are 'simple' enough for natural language purposes. 1 In this research program some idealizations are made. For instance, by and large, phenomena such as vagueness or intensionality have been disregarded. We shall be concerned here with another restriction, viz. the assumption that the domains of discourse, and consequently the relata of the determiners, are finite. We will call this abstraction FIN. Mostly, idealizations like these are supposed to "oil the wheels of a semantic theory" for a while, but eventually they will be abandoned. The most important aim of this paper is, to suggest that FIN differs essentially from the other idealizations, because it does not make things * I wish to express my thanks to quite a few people who have helped me in writing this paper: Dick de Jongh drew my attention to the role of FIN in a certain proof and corrected an ancestor of this paper. Fred Landman was the first to draw the "infinitary tree" in the way it figures here. Johan van Benthem encouraged me to write someting on the subject in the first place and together with Alice ter Meulen, he made me improve it. Both Elias Thijsse and Martin Stokhof convinced me of various expository shortcomings. Liesbeth de Ruiter detected many sins against the English grammar. Most of all, I want to thank Frank Veltman for many useful suggestions, as well as continuing support in my attempts to live up to the wishes of the editors.
146
Kees van
Deemster
easier. Even if we are only interested in finite relata, there turn out to be several reasons for including infinite ones from the outset. Some people view FIN not as a temporary idealization, but as a quite fundamental principle of natural language. In Van Benthem (1984a) the conjecture was advanced that "no notion for logico-linguistics may crucially involve infinity". Why not? We suspect that its motivation is the following. Infinite models are unnecessary, because, although they are certainly interesting from a scientific point of view, ordinary language use does not fathom their peculiarities. Quantifiers which make essential use of the infinite resources of a domain (e.g. the quantifier Q such that QAB «· A - Β is uncountable & Α Π Β is countable) are expressed in no human language. And quantifiers which are defined for infinite relata A and B, but not in an "essential" way (e.g. the quantifier denoted by 'some', which expresses that Α Π Β φ 0, no matter whether A and Β are finite or infinite) are no more than obvious extensions of "finite quantifiers". So, what do we lose if we acknowledge only finite domains - the more so, since the presence of arbitrary increasing finite universes gives us at least a 'potential infinite' in our model theory? In the first place we reply that if all this is true, we would like to really understand the difference between quantifiers which are straightforward extensions of finite ones and quantifiers which are more complex. For the concept of complexity is right at the heart of GQ theory; GQ theory tries to characterize the class of those quantifiers which are, among other things, not too complex for natural language. Next, we remark that some natural language quantifiers evidently require infinite domains. The determiner 'infinitely many' is a case in point; for, apart from its sloppy use, this expression cannot be denied a literal meaning. And how about 'countably many'? And is the determiner 'most' really meaningful only in connection with finite relata? It is difficult to draw the line here. Rather than a priori excluding such quantifiers in relation to infinite relata, we want to plead for a different, more reconstructionist perspective upon semantics. The semanticist should try to solve the problems arising in connection with, for example, essentially infinite quantifiers. He should make decisions where the layman is puzzl ed, and he should not be afraid to contradict him occasionally. It is widely accepted that rational revision of our conceptual apparatus is one of the goals of science: mathematicians found out what 'computable' should really mean (Church's thesis); psychologists try to do the same with the concept of intelligence. So the only question is whether we are prepared to include these enterprises in the subject of semantics. It will be clear that our own, tentative, answer is yes. Nevertheless, this conviction is not crucial for the appreciation of this paper. Even a
Quantifiers: Finite versus Infinite
147
person who wants to exclude all essentially infinite quantifiers from natural language, will have to admit that he can only do this in an articulate manner if he has not excluded from consideration infinite relata from the outset. Furthermore, abandoning FIN will yield all kinds of practical advantages. This paper now presents an investigation of the actual role of FIN in the notions and results of GQ-theory. Our final assessment of the principle will be found at the end.
2.
A REVIEW OF CONSTRAINTS
In the literature, natural language quantifiers are found to obey at least the following three constraints. Quantifiers are 'topic-neutral', in the sense of being invariant for permutations π of the domain of individuals E: (Quantity; QUANT)
Q e A B iff Q e π [Α] τ: [Β],
(Alternatively, they only depend on the numbers of individuals in Α Π Β, A - Β, Β - A and E - (A U Β ) . ) Quantifiers are also 'context-neutral', in that A U Β is the only part of the universe which matters: if A , B e E c E ' , then
Q^AB
iff
QE'AB
(Extension; EXT)
Finally, quantifiers are 'conservative': the left-hand argument dominates: QeAB
iff
QE A ( Β Π A )
(Conservativity; CONS)
As pointed out in van Benthem (1984a), these three constraints imply (and are indeed equivalent to) the possibility of representing quantifiers as segments of the following 'tree of numbers': IA I = 0
0,0
1
1,0
2
2,0
0,1 1,1
0,2
etcetera Here, pairs (a,b) are understood as couples I A - Β I, I Α Π Β I. For instance, all denotes { (a,b) | a = 0 }, most { (a,b) I a < b }. Further constraints in the literature are sometimes formulated in terms of this
148
Kees van Deemter
tree. As we have no quarrel with the above three assumptions, this practice will be followed here. Free from intrinsic reference to the tree are the common conditions of Continuity (if Β - Β ' - Β" , then QAB & QAB" imply QAB' - and likewise for not-Q; CONT) and Variety (for all non-empty A, there exist B,B" such that QAB, not QAB' ; VAR). Of course, when superimposed upon the earlier conditions, these requirements assume a clear geometrical meaning in the tree. More specifically tree-oriented are the following constraints (cf. van Benthem 1984a, 1983b). We now think of the tree as having +/- indications, for presence/absence of the quantifier at a node. Our first condition expresses a certain 'smoothness': PLUS
: (a,b) e Q only if (a+l,b) e Q and likewise for not-Q.
or (a,b+l) e Q.
A triangle of truthvalues for (a,b), (a+l,b), (a, b+1) is called a 'jump pattern'. The next condition expresses Homogeneity: HOMOG
there are only two jump patterns, one for positions +, the other for positions -.
Suitable relaxations of this condition exist, describing lesser degrees of uniformity (cf. the HOMOG* of van Benthem 1984a). Now, these constraints seek to express intuitions that are not intrinsically tied to finite sets. In principle, "infinite" quantifiers can be constrained in exactly the same ways, and in those cases where the standard formulation of a constraint presupposes the finiteness of domains, we should be able to undo this presupposition. To the unproblematic group belong CONS, EXT, QUANT, VAR; to the more difficult group that needs reformulation belong PLUS and HOMOG. Among the unproblematic constraints, only the constraint QUANT deserves explicit discussion. The intuitive meaning of the constraint called Quantity is: QUANT'
QgAB depends only on the number of individuals in the four zones Α (Ί Β, A - Β, Β - A, E - (A U B).
But what is the number of elements in the set of natural numbers? Is it equal to the number of primes, or is it equal to the number of reals? In principle it is possible to answer these questions in the affirmative and say that all infinite sets are equinumerous, namely simply infinite. This is the approach in van Benthem 1984b where an extra level is added "underneath" the tree of numbers:
149
Quantifiers: Finite versus Infinite °°,0
°°,1
.. .
...
1,°°
o,°°
This amounts to a version of Quantity in which the number of individuals in any two infinite sets is equal. But this is not the standard version of Quantity, and quite rightly so: One way or another, people tend to deny that all infinite sets are on a par. If we stick to the above formulation of QUANT (in terms of permutation-invariance) now that we include infinite domains, this comes down to Cantor's way of comparing sets as to their number of elements. It is equivalent to: ( I A | denotes Cantor's cardinality of A ) Q e A B = Qg.CD
if I Α η Β 1
=
ICDDI
and
ΙΑ - Β I
=
I C - D I
and
IB - A I
=
ID-C 1
and
I E - (A U Β 1
=
IE'
-
(CUD) 1
Thus, the earlier phrase "number of individuals" in a set becomes equated with its cantorian cardinality. In principle, one may regret this, presumably because there are intuitions about numbers which run counter to Cantor's analysis. Foremost among them is the conviction that a real part should be smaller than a whole, so that there are more natural numbers than, say, odd natural numbers. But another intuition, which seems to be more basic, is the conviction that with respect to the number of elements in a set, the identity of these elements is immaterial. This implies that if there is a bijection between two sets (as is the case with the odd numbers and all numbers), they are of the same size. So, clearly, some intuitions concerning the concept of number contradict each other. So we cannot simply describe these intuitions: we must start from them, but not stick to them. Adhering to the most important ones, one must try to reach a consistent theory. Cantor chose to work from the second intuition (immaterial identity), ignoring the part-whole intuition. And indeed, the latter seems to be the weaker of the two. After some reflection, we feel that it is an illegitimate extrapolation from the finite case. Many people are prepared to admit that an infinite chessboard has as many white fields as it has fields at all. And the number of natural numbers to come does not decrease when we move from zero to one, and so on. These far too sketchy remarks are probably not enough to convince a dissident; but they are intended in the first place to make our position clear. Description of intuitions is not enough for semantics. We will fol-
150
Kees van Deemter
low Cantor's prescription, which seems to be in accordance with the above formulation of Quantity. A consequence of our approach is to add to the old tree of natural numbers not just one level, as van Benthem did, but to add infinitely many levels, for I A I = X 0 . I A I = Κ ι , . . . , A = Ν ω , etc. 0,0 1,0 20 3,0
0,1 14
2J
Q2 12
Q3
« 0 0 . t y . . . « ο ^ ο . , . Ι ^ ο QN0 I Nr0 K r l . . . NrN0 «Ό ω
Ν ,1 ω'
. . . Ν ,«n... ω' U
K^j
^«j...
Ν ', Ν... ω ω
(Τω
...
Ι,Ν
ω
Ol«
c
We will often refer to this object, which we will freely call a tree, although it isn't, calling the levels for | A I < Ν o the finite part of the tree (although it is infinite), and the set of pairs (Ν α , x) (for certain a) a diagonal (although its geometrical properties are not those of a straight line). Constraints like PLUS and HOMOG constrain a certain level of the tree, for instance the level |A | = n , by requiring that it relates to the previous level in a certain prescribed fashion. This dependence on the previous level leaves limit levels (like, for instance, the row ¡ A i = ) unconstrained. For obvious reasons, then, we call PLUS and HOMOG 'successor constraints'. Almost any proof in the standard theory using either of these successor constraints will fail, once we include infinite domains. We now turn to a more detailed investigation.
3.
TYPES OF DEPENDENCE ON FINITENESS
In this section we shall look at the role of FIN in standard GQ-theory. The latter consists of a variety of results, expressing relations between proposed constraints, classifying all linguistic possibilities left open by certain sets of constraints, or establishing connections with classical logical languages. (A systematic survey is presented in van Benthem 1983b.) Given our special concern, we will not follow this 'internal struc-
Quantifiers: Finite versus Infinite
151
ture' of the subject, concentrating instead on types of dependence on the finiteness assumption. In 3.1. we will be occupied with theorems in which no successor constraints are used. Even here, we meet various kinds of dependence on FIN. We will try to rescue as much of the old theorems as we can, but in this enterprise we will not provide special provisions for the problematic part of the tree. In 3.2. we will turn to theorems using PLUS and HOMOG, discussing two different ways of coming to grips with the problematic part of the tree: generalization of old constraints, versus addition of only one, entirely new constraint (GEN) which will make any change in the rest of the constraints superfluous. Our survey will be very incomplete, we admit, but at least it will give an impression of what the infinite theory will look like. 3.1.
Theorems without successor constraints
1) It may be instructive to take a closer look at a case first where FIN does not play a role at all (theorem and proof, van Benthem 1983b): THEOREM: If Q is transitive and QA0, then QAU for arbitrary U. PROOF: Consider a set A + , just as big as A, disjoint from both U and A. Now QA + (J (because QA0 and QUANT), so QA * U (because A + η U is empty, and CONS). But QAA + (for Α η A + is empty, and CONS). So QAU (because QAA and QA U, and transitivity).
Although FIN is not presupposed in this proof, some presupposition is made as to the cardinality of models: they can have any finite size. Adding the constraint that only models of size, say, smaller than η are allowed, invalidates the proof, because the intermediate conclusion QAA+ presupposes that the universe is big enough to contain both A and A + , which is impossible if we choose | A I = n. Conclusion: Standard GQ theory presupposes the potentially infinite.2 Indeed, the very motivation of CONS and EXT only makes sense with such an intuitive picture in the background. 2) A few examples exemplify the other extreme: total reliance on FIN. Locally, a quantifier is a subset of POW(E) χ POW(E) (for certain E). Globally, it is a function from universes E to local quantifiers. Now consider the following statement: There are no quantifiers forming a transitive, irreflexive relation which is unbounded to the right.
152
Κees van Deemter
Locally, with FIN, this is a true assertion. But, once infinite models are allowed (or indeed, globally, arbitrary increasing sequences of finite models), counter-examples are easily found. An example of a local theorem from the literature is the Keenan and Stavi definability theorem for CONS. Here the same thing happens: with infinite domains, the theorem is false. 3 It is understandable that local theorems tend to suffer more from admitting infinite domains than global ones. For even if we have FIN, global quantifiers are infinite objects, because the domains may have any finite size, whereas local quantifiers can be infinite only when infinite domains are considered. However, there are also global theorems which rely totally on FIN. A clear example is the theorem from van Benthem (1984a) to the effect that all doubly monotonie quantifiers are first-order definable. The quantifier 'infinitely many' is a counterexample, 4 of a kind that we think ought to be taken seriously. Notwithstanding the fact that these theorems are tied to FIN, it is sometimes possible to prove an attractive new theorem, resembling the former. For example: 3) An old theorem in van Benthem (1984a) runs as follows: THEOREM: If Q is transitive, then QAB => Ac Β or QA0. PROOF:
Assume QAB and, say e e Α-B. Construct A+ with the same size as A, and such that Α η A+ = Α η B. Since QA(B η A)(CONS), we know that QA+(B η AXQUANT, EXT). But also QAA+ (because QAA + « QA(A + D A ) · » QA(B η A), which we can exploit in connection with transitivity. We already know that QA + (B Π A); now say e + e A+ - B, and let π interchange e+ with some element xeA η Β (assuming that Α η Β is not empty yet). Result: Q π A+ π (Β η A), so QA+ π (Β η A). Now with transitivity QA η (Β η A). So QA( π (Β η A) η A) (CONS), and clearly π (Β η Α) η A ¿ Β η A, as χ has been removed from Β η Α. •
Let us see what happens in this proof: Α η Β is emptied one by one by means of elements from A - B. If e.g. I A - Β I = 1 and | Α Π Β I = η, then η applications of this trick suffice. So Α-B may very well be smaller than Α Π B. Now consider infinite sets. Suppose | Α Π Β I = Κ 0 , and | A - Β I = 1. No matter how often we apply the trick, Α Π Β will never be empty. So the proof is not valid anymore. And indeed, the theorem becomes invalid. Consider e.g. the quantifier Q such that QAB = Α Π Β is infinite and A - Β is finite, as van Benthem noticed. (The reason is, that for all infinite, as opposed to finite, cardinalities a and β if a < β then β - a = β.) But evidently, it is possible to empty an infinite set Α Π Β by means of an infinite set A - Β which is at least as big. Thus there arises a kind of infinite analogon for the old theorem.
Quantifiers: Finite versus Infinite
153
THEOREM: I f l A - B \> l A O B i, then for transitive Q, if QAB then QA0 4) Jumping to a quite different situation, here is a theorem which remains true, but must be provided with a new proof. THEOREM (VAR): Inclusion is the only reflexive and transitive quantifier. In van Benthem (1984a) this is proved as a corollary of the "emptying" theorem from 3, which, as we just saw, does not survive our change of scene to the infinitary. NEW Assume that Q is reflexive and transitive. Suppose Q is not inclusion. PROOF: Then either a) not QXY and X ç Y for certain X, Y: which is impossible because if X ç Y then QXY iff QX(YnX) iff QXX (and use reflexivity). Or else b) OXY and X0: say eeX - Y . But this contradicts VAR, because 1. Q { e } {e } (refi), 2. Q {e } Χ η {e } (eeX), 3. Q { e } X ( C O N S ) , 4. QXY (assumption), 5. Q {e>Y (trans), 6. Q { e } Υ η { e } (CONS), 7. Q { e } 0 (e¿Y). So for arbitrary U: 8. Q { e } U n { e } (because this amounts to Q {e } 0 or Q {e } {e }, both of which are true, due to 7 and 1) and 9. Q {e } u (CONS). And this contradicts VAR. •
Thus, the proof without FIN is both easier and shorter, and moreover, it saves a premise (namely EXT).S 5) Sometimes the situation is almost the reverse: we can employ the same proof strategy, but it proves a different, but closely related theorem. For example, in the finite theory there is a straightforward proof for the following result concerning 'symmetrical' quantifiers, allowing the conversion from QAB to QBA: THEOREM: The symmetrical quantifiers are exactly the quantifiers Q of the form 'exactly Πι or . . . or exactly η 1 (for certain natural numbers niι , . . . , η m).' In tree-language this can be expressed very simply. The symmetrical quantifiers consist of an arbitrary number of ^diagonals of the finite tree. So, in our more general case, we get not only the old diagonals (which are somewhat longer now), but also the "diagonals" i Α η Β I = Ν α · Generalization of the theorem is then straightforward, provided that we watch out for a few geometrical peculiarities of the infinite part of the tree. For instance, symmetry amounts to the numerical condition that (a,b) e Q iff (c,b) e Q (for arbitrary a, b, c, whether finite or infinite). In the finite part of the tree, this gives the familiar diagonal picture but, e.g. on the first infinite row, there is a downward 'jump' from, say,
154
Kees van Deemter
3,No to • Thus in general, our infinite tree should be handled with some caution. 3.2.
Theorems using successor constrain ts
Probably all theorems lost as a result of their presupposing one or more successor constraints (like PLUS and HOMOG) can be replaced by analogous theorems employing modified versions of these constraints. The reason is that the intuitions on which such constraints were based concern the effect of increasing or decreasing certain 'zones' in E, rather than the addition or removal of only one element. Therefore, a generalization to our infinite case ought to be possible. For instance, consider again PLUS:(a,b) e Q only if (a,b+l) eQ or (a+1, b)e Q. And the same for The following formulation sounds stronger, but is exactly equivalent: PLUS : Suppose (a,b) e Q. Then for any k e N, there are k' and k", such that k ' + k " = k and (a + k·, b + k" ) e Q. (The same for This tells us that any k elements can be divided over A - Β and Α Π Β in such a manner, that the result is in Q if and only if QAB. Now this can easily be generalized to cover both finite and infinite additions: PLUS°°Suppose (a,b) e Q. Then for any cardinality k, there are k' and k" such that k " + k " = k and (a + k' , b + k" ) e Q . (The same for*.) Next, consider the constraint of Homogeneity. Here our task is considerably more difficult, because of the crucial reference to finite configurations: HOMOG; In any quantifier-tree, there is only one jump pattern starting with + and only one jump pattern with - . Moreover, now that we handle infinite additions instead of finite ones, we must distinguish three groups of possibilities for adding elements to A instead of only two: letting only a grow, or letting only b grow, or letting both a and b grow. (Finite additions can be split up into additions of one element; and an addition of one element always concerns only a or only b.) We propose HOMOG to say that the outcome for each of these three possibilities separately is uniform, regardless of how many elements we add:
Quantifiers: Finite versus Infinite
155
HOMOG Let β, a > 0. Suppose (a,b) e Q. If addition of a elements to A in such a way that both a and b grow, results in +, then this will also be the case for any addition of β elements starting from any pair (a 1 , b')· (The same for i - and also, with respect to growth of a or b only.) The idea of HOMOG00 is this. Addition of elements may produce different effects in different places of the tree; but if we take into account whether the elements increase Α Π Β or A - Β or both, then the effects are the same everywhere in the tree, regardless of how many elements are involved. It seems to me that both PLUS°° andHOMOG°° are conceptual improvements on their predecessors; (even though) HOMOG is stronger in the finite domain than HOMOG.6 If we turn to the use of PLUS and HOMOG made in the literature, then our new versions can still do the job. Here is a typical result: THEOREM: The only quantifiers satisfying CONT, PLUS, HOMOG are the four in the Sqare of Opposition. With HOMOG00 and PLUS°° for HOMOG and PLUS, the theorem may be proved as follows: PROOF:a)
b)
In the finite domain, HOMOG°° implies HOMOG, and PLUS°° = PLUS. All four quantifiers from the square of oppositions satisfy HOMOG°° in their finite part, as is easy to see. So in the finite part, just those four are obtained (by the original argument), Assume e.g. that the finite part of the tree is the representation of 'some'. Now HOMOG fixes the rest of the tree. What are the outcomes in the three sorts of cases we have distinguished when we formulated HOMOG °° ? To see this, take any finite pair which is in Q, say (0,1). Add two elements. No matter how we do this, again + results: (2,1), (1,2), (0,3) e Q. So by HOMOG , adding elements to + results always in +. Consequently, all pairs in the entire tree except the leftmost diagonal (b = o) are + . Finally, we turn to the pair (0,0), and look what happens if we increase just its leftmost element (a): the result is -. But then results, no matter how many elements we add (HOMOG°° ), so the leftmost diagonal consists entirely of - . I.e., the pattern of 'some' is found throughout the tree.
Note that the surplus of PLUS°° over PLUS was not used in this argument - and hence the former is a consequence of the latter (modulo our other assumptions used here). Thus, the move from the 'stepwise' formulation of the successor con-
156
Kees van Deemter
straints to a more global one, suitable for the infinitary case, does not destroy the old theory. One central theme of the original GQ-theory which we lose, however, is the study of 'fine structure' of complexity, as measured by various weaker formulations of Homogeneity, referring essentially to finite steps. Perhaps, the following alternative approach to 'infinitizing' may be more sensitive here eventually.
4. EXTRAPOLATION TO INFINITY
Let us now consider the transition to the unrestricted realm for a different angle. Consider a theorem from the classical theory, using one or more of the original constraints. Typically, this will be a characterization theorem, that is a theorem which delineates some interesting class of natural language quantifiers. So the theorem delivers a certain set of relatively simple quantifiers. That they are simple, can be seen from the fact that their tree shows a great degree of regularity. (A totally irregular quantifier would be unlearnable.) Moreover, the regularity is usually of a special sort: the quantifier tree can be divided up into a number of diagonals, each of which will eventually settle down to a constant +/- behaviour (i.e. only+ or only-). Now, this observation makes possible a kind of extrapolation principle, allowing us to generalize earlier notions and results. Intuitively, Regularity in the finite part is continued in the infinite part. To make this more precise, distinct clauses will have to be provided for each particular kind of regularity which is to be continued. For instance, one plausible candidate is "stability in the long run": GEN (la) (lb) (Γ)
If for all η > m, (a,n) e Q (where a is finite), then the same is true for infinite n. Same condition for (n,a). Same conditions for 1), and Q* the latter's complement again. In particular, then, it follows that no diagonal position n,n itself (n > 1) can belong to Q. (Otherwise, n,n e n,n ¿ Q*: contradicting principle (2).) Finally, (3) expresses symmetry of Q # . By a standard argument, this amounts to "dependence on the right-hand number only": For any b, a,be iff a'jb e Q^, for all a,a". Geometrically, then, Q itself must consist of entire north-west\southeast diagonals. The eventual argument now simply becomes this. If Q is non-empty, it must contain a diagonal like above, of the form {a,b I b > 0 }. Now, if a ψ 0, this set will contain a pair a,a with a > 1 : quod non. Therefore, the only possibility is the right-most edge of the tree — which is the pattern for the quantifier "all". •
On the more theoretical side, this volume contains several lines of research worth pursuing. To begin with, there are the original questions of GQ theory, having to do with the interplay between the "outside" approach to linguistic meanings, narrowing down the area through successive semantic constraints, and the "inside" approach, using some linguistic mechanism of generating all admissible candidates. This leads to definability results, such as the (local) definition theorem for conservative determiners (Keenan & Stavi 1981) or the definition theorem for doubly monotone determiners in van Benthem (1984b). Conversely, one can also start with a given set of linguistic expressions, asking for an illuminating semantic characterization. This other side of the coin may be found in the above analysis of the Square of Opposition, or other striking special sets in the literature. In this area, the paper by Keenan and Moss provides quite a few additional, or refined questions for research. It would be impossible to enumerate all aspects of their "expressability" concerns (both "in principle" and "in practice"), but one example in their spirit may suffice. Usually the above approach imposing semantic conditions takes place locally, for one categdry of expression at a time. To be sure, there have been attempts to
164
Johan van Benthem
proceed more generally, in all categories at once, witness van Benthem (1983). But the present point is this. As categories are intertwined, semantic restrictions do not live in isolation: e.g., constraining determiner denotations may ipso facto constrain noun phrase denotations. More specifically, Keenan and Moss prove that Conservativity is still extremely generous: every possible noun phrase denotation can be obtained as a set of the form { Β I D A Β } for some choice of a conservative determiner denotation D and argument A. But Conservativity in combination with Quantity (i.e., "logicality") becomes restrictive: not all possible noun phrase denotations can be obtained in the above way with a logical determiner D. Once this perspective is adopted, any number of questions arises. For instance, fixing one constraint on a functional category (say, Conservativity), one may study its "transmission behaviour". For various semantic conditions on its argument category, what are the resulting restrictions (if any) on the value category? Another example is that of constraints which have a universal definition, applicable to all categories, such as Quantity (cf. van Benthem 1983). Ought not these to be "selfpropagating", in the sense that if a function and its argument obey the constraint already, then so do its values? (For Quantity, the answer happens to be positive.) A more concrete technical example may illustrate the above concerns. We can characterize those noun phrase denotations X which can be obtained in the above "det N" form, for a logical determiner D. First, define c(X) =
def
Π {U I forall Y ( Y e X iff Υ Π U e X }
Thijsse has observed that, if U j , U2 have the above "witness" property, then so does their intersection Ui Π U 2 . For finite domains then (a restriction adopted in this argument), the above definition yields a smallest witness set for X. Next, define c * (X) =
def
{ U e X I U is contained in c (X) }.
Now, the desired characterization is this: X is logically det Ν - representable if and only if c * (X) is closed under permutations of c (X).
PROOF: Fiist, suppose that X = {fi ID A B }for some suitable D, and π is some permutation of c(X). Now, c(X) c A, by Conservativity of D and the definition of c. Also, π can be extended to a permutation π + on the
Themes from a Workshop
165
whole universe by adding the identity outside o f c ( X ) . Then, if U e c * ( X ) , i.e., U e X , U C c ( X ) , we have D A U , D π + [ A ] π + [ U ] (by Quantity for D), i.e., D Α π [U ], π [ U I e X , ir| U ] e c * ( X ) . Conversely, let c * ( X ) have the stated closure property. Then set A = c(X), and define a determiner D as { ( π [Α], π [ Β ] ) I π is some permutation o f the universe, and Β e X
}.
It may be checked that D is conservative and quantitative. Moreover, the closure property guarantees that { b D A B } remains just X . •
This result gives an easy criterion for checking if a given set X is a possible denotation of the above form. A more mathematical topic in the traditional framework is the counting of denotations left by various constraints. This investigation of their numerical effects, initiated in Keenan & Stavi (1981), is pursued in this volume by Keenan and Moss, and especially by Thijsse, who discovers some surprising connections with open mathematical counting problems. Also noteworthy is his use of the Fibonacci sequence in counting continuous quantifiers; which must express some deep truth about the semantic flore (or fauna). It would be pleasant if these combinatorial results could be connected up with more psychological concerns as well (beyond some tentative remarks towards this end in both papers). But also mathematically, much work remains to be done beyond establishing mere counting formulas for each domain separately. The semantic interest may lie more in "jumps" (how does the formula change when entities are added?) and "trends" (what is the long term behaviour for increasing sizes of the universe?). An interesting application of the latter type of result was presented by Keenan at our workshop (not included in the paper). By comparing the counting formula for conservative determiners and a suitable one for Boolean definition from a fixed parameter set (involving common noun and adjective denotations), it may be shown that, in the long run, the former outrun the latter - and hence no global version of the Keenan & Stavi definability theorem is possible. Next, this volume addresses questions calling for extensions and modifications in the very framework of GQ theory. Two prominent examples of a still relatively reformist nature are the following. Westerstâhl considers the influence of changing local context on the interpretation of determiner expressions, arriving at a quaternary picture, involving not just two argument sets within a background universe, but also a more changeable context set. From a descriptive point of view, this move seems imperative when trying to capture the more dynamical aspects of interpretation. Theoretically, some interesting connections emerge with the logical operation of restricting quantifiers. For instance, it might be investigated
166
Johan van Benthem
if the equivalence theorem in van Benthem (1984b) (for first-order definable quantifiers, Conservativity plus Extension are equivalent to restricted definability) can be extended to this more complex setting. Van Deemter subjects the finiteness restriction to close scrutiny, which played a conspicuous role in some earlier work. His conclusion is that large parts (though by no means all) of GQ theory can also be made to work quite generally, without this restriction. Still, there remains a lot of interest to the "finitizing" program, as it gives us several new and unexpected questions of fine structure, while offering a chance to reconsider entrenched "scientific" model-theoretic dogmas. One difficult, but interesting issue raised by van Deemter is the question how to characterize "reasonable" infinitary quantifiers, with a certain natural content. Here is one example, derived from an attempt to generalize the definability theorem for doubly monotone quantifiers on finite universes. Consider the following natural class of Quantifiers Q, satisfying the properties of D-MON: monotonicity (upward or downward) in both arguments, PLUS: if (a,b) e Q, and k is any cardinality (finite of infinite) then there exists k j , k 2 with k = ki + k 2 such that a + k i , b + k 2 e Q. And likewise for 4 · The latter principle is van Deemter's generalization of the "smoothness" property PLUS. In order to exclude irrelevant considerations of higher infinite cardinality, we shall demand that Q be non-trivial on countable cardinalities (i.e., not always true or always false) — a kind of "Löwenheim condition". The only quantifiers satisfying the above demands are those in the finite "Squares of Opposition": at most η not, at least n+1, at most n, at least n+1 not (η = 0 , 1 , 2 , . . . ) together with the infinitary square: at most finitely many not, at least infinitely many, at most finitely many, at least infinitely many not. This certainly delimits a very reasonable class in the above spirit.
Themes from a
Workshop
167
PROOF-.hm. quantifiers mentioned satisfy the above conditions. Conversely, we shall consider one (typical) example. Suppose that Q has type t MON t : upward monotone in both arguments. Well-known arguments give its pattern in the finite tree of numbers: either it is empty, or it has a shape like this:
Consider the latter case. PLUS forbids irregular kinks in the boundary, and hence the shape must be that of at least k, for some k > 0. Actually, k > 1, since otherwise the first infinite row would be all + (by PLUS), and Q would become countably trivial. But then, the infinite rows of the tree are fully determined, through the following observations. At infinite level m, all positions m,s with s >k get + (apply upward monotonicity to: k, k e Q and k < m, k < s), the position m, k-1 gets - (since k, k-1 has - , using PLUS, adding m entities. Notice that the - cannot be put any further toward the right), all positions m,s with s < k-1 then get - (by upward monotonicity again). Thus, this case becomes at least k all the way through. In addition, the former case with a negative finite tree yields one infinitary quantifier. For, consider the position Ν 0 , Ν 0 . If it is — , then so are all η, Κ 0 and Κ 0 , η for finite η (by t MON t ), making Q countably trivial. Therefore, K 0 , K „ must have + , and hence so are all nodes a,b with a > Κ 0 , b > Κ 0 (by tMON) as well as all nodes towards their right (by MONt). Finally, a suitable use of PLUS with respect to the - nodes n,n (n finite) gives - for all nodes a,η ( » > « , ) . Thus, the quantifier becomes at least infinitely many. The other three double monotonicity cases are similar. O Perhaps most conspicuously, this volume also contains proposals for extending GQ studies to quantifiers and determiners in more complex syntactic settings. Here is where further descriptive work again starts fueling the theoretical development. Keenan and Moss argue for the linguistic interest of more-place determiners, such as "every . . . and . . . ", "more . . . than . . . ", extending large parts of the traditional theory to this area. (Still, many results are open here - especially concerning definability of denotations satisfying various constraints.) Also, they hint at a smooth treatment of noun phrases in object position
168
Johan van Benthem
(rather than the above subject position), by taking a suitably liberal look at syntax. Indeed, it might be emphasized here that the GQ perspective is not tied exclusively to any particular syntactic analysis (not even to the rule S => NP VP). Within this development (though not represented in this book) is the recent move toward considering determiner denotations as applying to predicates of higher arities. For instance, it has been proposed to analyze donkey sentences such as "every farmer who owns a donkey, beats it", as being of the form "every (λχ Xy (x is a fanner who owns donkey y), λχ Xy (x beats y)" (Barwise; Fenstad). This move calls for a suitable generalization of "unary" determiner meanings (in the sense appropriate now) to these higher cases: a matter of some debate. For instance, the common reading with inclusion of couples in the above case has been disputed. And with higher-order quantifiers such as "most", there are even few convincing proposals. All the more reason, then, to subject such extensions of the framework to a sensitive formal scrutiny of ranges of possible meanings. All these possible extensions remain rather close in spirit to the original extensional paradigm. Further roads will lead to the exploration of mass terms, plurals: i.e., continuous and collective quantification (cf. Linning 1984), and perhaps also to the original Montagovian concern of intensionality from a GQ perspective (a first attempt is in van Benthem 1984a). Finally, there are some major methodological differences to be observed in this book. Some contributions are within the "flat" relational approach, others have "hierarchical" NP structure. Some authors prefer using algebraic techniques, others set-theoretic ones. Compositional interpretation is a dear tenet to some, less so to others. And of course, there is a widely varying emphasis on description versus theory. It is gratifying to observe that a semantic research program can work without resolving such issues at gun point, letting the various points of view make their specific contributions. References Barwise, J. & R. Cooper, (1981), 'Generalized Quantifiers and Natural Language', Linguistics and Philosophy 4:2,159-219. Benthem, J. van, (1983), 'Determiners and Logic', Linguistics and Philosophy 6:4, 447-478. Benthem, J. van, (1984a), Ά Manual of Intensional Logic', CSLI Lecture Notes, vol. 1, Stanford. Benthem, J. van, (1984b), 'Questions about Quantifiers', Journal of Symbolic Logic 49:2, 443-466. Dowty, D. &. B. Brodie, (1984), 'The Semantics of 'Floated' Quantifiers in a Trans-
Themes from a
Workshop
169
formationless Grammar', to appear in Proceedings WCCFL (1984). Keenan, Ε. & Y. Stavi, (1981), Ά Semantic Charaterization of Natural Language Determiners', to appear in Linguistics and Philosophy. Ladusaw, W., (1979), Polarity Sensitivity as Inherent Scope Relations, dissertation, University of Texas, Austin. Linning, J.T., (1984), 'Mass Terms and Quantification', in J-E Fenstad et al. (eds.), 'Report of an Oslo Seminar in Logic and Linguistics', Institute of Mathematics, University of Oslo, preprint series 9. Zwarts, F., (1984), Formal Semantics and Natural Language, dissertation, Rijksuniversiteit, Groningen. (To appear in GRASS.)