182 87 13MB
English Pages 505 [508] Year 2002
The Second Glot International State-of-the-Article Book
W G DE
Studies in Generative Grammar 61
Editors
Harry van der Hulst Jan Köster Henk van Riemsdijk
Mouton de Gruyter Berlin · New York
The Second Glot International State-of-the-Article Book The Latest in Linguistics
edited by
Lisa Cheng Rint Sybesma
Mouton de Gruyter Berlin · New York
2003
Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Waiter de Gruyter GmbH & Co. KG, Berlin.
The series Studies in Generative Grammar was formerly published by Foris Publications Holland.
@ Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication Data
The second glot international state-of-the-article book : the latest in linguistics / edited by Lisa Cheng, Rint Sybesma. p. cm. - (Studies in generative grammar ; 61) Includes bibliographical references. ISBN 3-11-017139-2 (cloth : alk. paper) - ISBN 3-11-017140-6 (pbk : alk. paper) 1. Linguistics. I. Cheng, Lisa Lai Shen. II. Sybesma, R. P. E. III. Series. P125.S37 2002 410-dc21 2002013694
ISBN 3-11-017139-2 (cloth) ISBN 3-11-017140-6 (paperback) Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at .
© Copyright 2003 by Walter de Gruyter GmbH & Co. KG, D-10785 Beriin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Typesetting: Selignow Verlagsservice, Berlin. Printing: WB-Druck GmbH & Co., Rieden/Allgäu. Binding: Lüderitz & Bauer-GmbH, Berlin. Printed in Germany.
Table of Contents
Preface The development of grammars David Lightfoot
vii 1
Semantics and the Generative Enterprise J.-Marc Authier
25
The semantics of Mood Paul Portner
47
Three approaches to discourse and donkey anaphora Henriette de Swart
79
Floating quantifiers: Handle with care Jonathan David Bobaljik
107
No lack of determination Greg Carlson
149
Partitivity Helen de Hoop
179
Islands Anna Szabolcsi and Marcel den Dikken
213
Structure for coordination Ljiljana Progovac
241
Optionality in optimality-theoretic syntax Gereon Müller
289
vi
Table of Contents
The syntactic representation of linguistic events Sara Thomas Rosen
323
Syntactic approaches to cliticization M. Rita Manzini
367
Featural markedness in phonology: variation Keren Rice
389
Schwa in phonological theory Marc van Oostendorp
431
Distributed Morphology Heidi Harley and Rolf Noyer
463
Affiliations and addresses
497
Preface
The articles in this book (State-of-the-Articles) give an overview of recent developments in fifteen different areas of linguistics. For each topic, they address questions like what the main issues were five, ten or fifteen years ago, what hypotheses have been put forth over the years, what solutions have been proposed for what problems, etc. Reviewing a huge body of literature, each contribution summarizes the progress made and ends with an overview of the most pressing outstanding issues. An important part of each article is an extensive up-to-date bibliography. An earlier version of each paper in this book appeared as a State-of-theArticle in the third and fourth volumes of Glot International (1998-1999). All papers have been updated and revised; depending on the area concerned and the time elapsed since the original publication, more or less radical revisions were made. In all cases the bibliography was updated. We thank all authors for their cooperation and for being on time. We also thank Henk van Riemsdijk for his help in many respects. Lisa Cheng Rint Sybesma
The development of grammars David Lightfoot
Some aspects of how syntactic systems change over time are a function of the way in which they are acquired by children. Here I shall make a few points about the art of explaining change through acquisition.
1. The development of grammars in children Grammars are mental entities which develop in the mind/brain of individual children. This development is data-driven only in part. Researchers have postulated genotypical principles which are available independently of experience and which do not have to be learned. These principles determine similarities among grammars, recurrent properties which hold of all grammars. Alongside the invariant principles, it is customary to postulate grammatical parameters, which children set on the basis of their linguistic experience and which account for grammar variation. So language acquisition proceeds as children set the parameters defined by Universal Grammar (UG), i.e. those genotypical principles and parameters which are relevant for the emergence of language in an individual. The parameters of UG are structural and abstract and that accounts for the "bumpiness" of language variation; even closely related languages generally diñer from each other in clusters of superficial phenomena (Baker 2001). Here I shall discuss the nature of the experience which triggers the development of grammars, arguing that children scan their environment for certain designated structures or "cues" and that they are not influenced by
2
David Lightfoot
the set of sentences generated by their grammars. Indeed, there are no independent "parameters"; rather, some cues are found in all grammars and some are fovmd only in certain grammars—the latter constitute points of variation. There is a second kind of development, which we shall tum to in the next section. Grammars may also develop from generation to generation. This is diachronic change. The central mystery for historical linguists taking our grammatical perspective is why they have anything to study: why do changes take place and why are languages not generally stable? If people produce utterances corresponding fairly closely to the capacity of their grammars, then children exposed to that production would be expected to converge on the same grammar. This is what one would expect if grammars have structural stability, as they must to some degree; children are not "trigger happy," developing different grammars whenever their trigger experiences differ just a little, but rather they may develop the same structural parameter settings despite some variation in experience. In that case, change would be expected only if there is a major disruption due to population movement (see below). Not only is stability what one would expect naively and pretheoretically, but it is also what many leamability models would lead one to expect. Chomsky (1965) viewed children as endowed with a metric evaluating grammars which could generate the primary data to which they are exposed, along with appropriate structural descriptions for those data. The evaluation metric picked the grammar which conformed to the invariant principles of UG and was most successful in generating those data and those structural descriptions. The child selected a grammar which matched her input as closely as possible. Again, if the data and the associated structural descriptions to which the child is exposed correspond fairly closely to the grammatical capacity of some older individual, one would expect the child's evaluation metric to select the same grammar as that older individual's. The same point holds for more recent models. Gibson and Wexler (1994) posit a Triggering Learning Algorithm (TLA), under which the childleamer uses grammars to analyze incoming sentences and eventually converges on the correct grammar. Gibson and Wexler distinguish global and local triggers, but both are sentence-types (op cit; p. 409). If the childleamer cannot analyze a given sentence with the current grammar, then she follows a certain procedure to change one of the current parameter settings and tries to reprocess the sentence using the new set of parameter values. If analysis is now possible, then the new parameter value is adopted, at least for a while. So the TLA is error-driven and permits the child to pinpoint which parameter setting is incorrect when the learner's
The development of grammars
grammar does not give the right results. There is much to be said about the way that this model works and Dresher (1999) has illuminating discussion, but what is crucial here is that the model has the child seeking (lefthand column)... although it would not be easy for the child to know which data set she is exposed to; Gibson and Wexler do not discuss how Parameter settings
Data in defined grammar
Spec-final Comp-final -V2 (VOS)
V S, V 0 S, V 01 02 S Aux V S, Aux VOS, Aux VOI 02S, AdvVS Adv V О S, Adv V 01 02 S, Adv Aux V S AdvAuxVOS, AdvAuxVOl 02 S
Spec-final Comp-final +V2 (V0S + V2)
S V, S V O, О V S, S V 01 02, 0 1 V 02 S, 02 V 01 S S Aux V, S Aux V O, О Aux V S S Aux V 01 02, 01 Aux V 02 S, 02 Aux V 01 S Adv V S, Adv V О S, Adv V 01 02 S Adv Aux V S, Adv Aux V О S, Adv Aux V 01 02 S
Spec-final Comp-first -V2 (OVS)
VS,OVS, 02 0 1 V S VAUX S, О VAUX S, 02 0 1 V Aux S, Adv V S Adv OVS, Adv 02 0 1 V S, Adv V Aux S Adv О V Aux S, Adv 02 01V, Aux S
Spec-final Comp-first +V2 (0VS + V2)
S V, О V S, S V O, S V 02 01, 0 1 V 02 S, 02 V 01 S S Aux V, S Aux О V, О Aux V S S Aux 02 01V, 01 Aux 02 V S, 02 Aux 0 1 V S Adv V S, Adv V О S, Adv V 02 01 S Adv Aux V S, Adv Aux О V S, Adv Aux 02 0 1 V S
Spec-first Comp-final -V2 (SVO)
S V, S V O, S V 0 1 02 S Aux V, S Aux V O, S Aux V 01 02, Adv S V Adv S V O, Adv S V 01 02, Adv S Aux V Adv S Aux VO, Adv S Aux VOI 02
Spec-first Comp-final +V2 (SVO + V2)
SV, SVO, OVS, S VOI 02, 0 1 V S 0 2 , 0 2 V S 0 1 S Aux V, S Aux V О, О Aux S V S Aux VOI 02, 01 Aux S V 02, 02 Aux S V 01, Adv V S Adv V SO, AdvVS Ol 02, Adv Aux SV Adv Aux S V O, Adv Aux S V 01 02
Spec-first Comp-first -V2 (SOV)
S V, S О V, S 02 0 1 V S V A u x , S 0 V A u x , S 0 2 0 1 V Aux, Adv S V Adv SOV, Adv S 02 01V, Adv S V Aux Adv S О V Aux, Adv S 02 0 1 V Aux
Spec-first Comp-first +V2 (S0V + V2)
SV,SV0,0VS,SV0201, 01VS02,02VS01 S Aux V, S Aux 0 V, О Aux S V S Aux 02 01V, 01 Aux S 02 V, 02 Aux S 0 1 V Adv V S, Adv V S O, Adv V S 02 01 Adv Aux S V, Adv Aux SOV, Adv Aux S 02 0 1 V
4
David Lightfoot
grammars which permit analysis of incoming data, where the data consist of more or less unanalyzed sentences. Gibson and Wexler's Table 3 correlates sets of three parameter settings (Specifier-final/initial, Comp-final/ initial, +/-verb-second) and sets of data (listed here in terms of primitives like Subject, Verb, First Object, Second Object). When exposed to some data set (righthand column), the child selects the appropriate grammar children's memories store sets of sentences in relevant banks, as seems to be required by their model. Clark (1992) offers a similar kind of model but one which differs from that of Gibson and Wexler in that the child cannot pinpoint the source of a grammar's failure, revising particular parameter settings. Clark posits a Darwinian competition between grammars needed to parse sets of sentences. All grammars allowed by UG are available to each child and some grammars are used more than others in parsing what the child hears. A "genetic algorithm" picks those grammars whose elements are activated most often. A Fitness Measure compares how well each grammar fares, and the fittest grammars go on to reproduce in the next generation, while the least fit die out. Eventually the candidate grammars £ire narrowed to the most fit £ind the child converges on the correct grammar Clark and Roberts (1993) used this model to give an accoimt of changes affecting the verb-second properties of early French, by allowing an arbitrary degree of misconvergence by children. There is a serious, technical problem with Clark's Fitness Measure. The Fitness Measure selects grammars which are most successful in parsing incoming data but there is no reason to suppose that a grammar with more parameters set correctly will be more successful in parsing/generating incoming data. Dresher (1999) illustrates this by considering the settings needed to generate the phonological stress system of Selkup, computing the relative score the Fitness Measure would give them when applied to eight representative words. It isn't obvious what criterion the Fitness Measure should use, so he tried three different criteria: words correct, syllables correct, and main stress correct. Some results were (1). (1) a. b. c. d. e. f.
Parameters correct 4/10 40% 60% 6/10 7/10 70% 8/10 80% 9/10 90% 9/10 90%
Words correct 2/8 25% 1/8 12.5% 4/8 50% 5/8 62.5% 5/8 62.5% 3/8 Ъ1.Ь%
Syllables correct 7/20 35% 7/20 35% 12/20 60% 14/20 70% 14/20 70% 10/20 50%
Main stress correct 3Ί.5Ψο 3/8 5/8 62.5% 50% 4/8 62.5% 5/8 5/8 62.5% 3/8 37.5%
Candidates (e) and (f) are each correct in all but one (different) parameter, but they are very different in their apparent fitness, (e) scores high, but no
The development of grammars
5
higher than (d), which has fewer correct settings. Candidate (f), with only one parameter wrong, scores worse in every category than (c), which has three parameters wrong. And (a) does better than (b), despite having only four correct parameter settings. Dresher also points out that these results can be influenced in unpredictable ways by the chance occurrence of various types of words. As a result, there is no simple relationship between success in parsing input and the number of parameters set correctly, which is a problem for Clark's Fitness Measure. Tesar and Smolensky's (1998) Optimality Theory model is also errordriven: children are expected to converge on the same grammar which generates the overt data to which they are exposed. Children reject constraint rankings which fail to generate certain data that they hear, successively re-ranking the constraints in some more successful fashion. What these models have in common is that learners eventually match their input, in the sense that they select grammars which generate the sentences of the input. Models of this type can characterize instances of language stability straightforwardly. The child converges on a grammar which analyzes the input successfully, where the input consists of sets of sentences, elements of E-language in the terminology of Chomsky (1986). In that case, the grammar will resemble closely the grammar/grammars which generate that input. Such models can also handle cases of mixed input imder conditions of population movement. There again the child is presented with a set of data, in this case data yielded by diverse grammars; she converges on a grammar which is most successful in generating that data-set, sometimes a grammar quite different from any of those in the previous generation. This would be a case of grammar change and the new grammar might yield structural descriptions and some sentences which differ from those of the input; but the new grammar would result from the child's effort to match the input sentences as closely as possible. Of course, these are not pure input-matching models of the type advocated by MacWhinney and Bates (1989), in which it is mysterious why children should ever produce non-adult forms in any systematic way. Clark, Gibson and Wexler, and Tesar and Smolensky's children are not dependent only on the input; they operate in a space defined by UG. Consequently, each intermediate stage for the developing child is represented by some set of UG-defined parameter settings, and that set may generate nonadult forms. However, it is a fact that sometimes children do not match their input at any stage, including the final stage. One instance would be abrupt, catastrophic change. It is difficult to see how failure to match input would be handled by models which are inherently input-matching. Here I want to argue the following: existing models of leamability commit us to insisting that languages are basically stable. This conforms to the
6
David Lightfoot
views of many historians that change is inherently piecemeal and gradual. But a better model of leamability enables us to better understeind historical change. Under this model we would expect language change sometimes to be abrupt, sudden, and "catastrophic". Otir leamability model, in turn, allows us to capture the contingent nature ofhistorical change and to avoid the excessively principled accounts of change offered by many historians. Ironically, the best worked out model of parameter setting comes from phonology and the work of Dresher and Kaye (1990). Parameters have not played an extensive role in the phonological literature, but Dresher and Kaye identified parameters for stress systems, a well-studied area of phonology. Furthermore, they developed a "cue-based" theory of acquisition, now clarified, elaborated, and generalized by Dresher (1999). Under this view, UG specifies not only a set of parameters, but also for each parameter a cue. I amend this view slightly and say that cues which are realized only in certain grammars constitute the parameters, the points of variation between grammars. A cue is some kind of structure, an element of I-language, which is derivedfromthe input. The cues are to be found in the mental representations which result from hearing, imderstanding, and "parsing" utterances. As a child understands an utterance, even partially, she has some kind of mental representation of the utterance. These are partial parses, which may differ from the full parses that an adult has. The learner scans those representations, derived from the input, and seeks the designated cues. If a cue is found, it is incorporated into the emerging grammar. Furthermore, the child scans the linguistic environment for cues only in simple domains; this is the "degree-0 leamability" of Lightfoot (1991, 1994). Learners do not try to match the input; rather, they seek certain abstract structures derived from the input (elements of I-language), looking only at stmcturally simple domains, and they act on this without regard to the final result. That is, a child seeks cues and may or may not find them, regardless of what the emerging grammar can generate; the output of the grammar is entirely a by-product of the cues that the child finds, and the success of the grammar is in no way based on the set of sentences that it generates, unlike in input-matching models. The child's triggering experience, then, is best viewed as a set of abstract stractures manifested in the mental representations which result from parsing utterances; some of those representations constitute partial parses, which lack some of the information found in mature, adult parses. Dresher (1999) illustrates the cue-based model of acquisition with some phonological parameters. The essential feature is that a cue-based leamer does not try to match target input forms, but uses them as sources of cues. The trigger consists not of sets of sentences but rather of partially analyzed syntactic structures, elements of I-language; these are the mental
The development of grammars
7
representations resulting from parsing utterances. So cues are intensione! elements, grammar fragments ("treelets" in the terminology of Fodor 1998). A cue-based learner determines whether a DP allows a Specifier on the basis of exposure to data which must be analyzed with a Spec preceding a D head, e.g. upigpecJohn jj's ^hat]. This cue may be identified only when the child has a partial analysis which treats John's and hat as separate words, the latter a head noim, etc. In this way, the order in which parameters appear to be set, the "learning path" (Lightfoot 1989), reflects dependencies among cues and follows from their internal architecture. Less trivially, a cue-based learner acquires a verb-second grammar not by evaluating grammars against sets of sentences but on exposure to structures commencing with a XP followed immediately by a finite V, where there is no fixed grammatical or thematic relation between the initial phrasal category and the finite verb, effectively where the initial XP is a non-subject (Lightfoot 1997b, 1999b). This requires analyzing the XP as in SpecCP and so specCpXP is the cue for a verb-second system; the cue must be represented robustly in the mental representations resulting from parsing the primary linguistic data (PLD). Some version of this cue-based approach to acquisition is implicitly assumed in some earlier work, notably in the work of Nina Hyams (1986, 1996) and in my own work (Lightfoot 1989, 1991). It has been productive for phonologists concerned with the parameters for stress systems (Dresher and Kaye 1990; Dresher 1999; Fikkert 1994, 1995), it has been invoked in syntax by Fodor (1998), emd it represents something quite different from the input-matching learning algorithms of Gibson and Wexler, Clark, Tesar and Smolensky, and others. In fact, I see no reason to believe that there is any learning algorithm beyond the information provided specifically by UG.
2. Diachronic development of grammars Turning now to language change, we note that the speech of no two people is identical, so it follows naturally that if one takes manuscripts from two eras, one will be able to identify differences and so point to language "change". In this sense languages are constantly changing in piecemeal, gradual, chaotic, and relatively minor fashion. However, historians also know that languages sometimes change in a bumpy fashion, several things changing at the same time, and then settle into relative stasis, in a kind of "punctuated equilibrium", to borrow a term from evolutionary biology. From the perspective adopted here, it is natural to try to interpret cascades of changes in terms of unitary changes in grammars, sometimes
8
David Lightfoot
having a wide variety of surface effects and perhaps setting off a chain reaction. So grammatical approaches to language change have focused on these large-scale changes, assuming that the clusters of properties tell us about the harmonies which follow from particular parameters, from identifying particular cues. By examining the clusters of simultaneous changes and by taking them to be related by properties of UG, we discover something about the nature of cues and about how they are identified. Let us consider one case of a grammatical change, which is partially understood, using it as a case-study to show what further work is needed. It will show how the study of a change is intimately connected, under this approach, with work on grammatical theory and on cue-based acquisition. Operations which associate inflectional features with the appropriate verb appear to be parameterized, and this has been the subject of a vast amount of work covering many languages (see, for example, the collection of papers in Lightfoot and Hornstein 1994). We can leam about the cues by considering how the relevant grammars could be attained, and that in tum is illuminated by how some grammars have changed. For ease of exposition, I follow work by Emonds (1978) and Pollock (1989) and I adopt the familiar basic clause-structure of (2). (2)
CP Spec^^^^C ^
IP
SpeT^^r ^ Spec
VP V
Subjects occur in SpecIP and wh elements t5φically occur in SpecCP. Heads raise from one head position to another, so verbs may raise to I and then further to C. In fact, many grammars raise their verbs overtly to the position containing the inflectional elements (3), but English grammars, unusually, do not. We know this because English finite verbs cannot be separated from their complements by intervening material (4a) and do not occur in some initial C-like position (4b). (3)
a. b.
Jeanne ¡liti ypftoujours e i les journaux] liti ipíelle e i ypftoujours e ι les journaux]]
The development of grammars (4)
9
a. * the women visited not I all Ifrequently Utrecht last week b. * visited you Utrecht last week ?
What is it that forces French children to have the overt V-to-I operation and what forces EngUsh children to lack the operation? It is reasonable to construe the English analysis as the default, as argued in Lightfoot (1993), Lasnik (1999: ch5), and Roberts (1999). There is no evidence available to the English-speaking child which would force her to select a covert movement over an overt, sjmtactic V-to-I movement. Children would need to know that (4a,b) do not occur, but these are negative data, therefore unavailable as input to children. In that case, the covert movement is the default setting. Now one can ask what triggers the availability of an overt, syntactic V-to-I raising operation in grammars where it may apply. Some generalizations have emerged over the last several years. One is that languages with rich inflection may have overt V-to-I operations in their grammars, and rich inflection could be part of the trigger (Rohrbacher 1994, 1999). However, the presence of V-to-I raising cannot be linked with rich inflection in a simple, one-to-one fashion (Bobaljik 2001, Lightfoot 2002). It may be the case that if a language has rich inflection, then overt V-to-I raising is available (Lightfoot 1991; Roberts 1999). If there is no rich inflection, a grammar may have the raising operation (Swedish—see Lightfoot 1997a: n5) or may lack it (English). Indeed, English verb morphology was simplifled radically and that simplification was complete by 1400; however, overt V-to-I movement disappeared only in the eighteenth century, so there was a long period when English grammars had very little verbal inflection but did have V-to-I movement. In that case, there needs to be a syntactic trigger for V-to-I movement. So, for example, a finite verb occurring in C, i.e. to the left of the subject NP (as in a verb-second language or in interrogatives), could only get there by raising first to I, and therefore inversion forms like (3b) in French could be syntactic triggers for V-to-I (see also Faarlund 1990 and Vance 1995 for illiiminating discussion bearing on these matters). Under a cue-based acqidsition approach, one would say that the cue for grammars raising V to I is a finite verb in I, i.e. iV, an element of Ilanguage. Children seek this cue in the representations resulting irom their (partial) parses, and they are sensitive only to what must be analyzed as jV One unambiguous instance of jV is £m I containing the trace of a verb which has moved on to C, as in the structure of (3b). Indeed, I would guess that this would be a very important expression of the cue, and I doubt that structures like (3a) would be robust enough to trigger V-to-I in isolation; this can be tested (see below). Adopting terminology from Clark
10
David Lightfoot
(1992), one can ask how robustly the cue is "expressed"; it is expressed robustly if there are many simple utterances which can be analyzed by the child only as iV. So, for example, the sentences of (3a,b) can only be analyzed by the French child (given what the child has already established about the emerging grammar) if the V Zìi raises to I; a simple sentence like Jeanne lit les journaux "Jeanne reads the newspapers", on the other hand, could be analyzed with lit raised to I overtly or covertly in the English style, and therefore it does not express the cue for the V-to-I operation. Early English grammars manifested the V-to-I operation, but later grammars do not; the operation was lost at some point. From the perspective adopted here, the operation ceased to be cued. The cue for V-to-I raising, iV came to be expressed less in the light of three developments in early Modem English. First, the modal auxiliaries {can, could, may, might, shall, should, will, would, must), while once instances of verbs that could raise to I, were recategorized such that they came to be base-generated as instances of I; they were no longer verbs and so sentences with a modal auxiliary ceased to include iV and ceased to express the cue for V-to-I movement. Sentences with a modal auxiliary, Kim must leave, are very common in ordinary speech addressed to young children, and the recategorization meant that they no longer expressed the cue. Sentences of this form existed at all stages, of course, but they came to be analyzed very differently after the change in category membership. The evidence for the recategorization is the obsolescence of (5), which follows if the modal auxiliaries are generated in I and therefore can occur only one per clause (5a), without an aspectual affix (5b,c), and mutually exclusively with the infinitival marker to, which also occurs in I (5d). (5)
a. b. c. d.
John shall can do it John has could do it canning do it I want to can do it
This change has been discussed extensively in Lightfoot (1979, 1991), Kroch (1989), Roberts (1985, 1993a), Warner (1983, 1993), and there is consensus that it was complete by the early sixteenth century. Second, as periphrastic do came to be used in negatives like John did not leave and interrogatives like did John leave?, so there were still fewer instances of jV. Before periphrastic do became available, sentences like the women visited all Utrecht last week (4a) visited you Utrecht? (4b) expressed the iV cue. Periphrastic do began to occur in significant numbers at the beginning of the fifteenth century and steadily increased in frequency until it stabilized into its modem usage by the mid-seventeenth
The development of grammars
11
century. Ellegârd (1953) shows that the sharpest increase came in the period 1475-1550. Third, in early grammars with the much-discussed verb-second system all matrix clauses had a finite verb in C. Therefore all matrix clauses expressed the cue for V-to-I, ¡V, on the assumption that V could move to С only by moving first to I. As these grammars were lost and as finite verbs ceased to occur regularly in C, so the expression of the cue for overt V-to-I raising was reduced correspondingly. By quantifying the degree to which a cue is expressed, we can understand why English grammars lost the V-to-I operation and why they lost it after the modal auxiliaries were reanalyzed as non-verbs, as the periphrastic do became increasingly common, and as the verb-second system was lost. We can reconstruct a plausible history for the loss of V-to-I in English. What we are doing here is identifying when grammars changed and how the available triggering experiences, specifically those expressing the cue, seem to have shifted in critical ways prior to the grammatical change (Warner 1995 adopts the same logic). We know from acquisition studies that children are sensitive to statistical shifts in input data. For example, Newport, Gleitman and Gleitman (1977) showed that the ability of English-speaking children to use auxiliaries appropriately results from exposure to non-contracted, stressed forms in initial positions in yes-no questions: the greater the exposure to these subject-auxiliary inversion forms, the earlier the use of auxiliaries in medial position. Also Richards (1990) demonstrated a good deal of individual variation in the acquisition of English auxiliaries as a result of exposure to slightly different trigger experiences. The issue is when trigger experiences differ critically, i.e. in such a way as to set some parameter differently. Our conclusion in earlier work was that V-to-I movement was lost in the seventeenth century, later than suggested by Kroch (1989), Roberts (1993a) and others (in fact, Kroch's own figures from Ellegârd show several sentence-types — positive intransitive questions, negative declaratives, and positive wh-object questions—with do less than 40% of the time at the very end of the sixteenth century, showing that V-to-I grammars were still very much in use). Wamer (1997) now argues that the operation may have been lost as late as in the eighteenth century. He offers some statistics from Ellegârd (1953) and Tieken-Boon van Ostade (1987). Ellegârd shows that interrogative inversion with nonauxiliary in positive clauses (i.e. came he to London? as opposed to did he come to London?) occurred 27% of the time for 1625-50; 26% for 1650-1700. Tieken-Boon van Ostade shows a drop to 13% in the eighteenth century. Negative declaratives with a nonauxiliary {he came not to London as opposed to he did not come to London) occur 68% in 1625-1650,54% in 1650-1700, dropping
12
David Lightfoot
sharply to 20% in the eighteenth century. The drop is actually sharper than these figures suggest; Tieken-Boon van Ostade's figures for the later period include a high proportion of recurrent items {know, doubt, etc) which Ellegârd omitted. A particularly interesting feature of these figures is the discrepancy between the interrogatives and the negatives, which lends some support to the himch (above) that structures like those underlying (3b, 4b) are a more effective expression of the cue ¡V than structures like those of (3a, 4a). In any case, we see that structures like (4a) were robust and widely attested in the texts of the late seventeenth century and then they disappeared rapidly—the kind of bumpiness that the abstractness of the cues leads us to expect. The historical facts, then, suggest that lack of rich subject-verb agreement caimot be a sufficient condition for absence of overt V-to-I (pace Rohrbacher 1994, 1999), but it may be a necessary condition. Under this view the possibility of V- to-I not being triggered first arose in the history of English with the loss of rich verbal infiection; similarly in Danish and Swedish. That possibility never arose in Dutch, French, German, where verbal inflections remained relatively rich. Despite this possibility, V-to-I continued to be triggered and it occurred in grammars well after verbal inflection had been reduced to its present-day level. However, with the reanalysis of the modal auxiliaries, the increasing frequency of periphrastic do and the loss of the verb-second system, the expression of {V in English bec£ime less and less robust. That is, there was no longer anything very robust in children's experience which had to be analyzed as jV, i.e. which required overt V-to-I, given that the covert operation was always available as the default. In particular, sentences like (4a) with post-verbal adverbs and quantifiers had to be analyzed with the V in I but these cues were not robust enough to set the parameter and they disappeared quickly, a byproduct of the loss of V-to-I. This suggests that the expression of the cue dropped below some threshold, leading to the elimination of V-to-I movement. The next task is to quantify this generally; we should recognize that the steady reduction in the expression of jV is not crucial, but rather the point at which the phasetransition took place, when the last straw was piled on to the camel's back. This can be demonstrated by building a population model, tracking the distribution of the jV cues in the PLD, and identifying the point at which the parameter was reset and V-to-I ceased to be triggered (differing, of course, from one individual or one dialect area to another). This work remains to be done (see below) but one hopes to find correlations between the changing distribution of the cue and the change in grammars.
The development of grammars
13
3. Some context and comparisons This grammatical, cue-based approach to diachrony explains changes at two levels. First, the cues postulated as part of UG which embody the points of parametric variation explain the unity of the changes, why superficially unrelated properties cluster in the way that they do. Second, the cues permit an appropriately contingent account of why the change took place, why children at a certain point converged on a different grammar: the expression of the cues changed in such a way that a threshold was crossed and a new grammar was acquired. That is as far as this model goes and it has nothing to say about why the distribution of the cues should change. That may be explained by claims about language contact or socially defined speech fashions but it is not a function of theories of grammar, acquisition or change,... except under one set of circumstances, where the new distribution of cues results from an earlier grammatical shift; in that circumstance, one has a "chain" of grammatical changes in a kind of domino effect. One example would be the recategorization of the modal auxiliaries (above) which contributed to the loss of V-to-I. One can, of course, embed these grammatical accounts in an appropriate model of population change (Niyogi 2002, Niyogi and Berwick 1997, Yang 2002). If Kroch and Taylor (1997) are correct, the loss of verb-second from English grammars (another contributing factor for the loss of V-to-I) was a function of contact between speakers with distinct grammars. Notice that this approach to change is independent of any particular grammatical model. Warner (1995) offers a persuasive analysis of parametric shift using a lexicalist HPSG model, quite different from the one assumed here. Interesting diachronic analyses have been offered for a wide range of phenomena, invoking different grammatical claims: Fontana (1993), van Kemenade (1987), Pearce (1990), Roberts (1993a,b, 1994, etc), Sprouse and Vance (1999), Vance (1995) and many others. Our general approach to abrupt change, where children acqviire very different systems from those of their parents, comports with work on creolization under the view of Bickerton (1984,1999), and the acquisition of signing systems by children exposed largely to unnatural input ((ioldinMeadow and Mylander 1990, Newport 1999, Supalla 1990). For several years Bickerton has worked on plantation creóles, where new languages appear to be formed in the space of a single generation. He argues, surely correctly, that situations in which "the normal transmission of wellformed language data from one generation to the next is most drastically disrupted" will tell us something about the innate component and how it determines acquisition (Bickerton 1999); it certainly shows that children do not always proceed by converging on grammars which match the input.
14
David Lightfoot
It is hard to see how input-matching models can succeed when children are exposed to unusual amounts of artificial and degenerate data, which in fact are not matched. In particular, it is hard to see how they could account for the early development of creole languages, as described by Bickerton and others. In these descriptions, early creole speakers are not matching their input, which typically consists to a large degree of pidgin data. Pidgins are primitive communication systems, cobbled togetherfromfragments of two or more languages. They are not themselves natural languages and they tend not to last long, before giving way to a creole with all the hallmarks of a natural grammar. The first speakers of creóles go way beyond their input in some ways and in other ways fail to reproduce what they heard from their models, arriving at grammars which generate sentences and structural descriptions quite different from those of the input. Let us call this the "abrupt" view of creolization (following Thomason and Kaufinan 1988). There is a dramatic discrepancy between what early creole speakers hear in childhood and what their mature grammars eventually characterize as well-formed, much greater than in non-creole contexts. The abrupt view of creolization is more controversial than it should be. It offends a commitment to the proposition that languages generally change only gradually. This commitment is linked to a highly datadriven view of language acquisition, and it is widely and deeply held, including by creolists. Creolists committed to gradualism (e.g. Carden and Stewart 1988) insist that creóles emerge gradually as a result of changes introduced primarily by adults, as they relexify their own languages. Clearly gradual change exists and this is part of the story. However, if this is generally true, if this is most of the story, and if creolization for the most part mirrors adult second language learning and is not abrupt and instantiated by children, then there is little reason for theoreticians to be interested in the phenomenon. Our data about the early stages of creole languages generally are not very rich, and if one is interested in adult second language learning, one is probably better off refining theories in the light of better data-sources. Bickerton's enterprise is limited by the sketchiness of the available data for the earliest stages of creole languages, but the view that new languages emerge rapidly and fully formed despite very impoverished input receives striking support from work on signed languages. A critical fact here is that only about 10% of deaf children in the US are bom to deaf parents who can provide eEirly exposiu-e to a conventional sign language. This means that the vast majority of deaf children are exposed initially to fragmentary signed systems which have not been internalized well by their primary models. This is often some form of Manually Coded English (MCE), which maps English into a visual/gestural modality. Goldin-
The development of grammars
15
Meadow and Mylander (1990) take these to be artificial systems, quite unlike natural languages in some ways, and they show how deaf children go beyond their models in such circumstances and "naturalize" the system, altering the code and inventing new forms which are more consistent with what one finds in natural languages. Supalla (1990) casts more light on this, showing that MCE morphology fails to be attained well by children, who fail to use many of the markers that they are exposed to and use other markers quite differentlyfi-omtheir models. He focuses on deaf children who are exposed only to MCE with no access to American Sign Language (ASL), and he foimd that they restructure MCE morphology into a new system. Clearly this cannot be modeled by error-driven or input-matching learning devices, because the input is not matched. Not even close. Furthermore, it is not enough to say that MCE morphology simply violates UG constraints, because that would not account for the way in which children devise new forms. More is needed from UG. The imleamability of the MCE morphology suggests that children are cue-based learners, programmed to scan for clitic-like, unstressed, highly assimilable infiectional markers. That is what they find standardly in spoken languages and in natural signed languages like ASL. If the input fails to provide such markers, then appropriate markers are invented; children seize appropriate kinds of elements which can be interpreted as inflectional markers. The acquisition of signed languages under these circumstances offers an opportunity to understand more about abrupt language change, creoUzation and about cue-based acquisition (Lightfoot 1999a). One particular case of great interest in this regard is the emergence of Nicaraguan Sign Language, as described by Kegl, Senghas and Coppola (1999). The characterization of abrupt grammatical change sketched here makes sense only if one views grammars as individual mental entities, and not as some kind of social entity codifying the data attested in the texts of some period. Failure to make this simple distinction has entailed confusion in the literature, discussed in Lightfoot (1995). Of course, one can talk about the social distribution of these grammars. There has been interesting work on the replacement of one grammar by another, i.e. the spread of change through a speech commimity. So, Kroch and his associates (Kroch 1989; Kroch and Taylor 1997; Pintzuk 1999; Santorini 1992, 1993; Taylor 1990) have argued for coexisting grammars. That work postulates that speakers may operate with more than one grammar in a kind of "internalized diglossia" and it enriches grammatical analyses by seeking to describe the variability of individual texts and the spread of a grammatical change through a population. Niyogi and Berwick (1997) have offered a population genetics computer model for describing the spread of new grammars. It is generally agreed
16
David Lightfoot
that certain changes progress in a S-curve but now Niyogi and Berwick provide a model of the emergent, global population behavior, which derives the S-curve. They postulate a learning theory and a population of child learners, a small number of whom fail to converge on preexisting grammars, and they produce a plausible model of population changes for the loss of null subjects in French. The fact that changes can be shown to progress through populations in a S-curve is not surprising to those who have written about chaotic systems and catastrophic changes (Lightfoot 1991: ch7), but the success of Niyogi and Berwick is to show that it is not impossibly difficult to compute (or simulate) grammatical dynamical systems; they show explicitly how to transform parameterized theories and memoryless learning algorithms to dynamical systems, producing results along the way. However, the approach sketched here runs coimter to three other pervasive lines of thought. One is the idea that all change is gradual and that abrupt, catastrophic change does not happen (Harris and Campbell 1995; Hopper and Traugott 1993; Carden and Stewart 1988). This is sometimes modeled in "lexicalist" theories of grammar, in which particular grammars differ from each other in terms of features on individual lexical items (see Lightfoot 1991: ch6 for discussion). This approach to change implies that language acquisition is data-driven, that children match their input, which may vary without limit. Where children appear not to match their input, it is claimed that access to more complete data would reveal that abrupt transitions do not happen. Of course, in dealing with historical texts, one is dealing with performance data which do not match grammars perfectly, least of all single grammars. This means that grammarians must interpret the data and each interpretation must find the most appropriate level of abstraction. For example. Fries (1940) offered statistical data showing that Old English alternated between object-verb emd verb-object order freely and that "the order of... words... has no bearing whatever upon the grammatical relationships involved" (pl99). He found that object-verb order occurred 53% of the time aroimd the year 1000 and that it was "gradually" replaced by verb-object order, reducing to 2% by the year 1500. However, his counts ignored the distinction between matrix and embedded clauses and he had no analysis of verb-second effects. If one makes such distinctions, one can show that Old English grammars most typically had object-verb order underlyingly and an operation of verb movement raising finite verbs to С in matrix clauses to yield verb-second order (van Kemenade 1987). Kroch and Taylor (1997) show that there was a dialect difference involving movement of finite verbs to C, and consequently the grammatical change consisted in a change in the order of the verb and its complement and in the loss of
The development of grammars
17
verb-second grammars, each of which were catastrophic (Lightfoot 1997a). A second incompatible Une of thought is that there exists a theory of change with some content. If one has a theory of grammar and a theory of acquisition, it is quite unclear what a theory of change is supposed to be a theory of A "theory of grammaticahzation" (Hopper and Traugott 1993) is a sub-part of such a theory of change, insofar as it involves a claim that there is more grammaticahzation over time. However, local causes are needed for each instance of grammatical change and it is not clear how it helps to postulate a general historical tendency (Lightfoot, in press). A third approach with which I take issue is the tendency to incorporate historicist elements into UG. Keyser and O'Neil (1985) propose a condition that "whenever possible the language acquisition device reduces the level of optionality, either by change of status or rule loss"; their evidence comesfromchanges which they analyze as the loss of optional rules. Similarly, Bauer (1995) construes Latin as a thorough-going left-branching language which changes into a thorough-going right-branching language (French). She explains this on the grounds that left-branching languages (with non-agglutinating morphology) were hard to acquire: "Latin must have been a difficult language to master, and one understands why this type of language represents a temporary stage in linguistic development" (p. 188). So she explains her change not in a mysterious theory of history, but rather in terms of human biology: our brains work in such a way that complex structures in left-branching languages without agglutinative morphology are hard to acquire. This, of course, immediately raises the question of why early Latin would have been left-branching: "If leftbranching structures are... acquired with greater difficulty, it is indeed legitimate to wonder why languages, in an early period, exhibit this kind of structure" (p. 216). She concludes that this "still remains to be explained" (p. 217). In the same vein, Kiparsky (1997) appeals to "endogenous optimization" and Roberts (1993b) builds a weighting into UG so that UG effectively encourages learners to "grammaticalize" independently of what they experience through their PLD; this is said to promote Diachronie Reanalyses. Historical linguists often see general directions to change and they explain this either by invoking laws of history (i.e. a "theory of change"; see Lightfoot 1979) or by attributing historical effects to genetic predispositions. So Keyser and O'Neil (1985) build a clause into UG predisposing us against optional rules. But for optional rules to be lost, they must first be introduced; if we are predisposed not to attain optional rules, one wonders how they would be triggered in the first place. The identical point holds of the inbuilt tendencies to branch to the right, to "optimize", and to grammaticalize.
18
David Lightfoot
These ideas reflect nineteenth-century efforts to find deterministic laws of history and the view that historical change is principled and law-govemed. Historians are attracted to this view when they focus attention on changes which they believe recur in the history of many languages (see Lightfoot 1999b: ch2 for discussion). Rather, one needs a more contingent approach: two people attain different grammars only if exposed to triggers which differ in some relevant way, and therefore grammatical shifts are to be explained oraZy by a prior change in the trigger experience. Language acqxiisition takes place in an individual by an interaction of intrinsic, native properties (UG etc.), the trigger experience that a given child is exposed to, and nothing else. Our model embodies this kind of contingency and characterizes change as chaotic and flukey. Change is chaotic and flukey, but nonetheless explainable to a degree.
4. Conclusion I submit that work on abrupt creolization, the acquisition of signed languages, and on catastrophic historical change shows that children sometimes do not match their input. This work invites us to think of acquisition as cue-based; children scan the environment for certain elements of I-language in unembedded domains. These elements are not in the input directly, but they are derived from the input, in the mental representations yielded as children understand and "parse" their input. So a cue-based learner acquires a verb-second grammar not by evaluating grammars against sets of sentences but on exposure to structures commencing specCpXP· Similarly grammars with a V-to-I raising operation are triggered when a child confronts sentences which must contain a jV structure, the cue for such a grammar. The name is new but the cue-based approach to acquisition is assumed in earlier work, as we noted, and it comports well with work on the visual system, which develops as organisms are exposed to very specific visual stimuli, horizontal lines for example (Hubel 1978; Hubel and Wiesel 1962; Sperry 1968). Current theories of the immune system are similar; specific antigens ampliiy pre-existing antibodies. In fact, this is the kind ofthing which is typical of selective learning quite generally (Piattelli-Palmarini 1986). Cue-based acquisition is a radical departure from much current work on leamability, which postulates various forms of input matching. It is striking that so much of this work has children dealing with elements of E-language, often requiring that the system perform elaborate calculations, in effect. For example, one of the best known results of work on leamability, the Subset Principle of Berwick (1985), is usually construed
The development of grammars
19
as calculating subset relations among sets of E-language and choosing among grammars accordingly. Dresher and Kaye (1990) show that the Subset Principle can be defined intensionally with respect to cues. The model advocated here plays down the centrality of E-language, and postulates children seeking elements of I-leinguage in the input and selecting grammars accordingly; the model makes no reference to elements of E-language or to the output of the grammar. The cue-based approach assumes with Lightfoot (1989) that there is a learning path, an order in which parameters are set. We have seen that a child cannot determine whether Specifiers precede heads imtil some anaIj^ical vocabulary has been developed. Similarly, the child cannot determine whether SpecCP is filled (in a verb-second language) until she has identified phrasal categories, learned that initial categories do not have any fixed grammatical or thematic role and (therefore) are followed directly by a finite verb. All of this represents prior stages of acquisition. Representations are elaborated step-by-step in the course of acquisition, and the cues needed become increasingly abstract and grammar-internal. In this model the learning path is part of linguistic theory, a function of the way in which the parameters and their cues are stated. Consequently, there may be no general learning algorithm distinct from the content of the grammar, along the lines of Gibson and Wexler's TLA or Clark's genetic algorithms, which are learning algorithms quite distinct from the grammars assumed. Dresher (1999) argues that the cue-based acquisition strategy is "deterministic" in the sense of Berwick (1985), in that the learner may not backtrack or undo parameter settings that have already been set. Under this view, one would expect there to be grammatical changes which are abrupt, and one would expect languages to diñer from each other in bumpy ways. We may seek to quantify the degree to which cues are represented in the PLD, showing that abrupt, catastrophic change takes place when those cues are expressed below some threshold of robustness. This enables us to avoid the circularity of historical explanations which, for example, attribute the loss of verb-second grammars in Middle French to the prior introduction ofZP subject V... forms. If we produce productive models for historical change along these lines, relating changes in simple cues to large-scale grammatical shifts, our results will have consequences for the way in which we study language acquisition. In particular, we shall not be surprised that changes sometimes occur abruptly. With the development of computer corpora, Niyogi and Berwick's results, and an explicit cue-based theory of acquisition, we have all the ingredients for success in the historical domain, synthesizing work on language change, acquisition, and variation.
20
David Lightfoot
Note Many of the issues raised here are discussed in more detail in Lightfoot (1999b).
A Development of Grammars
Bibliography
Baker, M. (1996). The polysynthesis parameter. Oxford: Oxford University Press. Baker, M. (2001). The atoms of language: The mind's hidden rules of grammar. New York: Basic Books. Battye, A. and I.G. Roberts (eds.) (1995). Clause structure and language change. Oxford: Oxford University Press. Bauer, B. (1995). The emergence and development of SVO patterning in Latin and French. Oxford: Oxford University Press. Berwick, R.C. (1985). The acquisition of syntactic knowledge. Cambridge, MA: MIT Press. Bickerton, D. (1984). The language bioprogram hypothesis. Behavioral and Brain Sciences 7.2,173-222. Bickerton, D. (1999). How to acquire language without positive evidence: What acquisitionists can leam from creóles. In: DeGraff (ed.). Bobaljik, J.D. (2001). The Rich Agreement Hypothesis in review. Ms, McGill University. Carden, G. and W.A. Stewart (1988). Binding theory, bioprogram and creolization: Evidence from Haitian Creole. Journal of Pidgin and Creole Languages 3,1-67. Chomsky, N. (1965). Aspecte of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Clark, R. (1992). The selection of syntactic knowledge. Language Acquisition 2.2. Clark, R. and I.G. Roberts (1993). A computational approach to language leamability and language change. Linguistic Inquiry 24, 299-345. DeGraff, M. (ed.) (1999). Language creation and language change: Creolization, diachrony, and development. Cambridge, MA: MIT Press. Dresher, B.E. (1999). Charting the learning path: Cues to parameter setting. Linguistic Inquiry 30, 27-67. Dresher, B.E. and J. Kaye (1990). A computational learning model for metrical phonology. Cognition 34,137-195.
ты development of grammars
21
Ellegârd, Α. (1953). The auxiliary DO: The establishment and regulation of its use in English. Stockholm: Almqvist & Wiksell. Emonds, J. (1978). The verbal complex V - V in French. Linguistic Inquiry 9.1, 49-77. Faarlund, J.T. (1990). Syntactic change: Toward a theory of historical syntax. Berlin: Mouton de Gru}rter. Fikkert, P. (1994). On the acquisition of prosodie structure. Ph.D. dissertation, University of Leiden. Fikkert, P. (1995). Models of acquisition: How to acquire stress. In: J. Beckman (ed.) Proceedings ofNELS 25. GLSA, University of Massachusetts. Fodor, J.D. (1998). Unambiguous triggers. Linguistic Inquiry 29,1-36. Fontana, J. M. (1993). Phrase structure and the syntax of clitics in the history of Spanish. Ph.D. dissertation. University of Pennsylvania. Fries, C. (1940). On the development of the structural use of word-order in Modem Enghsh. Language 16,199-208. Gibson, E. and K. Wexler (1994). Triggers. Linguistic Inquiry 25, 355-407. Groldin-Meadow, S. and C. Mylander (1990). Beyond the input given: The child's role in the acquisition of language. Language 66, 323-355. Harris, A. and L. Campbell (1995). Historical syntax in crosslinguistic perspective. Cambridge: Cambridge University Press. Holmberg, A. (1986). Word order and syntactic features. Ph.D. dissertation, University of Stockholm. Hopper, P. and E. Traugott (1993). Grammaticalization. Cambridge: Cambridge University Press. Hubel D. (1978). Vision and the brain. Bulletin of the American Academy of Arts & Sciences 31, no. 7,28. Hubel, D. and T. Wiesel (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Journal of Physiology 160,106-154. Hyams, N. (1986). Language acquisition and the theory of parameters. Dordrecht: Foris. Hyams, N. (1996). The underspecification of ftmctional categories in early grammar. In: H. Clahsen (ed.) Generative perspectives on language acquisition. Amsterdam: Benjamins. Kegl, J., A. Senghas and M. Coppola (1999). Creation through contact: Sign language emergence and sign language change in Nicaragua. In DeGraff(ed.). Kemenade, A. van (1987). Syntactic case and morphological case in the history of English. Dordrecht: Foris.
22
David Lightfoot
Kemenade, Α. and Ν. Vincent (eds.) (1997). Parameters of morphosyntactic change. Cambridge: Cambridge University Press. Keyser, S.J. and W. O'Neil (1985). Rule generalization and optionality in language change. Dordrecht: Foris. lüparsky, P. (1997). The rise of positional licensing. In: van Kemenade and Vincent (eds.). Kroch, A. (1989). Reflexes of grammar in patterns of language change. Journal of Language Variation and Change 1,199-244. Kroch, A. and A. Taylor (1997). Verb movement in Old and Middle English: Dialect variation and language contact. In: van Kemenade and Vincent (eds.). Lasnik, H. (1999). Minimalist analysis. Oxford: Blackwell. Lightfoot, D.W. (1979). Principles ofdiachronic syntax. Cambridge: Cambridge University Press. Lightfoot, D.W. (1989). The child's trigger experience: Degree-0 leamability. Behavioral and Brain Sciences 12.2, 321-334. Lightfoot, D.W. (1991). How to set parameters: Arguments from language change. Cambridge, MA: MIT Press. Lightfoot, D.W. (1993). Why UG needs a learning theory: Triggering verb movement. In: C. Jones (ed.) Historical linguistics: Problems and perspectives, 190-214. London: Longman. [Reprinted in Battye and Roberts (eds.)]. Lightfoot, D.W. (1994). Degree-0 leamability. In: B. Lust, G. Hermon and J. Komfilt (eds.) Syntactic theory and first language acquisition: Crosslinguistic perspectives. Vol. 2. Hillsdale, NJ: Erlbaum. Lightfoot, D.W. (1995). Grammars for people. Journal of Linguistics 31, 393-399. Lightfoot, D.W. (1996). Review of Bauer 1995. Language 72.1,156-9. Lightfoot, D.W. (1997a). Catastrophic change and learning theory. Lingua 100,171-192. Lightfoot, D.W. (1997b). Shifting triggers and diachronic reanalyses. In: van Kemenade and Vincent (eds.). Lightfoot, D.W. (1999a). Creoles and cues. In: DeGraff (ed.). Lightfoot, D.W. (1999b). The development of language: Acquisition, change, and evolution. Oxford: Blackwell. Lightfoot, D.W. (ed.) (2002). Syntactic effects of morphological change. Oxford: Oxford University Press. Lightfoot, D.W. In press. Grammaticalization: Cause or effect? In: R. Hickey (ed.) Motives for language change. Cambridge: Cambridge University Press. Lightfoot, D.W. and N. Hornstein (eds.) (1994). Verb movement. Cambridge: Cambridge University Press.
The development of grammars
23
MacWhinney, B. and E. Bates (eds.) (1989). The crosslinguistic study of sentence processing. Cambridge: Cambridge University Press. Newport, E.L. (1999). Reduced input in the acquisition of signed languages: Contributions to the study of creolization. In: DeGraff (ed.). Newport, E., H. Gleitman and L.R. Gleitman (1977). Mother, I'd rather do it myself: Some effects and non-effects of maternal speech style. In: C. Snow and C. Ferguson (eds.) Talking to children: Language input and acquisition. Cambridge: Cambridge University Press. Niyogi, P. (2002). The computational study of diachronic linguistics. In: Lightfood (ed.) Niyogi, P. and R.C. Berwick (1997). A dynamical systems model for language change. Complex Systems 11,161-204. Pearce, E. (1990). Parameters in Old French syntax. Dordrecht: Kluwer. Piattelli-Palmarini, M. (1986). The rise of selective theories: A case study and some lessons from immunology. In: W. Demopoulos and A. Marras (eds.) Language learning and concept acquisition: Foundational issues. Norwood, NJ: Ablex. Pintzuk, S. (1999). Phrase structures in competition: Variation and change in Old English word order. New York: Garland. Pintzuk, S., G. Tsoulas and A. Warner (eds.) (2000). Diachronic syntax: Models and mechanisms. Oxford; Oxford University Press. Platzack, C. (1986). Comp, Infi, and Germanic word order. In: L. Hellan and K. Koch Christensen (eds.) Topics in Scandinavian syntax. Dordrecht: Reidel. Pollock, J.-Y. (1989). Verb movement, UG and the structure of IP Linguistic Inquiry 20.3, 365-424. Richards, B.J. (1990). Language development and individual differences: A study of auxiliary verb learning. Cambridge: Cambridge University Press. Roberts, I.G. (1985). Agreement patterns and the development of the English modal auxiliaries. Natural Language and Linguistic Theory 3, 21-58. Roberts, I.G. (1993a). Verbs and diachronic syntax. Dordrecht: Kluwer. Roberts, I.G. (1993b). A formal account of grammaticalization in the history of Romance futures. Folia Linguistica Histórica 13, 219-258. Roberts, I.G. (1994). Two types of head movement in Romance. In: Lightfoot and Hornstein (eds.). Roberts, I.G. (1999). Verb movement and markedness. In DeGraff (ed.). Rohrbacher, В. (1994). The Germanic VO languages and the full paradigm: A theory of V toi raising. Ph.D. dissertation. University of Massachusetts, Amherst.
24
David Lightfoot
Rohrbacher, B. (1999). Morphology-driven syntax: A theory of V to I raising and pro-drop. Amsterdam: Benjamins. Santorini, B. (1992). Variation and change in Yiddish subordinate clause word order. Natural Language and Linguistic Theory 10, 595-640. Santorini, B. (1993). The rate of phrase structure change in the history of Yiddish. Journal of Language Variation and Change 5, 257-283. Sperry, R. (1968). Plasticity of neured maturation. Developmental Biology Supplement 2, 306-27. Sprouse, R. and B. Vance (1999). An explanation for the loss of null subjects in certain Romance and Germanic languages. In: DeGrafF (ed.). Supalla, S. (1990). Segmentation of Manually Coded English: Problems in the mapping of English in the visual Igestural mode. Ph.D. dissertation. University of Illinois. Taylor, A. (1990). Clitics and configurationality in Ancient Greek. Ph.D. dissertation, University of Pennsylvania. Tesar, В. and P. Smolensky (1998). Leamability in Optimality Theory. Linguistic Inquiry 29, 229-268. Thomason, S. and T. Kaufman (1988). Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. Tieken-Boon van Ostade, I. (1987). The auxiliary DO in Eighteenth- century English: A sociohistorical-linguistic approach. Dordrecht: Foris. Vance, B. (1995). On the decline of verb movement to Comp in Old and Middle French. In: Battye and Roberts (eds.). Vikner, S. (1994). Finite verb movement in Scandinavian embedded clauses. In: Lightfoot & Hornstein (eds.). Warner, A.R. (1983). Review article on Lightfoot 1979. Journal of Linguistics 19,187-209. Wamer, A.R. (1993). English auxiliaries: Structure and history. Cambridge: Cambridge University Press. Wamer, A.R. (1995). Predicting the progressive passive: Parametric change within a lexicalist framework. Language 71.3, 533-57. Wamer, A.R. (1997). The structure of parametric change, and V movement in the history of English. In: van Kemenade and Vincent (eds.). Yang, C.D. (2002). Grammar competition and language change. In: Lightfood (ed.).
Semantics and the Generative Enterprise J.-Marc Authier
1. Introduction Everett (1998:142-143) takes issue with Homstein's (1995) claim that LF is "formally very similar to the logician's logical form" (which Everett labels If). This is true, he says, [...] only in the broadest and nearly meaningless of ways, namely, that both If and LF have both recursion and syntax: If has no quantifier raising (QR), no move a, [...] or full interpretation, to name a few obvious differences between If and LF [...] I doubt if professional logicians would be very sanguine about identifying If and LF, even in the most general of ways. These remarks echo the widespread belief among natural language semanticists that if semantics is to be primarily concerned with the mechanisms/algorithms by which the meaning of words can be combined to form the meanings of larger constructs, then Chomskjr's generative grammar can only be appreciated in terms of the contribution it makes to sj^tax (i.e. the study of phrase structure and word order). And yet Montague's model-theoretic treatment of natural language meaning, which has remained popular among semanticists because it is compositional in nature, is in no way incompatible Tvith the syntactic appeiratus currently utilized in the generative framework. As demonstrated in Authier and Reed (1999), the derivational structure building process advocated in Chomsky (1995) allows the type of semantics rules used by Montague to follow the lead of syntactic configurations in a step-by-step, compositional
26
J.-Marc Authier
fashion. Frege's principle of compositionality is after all based on the observation that we are capable of interpreting a theoretically infinite number of sentences and our interpretation of them is based on our knowledge of the meaning of a finite number of lexical items and our subconscious knowledge of a finite number of syntactic principles. Such observations are exactly what launched the generative entreprise. What then has led to the perception that generative grammar and Montague grammar are distinct, incompatible theories of natural language? The principal answer to this question is that Tarski's model theory, which Montague applied to natural language semantics, was originally designed to fit to the structure of the notation of logic and that this has led to the belief that the syntax of natural language should be the same as the syntax of logical notation, a.k.a. categorial grammar (cf Gillon 1998). Unfortunately, categorial grammars, unlike Chomskys generative grammar, do not recognize the relevance of tree geometry to grammatical theory, and, as a result, generativists have been reluctant to endorse model theoretic semantics because such an endorsement could be interpreted as an abandonment of structural notions which are at the heart of Chomsky's model. By and large then, Chomskyan linguists have not been overly concerned with issues related to compositional meaning. Could it then be that generative grammar, which claims to be a theory of language, has little to say about meaning-related phenomena? In what follows, we will see that the answer to this question is negative, although the range of semantic phenomena explored in generative grammar only partially overlaps with that which is of interest to model-theoretic semanticists.
2. Mathematically versus empirically motivated of adequacy for possible grammars
criteria
To imderstand the place semantics occupies within the framework of generative grammar, one must first understand the relationship of formal logic to the study of natural language. Carnap (1942) discusses this issue in depth. On the one hand, we have descriptive theories of natural language which investigate the structure and interpretation of natural language expressions; on the other hand, we have correlated formal/logical systems which investigate the mathematical properties descriptive theories can take. In the early days of the development of generative grammar, Chomsky was concerned with issues of mathematical linguistics, for example, where natural language grammars stand in the hierarchy of mathematically-defined languages. After supplementing the classical notion of a rewriting system introduced by Thue (1914) with some mechan-
Semantics and the Generative Enterprise
27
ism for "squeezing out" a language, Chomsky showed that a rewriting system can be used as a device for defining formal languages. He thus introduced different types of grammars (cf. Chomsky 1956,1957,1959) and, by the mid-sixties, the so-called four classes of the Chomsky hierarchy of grammars (i.e., recursively enumerable or typeO, context-sensitive or type 1, context-free or type 2, and regular or type 3) had become standard in the field of mathematical linguistics (see e.g. Mateescu and Salomaa 1997). Around the same time, however, Chomsky abandoned this line of research to focus instead on finding empirically motivated restrictions on the class of natural language grammars, thereby identifying abstract principles of considerable explanatory power which constitute the core of Universal Grammar. The generative framework which has evolved as a result of this choice has often been criticized for being a framework of description whose mathematical properties are completely unknown (cf Horrocks 1987: 284). Some have even lamented that the generative enterprise is perverse in this respect: Why not simply assume that the mathematics of the notation of classical quantificational logic, i.e. categorial grammar, underlies all possible descriptive theories of natural language syntax and that the model theory of classical quantificational logic bears a similar relation to possible descriptive theories of natural language semantics? Answering this criticism, Gillon (1998) argues that while nothing precludes the possibility that (at least some of) the structure of natural language will tum out to be the same as that of the notation of classical quantificational logic, the question is one which requires that we carefully study the possible structures of natural languages. As Gillon (1998: 16) puts it: The inference that an expression of natural language and a piece of mathematical notation have the same structure by dint of their expressing the same thing is as fallacious as the inference that an expression in English and an expression in Chinese have the same structure by dint of expressing the same thing. Another reason underlying the argument that the theory of natural language may be, at least to some extent, independent from classical quantificational logic (i.e. the mathematical study of artificial language) is that the s jTitax of the latter, categorial grammar, makes no reference to those features of tree geometry which have been shown by generativists to be of great importance in the formulation of principles underljdng natural language. For example, in Montague's (1974) paper entitled The proper treatment of quantification in ordinary English, the syntax used was originally based on a pure categorial syntax, that is, a syntax which allows only one
28
J. -Marc Authier
type of operation in forming expressions: concatenation. As is well-known, however, pure categorial syntax is ill-equipped to cope with phenomena such as word order, morphological features, and so on. Montague overсгшае these shortcomings in an ad-hoc manner, by incorporating a large range of syntactic operations (some of them construction specific), thus creating a rather unconstrained sj^tactic system. The syntax used by Montague (1974) did recognize constituency, however. The rule in (1), for instance, states how a determinerless noun phrase combines with an intransitive verb to form a sentence (note: Pjv = the set of expressions that are "intransitive verbs", Ρχ = the set of expressions that are "terms" (proper names and pronoims), and Pg = the set of expressions that are "sentences"). (1)
If ó ε Piv and α e Ρτ, then Fi(ô,a) e Pg and Fi((5,a) = α ô', where δ' is the result of replacing the main verb in ó by its third-person singular present form.
But the syntax used by Montague did not make use of features of tree geometry such as c-command to account for quantifier scope ambiguity, for example. To account for such phenomena, Montague formulated a second method of sentence construction, a "rule of quantification" which forms a new sentence out of a quantifier and a sentence containing a variable by substituting the quantifier for the variable. Since the same sentence can have two derivational histories, it can have two meanings. Hence, in Montague's view, an account of scope ambiguities presupposed a type of syntax which provides several ways of constructing sentences which display such ambiguities. A few years later. May (1977) incorporated this insight into generative grammar by proposing a rule of Quantifier Raising (QR) which adjoins поп-м;Л quantificational phrases to the sentence which contains them at LF and provides several derivations for sentences containing multiple quantificational expressions of this type. Unlike Montague, however, May contended that scope is to be defined in terms of c-command. Prima facie, this divergence may not seem terribly important; after all both Montague and May agreed that scope ambiguities are the result of "multiple derivations". As we shall see shortly, however, it is May's claim that scope ambiguities are ultimately determined by features of tree geometry and similar claims linked to the coreference possibilities between noun phrases which have shaped the semantic research agenda in generative grammar up to this day.
Semantics and the Generative Enterprise
29
3. Quantifier scope and the birth of generative "syntax-based" semantics In predicate logic, variables may be bound by a quantifier iff they appear in its scope. The scope of a quantifier is the length of the parentheses that immediately follow the quantifier. So, for instance, in (2) the universal quantifier binds both x's in W(x) and A(x) but not the χ in C(x), the latter being a free variable. In (2), C(y) can be substituted for C(x) without changing the meaning of the expression but one cannot substitute W(y) for W(x) or A(y) for A(x) without altering the meaning of (2). (2)
Vx (W(x)-A(x)) & C(x)
The natural language sentence in (3) mimics the expression of predicate logic in (2). The pronoun she in the right conjunct can only be construed as a free variable, hence it must fall outside the scope of the universally quantified phrase every woman. If we define scope in terms of c-commemd, we immediately account for this fact. The parentheses in (2) can then be taken to correspond to the c-command domain of a quantifier phrase (or QP) in natural language syntax. (3)
Every woman is angry and she's complaining.
But if quantifier scope is defined in terms of c-command, how do we account for the ambiguity of (4)? The wide scope reading for the universal QP immediately follows since the existential QP is in its c-command domain, but what predicts the possibility of a wide scope reading for the existential QP? (4)
Every child saw a clown.
May's (1977,1985) answer to this question is that non-wh quantificational expressions (i.e. QPs) are subject to a syntactic covert adjunction process called Quantifier Raising (QR) which operates in a way such that two QPs in the same clause will c-command each other at LF. The basic assumption about quantifier scope in (5) then accounts for the ambiguous status of (4). (5)
A quantifier a may take scope over another quantifier /3 iff α c-commands β at LF.
In order for this explanation to go through, however, one must ensure that QR is an obligatory process and explain why this should be so. One explanation found in Lasnik and Uriagereka (1988: 7-8) is that QR is forced by Chomskys (1982) ban on vacuous quantification: If a sentence contains a quantifier, the latter must have scope, and that scope must include a variable, hence the quantifier must undergo QR. Other explanations have
30
J.-Marc Authier
been offered in recent years, and there is no real consensus at this point in time. The motivation for QR and other structural accounts of quantifier scope in the generative framework has been argued to be purely syntactic (see BegheUi 1995; Beghelli and Stowell 1997; Ernst 1998; Hornstein 1995; Kitahara 1996), syntactic but allowed only if it "leads to a distinct interpretation" in the sense of Reinhart (1993) and Fox (1994) (see Chomsky 1995: 377), syntactic but forced by the formal requirements imposed by quantificational determiners (see Kennedy 1997), and purely semantic (see Diesing and Jelinek 1995). Another question concerns the nature of the adjunction site. This is an important question because if we allow the universal QP in (6) to adjoin to PP, no scope ambiguity will be expected, contrary to fact. (6)
Someone talked to every child in town.
It is generally assumed in the literature on QR that the type of LF adjimction instantiated by QR should be limited to maximal projections that are nonarguments; that is VP and IP (see May 1985; Aoim and Li 1989), although Emst (1998) has recently proposed to additionally allow subject QPs to adjoin to a position below I in order to capture the narrow scope reading of subject QPs with respect to negation and modals. Thus, according to what we said so far, QR is an obligatory process adjoining QPs to IP or VP. Quantifier scope ambiguities are then accoimted for by the structure-based scope principle in (5). Given these assumptions, the ambiguity of (7a) is captured via its two possible LF representations in (7b-c): (7)
a. b. c.
Λ boy danced with every girl. [jp every girlj [jp a Ьоу; [jp tj danced with tj]]] [ip a boyi [ip ti [yp every girlj [yp danced tj]]]]
These representations yield two readings: in (7b) the universal QP c-commands the existential one and therefore is allowed to scope over it by (5), and in (7c) the reverse is true. However, there is evidence that formulating scope in terms of the c-command relations that obtain between moved QPs without considering the traces (or copies) they leave behind is insufficient. For example, as early as May (1977), it was noted that bi-clausal raising constructions like (8) are just as ambiguous as simple clausal constructions like (7a) yet, as pointed out in May (1977), Aovm and Hornstein (1985), £md Wilhams (1986), there is a body of evidence, exemplified by the unambiguous status of (9), indicating that QR is clause-boimd. (8)
A boyj seems [¡p tj to have danced with every girl]
(9)
A boy thinks (that) [¡p Tom danced with every girl]
Semantics and the Generative Enterprise
31
Based on the ambiguity of examples like (8), Aoun and Li (1989) conclude that the determination of the relative scope of QPs is sensitive to the chain in which they occur (see also Aoun and Li 1993). Various versions of their Scope Principle in (10) are still in use in the most recents accounts of scope ambiguities (c£ Kitahara 1996; Emst 1998; among others). (10)
Aoun and Li's Scope Principle (1989:151) A quantifier A has scope over a quantifier В in case A c-commands a member of the chain containing B.
To illustrate how (10) works, consider the LF representation (11) of (8): (11)
[jp a boyi [jp tj seems [jp every girlj [¡p t; to have danced with tj]]]]
In (11) the indefinite noun phrase c-commands the chain headed by the universally quantified NP and the latter c-commands the NP-trace left behind by the indefinite. Thus, by (10), either QP may have scope over the other. Aoun and Li (1989,1993) also introduce another principle called the Minimal Binding Requirement to account for further complications in English as well as parametric differences between English and Chinese. We will return to this issue in a moment. For now, what is important to note is that scopai ambiguities in generative grammar are accounted for via syntax-based principles which utilize a feature of tree geometry, namely c-command, as well as the traces (or copies) left behind by QPs during the structure building process. In Minimalist Theory, a principle such as the Scope Principle is a Bare Output Condition; that is, a principle which uses information from the computational system but is determined from the "outside" at the syntax-semantics interface. Such "syntactically constrained" interpretive phenomena constitute the class of semantic facts that generativists have been primarily interested in. Integrating QR into Minimalist Theory is not without problems, however. First, it is quite clear that this version of generative grammar assumes that "core" syntactic processes do not involve adjtinction and that if at all possible, adjunction processes should be dispensed with entirely. Second, since movement is assumed to be morphologically driven, one must assume that QR is the result of the attraction of a QP by abstract morphological Q-features. But this, in tum, is problematic on at least two counts. First, as pointed out by Hornstein (1995), QR results in adjunction to at least VP and IP and this lack of a specific target for the movement is what one would expect if QR lacked a morphological trigger (but see Beghelli and Stowell 1997). Second, Chomsky's (1995) version of Minimalist Theory clearly prohibits movement induced by weak interpretable fea-
32
J.-Marc Authier
tures, which would seem to render QR an illegitimate operation within the computational system (but see Kennedy 1997). This raises the question of whether it is possible to reanalyze QR in Minimalist terms without invoking any form of adjunction. Both Hornstein (1995) and Kitahara (1996) argue that it is, though in rather different ways. Hornstein (1995) proposes to treat quantifier scope ambiguities via a Bare Output Condition making reference to A-chains. His account incorporates the following standard Minimalist assumptions: (i) DPs move to the functional domain for Case checking purposes, (ii) arguments enter the derivation in the lexical domain of a sentence (i.e. VP) and (iii) movement is copying Eind deletion. In addition to this, Homstein's theory of quantifier scope rests on three crucial assumptions. First, only a single link in an A-chain can be interpreted at the C-I interface. Consequently, in multi-membered A-chains, it is necessary to delete all but one link. Further, any member of an A-chain can be deleted. Second, quantifier scope is determined after the requisite deletion by the following principle: (12)
A quantified argument Qi takes scope over a quantified argument Q2 iff Qi c-commands Q2 (and Q2 does not c-command Qi).
Finally, Hornstein assumes a version of Diesing's (1992) Mapping Hypothesis which requires that elements interpreted as definite or specific be outside the VP shell at LF. A direct consequence of this is that in A-chains headed by a specific or definite element, the links that must delete at LF are those inside the VP shell. To see how this theory works, consider the ambiguous sentence in (13), its phrase marker after movement induced by Case checking in (14), and the LF deletion options for this phrase marker in (15a-d) (parentheses indicate deletion). (13)
Someone attended every rally.
(14)
[Agrs someone [χρ [AgrO every rally [yp someone attended every rally]]]]
(15)
a. b. c· d·
[Agrs someone [tp ÍAgrO every rally [yp (someone) (every rally)]]]] ÍAgrs someone [χρ [AgrO (every rally) [yp (someone) every rally]]]] ÍAgrS (someone) [χρ [AgrO (every rally) [yp someone every rally]]]] [AgrS (someone) [χρ [д^^о every rally [yp someone (every rally)]]]]
attended attended attended attended
Semantics and the Generative Enterprise
33
Assuming Diesing's Mapping Hypothesis, the representations in (15b,c) cannot be interpreted at the C-I interface because quantifiers Uke every are specific and, as a result, must be outside the VP shell after deletion. This leaves us with (15a) and (15d). Given (12), (15a) yields the wide scope reading for the existential and (15d) yields the wide scope reading for the xmiversal. Homstein's account has quantifier scope grammatically piggy backing on movement required for Case checking. Kitahara (1996) also argues that the results obtained via QR can be derived "forft-ee"via the Case checking operations independently needed in Minimalist Theory. His approach, however, is not based on the LF deletion of copies advocated by Hornstein but, rather, uses the Scope Principle in (16) which is very close, if not identical, to the one proposed by Aoun and Li (1989, 1993). (16)
Kitahara's (1996) Scope Principle: A quantifier X has scope over a quantifier Y iffX c-commands a member of each chain associated with Y at LF.
In Kitahara's theory, the ambiguity of (13) is predicted by his Scope Principle in (16) as applied to (17), the phrase marker corresponding to (13) which has been constructed via Case checking operations. (17)
ÍAgrS someonej [χρ [дею every rallyj [yp t j attended tj]]]]
In (17), someone c-commands both members of the unique Case checking chain formed by every rally and every rally c-commands one member of the imique Case checking chain formed by someone. Hence, by (16), each quantifier is predicted to be able to take scope over the other It is often the case that when there are two competing theories of a linguistic phenomenon in existence, new accoxmts which incorporate featiu-es of both soon follow. The linguistic phenomenon of quantifier scope is no exception. The theory of the syntax of quantifier scope developed in Beghelli (1995) and Beghelli and Stowell (1997), for example, is a checking theory of scope assignment which draws distinctions among quantifier types: while certain types take scope in their Case positions (à la Homstein/Kitahara), other types must move to designated LF scope positions. This hybrid theory of scope recognizes five major classes of natural language quantifiers, incorporating insights of Szabolcsi (1997). Although space considerations prevent me from discussing the details of Beghelli and Stowell's account, it is important to note that such an account addresses the issue of the interpretive differences that exist between various universal quantifiers such as all, each and every, thereby invalidating some of the arguments presented by Hintikka (1997) in favor of the claim
34
J. -Marc Authier
that syntax-based generative approaches to quantifier scope leave a great deal of logical structure unaccounted for. The quantifier scope ambiguities characteristic of English that we have seen up to this point are not consistently attested in the East Asian languages. Aoun and Li (1989, 1993) (see also Tsai 1994 and Cheng 1991: chapter 5) discuss some interesting parametric variation which any theory of quantifier scope in natural language must capture. The basic differences are the following: (i) in active sentences, object QPs may have scope over subject QPs in English but not in Chinese; (ii) in passive sentences, both English and Chinese display quantifier scope ambiguities; and (iii) in raising constructions like (8), English, but not Chinese, displays scope ambiguities. Both Hornstein (1995) and Emst (1998) propose an account of these parametric differences albeit in radically different terms. Space considerations prevent mefi-omcomparing their theories in detail. However, one crucial difference between them is that while Hornstein argues, following Aoun and Li (1989, 1993), that the difference in quantifier scope ambiguities between English and Chinese can be traced back to a structural difference between the two languages (i.e., English subjects enter the derivation within the VP and move to SpecAgrS while Chinese subjects enter the derivation directly in SpecAgrS), Emst defends the view that this cross-linguistic variation is best captured in terms of the differing modes of Case assignment to subjects: Chinese nominative Case is assigned under government from Infi to Spec VP, causing Chinese subjects to behave like objects in that they "strongly" interrupt variable-binding relationships (see Emst 1998 for details). Homstein's assumptions conceming the derivational properties of Chinese subjects is at odds with Huang's (1993) argument, based on reconstmction effects with VP-fronting in Chinese, that subjects in that language originate in VP. Ernst's accoimt, on the other hand, is at odds with Minimalist Theory in that it assumes that Case checking does not always take place in the checking domain of a head and that head government plays a significant role in the computational component. Both authors, however, agree, as did all of the other generativists whose work has been discussed, that quantifier scope ambiguities are constrained by c-command relations at LE Thus, in generative terms, quantifier scope is regulated by a Bare Output Condition.
4. Other Bare Output
Conditions
Until recently, core binding phenomena in the generative framework were assumed to be purely syntactic in nature (cf, e.g., Chomsky 1995: chapters 1-2). Earlier attempts at capturing intrasentential anaphoric
Semantics and the Generative Enterprise
35
relations in pragmatic terms (see Reinhart 1986) having been shown to make incorrect predictions cross-linguistically (see Lasnik 1991). In the past seven years, however, the status of Binding Theory has been reevaluated, due to novel assumptions motivated on conceptual as well as empirical groimds. Both Chomsky (1995: chapter 4) and Epstein (1994) have proposed that the computational system (i.e., syntax) is strictly derivational. This automatically places Binding Theory at the syntax-semantics interface due to the representational character of the binding conditions (cf. Freidin 1997, among others). The very term 'binding' would seem, in fact, inappropriate under its traditional definition of'coindexing and c-command' since the use of syntactic indices is questionable under Minimalist assumptions. As Chomsky (1995: 217, fii. 53) puts it, "Indices are basically the expression of a relationship, not entities in their own right. They should be replaceable without loss by a structural account of the relation they annotate." Chomsky (1995:211) therefore proposes to reformulate the binding conditions as in (18). While still making reference to features of tree geometry (i.e., c-command), the binding conditions are now stated as constraints on interpretation: (18)
A: B: C:
If a is an anaphor, interpret it as coreferential with a c-commanding phrase in D. If α is a pronominal, interpret it as disjoint from every c-commanding phrase in D. If α is an r-expression, interpret it as disjoint from every c-commanding phrase.
Thus, the binding conditions are seen in Minimalist Theory as Bare Output Conditions, that is, modes of interpretation which use information from the computational system. This opens up new avenues of investigation because it now becomes possible for binding conditions to be computed relative to various types of semantic information. For example, in Authier (1998), I argue for a more articulated theory of the C-I interface, one which recognizes the ability of Bare Output Conditions such as referential disjointness conditions to make reference to semantic information from two distinct sources, one of which uses lexically encoded material known as conventional implicature or semantic presupposition. This hypothesis accounts for why lexical elements such as even, only, emphatic anaphors, etc. allow the nominal elements which they focus to seemingly violate the well-known disjointness requirements known as Condition B, Condition С and Weak Crossover (cf (19)). It also leads to the conclusion that Condition A is not a Bare Output Condition and that proposals which attempt to derive its effects from movement constraints internal to the
36
J.-Marc Authier
computational system (cf. Heim, Lasnìk and May 1991; Chomsky 1995: 104-105, 211) are probably on the right track. (19)
a. b.
Tom pities only Tom. (circumvents Condition C) Which lawyer; did even hisj clients hate t;? (circumvents Weak Crossover)
Positive-polarity licensing also is arguably a Bare Output Condition, given the parallels in interpretation and distribution between Positive Polarity Items and pronominals pointed out by Progovac (1993: 168-171). Indeed Progovac's proposal that the fact that Positive Polarity Items must take wide scope with respect to a clausemate negation be accounted for by appealing to interpretive options of the sort determined by Condition В would appear to place positive-polarity licensing at the C-I interface as a Bare Output Condition. There has been, and there still is, a heated debate as to the exact nature of polarity sensitivity at large. The following list of references, which is by no means exhaustive, will give the reader an appreciation of the complexity of the issues involved: Authier (1998); Baker (1970); Borkin (1971); Dowty (1994); Fauconnier (1975); Heim (1984); Hoeksema (1994); Jackson (1995); Kadmon and Landman (1993); Krifka (1992, 1994, 1995); Ladusaw (1983); LeGrand (1974); Linebarger (1987, 1991); Progovac (1993, 1994); Sánchez Valencia, Zwarts and van der Wouden (1994); van der Wouden (1994); Yoshimura (1994); Zwarts (1993). However, it is relatively easy to convince oneself that polarity sensitivity is at least partially statable in terms of Bare Output Conditions since what we are dealing with here is a matter of semantic scope determined by conditions making reference to c-command. For example, contrary to what has been claimed by many syntacticiems, it is not the case that a Positive Polarity Item like something in (20) triggers "ungrammaticahty" when it appears in the c-command domain of a clausemate negation. Rather, if it does, then it must be interpreted as taking wide scope over negation as (21) makes clear. (20)
I didn't do something. (*-i 3 / 3 -•)
(21)
I didn't do something I was supposed to do and I got fired.
Progovac's (1993) theory of Positive Polarity Items is designed to account for such facts. In essence, she argues that such elements must take wide scope with respect to a clausemate negation emd are free to take either wide scope or narrow scope with respect to a distant negation, because they are pronominal in some sense and pronominals cannot be in the scope of a local binder. Negative Polarity Items appear to have the opposite property; namely, they гиге marked items which must be interpreted as being in the scope of
Semantics and the Generative Enterprise
37
negation. At least in those cases in which a Negative Polarity Item co-occurs with a clausemate negation, it appears that scope should be defined in terms of c-command, as the contrast in (22) makes clear: (22)
a. I didn't do anything. b. *Anything wasn't done.
This contrast can be accounted for on the following assumptions: (i) in English, negation does not c-command the position in which subjects check off their nominative Case feature; (ii) the Unambiguous Scope Principle in (23); and (iii) the licensing condition on NPIs in (24). (23)
An operator A (e.g., negation) has unambiguous scope over a quantifier В in case A c-commands every member of the chain containing B.
(24)
A Negative Polarity Item must be in the unambiguous scope of negation at LF.
In (22a), all of the links of the Case chain formed by the Negative Polarity Item are c-commanded by the negation in compliance with (24). In (22b), on the other hand, the highest member of the Case chain is outside the c-command domain of the negation, in violation of (24). Let me hasten to add that this is speculative on my part emd that I do not claim that this account can be extended to all of the data concerning negative polarity discussed in the literature. I do believe, however, that the facts in (22) indicate that a structure-based account is warranted, at least in the so-called "core" cases of Negative Polarity Item licensing.
5. Bare Output Conditions and others Given that in the generative iramework, the primary focus of investigation concerning interpretive phenomena has been on conditions which make reference to the computational component, the question arises as to what relation, if any. Bare Output Conditions bear to other conditions on interpretation which do not make reference to tree geometry. As Chomsky (1995: 222-223) puts it: [...] we do not know enough about the "external" systems at the interface to draw firm conclusion about conditions they impose, so the distinction between bare output conditions and others remains speculative in part. The problems are nevertheless empirical, and we can hope to resolve them by learning more about the language faculty and the systems with which it interacts.
38
J.-Marc Authier
In Authier (1999), I argue that it is sometimes possible for Bare Output Conditions to override other interpretive constraints and I define the general conditions under which the grammar allows this situation. My evidence comes from French, a language in which there exists a pronoun which is subject to both a semantic aspectual constraint and a noncoreference Bare Output Condition. I show that when these two constraints operate in an environment which causes them to put conflicting requirements on the pronoun, the Bare Output Condition has primacy over the aspectual constraint. The pronoun in question, known in the literature as "demonstrative ce", competes with personal pronouns inflected for number and gender such as il I elle 'he/she' in the subject position of predicate nominal sentences. Thus, one finds contrasts of the type in (25) adapted from Coppieters (1975): (25)
a.
Si Max
était bel et bien un meutrier,
if Max
were really
a
HI I ce serait
murderer,IL!CE
'If Max were really a murderer, *helCE
un homme traqué
would-be a
man
parla
tracked by
justice,
the police
would be a man hunted by the police.'
b. Si Max
commettait un meurtre, il / *ce serait
alors un homme
traqué
if Max
committed
then a
tracked by
a
murder, IL/СЕ would-be
'If Max were to commit a murder, hel*CE
man
parla
justice,
the police
would then be a man hunted by the police.'
Reed (1997) argues that what determines the distribution of ce in sentences like (25) is a semantic constraint which can be stated informally as in (26). (26)
In predicate nominal sentences, demonstrative ce must be used if the aspectual value of the sentence focuses on a consequent state, while personal pronoims like il must be used elsewhere.
On the other hand, Authier and Reed (1997) note that this same element obeys the noncoreference Bare Output Condition known as Condition C. In support of this contention, we offer paradigms such as (27). (27)
a. *Sylviei est convaincue que c^est une matheuse. Sylvie is convinced that CE.is a math expert 'Sylvie is convinced that she is a math expert.' b. Robert est jaloux de Sylvie^ parce que ch'est une matheuse. Robert is jealous of Sylvie because CE.ÌS a math expert 'Robert is jealous of Sylvie because she is a math expert.'
Interestingly, this syntax-sensitive noncoreference constraint on the interpretation of ce provides us with the means to test what happens when a semantic constraint which does not crucially make reference to structure
Semantics and the Generative Enterprise
39
and a constraint which does put conflicting requirements on a lexical item. To see how, consider first the sentence in (28): (28)
Léon¿ veut que tous sachent que c/est le chef. Léon wants that all know that CE.is the boss 'Leon wants everyone to know that he Leon) is the boss.'
In this example, the aspectual value of the most embedded clause focuses on a consequent state and, as a result, the semantic rule in (26) forces the choice of ce as the subject. However, ce is also subject to Condition C, which in tum forces the disjoint reading found in (28). French must nevertheless find a way to express the idea conveyed by the predicate calculus representation in (29): (29)
WANT (1, Vx [PERSON (x) ^ KNOW (x, BOSS (1))]) (i.e., Leon wants everyone to know that he (= Leon) is the boss.)
The problem is, of course, that expressing this idea creates a tension between the two principles which constrain the interpretation of ce. If il is substituted for ce in (28), coreference between this pronoun and the matrix subject will be allowed since ce is a pronoun subject to Condition B. However, the semantic rule in (26) will disallow the choice of il since the clause in which it appears focuses on a consequent state. If, on the other hand, ce is selected in compliance with the semantic aspectual rule in (26), then Condition С will prohibit the interpretation in (29). In a nutshell, French cannot express the interpretation in (29) without allowing one of the two confiicting constraints to supercede the other and, as can be seen in (30), it is the noncoreference Bare Output Condition (i.e.. Condition C) which prevails in this case. (30)
Léoni veut que tous Léon wants that all Interpretation as in (29)
sachent know
qu' il¡ est le chef. that IL is the boss
That (30) illustrates a case of an interpretive Bare Output Condition having primacy over a non-structure-based semantic constraint and not a mere suspension of (26) in embedded contexts is evidenced by the fact that il in (30) does not behave like a iree pronoun. That is, the interpretation symbolized by the indices in (31) is impossible: (31)
*Léo/i¡ veut que Léon wants that
tous all
sachent know
qu' ilj est le chef. that IL is the boss
This follows from the fact that the interpretation conveyed by (31) can also be conveyed by (28), a sentence which does not violate the semantic con-
40
J. -Marc Authier
straint in (26). Thus, if both Condition С and the semantic constraint in (26) can be obeyed, the grammar requires that they be so. That is, the primacy of Bare Output Conditions over other semiintic constraints is established only when no alternative output exists which would resolve the conflict. This in tum suggests that at least some of Chomskys (1995) economy principles extend beyond LF to the cognitive system of language as a whole. Thus, I argue in Authier (1999) for a principle of representational economy as applied to the primacy of Bare Output Conditions along the lines of (32): (32)
Primacy of structure-based semantic conditions over other types of semantic conditions applies when it is necessary to ensure the availability of a particular interpretation and is prohibited from applying otherwise.
6. On the mapping from LF to logical
representations
As should be clear by now, virtually all of the semantic phenomena that have been studied within the framework of generative grammar eire conditions on interpretation which make reference to tree geometry. Issues of compositional semantics, that is, how exactly one computes the meaning of complex linguistic expressions step by step in a bottom-up fashion, have seldom been addressed in that framework. In this last section, however, I will briefly discuss Diesing's (1992) proposal to link generative constructs to logical representations because it does take seriously the question of what it means for semantic representations to be "read ofT an LF phrase marker. One of Diesing's fundamental assumptions is that LF representations are mapped onto logical representations and she develops specific proposals in order to do this. The structure of the model of grammeir she assumes is in some ways reminiscent of Montague's (1974) model in The proper treatment of quantification in ordinary English since in the latter also natural language expressions are interpreted indirectly through the interpretation of the logical expressions they are translated into. The parallel in fact goes a little deeper. The structure of Montague's model is such that if an expression can be analyzed syntactically in more than one way, then it has more than one logical representation and therefore possibly more than one meaning. Diesing's account of the interpretive properties of bare-plural subjects of stage-level predicates proceeds along similar lines. She begins by assuming with Heim (1982) that indefinites in natural language are not represented as existential quantifiers but rather, introduce variables into the
Semantics and the Generative Enterprise
41
logical representation of the sentence in which they appear. They receive quantificational force by virtue of being unselectively bound by a quantifier (c£ Lewis 1975) or, if no quantificational element is present in the sentence, they get bound by an implicit existential quantifier, an operation known as "existential closure", whose effect is to prevent the occurrence of free variables. For example, in a sentence like (33), the indefinites a cat and α dog introduce variables in the logical representation of the sentence, given in (34). Since no quantificational element is present in (33), the variables are bound by the implicit existential quantifier in (34), a quantifier which closes off the nuclear scope indicated by the brackets. (33)
A cat chased a dog.
(34)
(3jy) [x is a cat Λ y is a dog л χ chased у]
Sentences which contain true quantifiers, however, introduce a restriction represented in the logical notation by restrictive clause formation, an operation which divides the sentence into a "semantic partition" which consists of the restrictive clause and the nuclear scope. Quantifiers of the form [every N], for example, introduce such a restriction in that they quantiiy over contextually given sets and thus are specific in the sense of Enç (1991). As a result, the restriction on a universally quantified NP like every cat in (35) is given an explicit representation in the restrictive clause between brackets in the logical representation in (36). (35)
Every cat chased a dog.
(36)
V^ [x is a cat] (3y) y is a dog л χ chased у
Diesing's Mapping Hypothesis is then that LF material contained in VP is mapped into the nuclear scope while LF material outside the VP shell is mapped into a restrictive clause. Given this hypothesis, she explains the fact that bare-plural subjects of stage-level predicates can receive either a generic or an existential reading via the hypothesis that they can be mapped onto logical representations from two distinct syntactic positions. To illustrate, consider the bare plural subject firemen in (37a) which displays (at least) the two interpretations in (37b,c): (37)
a. b. c.
Firemen are available. Gen^j t [x is a fireman л t is a time] χ is available at t X is a fireman л χ is available
By Diesing's Mapping HJφothesis, if the bare plural firemen is interpreted in SpecIP, it is mapped into the restrictive clause of the logical formula in (37b), hence its (quantificational) generic interpretation. Ц on the other hand, the bare plural is mapped onto a logical representation via the VP-
42
J. -Marc Authier
internal position of its trace or copy, its logical translation appears in the nuclear scope of the logical expression where it is subject to existential closure and is ultimately interpreted as having the force of existential quantification (cf (37c)). Diesing's Mapping Hypothesis embodies the claim that, at least with respect to natural language quantification, there is a straightforward correspondence between the syntactic partitioning of sentences into lexical and functional domain and the semantic partitioning of Heim-style logical formula into nuclear scope and restrictive clause. Diesing's Mapping Hypothesis, which has since been used in the theory of quantifier scope defended by Hornstein (1995), represents an important step in the generative framework toward providing serious answers to the question of how semantic representations follow the lead of syntactic structure. While many semanticists may feel that this trend is in its infancy, the recognition that scope and binding are Bare Output Conditions and that logical representations link to the computational system of Minimalist Theory are, in my view, exciting developments in generative grammar which I hope will lead to a more articulated theory of the syntax-semantics interface in this framework.
A Semantics and the Generative Bibliography
Enterprise
Aoun, Joseph and Norbert Hornstein (1985). Quantifier tзφes. Linguistic Inquiry 16, 623-636. Aoun, Joseph and Yen-hui Audrey Li (1989). Scope and constituency. Linguistic Inquiry 20,141-172. Aoim, Joseph and Yen-hui Audrey Li (1993). Syntax of Scope. Cambridge, MA: MIT Press. Authier, J.-Marc (1998). On presuppositions and (non)coreference. Studia Linguistica 52, 244-275. Authier, J.-Marc (1999). On the issue of syntactic primacy: Evidence from French. Probus 11, 165-176. Authier, J.-Marc and Lisa Reed (1997). On some split binding paradigms. Natural Language and Linguistic Theory 15,429-463. Authier, J.-Marc and Lisa Reed (1999). Structure and Interpretation in Natural Language. Munich, Germany: LINCOM EUROPA. Baker, C.L. (1970). Double negatives. Linguistic Inquiry 1,169-186. Beghelli, Fillipo (1995). The phrase structure of quantifier scope. Ph.D. dissertation. University of California, Los Angeles.
Semantics and the Generative Enterprise
43
Beghelli, Fillipo and Tim Stowell (1997). Distributivity and negation: The syntax oieach and every. In: Anna Szabolcsi (ed.), Ways of Scope Taking, 71-107. Dordrecht: Kluwer Borkin, Ann (1971). Polarity items in questions. Proceedings of the Seventh Regional Meeting of The Chicago Linguistic Society, 53-62. Carnap, Rudolf (1942). Introduction to Semantics. Cambridge, MA: Harvard University Press. Cheng, Lisa Lai Shen (1991). On the typology of wh-questions. Ph.D. dissertation, MIT. Chomsky, Noam (1956). Three models for the description of language. I.R.E. Transactions on Information Theory. Vol. IT-2, Proceedings of The Symposium on Information Theory, 113-124. Chomsky, Noam (1957). Syntactic Structures. Gravenhage: Mouton. Chomsky, Noam (1959). On certain formal properties of grammars. Information and Control 2,137-167. Chomsky, Noam (1982). Some Concepts and Consequences of The Theory of Government and Binding. Cambridge, MA: MIT Press. Chomsky, Noam (1995). The Minimalist Program. Cambridge, MA: MIT Press. Coppieters, René (1975). The opposition between il and ce and the place of the adjective in French. In: Susumu Kuno (ed.). Harvard Studies in Syntax and Semantics, 221-280. Cambridge, MA: Harvard University I^ess. Diesing, Molly (1992). Indefinites. Cambridge, MA: MIT Press. Diesing, Molly and Eloise Jelinek (1995). Distributing arguments. Natural Language Semantics 3,123-176. Dowty, David (1994). The role of negative polarity and concord marking in natural language reasoning. Proceedings of Semantics and Linguistic Theory IV, 114-144. Ithaca, NY: Cornell University Enç, Mürvet (1991). The semantics of specificity. Linguistic Inquiry 22, 1-25. Epstein, Samuel (1994). Un-principled Syntax and The Derivation of Syntactic Relations. Manuscript, Harvard University. Ernst, Thomas (1998). Case and the parameterization of scope. Natural Language and Linguistic Theory 16,101-148. Everett, Daniel (1998). Review of Hornstein (1995). Language 74, 142-146. Fauconnier, Gilles (1975). Polarity and the scale principle. Proceedings of The Eleventh Regional Meeting of The Chicago Linguistic Society, 188-199. Fox, Danny (1994). Economy, Scope and Semantic Interpretation: Evidence from VP Ellipsis. Manuscript, Cambridge: MIT.
44
J. -Marc Authier
Freidin, Robert (1997). Review of Chomsky (1995).Lan;?Mcge 73,571-582. Gillon, Brendan (1998). Semantics: Two views. Manuscript, Montreal: McGill University. Heim, Irene (1982). The semantics of definite and indefinite noun phrases. Ph.D. dissertation, University of Massachusetts, Amherst. Heim, Irene (1984). A note on negative polarity iind downward entailingness. In: Charles Jones and Peter Sells (eds.). Proceedings of North East Linguistic Society 14, 98-107. Amherst: GLSA, University of Massachusetts. Heim, Irene, Howard Lasnik and Robert May (1991). Reciprocity and plurality. Linguistic Inquiry 22, 63-101. Hintikka, Jaakko (1997). No scope for scope? Linguistics and Philosophy 20, 515-544. Hoeksema, Jack (1994). On the grammaticalization of negative polarity items. In: Susan Gahl, Andy Dolbey and Christopher Johnson (eds.). Proceedings of The Twentieth Annual Meeting of The Berkeley Linguistics Society, 273-282. Berkeley: Berkeley Linguistics Society, University of California. Hornstein, Norbert (1995). Logical Form: From GB to Minimalism. Cambridge, MA: Blackwell. Horrocks, Geoffrey (1987). Generative Grammar. London and New York: Longman. Huang, C.-T. James (1993). Reconstruction and the structure of VP: Some theoretical consequences. Linguistic Inquiry 24,103-138. Jackson, Eric (1995). Weak and strong negative polarity items: Licensing and intervention. Linguistic Analysis 25,181-208. Kadmon, Nirit and Fred Landman (1993). Any. Linguistics and Philosophy 16, 279-298. Kennedy, Christopher (1997). Antecedent-contained deletion and the sjmtax of quantification. Linguistic Inquiry 28, 662-688. Kitahara, Hisatsugu (1996). Raising quantifiers without quantifier raising. In: Werner Abraham, Samuel Epstein, Höskuldur Thráinsson and Jan-Wouter Zwart (eds.). Minimal Ideas: Syntactic Studies in The Minimalist Framework, 189-198. Amsterdiim and Philadelphia: John Benjamins. Kriika, Manfred (1992). Some remarks on polarity items. In: Dietmar Zaeíferer (ed.), Semantic Universals and Universal Semantics, 150-189. Berlin: Foris. Krifka, Manfred (1994). The semantics and pragmatics of weak and strong polarity items in assertions. Proceedings of Semantics and Linguistic Theory IV, 195-220. Ithaca, NY: Cornell University Krifka, Manfred (1995). The semantics and pragmatics of polarity items. Linguistic Analysis 25, 209—257.
Semantics and the Generative Enterprise
45
Ladusaw, William (1983). Logical form and conditions on grammaticality. Linguistics and Philosophy 6, 292-373. Lasnik, Howard (1991). On the necessity of binding conditions. In: Robert Freidin (ed.), Principles and Parameters in Comparative Grammar, 7-28. Cambridge, MA: MIT Press. Lasnik, Howard and Juan Uriagereka (1988). A Course in GB Syntax. Cambridge, MA: MIT Press. LeGrand, Jean (1974). AND and OR: Some SOMEs and all ANYs. Proceedings of The Tenth Regional Meeting of The Chicago Linguistic Society, 390-401. Lewis, David (1975). Adverbs of quantification. In: Edward Keenan (ed.). Formal Semantics of Natural Language, 3-15. Cambridge: Cambridge University Press. Linebarger, Marcia (1987). Negative polarity and grammatical representation. Linguistics and Philosophy 10, 325-387. Linebarger, Marcia (1991). Negative polarity as linguistic evidence. Proceedings of The Twenty-seventh Regional Meeting of the Chicago Linguistic Society [II, Papers from The Parasession on Negation]. Mateescu, Alexandra and Arto Salomaa. (1997). Aspects of classical language theory. In: Grzegorz Rozenberg and Arto Salomaa (eds.), Handbook of Formal Languages, Vol. 1,175-251. Berlin, Heidelberg and New York: Springer. May, Robert (1977). The Grammar of Quantification. Ph.D. dissertation, MIT May, Robert (1985). Logical Form: Its Structure and Derivation. Cambridge, MA: MIT Press. Montague, Richard (1974). Formal Philosophy: Selected papers of Richard Montague. Edited by Richmond Thomason. New Haven: Yale University Press. Progovac, Ljiljana (1993). Negative polarity: Entailment and binding. Linguistics and Philosophy 16,149-180. Progovac, Ljiljana (1994). Negative and Positive Polarity: A Binding Approach. Cambridge: Cambridge University Press. Reed, Lisa (1997). Pronominalized aspect. Studia Linguistica 51, 121-153. Reinhart, Tanya (1986). Center and periphery in the grammar of anaphora. In: Barbara Lust (ed.). Studies in The Acquisition of Anaphora, 123-150. Dordrecht: D. Reidel. Reinhart, Tanya (1993). Wh-in-situ in the framework of the Minimalist Program. Manuscript, Tel Aviv University. Sánchez Valencia, Victor, Frans Zwarts and Ton van der Wouden. (1994). Polarity, veridicality, and temporal connectives. In: P. Dekker and Mar-
46
J.-Marc Authier
tin Stokhof (eds.), Proceedings of The Ninth Amsterdam Colloquium, 587-606. Amsterdam: University of Amsterdam. Szabolcsi, Anna (1997). Strategies for scope taking. In: Anna Szabolcsi (ed.), Will'S of scope taking, 109-154. Dordrecht: Kluwer. Thue, Axel (1914). Probleme über Veränderungen von Zeichenreihen nach gegebenen Regeln. Skrifter utgit av Videnskapsselskapet i Kristiania I, 10.
Tsai, Wei-Tien (1994). On Economizing The Theory of A'-dependencies. Ph.D. dissertation, MIT. Williams, Edwin (1986). A reassignment of the functions of LF. Linguistic Inquiry 17, 265-299. Wouden, Ton van der (1994). Polarity and 'illogical negation'. In: Makoto Kanazawa and Christopher Piñón (eds.), Dynamics, Polarity, and Quantification, Stanford: CSLI. Yoshimura, Akiko (1994). A cognitive constraint on negative polarity phenomena. In: Susanne Gahl, Andy Dolbey and Christopher Johnson (eds.). Proceedings of The Twentieth Annual Meeting of The Berkeley Linguistics Society, 599-610. Berkeley: Berkeley Linguistics Society, University of California. Zwarts, Frans (1993). Three Types of Polarity. Manuscript, University of Groningen.
The semantics of Mood Paul Portner
1.
Introduction
The formal study of the semantics of mood has a long history based in the analysis of indicative and subjunctive conditionals (Chisholm 1949; Anderson 1951; Stalnaker 1975; Adams 1979). Another longstanding literature pertaining to mood is that on speech acts and illocutionary force (for example Austin 1962; Strawson 1964; Stenius 1967; Searle 1969; Searle & Vanderveken 1985). For many years the study of mood within generative linguistics tended to focus on developing the interests of this originally philosophical literature in a more empirically motivated environment (e.g. Lakoff 1970; Karttunen & Peters 1975; Brée 1982; McCawley 1996; Cole & Morgan 1975 and the references in Schiffrin 1994). Recent studies of the semantics of mood have originated from a different direction, drawing together ideas from a number of broader trends in the syntax and semantics literatures. First among these may well be the change in perspective brought about by Pollock's (1989) réévaluation of clausal syntax, giving rise as it did to the proposal of projections corresponding to various inflectional categories, including mood (first in Rivero 1994). A second motivating factor has been that mood can be seen as parallel to a number of other cases of non-local semantic dependency, such as negative polarity and the selection of tense in subordinate clauses; theoretical connections between mood and each of these phenomena have been proposed, by Giannakidou (1997) and von Stechow (1995) respectively. Thirdly, since the work of Roberts (1987, 1989) on modal subordination, a notion of mood has been seen as central to an understanding of discourse semantics. And a last factor has been an increasing attempt to
48
Paul Portner
gain a better understanding of various subtypes of intensional contexts, an interest at the confluence of work on event semantics (nominalization, perception reports, and infinitivals, e.g. Barwise 1981, Stowell 1982, Zucchi 1993, Asher 1993) and the detailed lexical semantics of sentenceembedding verbs (Farkas 1992, Heim 1992, Portner 1992, Villalta 2000, 2001). In this overview article I would like to accomplish three things. First, it is important to distinguish mood from various related concepts like illocutionary force and modalization. After discussing such matters briefly, I will outline the basic types of data which theorists of mood have taken as their core paradigms; in this connection, we look at traditional moods (indicative, subjunctive, imperative) as well as other temporal and modal forms (e.g. dependent modals, infinitives, modal particles, and selected complementizers). And finally, I will present a number of contemporary analyses of mood, attempting to categorize them in terms of their leading ideas. The various approaches analogize mood to a number of other phenomena, such as definiteness, polarity, or agreement, and differ as well in terms of which data they take to be central. Finally, I will wrap up by discussing the prospect for future developments in the field. 2. Mood and related
categories
The core phenomenon of mood as it is seen within traditional grammar is the selection of finite verbal forms in complement clauses due to semantic characteristics of the embedding predicate. Тзф1са11у one available form is that used in root clauses to make assertions of fact, the indicative, and if there is just one other form it is likely referred to as the subjunctive. For example: (1)
Dico che è felice. (Italian) say.lsg that is.indic happy Ί say that she is happy.'
(2)
Credo che sia felice. believe. Isg that is.subj happy Ί believe that she is happy.'
The mood alternation might also be present in non-selected contexts, such as relative clauses, where it can be associated with a semantic distinction of some sort. On top of this prototypiceil indicative/subjunctive contrast, a language may have other finite forms, such as an optative or conditional, and there may be non-finite forms whose distribution seems roughly to depend on factors similar to the indicative/subjunctive contrast, for example
The semantics of Mood
49
the English for infinitive. This conception of mood thus requires a distinction between "morphological mood" and "notional mood", where the latter can be defined as that class of phenomena which can be explained in terms of the same theory which explains the central morphological moods of indicative and subjunctive. In other words, notional mood is the semantic category which includes morphological mood. Many scholars use the term "mood" in senses other than this one, referring to phenomena which bear some intuitive semantic similarity to morphological mood. For example, Roberts (1989) utilizes a distinction between "realis" and "irrealis" mood in her analysis of modal subordination, the phenomenon seen in (3): (3)
a. John might have left a note at home. b. It might explain where he is. c. ? It explains where he is.
Sentence (3b) can naturally be taken as a continuation of the previous sentence, as its irrealis mood, indicated by might, indicates that it is semantically within the scope of ("modally subordinated to") the condition expressed in the first sentence; thus it means 'If John left a note at home, it might explain where he is.' In contrast, the realis mood (3c) cannot have this fimction. This sense of mood clearly must be differentiated irom the core indicative/subjimctive contrast discussed above, since it at least cuts across that contrast; while the indicative clause in (3c) is realis, that in (4b) is irrealis (Farkas 1992 notes similar data): (4)
a. b.
Mary dreamed that she saw a dog. It was a Keeshond. (i.e. Mary dreamed that she saw a Keeshond.)
Whether in general the realis/irrealis pattern seen in modal subordination should be treated as notional mood in the sense defined above depends on a number of issues, in particular the proper treatment of modal auxiliaries and our broader theory of discourse semantics. Another concept which is often connected to the category of mood is that of illocutionary force. For example, Rivero & Terzi (1995) propose that an imperative operator, representing the clause's illocutionary force, is characteristic of imperative mood. Likewise, some scholars write of interrogative mood or declarative mood. Such an inventory shows the close connection between this sense of mood and the notion of "clause type". Sadock & Zwicky (1985) define a clause type as a formally (morphologically or syntactically) distinct category which is both conventionally associated with a certain function, like asking a question or making a statement, and which is unique to each clause, so that every clause is a member of exactly
50
Paul Portner
one type. According to this characterization, clause type must be a different system from mood, since, for example, both declaratives and questions are normally in the indicative. Nevertheless, a better understanding of mood's connections with clause type and illocutionary force will be a necessary part of our analysis of all three concepts. The literature which I wish to review here focuses on the indicative/subjimctive contrast and the associated concept of notional mood. I will therefore leave aside for the most part the other senses of mood mentioned above. It is also to be noted that there are still other linguistic categories which likely have a close association with the semantics of mood, such as evidentiality, certain adverbials with evidential force (e.g. allegedly), and even perhaps aspect (Izvorsky 1997). The intuition that all of these are related to mood clearly has something significant behind it, and hopefully in the near future some formal work will begin to explore the connections.
3. Outline of some central
data
I will next outline some of the data which has been of central concern to scholars studying the indicative/subjunctive contrast and notional mood more generally. In many cases, the behavior of particular constructions is subject to a great deal of cross-linguistic variation; thus, for example, the verb for believe selects the subjunctive in Italian but the indicative in Greek. There is also a certain amount of variation or optionality within a given language, with the choice of mood frequently correlating with a semantic or pragmatic contrast. The details are too subtle and languagespecific to describe thoroughly here, so my goal will be to summarize the main classes of environments which are relevant to mood selection across languages. Exactly what sort of theoretical framework should be used to analyze all of the variation is something we will come back to in the next section. I would also note that the data presented in this section will come exclusively from western branches of Indo-European. This reflects the bias in the formal literature (Baker & Travis 1997 being a notable exception) and highlights one direction in which much future work is clearly needed. 3.1. Root
clauses
First let us look at the distribution of mood in root clauses. The indicative is the prototypical mood of matrix assertions and questions. Imperative clauses may take the form of an indicative, a subjunctive, or a morphologically distinct imperative mood (examples from Zanuttini 1997):
The semantics of Mood (5)
Telefonate! (Italian) call.indic.2pl 'Call!'
(6)
Lo dica pure! it say.subj.3sg indeed 'Go ahead and say it!'
(7)
Telefona! call.imper.2sg 'Call!'
51
English brings up a difficult issue of categorization. Imperatives are very similar to an embedded clause type that is sometimes referred to as a subjunctive: (8)
Bring the wine!
(9)
I demand that you bring the wine!
It is quite tempting to identify these forms and associate them with a meaning of "ordering". One problem with this is semantic; in embedded contexts, this subjunctive may be used for a more general sense of obligation than mere ordering: (10)
It is necessary that it rain soon.
(11)
Rain soon!
While (11) must be taken as an order to a personified sky, (10) simply expresses a human need. One might attribute this difference to the fact that the understood subject in root imperatives is always you, giving rise to a sense that the addressee is being commanded, while the embedded subjunctives may have any subject. Thus, (11) can only be understood as addressing a personified sky, while the complement in (10) has the expletive it subject. Another issue is whether the English imperative should be seen as a mood proper, or rather as a type of notional mood-indicating infinitive. The embedded forms show infinitive word order, with negation preceding all verbal elements and modals being disallowed, while root imperatives also disallow modals but clearly show verb movement to the left of the subject when negated: (12)
I demand that you not do that.
(13)
Don't (you) do that!
The special status of the root forms may arise from the fact that they, but not embedded clauses, are bearers of illocutionary force — a category
52
Paul Portner
wMch, as we noted, is related to but distinct from mood (c£ Lewis 1979 for what may be taken as an analysis of the illocutionary force of imperatives; also Huntley 1984). It is thus likely that English has a distinct notional mood for ordering and some subtypes of human necessity, but whether it is a variety of subjxmctive or a type of infinitive is an open question. Another issue that arises is the status of permissives. (14)
Have an applel
(14) may be taken either as an order or as a granting of permission. This distinction may be given a pragmatic analysis, as in Brown & Levinson (1987): they propose that the permission reading of (14) is semantically an order which is weakened by virtue of certain principles of Politeness Theory. The idea is that in certain discourse environments, a speaker may issue an order so that the hearer will undertake an action which he or she is only avoiding so as not to offend the speiiker. For instance, if Mary is a visitor to John's house, she might be reluctant to eat one of his apples for fear of being a bad, selfish guest. In such a context, John might "order" her to have an apple, in order to allow her to achieve her desire to eat one, since in that case her obligation to do as John (pretends to) require would outweigh her reluctance to take his fruit. Hence, the command has the pragmatic force of permitting, since it is only issued to override the hearer's reluctance to act in her own interest. Main clause subjunctives may surface with readings other than the imperative, for example expressing supposition or astonishment. (Italian examples from Moretti & Orvieto 1981, cited in Portner 1997; Enghsh example suggested by the editors.) (15)
(16)
L'avesse anche detto lui. (Italian, de Lampedusa, Il gatit-have.subj also said he topardo) 'Suppose he had said it too.' a.
b.
Che sia nel bagno? (Cassola, Una relazione) thatbe.subj in-the bath 'She's in the bath?!' God bless you.
A variety of other moods have received far less attention than the core cases of indicative, subjunctive, and imperative. These include optatives and main clause infinitives. For the most part, the theoretical literature has ignored main clause forms other than indicatives, though they have been mentioned by Giorgi & Pianesi (1998) and Quer (1998); and analyzed to some extent by Portner (1997).
The semantics of Mood 3.2. Embedded
53 subjunctives
Next let us tum to the distribution of subjunctives in embedded clauses. Perhaps the prototypical case is the subjunctive or "counterfactual" conditional: (17)
If Mary were coming to the party, I would come too.
The English third person were form in (17) reflects the fact that the proposition that Mary is coming to the party has a particular semeintic status; at first glance, examples like (17) indicate that the ¿/clause proposition is presupposed to be false, but cases like (18) suggest that a weaker requirement is at work (from Anderson 1951): (18)
If Jones had taken arsenic, he would have shown just exactly those symptoms which he does in fact show.
The precise interpretation of subjunctive conditionals is a matter of much debate; see von Fintel (1998) for a recent discussion. The issue that arises in the present context is what they show about the semantics of mood more generally. We need to distinguish the contribution of the subjunctive from that of other elements, like the modal or i f . In this connection, there are a couple of things to note. In the first place, the English subjunctive form illustrated here has a very narrow distribution, essentially limited to this case, as though I as ¿/"clauses, and following the verbs wish and suppose. Thus we would be likely to gain insights by crosslinguistic comparison, to see whether the details of the subjunctive conditional's meaning vary from language to language. Another relevant comparison would be between subjunctive conditionals and other varieties of subjunctive adjunct clauses. Quer (1998) analyzes subjunctives in concessive clauses of various types, as in the following (data from Quer 1998): (19)
Encara que no siguí major d'edat, el deixaran although that not be.subj major of-age him let.fut.3pl entrar. (Catalan) enter.inf 'Even if he is not an adult, they will let him in.'
(20)
Truqui qui truqui, no diguis el teu nom. calls.subj who calls.subj not tell.imper the your name •Whoever calls, don't tell your name.'
The data illustrated from Catalan above also lets us introduce a minor sub-theme which will persist throughout this section. Note that one way to translate (20) in English is to use the modal verb may:
54 (21)
Paul Portner Whoever may call, don't tell your name.
Here, may is not an independent modal operator, but rather plays the same role as the subjunctive mood in Catalan. As noted by Palmer (1990) and Portner (1992, 1997) English often makes use of modal verbs as indicators of notional mood. We will see even clearer cases of this below. Next we tum to subjunctives which are selected by a higher predicate. Crosslinguistically, the predicates which have the strongest tendency to take the subjunctive are desideratives, directives, and those expressing some varieties of modality (e.g. those glossed as 'possible' and 'necessari; certain others are less consistent): (22)
Spero che sia felice. (Italian) hope.lsg that be.subj happy Ί hope that you are happy'
(23)
II a ordonné que je parte. (French, Farkas 1992) he has ordered that I leave.subj 'He ordered me to leave.'
(24)
E posibil sa fi venit Ana. (Romanian, Farkas 1992) is possible SUBJ PAST come Ana 'It is possible that Ana came.'
In English all of these predicates may govern infinitives with the complementizer/or, that is non-control, non-ECM infinitives: (25)
I hope for you to be happy.
(26)
He ordered for me to leave.
(27)
It is possible for Ana to come.
The for infinitive differs from the subjunctives illustrated above, however, in that it is only possible when the complement clause is futurate with respect to the main tense (Bresnan 1972; Stowell 1982; Portner 1992,1997). This form is cleeirly a notional mood related to the subjunctive, but it has a temporal semantics that the subjunctive mood itself lacks. English also allows the presence of the mood-indicating may with desire and possiblity predicates: (28)
I hope that you may be happy.
(29)
It is possible that Ana may come.
In neither of these cases is may a true modal; for instance, (28) does not mean I hope that it's possible you are happy. It means essentially I hope
The semantics of Mood
55
that you will be happy, with some added sense of adding a blessing or "official good wishes", as one might with the root clause May you be happy! With somewhat more cross-linguistic variation, causatives, emotive factives, and non-factive verbs of mental judgment may select the subjunctive: (30)
Fas que marxi abans d'hora. make.2sg that leave.subj before of-time Tou make her/him leave earlier.'
(Catalan, Quer 1998)
(31)
Marie regrette que Paul sait parti. Mary regrets that Paul be.subj left 'Marie regrets that Paul left.'
(French, Farkas 1992)
(32)
Gianni crede che Maria sia partita. (Italian) Gianni believes that Maria be.subj left 'Gianni believes that Maria left.'
The English for infinitive may not occur with any of these verbs. Subjunctives are also triggered by elements other than embedding predicates. The subjunctive often appears when the embedding clause is negated, questioned, or has an impersonal subject: (33)
Gianni nan sapeva che Maria fosse incinta. (Italian) Gianni neg knew that Maria was.subj pregnant 'Gianni didn't know that Maria was pregnant.'
(34)
Recordes que en Miquel treballi? (Catalan, Quer 1998) remember that the Miquel worked.subj 'Do you remember if Miquel worked?'
(35)
Si dice che Maria fosse incinta. (Italian) One says that Maria be.subj pregnant 'It is said that Maria is pregnant.'
These non-selected complement clause subjunctives have a number of properties which differentiate them from the examples in (22)-(32), summarized nicely by Quer (1998): they show greater freedom of tense choice, they more freely alternate with the indicative, they may be licensed from more than one clause away, and they allow their subject to be coreferential with the embedding clause's subject. As far as I know, however, there has been no formal work relating these differences to the details of the semantics of mood. The final subjunctive environment which has received attention is that of relative clauses. Quine (1975) noted that quantifiers containing sub-
56
Paul Portner
junctive relatives are often semantically different from those containing indicatives: (36)
Gianni voleva un dottore che fosse
comprensivo. (Italian, Beghelli 1997) Gianni wanted a doctor that was.subj understanding 'Gianni wanted a doctor who would be understanding.'
(37)
Gianni voleva un dottore che era comprensivo. Gianni wanted a doctor that was.indic understanding 'Gianni wanted a doctor who was understanding.'
The contrast between these cases, (36) containing a subjunctive relative and (37) an indicative, has been described is a VEuiety of ways, e.g. as narrow/wide scope, nonspecific/specific, de dicto/de re, and attributive/referential. Thus we may at least approximate the difference between (36) and (37) by saying that the latter entails the existence of an understanding doctor, while the former does not; however, the precise nature of the contrast is controversial (some recent references are Rivero 1975; Earkas 1985; Giannakidou 1997; BegheUi 1997; and Quer 1998). The essential idea of all of these is that the subjunctive in (36) is triggered by voleva, which takes a subjunctive in its complement clause when it has one. A matrix verb which does not take a subjunctive complement, like met or knew, would not license the subjunctive relative. The triggering of the subjunctive in the relative clause is only optional, however, because the NP can be interpreted as dependent on, or independent οζ the main verb; it only takes subjunctive form when it is "dependent". Notions like scope or specificity are different ways of working out the precise nature of this potential dependency. 3.3. Embedded
indicatives
The central cases of predicates which select the indicative can be divided into three groups: those expressing (factive) mental judgment, assertion, and mental creation: (38)
Gianni sa che Maria è partita. (Italian) Gianni knows that Maria is.indic left 'Gianni knows that Meuia left.'
(39)
Diu que i'era^Ora. (Catalan, Quer 1998) says that you-misses.indic 'S/he says that s/he misses you.'
The semantics of Mood (40)
Ion a
visat
57 cá
Petru a
primit premiai Nobel. (Romanian, Farkas 1992) Ion has dreamed INDIC Petru has received prize Nobel 'Ion has dreamed that Petru received the Nobel Prize.'
This last class, which also includes verbs like imagine and lie, tends to be a problem for semantic analyses of mood. The traditional intuition that the indicative is the "realis" mood, indicating asserted or claimed truth of the clause, falls on difficult ground here; we'll see below how various theorists try to account for this class. In many languages nonfactive verbs of mental judgment, like think and believe, take the indicative, though as we saw in the last section this is not consistently so. (41)
Je crois qu'il est parti. I beUeve that-he is.indic left Ί think he has left.'
(French, Giorgi & Pianesi 1998)
These cases are also difficult for any simple application of the idea that the indicative is realis.
4. Theoretical analyses of the semantics of mood Contemporiiry analyses of the semantics of mood come in three groups: 1. Mood is part of tense semantics, either positively introducing some temporal meaning or reflecting some other temporal property of the context. 2. Mood should be analogized to various features of NP semantics, in particular definiteness, specificity, and negative polarity. 3. Mood is a dependent modal element, reflecting the properties of some intensional operator in the context.
4.1. Mood in temporal
semantics
A number of scholars have proposed that mood is present to mark something about the semantic status of tense in the clause. For instance, Picallo (1984, 1985) claims that subjunctive marks the "anaphoric" nature of tense, while Progovac (1993) and von Stechow (1995) suggest that the subjimctive is associated with a semantically null tense morpheme. An anaphoric theory like Picallo's is designed to explain two types of facts: first.
58
Paul Portner
that subjunctive clauses display more rigid sequence of tense restrictions than indicatives, and second, that they create larger domains for binding and movement (data from Quer 1998): (42)
(43)
Desitja que porti!
*portés
unllibre. (Catalan) desires that brings.subj.pres brings.subj.pst a book 'S/he hopes that s/he brings a book.'
*Vull que la convidi. want.lsg that her invite.subj.lsg Ί want to invite her.'
These facts follow because the anaphoric tense must "agree" with its antecedent (example [42]), and because the relationship between them extends the binding domain for Conditions A and В ([43]). This approach suffers from two main problems, however. First, it does not explain why anaphoric tense occurs in precisely the environments it would have to. That is, why does want take anaphoric tense but know not? And second, the properties seen in (42)-(43) are only associated with selected subjunctives, and not with those triggered by negation, questions, etc., a point made by Raposo (1986); Suñer & Padilla-Rivera (1987), and Suñer (1986). Progovac's account is designed to explain similar data to Picallo's, but accomplishes it by deleting the embedded functional projections. It suffers from the same difficulties. Von Stechow's analysis is designed to explain English facts somewhat similar to that in (42), but his point is much more subtle. He is concerned with the availability of a "simultaneous" vs. a "shifted" reading of embedded tense. (44)
Mary thought that Bill liked beets.
(44) has two interpretations, a point noted by Abusch (1988) and Ogihara (1989). The time of Bill's allegedly liking beets may be simultaneous with the time of Mary's thinking, or prior to that. On the simultaneous reading, both Abusch and Ogihara treat the embedded tense as semantically vacuous (either by being literally meaningless (Abusch) or by being deleted (Ogihara)). Von Stechow's proposal is that the tense deletion operation occurs when the embedded clause is subjunctive. This proposal requires that he treat the English example (44) as involving an optional covert subjunctive, but it receives better support in Romance: (45)
Gianni sapeva che Maria era incinta. (Italian) Gianni knew that Maria was.indic pregnant 'Gianni knew that Maria was pregnant.'
The semantics of Mood (46)
59
Gianni pensava che Maña fosse incinta. Gianni thought that Maria was.subj pregnant 'Gianni thought that Maria was pregnant.'
The subjunctive complement in (46) can only receive a simultaneous reading, as predicted by von Stechow. The indicative complement (45) can have either reading, however, suggesting that he ought to propose that indicatives optionally trigger tense deletion, while subjunctives require it. Von Stechow's proposal suffers from the same difficulties as the other tense-based accounts. The effects seen in (45)-446) are limited to selected subjunctives, and do not even occur with all such: (47)
Gianni era contento che il tempo
fosse
bello. (Itahan) Gianni was content that the weather was.subj pretty. 'Gianni was pleased that the weather was pretty.'
(48)
Gianni non sapeva che Maria fosse incinta. Gianni neg knew that Maria was.subj pregnant 'Gianni didn't know that Maria was pregnant.'
In (47) we have a selected subjunctive, while in (48) we have one triggered by negation; both allow shifted readings. In summary, these temporally-based analyses of mood suffer from a variety of serious difficulties. Nevertheless, they are the only accounts which attempt a substantial explanation of the facts seen in (42)-(46) (but cf Giannakidou 1997: 196-206 on the locality of licensing for negative indefinites in Greek), and as such deserve continued research. There is one other type of account which attempts to link mood to temporal semantics. Portner (1992,1997) suggests that the English/or infinitive indicates a kind of futurity that explains its distribution, while Beghelli (1997) makes a similar proposal for Greek na clauses. It does seem true that English for infinitives are always future oriented, as was noted in section 3; this does not by itself account for their distribution, though, since verbs which do not take the for infinitive, like believe or claim, can certainly respresent an attitude towards some fixture time. However, these scholars' general views on mood fall within the third group we will discuss ("mood as a dependent modal element"), and they give their explanations for the distribution of these forms within that paradigm. We will thus come back to these issues later.
60
Paul Portner
4.2. Analogies to notions from NP semantics A number of analyses treat the semantics of mood in terms of concepts taken from NP semantics, in particular definiteness or polarity. Only one of these is solely developed from this point of view, that of Baker & Travis (1997), with the others incorporating some ideas from the third family of theories to be discussed below. For this reason, let us begin with Baker & Travis. Working with data from Mohawk, they propose that the mood morphemes wa'-'factual', u-'future', and a-'optative' are respectively a marker of definiteness, indefiniteness, and negative polarity. We should focus on the contrast between the first and the second two, since Baker & Travis think of a- more or less as a version of v- for negative contexts (though it appears in certain non-negative contexs as well, but in those cases the choice between the two is not explained). Between them, vand a- show properties of traditional subjunctives: they are required by predicates like promise and want and are associated with a non-specific interpretation of relatives clauses; a- occurs under the scope of negation. Interestingly, υ- also has a set of uses which are not seen with the IndoEuropean subjunctive: it can indicate fixture time in matrix clauses and may indicate a past habitual or generic interpretation. I present some of this data below (morphological details suppressed): (49)
Toka v-kenvsko' akaret, v-yukhrewahte' ake-nistvha. (Mohawk, Baker & Travis) if u-steal cookie υ-punish my-mother 'If I steal/stole a cookie, my mother will punish/punishes/ would punish me.'
(50)
Tehoatvhutsoni t-a-hanunyahkwe'. wants pre-a-dance 'He wants to dance.'
Baker & Travis take (49) to represent the core use of υ-, and conclude from its generic reading that it is indefinite-like. The analogy ceui be seen clearly by considering a donkey sentence like the following: (51)
If I steal a cookie, I always eat it.
Within the classic Lewis/Kamp/Heim analysis of indefinites, the indefinite a cookie and the pronoim it are both treated as introducing a free variable. The operator always binds these variables, giving rise to the generic reading. Baker & Travis analyze (49) similarly, saying that the verb marked by v- also has a free variable, representing an event of stealing, which may be bound by a generic operator. The factual prefix wa'-, in
The semantics of Mood
61
contrast, would block binding by the generic operator, parallel to the effect oí the in English NPs. Having accounted nicely for the habitual/generic uses of mood, the challenge for Baker & Travis is to extend their account to other contexts. Negation works out the best. They note that in Heim's theory negation introduces an existential quantifier, "existential closure", binding free variables in its scope. An indefinite mood marker will thus allow the event variable to be boimd. More difficult is the selection of mood by embedding predicates, as in (50). The basic idea of their account is that sentence-embedding verbs also introduce existential closure, so that (50) means something like 'He wants that there is an event of him dancing", or more precisely: 'in every possible future compatible with what he wants, there is an event of him dancing.' Then they go on to explain the impossibility of factual mood here by suggesting that, since wa'· would prevent the variable representing the dancing event from being boimd, this variable would end up representing a past event in the real world. This is incompatible with the fact that want has to do with future possibilities. In contrast to the case with want, the verbs think and know allow factual mood because they may relate to past real-world event. Thus, (52) should mean something like 'The actual past event is such that he thinks it may be one of her dancing", though Baker & Travis do not confirm the details. The lack of factual mood with (53) also supports the theory, since wish represents a counterfactual desire (see also Portner 1992). (52)
Ihrehre' tóka wa'-tyenunyahkwe'. (Mohawk, Baker & Travis) thinks maybe u)o'-dance 'He thinks that maybe she danced.'
(53)
Taaskaneks a-hoatoratu. wish a-himted 'He wishes he had hunted.'
Baker & Travis extend their accoimt of the incompability of wa'- with want to explain why in main clauses it virtually always gets a past interpretation, while V- has a future meaning. The idea is based on Dowty's (1979) notion of'braching time", where we think of a single past extending into alternative possible futures. In light of this, the future tense (covert in Mohawk) quantifies over possible futures, parallel to want, while the past focuses on the past, of which there's only one. The indefinite mood is then required to describe a future event, while the factual mood is applicable to the past. One problem with Baker & Travis' account is that it is not clear how the binding pattern they propose would give rise to the difference between
62
Paul Portner
real-world and other-world events being described. According to them, a bad example like He wishes Mary had danced, with factual mood, should have a logical form like the following (or with e left unbound): (54)
3e [Vw[w is a wished-for world] [dance(Mary, e) and e is in w]]
They assume that this logical form entails that e is a real-world event. However, a garden variety semantics would imply that because dance is interpreted within the scope of wish, it describes events in the wished-for worlds. Giving the event's quantifier wide scope wouldn't change this; it would merely entail that a single event is being talked about across those worlds. (One might suggest that this latter consequence is incoherent, and that events only exist in a single world. But that would predict that (52) is bad as well.) We can of course assume a framework where (54) does entail that a real-world event is being described. However, it is not clear that we should wish to do so. Mohawk wa'- clauses are meant to be analogous to definites, but definite NPs under the scope of sentence embedding verbs do not entail real-world existence: (55)
a. b.
Mary believes that a blue unicorn is in the other room. She wishes that the unicorn were white.
One way to solve this problem might be to propose that the event arguments oîwa'- marked clauses are interpreted de re. This would entail realworld existence of a relevant event, with (52) meaning something like 'He thinks of the event in question that it might be an event of her dancing.' As noted above, however, we are not given the details of interpretation which would allow us to evaluate this possibility. There are also some difficulties with the link established between mood and tense. It is not clear what rules out the use of factual mood to describe the future. The representation of Mary will dance would be as follows: (56)
3e [Vw[w is a future possibility from now] [dance(Mary, e) and e is in w]]
It cannot simply be that the event quantification is not allowed to have scope outside of the universal Vw', as they seem to suggest (p. 262), since (52) would present a parallel structure. And one cannot appeal to the fact that wa'- clauses must refer to past events, as in the discussion of (50), since this requirement is precisely what we're trying to explain. Conversely, it's not immediate why the indefinite mood v- couldn't describe past real-world events; the fact that there is no quantification over worlds would not be incompatible with quantifying over events, getting meanings like 'in the real world, there was a past event of Mary dancing.'
The semantics of Mood
63
Overall, Baker & Travis' theory does a good job in accounting for the habitual and generic interpretations linked to mood in Mohawk. They are less successful in dealing with verbal selection, though they seem at least close to the mark on the think/wish contrast. In trying to explain the temporal effects of mood choice, both in root clauses and in interaction with want, it seems to me there are problems. Next we tum to a family of theories, those of Beghelli (1997), Giannakidou (1994, 1995, 1997, 1999); and Quer (1998), which also suggest that mood may be similar to indefiniteness, negative polarity, and related concepts. Both combine this proposal with a strong dose of ideas from modal accounts of mood, as we'll discuss below, but at this point it's worth examining what each has to say about the link to polarity. Beghelli's account is more straightforward in this regard; he treats the subjunctive as a verbal equivalent of any, but the indicative in modal terms. His evidence for the former point is twofold. On the one hand, the set of licensing contexts for the subjunctive is somewhat similar to that for any: under negation and adversative predicates we get polarity any and subjunctive; additionally, he suggests that the set of contexts which obligatorily select the subjunctive overlaps with the set of contexts where free choice any is licensed. I'm not sure what he has in mind here, since core subjunctive-selecting predicates like want and indicative-selecting ones like know seem equivalent in terms of allowing any in their scope. Stronger evidence comes from the similarity of readings between indefinites containing a subjunctive relative and free choice indefinites: (57)
Ho
bisogno di un libro che tratti
di linguistica. (ItaUan, BegheUi) Have. Isg need of a book that deals.subj of linguistics Ί need [any] book that deals with linguistics.'
(58)
Ho bisogno di un qualsiasi libro di linguistica. Have. Isg need of a any book of linguistics Ί need any book on linguistics.'
Overall, though, Beghelli's idea needs to be extended to deal with more subjvmctive-licensing contexts before it is to be considered further. Giannakidou discusses the subjunctive extensively in the context of an analysis of polarity items in Greek, but it is not clear whether she intends to give an account of the subjunctive itself She notes that one class of polarity items is licensed in most subjunctive clauses, in particular the core contexts of desire, directive, and modal predicates, though not in all, failing to be licensed by emotive factives and causatives. She then works with Farkas' (1992) modal theory of mood selection, turning it into гт account
64
Paul Portner
of NPI licensing. However, at this point it is difficult to tell whether it is supposed to explain mood anymore, since she must distinguish the subjunctive contexts which license NPIs from those which do not. The most that can be said with certainty is that Giannakidou correctly emphasizes the similarity between mood and polarity (on this point, see also Nathan and Epro 1984). We'll discuss her ideas in more detail in the next section, treating them as refinements of Farkas' approach.
4.3. Modal accounts The most successful recent accoimts of the semantics of mood treat it as a marker of some contrast within the domain of modal semantics. In particular, they note that sentence-embedding predicates can be analyzed within possible worlds semantics as prepositional operators which quantify over possible worlds. In this regard they are like modals. For example, (59) can be paraphrased as 'In every world compatible with the information in the conversation, it is raining* (i.e. the information we have implies it is raining), and similarly (60) can be understood as 'In every world compatible with John's beliefs, it is raining': (59)
It must be raining.
(60)
John believes it is raining.
The set of worlds that these operators quantify over is determined by an accessibility relation. The epistemic accessibility relation used by (59) picks out the set of worlds compatible with conversational information, while that introduced by (60) picks out those compatible with John's beliefs. In general, modal theories of mood claim that mood marks the properties of the accessibility relation associated with the governing operator. In these terms, Farkas' theory attempts to develop the traditional idea that the indicative is the realis mood, while the subjunctive is irrealis. As mentioned above, there are many problems for a simple view along these lines, for example indicatives selected by dream, say, and believe and subjunctives selected by enjoy and force. Feirkas attempts to resolve these difficulties by proposiug that what governs the indicative is not truth in the real world, as suggested by the term "realis", but rather truth in a particular world; thus a main clause assertion purports truth in the real world, while a clause embedded imder dream is seen as purporting truth in "the world of the dream". In contrast, subjimctive mood is governed by truth in a set of worlds; thus, Farkas suggests that a clause under want is taken as true in the set of worlds representing desired futures.
The semantics of Mood
65
We can rephrase Farkas' ideas by saying that an indicative clause utilizes an accessibility relation which picks out just one world, while a subjunctive clause uses one which picks out a non-singleton set. The problem with this approach is that it is incorrect to say that any verb picks out just a single world; for example, (60) cannot utilize a single belief world. John's beliefs could never be specific enough to pick out just one world—he will have no opinion about many issues, and this means that multiple worlds will be compatible with his beliefs. Thus, Farkas' analysis calls at least for some formal revision. Giannakidou (1994, 1995, 1997), and following her Quer (1998), give such a revision. She allows that all sentence-embedding predicates have accessibility relations picking out (non-singleton) sets of worlds, and proposes that indicative is governed by verbs which require the complement clause proposition to be true throughout that set, while the subjunctive is governed by those which do not. For example: (61)
a. b.
Mary believes that it's raining, (indicative) It is raining is true in every belief world for Mary.
(62)
a. b.
Mary hopes for it to rain, (subjunctive) not: It is raining is true in every future possibility compatible with Mary's beliefs.
There are some problems in her formulation of these conditions. For example, she defines the sets of worlds under consideration as subsets of Stalnaker's (1978) conversational context set, but this would imply that one cannot believe (say, dream) anj^hing which is not taken to be true in the conversation. However, I think all such difficulties could be fixed. A more serious problem arises from the lack of parallelism in the contrast (61)/(62). What's going on with (61) is clear enough; (61b) can be seen as giving the truth conditions for the sentence. But the condition given in (62b) does not have this status; it just states a negative fact. (62)'s truth conditions would be as follows: (63)
It is raining is true in every future desired by Mary.
As seen in (63), the accessibility relation required for hope would give the set of Mary's desired futures. Parallel to (61), one would think that we should be concerned with this set in considering mood choice. However, Giannakidou appeals to a different set in (62b), the futiu*e possibilities compatible with her beliefs. It is unclear why this set is relevant to the semantics of the sentence at all; focusing on it appears to be an arbitrary choice designed to let hope meet the conditions for selecting the subjunctive.
66
Paul Portner
One comment in Giannakidou (1997: 112) suggests a solution to this problem. She seems to hint that we should view the semantics of desire verbs as inherently comparative, as in Stalnaker (1987); Heim (1992), and Pesetsky ( 1992); so that (62) means something like 'If you divide the worlds compatible with Mar^s beliefs into those where it rains and those where it doesn't, she prefers the former.' In this regard, the worlds relevant for hope would be, as in (62b), a set compatible with her beliefs. The meaning of hope would then compare two subsets of this set. While Giannakidou's discussion is too brief to allow us to go further in this direction, this intuition about the subjunctive is very close to Giorgi & Pianesi's, which we will return to below, and the analysis outlined in Villalta (2000, 2001). A final problem for Giannakidou's account is the fact that causatives and emotive factives select the subjunctive. Quer (1998) takes up this issue, proposing that the causative character of all such predicates is crucial. Noting that counterfactuals have subjunctive antecedents, and that causation is given a coimterfactual analysis by Lewis (1973); he suggests that the subjunctive is licensed by a lexical causative meaning (see also Pesetsky 1992). We can think of this proposal as analyzing (64a) as follows: (64)
a. b. c.
Mary is pleased that it is raining. The fact that it is raining causes pleasure in Mary, (or) If it weren't raining, Mary wouldn't have pleasure.
This point may also tie into that of the last paragraph, since Lewis' account of counterfactuals in also "comparative" (see his work for details). In Portner (1997), I address the issue of crosslinguistic variation in mood systems by providing an analysis of the English and Italian notional mood system. I make a fairly weak claim about that variation, essentially saying that a notional mood is any form whose meaning interacts with the clause's accessibility relation in one way or another. The analysis of Italian is simpler than that of English, basically following the traditional view of indicative as a realis mood, indicating truth in the actual world, with the subjunctive the default mood used when the indicative is inappropriate. It is immediately apparent what the problems are: indicatives governed by say and dream {believe takes the subjunctive in Italian), and subjunctives which follow emotive factives and causatives. On the latter issue, the possibility was suggested that these verbs are really not propositional operators at all, but rather take events as arguments. Thus, (64a) would mean something like 'the event of it raining caused pleasure in Mary'. In this way, the realis indicative wouldn't be called for, since the complement clause wouldn't denote a set of worlds at all, and so it wouldn't be true in the real world in the right sense.
The semantics of Mood
67
Turning to the apparently irrealis indicatives, I suggest that say can take the indicative because it has been grammatically categorized as a realis predicate, though it is not strictly speaking one. In many cases, a statement that somebody said something is taken as good evidence that it is so; if this is taken to be the prototypical situation, we can propose that say takes the indicative because it is prototypically realis. The idea here is that the classification of predicates into those that take one mood or another involves a certain amount of arbitrariness, since it can be a graded matter whether a particular predicate does or does not have the relevant property. In the case at hand, Italian considers say "close enough" to being realis to get the indicative. (An analogy can be made to nominal classification (gender) systems, where there will be borderline cases with regard to whether certain things should be in a given class, e.g. whether they are flat enough to count as a flat object or smart enough to coimt as animate.) In that paper I don't have anything to say about verbs of mental creation like dream, however, at least in Italian, and they do not seem amenable to such an analysis. I also propose an explanation for the choice of subjxmctive xmder negation rather different from the polarity-based accoimt of above. The suggestion here is that the negation combines semantically with the verb, affecting the way in which it quantifies over worlds. Thus, (61a) has the meaning indicated, with universal quantification, but the negative version is not simply the negation of this proposition. Rather it gets the following semantic analysis: (65)
a. b.
Mary doesn't believe that it's raining. It is raining is false in some belief world for Mary.
I propose that mood choice is sensitive to this switch to existential quantification. A variety of pieces of evidence is provided for the semantic combination of negation with the main verb. The account of English focuses primarily on the contrast between indicatives and for infinitives. The ideas here are quite different. I explain the contrast oí believe vs. want as follows: One's beliefs are cumulative, in the sense that ideally they all hang together to form a coherent whole. In addition, since we have beliefs about the past, the future, things far away, and so forth, our beliefs сгт only be modeled by a set of worlds. In contrast, our desires are not cumulative; we may want things which we consider to be incompatible (see Heim 1992 and Giannakidou 1997 as well): (66)
Mary wants to be a surgeon and she wants to be a lawyer.
Thus, our desires need to be represented one-by-one, and this can be done with sets of situations (event-like entities smaller than worlds, cf Kratzer
68
Paul Portner
1989). In the above example, Mary stands in the want-relation to a set of situations where she's a surgeon and to a set where she's a lawyer, but not necessarily to any where she's both. The account of the indicative, then, is that it is triggered by an accessibility relation which is cumulative in this sense, and I argue that such verbs as say and dream plausibly have this property as well. Moreover, I argue that the conversational context is cumulative too, and then explain the association of indicative with cumulativity by appealing to the idea that making a simple statement in a conversation is the core function of the indicative. In contrast, the for infinitive is taken to denote "futurate situations", which begin with the time of the main verb and extend into the futurefromthere. (For example, with (66) situations which begin with the present and extend until she's a surgeon or a lawyer.) This represents the future-orientation noted above for these infinitives, and is compatible with non-cumulative accessibility relations like that for want. Since it involves this notion of future-orientation, the accoimt of the infinitives draws on both the temporal and modal accounts of mood. The final theory which I will discuss is that of Giorgi & Pianesi (1998). They also address the issue of crosslinguistic variation; focusing on Romance and a couple of Grermanic languages (German and Icelandic), they propose that there is a scale from "more indicative-like" to "more subjunctive-like" contexts, with individual languages dividing it at different points. (In the end, factors enter into the analysis in addition to those which are part of the scale.) Here we'll only have the opportunity to examine their treatments of French and Romanian on the one hand, and Italian on the other. Giorgi & Pianesi make use of a slightly different theory of modality than the others, that of Kratzer (1991, among others). (Quer frames his analysis in terms of this theory as well, but does not take advantage of its differencesfromthe simpler one discussed above.) This theory makes use not just of a single accessibility relation, but rather two independent modal parameters of interpretation, the modal base and ordering source. The function of the modal base is to establish a set of relevant worlds; it may be seen as analogous to the accesibility relation of the simpler system. The ordering source determines a ranking of these worlds according to how similar they are to some ideal. For example: (67)
John must register his car (given the laws of DC).
Given the presupposition of his car, the relevant worlds, those determined by the modal base, must all be ones where John has a car. The ordering source, on the other hand, represents the laws of the District of Colimibia; it ranks worlds according to how "good" they are from the point of view of
The semantics of Mood
69
those laws. The sentence is true because in the best relevant worlds, those where the laws are followed, John registers his car. There are some equally good worlds where he doesn't register his car—those where he doesn't have a car — but these don't have to be considered because the modal base renders them irrelevant. The ordering source is simply a set of propositions, and so one realization of it is the empty set; with such a null ordering source, the modal base functions on its own to determine a standard accessibility relation. Giorgi & Pianesi analyze the French and Romanian subjunctive as marking that the ordering source is playing a role, i.e. that the ordering source for the clause is non-null. Their goal is to capture the fact that these languages use the subjimctive in the core contexts of desideratives and directives, but not with belief verbs. The idea is that example (62) above would get an analysis involving an ordering source representing Mary's desires and a modal base representing her beliefs. The non-null ordering source would trigger the subjunctive. Though a detailed semantics is not provided, the simplest suggestion would be that the sentence is true if in all of the ideal worlds with respect to this ordering source (i.e. where as many of her desires as possible are satisfied), it rains. In contrast, believe in (61) would not use the ordering source at all, and so will take an indicative. I would note at this point that Villalta (2000, 2001) gives an analysis of mood in Spanish whose central idea is very similar to that of Giorgi & Pianesi. Building on Heim's (1992) semantics for verbs of desire, she proposes that all predicates which select the subjunctive in Spanish have a semantics which is to be expressed in terms of ranked alternative propositions. Thus, (62) would be true if the proposition that it rains is preferred by Mary to all of the relevant alternatives (perhaps in this case, just that it not rain while remaining unbearably hot). It is unclear to me whether this is simply a notational variant of Giorgi & Pianesi's theory. The same ranking of propositions required on Villalta's theory could be formally implemented with the ordering source on Giorgi & Pianesi's. Moreover, given that certain modals in Spanish select the subjimctive, it does seem attractive to account for mood selection in terms of a theory which is applicable to ordinary modality. Thus, before we can fully evaluate these issues, it seems to me that the formal relationship between Heim's comparative semantics for desire predicates and Kratzer's theory of modality ought to be further explored. Returning to Giorgi & Pianesi's work, there are a number of problems with this accoimt. One is that French and Romanian diverge on emotive factives, with French selecting subjimctive and Romanian indicative. Presumably these have a non-null ordering source, since as mentioned above they involve counterfactuality, which is analyzed with a non-null ordering
70
Paul Portner
source within Kratzer's theory. Thus some other factor must be relevant for Romanian. Another, more central difficulty is that it's not clear why desire predicates would use the ordering source but belief predicates would fail to; when it comes to modals, both desires and beliefs may provide an ordering source (Grerman sollen 'should' and werden Svili', respectively, Kratzer 1991: 650), so we'd need justification for treating the lexical verbs differently. Turning to Italian, Giorgi & Pianesi need to account for the fact that belief predicates, which are supposed to have a null ordering source, select the subjunctive. The solution is based in the idea that one's beliefs may be false, thus developing the traditional description of the subjunctive as Irrealis. However, Italian distinguishes belief predicates from verbs of speaking like say, which take the indicative. Thus Giorgi & Pianesi must find a way to draw a line between these classes. Their proposal is that believe is in a way even less realis than say: believe is non-realistic, in that it's possible that everjd;hing one believes is false, while say is only weakly realistic, in that someone who speaks will presuppose at least some things which are true. (The actual formulation is a bit more sophisticated, stated in terms of Stalnaker's Common Ground, but this one will do.) I don't find this distinction plausible. It's awfully difficult to see how one could completely lack true beliefs—at least the belief that one exists! — and to the extent that one can imagine somebody with no true beliefs, one can imagine that person rambling on completely non-realistically. One other problematic class in Italian is dream and its relatives. Recall that these select indicative complements, despite being about as irrealis as possible. Here Giorgi & Pianesi point to data like (4), which shows that a main clause, e.g. (4b), can be modally subordinated to dream in a prior sentence. In contrast, believe does not have this property, as noted by Farkas: (68)
Mary believes that she saw a dog. It was a Keeshond.
This example cannot be continued with Ъut it was really a Spitz', i.e. It was a Keeshond can't be interpreted under the scope of believe. This persistency of scope which dream has but believe lacks makes the former similar to the conversational context (which persists as long as the conversation does) and allows it to take the indicative. Though the next question of course is why dream behaves this way, it seems to me that this is an appealing proposal for why such verbs take the indicative in Romance. The main advances of Giorgi & Pianesi's theory are the incorporation of Kratzer's theory of modality into the semantics of mood, with the corresponding suggestion that the ordering source may be relevant, as well as their idea concerning verbs of mental creation. However, a number of empirical problems with other irrealis indicatives remain.
The semantics of Mood
71
Finally, to conclude this discussion of modal approaches to mood selection, I would note that the fundamental outstanding problem of all such analyses is the lack of a theory telling us what precisely is a possible meaning for an intensional predicate. In terms of Giorgi & Piansi's framework, for instance, we need a theory — both formal and conceptual — of what combinations of modal base and ordering source make for possible natural language meanings. Only if we know exactly why believe (in French) must be analyzed without reference to the ordering source will we have a substantive explanation for its mood choice. Villalta (2001) notes correctly that we could provide adequate truth conditions for sentences involving believe on the assumption that it does involve the ranking of alternative propositions (the correlate on her theory of reference to the ordering source), and she suggests in fact that languages may differ in terms of whether they handle it with or without a ranking. The other scholars we have discussed, in contrast, seem to assume that the lexical meanings are pretty much the same across languages, with differences in mood selection arising from the meaning of mood. Under either perspective, there is the danger of tailoring lexical meanings to fit one's theory of mood choice, rather than basing the theory of mood on lexical meanings which are independently justified.
5. Future
directions
I would like to conclude with a summary of the ideas and issues in the literature which I find most significant: 1. It seems to me that the concepts of comparativity, cumulativity, and ordering source have gotten us very close to an understanding for why desideratives, directives, and predicates whose semantics involve causation so frequently select the subjunctive. A promising possibility would be to combine the idea of a comparative semantics with the formal apparatus of the ordering source theory. 2. New, detailed work on the lexical semantics of intensional predicates will likely play a crucial role in the development of modal theories of mood choice. In particular, we need a theory of the general domain of intensional lexical meanings, one that lets us know such things as what are the possible ways of expressing pre-theoretic notions like desire and belief 3. The connection between modal subordination and mood should be pursued. Doing so might allow us to explain the connection between the
72
Paul Partner persistent discourse scope of dream and related verbs and the fact that they select the indicative; it may also let us understand the relation between notional mood-indicating modals and modals which function "normally", as real operators.
4. Future works should explicitly address to what extent each language's mood system should be analyzable in logical terms. Should the semantic theory incorporate prototypical features of those contexts associated with each mood, and then allow for borderline cases and a certain degree of arbitrariness, or should it define each language's system in detail? 5. Current theories disagree about the theoretical importance of relating mood to two other empirical domains, tense and polarity. Are the connections which we see the result of parallels across independent systems, or do they show that the semantics of mood markers may involve temporal or polarity notions? These open issues, among many others, represent the abxmdance of promising areas of research that have developed within the literature on mood over the last ten or fifteen years.
Acknowledgement I'd like to thank Raffaella Zanuttini for comments and help with the Italian data.
A Semantics of Mood Bibliography Abusch, Dorit (1988). Sequence of tense, intensionality, and scope. In The Proceedings ofWCCFL 7,1-14. Stanford: CSLI. Abusch, Dorit (1997). Sequence of tense and temporal de re. Linguistics and Philosophy 20,1-50. Adams, Emest (1975). The Logic of Conditionals. Dordrecht: Reidel. Anderson, Alan Ross (1951). A note on subjimctive and counterfactual conditionals. Analysis 11, 35-38. Asher, Nicholas (1993). Reference to Abstract Objects in Discourse. Dordrecht: lüuwer. Austin, J. (1962). How to do Things with Words. New York: Oxford University Press. Baker, Mark and Lisa Travis (1997). Mood as verbal definiteness in a tenseless language. Natural Language Semantics 5, 213—269.
The semantics of Mood
73
Barwise, Jon (1981). Scenes and other situations. Journal of Philosophy 78, 369-397. Beghelli, Filippo (1997). Mood and the interpretation of indefinites. Manuscript, University of Pennsylvania. Bell, A. (1980). Mood in Spanish: A discussion of some recent proposals. Hispania 63, 377-390. Bolinger, Dwight (1968). Postposed Main Phrases: An English rule for the Romance subimctive. Canadian Journal of Linguistics 10,125-197. Bolinger, Dwight (1974). One Subjunctive or Two. Hispania 61, 218-234. Brée, David (1982). Counterfactuals and causality. Journal of Semantics 1. Bresnan, Joan (1972). Theory of Complementation in English Syntax. PhD thesis, MIT. Brown, Penelope and Stephen Levinson (1987). Politeness: Some Universals in Language Usage. Cambridge: Cambridge University Press. Chisholm, Roderick M. (1949). The contrary-to-fact conditional. In Hervert Feigl and Wilfiid Sellars (Eds.), Readings in Philosophical Analysis, 482-497. New York: Appleton-Century Crofts. Chung, Sandra and Alan Timberlake (1985). Tense, aspect, and mood. In T. Shopen (Ed.), Language Typology and Syntactic Description, 202-258. Cambridge: Címibridge University Press. Cole, Peter and Jerry Morgan (1975). Syntax and Semantics vol. 3, Speech Acts. New York: Academic Press. Del Verourdi, Rhea, Irene Tsamadou, and Sophia Vassilaki (1994). Mood and Modality in Modem Greek: The particle NA. In Irene Philippaki Warburton, Catenna Nicolaidi, and Maria Sifianou (Eds.), Themes in Greek Linguistics: Papers from the first international conference on Greek linguistics, 185-192. Amsterdam and Philadephia: Benjamins. Dowty, David (1979). Word Meaning and Montague Grammar. Dordrecht: Reidel. Farkas, Donka (1985). Intensional Descriptions and the Romance Subjunctive Mood. New York: Garland. Farkas, Donka (1992). On the semantics of subjunctive complements. In P. Hirschbueler and K. Koemer (Eds.), Romance Languages and Modern Linguistic Theory, 69-104. Amsterdam and Philadephia: Benjamins. von Fintel, Kai (1998). The presupposition of subjunctive conditionals. In Orin Perçus and Uli Sauerland (Eds.), MIT Working Papers in Linguistics 25. Cambridge, MA.: MIT. Giannakidou, Anastasia (1994). The semantic licensing of NPIS and the modem Greek subjunctive. In Language and Cognition 4, Yearbook of
74
Paul Portner
the Research Group for Theoretical and Experimental Linguistics, 55-68. Groningen: University of Groningen. Giannakidou, Anastasia (1995). Subjunctive, habituality, and negative polarity items. In M. Simons and T. Galloway (Eds.), The Proceedings of SALT 5, 94-111. Ithaca: Cornell University Giannakidou, Anastasia (1997). The Landscape of Polarity Items. PhD thesis, Groningen. Giannakidou, Anastasia (1999). Affective Dependencies. Linguistics and Philosophy 22, 367-421. Giorgi, Alessandra and Fabio Pianesi (1998). Tense and Aspect: From Semantics to Morphosyntax. Oxford: Oxford University Press. Givon, Talmy (1994). Irrealis and the subjunctive. Studies in Language 18,265-337. Hausser, Roland R. (1983). The syntax and semantics of English mood. In F. Kiefer (Ed.), Questions and Answers. Dordrecht: Reidel. Heim, Irene (1982). The Semantics of Definite and Indefinite Noun Phrases. Amherst: GLSA. Heim, Irene (1992). Presupposition projection and the semantics of attitude verbs. Journal of Semantics 9,183-221. Huntley, M. (1984). The semantics of English imperatives. Linguistics and Philosophy 7,103-134. latridou, Sabine (1998). The grammatical ingredients of counterfactuality. Manuscript, MIT. Izvorsky, Roumyana (1997). The present perfect as an epistemic modal. The Proceedings of SALT 7. Kamp, Hans (1981). A theory of truth and semantic representation. In J. Groenendijk, T. Janssen, and M. Stokhof (Eds.), Formal Methods in the Study of Language, 277-322. Amsterdam: Mathematical Centre. Karttunen, Lauri and Stanley Peters (1975). Conventional implicature. In C.-Y. Oh and D. Dinneen (Eds.), Syntax and Semantics, vol. 11, Presupposition, 1-56. New York: Academic Press. Kasper, Walter (1992). Presuppositions, composition, and simple subjunctives. Journal of Semantics 9, 307-331. Klein, P.W. (1974). Obsesrvations on the semantics of mood in Spanish. PhD thesis. University of Washington. Kratzer, Angelika (1989). An investigation of the lumps of thought. Linguistics and Philosophy 12, 607-653. Kratzer, Angelika (1991). Modality. In Arnim von Stechow and Dieter Wimderlich (Eds.), Semantik I Semantics: An International Handbook of Contemporary Research, 639-650. Berlin: de Gruyter. Ladusaw, William A. (1980). Polarity Sensitivity as Inherent Scope Relations. New York: Garland.
The semantics of Mood
75
Lakoif, Greorge (1970). Linguistics and natural logic. Synthese 22, 151-271. Lewis, David (1973). Counterfactuals. Cambridge, MA.: Harvard University Press. Lewis, David (1975). Adverbs of quantification. In E. Keenan (Ed.), Formal Semantics of Natural Language, 3-15. Cambridge: Cambridge University Press. Lewis, David (1979). A problem about permission. In E. Saarinen, R. Hilpinen, I. Niiniluoto, and M. P. Hintikka (Eds.), Essays in Honour of Jaakko Hintikka, 163-175. Dordrecht: Reidel. Lewis, David (1983). Languages and language. In Philosophical Papers, vol. 1,163-188. Oxford: Oxford University Press. Linebarger, Marcia (1987). Negative poleuity and grammatical representation. Linguistics and Philosophy 10, 325-387. McCawley, James (1996). Conversational scorekeeping and the interpretation of conditional sentences. In Masayoshi Shibatani and Sandra Thompson (Eds.), Grammatical Constructions: Their Form and Meaning, 77-101. Oxford: Clarendon Press. Moretti, G.B. and G.R. Orvieto (1981). Grammatica Italiana. Perugia: Веnucci. Nathan, G.S. and M.W. Epro (1984). Negative Polarity and the Romance Subjunctive. In P. Baldi (Ed.), Papers from the Xllth Linguistic Symposium on Romance Languages, 517-529. Amsterdam and Philadephia: Benjamins. Ogihara, Toshiyuki (1989). Temporal Reference in English and Japanese. PhD thesis, University of Texas at Austin. Ogihara, Toshijmki (1995). Double access sentences and references to states. Natural Language Semantics 3,177-210. Ogihara, Toshiyuki (1996). Tense, Attitudes, and Scope. Dordrecht and Boston: Reidel. Palmer, F.R. (1986). Mood and Modality. Cambridge: Cambridge University Press. Palmer, F.R. (1990). Modality and the English Modals. New York: Longman. Pesetsky, David (1992). Zero syntax, vol. 2. Manuscript, MIT. Picallo, Carme (1984). The inñ node and the null subject parameter Linguistic Inquiry 15, 75-101. Picallo, Carme (1985). Opaque Domains. PhD thesis, CUNY. Pollock, Jean-Yves (1989). Verb movement. Universal Grammar and the structure of IP. Linguistic Inquiry 20(3):365-424. Portner, Paul (1992). Situation Theory and the Semantics ofPropositional Expressions. PhD thesis. University of Massachusetts, Amherst.
76
Paul Portner
Portner, Paul (1997). The semantics of mood, complementation, and conversational force. Natural Language Semantics 5,167-212. Progovac, Ljiljana (1993). The (mis) behavior of anaphora and negative polarity. The Linguistic Review 10, 37-59. Quer, Josep (1998). Mood at the Interface. The Hague: Holland Academic Graphics. Quine, Willard Van Orman (1975). Quantifiers and propositional attitudes. In The Ways of Paradox and Other Essays, 185-196. Cambridge, MA.: Harvard University Press. Raposo, Eduardo (1986). Some asymmetries in the binding theory in Romance. The Linguistic Review 5, 75-110. Rivero, Maria Luisa (1971). Mood and Presupposition in Spanish. Language 47, 305-336. Rivero, Maria Luisa (1975). Referential properties of Spanish noun phrases. Language 51, 32-48. Rivero, Maria Luisa (1994). Clause structure and V-movement in the languages of the Balkans. Natural Language and Linguistic Theory 12(1):63-120. Rivero, Maria Luisa and Arhonto Terzi (1995). Imperatives, V-movement and logical mood. Journal of Linguistics 31, 301-332. Roberts, Craige (1987). Modal Subordination, Anaphora, and Distributivity. PhD thesis. University of Massachusetts. Roberts, Craige (1989). Modal subordination and pronominal anaphora in discourse. Linguistics and Philosophy 12, 683-721. Rochette, Anne (1988). Semantic and Syntactic Aspects of Romance Sentential Complementation. PhD thesis, MIT. Sadock, J. and A. Zwicky (1985). Speech act distinctions in syntax. In Timothy Shopen (Ed.), Language Typology and Syntactic Description, 155-196. Cambridge: Cambridge University Press. Schiffrin, Deborah (1994). Approaches to Discourse. Oxford: Blackwell. Searle, J. and D. Vanderveken (1985). The Foundations of Illocutionary Logic. Cambridge: Cambridge University Press. Searle, John (1969). Speech Acts. Cambridge: Cambridge University Press. Stalnaker, Robert (1968). A theory of conditionals. In N. Rescher (Ed.), Studies in Logical Theory, 315-332. Oxford: Blackwell. Stalnaker, Robert (1975). Indicative conditionals. Philosophia 5. Stalnaker, Robert (1978). Assertion. In P. Cole (Ed.), Syntax and Semantics 9, Pragmatics, 315-332. New York: Academic Press. Stalnaker, Robert (1987). Inquiry. Cambridge, MA.: MIT Press. von Stechow, Arnim (1995). On the proper treatment of tense. In M. Simons and T. Galloway (Eds.), The Proceedings of SALT 5, 362-386. Ithaca: Cornell University.
The semantics of Mood
77
Steele, Susan (1975). Past and irrealis: Just what does it all mean. International Journal of American Linguistics 41, 200-217. Stenius, Erik (1967). Mood and language game. Synthese 17, 254-274. Stowell, Tim (1982). The tense of infinitives. Linguistic Inquiry 13(3):561-570. Strawson, Peter (1964). Intention and convention in speech acts. Philosophical Review 73, 439-460. Suñer, Margarita (1986). On the referential properties of embedded finite clause subjects. In I. Bordelois et al. (Ed.), Generative Studies in Spanish Syntax, 183-203. Dordrecht: Foris. Suñer, Margarita and J. Padilla-Rivera (1987). Sequence of tenses and the subjunctive, again. Hispania 70, 634-642. Villalta, Elisabeth (2000). Spanish Subjunctive Clauses Require Ordered Alternatives. In Brendan Jackson & Tanya Matthews (eds.), Proceedings of SALT X. Ithaca: Cornell University Press. Villalta, Elisabeth (2001). Propositional Attitudes and Mood Selection in Spanish, Ms., University of Massachusetts at Amherst. Zanuttini, Raffaella (1997). Negation and Clausal Structure. A Comparative Study of Romance Languages. New York: Oxford University Press. Zucchi, Alessandro (1993). The Language of Propositions and Events. Dordrecht: Kluwer.
Three approaches to discourse and donkey anaphora Henriëtte de Swart
In recent years, a number of linguistic theories have been developed which take the discourse as their primary syntactic and semantic unit. They are often referred to as "dynamic theories of meaning". The similarities and differences between various approaches to discourse are illustrated by the way they treat problems of reference and anaphora. Questions which arise are: How are discourse and donkey anaphora anaphora licensed? How are the indices set, in other words, how do we find the actual antecedent of an anaphor? 1. Discourse as the basic unit of
interpretation
Discourse analysis has always played an important role in pragmatics and text linguistics. In the 1980's, discourse became a hot issue in formal semantics. This new interest arose from certain puzzles in the interpretation of anaphoric relations at the discourse level. Examples like (1) constitute the easy cases: (1)
a. b.
Susan; came in. She; wanted to schedule a meeting. The head of the department; came in. She; wanted to schedule a meeting.
Proper names and definite descriptions are referential expressions which point to and pick out a specific individual in the tmiverse of discourse. In (1), coindexing in the syntax corresponds with coreference in the semantics. For pronouns dependent on quantificational expressions, the situ-
80
Henriëtte de Swart
ation is more complex. Coindexing between quantificational expressions and anaphora is interpreted in terms of binding within the scope of an operator. Given that the scope of the quantifier ends at the sentence boundary, we do not expect quantificational NPs to license anaphoric pronouns in a later sentence. Compare the well-formed complex sentences in (2) and the infelicitous discourses in (3): (2)
a. b.
Every student; worried she; might fail the test, No studenti thought she; had passed the exam.
(3)
a.
Every student; came in. # She; wanted to schedule a meeting. (Vx (Studenti^;) -»• Come(x)) Λ Schedule(a;)) No student; came to the meeting. #She; needed to prepare for class. (-1 3x (Student(a;) л Соте(л;)) Л Prepare^))
b.
The pronoun she in (2a) and (b) is in the scope of a universal or negative imiversal quantifier, and the coindexing corresponds with a bound variable interpretation. We see that in (3), coindexing the pronoun with the quantified NP does not lead to a coherent interpretation. We cannot interpret the coindexing relation as coreference, because the quantified NP is not a referential expression. No bound variable interpretation is available, because the pronoun is outside the scope of the quantifier. The variable X in the second conjunct of (3a) and (b) is thus a fi-ee variable, the value of which is dependent on the assignment function. This is exactly what first-order predicate logic predicts on the basis of the standard definitions of scope and binding. However, there are cases which do not quite fit this pattern. These involve discourse anaphora which have indefinite NPs as their antecedent, as in (5), which are to be compared to the bound variable cases in (4): (4)
a. b.
A student; thought she¡ had passed the exam. Some students; worried they¡ might fail the test.
(5)
a.
A student; came in. She; wanted to schedule a meeting. (3x (Studenti^:) л Come(x)) л Schedule(x)) Bill owns some sheep. George vaccinates them. (3x (Sheepix) л Own(bill,x)) л Vaccinate(george,í;))
b.
Unlike proper names and definite NPs, indefinite NPs eire not referential expressions, so we cannot interpret the coindexing in (5) as coreference. The pronouns it and them are bound by the existential quantifier in (4), but not in (5), because the scope of the quantifier ends at the sentence boundary. The variable χ is outside the scope of the existential quantifier 3, and remains unbound. Its value is dependent on the assignment fune-
Three approaches to discourse and donkey anaphora
81
tion. This means that л; could refer to any salient individual in the context. But typically, we do not interpret the second sentence of (5a) to claim that some salient female individual wanted to schedule a meeting, or that George vaccinates just anything that happens to be around. We want the pronoims she and them to pick up the referent introduced in the first sentence of each discourse, and refer to the student who just came in and the group of sheep owned by Bill, respectively. One potential analysis of the anaphoric relations in (5) is a reduction to cases which can be treated at the sentence level. This would avoid the introduction of special discourse machinery, and therefore it would be the most conservative approach. A straightforward proposal along these lines would be to treat sentence sequencing as VP-conjunction. This would make (6a) equivalent to (6b): (6)
a. b.
A student; called me. Shei asked about the exam. A studenti called me and (she;) asked about the exam.
This approach works for the particular case discussed here, but it does not extend to other examples. For instance, (7a) and (b) are not equivalent: (7)
a. b.
Exactly one student; called me. She, asked about the exam. Exactly one student; called me and asked about the exam.
(7b) claims that there is exactly one student who satisfies both the property of calling, and the property of asking. (7b) comes out true in a context in which several students called me, but only one of them asked about the exam. However, (7a) comes out false in that context, as the first sentence excludes the possibility of more than one student calling. If the anaphoric relations in (4) and (5) cannot be reduced to the sentence level, we have a true instance of discourse anaphora. The examples at hand involve an incremental interpretation in which the second sentence is interpreted in the context set up by the interpretation of the first sentence. Once we agree that we need to talk about meaning beyond the sentence level, the question arises how we can build a semantic theory that takes discourse as the basic unit of interpretation. The interest in discourse anaphora has made semantics a lot more 'dynamic' or 'procedural'. This has led to a whole new set of research questions (see Muskens, van Benthem and Visser 1997 for a general overview). For the cases of reference to individuals, the main question is how an incremental, partial theory of discourse meaning handles anaphoric relations as in (5)-(7). This question can be phrased as: how can we design a logic which explains why indefinite NPs license discourse anaphora, and other quantifiers do not (compare 3). Given that this phenomenon has been characterized as the prop-
82
Henriette de Swart
erty to establish anaphoric relations beyond the (traditional) scope of the quantifier, it is not surprising that discourse anaphora should be studied in connection with other contexts in which this property manifests itself. The splitting between indefinite NPs and other quantifiers also occurs in conditional and quantificational constructions. Compare the following sentences: (8)
a. * If every studente likes Copenhagen, she¿ is happy, b. If a studenti likes Copenhagen, she, is happy.
(9)
a. * Every student who read every рарег; on semantics liked it;, b. Every student who read a paper; on semantics liked it;.
The representations of (8a) and (9a), given in (10a) and (b) respectively, show that the pronoun translates as a free variable: (10)
a. b.
((Vx (Student(ic) ^ Like(ic,c))) -> Happy(x)) ((Vx \/y ((Student(a;) л Paper(y)) ^ Read(a:,y))) -* Likeix,^))
The general rules of predicate logic thus correctly rule out (8a) and (9) as illegitimate binding beyond the scope of the quantifier. However, if we adopt the rules governing scope and binding relations that predicate logic provides, we cannot explain the felicitous anaphorical relations in (8b) and (9b). After all, the representations in (11a) and (lib) show that these sentences also involve anaphoric pronouns outside the scope of the quantifier: (11)
a. b.
({3x (Studenti^;) л Like(a;,c))) ^ Нарру(:к)) ((Ξλ; 3y (Student(a;) л Paper(y) л Read(a;,y)))
Ыке(л;,у))
The quantificational structure of (11a) and (b) is not any different from the one we gave in (10a) and (b). If anaphoric relations in natural language are constrained by the predicate-logical rules of scope and binding, we would expect the coindexing in (8b) and (9b) to be illegitimate as well, but it is not. The problem is not that it is difficult to formulate an appropriate translation for the sentence in (8b) and (9b). The representations in (12) are well-formed formulas in first-order predicate logic, and they do a fine job of capturing the meaning of the natural language sentence: (12)
a. b.
(yx (Student(x) л Like(x,c)) ^ Нарру(л;)) Çix Vy (Studenti»;) л Paper(y) л Read(a:,y)) -> Like(x,y))
The problem with the representations under (12) is that they do not seem to fit our general aim of developing a compositional theory of meaning. Note in particular that the indefinite NP which shows up in the antecedent of a conditional in (10b) and in the relative clause of a universally
Three approaches to discourse and donkey anaphora
83
quantified NP in (lib) is translated in (12) in terms of a wide scope universal quantifier. However, universal quantifiers are not normally taken to be the meaning of an indefinite NP. So we have a dilemma: on the basis of the sentences in (10), we can compositionally derive the formulas in (11), but they do not give us the desired binding relation. The formulas in (12) give us the desired interpretation, but we do not know how to derive them in a compositional way. This raises the question whether standard first-order predicate logic is the appropriate meta-language to use in the interpretation of natural language. The problem raised by conditional sentences Uke (8) and (9) is called the problem of'donkejr" anaphora. The phenomenon was named after the original examples discussed by Greach (1962), which involved farmers and donkeys: (13)
a. b.
If a farmer¡ owns a donkey^ he; beats it^. Every farmer¿ who owns a donkey, beats it^.
The problems raised by discourse and donkey anaphora are related: both depend on a special relation between an indefinite NP and an anaphoric pronoim outside the regular scope domain of the NP. The phenomena at hand are situated at the semantics-pragmatics borderline, but they equally are of interest to researchers concerned with language processing. In a processing view, the meaning of an expression is regarded as an instruction to the hearer to construct or extend a representation. This view fits in well with the intuitive way in which we interpret discourse: we do not wait until the very end of the entire discourse before we start interpreting it. The idea of an incremental interpretation is that we interpret one sentence at a time and look at each sentence as an extension of the information built up by the context so far. A dynamic theory of natural language meEining is thus formulated in terms of 'updating* or 'context change' (Chierchia 1995). The dynamic view has important consequences for semantic theory. First, updating involves an interpretation of sentence sequencing that corresponds with an asymmetric version of conjunction. In classical firstorder logic, ρ л ρ is equivalent to ρ л p. But in natural language, (14a) is a coherent discourse under the intended interpretation indicated by co-indexing, but (14b) is not: (14)
a. A student; came in. She¿ wanted to schedule a meeting, b. # She, wanted to schedule a meeting. A student; came in.
Second, an incremental theory of discourse is necessarily a partial theory of meaning. In a static approach, one cannot really talk about 'old' and 'new' information. As a result, a static approach does not capture the in-
84
Henriette de Swart
tuition that indefinite NPs introduce a 'new' referent into the discourse, which is available from then for anaphoric reference. The only way in which we can update the context with the information contributed by the sentence is to adopt a partial theory of meaning which allows our universe of discourse to grow. Last but not least, the dynamic view implies that meaning becomes procedural. If interpretation becomes updating, it is a relation between input and output conditions. The formalization of input and output conditions as information states makes it possible to define the semantic value of a sentence as a functionfrominformation states to information states. Note that the djTiamic view does not replace the static view, but extends it to account for certain phenomena at the discourse level. Also, dynamic semantics does not do away with the semantics-pragmatics distinction altogether. Although dynamic semantics deals with certain phenomena which were traditionally taken to be part of pragmatics, there is still an important role for pragmatics to play in an overall theory of discourse (as we will see in section 5 below). In principle, there are three different lines along which we can try to approach the problem of discourse and donkey anaphora. The first option is to say that there is something special about the anaphoric pronoun in these cases. It cannot be interpreted in terms of regular coreference or binding, but it gets some other interpretation. This solution is explored in the E-type approach. The second line of explanation is to say that there is something special about the indefinite NP. If it does not behave as a regular quantifier, maybe we should not translate it as an existential quantifier This has given rise to theories like File Change Semantics and Discourse Representation Theory. These theories interpret indefinite NPs as variables and use a mechanism of unselective binding to allow them to be bound by other quantifiers in the sentence or the discourse. The third possibility is to view the licensing problem as a problem with the definition of binding domain. This triggers proposals to somehow extend the scope domain of the indefinite NP. This solution is adopted in D5Tiamic Predicate Logic, which develops a notion of dynamic binding. In the next three sections, we will study the three types of analyses developed to deal with discourse and donkey anaphora. Section 5 addresses the issue of anaphora resolution. In the final section, we discuss some current problems in the study of discourse and donkey anaphora.
Three approaches to discourse and donkey anaphora
85
2. E-type anaphora In the late 1970's, Evans developed a way to account for anaphoric pronouns that are outside the scope of their binding operator by giving the pronoun a special interpretation. According to Evans, pronouns outside the scope of their binding operator are not bound by the quantifier. In order to account for the observation that discourse and donkey anaphora are dependent on an indefinite NP, he claims that these pronouns reconstruct their descriptive content on the basis of the antecedent NP. Besides coreferentiality and binding, we thus have a third way of interpreting coindexing in the syntax. The so-called E-type anaphor is nothing but a disguised description. Usually, the descriptive content of the E-type anaphor consists in the conjunction of the common noun and the predicate. For instance: (15)
a. b.
A student came in. She had a question about the exam. she = the student who came in Bill owns some sheep and Max vaccinates them, them = the sheep Bill owns
The E-type approach extends to the quantificational cases in (16): (16)
a.
b.
If a student likes Copenhagen, she is happy. she = for every case we examine, the student in question who likes Copenhagen Every student who read a paper on semantics liked it. it = for every student, the paper she read
If the E-type approach is generally available as an interpretation mechanism, the question arises how much freedom the pronoim has in reconstructing its descriptive content. Cooper (1979) suggests that reconstruction is dependent on a contextually salient function. If we would interpret this broadly, we should be able to reconstruct the descriptive content of an E-type pronoun from any salient expression picking out a unique referent in the context. However, examples like (17) and (18) show that the available interpretations are constrained by the syntax: (17)
a. b.
Bill owns a cat. Max takes care of it. Bill is a cat-owner. # Max takes care of it.
(18)
a. Everyone who owns a cat takes good care of it. b. * Every cat-owner takes good care of it.
We may assume that the function which maps cat-owners to the cat they own is salient in (17) and (18), because we are talking about cats and
86
Henriëtte de Swart
OAvners. Although thisfimctioncan be used to reconstruct the descriptive content of the E-type pronoun in (17a) and (18a), it is unable to license the pronoun it in (17b) and (18b). The data in (17) and (18) suggest that an E-type pronoun needs to be licensed by a 'real' antecedent, given by an indefinite NP for instance; an incorporated nominal is not good enough. Note that this conclusion is valid for English, but that we do seem tofinddiscourse transparent incorporated nomináis in other languages, such as Greenlandic Eskimo (cf Van Geenhoven 1998). The general conclusion however, is that the pronoun reconstructs its descriptive content on the basis of a linguistic expression, rather than on the basis of a contextually available fimction. These observations make it clear that the relation between the pronoun and its antecedent is tighter than what a liberal view of the E-type analysis predicts. This is not to say that we cannot make the relation between the pronoun and its Euitecedent tighter in an E-type analysis. But such restrictions have to be formulated on top of the interpretation procedure; they do not follow naturally from the E-type analysis itself Another question that the E-type approach raises concerns the presuppositions of uniqueness associated with definite descriptions. According to Russell (1905), definite descriptions get the interpretation in (19): (19)
3x (P(x) л Vy {P{y) -*y = x))
Under this interpretation, there is an object jc which has a certain property Ρ íuid no other object has this property. Heim (1982) points out that imiqueness presuppositions in the context of donkey sentences mean that in examples like (16b) and (18a), we would have to assume that we quantify over people who read exactly one paper on semantics, or who own exactly one cat. This seems like a counterintuitive requirement. Heim argues that it is generally not appropriate to associate uniqueness presuppositions with donkey anaphora. Examples like (20) support her view that this would be undesirable: (20)
Everyone who buys a sage plant in this store gets eight others along \vith it.
Given that every buyer gets nine sage plants, it is impossible to interpret it as the unique sage plant for every buyer. Heim (1982) rejects the E-t5φe analysis because the uniqueness condition on definite descriptions is incompatible with the interpretation of donkey anaphora in sentences like (20). In view of more recent emalyses of definites, we can argue that the 'sage plant examples' do not provide a convincing counterargument against the E-type analysis of discourse and donkey anaphora. The problem of (20) is not so much the interpretation of the pronoim as a hidden description, but the interpretation ofthat description as carrying a presup-
Three approaches to discourse and donkey anaphora
87
position of uniqueness. We can envisage a slightly different interpretation of definite descriptions that does not have these problems. Independently of the E-type analysis, Neale (1990) proposes to interpret definite descriptions as numberless descriptions. If we adopt this proposal, we can have E-type pronoims which do not carry uniqueness presuppositions. A revival of the E-type approach in work by Heim (1990), van der Does (1994), Lappin and Francez (1994), Cheng and Huang (1996), Egli and Von Heusinger (1995) shows that quite sophisticated analyses of donkey sentences can be developed based on this strategy. The E-type analysis illustrates an approach to discourse and donkey anaphora which tries to account for the observations made in section 1 by changing the interpretation of the anaphoric pronoun. A second solution which has been developed leaves the interpretation of the pronoun intact, but changes the interpretation of the indefinite NP. This approach to the problem of discourse and donkey anaphora is exemplified by File Change Semantics and Discourse Representation Theory.
3. Unselective
binding
As an alternative to the E-type approach, Heim (1982) developed File Change semantics, and Kamp (1981) and Kamp and Reyle (1993) developed Discourse Representation Theory (DRT). The presentation here follows Kamp's Ъох' notation rather than Heim's files. For more details and a complete formulation of the construction rules and embedding conditions, the reader is referred to Kamp and Reyle (1993). In between linguistic expressions and their interpretation in a model, DRT postulates an intermediate level of semantic representation, viewed as a mental, or cognitive, representation of the (partial) information conveyed by the discourse so far. It is written in a language of 'boxes', called Discourse Representation Structures (DRSs). So-called construction rules take natural language expressions as input and have boxes as output. The boxes contain discourse referents, and DRS conditions which specify what we know about the discourse referents. A discourse is true if the box which is constructed for it can be embedded into the model. The embedding conditions are the counterpart to truth conditions in classicalfirst-orderlogic. The introduction of discourse referents and conditions is one of the things that are regulated by construction rules. The typical role of indefinite NPs is to introduce new discourse referents into the DRS. Consider the discourse in (21): (21)
A student; called Jane. She¡ asked about the exam.
88
Henriette de Swart
The indefinite NP in the first sentence of (21) introduces an individual which has the property of being a student. The verb phrase adds that this individual has the property of calling Jane. The proper name introduces a discourse referent and a condition that the referent bears the name Jane. The transitive verb introduces a two-place relation that holds between discourse referents. The representation of the first sentence of (21) is spelled out in figure 1.
Figure 1: A student called Jane. The discourse referents of the DRS are listed in the top row of the box. They are followed by conditions describing the properties of the discourse referents. The DRS is true if and only if we can find an embedding function /"which assigns individual values to all the discourse referents in such a way that there exists some student who called Jane. The existential force of the indefinite NP is located in the embedding function, for there is no quantifier present in the DRS. There may be many embedding functions which make the sentence come out true. This supports the view that discourse referents are only stand-ins for real individuals. The incremental interpretation of the discourse means that the second sentence adds more discourse referents and more conditions to the same DRS. The fact that the pronoun she in (21) is coindexed with the indefinite NP α student is interpreted as an instruction to identify the discourse referent и with the discourse referent χ introduced for the indefimite NP. Identification is allowed if this is an appropriate and accessible discourse referent. In DRT, accessibility requires the antecedent to be in a box 'as high as' or Ъigher than' the discourse referent for the pronoun. In structxu-al terms, accessible antecedents are in the same box, in a box to the left, or in a box that embeds the one that contains the discourse referent for the pronoun, χ is accessible as an antecedent for the pronoun, because it occurs in the same box, so identification is allowed. The interpretation of the two sentence discourse is spelled out in figure 2. The treatment of quantificational NPs such as every student in DRT differs in important respects from the analysis of indefinite NPs. Universal quantifiers invoke a box-splitting construction rule, which introduces two sub boxes related by the connective The restriction on the quantifier is in the left subordinate box and the scope of the quantifier is in the
Three approaches to discourse and donkey anaphora
89
right subordinate box. Consider the DRS constructed for the first sentence of (22) in 3. X
y
u
Student(x) Jane(y) Call(x,y) u =? U = X
Ask(u) Figure 2: A student; called Jane. She; asked about the exam. (22)
Every student called Jane. She asked a question about the exam. У Jane(y)
CalKxJ)
Figure 3: Every student called Jane. The verification procedure for ^ requires every embedding function which verifies the left embedded box to extend to a function which verifies the right embedded box. In the case of figure 3, this interpretation requires every individual which satisfies the property of being a student (and can thus function as the value for the discourse referents) to have the property of calling Jane. The similarity with first-order predicate logical formulas makes it possible to grasp this at an intuitive level. If we extend the DRS in figure 3 with the interpretation of the second sentence of (22), we end up with the DRS in figure 4. The pronoun she introduces a discourse referent и in the top box, just like in figure 2 above. The coindexing of the pronoun with the quantificational NP indicates that the intended antecedent for she is the quantified NP every student. Note however that the discourse referent introduced by this NP is not present in the top box, because the quantificational expression cannot be directly mapped onto some specific individual in the universe of discourse. The discourse referent χ introduced by the quantified NP is not accessible as an antecedent for the pronoim, because it is
90
Henriette de Swart
Figure 4: Every student¿ called Jane. She¡ asked a question about the exam. contained in a subordinate box. As a result, и cannot be identified with x. We end up with a DRS for which we cannot give a coherent interpretation with the pronoun taking the quantified NP as its antecedent. The DRSs in figures 2 and 4 show that indefinite NPs and quantified NPs are treated in crucially different ways. The construction rules these expressions trigger make indefinite NPs, but not quantified NPs suitable antecedents for discourse anaphora. The next step is to consider what happens if we embed an indefinite NP under a universal quantifier, as in (23), where the anaphor is a donkey pronoun. In DRT, (23) gets the representation in Figure 5. (23)
Every student who read a paper¡ on semantics liked it¡ u
X у Student(x) Paper(y) Read(x,y)
=>
Like(x, u) u=? u=y
Figure 5: Every student who read a paper¡ on semantics liked it;. The pronoun it in (22) is coindexed with a paper. The intended antecedent у of the pronoun it is in the box to the left of the one which contains u. According to the rules on accessibility, this means that у is an accessible antecedent for u. Thus we can interpret it as being dependent for its interpretation on the indefinite NP a paper. Note that the indefinite NP a paper is embedded under a universal quantifier. The connective is defined in such a way that for all embedding functions which make the left box true it should be possible to extend them to one which makes the right box true. Because the embedding function assigns individuals to all the discourse referents in the left box, this formulation amounts to universal
Three approaches to discourse and donkey anaphora
91
quantification over all the discourse referents in the left box. The identification of и with у brings the pronoun under the scope of the universal quantifier, and the donkey anaphor is interpreted as a regular bound pronoun. Given the universal force of the binder, the meaning of the sentence can be paraphrased as 'for every student x, and every papery that χ read, л; liked у'. The interpretation of reflects an important difference between DRT £md classical first-order logic. In standard first-order logic, all quantification is selective. That is, the universal or existential quantifier in a formula 'ixφoτЗxφ binds the variable x, and no other variable around in the formula φ. The connective =>, however, effectively binds all the variables standing for discourse referents in the left subordinate box. This amounts to assuming unselective quantification over all the variables in the restriction of the quantifier. It is the combination of the interpretation of indefinite NPs as introducing variables with the concept of vmselective binding which permits File Change Semantics and DRT to provide a unified analysis of discourse and donkey anaphora. The advantages of the Kamp/Heim approach to indefinites are clear. First, we do not need a third way of interpreting coindexing in the s5Titax: we can make do with the classical mechanisms of coreferentiality and binding. The second main achievement is a principled explanation of the difference in licensing behavior between quantified NPs such as every student and no student on the one hand, and indefinite NPs on the other hand. The explanation is that quantified NPs are real quantifiers, which introduce binding operators. Indefinite NPs do not have any quantificational force of their own; they just translate as iree variables. Furthermore, the Kamp/Heim approach to indefinites and the method of unselective binding have been successfully applied to a wide variety of linguistic problems. Examples include a treatment of temporal anaphora (Kamp and Rohrer 1983; Partee 1984; Hinrichs 1986; Kamp and Reyle 1993), an account of generic emd partitive interpretations of bare plurals and singular indefinites (Diesing 1992; Kratzer 1995; Krifka et al. 1995), and an analysis of presuppositions (van der Sandt 1992; Beaver 1997). Notwithstanding the success of the framework, the DRT approach to anaphora also faces some serious problems. Crucially, the interpretation of donkey pronouns involves the assumption that indefinites always take over the quantificational force of their binder. This does not always yield the correct interpretation. Consider an example like (24): (24)
Most students who own a cat take good care of it. Most,,/(Student(x) л Cat(y) л Own(jc,y))(Саге(л;,з')))
92
Henriette de Swart
The interpretation of most as an unselective quantifier that binds both the student and the cat variable results in an interpretation in which the sentence quantifies over pairs of students and cats and claims that for most of these pairs, it is true that the student takes good care of the cat. But suppose we have a situation in which there are 10 students, 9 of which own one cat each, and they take good care of it. The 10"^ student owns 15 cats, and neglects them. According to DRT, the sentence should be false in this situation, because most pairs of a student and a cat are such that the student does not take good care of the cat. Intuitively, however, we would take the sentence to be true in the given context. After all, most students that have one or more cats are good care takers. These observations suggest that a sentence like (24) expresses quantification over cat-owning students rather than over pairs of a student and a cat. This problem arises with all quantifiers which have more them existential, but less than universal force, such as most, many and few. Because the problem has to do with proportions, it has been dubbed the proportion problem. The proportion problem spreads through the whole system, because unselective binding is a central property of DRT. For instance, we can come up with variants of the proportion problem which involve temporal anaphora (see Partee 1984 and de Swart 1991 for discussion). We can devise solutions to the proportion problem within DRT, but it is clear that the existence of the problem as such sheds doubt on the question whether we really want to treat natural language determiners as imselective quantifiers (see Kadmon 1987, Heim 1990, Chierchia 1992 for further discussion). DRT proposes a radical difference between indefinite NPs and real quantificational NPs. This distinction explains the difference in licensing of discourse anaphora between NPs like a student and every student. Several people have observed that this means that we lose the nice, imiform analysis of NPs that has been developed within Montague Grammar and which we exploit in generalized quantifier theory. This also relates to other empirical problems. It has been observed that the difference between α and every, no is gradual, rather thiui absolute. That is to say, in certain cases these quantifiers also allow binding beyond the scope of the operator, as shown by the examples in (25) (from Fodor and Sag 1982 and Roberts 1989 respectively): (25)
a. b.
Each student in the syntax class was accused of cheating on the exam, and he was reprimanded by the deem, Either there is no bathroom in this house, or it is in a fimny place.
(25a) means that every student who cheated was reprimanded by the dean and (25b) means that the bathroom is in a funny place if there is one. In
Three approaches to discourse and donkey anaphora
93
standard DRT it is hard to account for these cases. In (25a) the anaphoric pronoun is dependent on a real quantificational NP, and not on an indefinite NP which does not have any quantificational force of its own. The negation operator in (25b) is supposed to close off the binding domain of the indefinite NP, which should therefore be inaccessible to the pronoun in the right disjunct. If we take these examples seriously, we seem to need a more general approach to binding beyond the scope of the operator. Within DRT, we find several proposals. For instance, Roberts (1989,1996) explores the use of domain restrictions on quantificational and modal operators to account for cases like (25a) and Krahmer and Muskens (1995) give a revised definition of negation in DRT which handles examples like (25b). But the problems have also inspired researchers to develop radically different approaches to discourse and donkey anaphora. In order to bridge the gap between indefinite NPs and other quantificational NPs, alternative analyses have been developed which propose an extension of the binding domain of certain quantifiers (in particular, existential ones), but not others. The proposals for dynamic binding follow ideas developed by Groenendijk and Stokhof (1990, 1991). This brings us to the third type of solution to the problems raised by the phenomena of discourse and donkey anaphora, namely a change of the concept of binding domain of a quantifier.
4. Dynamic
binding
In the literature, we find several systems which try to combine the traditional interpretation of indefinite NPs in terms of existential quantifiers and the idea that indefinite NPs introduce a discourse referent that can be picked up by an anaphoric pronoun in subsequent sentences or in the consequent of a conditional. This provides new analyses of discourse and donkey anaphora. Examples of such combined dynamic systems are Barwise (1987), Schubert and Pelletier (1989) and Groenendijk and Stokhof (1990, 1991). We will concentrate here on the dynamic predicate logic (DPL) developed by Groenendijk and Stokhof (1991). The starting point for DPL is the view that the utterance of a sentence brings the hearer fi-om a certain state of information to another one. As long as we restrict ourselves to information about individuals that are in our domain of discourse, we can identify an information state with an assignment of objects to variables. Context change is then the change irom an input assignment to an output assignment. Accordingly, the interpretation of a formula in DPL is a set of ordered pairs of assignments, the set of its possible 'input-output' pairs. For instance, (26a) is interpreted as in (26b):
94 (26)
Henriette de Swart a. b.
SxPix) I3x Р(д:)]1 = {{g,h) I
л h{x) E FiP)}
h{x]g means that assignment h differs from^ at most with respect to the value it assigns to x. F is the interpretation function which assigns individuals to individual constants and sets of n-tuples of individuals to n-place predicates. All assignments h that are in the interpretation of PU) are taken to be possible outputs with^ as input. In order to treat discourse anaphora, we need to combine the djTiamic interpretation of indefinite NPs with an analysis of sentence sequencing. In DPL, sentence sequencing is represented as djTiamic conjunction. The definition is in (27): (27)
1Ф h n= {{g,h) I 3k : {g,k) G Щ л {k,h) e ВД}
According to this definition, the interpretation of φ л ψ with input g may result in output h if and only if there is some k such that interpreting φ in g may lead to k, and interpreting ψ ink gets us from k to h. The fact that the second conjunct is interpreted with respect to the assignment which results after interpreting the first conjunct means that we have an incremental interpretation. That is, the second sentence is interpreted in the context which results after processing the first sentence. Suppose now that φ contains an existential quantifier. In that case, the output of processing the first conjunct means that we have a variable assignment which assigns a certain object to the variable χ bound by the existential quantifier. Given that this is the input assignment for the second conjunct, any occurrence of χ in the second conjunct will be taken to refer to that same object. Because of its power to pass on variable bindings from the left conjunct to the right one, conjimction is called an internally dynamic connective. And because of its capacity to keep passing on bindings to conjuncts yet to come, it functions as an externally dynamic connective as well. The combined dynamic treatment of the existential quantifier and conjunction solves the problem of discourse anaphora in sequences like (28): (28)
A student; called Jane. She; asked a question about the exam.
The pronoun and the indefinite NP bear the same index, which means that they use the same variable in their interpretation. The output assignment of the first sentence fixes the value assigned to the variables bound by the existential quantifier. Given that this is the input assignment for the second sentence, the value of χ is passed on to the next sentence. An occurrence of the same variable in the second sentence is not in the scope of the existential quantifier in the ordinary sense. However, the dynamic
Three approaches to discourse and donkey anaphora
95
interpretation of conjunction implies that it is bound by the quantifier with the same force as an occurrence of the same variable in the first sentence. Implication is another internally dynamic connective. With respect to a certain input assignment, the antecedent of an implication results in a set of assignments that verify the antecedent. For the implication as a whole, it is required that each of these verifying assignments be a possible input for the consequent. This is reflected in the following interpretation of implication: (29)
= {{g,h) I Л = я л VÄ : {h,k) ε Щ =>
Suppose that the antecedent contains an indefinite N P which translates in terms of a dynamic existential quantifier binding a variable x. Suppose further that the consequent contains a pronovm which is coindexed with the indefinite NP, so that it introduces an occurrence of the same variable X. In that case, the quantification over output assignments of the antecedent guarantees that the value assigned toa; is passed on to the consequent. This guarantees the right anaphoric bindings in donkey sentences like (30a) and (30b): (30)
a. b.
If a farmer^ owns a donkey, he¡ beats it^. Every farmer who owns a donkey¿ beats it¿.
The binding effect of the existential quantifier occurring in the antecedent extends to occurrences of the corresponding variable in the consequent. Universal quantification over the output assignments of the antecedent indirectly gives the indefinite N P universal force. In this way, DPL yields the same interpretation for donkey sentences as Kamp (1981) and Heim (1982) obtain by interpreting indefinites as variables. The main advantage of the dynamic framework is that we stay close to traditional predicate logic and provide a unified treatment of indefinite and quantificational NPs, which is compatible with Montague grammar and generalized quantifier theory. Thus it is easy to account for the observation that the difference between indefinite NPs and other quantified NPs is not absolute, but only relative. Universal quantifiers and negation are usually static, but have dynamic definitions as well to deal with examples like (23) above. Chierchia (1992) shows that we can extend Groenendijk and Stokhof s definition to generalized quantifier interpretations of determiners like most. He shows that in his system the proportion problem does not arise, because determiners only bind the variable of the Ъead noun'. Thus in cases like (24) above, we automatically get quantification over cat-owning students, rather than over pairs of students and
96
Henriëtte de Swart
cats. Chierchia (1992, 1995) further argues that we need to combine dynamic logic with an E-tJφe approach in order to get the various donkey sentences their correct interpretations. More generally, the developments within the ЕЧзфе approach, DRT and dynamic logic are such that they converge into one general dynamic analysis. There are more and more 'mixed' systems, which combine the insights of the three approaches (compare Muskens 1995).
5. Anaphora
resolution
So far we have forced the pronoun to take up a certain interpretation by coindexing it with the NP we intended to be its antecedent. Discourses in real life do not normally come with subscripts, so the question arises how we know where to put them. We have emphasized that coindexing is a syntactic device which reflects the outcome of a process of anaphora resolution. One of the reasons why it is hard to develop a good theory of natural language processing is that anaphora resolution is not a fully compositional process. In section 3 above, we argued that the instruction on the pronoim is to unify its referent with an appropriate and accessible antecedent. Usually we can determine which antecedents are accessible to certain pronouns in a compositional procedure. However, different factors come into play to determine which of the accessible antecedents provides the actual interpretation of the pronoun. In this section, we study strategies which make predictions about the actual choice of the antecedent. For some cases grammar is all we need, even if there is more than one accessible antecedent for the pronoun. Consider the following example, taken from Kameyama (1996): (31)
John hit Bill. He hit him back.
The discourse referents introduced for John and Bill are accessible antecedents for the two pronouns he and him. However, hearers typically interpret the second sentence of (31) as conveying the information that Bill hit John. This is a result of the presupposition introduced by back. Back presupposes that a similar event has already occurred, but the distribution of roles in that previous event is exactly the reverse of the one in the current event. The presupposed event is spelled out in the first sentence of (31). This means that the names of the hitter and the hittee are fixed. The reversal of the argument roles implies that the second sentence imambiguously describes an event of Bill hitting John. The example in (31) nicely shows that purely linguistic information can fully determine anaphora resolution. However, in most other cases in
Three approaches to discourse and donkey anaphora
97
which linguistic information plays a role, it is either as a default inference or in combination with some other source of information. This is illustrated by the following example, taken from Vermeulen (1994): (32)
a. b.
The brick was thrown against the window. It broke. John was surprised by the strength of the window. The brick was thrown against the window. It broke.
Lexical semantics provides us with information about the fragility of objects. From this, we know that windows typically have a higher value on the scale of fragility than bricks. Thus in (32a), it is more likely that the window broke than the brick. (32b) shows that this is a default inference which can be destroyed in a richer context. The interaction between lexical information and discourse interpretation is an important object of study in computational semantics. For more discussion, see Webber (1991), Asher (1993), Asher and Lascarides (1995), Nunberg (1995), etc. Aside from linguistic information, other tJφes of information contribute to the resolution of the anaphor. According to Reinhart (1982), the topic of a sentence is pragmatically defined as what the sentence is about, and the comment is what we predicate of the topic. It is quite generally assumed that topics make good antecedents for pronouns, because their referents are maximally salient at the point in the discourse at which the pronovm is used (e.g. Lambrecht 1994, Fox 1987). Although subjects are generally taken to be default topics, Reinhart (1982) argues that this preference is much stronger in passive sentences than in active ones. Consider the pair of examples in (33): (33)
a. b.
When she entered the room, Lili was greeted by Lucie, When she entered the room, Lucie greeted Lili.
In (33a), the cataphoric (forward looking) pronoun she must be interpreted as referring to Lili, whereas in (33b), it can refer to either Lucie or Lili. Reinhart's observations suggest that there are connections between a pragmatic phenomenon like topichood and grammatical phenomena like subjecthood and voice. Vallduvi (1994) discusses a wide range of constructions in which information structuring and grammar interact. Information structure is more generally dependent on the coherent interpretation of the discourse. Hobbs (1979) argues that we need to develop a theory of coherence relations (also called discourse or rhetorical relations), which give structure to a discourse. Anaphora resolution then comes about as a side effect of the coherence relations between sentences in the discourse. Consider Hobbs' examples (34a) and (b): (34)
a.
John can open Bill's safe. He knows the combination.
98
Henriette de Swart b.
Bill is worried because his safe can be opened by John. He knows the combination.
Grammatical parallelism allows us to resolve he in the second sentence of (34a) as referring to John. But that strategy would not work for (34b). Hobbs argues that the relevant coherence relation in (34b) is Elaboration. If someone knows the combination of the safe, he knows how to open the safe. The pronoun he in (34b) then naturally unifies Avith John. The example illustrates that coherence rules use a mixture of linguistic and non-linguistic knowledge to infer a particular anaphoric relation.
6. Current problems Current research on discourse and donkey anaphora concerns both new empirical problems and attempts to solve existing puzzles. Kamp and Reyle's (1993) extension of DRT to plural NPs and anaphora opened a new area of research. Kamp and Reyle argue that plural indefinites license discourse and donkey anaphora, but plural quantificational NPs do not, or at least not in the same way. (35)
a. b. c.
Two students called. They had a question about the exam. All students came to the meeting. They requested a new class schedule. At least two students called. They had a question about the exam.
As pointed out by Kamp and Reyle, the bare numeral NP in (35a) introduces a plural discourse referent that sets up an anchor for further reference. In (35b and (c), it is not the NP which provides the antecedent for the pronoun they, but the sentence as a whole. That is, we need to build a plural discourse referent for the entire set of students that called/came. As emphasized by Szabolcsi (1997, 25), we can continue (35a) with the sentence 'Perhaps there were others who did the same'. This shows that bare numerals allow non-maximal anaphora, just like singular indefinites, whereas with the real quantifier all in (35b) and the modified numeral at least two in (35c) only maximal anaphora are allowed. Note that both two N and at least two N are weak monotone increasing NPs in the sense of generalized quantifier theory (Barwise and Cooper 1981), so the Heim/ Kamp contrast between indefinites and quantificational NPs does not reduce to the well-known weak/strong distinction. The intriguing question why two N and at least two N do not have the same licensing capacity has received different emswers in the literatur (of. Ruys 1992; Kamp and Reyle
Three approaches to discourse and donkey anaphora
99
1993; Szabolcsi 1997; Farkas 1997; Beghelli and Stowell 1997; Reinhart 1997; Honcoop 1998; Winter 1998; Landman 1998; Kriika 1999; de Swart 1999, 2001). The extension of the analysis to plurals does not only raise new questions as to which NPs can license discourse and donkey anaphora, it also makes it necessary to look at the different interpretations of the anaphor. If the determiner D of a plural NP denotes a relation between the set A provided by the common noun, and the set В corresponding with the VP, the anaphoric pronoun they in the next sentence can refer to the intersection of A and В (A η B, called the refset), the set A (called the maxset) or the difference between A and В (A-B, called the compset), as illustrated in (36): (36)
a. b. c.
Most students came to the meeting. They asked many questions. [refset] Most students came to the meeting. They are very concerned about the safety issues, [refset/maxset] Few students came to the meeting. They were too busy preparing their exams, [compset]
The psycholinguistic study conducted by Moxey and Sanford (1987,1993) reveals that compset reference is not possible following non-downward entailing quantifiers. However, not all monotone decreasing quantifiers license complement anaphora: with NPs like less than half of the students or less than twenty students compset reference is marginal. The question whether compset reference really exists, and how to characterize the set of plural NPs that allows complement anaphora has been answered in different ways in the literature (cf Moxey and Sanford 1987, 1993; Link 1991; Corblin 1996; ШЬЫе 1997a, b; Geurts 1997; Hendriks and de Hoop 2001).
The analyses discussed so far always concern discourses with anaphoric pronouns (he, it, they). Of course, other expressions have anaphoric properties as well, the standard example being the definite NP, which picks up an 'old' discourse referent whereas an indefinite NP introduces a 'new' discourse referent (cf Heim 1982). Anaphoric definites have a wider range of interpretations than pronoims. In particular, they do not need to identify with their antecedent, but can refer to a member or a subset of a previously introduced set (36 a), or to another individual that is related to the antecedent by a relation called bridging (36b). As (36c) shows, the bridging relation also exists in 'donkey contexts: (37)
a.
The children were in the playground. The girls were playing soccer and the boys were watching.
100
Henriëtte de Swart b. c. d.
We entered a small room. The ceiling was quite low. We entered the ball room. The chandelier gave a warm light. When I go to a bar, the barkeeper always throws me out.
The analysis of membership, subset and bridging relations requires an extension of the linking relation between antecedent and anaphor (c£ Clark [1975] 1977, van Deemter 1992, 1994, Bos, Buitelaar and Mineur 1995, Matsui 1995, Asher zmd Lascarides 1998, Krahmer and van Deemter 1998, Piwek and Krahmer 2000, Poesio and Vieira 1998, Gardent and Striegnitz 2001, for discussion and various proposals). We conclude that the interpretation of discourse and donkey anaphora provides an interesting example of the kind of problems that arise when we start looking at meaning at the discourse level. On the one hand, we see that the traditional sentence-based semantic theories have been extended in order to deal with the dynamic binding phenomena induced by indefinite NPs. On the other hand, we observe that interpretation strategies that have been developed in text linguistics and pragmatics prove helpfiil in the search for the appropriate antecedent of a discourse anaphor. The definition of these research questions also proves to be an interesting way of bringing insights from lexical semantics, model-theoretic semantics, formal pragmatics and discourse analysis together in their search for a better imderstanding of discourse phenomena.
Note This article is an abridged and revised version of chapter 6 'Discourse and donkey anaphora' of a textbook on semantics: Henriëtte de Swart (1998). Introduction to Natural Language Semantics, Stanford: CSLI publications.
References Asher, N. (1993). Reference to abstract objects in discourse, Dordrecht: ffluwer. Asher, N. & A. Lascarides (1995). Lexical disambiguation in a discourse context, Journal of Semantics 12, 69-108. Asher, N. & A. Lascarides (1998). Bridging, Journal of Semantics 15, 83-113. Barker, C. (1996). Presuppositions for proportional quantifiers. Natural Language Semantics 4, 237-259.
Three approaches to discourse and donkey anaphora
101
Barwise, J. (1987). Noun phrases, generalized quantifiers and anaphora. Generalized quantifiers, edited by P. Gärdenfors, 1-30. Dordrecht: Reidel. Barwise, J. & R. Cooper (1981). Generalized quantifiers and natural language, Linguistics and Philosophy 4,159-219. Beaver, D. (1997). Presuppositions. Handbook of logic and language, edited by J. van Benthem and A. ter Meulen, 939-1008. Amsterdam: Elsevier. Beghelli, Filippo & Stowell, Tim (1997). Distributivity and negation: the syntax of each and every. Ways of scope taking, edited by Szabolcsi, Α., 71-107. Dordrecht: Kluwer Academic Publishers. Bos, J. & P. Buitelaar & A.-M. Mineur (1995). Bridging as coercive accomodation. Computational logic for natural language processing. South Queensferry, Scottland. Cheng, L. & Huang, Ch. (1996). Two types of donkey sentences. Natural language semantics 4,121-163. Chierchia, G. (1992). Anaphora and Dynamic Binding, Linguistics and Philosophy 15,111-183. Chierchia, G. (1995). Dynamics of meaning, Chicago: University of Chicago Press. Clark, H. (1975). Bridging. Theoretical issues in Natural Language Processing, edited by R. Schank and B. Nash-Webber, reprinted in P.N. Johnson-Laird and P.C. Wason, eds., 1977 Thinking: Readings in Cognitive Science, Cambridge University Press, pp. 411-420. Cooper, R. (1979). The interpretation of pronouns. Syntax and Semantics, edited by F. Heny and H. Schnelle, 61-92. New York: Academic Press. Cooper, R. (1996). The role of situations in generalized quantifiers. Handbook of contemporary semantic theory, edited by S. Lappin, 65-86. Oxford: Blackwell. Corblin, F. (1996). Quantification et anaphore discursive. Langages 123, 51-74. Cruse, D. (1986). Lexical semantics, Cambridge: Cambridge University Press. Deemter, K. van (1992). Towards a generalization of anaphora. Journal of Semantics 9, 27-51. Deemter, K. van (1994). What's new? A semantic perspective on sentence accent. Journal of Semantics 11,1-31. Dekker, P. (1993). Transsentential meditations: ups and downs in dynamic semantics, Ph.D. thesis. University of Amsterdam. Diesing, M. (1992). Indefinites, Cambridge MA: MIT Press. Does, J. van der (1994). Formalizing e-type anaphora. Proceedings of the ninth Amsterdam Colloquium, Amsterdam: ILLC publications.
102
Henriette de Swart
Does, J. van der & J. van Eijck (eds.) (1996). Quantifiers, logic and language, Stanford CA: CSLI publications. Egli, и. (1979). The Stoic concept of anaphora. Semantics from different points of view, edited by U. Egli and R. Bäuerle and A. von Stechow Berlin: Springer. Egli, Urs & Klaus von Heusinger (1995). The epsilon operator and e-type pronouns. Lexical knowledge in the organization of languages, edited by Egli и. and others, 121-141. Amsterdam: Benjamins. Evans, G. (1977). Pronoims, quantifiers and relative clauses (i) and (ii). The Canadian journal of philosophy 7, 467-536, 777-797, reprinted in G. Evans (1985). Collected Papers. Dordrecht: Foris. Evans, G. (1980). Pronouns, Linguistic Inquiry 11, 337-362, reprinted in G. Evans (1985). Collected Papers. Dordrecht: Foris. Farkas, D. (1997). Evaluation indices and scope. Will's of scope taking, edited by Szabolcsi, A. Dordrecht: Kluwer Academic Press. Fodor, J. & I. Sag (1982). Referential and quantificational indefinites. Linguistics and Philosophy 5, 355-398. Fox, B. (1987). Discourse structure and anaphora, Cambridge: Cambridge University Press. Franck, A. (1996). Context dependence in modal constructions, Ph.D. thesis, University of Stuttgart. Gardent, С. & К. Striegnitz (2001). Grenerating indirect anaphora. Proceedings of the International Workshop on Computational Semantics, 4, Tilburg: Tilburg University Press. Geach, P. (1962). Reference and generality, Ithaca, NY: Cornell University Press. Groenendijk, J. & M. Stokhof (1984). Studies on the semantics of questions and the pragmatics of answers, Ph.D. thesis. University of Amsterdam. Groenendijk, J. & M. Stokhof (1990). Dynamic Montague grammar. Papers from the second symposium on logic and language, edited by L. Kálmán en L. Polos, 3-48. Budapest: Akadémiai Kiadó. Groenendijk, J. & M. Stokhof (1991). Dynamic predicate logic. Linguistics and Philosophy 14, 39-100. Groenendijk, J. & M. Stokhof (1997). Questions. Handbook of logic and language, edited by J. van Benthem and A. ter Meulen, 1055-1124. Amsterdam: Elsevier. Groenendijk, J. & M. Stokhof & F. Veltman (1996). Coreference and modality. Handbook of contemporary semantic theory, edited by S. Lappin, 179-213. Oxford: Blackwell. Geurts, B. (1997). Book review of Moxey and Sanford (1993), Journal of Semantics 14, 87-94.
Three approaches to discourse and donkey anaphora
103
Grosz, В. & С. Sidner (1986). Attention, intentions and the structure of discourse, Computational Linguistics 12,175-204. Heim, I. (1991). Artikel und Definitheit. Semantics: an international handbook, edited by A. von Stechow and D. Wunderlich, 487-535. Berlin: De Gruyter. Heim, I. (1982). The semantics of definite and indefinite NPs, Ph.D. thesis, University of Massachusetts, Amherst. Heim, I. (1990). E-type pronouns and donkey anaphora, Linguistics and Philosophy 13,137-178. Hendriks, P. & H. de Hoop (2001). Optimality theoretic semantics. Linguistics and Philosophy 24,1-32. Hinrichs, E. (1996). Temporal anaphora in discourses of English, Linguistics and Philosophy 9, 63-82. Hobbs, J. (1979). Coherence and coreference. Cognitive Science 3, 67-90. Hobbs, J. & M. Stickel & D. Appelt & P. Martin (1993). Interpretation as abduction. Artificial Intelligence 63, 69-142. Honcoop, Martin (1998). Dynamic excursions on weak islands, Ph.D. thesis, University of Leiden. Johnson-Laird, P. (1983). Mental models, Cambridge: Cambridge University Press. Kadmon, N. (1989). On unique and non-unique reference and asymmetric quantification, Ph.D. thesis. University of Massachusetts at Amherst. Kadmon, N. (1990). Uniqueness, Linguistics and Philosophy 13,273-324. liameyama, M. (1996). Indefeasible semantics and defeasible pragmatics. Quantifiers, deduction and context, edited by M. Kanazawa and C. Piñón and H. de Swart, 110-138. Stanford: CSLI Publications. Kamp, H. (1981). A theory of truth and semantic representation. Formal methods in the study of language, edited by J. Groenendijk and T. Janssen and M. Stokhof Amsterdam: Mathematisch Centrum, reprinted in J. Groenendijk, T. Janssen and M. Stokhof (eds.) (1984). Truth, interpretation and information, pp. 1-41. Dordrecht: Foris. Kamp, H. & С. Rohrer (1983). Tense in texts. Meaning, use and interpretation in language, edited by R. Bäuerle and C. Schwarze and A. von Stechow, 250-269. Berlin: De Gruyter. Kamp, H. & и. Reyle (1993). From discourse to logic, Dordrecht: Kluwer Academic Publishers. Kamp, H. & Α. Roßdeutscher (1994). Remarks on lexical structure and drs construction, Theoretical linguistics 20, 97-164. Kanazawa, M. (1994). Weak vs. strong readings of donkey sentences and monotonicity inference in a dynamic setting, Linguistics and Philosophy 17,109-158.
104
Henriette de Swart
Karttunen, L. (1976). Discourse referents. Syntax and Semantics vol. 7, edited by J. McCawley, 363-385. New York: Academic Press. Kibble, R. (1997). Complement anaphora and dynamic binding. Proceedings of SALT, 7, Ithaca, NY: Cornell University Press. Kibble, R. (1997). Complement anaphora and monotonicity. Formal grammar, edited by G. Morrill and G.-J. Kruijf and R. Oehrle, 125-136. Stanford: CSLI publications. Krahmer, E. & R. Muskens (1995). Negation and disjunction in discourse representation theory. Journal of Semantics 12, 357-367. Krahmer, E. & R. Muskens (1998). On the interpretation of anaphoric noun phrases: Towards a full understanding of partial matches. Journal of Semantics 15, 355-392. Kratzer, A. (1995). Stage-level and individual-level predicates. The generic book, edited by G. Carlson and F. Pelletier, 125-175. Chicago: University of Chicago Press. Kratzer, A. (1998). Scope or pseudoscope? Are there wide-scope indefinites? Events and grammar, edited by S. Rothstein, 163-196. Dordrecht: Kluwer. Kriíka, M. & F. Pelletier & G. Carlson & Α. ter Meulen & G. Chierchia & G. Link (1995). Genericity: and introduction. The generic book, edited by G. Carlson and F. Pelletier, 1-124. Chicago: University of Chicago Press. Krifka, M. (1999). At least some determiners aren't determiners. The semantics Ipragmatics interface from different points of view, edited by Turner, K. Amsterdam: Elsevier Science. Lambrecht, К. (1994). Information structure and sentence form, Cambridge: Cambridge University Press. Landman, Fred (1998). Plurals and maximalization. Events and grammar, edited by Rothstein, Susan, 237-271. Dordrecht: Kluwer Academic Publishers. Lappin, S. & N. Frances (1994). E-type pronouns, I-sums and donkey anaphora, Linguistics and Philosophy 17, 391-428. Lewis, D. (1975). Adverbs of quantification. Formal semantics of natural language, edited by E. Keenan, 3-15. Cambridge: Cambridge University Press. Link, G. (1991). Plural. Semantik. Ein internationales Handbuch der Zeitgenössischen Forschung, edited by D. Wunderlich and Α. Von Stechow, 418-440. Berlin: de Gruyter. Matsui, T. (1995). Bridging and relevance, Ph.D. thesis. University College London. Moxey, L. & A. Sanford (1987). Quantifiers and focus. Journal of semantics 5,189-206.
Three approaches to discourse and donkey anaphora
105
Moxey, L. & A. Sanford (1993). Communicating quantities. A psychological perspective, Laurence Erlbaum Associates. Muskens, R. (1996). Combining Montague Semantics and discourse representation, Linguistics and Philosophy 19,143-186. Muskens, R. & J. van Benthem & A. Visser (1997). Dynamics. Handbook of logic and language, edited by J. van Benthem and A. ter Meulen, 587-648. Amsterdam: Elsevier. Neale, S. (1990). Descriptions, Cambridge Massachusetts: MIT Press. Nunberg, G. (1995). Transfers of meaning. Journal of Semantics 12, 109-132. Partee, В. (1973). Some structural analogies between tenses smd pronoims in English, Journal of Philosophy 70, 601-609. Partee, В. (1984). Nominal and temporal anaphora. Linguistics and Philosophy 7, 243-286. Piwek, P. & Krahmer, E. (2000). Presuppositions in context: Constructing bridges. Formal Aspects of Context, edited by Bonzon, P. and Cavalcanti, M and Nossum, R. Dordrecht: Kluwer. Poesio, M. & R. Vieira (1998). A corpus-based investigation of definite description use. Computational Linguistics 24,183-216. Reinhart, T. (1982). Pragmatics and linguistics: an analysis of sentence topics. Distributed by the Indiana University Linguistics Club. Reinhart, Tanya (1997). Quantifier scope: how labor is divided between QR and choice functions, Linguistics and Philosophy 20, 335-397. Roberts, C. (1989). Modal subordination and pronominal anaphora in discourse, Linguistics and Philosophy 12, 683-721. Roberts, C. (1996). Anaphora in intensional contexts. The handbook of contemporary semantic theory, edited by S. Lappin, 215-246. Oxford: Blackwell. Rooth, M. (1987). Noim phrase interpretation in Montague Grammar, File Change Semantics and Situation Semantics. Generalized quantifiers: linguistic and logical approaches, edited by P. Gèlrdenfors, 237-268. Dordrecht: Reidel. Russell, B. (1905). On denoting. Mind 14,479-493, reprinted in B. Russell (1956). Logic and Knowledge edited by RC. Marsh, pp. 41-46. London: Macmillan/New York: Allen and Unwin. Ruys, Ed (1992). The scope of indefinites, Ph.D. thesis, Utrecht University. Saint-Dizier, P. & E. Viegas (1995). Computational lexical semantics, Cambridge: Cambridge University Press. Sandt, R. van der (1992). Presupposition projection as anaphora resolution, Journal of Semantics 9, 333-377. Schiffrin, D. (1994). Approaches to discourse, Oxford: Blackwell.
106
Henriëtte de Swart
Schubert, L. & F. Pelletier (1989). Grenerically speaking or using discourse Representation theory to interpret generics. Properties, types and meaning, edited by G. Chierchia, B. Partee and R. Turner Π, 193-268. Dordrecht: Reidel. Swart, H. de (1991). Adverbs of quantification: a generalized quantifier approach, Ph.D. thesis. University of Groningen. Published by Garland New York in 1993. Swart, H. de (1999). Indefinites between predication and reference. Proceedings of SALT, 9, 273-297. Ithaca, NY: Cornell University Press. Swart, H. de (2001). Weak readings of indefinites: type shifting and closure, The linguistic review 18, 69-96. Szabolcsi, Anna (1997). Background notions in lattice theory and generalized quantifiers. Will's of scope taking, edited by Szabolcsi, Α., 1-27. Dordrecht: Kluwer Academic Publishers. Vallduvi, E. (1994). Detachment in Catalan and information packaging. Journal of Pragmatics 22, 573-601. Van Geenhoven, V. (1998). Semantic incorporation and indefinite descriptions, Stemford, CA: CSLI publications. Vermeulen, C. (1994). Explorations of the dynamic environment, Ph.D. thesis. University of Utrecht. Webber, B. (1991). Structure and ostension in the interpretation of discourse deixis. Language and Cognitive Processes 6,107-135. Winter, Yoad (1998). Flexible Boolean semantics, Ph.D. thesis, Utrecht University.
Floating quantifiers: Handle with care Jonathan David Bobaljik
What is the relationship between the two sentences in (1)? (1)
a. b.
All the students have finished the assignment. The students have all finished the assignment.
More precisely, what is the nature of the relationship between all and [DPthe students] in (lb) and what can this relationship tell us about grammar? The meanings of the two sentences are obviously quite similar and they involve (apparently) the same collection of words. This observation has led to a series of proposals based on the idea that there is a transformational relationship between these sentences, and thus a syntactic relation between the DP and the "floating" quantifier (FQ), so-called since the earliest proposals took the quantifier to float rightwards, away from the DP. In this brief overview, I will examine some of the central proposals concerning such constructions, and try to flesh out a sense of what we have collectively learned since attention was focused on this phenomenon in the early 70s. I will argue that despite significant progress in our understanding of the syntactic, semantic and morphological properties of constructions such as (lb), there is still a great deal more to learn. One proposal in particular (i.e., that all in (lb) marks a subject trace, due to Sportiche 1988), if substantiated, offers a very powerful tool for the investigation of phrase structure and movement properties, and has had a significant influence on (especially the syntactic) literature of the past decade. Given the potential of this account, the hypothesis that FQs mark positions from (or through) which a DP has moved deserves close scrutiny. Such scrutiny reveals, however, that the evidence is not as clear as often assumed and
108
Jonathan David Bobaljik
that many crucial questions are still unanswered. I hope, though, to offer with this overview a sense of where research into the matter stands currently, and what the major issues are that still loom before us.
1. Context It was noticed early in the generative tradition (see especially Kayne 1969, 1975) that in some languages, sentences with certain quantified DPs may be paraphrased quite closely by sentences in which the quantifier [Q] is separated from the DP, surfacing in apparent adverbial positions. Example (1) is a canonical example from English and (2) gives a similar pair from French. (2)
a.
b.
Tou8 les enfants ont vu ce film. ail the children have seen this movie 'All the children have seen this movie.' Les enfants ont tous vu ce film. the children have all seen this movie 'All the children have seen this movie. (Sportiche 1988: 426)
Not all Qs may occur in such pairs; abstracting away from the presence of of or French de 'of (see below) imiversal quantifiers all, each and hath and French tou(te)s 'all', chacun 'each' may alternate between positions. In the earliest proposals, a transformation took the Q from its position at the left edge of the DP emd moved it to a different position in the clause; the phenomenon was soon dubbed Q-float. Kayne (1969, 1975) identified two Q-fioat operations in French: Q-Post /R-Tous — in which the Q moves to the right from the DP with which it is associated (as in (2)), and L-Tous—in which the Q moves to the left from its associate (3). (3)
Elle a tous voulu les lire. she has all wanted them to-read 'She wanted to read them all.'
(Kayne 1975: 4)
There are two fundamental properties of Q-float which motivated the initial transformational proposals (see Kayne 1975: 2) and which have continued to be primary motivations for all approaches which maintain that there is a syntactic relationship between the FQ and the DP (see, e.g., Sportiche 1988: 426; Doetjes 1997: 201-205). First is the intuition that the FQ quantifies over the DP in the (b) examples in (1) and (2) in the same way that it does in the (a) examples, i.e., that the sentences are logically equivalent, or that their "quantificational properties" are "identical" (Sportiche 1988: 426).
Floating quantifiers: Handle with care
109
Second is the fact that in many languages, FQs show agreement (typically for case, пшпЬег and gender) with the DP that they are associated with: (4)
a.
Elles
sont toutes I Hous
they-F a r e
b.
allées
à la plage.
all-F.PL/*all-M.PL gone-F.PL t o t h e b e a c h
'They (the women) all went to the beach.' (French, Doetjes 1997: 205) Diesen Studenten habe ich gestern These-DAT.PL students have I yesterday allen/*alle geschmeichelt. all-DAT.PL/*-0
flattered.
Ί flattered all of these students yesterday.' (German, Merchant 1996: 4) Agreement is a property of the nominal system and the agreement morphology home by FQs in French and German is adjectival, the same morphology that these Qs bear when they occur at the left edge of the DP. Having thus identified Q-float as a likely candidate for a transformation, a good deal of work in the 1970s and early 1980s was devoted to discovering and refining the conditions under which the transformation could apply, that is, in describing and explaining the distributional properties of FQs (see, e.g.. Baltin 1978 for data irom a range of languages). For instance, it was understood early on that FQs occupy positions in which adverbs canonically surface, especially to the left of verbs and verbal elements (e.g., auxiliaries and modals) (5). (5)
a. b.
The children {all} would {all} have {all} been {all} doing that. Les soldats ont {tous les deux} été {t.l.d.} présentés {t.l.d.} the soldiers have all the two been all 2 introduced all 2 à Anne par ce garçon. to A. by this boy 'Both soldiers were introduced to Anne by this boy.' (Kayne 1975:46)
This becomes clearer when one examines various contrasts between English and French: the diñierences in possibilities for adverb placement between the two languages correspond to differences in admissible sites for the FQ. For instance, English, but not French, allows an adverb or a FQ to immediately follow the subject. (6)
a. My friends all ¡probably will leave. b. * Les enfants tous ¡bientôt vont partir. the children all/soon will leave
(Pollock 1989: 368)
110
Jonathan David Bobaljik c. * Les soldats tous les deux ont été présentés à Anne par ce garçon. the soldiers all the two have been introduced to A. by this boy (cf., (5b)) (Kayne 1975: 47)
As (6a) shows, this is true even in sentences with an auxiliary (or modal) which is standardly taken to be in Infl (or the highest functional projection in a split Infl). An additional English versus French contrast in which FQs pattern with adverbs concerns their use as a diagnostic for the left edge of the VP. Thus, the argument from Emonds (1978) (expanded by Pollock 1989) that English finite main verbs remain in the VP (at S-structure), while in French all finite verbs raise to Infi, is in part based on the fact that a certain class of adverbs must precede finite main verbs in English and follow them in French (7a-b). Examples (7c-d) show that FQs pattern with the left-edge of VP adverbs in this regard. a.
b. c.
d.
Jean (*souvent) embrasse John often kisses 'John often kisses Mary' John (often) kisses Mes amis (*tous) aiment my friends all love 'My friends all love Mary.' My friends (all) love
(souvent) Marie. often Mary (*often) (tous) all
Mary. Marie. Mary
(*all)
Mary. (Pollock 1989: 367)
Pollock (1989) uses adverbs, FQs and negation to diagnose the left edge of the VP. While negation and adverbs do not behave alike under all tests for position. Sag (1978) observes that FQs pattern with adverbs (and as opposed to negation) in tests such as the licensing of VP-ellipsis: (8)
a. b. c.
Otto has read this book, and my brothers have (all ! certainly) read it, too. Otto has read this book, and my brothers have (*all / *certainly) , too. Otto has read this book, but my brothers have (n't!not) .
By and large then, it appeared (and was certainly assumed) that FQs occupy adverbial positions. It was also known that there were certain locality restrictions on the dependency between an FQ and the DP it modifies. These were originally investigated in terms of linear precedence (e.g.. Baltin 1978, though Fiengo and Lasnik 1976 already note the relevance of subjacency, the precursor
Floating quantifiers: Handle with care
111
to c-command). In the early 1980s an important discovery was made, namely, that the dependency between an FQ and a DP obeys in essence the same locality constraints as those holding between an anaphor and its antecedent (Kayne 1981: 196; Belletti 1982: 114). Thus, the DP must c-command the FQ (9) (and perhaps (10)), and no finite clause boundary or specified subject may intervene between them (11). (9)
a. *[The mother of my friends J has all¡ left. b. *La mère de mes amiSi est tou8¡partie. the mother of my friends is all left intended: "The mother(s) of all my friends left' (Kayne 1981: 196)
(10)
* There (had) all hung on the mantelpiece portraits by Picasso. vs. The portraits by Picasso (had) all hung on the mantelpiece. There hung on the mantelpiece all (of) the portraits by Picasso. (Baltin 1978: 26).
(11)
a. *My friendSi think that I have alii Ιφ· b. *Mes amis i pensent que je suis tous ι parti. my fnends think that I am all left intended: 'My Mends all think that I have left. (liayne 1981: 196)
By the mid-1980s, the leading view of Q-float as an extraposition rule, a transformation moving the quantifier to the right, was being gradually replaced by a view in which FQs were "anaphoric adverbs", related to their hosts via Binding. To be sure, there were variations in the implementation of this idea. Belletti (1982), for example, proposed that the anaphoric status was not inherent, but rather the result of a requirement that distributive elements including reciprocals and FQs need to imdergo LF A-movement to their host, an idea which is picked up in Heim, Lasnik and May (1991) and extended to anaphors generally in Lebeaux (1983) and Chomsky (1986).
2. 2.1. Sportiche
Stranding (1988), Shlonsky
(1991)
In the late 1980s, four properties of FQs were considered especially salient: (i) FQs appeared to modify DPs in the same way as DP-initial Qs; (ii) FQs in some languages display determiner-like agreement with the DP they modify; (iii) FQs surface in the left periphery of (certain) maximal
112
Jonathan David Bobaljik
projections, especially VP; and (iv) the relationship between an FQ and the DP it modifies obeys an anaphor-like locality condition. In this context, Sportiche (1988) proposed that all of these properties can be made to follow from the observation (12), on certain independently motivated assumptions about movement and phrase structure (I tiim to a contemporaneous proposal by Miyagawa (1989) in section 2.2). (12)
Qs may appear in P]P-initial position. (Sportiche 1988: 427)
Sportiche argues that since (12) is an independently necessary statement, the most parsimonious theory of the distribution of FQs is therefore "one in which nothing essential needs to be said beyond [(12)]" (p. 427). Now, since it was by this time well understood that the locality conditions applying to DP-traces (i.e., traces of A-movement) were the same as those for anaphors, Sportiche (1988) proposed that the cluster of properties of FQs discussed above could be explained if the FQs formed a constituent with the DP at D-structure, and the phenomenon of Q-float was actually the stranding of the Q in a position adjacent to the trace of the DP. Such a theory would work if the subject in Spec,IP occupied a derived position not only in raising, passive and unaccusative environments but also in simple clauses — i.e., the VP-intemal subject hypothesis, a proposal independently gaining attention at that time (see, e.g., Koopman and Sportiche 1991). Thus, a sentence like (lb) was more accurately represented as (13a), i.e., with a D-structure as in (13b). (13)
a.
The students¡ have [all t j finished the assignment.
b.
IP
INFL
VP*
have
DP all
the students finished the assignment
Sportiche's proposal captured the observations that were the original motivation for a transformational relationship between (la) and (lb): the Q is able to modify the DP, and in some languages to agree with it, since at D-structure, [Q DP] is a single constituent. Moreover, the proposal appeared to capture the major distributional properties of FQs: FQs appeared to occupy adverbial positions such as the left periphery of VP since the adverbial positions were adjacent to the base position of the subject, and the locality conditions looked like those for NP-movement, since they were holding not between the DP and the FQ directly, but be-
Floating quantifiers: Handle with care
113
tween the DP and its trace. While a number of questions were still unanswered (see below), Sportiche's proposal appeared to be a major breakthrough in our understanding of the phenomena, and at the same time, was considered to be compelling empirical support for the VP-intemal subject hypothesis. Shlonsky (1991) refines the stranding proposal in an important way, in doing so expanding its empirical coverage. While Sportiche (1988) remains vague about the mechanics of the extraction (just how does a subconstituent DP move out of the larger DP without violating conditions on extraction?), Shlonsky offers an account, drawing on Hebrew data of the following sort: (14)
a.
b.
c.
Katafti ?et kol / *kul-am ha-praxim bi-zhirut. (I) picked Acc all / *а11-[3мрь] the-flowers with-care. Ί picked all the flowers carefully' Katafti ?et ha-praxim kul-am / *kol bi-zhirut. (I) picked ACC the-flowers а11-[3мрь] / *all with-care. Ί picked all the flowers carefully.' Ha-yeladim yaSnu kul-am / *kol. the-children slept all-[3MPL] / *all 'The children all slept.' (Shlonsky 1991: 160-1, 167)
In Hebrew, a Q such as kol 'all' may occur before or after the DP which it modifies. When it precedes the DP, the Q must be bare (14a), but following the DP, the Q obligatorily hosts a pronominal (agreement) clitic (14b). Shlonsky proposes that a quantified DP such as [all the flowers] is a QP, headed by the Q which in t u m takes the DP as its complement: {q^all [npi/ie flowers]]. In (14b), the DP has raised to the specifier of the QP, and the agreement clitic is a reflection of this movement (Shlonsky relates this to the ECP, it could also be interpreted as an instance of Specifier-Head agreement). Finally, Shlonsky demonstrates that Hebrew, like French and English, has a Q-fioat phenomenon. As illustrated in (14c), an FQ requires the appropriate clitic, just as does a post-nominal Q. This suggests that the stranding of the FQ involves a step of DP movement to the specifier of QP, allowing the DP to be further extracted. Shlonsky's proposal appears also to shed light on English facts discussed in Postal (1974b) and dubbed Q-Pro-Flip in Maling (1976). Thus in constructions without of, the Q all cannot follow a plural DP, but must (or is strongly preferred to) follow a plural pronoun with which it forms a constituent: (15)
a. * Sam saw [the students all], vs. b. Sam saw [us I them all]. vs.
Sam saw all (of) the men. Sam saw all *(of) us! them.
114
Jonathan David Bobaljik
Studies of DP syntíix have often noted asymmetries between pronouns and NPs and Q-Pro-Flip can thus be seen as an example of such an asymmetry, wherein the pronoun obligatorily undergoes the short movement to the specifier of QP which Shlonsky takes to underlie Q-float in general. 2.2. Stranding in Japanese (Miyagawa
1989)
At roughly the same time as Sportiche (1988) introduced the stranding analysis for English and French Q-float, a similar proposal was advanced to account for the distribution of Japanese numeral quantifiers (NQ) (the analysis is developed and defended in Miyagawa 1989, chapter 2; the relevance of traces to the distribution of NQs is also mentioned in Kuroda 1980,1983). NQs need not always appear adjacent to the NP they are associated with, and had already been treated as Q-float phenomenon in the literature. Miyagawa considers contrasts of the following sort. (16)
Gakuseiga kyoo 3-nin kita. students NOM today 3-CL came. 'Three students came today.' b. Ί*Gakuseiga hon о 4- nin katta. students NOM book ACC 4-CL bought ('Four students bought books.') c. Yuube, kurumaga doroboo ni 2-dai nusum-are-ta. last night cars NOM thief by 2-CL Steal-PASS-PAST 'Last night, two cars were stolen by a thief (Miyagawa 1989: 21, 38) a.
Miyagawa observed that an NQ occurring to the right of the DP it modifies could be separated from that DP if the DP is the subject of an unaccusative (16a), or passive (16c) verb, but that the direct object may not intervene between a transitive subject and an NQ (16b). (The classifier, glossed CL, like agreement shows clearly which DP the NQ is associated with.) Miyagawa proposes that the NQ must be in a relation of mutual c-command with the phrase it quantifies over, at D-structure [Miyagawa admits ternary-branching structures]. Since both passive and imaccusative subjects are taken to be derived by movement from a VP-intemal position, the legitimate positions of the NQ in (16) are those which mutually c-command the trace of the moved DP. Miyagawa (1989) assumes that there is no subject trace to the right of the direct object in the (b) example, thus accounting for its ungrammaticality. Since Miyagawa's proposal makes reference to D-structure (or equivalently, relations among traces), he correctly rules out examples in which
Floating quantifiers: Handle with care
115
the DP fails to c-command the FQ at any level (17a), while admitting examples in which the c-command condition is met at D-structure, but not at S-structure, as when the NQ is scrambled (17b). (17)
a. *[^¡pTotnodatino kuruma]ga 3-nin kosyoosita. friends GEN car NOM 3-CL broke down (Three friends' cars broke down.') b. 3-maii, kodomo ga sara о ti watta (koto). 3-CL child NOM plate Acc broke (fact) '(The fact that) the child broke three plates.' (Miyagawa 1989: 29-30)
Note that a subject trace following the direct object would not necessarily be unexpected under the VP-intemal subject hJφothesis, since examples such as (18a) indicate the possibility of short, leftwards movement of the direct object (i.e., across the adverb). Nakayama and Koizumi (1991) and Koizumi (1995) take (18a), together with the possibility of an NQ separated from the subject DP by an adverb (18b), to indicate that the base position of the transitive subject is VP-intemal in Japanese, but is nevertheless above the position occupied by "shifted" objects. (18)
a.
b.
Hanakoga peno kyoo 3-bon katta. students NOM pen ACC today 3-CL bought 'Hanako bought three pens today.' Gakuseiga kyoo 3-nin hon о katta. students NOM today 3-CL book ACC bought 'Three students bought the book today' (Miyagawa 1989: 28)
While Miyagawa's analysis differs from those of Sportiche (1988) and Shlonsky (1991) in that for Miyagawa, the FQ at no level forms a constituent with the DP it modifies, the analyses share the fundamental idea that the FQ is underlyingly adjacent to the position of traces. 2.3. A refinement, re: PRO Limiting the positions of FQs to those occupied by traces is known to be too restrictive. Indeed, though I follow common practice in referring to the analyses of Sportiche (1988), Miyagawa (1989) and others as "stranding" analyses, implying movement, Sportiche himself proposed that the FQ need only be sister to certain types of empty category, including, but not limited to, DP-traces. The positions available to FQs in non-finite clauses suggest that an FQ may also be adjacent to (arbitrary or controlled) PRO in French (19) and Enghsh (20).
116 (19)
Jonathan David Bobaljik a.
b.
(20)
a. b. c.
Il aurait fallu tous partir. it would-have been-necessary all PRO to-leave 'It would have been necessary for all to leave.' Ils ont décidé de tous partir. they have decided to all PRO leave. 'They decided to all leave.' (Sportiche 1988: 436) To all have been doing that would have been inconvenient. I persuaded the men all to resign. The men promised me all to resign. (Baltin 1995: 211, 222)
The proposal that FQs can modiiy PRO as well as traces offers a straightforward account of the following examples (mostly from Fiengo and Lasnik 1976 and Maling 1976), as noted by Sportiche (1988), Bowers (1993) and Baltin (1995). (21)
a. h c. d. e. f.
I gave the boys both (a quarter I quarters}. The tooth fairy promised the kids each a quarter. Cinderella's fairy godmother turned the pumpkins all into handsome coaches. She called the men both bastards. The vision struck the shepherds all blind. Three of my friends came into the café all very drunk.
Maling (1976: 716) notes that an FQ at the left edge of an NP, PP or AP constituent is felicitous "only if the following phrase can reasonably be associated (semantically) with the NP that the quantifier binds," noting also that a similar restriction is suggested by Kayne (1975: 49) for French. Bowers (1993) and Baltin (1995) translate this proposal into a theory of predication, which involves PRO or DP-trace in the specifier of all predicative constituents. [Baltin does not adopt a stranding analysis, but argues that FQs are preverbs: a class of adverbs adjoined to the left edge of a predicate. The point is nevertheless that the possibility of an FQ in the examples in (19)-(21) is tied to the presence of PRO or DP trace in that constituent: for Bowers, the PRO forms a constituent with the empty category, while for Baltin, the FQ adjoins to the class of XPs which require an empty category in their specifier, following from the theory of predication he develops.] If the stranding analyses are correct that the FQs are adjacent to the positions of empty categories, then they constitute one of our most direct and thus most powerful tools for the investigation of phrase structure and movement. Indeed, FQs are now routinely and often unquestioningly used in this fashion, in introductory texts (e.g., Haegeman [1991] 1994) and in the syntax literature more generally. Given this potential, the hypothesis
Floating quantifiers: Handle with care
117
that FQs mark positions from (or through) which a DP has moved deserves close scrutiny. Such scrutiny reveals, however, that the evidence for this proposal is thin and that many crucial questions are still unanswered more than a decade later. 3. Some problems
for stranding
3.1. The passive/unaccusative
analyses problem
Returning to English and French, the possibility of FQs at the left periphery of VP (as in (1) and (2)) is taken, under the stranding analysis, to mark a subject DP-trace in that position. An initial problem for this approach, noted already by Sportiche, is that FQs are impossible in canonical NP-trace positions such as the complement of unaccusative and passive verbs: (22)
a. b.
The studentsi have arrived (*all) ti. The studentSi were seen (*all) ti-
Towards the end of the article, Sportiche (1988:444) is forced into a rather unwieldy analysis of English passive and unaccusative constructions, in which the surface subjects of these constructions originate neither in the base position of transitive subjects nor in the base position of direct objects. Similarly, Sportiche claims based on the (somewhat degraded) acceptability of (23a-b), that FQs can mark passive and unaccusative subject traces in French, noting that the examples improve with emphasis on the Q or the addition of the modifier presque 'almost'. (23)
a.
Les enfantSi
ont
été
vus
ti îtous
/ presque tous.
the children have been seen all almost all 'The children have (almost) all been seen.' b.
c.
d.
Les enfantSi
sont
venus
ti
?tous / presque tous.
the children are came all almost all The children have (almost) all arrived.' Les enfantSi ont ti dormi ìtous / presque tous. the children have slept all almost all "The children have (almost) all slept.' Les enfantSi ont t, vu ce film ?tous / presque tous. the children have seen this movie all almost all 'The children have (almost) all seen this movie.' (French, Sportiche 1988: 427, 437)
This proposal is however undermined by examples like (23c-d). FQs may appear clause-finally in transitive and unergative clauses as well. Ac-
118
Jonathan David Bobaljik
ceptability does not vary among the diíferent clause types, though there is a DP-trace in the relevant position [adjacent to the FQ] only in the first two examples. [For some speakers, sentences like (23) are uniformly bad; what is important is that the predicted contrast (a-b) vs. (c-d) is unattested]. Sportiche suggests that the subject position (Spec,VP or [NP,VP]) must be on the left when overt, but may be ordered freely when it is headed by an empty category. However, the effect of heaviness on acceptability in clause-final position is exactly that attested independently for adverbs in French generally, as noted by Jaeggh (1982: 65). In these cases then, the stranding iinalysis would seem to make the wrong predictions for the most well-motivated positions of DP-traces. Déprez (1989) has suggested that the stranding analysis may be maintained if, in English and French (but apparently not Japanese and Hebrew), FQs may remain in the positions of intermediate DP-traces, but not in thematic (i.e., base) positions, though such a step would require rethinking the VPintemal subject hypothesis to have a DP-trace even lower than Sportiche originally proposed.
3.2. The AI A' distinction Another potential problem for the stranding analyses is explaining the anaphor-like locality restrictions on FQs (see (9)). Thus, in addition to c-command, in (standard) English a DP which has imdergone A-movement (24) may antecede an anaphor or an FQ, but a DP which has undergone A'-movement (25) may not (unless of course, it had previously undergone short A-movement; McCloskey 2000 reports on a dialect of Irish English which differs in this regard; see section 3.3 below). (24)
a. b c.
(25)
The runnerSi seem to themselves^ [jp írace¡ to be moving very slowly]. The lionSi might all seem (to you) [ip trace^ to have large teeth]. The lionSi might all have been seen tracei (by the tourists).
a. * [np i/ie professors who Taylor will have all met before the end of term] (relativization) b. * These professors, Taylor will have all met before the end of term. (topicalization) c. * Which professors will Taylor have all met before the end of term? (M)/i-question)
Floating quantifiers: Handle with care
119
It has been assumed that the stranding analyses account for this restriction directly. For example, Shlonsky (1996:14) states: "If FQs are adverbs, the fact that they must be c-commanded by an (A) Eintecedent requires a special explanation. The most intuitive explanation of this fact is that FQs are associated with DP-trace positions and hence mark a link in an A-chain." Though this seems to be appealing—since we know anaphors and DP traces have the same distribution, we can subsume an anaphora-like relation to movement theory—it only pushes the need for a "special explanation" back one step: why is it the case that in English, FQs must be associated with DP-trace positions, and not, e.g., wh-traces? Note that this is all the more peculiar a fact about Q-float in that other, well-attested stranding processes are not restricted to A-movement, and if anjrthing, are licit only with A'-movement (e.g., 1еп-ЬггтсЬ violations in French, Split Topicalization and was... für split in German). Déprez (1989) suggests one account of the restriction to A-movement, proposing that intermediate traces of A'-movement, but not of A-movement, delete at LF, and that FQs must be licensed by LF-adjacency to an intermediate trace. If the deletion of intermediate traces of A'-movement at LF could be independently motivated, and the restriction to intermediate trace positions could be explained, then this account would provide the missing piece of the puzzle, showing why FQs may only be associated with DP-trace positions. There is also some debate as to the universality of this restriction. For English (with the exception of the dialect reported by McCloskey (2000), the A/A' contrast represented in (24)—(25) appears to be straightforward, and something similar seems to be true for Hebrew (Shlonsky 1991:173). Regarding other languages, though, the situation is less clear. For German, Dutch and French, opposing views have been presented in the literature. Merchant (1996) and Doetjes (1997) for example argue that these languages do allow A' licensers for FQs, while Déprez (1989) and Bobaljik (1995) argue that they do not. At issue are examples such as the following (note that some variation is reported concerning the French judgments, Marie-Hélène Côté, personal communication). (26)
a.
b.
Ces livres;, que j'ai tous lus t¡ sont très intéressants. these books which I have all read are very interesting. These books, all of which I read, are very interesting.' (French, Doetjes 1997: 208) Deze boekeni heb ik allemaal ti gelezen. these books have I all read Ί read all these books.' (Dutch, Doetjes 1997: 209)
120
Jonathan David Bobaljik c.
Welche Bücher i hast du alle t^ lesen müssen? which books have you all to-read had 'All (of) which books did you have to read?' (German, handout to Merchant 1996: 3)
On the surface, these languages appear to differ from English in allowing DPs in A'-positions to license FQs. However there are complications lurking beneath the surface. Déprez (1989) observes that drawing a conclusion from (26a) is complicated by the existence of L-tous, the name given by Kayne to two processes which appear to move the FQ leftwards. One example of L-tous was given in (3) and contrasts minimally with (27). (27)
*Elle a touSi voulu lire ces livresi. she has all wanted toread these books. ('She wanted to read all these books.') (Kayne 1975: 5)
This pair shows that L-tous is not possible with an in situ DP object (27), while L-tous is possible when the object is a clitic (3). Déprez assumes (with Kayne 1975, Sportiche 1988) that L-ÍOMS and Q-float are distinct (if related) processes, and that (26a) is an example of L-tous. Doetjes (1992, 1997) has argued that L-tous and Q-float are in fact the same process, a position which would avoid this objection of Déprez's. (Doetjes's proposal is that the FQ must bind a trace of the DP over which it should quantify; this is discussed in section 5 below). Independently of whether or not L-tous and Q-float are separate phenomena, there is a further set of complications noted also by Déprez (1989: 477fF), having to do with the possibility of short A-movement. Primarily on the basis of facts from past participle agreement Kayne (1989) has argued that A'-extraction in French involves (perhaps optionally) an intermediate stage of A-movement to the left of the participle and he posits an Agr-P dominating the participle. If this analysis could be sustained, it would mean that (26a) tells us nothing about the possibility of A'-movement licensing FQs in French, since it could be the prior, short A-movement which licenses the FQ. Likewise for Dutch and Grerman, a now common analysis of A'-movement which originates with Vanden Wyngaerd (1989) and Mahajan (1990) posits intermediate A-movement through the specifier of an Agr-P. To the extent that such an analysis can be maintained, the possibility of an intermediate A-trace being responsible for the grammaticality of the FQ in (26b-c) cannot be excluded. It is therefore not clear from such examples that French and Dutch differ from English in permitting FQs licensed by A'-movement, but perhaps only that French and Dutch differ in having short A-movement as a (possible) prior step in A'-extraction.
Floating quantifiers: Handle with care
121
These complications require more complex examples, involving, e.g., long-distance A'-movement and a FQ in a higher position than the highest A-position. The examples in (28) show that long-distance A'-moved DPs cannot license FQs in the matrix clause in French, German and Dutch. (28)
a. * [щ. ces hommes, que j'aurais tous cru these men who I would have all believed qui auraient été arrêtés] who had been arrested ('these men, whom I had believed to have all been arrested') (French, Déprez 1989: 94) b. Welche Würste hat der Peter (*alle) bezweifelt which sausages has the Peter all doubted ob der Hund gegessen hat? whether the dog eaten has "Which sausages did Peter doubt whether the dog has eaten all (of)?' (German) c. De dronken taalkundiger heeft Freek (*allenuial) gezegd the drunk linguists has Freek all said dat Marie uitlachte. that Marie made fun of 'Freek said that Marie made iun of all the dnmk linguists.' (Dutch)
The ungrammaticality of the examples in (28) is of course expected on the stranding analyses if there are no intermediate trace positions in the higher clause. The examples in (29) demonstrate that it is also impossible to strand an FQ in an embedded Spec,CP, where, presumably, there is an intermediate trace of the wh-moveà element. This is true whether the long-distance extraction takes place out of an embedded complementizerinitial CP (29a) or an embedded V2 clause (29b). Both examples are fine without the floated quantifier, as indicated. (29)
a.
b.
Welche Würste hat der Peter gesagt which sausages has the Peter said [cp (*alle) daß der Hund gegessen hat?] all that the dog eaten has "Which sausages did Peter say (*all) that the dog ate?' Welche Würste hat der Peter gesagt which sausages has the Peter said [cp (*alle) hat der Hund gegessen?] all had the dog eaten "Which sausages did Peter wonder whether the dog has eaten all (of)?' (German, S. Wurmbrand p.c. 4/01)
122
Jonathan David Bobaljik
3.3. An Irish English In many varieties of English (including those tJφically referred to as "standard") it is quite clear that A'-movement does not license quantifier float. In addition, no unambiguous cases of A'-licensed quantifier float have been adduced irom other well-studied languages, as just discussed. However, in a recent paper, McCloskey (2000) has presented data from a particular variety of Irish English (which he labels "West Ulster English" [WUE] ) which displays a striking contrast with more familiar varieties. In WUE, M;A-movement does license floating quantifiers, thus allowing examples like the following (which are sharply ungrammatical in other varieties, e.g., to my ear). (30)
a. b. c. d.
Who did you meet all when you were in Derby? I can't remember what I said all. Where did they go all for their holidays? What did he say all that he wanted? (WUE, McCloskey 2000: 58)
That the data from this dialect is unlikely to fall to a short A-movement analysis (as mentioned above for French, German and Dutch) is suggested by: (i) examples like (30c) in which the FQ is associated with a ti)A-adjunct (an unlikely candidate for short A-movement) and (ii) the apparent stranding in an intermediate Spec,CP (30d) (contrast German (29)). Of course, one does need to ask if there are alternatives to analysing the FQ as occupying the intermediate Spec,CP in (30d). For example, it does not seem a priori implausible to analyse this example as involving adjunction of аН to VP, either as right-adjunction (with extraposition of the embedded CP), or as left-adjunction (with short verb movement, which McCloskey proposes for independent reasons). This may shed light on the observation that the sequence [main verb + all] in such examples "are prosodie units whose most prominent element is the verb" and that "[t]here is a strong intonational break following this prosodie unit." McCloskey (2000). While such an analysis is suggested by parallels with adverbs like exactly, precisely in standard English noted in McCloske/s fii. 8 (p. 63) (e.g.. What did he say exactly that he wanted?) the parallels are not exact and developing such an analysis would not be trivial. Minimal pairs like (31) also wiegh in against an analysis invoking short A-movement for this variety of English to the extent that wA-movement licenses FQs in positions where normal A-movement does not. (31)
a.
Who was throwing stones all around Butcher's Gate? [=who... all]
Floating quantifiers: Handle with care
123
b. * Tkey were throwing stones all around Butcher's Gate. (WUE, McCloskey (2000: 77) The search for the proper analysis of the West Ulster English facts poses interesting challenges for any theory, but more pressingly is the question of accounting for the variation. Quite simply: if the distribution of FQs in standard English and other languages (e.g., the restriction to A-movement) follows in any straightforward way from deeper principles (as I will suggest in section 5) then it is not at all clear how West Ulster English could be permitted to show the properties it does. At the same time, it would be unfortunate (and would raise familiar questions about acquisition) if the restriction to A-movement for English and other languages' FQs needs to simply be stipulated. Without having shed any particular light on this issue, we may note that there are really only two avenues to pursue, namely, attributing the difference to different lexical properties of the quantifier all, or pinning the difference on some yet-to-be imcovered independent s}Titactic parameter distinguishing WUE on the one hand and other varieties of English (including apparently other varieties in Ulster) on the other.
3.4. The underlying constituents
problem
Returning to the main thread of this section, we might consider a final class of examples which are challenging for the hypothesis that the FQ and DP are derived from an underlying constituent [Q-DP] (or [Q-PRO]). These are cases in which the (ñoated) Q and DP cannot form a grammatical constituent together, for example, cases in which the Q occurring preDP requires the preposition of (French de). The examples in (32)-(33) illustrate this for English each, and French chacun 'each' which float easily yet require a preposition oflde when combining with a plural DP. (Determiner each which requires no presposition when combining with a singular noun — each child — cannot float: *Child has each read a different book.] (32)
a. b.
(33)
a.
These children have each (*of) read a different book. DPpL ... each *(of) these children] has read a different book. *[each DPpiJ Ces enfants ont chacun lu un livre différent. these children have each read a book different 'These children have each read a different book.'
124
Jonathan David Bobaljik Chacun *(de) ces enfants a lu un livre différent. each of these children has read a book different 'Each of these children has read a different book.' (Doetjes 1997: 201)
b.
The mismatch here is relatively minor, and has (perhaps correctly) generally been relegated to the dustbin of morphological or phonological processes inserting (or deleting) of {see Sportiche 1988, in. 3, though note the obligatory contrast in number agreement in the (a) vs. (b) examples). Further examples in which an FQ is licit, but the hypothetical underlying NP is not are illustrated by the following pairs (on examples like (34) see Carden 1976: 94; parallel facts obtain in French, Sportiche 1988: 440; Junker 1995: 88): (34)
a. Larry, Darryl and Darryl have all come into the café. b. Ί*ΑΙΙ (of) Larry, Darryl and Darryl have come into the café.
(35)
a. Some (of the) students might all have left in one car. b. *All (of) some (of the) students might have left in one car.
A final, and striking set of apparent problems for analyses based on (12) include complex quantifying expressions in apparently floated positions, which cannot occur prenominally at all, and which in some cases even include pronoims and determiners (see also section 4.1, below). This includes expressions such as all I none of them, the both of them, all three (of them), and similar complex FQs in French as in (5b) above and (36) (see especially Baltin 1978; Kayne 1975; Torrego 1996). (36)
a. * b.
We have all three of us completed the assignment on time. [ N P A I I three of us we]... Elles sont [toutes les trois] intelligentes. They-F are
all-F.PL the three
intelligent.
'They are all three (of them) intelligent.' * [NpToutes les trois elles]... (French, Kayne 1975: 44) Examples (5b) and (36b) illustrate another important point. Recall that a significant part of the motivation for the stranding analysis is the fact that the FQ in languages such as French shows agreement with the DP, manifesting the same agreement paradigm as when the Q occurs as a part of the DP constituent. Thus, when Sportiche asserts that "nothing essential needs to be said beyond [(12)]" (p. 427) in order to explain the occurrence of agreement on the FQ, he is claiming that agreement arises as a result of the FQ forming an underlying constituent [QDP]. However, the larger quantifying expressions as in (36b) also agree obligatorily, again manifesting the standard agreement paradigm for tous. In such cases.
Floating quantifiers: Handle with care
125
though, there is no corresponding constituent which would underhe the example. Examples of the sort considered here thus appear to present a strong challenge to the assumption that agreement on FQs entails underlying constituency. 4. The semantics
of FQs
Given the vast literature on the semantics of quantification, it is somewhat surprising that FQs have received very little attention from a semantic perspective (notable recent exceptions being Junker 1995 and Doetjes 1997). One question which has, to my knowledge, never been answered is why only a specific class of quantifiers may float, in particular, universal Qs: English each, all, both, (but not every) French: tous, chacun, German alle, but not other partitive quantifiers such as many, some, most, nor the universal every. But there are other aspects of the semantics which have been partially investigated and which should shed light on the syntax as well as the semantics of the construction. Recall that an initial motivation for investigating pairs such as (l)-(2) is the intuition that they are equivalent in meaning. Moreover, transformational approaches (floating or stranding) are built on the sometimes implicit assumption that such equivalence implies an underlying constituency. The clearest formulation of this is again Sportiche's parsimony argument from (12). The alternative to stranding analyses take FQs to be adverbs, e.g., VP-modifiers (Klein 1976; Dowty and Brodie 1984; Milner 1987). Sportiche argues, in part, that the stranding analysis is superior to the VP-modifier analysis since "the semantics of floating Q constructions and partitive Q constructions are so similar" and thus it would be "a priori undesirable to assign the 'same' Q to two diiferent logical types — PP] quantifiers and VP quantifiers..." (p. 446). This last claim requires a certain qualification, though, as the following examples illustrate. (37)
a.
b.
Jean a lu [beaucoup de livres], DP-Q Jean has read a lot of books 'John has read a lot of books.' (Doetjes 1997: 254) Jean a beaucoup travaillé. Adv-Q John has a lot worked. 'John has worked a lot.' (Doetjes 1997: 271)
The French examples in (37) (and the English glosses) show that it is independently necessary to assign the "same" quantifying expression [French beaucoup 'a lot' English a lot] to two logical types, DP quantifier and VP quantifier (or, as Doetjes proposes, to allow the quantifiers to be
126
Jonathan David Bobaljik
underspecified in some way, and therefore permissible in either context). In order to maintain the view that (12), along with similarity in meaning, entails an underlying constituent [Q DP], what is important to show are the following: (38)
a. b.
FQs quantify over the DPs in a way that adverb Qs cannot, and FQs quantify over DPs in the way that (pre-)determiner Qs do.
Without showing these, the semantic motivation for an xmderlying constituent analysis would disappear, especially given the examples above which show that (12) is perhaps not always (surface) true. Let us therefore examine each thesis in tum. 4.1. FQs and adverbial
quantification
Consider in light of the above the examples in (39) and (40); (39a) illustrates adverbial quantification, and (40a) is an example of "Quantification at a Distance". (39)
a. b. c.
Horses will always eat sugar. Horses will all eat sugar. All horses will eat sugar.
(40)
a.
Jean a beaucoup lu de livres. Jean has a lot read of books. 'John has read a lot of books.' Jean a lu [beaucoup de livres]. Jean has read a lot of books 'John has read a lot of books.'
b.
(Doetjes 1997: 254)
Examples (39a-c) can all be read as effecting universal quantification over horses, i.e., all three have the reading: For every χ,χα horse, χ will eat sugar. It is of course true that (39a) has readings which are not available to the other two sentences, but these are not at issue. What is at issue is the following: for those readings in which (39a-b) are sjnnonjnnous, is the effect of universal quantification over horses achieved in the same manner in both exEunples? Proponents of a transformational or movement analysis of FQs must show that it is not (i.e., that (38a) holds). This point can be made clearer perhaps with examples such as (41), after Lahiri (1991:120f), in which it is not at all immediately clear that there is a significant difference in meaning between a sentence with an adverbial quantifier and a paraphrase with a DP-quantifier:
Floating quantifiers: Handle with care (41)
a. b. a'. b'.
127
Media experts in the U.S. tend mostly to be too indoctrinated. The children, for the moat part, were playing in the garden at 6pm. Most media experts in the U.S. tend to be too indoctrinated. Most of the children were playing in the garden at 6pm.
Such examples provide a prima facie challenge to the validity of (38a) and lead to the conclusion that a similarity in meaning or quantificational properties does not lead inescapably to a transformational relationship. In order to establish that (38a) holds, it must be shown (assuming that the boldfaced elements in (41) are adverbs) that the quantification in these examples is of a different sort than that which obtains with determiner quantifiers, despite the similarity in meaning. Of the two major approaches to adverbial quantification only one might have this property. Thus, on the approach which takes adverbial Qs to be unselective binders (e.g., the line initiated by Lewis 1975), universal quantification over horses is achieved in both (39a) and (b) via binding of an open variable in the DP. In an important sense, then, this approach fails to support (38a) to the extent that the unselective nature of the Q in (39a) does not follow necessarily from its status as a VP-modifier. The second set of approaches to adverbial quantification takes the adverb in (39a) to quantiiy over situations, times or events and not directly over individuals such as horses (see, e.g., de Swart 1991; von Fintel 1994). If it can be shown that event quantification is excluded in sentences such as (39b), then this type of analysis of adverbial quantification would support the claim in (38a); always would quantiiy over events, and all over individuals. Similarly, the literature on QAD constructions (see, e.g., Obenauer 1994, Doetjes 1997 and references therein) has argued persuasively that there are (sometimes subtle) interpretive differences between (40a-b), but it is somewhat less clear that these differences entail a difference in logical type between the two uses of the degree quantifiers. Doetjes (1997) argues that degree Qs such as beaucoup 'a lot' are underspecified for their categorial type and that the differences in meaning between (40a-b) follow from the positions in which such quantifiers occur, and not the other way around. A final, but important point to make concerning the claim that "sameness" in meaning entails underljdng constituency is that there are many languages in which sentences parallel to (lb)—i.e., putative examples of FQs — involve quantificational elements which are morphologically distinct from their pre-nominal coimterparts. This is true of Dutch allemaal 'air (42) and Mandarin Chinese dou 'all' (43), both of which occur in floated
128
Jonathan David Bobaljik
positions, but neither of which is generally permitted prenominally as a strong universal Q (Hoeksema 1996; Dowty and Brodie 1984). (42)
a.
De hinderen zijn allemaal gekomen. the children are all come b. * Allemaal (de) hinderen zijn gehomen. c. Alle hinderen zijn gehomen. All the children are come The children have all come.' (Doetjes 1997: 210-11)
(43)
a.
b.
ren dou zou le people all left ASP 'The people have all left.' suo you de ren zou le all PRT people left ASP 'All the people have left.'
(Dowty and Brodie 1984: 82)
In Dutch, the FQ is allemaal (although the prenominai Q alle(n) may apparently be used in floated positions, at least in more formal or older registers). This Q contEiins the adverbial suffix -maal which occurs in frequency expressions such as eenmaal 'once', andermaal 'once more' (from een 'one' and ander 'other'). To my knowledge, it has not been fully investigated to what extent quantification by allemaal in examples like (42a) differs from quantification by alle in (42c) (though Doetjes 1997 does not discuss any difference). If sentences (42a-c) are truly equivalent in meaning, even though the Qs are different alle vs. allemaal, then this would appear to pose a cross-linguistic challenge to the claim that sameness of meaning entails an underlying constituency. The Mandarin examples make the same point (see Chiù 1990,1993 for an analysis taking ciou to be an FQ, and Cheng 1995,1997 for arguments that dou is an adverb). If it is to tum out that floating quantifiers are semantically distinguishable from true adverbial quantifiers, and moreover, that the examples in (42)-(43) truly do have the semantics of floating quantifiers, then these examples add to the problems for a stranding account mentioned in 3.4. As it stands, these examples constitute a challenge to the thesis in (38a) in that they appear to be adverbial quantifiers, but have not been shown to have a semantics distinct from true floating quantifiers. The examples in this section have been intended to show that apparently similar quantificational readings appear to arise in independent configurations, thus challenging (38a). Note though that the truth of (38a) is not necessary for a stranding analysis of FQs. It could well be that FQs and adverbial Qs do (or may) quantify over DPs in the same way (e.g., by binding a variable in the DP), but that nevertheless FQs do not occur in
Floating quantifiers: Handle with care
129
adverbial positions and that the constructions have quite different derivations. What is important about (38a) is that it is often implicitly assumed to be true, £ind that it is generally given as a primary motivation for transformational analyses, including striinding. The claim that since (la) and (lb) "mean" the same, they must be transformationally related relies on the tacit assumption that there would be no other way for the two sentences to mean the same. The brief discussion of adverbial quantification above is intended to show not that this assumption is false, but that the question is certainly still open. 4.2. FQs and determiner
Qs
The other aspect of the "sameness of meaning" motivation for a transformational analysis of floating quantifiers is (38b)—the thesis that floating quantifiers quantify over DPs in the same manner as their non-floating counterparts. While most of the sentences with FQs considered thus far seem to mean the same as their coimterparts with a DP-Q, it is not always clear that this is true. For example, there are cases in which an interpretation is possible or preferred with an FQ which is not possible when the Q is in prenominai position. Consider the interpretative differences in the following pairs (originally drawn to my attention by Heidi Harley, personal communication, ca. 1995). (44)
a. b.
All lions, tigers and bears are scary. Lions, tigers and bears are all scary.
(45)
a. b.
All students, professors and clowns have come to the meeting. Students, professors and clowns have all come to the meeting.
The example in (44a), on its most salient reading, asserts that every lion is scary, every tiger is scary, and every bear is scary, that is, all quantifies over [lions, tigers and bears]. Example (44b) allows this reading as well. However, there is an additional reading which is available only with the FQ. That is, (44b) can be taken to assert that lions are generally scary, and tigers are generally scary, and bears are generally scary. Loosely put, the requirement is that the predicate be scary be true of all of the terms in the subject DP, but it allows for the individual plural nouns to be interpreted as generics. This generic reading is imavailable in (44a). The pair in (45) shows a similar contrast: in (45a), all quantifies over [students, professors and clowns], asserting roughly that every member of each group is in attendance. The sentence (45b) makes a different assertion, namely that each of the groups is represented at the meeting, but it does not require that all students, all professors, and all clowns have been at the meeting.
130
Jonathan David Bobaljik
(Note that contrary to Bobaljik 1995: 225, the same is not true of both which, with the proper intonation, allows the non-quantificational reading in both floated and DP-initial position; that is Both students and professors came to the show does not necessarily mean both students and both professors came to the show.) Another important semantic difference between sentences with FQs and those in which the Q is a part of the DP constituent was apparently first observed in Williams (1982) and elaborated on in Dowty and Brodie (1984) and Déprez (1994b). FQs are restricted to taking scope in their surface position, while Qs which are part of DPs may undergo scope-changing operations such as Quantifier Raising and Reconstruction. Consider (46). (46)
a. b.
All the contestants could have won. The contestants could have all won.
0 > V, V > 0 0 > V, *V > 0
Dowty and Brodie (1984) observe that a sentence such as (46a) is ambiguous with respect to the relative scope of the universal quantifier and the modal. On one reading (wide scope for the universal), the sentence asserts that the predicate [can win] is true of all the contestants, i.e., that any one of them сгт win. On the second reading, the universal takes scope under the modal, and on this latter reading, (46a) would be taken to assert that a universal tie is possible, e.g., every one of the contestants will receive a prize. The example in (46b), however, has only the second reading, i.e., in which the FQ takes scope in its surface position, beneath the modal. There is one class of exceptions, noted already by Dowty & Brodie: An FQ seems to be able to take scope under a following negation just in case that negation immediately follows the finite auxiliary. Thus, while (46b) is unambiguous (47) is ambiguous. On one reading no contestants won, while on the second, i.e. lowered, reading, it asserts only that some contestants didn't win (i.e., it could be paraphrased as Not all the contestants won). The existence of the covert, lowered reading is demonstrated by considering contexts in which some (but not all) contestants did win. As the truth conditions for the surface scope reading are not met, the fact that (47) may be truthfully uttered in such a context establishes that the negation can scope over the universal. (47)
The contestants all didn't win.
V > not, not > V
Controlling for this local interaction with negation requires rather complex examples. However, when such examples are constructed, Dowty & Brodie's observation that FQs must take scope in their surface positions appears to be valid. This is perhaps clearer when one of the readings is pragmatically infelicitous. Consider in this light the pair in (48)
Floating quantifiers: Handle with care
131
where only one of the two FQ positions relative to the modal expression yields a felicitous interpretation. (48)
a. Gore and Bush should each be 50% likely to beat the other. b. #Gore and Bush should be 50% likely to each beat the other.
Related examples discussed by Williams (1991) show that a quantified DP in embedded subject position may enter into scope relations with elements in the matrix clause (at least for some speakers); (49a) is modeled on Williams's (43), p. 171. However, an FQ related to an embedded subject DP is frozen with respect to scope. (49)
a. b.
Someone said that [jp [each/all of the men] have / had voted for Mary.] Someone said that [χρ [the men] have each/all voted for Mary.
Déprez (1994b) also arrives at the conclusion that FQs scope in their surface positions from a consideration of the possibilities for pair-list answers for questions in which there is an FQ. A final environment in which FQs and Q-DPs seem to differ semantically has to do with distributivity and interaction with certain adverbials. Consider the pair in (50) (including the piirallel English paraphrases) from Junker (1995: 82-83). (50)
a.
Les enfants prendront chacun un ballon l'un après l'autre. the children took each a ball the one after the other 'The children each took a ball one after the other.' h.lll Chacun des enfants prendra un ballon l'un après l'autre. each of the children took a ball the one after the other. 'Each of the children took a ball one after the other'
The Q each (French chacun) enforces a distributive reading in both floated and non-floated uses; i.e., in both (50a) and (b) there is one ball per child. Junker notes that adverbial such as "one after the other" and "at the same time" can only modiiy multiple events, and these adverbials are only felicitous with floated each. From this, she makes the argument that the FQ distributes over events, not over individuals. Further evidence that an analysis along these lines is on the right track may comefromthe fact that in both English and French, the partitive Qs each (of) {chacun des) triggers singular agreement on the verb, while in the FQ constructions, the verb shows plural subject agreement. The interaction of FQ all with distributivity is more difficult to pinpoint. Junker gives pairs parallel to (50a-b) with tous 'all,' arguing that the uni-
132
Jonathan David Bobaljik
versal FQ distributes over events, and it is often claimed in the literature that all is distributive. Nevertheless it is well-known that all may combine with collective (i.e., non-distributive) predicates, e.g.. The students all gathered in the hall (see Dowty 1986 for a discussion oí all, and the suggestion that collective predicates involve a specific type of distributive ''sub''-entailment). The importance of (50) and the further evidence presented by Junker (1995) is that they provide a range of evidence suggesting that (38b) may not be true: perhaps FQs and partitive (i.e., pre-DP) Qs in fact do involve different types of quantification. Sections 4.1-4.2 together question part of the initial motivation for a transformational or movement based approach to FQs. A central assumption which has driven these analyses is that the floated and non-floated alternates in pairs such as (1) have the same meaning, and that moreover, the same meaning is indicative of an underljdng constituency. The examples in section 4.2 question the first part of this assumption, i.e., that the meaning is always the same, while those in section 4.1 question the notion that sameness of meaning entails a derivational relationship. More thorough investigation of quantification may reveal that these questions do not threaten analyses positing a syntactic relationship between the FQ and the DP. For instance, one could imagine that the effects of scope freezing are derivative of the syntactic derivation which imderlies Q-float and do not imply a different logical type for FQs. For the time being, though, this has not been shown and we would thus be premature in concluding either that the relationship of (lb) to (la) is one of systematically identical meaning, or that to the extent they are the same, this sameness must be the reflection of a derivational relationship between them (e.g., a common D-structure).
4.3. On variation: Japanese numeral quantifiers once more The discussion of scope allows us to return once again briefly to the Japanese numeral quantifiers discussed above. In addition to the fact that the elements involved are different (only universals and distributive quantifiers in English, French, Hebrew etc., while in Japanese numerals as well as certain imiversal and existential quantifiers may be discontinuous with their host NP), a striking point of difference between Japanese NQs (on the one hand) and English and French FQs (on the other) is in the possibility of stranding in a canonical trace position such as passive or unaccusative (see (16a), (c)). In a series of recent papers, Yamashita (2000, 2001) has brought to light another difference between Japanese and the
133
Floating quantifiers: Handle with care
other languages studied, a difference concerning interpretive properties such as scope and binding. Yamashita starts with the much-discussed observation that the relative scope of subject and object quantifiers, and binding possibilities among subject and object, in Japanese are fixed in a simple transitive clause (e.g., the object cannot scope over the subject in (51a)), but scope ambiguities (51b) and new binding relations (not shown here) emerge when the object is scrambled to the left of the subject. The examples in (51) illustrate with numeral expressions. (51)
a.
[Otoko-ga 3-nin] [neko-o 2-hiki] mita (koto) mem-NOM 3-CL
b.
cat-Acc 2-cl
saw
(fact)
Three men saw two cats.' (3 > 2, *2 > 3) [Neko-o 2-hiki]\otoko-ga 3-nin] t^ mita (koto) cat-ACC
2-CL
man-NOM
Three men saw two cats.'
3-CL
saw
(fact)
(3 > 2, 0^2 > 3) (Yamashita 2001: 231, 238)
Interestingly, the possibility that scrambling has of altering scope and binding relations does not arise if the NP is scrambled alone stranding the NQ (52). (52)
[Neko-o]i [otoko-ga 3-nin] (kinoo) cat-ACC
man-NOM 3-CL
'Three men saw two cats.'
i; 2-hiki mita (koto)
(yesterday)
2-CL
saw
(fact)
(3 > 2, *2 > 3) (Yamashita 2001: 232)
While at this point one might be tempted to assimilate this to Dowty & Brodie's observation that an FQ (in English) generally scopes in its surface position, Yamashita shows that this would be a mistake for the Japanese constructions; the example in (53) illustrates that even when the object NQ is scrambled alone (i.e., to a position higher than the subject), the object must still be interpreted as having scope beneath the subject. (53)
[2-hiki] i [otoko-ga
3-nin] neko-o t^ mita (koto)
2-CL
3-CL
man-NOM
'Three men saw two cats.'
cat-ACC
saw
(fact)
(3 > 2, *2 > 3) (Yamashita 2001: 231)
The paradigm in (51)-(53) can be replicated for the binding of reciprocals and pronouns embedded in the subject. The correct generalization for Japanese appears to be that whenever a nominal constituent consisting of a noun and an associated numeral is split, that nominal constituent is restricted to its base position—or frozen—for scope and binding. If this is on the right track, then the Japanese NQ constructions have less in common with quantifier float than they do with another family of construc-
134
Jonathan David Bobaljik
tions, loosely grouped under the term "Split NPs" £ind including the Grerman was-fUr split or split topicalization, the latter illustrated in (54) (see van Geenhoven 1998 for discussion and a recent analysis). (54)
Fragen^ hat Johann sieben ί, richtig beantwortet. questions has J. seven correctly answered 'Johann has answered seven questions correctly.' (van Geenhoven 1998: 43)
In sum, while the conclusion that is pointed to by Yamashita's work challenges the long standing hypothesis that Japanese NQ "float" and floating quantiflers of the English kind are fundamentally similar processes, it should be noted that it was only by attempting to pursue this hypothesis that the differences among the two processes have been brought into sharp focus. Importantly, as we understand the differences better, we flnd that it is only the Japanese NQ-type phenomena that display the kinds of behaviour we might expect from a straightforward stranding analysis. 5. Taking stock: What's really at issue? The hypothesis advanced by Sportiche (1988) is that the distribution and semantics of FQs would follow directly from the statement in (12) on independently motivated assumptions about phrase structure and movement, prominent among these being the VP-intemal subject hypothesis. A corollary of Sportiche's proposal (and related ones) was that FQs could be used as a convenient diagnostic for the exact positions of empty categories, including traces of A-movement and PRO. Though this hypothesis has gained widespread currency, it should come as no surprise that language is not so obligingly straightforward. In the critical examination above of the stranding hypotheses and of the assumptions underlying transformational analyses generally, I have suggested that there are numerous significant questions which remain unanswered, but which suggest caution in using FQs as direct reflections of D-structure or the positions of empty categories. This conclusion of course does not entail that the stranding hypotheses are entirely wrong, nor does it entail that the positions of FQs can tell us nothing about the positions of empty categories. A series of works by Doetjes (1991, 1992, 1997) has argued that FQs are indeed adverbial in their distribution and thus not related to the DPs they appear to quemtify over through movement, but that nonetheless FQs must c-command and bind a trace ofthat DP. One of the primary motivations for this proposal is that it permits a unified account of L-tous (3) with the more widely inves-
Floating quantifiers: Handle with care
135
tigated cases where the Q is to the right of its antecedent. Importantly, it explains the fact that L-tous is possible when the object is a clitic, but not when the object is a full DP. [Again, some variability in judgments is reported. For some, the clitic/pronoun versus DP contrast does not obtain in (55), though it surfaces in other L-tous environments such as (57), below (Marie-Hélène Côté, p.c.)]. (55)
a.
Elle a tous voulu les lire. she has all wanted them to-read 'She wanted to read them all.' b. * Elle a tous voulu lire ces livres. she has all wanted to-read those books ('She wanted to read all those books.')
(= (3)) (= (27))
Some earlier accounts have treated L-tous as a form of cliticization or head-movement, see especially Bonneau and Zushi (1993), who point out that L-tous is possible in certain varieties of Spanish as well, but that in all these languages, long-distance L-tous i.e., out of an infinitival or subjunctive clause is restricted to the class of "restructuring" verbs, e.g., those which allow clitic-climbing in other Romance languages (see Wurmbrand 2001 for an overview and recent treatment of restructuring). On Doetjes's account, tous is instead base-generated in the floated (adverbial) position in (55a), and is licit there by virtue of the trace of the clitic in direct object position, which the FQ binds. Movement is necessary to leave a trace for the FQ to bind, and most movement operations take the relevant DP to a position higher than the FQ. Short clitic movement is one case (in French) in which the moved element does not move to a position c-commanding the FQ. Like the stranding analyses, then, Doetjes's account sheds light on the distribution of empty categories, though it does not require that there be a trace in every position which may host an FQ in, e.g., (5). This difference between the stranding analyses and Doetjes's analysis is especially clear in examples such as (56). (56)
The students i don't all seem [ ti to be from New York City ].
Since all in this example is lower than the negation in the matrix clause, it must be at the left periphery of the matrix VP. On the stranding analyses, this example would entail that raising, i.e., movement irom the embedded Spec,IP position to the matrix Spec,IP position, had an intermediate (A)-movement through the matrix Spec,VP position, a surprising conclusion on many current assumptions about raising. Doetjes's account would take the FQ all to be in an adverbial position (left edge of VP), licensed presumably by the trace of the subject in the embedded clause. No intermediate movement is required.
136
Jonathan David Bobaljik
Another environment which licenses L-tous takes the FQ in a matrix clause to be associated with the subject of a subjunctive (i.e., tenseless) complement of a restructuring verb, as in (57). (57)
a.
Je veux tous qu' ils viennent. I want all that they come (SUBJUNCTIVE) Ί want them all to come.' b. *Je veux tous que les enfants viennent. I want all that the children come (SUBJUNCTIVE) (Ί want the children all to come.') (Doetjes 1997: 207)
Doetjes (1997: 207-8) proposes that even though there is a trace for the FQ to bind in both the (a) and (b) examples, the ungrammaticality of (57b) should be attributed to a Binding Theory (Principle C) violation [the pronouns and clitics in (55a) and (57a) are licit since tous is outside of their binding domain]. Though Doetjes does not note it, a consequence of the hypothesis that FQs are relevant for Binding Theory is that the A/A' distinction (24)-(25) is predicted. If the FQ can trigger a Principle С violation, then A'-extraction across it should trigger a strong crossover violation, while A-movement should be improblematic. If correct, Doetjes's analysis would imply that FQs do indicate something about the position of empty categories: though not revealing the exact position of subject traces, it would entail that there is a subject trace somewhere low in the structure, e.g., internal to VP. Certain important questions remain nevertheless. Besides some of the problems discussed in section 4.2, Doetjes's analysis leaves unanswered the question of why FQs need to bind a trace, as opposed, e.g., to an anlysis in which it is the FQs themselves which need to be bound. For example, an analysis in which the FQs are themselves anaphors (subject to Principle A) would equally predict the A/A' differences and would be empirically distinguished only by the L-tous cases. Should the clitics raise at LF across tous (plausible, since clitics raise overtly in these environments in, e.g., Italian) then Doetjes's analysis and one in which the FQs are themselves anaphors, would perhaps be empirically indistinguishable. Note that Doetjes's analysis appears to make the wrong prediction with respect to the generality of L-tous cross-linguistically. It is commonly assumed that the pre-adverbial position of the pronominal object in German examples such as (58a) (or their Dutch equivalents) is the result of a short, leftwards movement of the DP. The definite DP object is assumed to be base-generated in a position following the adverb^esiern 'yesterday*, i.e., the position occupied by the indefinite Kekse 'cookies' in (58b). Indeed, on Doetjes's analysis, this must be true in order for the FQ in (58a) to be
Floating quantifiers: Handle with care
137
licensed. For discussion of Q-float and scrambling/object shift in German, see especially Giusti (1990a, 1990b) and Merchant (1996). (58)
a.
b.
Im Garten hat der Hans siei gestern (alle J gegessen. In the garden has the Hans them yesterday all eaten 'Hans ate them all yesterday in the garden.' Im Garten hat der Hans gestern Kekse gegessen. In the garden has the Hans yesterday cookies eaten 'Hans ate cookies yesterday in the garden.'
Doetjes's analysis of L-tous takes it that the FQ in (55a) is licit preceding the DP it modifies because the DP has undergone a short movement (diticization) leaving a trace. In her analysis, there is nothing special about cliticization per se, and unlike other analyses, no requirement that the DP being modified c-command or precede the FQ. Her analysis thus predicts that an FQ in Dutch or German should be able to precede the pronoun it modifies, so long as that pronoun has imdergone the short scrambling exemplified in (58a) leaving a trace. The sharply ungrammatical (59a) is therefore incorrectly predicted to have the same status as French L-tous (55a). On Doetjes's assumptions, it is hard to see how to exclude (59a) without also excluding (55a). (The (b) example is included as a control to show that the position ofalle in (59a) is independently available.) (59)
a. * Im Garten hat alle^ der Hans siei gestern gegessen. In the garden has all the Hans them yesterday eaten ('Hans ate them all yesterday in the garden.') b. Die Keksei hat allei der Hans gegessen. The cookies has all the Hans eaten 'Hans ate all the cookies.'
Junker (1995) £md Doetjes (1997) propose to relate their emalyses of FQs to other analyses of adverbial quantification and of "binominal each" (i.e., as in The children received three balloons each), by positing that even the simplest FQs (e.g., all) have a complex internal structure. In particular, they argue that FQs contain empty nominal positions, which are involved in the binding relations, and (for Junker) in agreement. For example, while Sportiche (1988) would assign the surface representation [ioMs e] to an FQ with the restriction that the e be a trace, pro or PRO, Doetjes (1997) assigns FQs essentially the same structure, arguing that it is this empty category (and not the quantifier) which must bind the trace. One might also interpret Belletti's, (1982) proposals in these terms as well, suggesting that the empty category in FQs could itself be anaphoric. This is the route taken by Sportiche (1988: 445) in analysing larger FQs which cannot form legitimate surface constituents with their DP antecedents, such
138
Jonathan David Bobaljik
as in (36). Sportiche assigns this the structure [íous les trois [Ν e]], where e is an empty anaphor, forming a chain with its antecedent, and not a trace of movement. Such an account might treat the pronominal clitics in Hebrew as overt manifestations of the covert pronominal or anaphoric element. Intriguingly, such proposals could also relate the distribution of FQs to other anaphor-like elements which display agreement and seem to appear in adverbial positions or the left periphery of VPs, imder ill-understood conditions (on the origin and rise, c. 1000 AD, of himself iorms used as "subject intensifiers" as in (60c-d), see Keenan 1996; see Tremblay 1990 for discussion of the French examples, and the suggestion that they are related to FQs; see also Torrego 1996 for an analysis of certain NPs and pronoims in Spanish as FQs, roughly along these lines). (60)
a. b. c. d.
The professors have all been working on this very problem. Chomsky and Halle have the two of them been working on this very problem. Chomsky has himself been working on this very problem. You know yourself that this can't be all there is to say.
(61)
Les enfantSi ont eux-mêmesi donné un cadeau à Marie. the children have them-selves given a present to Marie 'The children have themselves given a present to Marie.' (French, Tremblay 1990: 236)
(62)
Mne
budet samomu interessno, will.be Self-M.SG.DAT interesting, kak reshitsja ètot vopros how resolves this question. 'It will be interesting to me myself, how this question turns out.' (Russian) Me-DAT
Finally, the investigation is made more complex if we accept that what appear to be FQs may be different elements in different languages (Bonneau and Zushi 1993) or even internal to one language (Shlonsky 1996:14) (see also sections 3.3 and 4.3 above). Indeed, most authors who take all to be the same element in both prenominai and floated positions, would presumably nevertheless accept an adverbial analysis of all in phrases such as [all alone], [all wet], where it cannot float and has the meaning 'completely, entirely'. In this article, I have limiped together various approaches to Q-float to highlight aspects salient for the present discussion. To be sure, they differ in many ways as well, and I have not attempted to provide a thorough listing of important details of many of the approaches. I have instead focused on what I see as a deeper issue, namely, can the position of
Floating quantifiers: Handle with care
139
FQs be taken as a diagnostic for traces and their positions? I have suggested that no analysis to date has been successful in predicting the distribution and properties of FQs with full generality. Nevertheless, in the years since pairs such as (1) became an object of study from a transformational perspective, and in particular in the decade or so since the stranding analysis was formulated by Sportiche (1988) and Miyagawa (1989), we have learned a great deal. Though FQs cannot be used unequivocally as tests for imderlying constituent structure, it is clear that they are in some way intimately connected to predication and their distribution is connected to either movement or binding (or both). While we may not have found the answer to the question with which we began the article, we at least know now some of the major questions that may lead us to the answers: (63)
· Why do only certain imiversal Qs float? • Why is the scope of an FQ fixed? • Why do A and A' antecedents/traces behave differently in (standard) English? • What is the relationship of Q-float to split NP constructions? • What permits and constrains the attested cross-linguistic variation?
Acknowledgement I would like to thank the participants of the LF Reading Group at MIT, Hideaki Yamashita and especially Susi Wurmbrímd for extensive comments on and discussion of some of the material presented here. Thanks also to Rachel Turk for her assistance in updating the bibliography.
References not in Bibliography Chomsky, Noam (1986). Knowledge of Language. New York: Praeger. Dowty, David R. (1986). Collective predicates, distributive predicates, and all. In: F. Marshall, A. Miller and Z.-S. Zhang (eds.). Proceedings of ESCOL 3,97-115. Ohio: Ohio State University Emonds, Joseph (1978). The Verbal Complex V'-V in French. Linguistic Inquiry 9,151-175. Haegeman, Liliane [1991] (1994). Introduction to Government and Binding Theory. London: Basil Blackwell.
140
Jonathan David Bobaljik
Heim, Irene, Howard Lasnik and Robert May (1991). Reciprocity and Plurality. Linguistic Inquiry 22, 63-102. Keenan, Edward (1996). Creating Anaphors: An Historical Study of the English Reflexive Pronouns. Ms. Los Angeles: UCLA. Lahiri, Uptid (1991). Embedded Interrogatives and Predicates that Embed them. Ph.D. Dissertation, MIT, Cambridge, Mass. Lebeaux, David (1983). A Distributional Difference Between Reciprocals and Reflexives. Linguistic Inquiry 14, 723-730. Lewis, David (1975). Adverbs of Quantiflcation. In: Edward Keenan (ed.). Formal semantics of natural language: Papers from a Colloquium Sponsored by the King's College Research Centre, 3-15. Cambridge: Cambridge University Press. Mahajan, Anoop K. (1990). The A/A-Bar Distinction and Movement Theory. Ph.D. Dissertation, MIT, Cambridge, Mass. Obenauer, Hans (1994). Aspects de la syntaxe A-barre: Effets d'intervention et mouvements des quantifleurs. Thèse de doctorat d'Etat, Université de Paris VIII, Paris. Pollock, Jean-Yves (1989). Verb Movement, Universal Grammar íuid the Structure of IP. Linguistic Inquiry 20, 365-424. de Swart, Henriette (1991). Adverbs of Quantification: A Generalized Quantifier Approach. Ph.D. Dissertation, University of Groningen, Groningen. Tremblay, Mireille (1990). Emphatic anaphoric expressions in French and Binding Theory. In: Anne-Marie DiSciullo and Anne Rochette (eds.). Binding in Romance: Essays in Honour of Judith McANulty, 233-258. Ottawa: Canadian Linguistics Association. Williams, Edwin (1982). The NP Cycle. Linguistic Inquiry 13, 277-295. Williams, Edwin (1991). Reciprocal Scope. Linguistic Inquiry 22,159-173. Wurmbrand, Susi (2001). Infinitives: Restructuring and Clause Structure. Berlin: Mouton de Gruyter. Wyngaerd, Guido Vanden (1989). Object shift as an A-movement Rule. MIT Working Papers in Linguistics 11, 256-271.
A Floating Quantifier
Bibliography
[Note: Papers dealing exclusively with binominal each or reciprocal constructions such as each... other are not included in this bibliography, though these elements may well bear some relationship to FQs.] Akiyama, Maseihiro (1994). On quantifier floating. English Linguistics 11, 100-122.
Floating quantifiers: Handle with care
141
Alam, Yukiko Sasaki (1997). Numeral Classifiers as Adverbs of Quantification. In: Ho-min Sohn and John Haig (eds.), Japanese /Korean Linguistics, volume 6, 381-397. Stanford: Center Study Language & Information. Altaha, Fayez M. (1994). Kashmiri causative constructions and the antipassive analysis. Indian Linguistics: Journal of the Linguistic Society of India 55,1-22. Anderson, John M. (1973a). A note on the placement of universal quantifiers. Edinburgh Working Papers in Linguistics 2, 24-36. Anderson, John M. (1973b). Universal quantifiers. Lingua 31, 125-176. Anderson, John M. (1974). All and Equi ride again. Archivum Linguisticum (new series) 5,1-10. Bach, Emmon, Eloise Jelinek, Angelika Kratzer and Barbara H. Partee (eds.) (1995). Quantification in natural languages. Dordrecht: Kluwer. Baltin, Mark R. (1978). Towards a theory of movement rules. Ph.D. Dissertation, MIT, Cambridge, Mass. Baltin, Mark R. (1982). A Landing Site Theory of Movement Rules. Linguistic Inquiry 13,1-38. Baltin, Mark R. (1995). Floating quantifiers, PRO and predication. Linguistic Inquiry 26,199-248. Bayer, Josef (1986/1987). The Syntax of Scalar Particles and so-called 'Floating Quantifiers'. Ms. Nijmegen: Max Planck Institut für Psycholinguistik. Beghelli, Filippo (1995). The phrase structure of quantifier scope. Ph.D. Dissertation, UCLA, Los Angeles. Belletti, Adriana (1982). On the anaphoric status of the reciprocal construction in Italian. The Linguistic Review 2,101-138. Benmamoun, Elabas (1999). The syntax of quantifiers and Quantifier Float. Lingusitic Inquiry 30,621-642. Bobaljik, Jonathan David (1995). Morphosjaitax: the syntax of verbal inflection. Ph.D. Dissertation, MIT, Cambridge, Mass. Bonneau, José (1986). Quantificateurs flottants, restructuration et théorie des chaînes dans les langues Romanes. MA Thesis, Université du Québec à Montréal, Montreal. Bonneau, José and Mihiko Zushi (1993). Quantifier Climbing, Clitic Climbing, and Restructuring in Romance. McGill Working Papers in Linguistics 8.1,1-35. Boster, Carole Tenny (1996). On the Quantifier-Noun Phrase Split in American Sign Language and the Structure of Quantified Noun Phrases. International Review of Sign Linguistics 1,159-208. Bowers, John (1993). The syntax of predication. Linguistic Inquiry 24, 591-656.
142
Jonathan David Bobaljik
Brandon, Frank Roberts (1982). Q-Float and Conjunction Reduction as Evidence for NEG-Placement. Papers from the Regional Meetings Chicago Linguistic Society 18, 29-39. Brodie, Belinda (1983). English adverb placement in GPSG. M.A. Thesis, Ohio State University. Carden, Guy (1976). English Quantifiers: Logical Structure and Linguistic Variation. New York: Academic Press. Cardinaletti, Anna (1991). On Pronoim Movement: The Italian Dative Loro. Probus 3,127-153. Cheng, Lisa Lai-Shen (1995). On ciou-quantification. Journal of East Asian Linguistics 4,197-234. Cheng, Lisa Lai-Shen (1997). On the typology of wh-questions. (Outstanding Dissertations in Linguistics New York: Garland. Chiù, Bonnie (1990). A case of quantifier float in Mandarin Chinese. Paper presented at Cornell. Chiù, Bonnie (1993). The inflectional structure of Mandarin Chinese. Ph.D. Dissertation, University of California, Los Angeles. Comrie, Akiko Kumahira (1988). On so-called quantifier floating in Japanese. Ph.D. Dissertation, University of Southern California, Los Angeles. Cook, Kenneth W. (1987). A new relational accoimt of Samoan quantifier float, case marking and word order. Proceedings ofBLS 13, 53-64. Déprez, Viviane (1989). On the Typology of Syntactic Projections and the Nature of Chains: Move-a to the Specifier of Fimctional Projections. Ph.D. Dissertation, MIT, Cambridge, Mass. Déprez, Viviane (1994a). Questions with floating quantifiers. In: M. Harvey and L. Santelmann (eds.). Proceedings of SALT IV, 96-113. Ithaca: Cornell University. Déprez, Viviane (1994b). The weak island effect of floating quantifiers. In: Elena Benedicto and Jeff Runner (eds.). Functional Projections: University of Massachusetts Occasional Papers 17, 63-84. Déprez, Viviane (1995). Pair-list answers with floating quantifiers. In: Raul Aranovich, William Bjmie, Susanne Preuss and Martha Senturia (eds.), Proceedings of the 13th West Coast Conference on Formal Linguistics, 205-220. Stanford, CA: Center for the Study of Language and Information. Doetjes, Jenny (1991). L-tous: A unifying account of quantifier float in French. M.A. Thesis, Leiden Universty, Leiden. Doetjes, Jenny (1992). Rightward floating quantifiers float to the left. The Linguistic Review 9, 313-332.
Floating quantifiers: Handle with care
143
Doetjes, Jenny (1997). Quantifiers and Selection: on the distribution of quantifying expressions in French, Dutch and English. Ph.D. Dissertation, University of Leiden, Leiden. Downing, Pamela (1993). Pragmatic and semantic constraints on numeral quantifier position in Japanese. Journal of Linguistics 29, 65-93. Dowty, David R. and Belinda Brodie (1984). The semantics of "floated" quantifiers in a transformationless grammar. In: Mark Cobler, Susannah MacKaye and Michael T. Westcoat (eds.). Proceedings of the 3rd West Coast Conference on Formal Linguistics (WCCFL 3), 75-90. Stanford, CA: Stanford Linguistics Association. Drijkoningen, Frank (1997). Morphological Strength: NP Positions in French. In: Dorothee Beerman, David LeBlanc and Henk C. van Riemsdijk (eds.), Rightward Movement, 81-114. Amsterdam: John Benjamins. Dubinsky, Stanley (1990). Japanese object to indirect object demotion. In: Paul Postal and Brian Joseph (eds.). Studies in Relational Grammar 3, 49-86.3.) Chicago: University of Chicago Press. Fauconnier, Gilles (1979). Theoretical implications of some global phenomena in syntax. New York: Garland. Fiengo, Robert and Howard Lasnik (1976). Some issues in the theory of transformations. Linguistic Inquiry 7,182-191. von Fintel, Kai (1994). Restrictions on Quantifier Domains. Ph.D. Dissertation, University of Massachusetts, Amherst, Mass. Frota, Sonia (1994). Aspects of the Prosody of Focus in European Portuguese; Aspectos da prosodia do foco no portugués europeu. letras de Hoja 29, 77-99. Fujita, Naoya (1994). On the nature of modification: A study of floating quantifiers and related constructions. Ph.D. Dissertation, University of Rochester. Fukushima, Kazuhiko (1991a). Generahzed floating quantifiers. Ph.D. Dissertation, University of Arizona. Fukushima, Kazuhiko (1991b). Phrase Structure Grammar, Montague semantics and floating quantifiers in Japanese. Linguistics and Philosophy 14, 581-628. Fukushima, Kazuhiko (1993). Model theoretic semantics for Japanese floating quantifiers and their scope properties. Journal of East Asian Linguistics 2, 213-228. van Geenhoven, Veerle (1998). Semantic incorporation and indefinite descriptions: Semantic and syntacic aspects of noun incorporation in West Greenlandic. Dissertations in Linguistics. Stanford, CA: CSLI.
144
Jonathan David Bobaljik
Gerdts, Donna B. (1987). Surface case and grammatical relations in Korean: the evidence from quantifier float. Studies in Language 11, 181-197. Giusti, Giuliana (1990a). Floating Quantifiers in Germanic. In: Joan Mascaro and Marina Nespor (eds.). Grammar in progress: GLOW Essays for Henk van Riemsdijk, 137-146. Dordrecht: Foris. Giusti, Giuliana (1990b). Floating Quantifiers, scrambling and configurationality. Linguistic Inquiry 21, 633-641. Giusti, GiuUana (1990c). The syntax of floating 'alles' in German. In: Werner Abraham, Wim Kosmeijer and Eric Reuland (eds.). Issues in Germanic Syntax, 327-350. New York/Berlin: MoutonAValter de Gruji;er. Giusti, Giuliana (1991). The Categorial Status of Quantified Nomináis. Linguistische Berichte 136, 438-452. Giusti, Giuliana (1992). La sintassi dei sintagmi nominali quantificati: Uno Studio comparativo. Ph.D. Dissertation, University of Venice & Padua, Venice and Padua. Guilfoyle, Eithne, Henrietta Hung and Lisa deMena Travis (1992). SPEC of IP and SPEC of VP: two subjects in Austronesian languages. Natural Language and Linguistic Theory 10, 375-414. Haig, John (1980). Some observations on quantifier floating in Japanese. Linguistics 18,1065-1083. Halpem, Richard Neil (1977). Notes on the origin of quantifier floating. Studies in the linguistic sciences 7, 41-45. Hamano, Shoko (1997). On Japanese Quantifier Floating. In: Akio Kamio (ed.). Directions in functional linguistics, 173-197. Amsterdam: John Benjamins. Hoeksema, Jacob (1996). Floating quantifiers, partitives and distributivity. In: Jocob Hoeksema (ed.). Partitives, 57-106. Berlin: Mouton de Gruji;er. Hoekstra, Eric, Ema Vermeulen, Pim Wehrmann and Guido Vanden Wyngaerd (1989). Anaphoric adverbs and quantifier float. Ms. Groningen, Leiden and Amsterdam: University of Groningen, Leiden University and University of Amsterdam. Hoekstra, Teun (1992a). Small clause theory. Belgian Journal of Linguistics 7,125-151. Hoekstra, Teun (1992b). Subjects inside out. Revue Québécoise de Linguistique 22,45-75. Hogg, Richard M. (1977). English quantifier systems. Amsterdam: North Holland. Hong, Ki-Sun (1990). Quantifier Float in Korean. Proceedings of the Berkeley Linguistics Society 16,175-186. Jaeggli, Osvaldo A. (1982). Topics in Romance Syntax. Dordrecht: Foris.
Floating quantifiers: Handle with care
145
Junker, Marie-Odile (1990a). Floating quantifiers and distributivity. Cahiers linguistiques d'Ottawa 18,13-42. Junker, Marie-Odile (1990b). Floating quemtifiers and Georgian distributivity. In: Karen Deaton, Manuela Noske and Michael Ziolkowski (eds.), Papers from the 26th regional meeting of the Chicago Linguistics Society. Chicago: Chicago Linguistics Society. Junker, Marie-Odile (1993). Distributivité en sémantique conceptuelle: Le cas des quantifieurs flottants. Ph.D. Dissertation, Université de Sherbrooke, Sherbrooke, QC. Junker, Marie-Odile (1995). Syntax et sémantique des quantifieurs flottants tous et chacun: distributivité en sémantique conceptuelle. Genève: Librarie Droz. Katagiri, Masumi (1991). Review Article: Structure and Case Marking in Japanese (Miyagawa 1989). Studies in Language 15, 399-414. Kawashima, Ruriko (1998). The structure of extended nominal phrases: The scrambling of numerals, approximate numerals, and quantifiers in Japanese. Journal of East Asian Linguistics 7,1-26. KajTie, Richard (1975). French Syntax. Cambridge, Mass: MIT Press. Kayne, Richard (1981). Binding, quantifiers, clitics and control. In: Frank Heny (ed.). Binding and Filtering, 191-211. Cambridge, MA: MIT Press. [Reprinted in Richard Kayne (1984) Connectedness and Binary Branching. Dordrecht: Foris.] Kayne, Richard (1989). Facets of Past Participle Agreement. In: Paola Benincà (ed.). Dialect Variation and the Theory of Grammar, 85-103. Dordrecht: Foris. Kayne, Richard S. (1969). The transformational cycle in French syntax. Ph.D. Dissertation, MIT, Cambridge MA. Kayne, Richard S. (1978). Le condizioni sui legamento, il collocamento dei clitici e lo spostamento a sinistra dei quantificatori. Rivista di grammatica generativa 3,147-171. Kim, Alan Hjmn Oak (1995). Word Order at the Noim Phrase Level in Japanese: Quantifier Constructions and Discourse Functions. In: Pamela Downing and Michael Noonan (eds.). Word order in discourse, 199-246. Amsterdam: John Benjamins. Klein, S. (1976). A base analysis of the floating quantifier in French. Proceedings ofNELS 7 (MIT). Koizumi, Masatoshi (1995). Phrase Structure in Minimalist Syntax. Ph.D. Dissertation, MIT, Cambridge, Mass. Koopman, Hilda and Dominique Sportiche (1991). The position of subjects. Lingua 85, 211-258. Kuroda, S.-Y. (1980). Bunkoozoo-no hikaku. In: T. Kimihiro (ed.), Nitieigo Hikaku-Kooza 2, Bunpoo, Tokyo: Taikushan.
146
Jonathan David Bobaljik
Kuroda, S.-Y. (1983). What can Japanese say about Government and Binding? In: Michael Barlow, Daniel P. Flickinger and Michael T. Westcoat (eds.), Proceedings of the West Coast Conference on Formal Linguistics (WCCFL) 2. Stanford:. Lancri, Annie (1992). L'ordre des quantificateurs flottants. In: Jacqueline Guéron (ed.). L'ordre des mots: Domaine anglais, 121-140. SaintEtienne: Travaux du centre interdisciplinaire d'études et de recherches sur l'éxpression contemporain. Lehman, Frederick K. (1985). On quantifier floating in Lushai and Burmese with some remarks on Thai. In: Graham Thurgood, James A. Matisoff and David Bradley (eds.), Linguistics of the Sino-Tibetan Area: The state of the art, 264-278. (Pacific Linguistics, Series С 87.) Canberra: Australian National University. Lemieux, Monique, Marielle Saint-Amour and David Sankoff (1985). / г и т / en français de Montréal: un cas de neutralisation morphologique. In: Monique Lemieux and A. J. Cedergren (eds.). Les tendances dynamiques du français parlé à Montréal, vol. 2, 2.) Québec: Gouvernement du Québec. Léard, Jean-Marcel (1995). Grammaire québécoise d'aujourd'hui: comprendre les québecismes. Montréal: Guérin universitaire. Link, Godehard (1974). Quantoren-floating im Deutschen. In: Ferenc Kiefer and David M. Perlmutter (eds.), Syntax und generative Grammatik 2,105-127. Frankfiart am Main: Athenion. Maling, Joan M. (1976). Notes on Quantifler-Postposing. Linguistic Inquiry 7,708-718. McCawley, James D. (1988). The syntactic phenomena ofEnglish. Chicago: University of Chicago Press. McCawley, James D. (1999). Why Surface Syntactic Structure Reflects Logical Structure as Much as It Does, but Only That Much. Language 75, 34-62. McCloskey, James (1998). The Prosody of Quantifier Float under WHMovement in West Ulster English. Ms. Santa Cruz: University of California at Santa Cruz. Available at http://ling.ucsc.edTi/~mcclosk/. McCloskey, James (2000). Quantifier float and M)/i-movement in an Irish English. Linguistic Inquiry 31, 57-84. Merchant, Jason (1996). Object Scrambling and Quantifier Float in German. Proceedings of NELS 26, 179-193. Amherst: Graduate Student Linguistics Association. Miller, Philip H. (1995). Une analyse lexicaliste des affixes pronominaux en français. Revue québécoise de linguistique 24,135-171. Milner, Jean-Claude (1978). De la syntaxe à l'interrprétation. Paris: Éditions de Seuil.
Floating quantifiers: Handle with care
147
Milner, Jean-Claude (1987). Interpretive chains, floating quantifiers and exhaustive interpretation. In: Carol Neidle and Rafael Nunez (eds.), Studies in Romance languages, 181-202. Dordrecht: Foris. Miyagawa, Shigeru (1989). Structure and case marking in Japanese. (Syntax and Semantics 22.) New York: Academic Press. Miyamoto, Yoichi (1994). Secondary predicates and tense. University of Connecticut, Storrs. Munro, Pamela (1984). Floating quantifiers in Pima. In: Eung-Do Cook and Donna B. Gerdts (eds.), The syntax of Native American languages, 269-287. (Syntax and Semantics 16.) New York: Academic Press. Naito, Seiji (1993). Concept of command in Japanese syntax. Ph.D. Dissertation, Harvard University, Cambridge, MA. Nakayama, Mineharu and Masatoshi Koizumi (1991). Remarks on Japanese subjects. Lingua 85, 303-319. Napoli, Donna Jo (1975). Consistency. Language 51, 831-844. Oosthuizen, Johan (1989). An interpretive analysis of quantifier postposing phenomena in Afrikaans: Stellenbosch Papers in Linguistics 19. Pafel, Jürgen (1995). Kinds of Extraction from Noun Phrases. In: Uli Lutz and Jürgen Pafel (eds.). On extraction and extraposition in German, 145-177. Amsterdam: John Benjamins. Postal, Paul (1974a). Avoiding reference to subject. Linguistics Inquiry 7, 151-182. Postal, Paul (1974b). On raising: One rule of English and its theoretical implications. Cambridge, Mass: MIT Press. Quicoli, A. Carlos (1976). Conditions on quantifier movement in French. Linguistic Inquiry 7, 583-607. Reis, Marga (1992). The category of invariant alles in wh-clauses: On syntactic quantifiers vs. quantifying particles in German. In: Rose-Marie Tracy (ed.), Who climbs the grammar tree?, 465-492. Tübingen: Niemeyer. Reis, Marga and Heinz Vater (1980). Beide. In: Günther Brettschneider and Christian Lehmann (eds.), Wege zur Universalienforschung, sprachwissenschaftliche Beiträge zum 60. Geburtstag von Hansjakob Seiler, 360-386. (Tübinger Beiträge zur Linguistik 145.) Tübingen: Gunter Narr. Sag, Ivan (1978). Floating quantifiers, adverbs and extraction sites. Linguistic Inquiry 9,146-150. Shimozaki, Minoru (1989). The quantifier float construction in Japanese. Gengo Kenkyu: Journal of the Linguistic Society of Japan 95. Shlonsky, Ur (1991). Quantifiers as functional heads: A study of Quantifier Float in Hebrew. Lingua 84,159-180.
148
Jonathan David Bobaljik
Shlonsky, Ur (1996). Review of Bobaljik (1995) 'Morphosyntax'. Glot International 2,11-14. SigurÖsson, Halldór Ármann (1991). Icelandic Case-marked PRO and the Licensing of Lexical Arguments. Natural Language and Linguistic Theory 9, 327-364. Sportiche, Dominique (1988). A Theory of Floating Quantifiers and its Corollaries for Constituent Structure. Linguistic Inquiry 19, 425-449. Takano, Yasukvmi (1984). The lexical nature of quantifiers in Japanese. Linguistic Analysis 14, 289-311. Terada, Hiroshi (1991). Тзфез of floating quantifiers and the landing site of scrambling. Nagoya Working Papers in Linguistics 7, 47-101. Torrego, Esther (1996). On quantifier float in Control clauses. Linguistic Inquiry 27,111-126. Ueda, Masanobu (1986). Quantifier Float in Japanese. Sophia Linguistica 20/21,103-112. Vater, Heinz (1980). Quantifier floating in German. In: Johan Van der Auwera (ed.), T%e semantics of determiners, 232-249. London and Baltimore: Croom Helm and University Park Press. Veld, E. i n ' t (1990). Een analyse van allemaal. M.A. Thesis, Leiden University, Leiden. Visser, Saskia J. (1991). A separate status for beide; Een status aparte voor Ъeide'. Tabu 21,117-126. Wongbiasaj, Soranee (1979). Quantifier floating in Thai and the notions of cardinality/ordinality. Studies inthe linguistic sciences 9,189-199. Yamashita, Hideaki (2000). A note on NQ-Scrambling in Japanese. M.A. Thesis, Nanzan University, Nagoya, Japan. Yamashita, Hideaki (2001). "FNQ-Scrambling" in Japanese. In: Second Report of Minimalization of Each Module in Generative Grammar, 221-252. Nagoya: Nagoya University. Yatabe, Shuichi (1990). Quantifier floating in Japanese and the (theta) hierarchy. Papers from the regional meetings Chicago Linguistic Society 26,437-451. Yoshida, Tomoyuki (1993). Quantifiers and the theory of movement. Ph.D. Dissertation, Cornell University, Ithaca, NY.
No lack of determination Greg Carlson
1. Introduction What are traditionally called Noim Phrases seem to come in two varieties—those that begin with a determiner (or a quantifier-like expression), and those that don't. So, at first glance, while phrases like those desks and most new cars show both a determiner-type element and a nominal, phrases like Fred, her, linguistics papers, and wheat, do not. The question of whether to analyze these latter types of noun phrases as being similar in structure to the former—and if so, how—has often boiled down the the question of the exact identity of the missing determiner element. It has been suggested, for instance, that proper names have a covert definite article associated with them, so that Fred should be analyzed syntactially and semantically along the lines of the Fred, with the proviso that the definite article is deleted or otherwise fails to surface phonetically in these and similar instances (see especially Sloat 1969). Or, with pronouns, it has been proposed (Postal 1969a) that they are a species of definite determiners themselves, and so she for instance should be analyzed as something like she one (parallel to that one), with the proviso that the nominal element be phonetically unrealized. The set of issues with the other determinerless noim phrases like linguistics papers or wheat is a bit different, and it's what we're going to be discussing here. These have been discussed most often in connection with genericity (some of the basic background works being Dahl 1975, Smith 1975, and Lawler 1973). The terminology of "bare plural" and "mass term'" to describe these has become most familiar in current generative linguistics, though a current term encompassing both is lacking; however.
150
Greg Carlson
as the two have a great deal in common despite their well-known differences, the lack of an appropriate cover term is apparent. So somewhat irrationally, I'll use the term "Bare Plurals' (BP's) but in so doing I also intend to include mass terms (unless otherwise noted). The basic fact about BP's is, first, that they appear have more than one interpretation. In a given sentence they might be interpreted existentially, as in (la), or as something like a vmiversal, (lb). (1)
a. b.
Curious people crowded around the site of the accident (i.e. some curious people) Curious people like to travel a lot (all, or nearly all, curious people; curious people in general)
They also typically have just one of these interpretations available in any given sentence (on a constant interpretation of the sentence less the BP). So, for instance, (la) has no universal or general reading, nor does (lb) have an existential reading. This observation is by no means hard and fast, however. In a sentence like (2), from Longobardi (1994), both readings appear accessible: (2)
I only excluded old ladies. (= Longobardi's (41a))
Such a sentence can be understood as excluding only some older women (and admitting others), or as excluding all who are older. Facts such as these have spurred a great deal of work, and controversy. It is possible to separate out two closely-related issues concerning the syntax and semantics of BP's: (A)
What do BP noun phrases mean? Do they have a single, unified meeming which appears to be different in different contexts, or are there two or more meanings?
(B)
How do the syntactic and semantic (and pragmatic) contexts determine which interpretation(s) of the BP is/are appropriate?
While these two questions are intertwined, we are going to focus on the former question.
2. An early unified analysis Traditional grammars of English assume that BP's have covert determiners associated with them. One of these is the plural form of the indefinite singular a(n) and accounts for the existential interpretation; the other is a universal sort of determiner or quantifier, having something like
No lack of determination
151
the force of all or any. This analysis presents two quite different problems. First, it would appear to predict systematic ambiguity of BP's, when one finds more generally lack of it. Second, it is no trivial matter to specify the exact identity of the "universal" null determiner, as it is clearly not imiversal, nor quite like any of the other non-null determiners/quantifiers. These problems were discussed in Carlson (1977,1980), the work inspired by that of Milsark (1974). The reason, I argued there, that bare plurals are not generally ambiguous in a given sentence, is that they are not ambiguous in and of themselves. Rather, it is the syntactic/semantic context in which they appear that makes them appear to have different interpretations, not any difference in the determiner or any other element in the norm phrase. Thus, the interpretation of the noun phrase 'curious people' in (la), in terms of the contribution that that noun phrase makes to the meaning of the whole, is identical to the interpretation it receives in (lb). That interpretation, it was argued, was the name of a kind ofthing, thus aligning the interpretation of bare plurals with that of proper names and definites as unquantified/referential noun phrases. If one were to assign the covert determiner a meaning (though there it was accomphshed syncategorematically), it would be a function from predicate meanings to generalized quantifier meanings of the logical sort of names and definite singular terms. This identification was supported by some data showing some similarities between BP's and proper names, such as the ability of the phrase "So-Called" to appear with each (Postal 1969b) or to appear in contexts favoring definites over indefinites (Postal 1969a); but it appeared to leave other data not easily accounted for which aligned BP's with indefinites or noun phrases with weak determiners, such as the easy appearance of BP's in English existential constructions. The mechanism for providing the apparently differing interpretations was based on an analysis of the stage-level / individual-level distinction, in which stage-level predicates (as found in (la)) introduced as a part of their meanings existential quantifiers, giving rise to the existential readings found in such examples as (la); individual-level predicates, as in (lb), did not introduce any similar existential quantifiers, so that an existential reading does not appear. That is, existential readings of BP's were attributed to an existential quantifier in the interpretation of the sentence that is not a part of the meaning of the BP itself The "kinds" treatment of bare plurals is motivated chiefiy by several factors, only a couple of which I'll mention here. One is the existence of predicates which appear only applicable to kinds, such as: (3)
a. b. c.
Ground squirrels are wideapreadJcommonlrare Forks are a type/kind of table utensil Pick-up trucks come in four basic sizes
152
Greg Carlson
Predicates such as these do not involve generaUzations over corresponding sentences Avith individual variables. Some facts about anaphora are also relevant. (4)
Cats think very highly of themselves
A sentence like (4) has two readings, one where each individual cat thinks it is wonderful, and another reading in which cats hold no attitude towards themselves directly, but think highly that species in general. Similarly, there is one reading for examples such as (5): (5)
John polished apples, and Mary ate them
which allows John to polish some apples and Mary to eat some other apples (that he never polished); on the other reading the apples are, of course, identical. In Carlson (1980) this difference was attributed to whether the pronoun was interpreted coreferentially or as an E-type pronoun. Further, NP's of the form "that kind of x" also, it is claimed, exhibit "generic" and "existential" readings as well: (6)
a. b.
That kind of animal eats wood ("generic") I saw that type of animal at the pet store yesterday ("existential")
Additional arguments can be foimd in Carlson (1977, 1980) and elsewhere. This was, I believe, the first attempt to deal with the phenomena systematically within a formal semantics framework (though see especially Lawler 1973), and it didn't take long for researchers to work on improvements.
3. Critiques
and
criticisms
The unified "kinds" view was certainly not beyond criticism. DeMey (1980, 1982) was among the first to question the necessity of "stages" for an analysis. More detailed presentations of alternative views are found in ter Meulen (1979), on mass terms, and more specifically in Wilkinson (1991), on bare plurals, who offers perhaps the most comprehensive critique to date. Kratzer (1980) presents a very interesting criticism of the "kinds" analysis that has had some reply (Carlson 1996; É. Kiss 1998). Lasersohn (1997) likewise critiques some of the semantic claims associated with a "kinds" analysis from examining the detailed semantics of donkey sentences. Schubert & Pelletier (1987) have a detailed critical discussion of the framework. Condoravdi (1994) and É. Kiss (1998) argue that a no-
No lack of determination
153
tion of specificity (though characterized differently in each case) distinguishes apparently universal from indefinite appearances of bare plurals; Condoravdi (1994), like Wilkinson, presents criticisms fairly comprehensively. Even on a unitary "kinds" analysis, there remain several alternative points of view (e.g., see Ojeda 1993), and while a unified analysis would seem a priori desirable it is by no means taken for granted. As the semantic theory of indefinites developed during the 1980's (Lewis 1975; Kamp 1981; Heim 1982), another analysis of BP's appeared that made quite different assumptions about their character. The most detailed proposals are to be found in Krifka and Gerstner-Link (1986), Wilkinson (1991), Kratzer (1995, initially written in 1989) and Biasing (1992). On this type of analysis, BP's are always indefinite noim phrases (and not like names) whose contribution to the meaning of the whole is a predicate condition with afreevariable in it. Thus, roughly, a BP like stars in any context would be interpreted as star(x), much like the indefinite singular noun phrase α star would if one set aside the plurality—or the definite the star(s), for that matter. In this theory, the quantificational force associated with a BP in a given sentence is, as before, provided by syntactic/semantic elements outside the noun phrase itself The mechanisms providing for this differ from the earlier analysis, some attributable directly to the DRT framework itself One very fertile version is Diesing (1992) and related work. In this theory, there is a simple algorithm for determining how the free variable introduced by the noim phrase gets bound: if (at LF) an NP is found within the VP of the sentence, it gets bound by an existential quantifier ("existential closure") and mapped to a nuclear scope; and if it appears in the IP of the sentence, it gets bound by something else and appears in a restrictos That "something else," in the case of quantified noun phrases that have undergone QR, would be the quantifier expression (e.g., the universal all in the novm phrase all men), but there are other possible binders as well. For instance, adverbs of quantification, as found in (7) below, can bind the free variable in the (interpretation of the) subject noun phrase linguists, appearing at LF in the IP of the sentence. (7)
Linguists are often good musicians
For a sentence like (8), in which there is no overt element to provide binding, a generic operator GEN is usually assumed of the sort outlined in Krifka et al (1995) (see also Krifka 1987), and, earlier, Farkas and Sugioka (1983)) to serve as a binder for the variable introduced by the BP subject noun phrase: (8)
Linguists like to read
154
Greg Carlson
The stage-level/individual-level contrast in thisframeworkgets indirectly reflected in whether the subject and other arguments of the predicate require the argument to appear outside the VP at the level of LF, or within it. This analysis has both been supported (e.g., Longobardi 2000) and questioned (e.g., de Hoop 1996; Bobaljik and Jonas 1996) on empirical grounds. There is something of an obvious smallish cost associated with accommodating BP's to indefinites within a ВКТ^Зфеframework:you give up a unified analysis. In these treatments a distinction is fairly systematically drawn between BP's that can be treated as indefinites (as predicates) in the DRTframework,fromthose that are subjects of kind-level predicates (e.g., in (3)), in which case the BP's are kind-denoting and cannot convincingly be treated as predicates of individuals (or groups of individuals). Wilkinson (1991) in particular wishes to argue that this is an acceptable outcome. Further, one does not give up on the idea that many instances of BP's in generic sentences with universal-like readings are syntactically and semantically identical to BP's interpreted existentially. In the extensive summary paper of Krifka et al (1995), the point of view espoused by Krifka & Grerstner-Link, Kratzer, Wilkinson, Diesing, and others was presented and treated as the "center-of-opinion" and prevEiiling view (even if there may have been some very minor split of opinion among the numerous co-authors). However, looking at things in this way also has some distinct advantages. For instance, in the Carlson (1977) analysis, the fact that indefinite singulars in English and other languages may also have generic readings along with their usual existential readings, as in (9), takes a bit of extra work. But within the DRTframeworkcoupled with some of the principles mentioned above (along with a few others), this appears to fall out naturally and, in fact, would be a bit hard to prevent given assumptions. (9)
a. b.
A curious person knocked on the door (existential) A curious person likes to travel (all curious people, curious people in general)
So this approach certainly has its advantages, though other alternatives are clearly possible (e.g., Cohen 2000). However, the DRT approach exemplified in Diesing and elsewhere does not, to this point, take account of certain features of bare plurals that other analyses have focused attention on.
No lack of determination
155
4. Some additional facts about English BP's For example, the similarities noted earlier about the relationship between bare plurals and proper names—the other side of the coin—haven't been dealt with. And long-noted scoping facts about existential readings of bare plurals have also tended to get set aside. It has been observed, and generally agreed, that bare plurals exhibit only narrowest-scope readings, in contrast to overt indefinites, which exhibit variable scope. For instance, a sentence like (10 a) does not mean there are specific shoes that are being sought; nor does (10 b) have a meaning equivalent to "There are some cows that are not in the garden", thereby allowing some (other) cows to be there: (10)
a. b.
Maiy is looking for shoes Cows are not in the garden
These and other unexpected properties of BP's on their existential readings are missed on any analysis such as this which equates BP empty determiners with the indefinite plural. One particular issue that has received only minor attention in the literature (see for instance Longobardi 1994) is whether BP's are real contrasting plurals in the sense of excluding singular objects from their denotations. It appears to make some sense, at least, to claim that a question like "Are there holes in the wall?" is truly answerable with "Yes" under the circumstance where just one hole is in the wall and no more. If this is so, it argues that BP's are not indefinite plurals that stand in contrast to the indefinite singular, but rather forms whose interpretation encompasses both. There are several other aspects of the interpretation of BP's that remain more near the periphery of research, but which have arisen in the course of this research, and which have motivated more detailed examination of BP's. For instance, Condoravdi (1994) has successfully focused attention on examples where bare plurals appear interpreted existentially, yet appears as subjects of individual-level predicates. (11)
(There was a ghost haunting campus.) Students were aware of this danger
As Condoravdi points out, this is not a simple existential statement, as (11) above does not mean the same as (12): (12)
(There was a ghost haunting campus.) There were students who were aware of this danger
But this is interpreted much more like a sentence with a definite article: (13)
... The students were aware of this danger
156
Greg Carlson
Such "functional" readings, as Condoravdi calls them, can be teased apart from truly generic readings; she ultimately argues that there is £in extensional generic reading—the functional reading—that stands alongside the generic and existential readings. É. Kiss (1998) also points out some facts about bare plurals when focused, in that they can take on purely existential readings, unlike their unfocused counterparts. So, for instance, in (14): (14)
GIRLS know mathematics the best in my school
it can mean that those students who know it best, are among the girls in the school; it may also be read generically as about "all" girls, as well. É. Kiss argues that this possibility of interpretation results from the fact that something must be interpreted specifcially in order to be topicalized (or, contrastively focused). A further fact about bare plurals, noted in Longobardi (1994, 2000) though also recognized (but not accounted for) in Carlson (1980), is that when a relative clause or other postverbal modifier (in English) is appended, an existential interpretation may arise where none was possible before. So, for instance, with: (15)
Neighbors are tall
only the (slightly implausible) generic reading seems possible, but in (16): (16)
a. b.
Neighbors of mine are tall Neighbors that live just down the block are tall
an existential reading may appear (along with generic readings of varjdng degrees of plausibility). Similar facts obtain with singular indefinites in Enghsh: (17)
a. b. c.
A neighbor is tall (generic only) A neighbor of mine is tall (both) A neighbor that lives just down the block is tall (both)
If we change the t)φe of relative clause in the examples above, the existential reading seems to disappear: (18)
a. b.
Neighbors that eat lots of vegetables are tall Neighbors from Scotland are tall
This stands in contrast to what occurs with the indefinite singular, where the existential reading appears to remain a possibility: (19)
a. b.
A neighbor that eats a lot of vegetables is tall A neighbor from Scotland is tall
No lack of determination
157
The types of postnominal modifiers that allow for existential readings are, intuitively, those that locate the corresponding individuals in time and/or space. Under most circumstances, a BP will, as noted above, exhibit most clearly only a narrow scope reading. However, with an appropriate postnominal modifier, not only is an existential reading possible, but the NP also can exhibit scopai properties just like an indefinite singular. Compare, for instance: (20)
a. b.
John is looking for old books (narrow scope only) John is looking for old books that he forgot to return to the library (narrow or wide scope)
In the second sentence, but not in the first, the object of John's search can be a specific set of books; both sentences have a clear narrow-scope reading. Again, these facts are differentfi-omwhat we observe with indefinite singulars in corresponding cases. These facts are discussed in more depth by Chierchia (1998b). One further lingering fact, noted by Barbara Partee (1985), is that when bare plurals (though not mass terms, in this instance) function as "dependent plurals", they show scoping effects as well. Exactly what the facts are, and how all these relate to each other, remains not very widely examined at the moment, but there is getting to be a rich enough set of data and sufficient theoretical development to support growing work in this somewhat obscure area.
5. BP'S in Romance (and Germanic) Somewhat ironically, some of the most interesting work on BP's comes from consideration of languages which don't have lots of them. The most highly developed body of hterature is on Romance, especially Spanish and Italian, which have fewer BP's than English, and French (which has virtually none). To foreshadow some, the problem raised by the "kinds" analysis as well as by the more commonly assumed indefinites analysis is that one would expect any language with BP's to exhibit them fairly freely, as in Germanic, and for BP's to have both generic and existential readings. However, consideration of other languages shows this is not always the case. What has resulted thus far from this line of research has been a return to a more sophisticated "kinds" analysis, which nonetheless makes critical use of the insights of the theory of indefiniteness. One such analysis is found in de Swart (1993), who bases her analysis on facts from English and French, and others I discuss below. (Here, as above, I can really only point to but cannot do full justice to the scope of the individual works, which contain a great deal more them the few facts presented here.)
158
Greg Carlson
It has been known for some time that in Spanish, the distribution and interpretation of BP's is limited. Contreras (1986) notes such facts as these: (21)
a.
b.
c. d. e.
Quiero cafe want-lsg coffee Ί want coffee' El cafe me gusta def coffee me pleases Ί like coffee' *Me gusta cafe me pleases coffee (subj) *Cafe me gusta coffee me pleases Hablamos con amigos speak-Ipl with friends We spoke with some friends'
BP'S may not occur as subjects of non-ergative verbs, whether post- or preverbal, and in general cannot be interpreted generically, only existentially; fiirther, when contrastively stressed, BP's may appear. The type of account offered by Contreras centers on the notion of proper government as applied to an NP with an empty determiner position; Ibrrego (1989) gives an account very similar in spirit. That is, BP's are claimed to have a determiner position represented in the syntax that, like other empty categories, requires proper government, and the N within the NP (or DP) itself cannot govern that position. The account of the data above, and much more, centers aroxmd defining government and the syntactic structures of Spanish in such a way as to account for patterns such as those found in (21). Governing items include verbs and prepositions, but subjects (as in (21 c,d)) have no governor, and the empty determiner position remains unlicensed, resulting in ungrammaticality. Note that it is necessary, on this accoimt, to have an actual empty D position in the DP; the data from English on any of the accounts reviewed above do not motivate such an analysis as the appearance of BP's in English is basically unrestricted. On the semantic side. Laca (1990) presents a number of keen observations about how one expresses generic objects in such a language, which has restricted occurrences of BP's. Spanish generally uses the definite article to express what we are calling the generic reading (though Laca argues the informational notion of "inclusive" presents a better xmderstanding), and the bare plural form is generally reserved for existential (= "non-inclusive") readings. Consider, for instance, the ambiguity inherent in the English:
No lack of determination (22)
159
The Gwamba-Mamba worship bears
The preferred reading for this is that the species represents the object of worship; however, there is also a reading where there are some specific bears they keep caged up, which they worship to the exclusion of other bears. This is the reading most favored for: (23)
The Gwamba-Mamba worship idols
That is, the object of worship is some specific group of idols, not idols in general (though this is still a possible reading). The following Spanish sentences express these preferred readings: (24)
Los G-M adoran a los osos defG-M worship (to) def bears
'The G-M worship bears (in general)' (25)
Los G-M adoras ídolos def G-M worship idols The GM worship (some) idols'
But this distinction between definites and bare plurals is not limited to intensionalizing verbs such as "worship". So, for example, both are possible after an extensional verb like "chase", with differential effects: (26)
Mi perro persigue a los gatos my dog chases (to) defeats 'My dog chases cats' = 'What my dog does with cats is chase them'
(27)
Mi perro persigue gatos my dog chases cats 'My dog chases cats' = 'My dog has a habit of cat-chasing'
In this case, the use of the definite form is correlated with focus on the verb. Some very interesting proposals can be found in Vergnaud and Zubizarreta (1992) regarding the possibility of generic interpretations in a comparison of French and English (focusing on expressions of inalienable possession). They lay out the idea that in the DP, specific reference arises from the Determiner itself, whereas the NP is the source of type-level reference (or denotation) (Svenonius 1996, recasts this as a distinction based on whether reference to context is available). Vergnaud and Zubizarreta express this as their "Correspondence Law" (p. 612): (28)
When a DP or an NP denotes, the DP denotes a token and the NP denotes a type
160
Greg Carlson
Phrases exhibiting determiners that nonetheless denote tJφes require a notion of'expletive determiner', that is, a determiner that appears without semantic effect, except to allow the denotation of the NP to serve as the whole DP's denotation. The claim is that French allows expletive determiners (the definite article, in most cases), whereas English does not, and that this accounts for many differences between French and English discussed in the article. Longobardi (1994), while focusing on proper names in Italian, presents an analysis with implications for the use of bare plurals which draws inspiration from Delfitto and Schroten's (1992), and can be seen as a reinterpretation of some of the Vergnaud and Zubizarreta facts. Longobardi takes some of the crucial assumptions also presented in Contreras, and focuses on data from Italiem. Like Spanish, Italian has some bare plurals in about the same restricted positions which are (almost) edways interpreted existentially (and, according to Chierchia, have a slightly literary flavor to them); the generic reading is conveyed with a definite article or a singular indefinite article in some cases, as in French and Spanish. Longobardi's main thesis is that in the case of proper names (and pronouns) there is movement within the DP from the N position into the empty D position, resulting in a structure which does not have an empty D that must be governed externally. This means that proper names, like (nearly) any noun phrase with an overt determiner, will appear in any position a DP may also occupy. (29)
Í D p b e ] [n-N]] t I
Evidence comes largely from facts about Italian word-order. In the case of empty D's in Italian, one gets about the right results by assuming that common nouns, lacking reference of their own, do not move into the D position like proper names can, leaving only those positions where the D is governed, giving rise to a narrow-scope existential reading. But then, what of English (Germanic) BP's? In these languages there is little wordorder evidence of the type to be foimd in Italian regarding movement of names and pronoims into D, and BP's may appear in any DP position at aU; further, BP's can have generic interpretations. Longobardi's proposal here is that in Germanic, determinerless common nouns can move into the empty D position at the level of LF, which gives rise to a referential, generic reading for BP's; failure to move into the D position will result in a "default" instance of existential interpretation. In this case, the presumption would need to be that the existential interpretation is available only for those positions which in Grennanic would count as governed positions, generic readings being the only available in ungovemed positions. Thus,
No lack of determination
161
to speculate for a moment, the IP position that Diesing has suggested for generic subjects would probably coimt as an ungovemed position, allowing only the generic reading; the VP-internal position Diesing assumes for stage-level subjects would be governed, and thus would allow for existential readings, and generic readings as well, unless otherwise restricted. The notion of an expletive determiner is also developed in Brugger (1993), who focuses on German as well as Italian, comparing them both with Enghsh. Brugger argues that the definite article in German, but not in English (at least for the constructions considered), can be expletive. One basic fact pointed out is that while English plural definites cannot be interpreted generically (or if so, only marginally), in German this is an entirely natural way of expressing genericity Thus, (30) has a generic reading, while its English counterpart does not, referring instead only to some contextually determined set of elephants, which is also a possibility for the German. (30)
...die Elefanten wertvolle Zähne haben the elephants precious teeth have 'Elephants have precious teeth'
German does have bare plurals, like English, but these cannot occur with true kind-level predicates (31), though they occur fairly freely in generic sentences (32): (31)
(32)
*... Dinosaurier dinosaurs
dabei sind auszusterben PRT become extinct (OK with die Dinosaurier)
... Elefanten wertvolle Zähne haben elephants precious teeth have 'Elephants have precious teeth'
Brugger concludes that German (and Dutch) BP's cannot be kind-denoting; the definite plurals, however, have an expletive determiner in them, which fills the D position which would have to otherwise be bound by another operator. The possibility of the English definite functioning this way is precluded because the English definite article carries no grammatical features (such as case, number, gender), and hence must function semantically. The most comprehensive attempt to deal with both the syntax and the formal semantics of BP's in Romance and Germanic is foxmd in Gennaro Chierchia's work (1998a, 1998b). Chierchia takes as his starting point a 'binds" approach that he also developed in his dissertation (Chierchia 1988, written in 1984), which involves a formal semantics making use of
162
Greg Carlson
type-shifting, chiefly as a way of characterizing the meanings of nominalizations more generally. Type-shifting is also a means of resolving type mismatches between function and argument (Partee, 1987). Chierchia takes the point of view, pace Longobardi, that NP's can, subject to parametrization, function as arguments just like DP's can: that is, on his analysis there is no empty determiner slot in the case of BP's or determinerless mass terms, on this parameterization. Another parameter setting, however, takes NP's to be predicational, and so NP's cannot enter into argument positions without being a part of a DP (this would be the case of French, though for Spanish/Italian an empty D is posited). One of the main featiu-es of his approach is that for Germanic and Romance (and many other languages), the semantics of the type-shifting itself demands that the NP be either plural or a mass term; type-shifting defined on singular count nouns will not jdeld a kind, so this rules out the appearance of bare singulars (for the most part) in these languages. One main point of Chierchia's analysis is to account for the scopelessness of the existential reading of BP's, and to answer some questions raised by the prevailing view that generics should be analyzed as indefinites: (a) why would languages consistently use the same device (BP's or, in many languages, determinerless singulars) to express both kind-reference and existential indefiniteness? (b) why in the indefinites view would there be an ambiguity posited between kind reference, and a weak indefinite reading—why not a strong one? In Chierchia's account, very briefly, the scopelessness is the result of an existential quantifier introduced in situ by type-shifting to resolve a type mismatch between predicate and argument. The approach taken here also has the merit of resolving some objections to the kinds analysis that Carlson's original Montague grammar analysis left lingering. For instance, the fact that BP's set well in existential "there" sentences seems at odds with their referential treatment, as proper names, definites, and other similarly referential phrases are generally excluded. However, as McNally (1992,1998) points out, "kind" phrases do not seem to obey the definiteness restriction ("There was every kind of animal in the garden" vs. ??"There was every guest in the garden"). McNally (1998) proposes, along with Chierchia, that the definiteness restriction should be supplanted by a semantics for existentials that requires the denotation of the NP to be something that can have individual instantiation—like kinds. In unpublished work, Delfitto (1998) has explored many of these same issues, again with emphasis on Romance and Germanic, and also arguing for a unitary analysis of BP's. One of Delfitto's main theses is that the existential readings of BP's arises from an existential quantifier associ-
No lack of determination
163
ated with the event-argument position of verbs, and that genericity arises not from the presence of a GEN-type quíintifier, but from the aspectual character of the sentence itself Possibly taking a cue from Diesing and Longobardi, Delfitto proposes that generic sentences involve an aspectual structure which requires one of its arguments (typically the subject) to be marked as an external argument (in Diesing's terms, in the IP). The effect is to create a predicational structure which demands that tJφe-shifting take place on the external argument whereby it gets interpreted intensionally—as a property set. Delfitto also deals with the apparent scopelessness of existential readings of BP's and the cases where they take on scopai properties, also discussed in Chierchia. One particular issue Delfitto wishes to deal with is the fact that the presence of a modifier, such as a relative clause, can make for acceptable BP's in cases where a bare noun seems unacceptable. For instance: (33)
a. *Cane creano guai seri dogs create troubles serious b. Cane di grosse dimensioni creano guai seri dogs of large size create troubles serious 'Dogs of great size create serious trouble'
In order to account for such differences, an empty D position is assumed within a minimalist framework; the issue is what types of features are transmitted to the D position. It is proposed that when there is a postnominal modifier, the system works in such a way that the D position remains devoid of nominal features, thus allowing for the D to be "identified" and interpreted. Delfitto in this respect presents an alternative to the somewhat simpler analysis offered by Longobardi. Delfitto offers a perspective on BP's that aims for a vmitary analysis, and, further, does not rely upon the standardly assumed GEN operator, nor on any existential quantifiers over and above the one to bind event-arguments in VP's. This work, along with Chierchia's, is much in the same vein as work by Dobrovie-Sorin (1996) and Dobrovie-Sorin and Laca (1997), who aim for a unitary analysis of BP's in English as well as Romance which again treats them as distinct from indefinites, and, additionally, incorporates an analysis of Condoravdi's functional readings of BP's. One of the main contributions (among others) of this work is to emphasize that the stage/individual-level contrast is cross-cut by another relevant contrast, that of spatio-temporal localization, and it is this dimension which determines the possibility. Evidence surrounds stative (adjectival) stage-level predicates which appear only capable of taking generic subjects. So, for instance, emotional-type predicates appear to be stage-level, but do not readily accept localizers:
164
Greg Carlson
(34)
a. ?? During. Chomsky's lecture, top-models were hungry/tired/ drunk b. ??Look! Top-models are drunk/hungry in the street. c. ?? Children are nasty/sick/happy in school (no existential reading; a frequentitive/conditional) d. ?? Where is John happy/nasty/angry? (only frequentitive/conditional rdg)
The emphasis on localization is highly reminiscent of many analyses of "stage-level" predicates (Kratzer1995; McNally 1995, to mention but two), but Dobrovie-Sorin and Laca put this common theme into a new and richer setting which addresses the types of issues also discussed extensively by Femald (1994).
6. A little on bare singulars Less work has focused on singular co\mt common nouns lacking determiners, as in the Romance and Germanic languages these do not appear systematically in argument positions (though may in vocative and predicative constructions). However, this does not mean they are totally lacking. English has them sporadically ("I saw it on television" or, as Richard Oehrle pointed out to me, "The special relation between doctor and patient deserves special legal protection."). In Scandinavian languages, they appear quite a bit more systematically. Borthen (1998) discusses these in Norwegian. Here we find, for instance: (35)
a.
b.
c.
Jeg kj0rer bil I drive car Ί drive a car' Petter spiser heist med skje Petter eats rather with spoon Tetter would rather eat Avith a spoon' Jeg har bestilt billett I have ordered ticket Ί ordered a ticket'
However, bare singulars may not appear in many other instances: (36)
a. *Jegodela datamaskin I destroyed computer b. *Bil kj0rer bortover veien car drives along road-the
No lack of determination
165
Borthen considers the semantics of these bare singulars where they may occur, detailing how, on almost anyone's analysis, they must be regarded as non-specific (i.e. the observations are similar to those made in Enç 1991 for determinerless bare singulars in Turkish); further, they exhibit the same sort of scopelessness as (most instances of) BP's. Borthen does not find a specific syntactic mechanism to account for the distribution and interpretation of bare singulars in Norwegian, in the end settling on a semantic account which limits their appearance to argument positions exhibiting only certain semantic roles which are enumerated and motivated in the account. Work on bare singulars in other languages that likewise have articles and/or plurality has yielded a very similar pattern of syntactic and semantic observations. The work on Albanian (Kallulli 1996,1999) shows a pattern there strikingly similar to the facts presented above. Bare singulars in Brazilian Portuguese have been investigated in detail by Schmidt and Munn (1999) and Munn and Schmidt (1999, 2000), where a similar set of facts seems to fall out, though there bare singulars also may function as subjects. Dayal (1999) examines bare singulars in Hindi, which are restricted to objects position, and again the same array of semantic observations hold. A quite different set of observations about bare singulars, though, appears to hold for English (e.g., "He went to prison; She is at school"^ as discussed in detail by Stvan (1998). It is widely recognized that bare singulars occur in any of the large number of languages which lack overt definite £md indefinite articles, but their study in the context of the issues raised within the framework of BP'S has lagged somewhat in comparison. Chierchia's work makes an attempt to deal :vith such languages as the Slavic languages, Japanese, and Chinese. A partial effort in this direction for Slavic languages can be foimd in Filip (1993) and elsewhere, but the most detailed effort to date that I am aware of is to be found in Cheng and Sybesma (1999), who undertake a detailed comparison between Mandarin and Cantonese within the context of these issues (see also Gelman & Tardif 1997 and Basilico 1998 for some observations regarding Chinese bare novms as well). Mandarin Chinese, like many others, is a language without plural morphology or articles, so many noim phrases have the appearance of bare singulars. These may be interpreted as definite, indefinite (existential), or as generic, but not all these interpretations may appear in all argument positions. So, for instance, in preverbal position, these noun phrases cannot (typically) be interpreted as indefinites, only as definites or generically; but postverbally an indefinite (and non-specific) reading may emerge.
166 (37)
Greg Carlson a.
b.
c.
Gou jintian tebie tinghua dog today very obedient 'The dog was very obedient today' But NOT: "A dog was very obedient todajr" Gou ai chi rou dogs love eat meat 'Dogs (in general) like to eat meat' Hufei mai shu qu le Hufei buy book go PRT 'Hufei went to buy a book/some books'
In Cantonese, however, bare nouns cannot be interpreted as definites. Definiteness is expressed by the use of a classifier (CL); as a consequence, bare nouns cannot be interpreted as definites preverbally, though they may be interpreted generically. (38)
a. * Gau soeng gwo maalou dog want cross road (Not possible for 'The dog wants to cross the road') b. Zekgau zung-ji sek juk CLdog like eat meat 'The dog (NOT dogs in general) likes to eat meat' c. Gau zung-ji sek juk dog like eat meat 'Dogs like to eat meat'
Mandarin, Cheng and Sybesma argue, has CL+Noun phrases as well; however, there, the interpretation is always indefinite, eind never definite as in Cantonese. The thrust of the work is to argue that classifiers in both Chinese languages function very much like D positions, in that any bare noun is a part of a classifier phrase, the basic structure being: (39)
[cipCUNPN]]
That is, bare nouns have more structure than just the noun (or NP) itself. To achieve a definite interpretation, the N moves into the empty CI position in Mandarin (similar to the Longobardi analysis of proper names). As a consequence, the empty position is filled and it need not be governed, which allows it to appear in preverbal position. In essence, the indefinite interpretation arises when the N does not move into CI, and as a result the empty position must be governed, as in the SpanishЯtalian analyses with empty D position that must also be governed. In Cantonese, on the other hand, the fact that an overt CI is used to express definiteness precludes the possibility of also using the covert strategy of movement into CI to ex-
No lack of determination
167
press definiteness (this also follows a suggestion of Chierdiia), and as a result the empty CI is interpreted indefinitely if governed, or if the N moves into CI it may also be interpreted generically. Presence of an overt CI blocks a generic interpretation. This work certainly sets the stage for future work in languages lacking articles, making use of the body of literature discussed above. Before closing, it is worthwhile mentioning some other recent work. Brockett (1991) contains a detailed examination of Japanese; we also find Dayal (1992) on the situation in Hindi and Portersfield & Srivastav (1988) on the contrast between Hindi and Indonesian. Chung (2000), in a reply to Chierchia, also examines Indonesian in detail. Petronio (1995) discusses ASL (which has no plurality or articles; see also the other papers in the same volume). Greenberg (1994) comprehensively presents facts about Hebrew, and É. Kiss (1998), Hungarian. Bittner (1994) and Van Geenhoven (1998) discuss West Greenlandic incorporated nomináis within this tradition, where one finds the most detailed semantic observations about these structures (along with the work on Hindi bare singulars—which are arguably incorprated forms — mentioned above). It appears that incorporated nomináis (chiefiy, objects of verbs) follow the general pattern of semantic interpretation characteristic of BP's and bare singulars as well — chiefly, in having nearly always weak indefinite existential and number-neutral interpetations. The same range of interpretations also appears to hold for "pseudo-incorprated" forms (Massam 2001). This suggests that incorporated nomináis and BP's share a lot in common that deserves closer examination, as argued most pointedly by van Geenhoven (1995). In some languages, such as West Greenlandic, incorporated nomináis can be modified or quantified from outside the word; this gives rise to a discontinuous syntactic form—a "split" construction—which likewise raises interesting questions about the semantics of determinerless nouns even in languages which do not have incorporation (Diesing 1992; Beerman 1997; also Geurts 1996). Perhaps the most comprehensive survey of the range of nominal forms used to express genericity is to be found in Gerstner-Link (1998), who compares forty disparate languages and summarizes the results in a series of proposed universals. The patterns she finds are largely in keeping with the detailed work on a more limited set of languages (she uses German as her base case), but there are some surprises, for example, that not all languages with indefinites can use indefinites generically; but definites can be consistently used that way.
168
Greg Carlson
7. A codicil on stages I would be remiss not to bring up one closely related issue before concluding. The Carlson (1977, 1980) analysis makes use of a construct of "stages", temporally-restricted portions of individuals, as a means of characterizing existential readings of bare plurals. Most researchers, including the majority of those discussed above, have found such constructs dispensible, using existential quantification over individuals instead for the indefinite reading. This results in equivalent truth-conditions (in most cases) but also in an ontologically more parsimonious framework (though many make use of an event-sememtics that introduces very similar t5φes of entities into the model). However, a good number of researchers have foimd that stages themselves are useful constructs in their own right, as in Stump's (1981) analysis of constructions such as "an occasional sailor walked bj^. The literature making use of stages is more scattered than the literature reviewed above, but the issue occasionally surfaces when temporal restrictions are examined more closely. One recent comprehensive discussion is to be fovmd in Musan (1995), who builds on the work of Enç (1981). Yoon (1998) considers the semantics of English indefinite NP's with a proper name modified by an adjective. (40)
... α handcuffed Jones protested as two Columbus police officers pushed him into the Franklin County jail
Yoon notes, first of all, that the adjectives appearing in this construction are stage-level and not individual-level adjectives ("a startled Kato Kaelin" vs. ?? "an intelligent Kato Kealin"); further, that despite their indefiniteness, they appear to make reference to individuals already introduced into the discourse. However, if the NP refers to a stage of an individual, then that stage itself consititues a novel entity into the discourse, hence the indefinite. Dermidache (1997a, 1997b), in some provocative work, has also employed and defended stages in the analysis of St'at'imcets (Lillooet Salish) noun phrases in order to account for their temporal restrictedness. Lin (1999) has proposed that stages be countenemced in order to account for the semantics of shenme Svhat' in donkey-type conditional sentences in Chinese (also examined in great detail by Huang and Cheng (1996) though they focus on the semantics of'who', which turns out to have some different properties). The stage/individual contrast has also been invoked in impublished work to form an account of the semantic distinction between the Japanese anaphoric expressions sore vs. kare. Carlson (1991) discusses the use of stages for the analysis of certain demonstratives in English — see Biiring (1998) for an interesting and closely-related discussion.
No lack of determination
169
8. Outcomes and conclusions Researchers now have on hand a large and sophisticated set of both data and analyses to draw from in considering the appropriate S57ntax and semantics of BFs. This presents us with an excellent base from which to work on this and related problems from a variety of perspectives, in a variety of the world's languages. On some matters, there is quite solid general agreement. One is that BP's on both existential and generic readings should try to be analyzed as having something basic in common. Another is that veiy close attention needs to be paid to issues of specificity and scoping for the indefinite readings. We have also seen a general trend towards taking elements of both the theory of indefinites and the kinds analysis, and trying to preserve something like a imitary analysis of BP's across languages: few if any of the reseachers noted, in particular, defend an analysis in which there are multiple null D's. The success of assuming an empty D position that must be properly governed in some languages is also widely appealing, as is the idea that there is a connection between movement into D and definiteness/genericity. An area that can use closer scrutiny, aside from extending research to a broader number of languages, is a more careful imderstanding of the relation between definite singular, definite plural, BP, singular indefinite generics, incorporated nomináis, and the relation of these expressions to overtly expressive "kind" NP's in general, in languages which allow them. This has not been entirely ignored by any means, but not enough has been done to create a consensus opinion. One also does not find convincing analyses (to my mind) of examples such as (2) above, where both a generic and an indefinite reading may appear, and I have to regard this as an open area of research. There has also not been quite enough work done in light of Condoravdi's observations about functional readings of BP's. However, the area where one finds the most bewildering variety of proposals is accoimting for the source of the existential quantification in the case of existentially-interpreted BP's. A sampling of some include: (41)
— DRT existential closure (Diesing 1992, Krifka 1987) — Type-shifting (Chierchia 1998b) — Quantification over stages (Carlson 1980) — Binding of a situation variable (de Swart 1993; Delfitto 1997) — Default interpertation of an empty D (Longobardi 1994) — Binding due to sentence information structure (Glasbey, 1993) — Location-argument binding (Dobrovie-Sorin 1996)
170
Greg Carlson — Categorical/thetic structure (Basilico 1998; Ladusaw 1994; Kurodal972) — Specificity (Condoravdi 1994) — Mapping from properties to propositions (Carlson 2000) — Referential anchoring (Löbner 2000)
and so forth. The variety here is perhaps best understood as a reflection of the differing theoretical asumptions and/or machinery that are available, but it certainly reflects the wide-open state of the area as it stands. The question is, how much difference do these various assumptions make? In some cases, they appear to make little difference. For instance, if one attributes VP-level existential quantification to existential closure, as Diesing suggests, or to an existential quantifier connected with the event structure, as Delfitto suggests, the effect on BP's is about the same. Typeshifting (Chierchia) quite clearly locates the source of the existential quantifier at the boundary between the NP and the predicate it is combining with, but then Carlson's existential quantification over stages can be looked upon in much the same way, though the ontologies differ. Longobardi's "default" existential quantifier is very much in the same vein. At the current state of research, there is no strong consensus about what source of existential quantification is correct or incorrect (nor does there seem to be one about the precise source of generic readings as well), so it is possible to focus on the correlated structures within the DP itself and still make very productive contributions. As research continues and the theoretical issues become increasingly sharpened, however, I expect people are going to have to increasing reason to choose among them.
A Bare Plural
Bibliography
Abney, Steven (1987). The English noun phrase in its sentential aspect. Massachusetts Institute of Technology Ph.D. dissertation, van der Auwera, Johan (1980). The semantics of determiners. London: Croom Helm. Bach, Emmon., Eloise Jelinek, Angelika Kratzer, and Barabara Partee (eds.) (1995). Quantification in natural languages. Dordrecht: Kluwer. Barker, Chris & David Dowty (eds.) (1992). Proceedings of SALT 2. Columbus, OH: Working papers in linguistics 40, Ohio State University. Basilico, David (1998). Object position and predication forms. Natural Language and Linguistic Theory 16, 541-95. Beerman, Dorothee (1997). Syntactic discontinuity and predicate formation: A study in German and comparative Germanic syntax. Tilburg, the Netherlands: Tilburg Dissertations in Language Studies.
No lack of determination
171
Bittner, Maria (1994). Case, scope, and binding. Dordrecht: Kluwer. Bobaljik, Jonathan and Dianne Jonas (1996). Subject positions and the role of TP. Lingusitic Inquiry 27,195-236. Bok-Bennema, Reineke and Peter Coopmans (eds.) (1990). Linguistics in the Netherlands. Dordrecht: Foris. Bordelois, I., Helas Contreras, and Karen Zagona (eds.) (1986). Generative studies in Spanish syntax. Dordrecht: Foris. Borthen, Katja (1998). Bare singulars in Norwegian. Cand. philol. thesis. Norwegian University of Science and Technology, Department of Linguistics. Brockett, Chris (1991). Wa-marking in Japanese and the syntax and semantics ofgeneric sentences. Ph.D. dissertation, Cornell University. Brugger, Gerhard (1993). (îeneric interpretations and expletive determiners. University of Venice working papers in linguistics 3,1-30. Büring, Daniel (1998). Two types of identity statements. Ms, University of Cologne. Burton-Roberts, Noel (1976). On the generic indefinite article. Language 52, 427-448. Carlson, Gregory (1977). A imified analysis of the English bare plural. Linguistics and Philosophy, 413-457. Carlson, Gregory (1980). Reference to kinds in English. New York: Garland. Carlson, Gregory (1991). Ostension and perception: cases of really direct reference. Paper presented at SALT L Cornell University. Carlson, Gregory (1996). A note on belldonnas. Paper presented at the гтnual meeting of the Linguistic Society of America, San Diego. Carlson, Gregory (2000). Weak Indefinites. Paper presented at the NP to DP Conference at the University of Antwerp; University of Rochester ms. Carlson, Gregory & Francis JefEry Pelletier (eds.) (1995). The generic book. Chicago: University of Chicago. Cheng, Lisa, & C-T. James Huang (1996). Two types of donkey sentences. Natural Language Semantics 4,121-163. Cheng, Lisa, and Rint Sybesma (1999). Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30, 509-42. Chierchia, Grennaro (1988). Topics in the syntax and semantics of infinitives and gerunds. New York: Garland. Chierchia, Gennaro (1995). Individual level predicates as inherently generic. In Gregory Carlson & Francis Jeffiy Pelletier (eds.) 176-223. Chierchia, (ìennaro (1998a). Partitives, reference to kinds, and semsmtic variation. In Proceedings of Salt VII. Ithaca, NY: Cornell University.
172
Greg Carlson
Chierchia, Gennaro (1998b). Reference to kinds across languages. Natural Langauge and Linguistic Theory 6, 339-405. Chung, Sandra (2000). On reference to kinds in Indonesian. Natural Language Semantics 8,157-71. Cohen, Ariel (2000). On the generic use of bare indefinite singulars. To appear in Christopher Pinon and Paul Dekker (eds.) Uses of indefinite expressions. Stanford: Center for the Study of Language and Information. Cohen, Ariel, and Nomi Erteschik-Shir (1997). Topic, focus, and the interpretation of bare plurals. In Proceedings of the 11"' Amsterdam Colloquium, Amsterdam: Institute for Language, Logic, and Information. 31-6. Cohen, Ariel, and Nomi Erteschik-Shir (1999). Are bare plurals indefinite? In Francis Corblin, Carmen Dobrovie-Sorin, and J. Marandin (eds.). 99-109. Condoravdi, Cleo (1992). Strong and weak novelty and familiarity. In Chris Barker & David Dowty eds. 17-37. Condoravdi, Cleo (1994). Descriptions in context. Ph.D. dissertation, Yale University. Contreras, Heles (1986). Spanish bare NP's and the ECP. In Ivonne Bordelois. Heles Contreras, & Karen Zagona (eds.), 25-49. Corblin, Francis, Carmen Dobrovie-Sorin, and J. Marandin (eds.) (1999). Empirical Issues in Syntax and Semantics 2. The Hague: Theus Publishing. Dahl, Osten (1975). On generics. In Edward L. Keenan ed., 99-111. Dayal, Veneeta (1992). The singular-plural distinction in Hindi generics. In Chris Barker and David Dowty (eds.) 39-58. Dayal, Veneeta (1999). Bare NP's, Reference to Kinds, and Incorporation", Proceedings of SALT 9. Ithaca, NY: Cornell University. Dekker, Paul & Martin Stokhof (eds.) (1993). Proceedings of the ninth Amsterdam Colloquium. University of Amsterdam: ILLC/Department of Philosophy. Diesing, Molly (1992). Indefinites. Cambridge, Mass.: MIT Press. Delfitto, Denis (1997). Aspect, genericity, and bare plurals. Ms, University of Utrecht. Delfitto, Denis and Jan Schroten (1991). Bare plurals and the number affix in DP. Probus 3,155-185. Delsing, L.O (1993). The internal structure of noun phrases in the Scandinavian languages. Ph.D. dissertation. University of Lund. Demirdache, Hamide (1996). The chief of the United States sentences in Lillooet Salish. Papers for the International conference on Salish and neighboring languages. 79-100.
No lack of determination
173
Demirdache, Hamide (1997). Predication times in St'at'imcets Lillooet Salish. In The Syntax and Semantics of Predication, Texas Linguistic Forum, University of Texas/Austin. Dineen, F. ed (1969). Monograph series on language and linguistics 19. Washington: Georgetown University Press. Dobrovie-Sorin, C. (1996). I^pes of predicates and the representation of existential readings. Ms, University of Paris 7. Dobrovie-Sorin, Carmen & Brendá Laca (1997). On the definiteness of generic bare NP's. Paper deUeverd at the Institute for Advanced Studies, The Hebrew University, Jerusalem. É. Kiss, Katalin (1994). On generic versus existential bare plurals. Ms, Hungarian Academy of Sciences, Budapest. É. Kiss, Katalin (1998). On generic and existential bare plurals and the classification of predicates. In Susan Rothstein ed. 145-162. Enç, Mürvet (1991). The semantics of specificity. Linguistic Inquiry 22.1, 1-26.
Farkas, Donka., & Yoko Sugioka (1983). Restrictive ifwhen clauses. Linguistics and Philosophy 6, 225-258. Femald, Theodore (1994). On the nonuniformity of the individual- and stage-level effects. Ph.D. dissertation. University of California at Santa Cruz. Femald, Theodore (2000). Greneralizations in Navajo. In Theodore Fernald and Paul Platero (eds.). 51-72. Femald, Theodore, and Paul Platero (eds.) (2000). The Athabaskan Languages: Perspectives on a Native American language family. Oxford Studies in Anthropological Linguistics 24. Oxford: Oxford University Press. Filip, Hana (1993). On genericity: a case study in Czech. In J. Guenther and B. Kaiser (eds.). Proceedings of the Nineteenth Meeting of the Berkeley Linguistic Society. University of Califomia at Berkeley. 125-142. von Fintel, Kai (1994). Restrictions on quantifier domains Ph.D. dissertation, University of Massachusetts/Amherst. von Fintel, Kai (1997). Bare plurals, bare conditionals, and only. Journal of Semantics 14,1-56. Fodor, Janet D., & Ivan Sag (1982). Referential and quantificational indefinites. Linguistics and Philosophy 5, 355-398. van Geenhoven, Veerle (1995). Semantic incorporation: A uniform semantics of West Greenlandic noun incorporation and West Germanic bare plural configurations. In Audra Dainora et al (eds) Papers from the 3P' regional meeting of the Chicago Linguistic Society. Chicago: Chicago Lingusitic Society.
174
Greg Carlson
van Greenhoven, Veerle (1998). Semantic incorportation and indefinite descriptions. Stanford: CSLI Publications. Gelman, Susan and Twila Tardif (1997). Generic noun phrases in English and Mandarin: An examination of child-directed speech. Paper presented at the LSA Annual Meeting, Chicago. Gerstner-Link, Claudia (1998). A tJφological approach to generics. Ms, University of Munich. Geurts, Bart (1996). OXÍNO. Journal of Semantics 13, 67-86. Glasbey, Sheila (1993). Event structure in natural language discourse. Ph.D. dissertation, University of Edinburgh. Greenberg, Yael (1994). Hebrew nominal sentences and the stage I individual level distinction. M.A. Thesis, Bar-Ilan University, Israel. Greenberg, Yael (1998). An overt syntactic marker of genericity in Hebrew. In Susan Rothstein, (ed.) 125-143. Groenendijk, Jeroen, Theo Janssen, & Martin Stokhof (eds.) (1981). Formal methods in the study oflanguage. Mathematisch Centrum, Amsterdam: Mathematical Center Tracts, 135. Groenendijk, Jeroen & Martin Stockhof (eds.) (1987). Studies in discourse representation theory and the theory of generalized quantifiers. Dordrecht: Foris. Heim, Irene (1984). The semantics of definite and in definite noun phrases. Ph.D. dissertation. University of Massachusetts/Amherst. de Hoop, Helen (1996). Case configuration and noun phrase interpretation, Outstanding Dissertations in Linguistics, New York-London,: Garland Publishing. de Hoop, Helen & Henriette de Swart (1990). Indefinite objects. In R. BokBennema & P. Coopmans (eds.). 91-100. Kallulli, Dalina (1996). Bare singulars and bare plurals: mapping syntax and semantics. In Proceedings ofConSole 5. University of Leiden Press. Kallulli, Dalina (1999). The comparative syntax of Albanian: On the contribution of syntactic types to prepositional interpretation. University of Durham Ph.D. Thesis. Kamp, Hans (1981). A theoiy of truth and semantic representation. Jeroen Groenendijk, Theo Janssen, & Martin Stokhof (eds.). 277-322. Katz, E. Graham (1995). Stativity, genericity, and temporal reference. Ph.D. dissertation. University of Rochester. Keenan, Edward L. (ed.) (1975). Formal semantics of natural language. Cambridge: Cambridge University Press. Krifka, Manfred (1987). An outline of genericity, partly in collaboration with C. Gerstner. University of Tübingen: SNS-Bericht 87-23. Krifka, Manfred (1992). Definite NP's aren't quantifiers. Linguistic Inquiry 23,156-163.
No lack of determination
175
Krifka, Manfred, F. Jeffry Pelletier, Gregory Carlson, Alice ter Meulen, Grennaro Chierchia, & Godehard Link (1995). Genericity: an introduction. In Gregory Carlson and Frances Jef&y Pelletier (eds.), 1-124. Kratzer, Angelika (1980). Die Analyse des bloßen Plural bei Gregory Carlson. Linguistische Berichte 70, 47-50. Kratzer, Angelika (1995). Stage-level and individual-level predicates. In Gregory Carlson and Frances Jeffiy Pelletier (eds.), 125-174. Kuroda, Sige-Yuki (1972). The categorical judgement and the thetic judgement. Foundations of Language 9,153-185. Laca, Brenda (1990). Greneric objects: some more pieces of the puzzle. Lingua 81, 25-46. Laca, Brenda and Liliana Tasmowski (1994). Le pluriel indéfini de l'attribut métaphorique. Lingvisticae Investigationes 18, 27-47. Ladusaw, William (1994). Thetic and categorical, stage and individual, weak and strong. In Proceedings of Salt IV. Ithaca, NY: Cornell University Department of Modem Languages and Linguistics. 220-229. van Langendonck, Willy (1980). Indefinites, exemplars, and kinds. In Johan van der Auwera ed., 211-231. Lasersohn, Peter (1997). Bare plurals and donkey anaphora. Natural Language Semantics 5, 79-86. Lawler, John (1973). Studies in English generics. Ann Arbor: University of Michigan Papers in Linguistics 1,1, University of Michigan Press. Le Pore, Emie (ed.) (1987). New directions in semantics. London: Academic Press. Lewis, David (1975). Adverbs of quantification. In Edward L. Keenan ed., 3-15. Lin, Jo-Wang (1999). Double quantification and the meaning of shenme "what' in Chinese bare conditionals. Linguistics and Philosophy 22, 573-93. Löbner, Sebastian (2000). Polarity in natural language: Predication, quantification and negation in particular and characterizing sentences. Linguistics and Philosophy 23, 213-308. Longobardi, Giuseppe (1994). Reference and proper names: a theory of N-movement in syntax and logical form. Linguistic Inquiry 25, 609-669. Longobardi, Giuseppe (2000). "Postverbal" subjects and the mapping hypothesis. Linguistic Inquiry 31, 691-702. Longobardi, Giuseppe (2001). How comparative is semantics? A unified parametric theory of bare nouns and proper names. To appear in Natural Language Semantics. Massam, Diane (2001). Pseduo noun incorporation in Niuean. Natural Language and Linguistic Theory 19,153-97.
176
Greg Carlson
McNally, Louise (1992). An interpretation for the English existential construction. Ph.D. dissertation, University of Califomia/Santa Cruz. McNally, Louise (1995). Bare plurals in Spanish are interpreted as properties. In Glyn Morrill and Richard Oehrle (eds.), 197-212. McNally, Loviise (1998). Existential sentences without existential quantification. Linguistics and Philosophy 21, 353-392. ter Meulen, Alice (1980). Substances, quantities, and individuals. Ph.D. dissertation, Stanford University. de Mey, Sjaak (1980). Stages and Extensionality: The Carlson problem. In S. Daalder and M. Gerritsen (eds.) Linguistics in the Netherlands. Amsterdam: North-Holland. 191-202. de Mey, Sjaak (1982). Apsects of the interpretation of bare plurals. In S. Daalder and M. (Jerritsen (eds.) Linguistics in the Netherlands. Amsterdam: North-Holland. 115-26. Milsark, Gary L (1974). Existential sentences in English. Ph.D. dissertation, Massachusetts Institute of Technology. Morrill, Glyn and Richard Oehrle (eds.) (1995). Formal grammar: proceedings of the conference of the European Summer School in Logic, Language, and Information. Munn, Alan, and Christina Schmidt (1999). Bare nouns and the morphosyntax of number. Proceedings of the Linguistic Symposium on Romance Languages 1999. Munn, Alan, and Christina Schmidt (2000). Bare nomináis, morpho-syntax and the Nominal Mapping Parameter. Michigan State University ms. Musan, Renate (1995). On the temporal interpretation of noun phrases Ph.D. dissertation: Massachusetts Institute of Technology. Musan, Renate (1999). Temporal Interpretation and information-status of noun phrases. Linguistics and Philosophy 22, 621-61. Ojeda, Almerindo (1991). Definite descriptions and definite generics. Linguistics and Philosophy 14, 367-397. Ojeda, Almerindo (1993). Linguistic individuals. Stanford: Center for the Study of Language and Information. Partee, Barbara H. (1985). 'Dependent plurals' are distinct from bare plurals. University of Massachusetts/Amherst ms. Partee, Barbara H. (1987). Noun-phrase interpretation and type-shifting principles. In Jeroen Groenendijk & Martin Stockhof (eds.), 115-143. Platteau, F. (1980). Definite and indefinite generics. In Johan van der Auwera ed., 112-123. Petronio, Karen (1995). Bare noun phrases, verbs and quantification in ASL. In Emmon Bach, Eloise Jelinek, Angelika Kratzer, and Barbara Partee (eds.). 603-618.
No lack of determination
177
Porterfield, Leslie & Veneeta Srivastav (1988). Indefiniteness in the absence of articles: evidence from Hindi and Indonesian. Proceedings of WCCFL 7. Stanford: Stanford Linguistics Association. 265-76. Postal, Paul (1969a). On so-called 'pronouns' in English. In F. Dineen ed., 177-206. Postal, Paul (1969b). Anaphoric islands. In Papers from the fifth regional meeting of the Chicago Linguistic Society. Chicago: Chicago Lingusitic Society. Reuland, Eric, & Alice ter Meulen (eds.). (1984). The representation of indefiniteness. Cambridge, MA: MIT Press. Rothstein, Susan, (ed.) (1998). Events and grammar. Dordrecht: Kluwer. Schmidt, Christina and Alan Munn (1999). Against the nominal mapping parameter: Bare nouns in Brazilian Portuguese. Proceedings of NELS 29. Schubert, Lenhart & Francis Jeffry Pelletier (1987). Problems in the representation of the logical form of generics, plurals, and mass nouns. In Ernie LePore ed. 385-451. Sloat, Clarence (1969). Proper nouns in English. Language 45,26-30. Smith, Neil (1975). On generics. Transactions of the Philological Society, 27-48. Stump, Gregory (1981). The interpretation of frequency adjectives. Linguistics and Philosophy 4, 221-258. Stump, Gregory (1985). The semantic variability of absolute constructions. Dordrecht: Reidel. Stvan, Laurel (1998). The semantics and prgamatics of bare singular noun phrases. Ph.D. dissertation, Northwestern University. Svenonius, Peter (1996). Predication and functional heads. In Proceedings of the Fourteenth West Coast Conference on Formal Linguistics. Stanford: Center for the Study of Language and Information. 493-507. de Swart, Henriette. (1993). Definite and indefinite generics. In Paul Dekker & Martin Stokhof (eds.). 625-644. de Swart, Henriette (1999). Indefinites between predication and reference. In Proceedings of SALT 9. Ithaca, NY: Cornell University. 273-97. de Swart, Henriette (2001). Weak readings of indefinites: type-shifting and closure. The Linguistic Review 18, 69-96. Torrego, Esther (1984). On inversion in Spanish and some of its effects. Linguistic Inquiry 15,103-129. Torrego, Esther (1989). Unergative-unaccusative alternations in Spanish. MIT working papers in linguistes. Department of Linguistics, MIT. Vergnaud, Jean-R.oget, & Maria Luisa Zubizaretta (1992). The definite determiner and the inalienable constructions in French and English. Linguistic Inquiry 23, 595-652.
178
Greg Carlson
Wilkinson, Karina (1991). Studies in the semantics of generic noun phrases. Ph.D. dissertation, University of Massachusetts/Amherst. Yoon, J-H. (1998). Indefinite proper nouns and stages. Korean Journal of Linguistics. Zamparelli, Roberto (1995). Layers in the determiner phrase. Ph.D. dissertation, University of Rochester. Zuber, Richard (1987). To be, to be called and generics. Ms., CRNS, University of Paris.
Partitivity Helen de Hoop
The part-o/· relation is reflected in language in many different ways, with different pragmatic, semantic, and syntactic properties. Partititivity plays an important role in theories on mass and count nouns as well as aspect (cf. a.o. Ter Meulen 1980; Link 1983; Moltmann 1998; Bach 1986; Krifka 1992; Verkuyl 1993; lüparsky 1998; Doetjes 1997; Bosveld-de Smet 1998). Hoeksema (1996) distinguishes between full or headed partitives and determinerless or bare partitives. In this article I will not be concerned with the (structural) differences between these two types of partitivity. Instead I will focus on the correspondences between ordinary partitive constructions on the one hand and other types of partitives, such as pseudopartitives, faded partitives, and partitive Case bearing noun phrases, on the other. I will attempt to develop at least part of a comprehensive view on partitivity.
1. The Partitive Constraint Ordinary partitive constructions are well-known for a constraint that has been dubbed the Partitive Constraint by Jackendoff (1977). The Partitive Constraint implies that the embedded noun phrase within a partitive must be definite, i.e., it must contain a definite article, a demonstrative, or a possessive (cf Jackendoff 1977; Selkirk 1977). Other strong noun phrases as well as weiik noun phrases are generally teiken to be excluded from that position. Some examples are given in (1): (1)
a.
one of these I the I my books
180
Helen de Hoop b. * one of all I most books c. * one of some / three Ino books
Jackendoif considers the Partitive Constraint to be part of the semantic component, yet makes no attempt to give a semantic explanation or motivation for its existence. Apartfromthe Partitive Constraint on the embedded determiner within a partitive construction, it is well-known that the upstairs determiners in partitive constructions are also subject to certain restrictions as not all determiners can occur in this upstairs position. Hoeksema (1984) observes that upstairs determiners are never transitive or indexical. This is a syntactic characterization that distinguishes transitive determiners that obligatorily combine with a noun (like the, every, a, no, and my) from the intransitive ones that cannot combine with a noun (such as everything, he, him) and the pseudotransitives (e.g., all, some, few, these). Transitive determiners are excluded from the upstairs determiner position, but not all intransitive or pseudotransitives are allowed. In fact, the indexical determiners (basically, demonstratives and personal pronouns) must be excluded as well. What Hoeksema (1984) has to say about the differences between *every I no I the of the students on the one hand and every one I none I all of the students on the other, has not lost much of its plausibility yet. Unfortunately, Hoeksma's (1984) paper has never been published, despite its status as one of the pioneer studies on the partitive construction. As for the Partitive Constraint, Barwise and Cooper (1981) give a formal semantic definition of definite noim phrases and argue that exactly these definite погш phrases are allowed in the embedded position of partitive constructions. Barwise and Cooper define definite noun phrases as noim phrases for which there is some non-empty set which is a subset of all sets contained in the family of sets the noun phrase denotes. Such a subset (empty or non-empty) is called the generator of the noun phrase. For example, the set of books is the generator oí every book, the set of three contextually indicated books is the generator of these three books. Noun phrases like three books or most books do not have such a generator: there is not necessarily one set of books that is a subset of all sets contained in the denotation οΐ three I most books. That is, any three books will make a sentence such as Three books are on the table true (unlike These three books are on the table). For a noim phrase to be definite, however, the generator must be non-empty, which means that in Barwise and Cooper's (1981) definition, a noun phrase like every book is not definite, as it does not presuppose the existence of books. Barwise and Cooper interpret the partitive phrase ofNP as the generator set of the noun phrase denotation if and only if this noun phrase has a
Partitivity
181
definite determiner. Thus, the of JVP-phrase gets a common noim denotation (that is, a set of individuals) and can combine with the first determiner to form a partitive construction. Barwise and Cooper note that they do not have an explanation for the contrast between the two and both in the embedded determiner position of a partitive. They should both be acceptable but only the two actually is, witness (2): (2)
a. one of the two books b. * one of both books
In order to account for this salient difference, Ladusaw (1982) and Hoeksema (1984) give an extension of Barwise and Cooper's analysis. They both recognize that the embedded noun phrase must have a group reading. Both, however, can only get a distributive reading and it cannot denote a group. This can be verified when both is combined with a collective, group level predicate. Unlike the two, it gives rise to ill-formedness. (3)
a. *Both cats lick each other. b. The two cats lick each other. c. * Both cats are a happy couple. d. The two cats are a happy couple.
Ladusaw takes a constituent such as the (two) cats to denote a group level individual, analogous to the cat denoting an entity level individual. Individuals are noun phrases of which the generator is a singleton set. A group level individual is generated by the singleton set of the group. For instance, Jane and Jacky on the group reading denotes such an individual, generated by the singleton set of the group consisting of Jane and Jacky. Such a group level individual denotes the set of all properties that this group has. This means that Jane and Jacky love each other will be true if and only if the property love each other is a member of the set denoted by the group level individual Jane and Jacky. The set of all groups G should contain the non-empty non-singleton sets of entities as its members. According to Ladusaw, both cats denotes the set of properties that the two cats share, in other words, the intersection of the properties that each of the two cats has. One cat cannot have the property of licking each other, hence the ungrammaticality of (3a). The two cats, however, denotes a group level individual and can contain a group level property such as lick each other. Ladusaw's interpretation rule for partitive constructions is a "downstepping" consisis-o/'function^ that maps the atoms which generate individuals into their components. Thus, in one of the two cats the argument of the determiner one is the set of entities which serves as the generator of the individual denoted by the two cats. A similar story can be told for partitives containing mass nouns or singular coimt nouns. For example, in
182
Helen de Hoop
some of the book the determiner takes as its argument the stuff the count atom consists of. The Partitive Constraint can be restated such that the embedded noun phrase within a partitive construction must always denote an individual, either entity-level or group-level. That is, ofNP is g(a) if the noun phrase denotes the individual and is undefined otherwise. The observation that the embedded noun phrase must have a group reading and can therefore be considered a group level individual, has been a significant step forward in explaining the characteristics of the embedded determiner in a partitive construction. There remain some problems within this approach, however, as we will see below. 2. Problematic
partitives
Ladusaw (1982) reformulated the Partitive Constraint merely in terms of the individual denoting versus quantifier distinction. Some problems immediately arise, since not only definites can have the required collective reading. Notoriously, the determiner all can have not only a distributive but also a collective reading. This would account for an example as in (4), taken irom De Jong and Verkuyl (1985): (4)
de the
helft half
van alle of all
hinderen children
So, universal quantifiers are sometimes allowed in partitive constructions such as in (4), yet not in all, witness (5): (5)
*een one
van of
alle hinderen all children
In De Hoop (1997) it is argued that the partitive constructions in (4) and (5) are instantiations of different types of partitive constructions and that the explanation for this difference lies in the nature of the upstairs determiner. The analysis also accounts for the difference between (6a) and (6b), observed by Roberts (1987), and similarly for the difference between (7a) and (7b) as well as the grammaticality of (8): (6)
a. half of Jane and Jachy b. * one of Jane and Jachy
(7)
a. half of the water b. * one of the water
(8)
half of a cookie
Partitivity
183
We have seen that according to Ladusaw (1982), Jane and Jacky can denote a group level individual and therefore, this noun phrase should be allowed in the embedded determiner position of a partitive. But it is only allowed in the type of partitive construction that also allows for the determiner all (compare (6) to (5)), which is, moreover, the one that can also have an indefinite noun phrase like α cookie in its embedded position. The well-formedness of (8) is problematic because α cookie does not have a generator at all. Hoeksema (1984) and Ladusaw (1982) cannot explain the data in (4)-(8). Another problem for the Partitive Constraint is constituted by partitives that contain a weak determiner in their embedded position. Weak noun phrases do not have a generator, hence should not be allowed. Consider Ladusaw's (1982) examples in (9)-(ll): (9)
That book could belong to one of three people.
(10)
This is one of a number of counterexamples to the Partitive Constraint.
(11)
John was one of several students who arrived late.
Yet, as Ladusaw points out, in the above examples the speaker does have a particular group of individuals in mind. For instance, (9) invites a continuation like namely, Jane, Jacky or Robert, or it might be that the particular group the speaker of (9) has in mind consists of three people who have been looking at the book just before the time of utterance (Teun Hoekstra, p.c.). Either way, it is not the case that the book in (9) could belong to just any three people for the sentence to be true. Therefore, although the embedded noim phrase in sentences like (9) is not syntactically definite, it might be characterized as semantically referential or specific. Then, the particular group of individuals the speaker has in mind functions as the generator set in the denotation of the weak noun phrase. Abbott (1996) rejects this approach on the basis of examples like Every year only one of many applicants is admitted to the program where there is not one particular group of individuals that the weak noun phrase refers to. Abbott argues that since the embedded weak noun phrase has narrow scope rather than wide scope relative to the universal quantifier, it cannot be semantically referential or specific after all. Thus, partitives containing weak noun phrases are still problematic for Ladusaw's (1982) semantic analysis of the Partitive Constraint. This led several people to argue in favour of a pragmatic rather than a semantic account of the Partitive Constraint (cf Reed 1991; Abbott 1996). In De Hoop (1997) the Partitive Constraint is reformulated as a semantic restriction on the type of noun phrases that can occur in partitives. It is
184
Helen de Hoop
claimed that within the class of determiners a distinction should be made between determiners that select semantic entities as their domain of quantification and those that take sets of entities as their domain of quantification. In English, determiner expressions like half (of), 20% of, one third of, and much (of) are of the former class, determiners such as three (of) and many (of) of the second, whereas determiners like some (of), all (of), and most (of) are ambiguous in this respect, as they can take arguments of both types. Let me emphasize that whether determiner expressions belong to either one or the other or to both classes seems to be a lexical, language specific matter. In Dutch, as opposed to English, enkele (van) 'some (of)' selects only sets of entities, whereas veel (van) 'many/ much (of)' takes either entities or sets of entities. The point of view taken here is in accordance with Doetjes (1997), who convincingly shows that the selectional properties of quantifiers are only partially determined by their meaning. For example, it is argued in Doetjes that the presence of minimal parts in the semantic structure is not the factor that determines compatibility with determiners that select either a singular or a plural; instead, these determiners are dependent on grammatical elements (which can be number morphology in certain languages or classifiers in others). So, I distinguish two types of partitive constructions, which I call entity partitives and set partitives, the type crucially depending on the class the upstairs determiner expression belongs to. Entity partitives are headed by determiner expressions that select entities as their first arguments, set partitives by determiners that select sets of entities as their arguments. At this point reconsider (4)-(8), examples that were presented as problems for the Partitive Constraint. We can account for the fact that half of the water in (7a) is well-formed, while *one of the water in (7b) is not. The reason is that the determiner one is looking for a set of entities to function as its first argument, but such a set is not available, since the water denotes a semantic element of type e (following L0nning 1987a). This also explains the well-formedness oihalf of a cookie in (8) and constructions with definite singular count noims such as half of the population, compared to the imgrammaticality of *one of the population and *one of a cookie. Definite and indefinite singular count noims denote entities, and these entities can be made available to the upstairs determiner in an entity partitive, irrespective of their having a generator set or not. Notoriously, the determiner all can have not only a distributive reading, but also a collective reading. It is well-known that all differs in this respect firom its truly quantificational or distributive counterpart every. This would follow from the fact that alle katten 'all cats' can be entity-denoting, hence it is allowed in an entity partitive such as (4).
Partitivity
185
With respect to a noun phrase such as Jane and Jacky, recall that according to Ladusaw, the embedded noun phrase within a partitive construction must always denote an individual (a noun phrase of which the generator is a singleton set), either entity-level or group-level. Therefore, Ladusaw cannot account for the ungrammaticality of *one of Jane and Jacky. In fact, he explicitly claims that a noun phrase such as Jane and Jacky can denote a group level individual, hence the partitive construction should be well-formed. In accordance with Link (1983) and L0nning (1987b) I assume that Jane and Jacky can denote a complex entity, which explains the grammaticality οΐ half of Jane and Jacky in (6a). At the same time, I conclude that Jane and Jacky cannot denote a set of entities and this accounts for the ungrammaticality of *one of Jane and Jacky in (6b). Note that half of Jane and Jacky does not refer to Jane or Jacky. If we consider Jane and Jacky to denote a composed entity, then half of it can be any half This becomes clear in Hoeksema's (1996) example: Only about half of Jane and Jacky was visible for the sniper. The function of partitive of in both types of partitives is to make expressions that are not directly accessible to the upstairs determiner (which basically means, expressions other than bare nouns) accessible. I will follow Ladusaw (1982: 240-241) who has put it as follows: "Of the vast array of sets of entities that might serve as the basis of a quantifier NP, a language will lexicalize as CNs relatively few. The resources of modification by adjectives and relative clauses increase the expressive power of NPs though they do not guarantee that any arbitrary set can serve as the argument of a determiner to express a QNP economically. Deictic pronouns and articles and discourse sensitive articles like the do guarantee that an arbitrary individual may be denoted, but syntactically they by-pass the determiner category that builds quantifier NPs. The partitive construction of a language provides a means of bypassing this syntactic bind, by allowing any arbitrary set to serve as the basis of a QNP." Note that in Ladusaw's view, the function of partitive of is not just to make any arbitrary set to be accessible to the upstairs determiner; rather, only individual denoting noun phrases (although these might be group level individuals) can be mapped onto their components by partitive of. This is different in my analysis. Noun phrases that denote entities can be made available by partitive of to an upstairs determiner that selects an entity as its argument, whereas noun phrases that denote sets of entities can be made available to determiners that choose sets of entities as their argu-
186
Helen de Hoop
merits. It will be clear that the Partitive Constraint can thus no longer be formulated in terms of individual denoting noun phrases alone. The Partitive Constraint can be restated very simply at this point: (12)
Partitive Constraint Only noun phrases that can denote entities are allowed in entity partitives; only noun phrases that can denote sets of entities are allowed in set partitives.
So, I follow Westerstâhl (1985) in his claim that in (set) partitive constructions the embedded noun phrase actually denotes a contextually determined or otherwise restricted set of entities, instead of adopting an analysis such as Barwise and Cooper's (1981) or Ladusaw's (1982) in which this set has to be recovered as the generator set from the generalized quantifier denotation of the noun phrase. In De Hoop (1997) further arguments are provided in favour of the claims that noun phrases that occur in entity partitives can indeed denote entities, and noun phrases that occur in set partitives can indeed denote sets of entities. In Anttila and Fong (2000) it is observed that the two classes of noun phrases induce different Case alternations in Finnish. Anttila and Fong do not pursue the difference between entity partitives and set partitives further, but focus on the Case alternation that occurs with entity partitives. They account for that Case alternation in terms of ranking two potentially conflicting constraints, one semantic and one structural.
3. The Partitive Constraint reconsidered: semantics or pragmatics'? So far, we have discussed several semantic analyses of the Partitive Constraint. Reed (1991) and Abbott (1996) argue on the basis of problematic examples such as the ones in (9)-(ll) above, however, that the Partitive Constraint cannot be maintained as a semantic restriction, but that instead pragmatic principles determine the well-formedness of partitives. Reed (1991) considers only partitives in which the embedded noun phrase is plural and argues that the function of partitives is to evoke subgroups of previously evoked discourse groups. She claims that there is no formal restriction on determiners in partitives, but that the interpretation for partitives demands that the embedded noun phrase access a discourse group. Therefore, weak notm phrases may occur in partitives if explicit modification or the discourse context makes the discourse entity evoked by the indefinite more accessible. She discusses the following examples in this respect:
Partitivity
187
(13)
The dog was stoned by two of some boys playing in that field.
(14)
Only one of many people who saw the accident would testify.
Note that the examples in (13) and (14) are reminiscent of Ladusaw's (11). Abbott (1996) takes a view similar to Reed's, yet argues against Reed's claim that partitives have a particular discourse function, i.e., the function of evoking subgroups of discourse groups. Abbott considers the analysis of embedded indefinite noun phrases in terms of accessing discourse groups inadequate, partly because of examples such as in (15), where obviously the students referred to by the embedded noun phrase need not already exist in the discourse: (15)
John was apparently one of several students who arrived late—I have no idea how many, or who the others were.
Another problem that Abbott notes with respect to Reed's analysis, is that the basis of her analysis must be stipulated. That is, why should partitives be confined to introducing subgroups of existing discourse groups and why should they be unable to introduce subgroups of new groups? Abbott, like Reed, claims that there is no formal (syntactic or semantic) restriction on the embedded noun phrases in partitives, and that the examples that have been cited as ungrammatical are only pragmatically odd. The pragmatic principle that Abbott claims is involved here, is a very general principle that prohibits mentioning entities unless there is some reason for mentioning them. So, Reed's idea that the embedded noun phrase should refer to an already existing discourse group is replaced by the idea that the embedded noun phrase should be worth mentioning somehow. Two examples Abbott discusses in developing her analysis are given in (16) and (17): ( 16)
Ants had gotten into most of some jars ofjam Bill had stored in the basement.
(17)
Ml of three people (out of the 501 wrote to) had the politeness to respond to my invitation.
In conclusion, both Reed and Abbott claim that if there is a restriction on embedded noun phrases in set partitives, then this restriction is pragmatic rather than semantic in nature. In general, contextualization should tum examples that eire judged ill-formed into well-formed constructions. I claim that it is not a coincidence, however, that all their crucial examples contain weak embedded determiners. In my opinion one of some linguists might be ill-formed, but one of some linguists who... is well-formed, independent of the exact content of the
188
Helen de Hoop
modifying phrase. I take this to indicate that the relative clause syntactically and not pragmatically turns the expression into a grammatical one. Consider (18): (18)
a. '' one of some linguists b. one of some linguists that have a cat c. one of some visiting linguists d. one of some linguists who are drinking whiskey
What is important is that the grammaticality judgements in (18) are independent of further context. Likewise, this holds for ungrammatical partitives that involve a quantificational determiner such as most: one of most linguists is bad and one of most linguists who... is just as bad, even if one wishes to use such a construction pragmatically. Consider for instance a situation in which most linguists drink whiskey. One of them wants to take the car. Hence, this should be sufficient contextualization for the construction in (19): (19)
* One of most linguists who are drinking whiskey wants to take the car.
Yet, the construction is ill-formed. The explanation lies in the fact that most is not a weak determiner. The most obvious explanation for the fact that weak determiners are allowed in set partitives is that they can get a non-quantificational, collective reading. In those cases, one сгт indeed maintain that the weak noun phrases denote contextually determined sets of entities in the embedded position of set partitives, and there is no problem for the Partitive Constraint as formulated in (12). Yet, I do acknowledge in accordance with Abbott that there are examples for which one can hardly claim that the embedded noun phrase denotes a contextually determined set of entities. Therefore, I propose that not only weak noim phrases that denote a contextually determined set of entities can be of type and therefore occupy the embedded position in set partitives. In fact, all weak noun phrases can denote sets of entities, in accordance with the observation in the literature that weak noun phrases live naturally in type in their predicative use (cf a.o.. Partee 1987; Van Geenhoven 1996; De Swart 1997; Van der Does and De Hoop 1998). To sum up, all noun phrases that are felicitous in the embedded position of set partitives are set-denoting, in accordance with the Partitive Constraint. Apart from the set-denoting noun phrases we have discussed before (noun phrases that denote contextually determined sets, that is, noum phrases introduced by context indicators, such as the definite article, demonstratives, and possessives), noun phrases that have weak deter-
Partitivity
189
miners can also denote restricted sets. In these cases, the set they denote is not necessarily contextually or lexically restricted but it can also be restricted noun phrase-intemally by modiiying phrases. Now the obvious question is why the sets denoted by embedded noun phrases in set partitives must be restricted. Why does a bare, unrestricted weak noun phrase leads to an ill-formed result in the embedded position of partitives, and why does that hold for bare plurals in particular? It is Abbott (1996) who claims that bare noun phrases are the only kind of noun phrases genuinely unacceptable in partitives: (20)
* some of books
Abbott's analysis of the ill-formedness of partitives conteiining bare noun phrases is based on the idea that the embedded noim phrase in a partitive construction always has wide scope over the upstedrs determiner. Bare plurals always take the narrowest possible scope (cf Carlson 1977) and this would rule out bare plurals in partitive constructions, according to Abbott. As far as I can tell, there are a number of problems with this analysis, however. First of all, compare the example (21a) with the example in (21b), for instance: (21)
a. b.
I ate a quarter of all cookies. I hate a quarter of all students.
Whereas in (21a) we can indeed get the reading in which the embedded determiner has scope over the upstairs one (all cookies are such that I ate a quarter of it), that reading is not preferred in (21b) (#all students are such that I hate a quarter of him/her). Instead, the obvious reading in (21b) is one in which the embedded determiner is in the scope of the upstairs one. So, in my opinion, the embedded noun phrase in a partitive construction does not at all necessarily take wide scope over the upstairs determiner. In fact, I argue elsewhere that the wide scope reading embedded noun phrase escapes the Partitive Constraint because the upstairs determiner does not quantify over the embedded noun phrase in these cases (De Hoop 1997). Furthermore, the claim that bare plurals always take narrow scope is rejected by Giannakidou (1997). She shows that although the semantic incorporation analysis of Van Geenhoven (1996) (to be discussed below) succesfiilly accounts for the lack of wide scope readings for bare plurals under negation (see Carlson's 1977 example in (22)), bare plurals can have these readings in certain cases (cf (23)). (22) a.
John didn't see spots on the floor It is not the case that John saw spots on the floor.
190
Helen de Hoop b. # There were spots on the floor that John did not see.
(23) a.
Paul didn't buy books after all. (They were sold out.) There were books that Paul did not buy.
The question remains why bare plurals are so bad in ordinary partitive constructions (this is one of the characteristics that distinguishes pseudopartitives from ordinary partitives, as shown below). In my opinion, this question is related to the semantic function of partitive of, as mentioned before. That is, the ill-formedness of (24) is related to the ill-formedness of (25): (24)
* one of students
(25)
* one the students
A restricted set of entities denoted by a noun phrase such as the one in (25) cannot be directly quantified over by the upstairs determiner. The iunction of o/is to make such a set accessible for quantification by an upstairs determiner. In that sense, the use of of in general is not optional. Either a determiner quantifies over a set of entities directly or it has to miike use of partitive of. There is hardly any optionality (there are some exceptions, compare all the cats/all of the cats) and superfluous o/" gives rise to ill-formedness. This would also explain the examples in (26) that have been noted by Hoeksema (1996) as counterexamples to the Partitive Constraint: (26)
a. b.
the most eloquent of men the best of friends
Compare (26a,b) to their coimterparts in (27) that receive a totally different meaning: (27)
a. b.
the most eloquent men the best friends
Partitive afin (26) is not superfluous, and therefore the examples are wellformed. I already noted that languages can differ with respect to the selectional properties of semantically related determiners. In Dutch the examples in (26) would be ill-formed. Only entity-denoting expressions could be used, suggesting that singular superlatives only quantiiy over semantic entities in Dutch: (28)
a. b.
de the de the
welsprekendste most eloquent welsprekendste most eloquent
van of van of
alle all Jane Jane
vrouwen women en Jacky and Jacky
Partitivity
191
This view actually implies a degradation of the status of partitives. Partitive constructions tum out to be nothing special. Either a determiner quantifies over a set of entities directly or it has to make use of partitive of. A e t h e r or not a partitive element is needed is not just a matter of the semantic or pragmatic properties of the noun phrases involved. It is also determined by the syntactic and lexical characteristics of these noim phrases which may differ cross-linguistically (cf Doetjes 1997). In ordinary (set) partitives of is used to provide access to a restricted set of entities. In some cases unrestricted sets (as denoted by bare plurals, for example) also need a partitive element before they can be quantified over In the following sections I will discuss several instantiations of this type of (unbounded) partitivity.
4.
Pseudopartitives
In Selkirk (1977) a distinction is made between ordinary ("real") partitives and pseudopartitives. Examples of each type of construction are given in (29) and (30), respectively: (29)
a. b. c.
a number of her cats three glasses of the wine four pounds of those apples
(30)
a. b. c.
о number of cats three glasses of wine four pounds of apples
Selkirk provides several syntactic tests in order to show that pseudopartitives and ordinary partitives are different. I will not discuss these tests in detail, but only give one example. This test involves extraposition of the o/'-phrase, which appears to be only allowed in the case of ordinary partitives: (31)
a. A lot had been eaten of the lefiover turkey. b. *A lot had been eaten of leftover turkey.
Hoeksema (1984) notes that in Dutch the distinction between real partitives and pseudopartitives is very obvious as partitive van 'оГ is obligatorily absent in pseudopartitives and obligatorily present in ordinary partitives. This is illustrated in the Dutch counterparts of (29) and (30): (29') a. b. c.
een aantal *(van) haar hatten drie glazen *(van) de wijn vier pond *(van) die appels
192
Helen de Hoop
(30') a. b. c.
een aantal (*van) hatten drie glazen (*van) wijn vier pond (*van) appels
Hoeksema follows Selkirk in claiming that pseudopartitives are ordinary noun phrases, having the structure DET N, where DET also stands for complex determiners such as een aantal 'a number of, whereas ordinary partitives have the structure DET of NP. The Dutch facts given in (29')-(30') might indicate that the occurrence of of in the English pseudopartitives in (29)-(30) is merely a coincidence. Other languages, however, contradict this view; it is not only true for English that pseudopartitives have a partitive 'flavour' although they differ from ordinary partitives in syntactic behaviour. For example, in French pseudopartitives partitive de 'of is used ((32a)), and in Finnish the embedded noun in pseudopartitives bears partitive Case ((32b)): (32)
a. b.
un a lasi glass
verre de vin rouge glass of wine red punaviiniä red-winepART
The question which determiner expressions in pseudopartitives require a partitive preposition or Case is a lexical, language-specific matter again. For instance, English o/" occurs in a glass of wine, but not in much wine, whereas French de as well as Finnish partitive Case shows up in both constructions. In Finnish, moreover, numerals also require partitive Case on their embedded nouns (e.g., kaksi tyttöä 'two girlpART-), whereas their French and English counterparts lack de and о/(compare deux filles, two girls). The strongest argument in favour of distinguishing pseudopartitives from ordinary partitives is the embedding of bare plurals and mass notms in pseudopartitives, which is prohibited in true partitives. Indeed this characteristic is essential for Selkirk's structure for pseudopartitives, according to which what is embedded in a pseudopartitive is an N' rather than a noun phrase. There is one further difference pointed out by Selkirk and that is the deletability of of in pseudopartitives as compared to ordinary partitives: one cup (of) flour. Note, however, that this characteristic is not completely reliable (compare all (of) the cats and half (of) the water). In the previous sections it was argued that in ordinary partitives the function of partitive of is always to make the set or entity denoted by the embedded noun phrase within a partitive construction available or accessible to the upstairs determiner. It often depends on the embedded
Partitivity
193
погш phrase as well as on the upstairs determiner whether partitive of is necessary. Very rarely it is optional as well. Following this line of reasoning, one might argue that determiner expressions such as о number never have direct access to the set denoted by the embedded noun or noun phrase, and therefore partitive o/" shows up in all cases (cf. *o number cats versus a number of cats and a number of the cats). In this way we can maintain that the function of partitive of is the same in pseudopartitives and ordinary partitives, although syntactically the two constructions may differ. So, the function of of in both (29a) and (30a) is to make the sets denoted by the embedded constituents accessible to the upstairs determiner a number. Other upstairs determiners such as several only need this partitive o/"when they quantify over a restricted or bounded set that is not denoted by a common noun. I claim that this difference between α number and several is not a fundamental, but rather a lexically specific matter
5. Faded
partitives
In Dutch a peculiar construction exists which looks like a prepositional phrase, but actually is a noun phrase, and which might be called a faded partitive, following Van der Lübbe (1982). Consider the following Dutch sentence, which is ambiguous: (33)
Els at van die smerige bonbons. Els ate of those filthy bonbons
Sentence (33) can mean that there is a certain set of filthy bonbons in the domain of discourse, and Els ate (some) of them, or the sentence can mean that Els ate some of those filthy bonbons ("you know"^. In the first reading the van die-constituent is a prepositional phrase, in the second it is a noim phrase. That a real categorial difference is involved has convincingly been shown in the literature (c£ Haegeman 1987; De Hoop, Vanden Wyngaerd, and Zwart 1990). Several syntactic tests can be used to illustrate this. For instance, prepositional phrases in Dutch can generally appear to the right of the verb in subordinate clauses (a phenomenon that bears the descriptive term PP-over-V), whereas noim phrases cannot. Therefore, while the van die-pbrase in (34a) has the same two readings as (33), only the prepositional phrase interpretation remains in (34b): (34)
a. b.
dat that dat that
Els Els Els Els
van die smerige bonbons at. of those filthy bonbons ate at van die smerige bonbons. ate of those filthy bonbons
194
Helen de Hoop
Неге I will focus on the noun phrase case and address the question what is the connection between the partitive form of this construction and its meaning. Consider the examples under (35). Native speakers of Dutch will confirm that there is a certain difference in meaning between (35a) with a bare plural object and (35b) with a faded partitive. The difference is, intuitively, that (35a) can mean that I never read thick books, because I never see or buy thick books, or maybe because I never knew there exist any thick books. This cannot be the case in (35b). In (35b) I know there are thick books and I do not read them, for some reason or another, but it cannot be just a coincidence, like in (35a). (35)
a. b.
Ik I Ik I
lees read lees read
nooit never nooit never
dikke thick van of
boeken. books die dikke those thick
boeken. books
Zwarts (1987) observes that faded partitives are weak or indefinite because they can occur in existential sentences, just like bare plurals. This is illustrated in (36): (36)
a. b.
Er there Er there
lagen lay lagen lay
dikke thick van of
boeken op de books die dikke those thick
tafel. on the boeken books
table op de tafel. on the table
Faded partitives are also similar to bare plurals in that they can be preceded by a determiner, as Zwarts notes: (37)
a. b.
Er there Er there
lagen drie dikke boeken op de tafel. lay three thick books on the table lagen drie van die dikke boeken op de tafel. lay three of those thick books on the table
Therefore, Zwarts analyzes van 'of in a faded partitive as an inverse determiner that takes an noun phrase as its argument and yields a common noim again such that van in fact cancels the meaning of the determiner die. There are some complications with this view, however. First of all, common bare nouns can be combined with all determiners, whereas faded partitives cannot, as can be seen in (38): (38)
a. *dezel *alle van die dikke these/ all of those thick b. ordinary partitive reading only: demeeste! sommige van most/ sòme (certain) of
boeken books die dikke boeken those thick books
Partitivity с.
195 ordinary and faded partitive reading: twee! enkele/ veel van die dikke boeken two/ some (sm)/ many of those thick books
The constructions in (38) show that faded partitives can only combine with weak determiners (see (38c)). If they are preceded by strong determiners, they loose their characteristic meaning and become ordinary partitive constructions (cf (38b)). That also explains the ill-formedness of the examples in (38a): the strong determiners deze 'these' and alle 'all' cannot occupy the upstairs determiner position in ordinary partitive constructions. In some cases, however, faded partitives can be preceded by strong determiners, but only if a generic reading is possible. I will come back to generic faded partitives below. A second difference between faded partitives and bare plurals is their difference in meaning of which I already gave an intuitive description with respect to the sentences in (35). At this point, reconsider the sentences in (37). In (37a), there are three thick books introduced in the discourse, and therefore we say that those thick books are new in the discourse. These may be the first thick books in the world; the speaker as well as the hearer may never have seen a thick book before. That situation is not possible in (37b), however. In (37b) the concept of a thick book should be well-known, and, as far as I can see, both to the speaker as well as to the hearer. In this way we can account for the fact that (39b) is an ill-formed sentence from a pragmatic point of view: (39)
a.
Er liepen roze there walked pink b. #Er liepen van there walked of de tuin. the garden
gespikkelde speckled die roze those pink
kippen chickens gespikkelde speckled
in de in the kippen chickens
tuin. garden in in
The hearer of (39a) might be really surprised by the fact that there were some pink speckled chickens walking around in the garden, yet the sentence is perfectly well-formed. In (39b), however, the speaker appeals to the hearer's knowledge-store (cf. Vallduví 1990) in which there should be a set of pink speckled chickens ("you know what I mean, those pink speckled chickens") and because such a set does not exist in most hearer's knowledge-stores, the sentence is odd. That it is not just a matter of the nonemptiness of the set, as some readers might think, is illustrated by an example such as in (40): (40)
Van die lekkere koekjes warden nietmeer gebakken. of those nice cookies are not anjmiore baked
196
Helen de Hoop
I think the explanation for the paradoxical being new and familiar at the same time can be derivedfromthe structure of the model. The thick books in (37b) are new in the domain of discourse, but as they are taken from an already existing set in the hearer's knowledge-store, they are in a sense familiar as well. How should that be understood? It is important to realize that pragmatic, information structuring notions such as focus and topic are in principle independent of notions such as familiarity or referentiality (cf. Reinhart 1982; Vallduví 1990; De Swart and De Hoop 2000). The terms old and new with respect to the domain of discourse are often confusing, therefore, as they sometimes refer to what is old and new information in a sentence (topic, focus, and related notions) and sometimes to old (familiar or anaphoric) and new discourse referents. Consider Vallduvfs (1990) example: (41)
[The BOSS called.]^
From an informational perspective, the sentence in (41) is an all-focus sentence. Yet, the referent the boss is not new, it is supposed to exist in the hearer's knowledge-store. I suppose that the sets of individuals given by faded partitives are also stored. They represent well-known phenomena, concepts. It is not a coincidence, therefore, that their form is partitive. Partitives can be conceived as quantification over sets of entities that cannot be denoted by simple noims. As Ladusaw (1982) observed (see the quotation above), there are infinitely many sets of entities that may serve as the domain of quantification of determiners, and the sets denoted by common noims form only a small part of these. Modifiers such as adjectives and relative clauses already increase the possibilities, but deictic articles and discourse sensitive articles such as those and the actually guarantee that any arbitrary set may be denoted. However, these sets are not directly accessible for a quantifier, and again, partitive of is inserted to make these sets accessible. At this point, one could say that faded partitives only differ from ordinary partitives in the location of the set the van die-constituent denotes. So, in the case of an ordinary partitive, the set is contextually determined (following Westerstâhl 1985), whereas in the case of a van die-NP the set is located in the hearer's knowledge-store. Hence, we could merely analyze the ambiguity of a van die-constituent as the one in (33) in terms of an ambiguity in the demonstrative die. In the case of ordinary peirtitives, die is a context-set indicator, such that the embedded noun phrase denotes a contextually determined set of entities, whereas in the case of faded partitives, die signals the presence of a set of entities in the hearer's knowledge store and this set can be made accessible by partitive of, in the seime way as contextually determined sets of entities in ordinary partitives can.
Partitivity
197
I will argue, however, that such an account cannot fully explain the differences between ordinary partitives and faded partitives. Consider the sentences in (42): (42)
a.
b.
c.
Er zitten weinig van de jonge taalkundigen in de kroeg. there sit few of the junior linguists in the pub 'Few of the junior linguists are in the pub.' Er zitten weinig van die jonge taalkundigen in de kroeg. there sit few of those junior linguists in the pub There are few of those — you know —junior linguists in the pub.' Er zitten weinig jonge taalkundigen in de kroeg. there sit few junior linguists in the pub '(There are) few junior linguists (are) in the pub.'
Take a situation in which there are 10 linguists, 2 of which are junior linguists and the others are senior linguists. There are also 6 non-linguists, say paleontologists, and there is me. Everybody except for me is in the pub. I enter the pub and utter one of the statements that are given under (42). In this situation, it turns out that (42a) is false, whereas (42b) is true (under the assumption that everybody agrees on what is few in a certain situation, of course). Let me clariiy this point. In (42a), the set of jxmior linguists is known in the domain of discourse to contain two members, and since they are all in the pub, it cannot be simultaneously true that there are few of them in the pub (see Partee 1988 on the incompatibility o f f e w of the and all). In (42b), although all the junior linguists are in the pub, it can still be true that there are only few of them in the pub (namely, two), in particular in comparison with the number of senior linguists in the pub. Weinig 'few' in (42b) cannot mean few in comparison with the remainder of the set of those junior linguists (as was its interpretation in (42a)), unless of course the faded partitive gets an ordinary partitive reading (like in (42a)). Sentence (42c) can get both readings, either a partitive reading or an existential reading, with concomitant truth values. Thus, given two definitions oÎfew (one truly cardinal: (43a); one truly proportional: (43b), cf Partee 1988), we see that bare plurals can combine with both definitions, (see (42c)), whereas ordinary partitives as in (42a) combine with the proportional definition, and faded partitives such as in (42b) with the cardinal one. (43)
a. b.
few^AR iff | АПВ | < η /èMigAB iff IАПВ | / | A| < к (A a fraction or %)
198
Helen de Hoop
This sufficiently shows that the two tJφes of constructions are not instantiations of one and the same phenomenon after all. The question is how to account for the difference between ordinary and faded partitives, while maintaining the insight that the similarity in form between ordinary and faded partitives is not a coincidence after all. In order to formalize the function of van 'of in a faded partitive, we can use the notion of a partition, defined below: Given a non-empty set A, a partition of A is a collection of non-empty subsets of A such that ( 1) for any two distinct subsets X and Y, X П Y = 0 and (2) the union of all the subsets in the collection equals A. The notion of a partition is not defined for an empty set. The subsets that are members of a partition are called cells of that partition. (Partee et al. : 46) I would like to claim that the function of van in a faded partitive is to give access to a cell of a partition of the set denoted by die (A) N, a set which is non-empty indeed (we know that instantiations of the phenomenon do actually exist, albeit not necessarily in the present discourse; as such the set is stored in our knowledge domain). The modifier can actually help to define the nature of the partition. We can assume, for instance, that it is the equivalence relation has the same colour as that induces a partition of the set of cats in van die zwarte hatten in which case the cell defined by {x\black(x)} is the non-empty set that is intuitively chosen when one interprets van die zwarte hatten. Equivalence relations are reflexive, symmetric, and transitive, and they structure a domain into subsets whose members are regarded as equivalent with respect to that relation. Clearly, being in the same cell of a partition is also an equivalence relation in itself. A cell denotes an unrestricted or unbounded set of individuals, like a bare plural, but it is not directly accessible for quantification. It needs partitive van before it can be quantified over. This explains the similarities between faded partitives on the one hand and bare plurals and pseudopartitives on the other On the other hand, we can maintidn the idea that the function of van 'of remains the same: it makes an otherwise unaccessible set of entities accessible for quantification. Usually, a partition need not be specified, and that is why a modifier does not have to be present in a faded partitive. There is one exception, however. If a faded partitive gets a generic reading, then a modifier has to be present, as shown in the paradigm in (44). (44)
a. *Van of b. Van of
die those die those
hatten cats zwarte black
brengen bring hatten cats
geluh. happiness brengen geluh. bring happiness
Partitivity
199
Maybe the reader did not expect a generic reading to be possible at all for a faded partitive, for instance because faded partitives look like French plural indefinites which do not get a generic reading (cf. Bosveld-de Smet 1998 and De Swart 1993). However, (44b) shows that a generic operator can quantify over the individuals in a cell of a partition denoted by a faded partitive. Modifiers can define a particular cell of a partition in a way that appears to involve association with focus, as can be witnessed in (45): (45)
a. b. c.
Van die zwarte hatten met witte pootjes brengen geluk. of those black cats with white feet bring happiness Van die zwarte hatten MET WITTE POOTJES brengen geluk. Van die ZWARTE hatten MET WITTE POOTJES brengen geluh. Van die ZWARTE hatten met witte pootjes brengen geluh.
The interpretation of (45a) involves a partition of black cats such that there is a non-empty cell of which the members have white paws. In (45b) the partition is of the set of cats again, and the cell concerned contains elements that are both black and have white paws. Finally, (45c) involves a partition of the set of cats with white paws, such that there is a cell of which the members are black. In fact, we can even tum the ill-formed (44a) into a well-formed sentence by focussing die 'those', which gives a contextually specified (specified by means of pointing, for instance) cell of a partition of the set of cats, see (44a'): (44)
a.'
Van DIE hatten of THOSE cats
brengen geluk. bring happiness
In fact, ordinary definite noun phrases behave exactly the same in this respect (cf Hoekstra and Wehrmann 1985; De Hoop, Vanden Wyngaerd, and Zwart 1990). I will assume that the partitive character of definites (cf. Westerstâhl 1985) and that of faded partitives is in principle incompatible with generic or universal quantification. Only if the set denoted by the definite noun phrase or faded partitive is explicitly specified, universal quantification over its members becomes possible. However, whereas an ordinary definite noun phrase denotes a contextually restricted set of entities, the definite noun phrase inside a faded partitive denotes a cell of a partition. Van 'of is used to make the cell accessible for a (possibly generic) qu£intifier. Another recent use of the definition of a partition in linguistic analysis is found in Lipták (2001) on the multiple partitive construction in Hungarian.
200
Helen de Hoop
6. Partitive Case In addition to extra-ordinary partitive constructions such as pseudopartitives and faded partitives, there is another tJφe of partitive which we find in languages that have morphological partitive Case. In Finnish, for example, there are two possible Cases for an object noiui phrase. Depending on the reading associated with it, the object of a transitive verb will be marked either with accusative or with partitive Case. This is illustrated in (46a) and (46b): (46)
a.
b.
Anne joi maidon. Anne drank milkAcc 'Anne drank (up) the milk.' Anne joi maitoa. Anne drank milkpAR^ Anne drank (some) milk.'
Belletti (1988) generalizes this observation to other languages, claiming that in general transitive verbs assign either structural or inherent Case to their objects, while assuming that accusative Case is structural, whereas partitive Case inherent. Belletti furthermore proposes that unaccusative verbs only lack the capacity to assign accusative Case, whereas their capacity to assign partitive Case is maintained. This is also suggested by Finnish sentences; consider some Finnish existential sentences containing an imaccusative verb: (47)
a.
b.
Syntyi vaikeuksia. arose difficultiespART 'Difficulties arose.' Sellaisia virheitä esiintyy suchpART mistakespART occur 'Such mistakes occur often.'
usein. often
Belletti considers the fact that unaccusative verbs are inherent Case assigners to provide a straightforward explanation of the definiteness effect in existential sentences. Her argumentation is as follows. In a language such as English only a restricted class of unaccusative verbs is allowed in existential sentences. So, the verb phrase-intemal subject in an existentisd sentence receives its Case directly from the ergative verb. This is partitive Case, the only Case ergative verbs can assign to their D-structure objects. Partitive Case selects an indefinite meaning, a claim that is based on examples such as (46a) and (47). Therefore, we get a definiteness effect in the object position of unaccusative verbs.
Partitivity
201
In De Hoop (1992), it is argued that Belletti's point of view concerning the relation between the meaning associated with partitive Case and Lndefiniteness cannot possibly be correct. Belletti claims that partitive Case is only compatible with an indefinite interpretation, the reason being that this Case always has a meaning such as 'some of, 'part of a larger set'. It is not true, however, that there is an incompatibility between partitive Case and a definite noun phrase in Finnish. In traditional Finnish grammar (cf Karlsson 1983), the alternation between a partitive object and an accusative object is attributed to two semantic distinctions, namely indefiniteness versus definiteness on the one hand, and irresultativity versus resultativity on the other. An example of the latter distinction is found in (48): (48)
a.
b.
Anne rakensi taloa. Anne built housepART 'Anne was building a/the house.' Anne rakensi talon. Anne built houseAcc 'Anne built a/the house.'
Note that the partitive object need not express indefiniteness, when the sentence is interpreted irresultatively. This becomes even more clear in the following example where the partitive Case bearing object contains the strong imiversal quantifier kaikki 'all': (49)
Presidenta ampui kaikkia lintuja. president shot аИрдкт birdspAET 'The president shot at all birds.'
This runs counter to Belletti who actually argues that the fact that universal quantifiers are excluded in existential sentences is a direct consequence of their being intrinsically incompatible with partitive Case. In Finnish, partitive objective Case is a syntactic Case that should be distinguished irom lexical Cases such as elative, inessive, adessive, etcetera. This is in accordance with Nikanne (1990) who also distinguishes between these lexical Cases on the one hand, and syntactic Cases (among which the partitive) on the other, while arguing that phrases bearing lexical Cases are in fact prepositional phrases, whereas the ones with syntactic Case are real noun phrases. Vainikka (1989) also argues that partitive Case in Finnish is not an inherent Case, but rather, as she puts it, a structural default Case. This term refers to Cases which establish a direct relation between structural position and type of Case. Vainikka claims that in Finnish, partitive Case is the default Case for the object position of a transitive verb.
202
Helen de Hoop
The claim that partitive Case in Finnish is a structural Case gets additional support from some Exceptional Case Meirking (ECM) contexts. In Belletti's theory, it is crucial that partitive Case cannot be assigned imder ECM because it is an inherent theta-related Case. In this way she can account for the paradigm in (50): (50)
a. * Consideravo [gc studenti intelligenti.] I-considered students intelligent b. Consideravo \gli studenti intelligenti.] I-considered the students intelligent c. I consider students intelligent.
According to Belletti, the matrix verb in (50a) cannot assign partitive Case to the small clause subject, due to the inherent nature of partitive Case, i.e., the subject is not θ-marked by the matrix verb but by the adjectival phrase. If the subject is a definite noun phrase (cf (50b)), it receives structural accusative Case from the matrix verb under ECM. Comparing (50a) to its grammatical English counterpart in (50c), the assumption arises that in English the bare plural can have accusative Case, and indeed, bare plurals in English do not necessarily get a weak reading; they can also get a generic reading, which is what actually happens in (50c). What is going on in (50a) and (50c) is that the individual level predicate (be) intelligent triggers a strong reading on its subject, which is universally the case for individual level predicates (c£ Carlson 1977). Bare plurals in Italian can only get a weak (existential) reading, but bare plurals in English can get a strong (generic) reading. Therefore, (50a) is ill-formed, whereas (50c) is fine. This obviously holds independently of the ECM construction. In fact, the subject of a small clause selected by the verb consider bears partitive Case in Finnish (I owe this example to Anne Vainikka, p.c.): (51)
Anne pitää [gc helsinkiläisiä kummallisina.] Anne considers inhabitants-of-HelsinkipART strange
I think the above example sufficiently shows that ECM verbs do have the option of licensing partitive Case, just like transitives, and this supports the claim put forward in the literature that partitive Case is a structural rather than an inherent Case (see also Vainikka and Maling 1996 for additional arguments with respect to Finnish partitive Case and Kornfilt 1996 for the structural nature of Turkish partitive Case). So far, it became clear that there are two types of structural Case for objects, and the choice between the two types in a particular context appeared to be a matter of either the strength of the object or the (ir)resultativity of the predicate. The question is what might be common to these two different semantic distinctions, one of which is nominal and one as-
Partitivity
203
pectual. In De Hoop (1992) it is claimed that noun phrases that bear weak (partitive) Case are always interpreted as part of the predicate, i.e. predicate modifiers, whereas noun phrases that bear strong (accusative) Case must be interpreted as real arguments, and that predicate modifiers and real arguments differ in semantic tJφe. Certain syntactic operations, in particular Case assignment, reflect type-shifting operations, in such a way that noun phrases that bear strong structural Case have the semantic type of a quantifier. Apart from referential (type c), quantificational (type «e,t>,t>), and predicative types (type ) for noun phrases (c£ Partee 1987), a fourth semantic type is introduced for noun phrases that bear weak Case, namely the predicate modifier type (type «e,t>,? definite complex DPs with relative clauses (but not indefinite ones; cf Postal 1998)
(7)
What the police arrested was this video
•
definite complex DPs with complement clauses (but not indefinite ones; cf Rothstein 1988)
(8)
Which man did they consider
•
subjects (regardless of definiteness), unless clausal (presumably because so-called sentential subjects are topics rather than subjects; cf. Koster 1978b)
(9)
a. * Which man did shock you? b. ? This is something which would be futile (Kuno & Takami 1993)
•
coordinate structures, unless extraction is across-the-board ( 10b) or a "fake coordination" is involved (10c) (on the latter, cf section 1.9 of Progovac, this vol., and especially Postal 1998 for detailed discussion)
(10)
a. * Which man did you invite ? b. Which man did you invite ? c. This is the beer that I (Jacobson 1996)
216
Anna Szabolcsi and Marcel den Dikken
•
left branches (in some languages) (see Corver 1990)
(11)
a. * Which (man's) did you see i b. Combien as-tu lu? how-many have-you read of books 3.2. Types of
explanation
The standard explanation of the islandhood of Complex DPs, Subjects, u)Ä-complements, and Left Branches is in terms of Subjacency. Subjacency is classically viewed as a condition on movement requiring that movement not cross more than one bounding node, with bounding nodes originally defined in terms of a list — NP and S (= DP and IP) for English (Chomsky 1973, 1977), NP and S' (= DP and CP) for Italian (Rizzi 1978). (See Richards 1997 for detailed discussion of strong island eftects arising at LF, and also of the lifting of such eftects in the presence of another movement operation of the same type which does not cross the island and pays the "subjacency tax," thereby satisfying his Principle of Minimal Compliance.) Chomsky (1986) redefines bounding nodes as barriers, which £ire themselves defined in terms of blocking categories (ВС). An XP is a ВС for an element α iff XP dominates a and is not L-marked (i.e., 0-govemed by a lexical category). All BCs for a except IP are also inherent barriers for a; in addition, a node YP which immediately dominates XP (a ВС for a) will be a barrier by inheritance for a. Chomsky (1986) assumes that movement is constrained by 1-subjacency (i.e., not more than one barrier should separate a trace from its antecedent); but crossing even a single boimding node leads to a mild degradation, and moreover, Chomsky (1986) crucially invokes 0-subjacency as a constraint of "chain composition," operative in his analysis of parasitic gap constructions. Cinque (1990) takes a 0-subjacency approach, en passant rethinking subjacency as a constraint on binding chains, not just movement. In Chomsky (1999), the classic theory of bounding nodes and locality is partially recast in terms of (strong and weak) phases, in conjunction with a Phase Impenetrability Condition which makes only the head and the edge of a phrase accessible to S5^tactic operations (see also McCloskey 2000 for an approach to locality effects and resumption in Irish couched in the "derivation by phase" model). Phases include uP and CP, and arguably also DP. (Chomsky's difference between weíik and strong phases does not translate into the distinction between weak and strong islands: the variety of weak islands canvassed below is such that no definition of "weak phase" is likely to capture it.)
Islands
217
Adjunct islands are standardly explained by Huang's (1982) Condition on Extraction Domidns (CED), hence by the Empty Category Principle (ECP): an extraction domain needs to be properly governed. Likewise, Pesetsky (1982) subsumes the Coordinate Structure Constraint under the path containment version of the ECP. The ECP has also been held responsible for the fact that adjuncts are impossible to extract out of islands, strong or weak (cf the c-examples in (3) and (4)). Manzini's (1992) integrated theory of locality is the only attempt to imify the eifect of tense (cf (4) vs. (5)) and definiteness (cf. (6)) with other locality phenomena: D and Τ block dependencies based on Case-addresses, which, in her theory, DPs otherwise rely on to escape from islands (dependencies based on categorial indices being blocked across all islands). Manzini's approach in terms of Case-addresses is only one of the extant ways of making the DP/PP distinction, which serves as Cinque's principal diagnostic for the strong/weak dichotomy (cf. (За) vs. (3b)). Chomsky (1986: 32, 66), who notes it in passing and attributes the observation to Adriana Belletti, suggests an accoimt of the contrast in (3a,b) in terms of the Barriers theory of adjunction (such that intermediate adjunction to the adjunct PP to circumvent the island is allowed only in the case of extraction of a DP), Alternatively, the DP/PP dichotomy is due to the fact that DP-gaps may be null resumptive pronouns while PP-gaps cannot be—there are overt resumptives for noun phrases but according to Cinque (1990) there are none for PPs. (12)
The DP-gap inside strong islands is not a trace but an A'-bound empty pronoun, pro.
Perlmutter (1972) originally proposed that all extractions leave invisible resumptive pronouns, regardless of whether the gap is in an island or not. Obenauer (1984/1985) claims that all extraction from islands involves null resumptives. Postal (1998) essentially follows this line of thought, while appealing to a different cutting of the strong/weak pie. Cinque (1990) narrows the application of the null resumption strategy down to A'-dependencies between a DP and a gap contained in a strong island. Rizzi (2000) picks up on the generalization that A'-dependencies across strong islands succeed only in the case of DP-dependencies, extending the account into the realm of weak islands, which we tum to next.
218
Anna Szabolcsi and Marcel den Dikken 4. Weak
islands
4.1. Bird's eye view Leaving the strong islands behind, we now embark on a voyage in the archipelago of weak islands (Wis). These come in a variety of forms, listed in (A), below, according to the types of constituents which induce WI effects. The list in (B) enumerates the types of elements whose extraction is sensitive to Wis. (A) What (Al) (A2) (A3) (A4)
induces a WI? tenseless u)A-questions VP-adverbs negatives and other affective operators response stance and non-stance vs. volunteered stiince predicates (A5) scope islands (A6) extraposed constituents (A7) anti-pronominal contexts (cf Postal 1998)
(B) What constructions are sensitive to WIsi (Bl) extraction/wide scope of adjuncts and predicates (versus arguments) (B2) extraction/wide scope of non-referential (versus referential) expressions (B3) extraction/wide scope of non-D(iscourse)-linked (versus D-linked) expressions (B4) extraction/wide scope of non-individuals such as manners, amoimts, predicates, and collectives (B5) functional readings and event-related readings (B6) split constructions (B7) negative polarity item (NPI) licensing (B8) cross-sentential anaphora For an exhaustive overview of the ins and outs of the Wl-inducers and WIsensitivity, we refer the reader to Szabolcsi (1999). In what follows, we will highlight the main theoretical issues in the domain of weak islands, zooming in on the three major players in the field, as listed in C, where, for each approach, we have listed the data and generalizations accounted for (the % sign indicating a partial account). As a caveat, we should point out that C3 collapses two distinct versions of the Scope Theory, whose individual empirical coverage is not as broad as their sum total.
Islands
219
(С) Theories of weak islands (CI) ECP and Subjacency (Al) (Bl) (C2) Relativized Minimality (Al, A2, A3%, A4%, A6%) (B2, B3, B6) (C3) Scope Theory (Al, A2, A3, A4, A5, A6%) (B3-B8) In the presentation to follow, we will bring up the various theories in conjunction with the empirical data to which they are most closely tied—more or less as a reflex of the fact that practically each new theory of island phenomena in the literature comes with its own new set of data. Space preventing a more detailed outline of the facts, we will often resort to illustrating weak islands just with adjunct extractions. 4.2. Types of
explanation
4.2.1. ECP and Subjacency (CI) The historical starting point when it comes to weak islands is the assumption, made in Huang (1982), Lasnik & Saito (1984, 1992) and Chomsky (1986), that the paradigmatic (if not the only) case of weak islands is (tenseless) wh-islands (Al): (13)
a. ? Which man are you wondering ? b. *How are you wondering ì
The distinction between arguments and adjuncts (Bl) seen in (13) is standardly taken to follow from the division of labor between the Empty Category Principle and Subjacency (CI). While all extraction out of a wh-island violates the Subjacency Condition (hence delivers a degraded result, to a greater or lesser degree depending on factors such as finiteness and definiteness), adjunct extraction from such an island in addition incurs a violation of the ECP, in ways that differ in detsdl in the various approaches developed in the literature but which need not be made precise in the present context. The essence is that the extraction of some phrase is ECPsensitive to «;/i-islands because of the fact that it originates in a non-argument position. On the ECP and Subjacency approach, the theory of weak islands is purely syntactic, both with respect to the Wl-inducer (a constituent with a ñlled SpecCP, occupying the "escape hatch" position) and when it comes to the sensitivity of extractees to weak islands (originating in argument or non-argument positions). But through the years it has become dear that such a straight and simple syntactic classification of Wl-inducers and WIsensitive expressions is insufficient to cover the data uncovered in the archipelago of weak islands, on both counts. In the inducer domain, it is especially the scope islands (A5) that lay bare the inadequacy of a strictly
220
Anna Szabolcsi and Marcel den Dikken
syntactic approach. And in the realm of Wl-sensitive expressions it is not at all obvious how, alongside non-individual wA-phrases, we might capture the amount and event-related readings of numerical QPs, fimctionally interpreted lüAic/i-phrases, definite dependents of "one time only" predicates and negative polarity items under one and the same syntactic umbrella. The first indication that the CI approach was inadequate came from Obenauer's (1984/1985) observation that VP-adverbs (A2) block so-called "quantification at a distance" (QAD; cf (14)), a case of "split constructions", B6 (see also de Swart 1992, Honcoop 1998 for cases of Dutch wat voor split blocked by VP-adverbs; see (34) for illustration, and see Rizzi 2000 for a similar case of "splitting" in Italian, involving wh... d'altro 'wh else'). That they also block adjunct extraction (Bl) is illustrated in (15) (see Doetjes 1997 for discussion). (14)
a.
J'ai beaucoup consulté [_de livres] I have a-lot consulted of books Ί consulted a lot of books' b. * Combien as-tu < beaucoup consulté [_ de livres]>? how-many have-you a-lot consulted of books
(15)
a. *How did you ? b. *How did you ?
These island eifects clearly do not fit the classic "escape hatch" model developed on the basis of u)A-islands, for the simple reason that VP-adverbs do not occupy any escape hatch position. 4.2.2. Relativized Minimality (C2) Rizzi (1990) capitalized on Obenauer's data in (14b) and used them as the key to a novel theory of locality. His Relativized Minimality (C2) deserves the credit of being the first relatively broad-scale attempt at providing the classic "escape-hatch based" approach to weak islands (CI) with a more empirically accurate successor (see also Cinque 1990, and Rizzi 2000 for an updated approach taking Cinque 1990 and Chomsky's 1995 copy theory of traces into account). Rizzi (1990) builds primarily on the theoretical analysis of QAD in Obenauer (1984/1985), whose crucial insight is that a local relation between an operator and its vEiriable is blocked by the intervention of any third party that may be derivationally totally unrelated to them but is sufficiently similar to the operator. Relativized Minimality is a representational theory of "like" intervention. It replaces Chomsky's (1986) "rigid" approach to minimality (according to which only an intervening head governor induces a minimality barrier) by a theory which relativizes mini-
Islands
221
mality to the kind of relationship that obtains between the governor and the dependent: (16)
RELATIVIZED MINIMALITY
X a-govems Y only if there is no Ζ such that (i) Ζ is a typical potential a-govemor for Y, and (Ü) Ζ c-commands Y and does not c-command X. Rizzi distinguishes four kinds of values for a: head, antecedent in an A-chain, antecedent in an A-biir chain, and antecedent in a head-chain. Recent developments of Relativized Minimality are Chomsky's (1995) Minimal Link Condition and its revision in Manzini (1998). Since we are concerned with chains headed by a шЛ-phrase in A'-specifier position, all and only A'-specifiers are relevant interveners in the discussion to follow. That u)/i-expressions such as who and whether count as such will not be surprising, so the classic wA-island effect is straightforwardly accommodated. The VP-adverb facts reviewed above also fit in, on the assumption that the VP-adverbs in question occupy an A'-specifier position. The theory can also be applied to explain the WI effects induced by negative and other affective operators (A3), illustrated in (17) (c£ Williams 1974 for the original observation that (imstressed) negatives block adjimct extraction; also cf Ross's 1984 Inner islands, and Rizzi 1990 for a broadening of the empirical domain to all affective operators in the sense of Klima 1964): (17)
a. b. c. d.
* I asked how John wh 'For every boy, which book did he read?' wh > every •Which book is such that every boy read it?' independent scope (uniformity presupposition) 'Taking for granted that every boy read the same book, what was this book?'
Reading (25a) is often called a pair-list reading, as it is answered by a list of pairs: 'Bill read Afasie Mountain, Jim read The Russia House,...'. Readings (25b) and (25c) both ask for a single book that was read by every boy, but differ as to the possibility ofwhat else each boy may have read. For instance, if Bill read Jurassic Park and Tom Jones, Jim read Jurassic Park and Airframe, and so on, reading (25b) is felicitous and the answer is 'Jurassic Park.' Reading (25c) is not felicitous in the same situation: it presupposes that each boy readjust one book, and moreover, the same one, and merely asks to identify the book. The question is whether these three readings are equally possible when every boy interacts with a Wl-sensitive expression. É. Kiss (1993) and de Swart (1992) make the fundamental observation that universals are harmless only when they scope above or independently of the sensitive ui^-phrase. When they scope below it, they induce a WI. Thus, É. Kiss observes that the example in (26) is grammatical on two of its three readings only. (26) a.
How did every boy behave J every > wh 'For every boy, how did he behave?'
Islands
227
b. * wh > every * 'What was the common element in the boys' non-uniform behavior?' c. independent scope (uniformity presupposition) 'Taking for granted that every boy behaved the same way, what was it like?' Szabolcsi (1997) also shows that non-affective QPs that cannot take wide or independent scope invariably induce Wis. The Scope Generalization emanating from these observations can be stated in either of the following ways: (27)
a. b.
If Opi has scope over Opj and binds a variable in the scope of Opj, Opi must be specific. (É. Kiss 1993) A quantifier Qi can only separate a quantifier Q2 from its restrictive clause if Q^ has wide scope over Q2 (or is scopally independent of it), (de Swart 1992)
The Scope Greneralization puts the whole WI phenomenon in an entirely new light (see also Frampton 1991). Just as Relativized Minimality was based on the observation that the range of Wl-inducers is much wider than Subjacency can accoimt for, the Scope Generalization expresses the observation that both the range and the nature of Wl-inducers is different from what Relativized Minimality (in its original form or in its monotonicity reincarnation) can take care of Tying everything in the domain of Wl-inducers to the property of "being scopai" does not make the desired cut, however. After all, there are expressions that some well-established theories classify as scope-bearing operators but which nevertheless do not induce Wis. Such are indefinite DPs and intensional verbs like want: (28)
a. b.
How did a boy behave J How do you want me to behave _?
Confronted with such cases, one may either embrace an analysis according to which indefinites and intensional operators are not scopai, or draw some principled demarcation line between scopai expressions, predicting some of them to be innocuous. Szabolcsi & Zwarts (1993) and Honcoop (1998), who both seek to explain the Scope Generalization stemming from É. Kiss (1993) and de Swart (1992) in formal semantic terms, follow the latter strategy.
228
Anna Szabolcsi and Marcel den Dikken 4.2.5. Scope Theory (C3) —
The Algebraic Semantics version Szabolcsi & Zwarts 1993 (reprinted, with a handful of new notes, as Szabolcsi & Zwarts 1997) is an instantiation of the Scope Theory (C3) embedded in the theory οΐAlgebraic Semantics. It is proposed that Wl-violations are semantically incoherent, in much the same way as *six airs is, where a numeral is applied to a mass term. In both cases, the source of incoherence is the fact that an operator wants to perform an operation which cannot be performed in the denotation domidn of the rest of the expression. It is well-known that the semantic contribution of many operators can be defined in terms of set-theoretic (Boolean) operations. Not is definable in terms of complement formation, every in terms of intersection, some in terms of union, and many other operators as combinations of these (and perhaps further non-Boolean ingredients). Szabolcsi & Zwarts propose to make this explicit in the interpretation of sentences and thereby use it to explain why certain expressions can, and others cannot, scope over certain operators. When an expression E scopes over some operator O, the operations that define 0 need to be performed in E's denotation domain. For instance, in calculating the denotation of Who didn't you see? we take the complement of the set of those whom you saw, and in calculating the denotation of Who did every girl see?(on the wh>every reading) we intersect the sets of those seen by individual girls. This is possible precisely because who ranges over individuals, and individuals form sets, on which complementation and intersection (as well as imion) can be performed. On the other hand, Szabolcsi & Zwarts argue that the denotation domain of Wl-sensitive amounts, manners, etc. does not lend itself to complementation and/or intersection (they form join semi-lattices). Therefore, these cannot scope over negation, universal quantifiers, or other operators whose definition involves similar operations. They can scope over existentials (whose definition is in terms of union) or intensional verbs (whose semantic contribution is not Boolean in nature). This proposal straightforwardly accounts for the Wl-inducing effect of affective operators (A3) and (non-existential) quantifiers (A5). It is claimed, albeit somewhat programmatically, that the same analysis carries over to wÄ-expressions (Al), quantificational adverbs (A2) and response stance and non-stance predicates (A4). Szabolcsi & Zwarts also explain the absence of Wl-effects in (28), with the aid of the observation that plain indefinites like о boy only rely on union (the operation that even join semi-lattices have), and want and should, while scopai, do not make a Boolean contribution. Hence the intervention of plain indefinites and intensionals is correctly predicted to be harmless.
Islands
229
Szabolcsi & Zwarts' algebraic Scope Theory has the additional advantage of accounting for an original observation of theirs which seems problematic for all other approaches to what expressions are immune to Wis. They observe that extraction of arguments and adjuncts of non-iterable or "one time only" predicates (B4), which, by virtue of the very nature of the predicate, must be interpreted as collectives, is sensitive to weak islands. The extraction from a negative island in (29b) is acceptable only on the assumption that the same house can be destroyed more than once; i.e., it is unacceptable on the verb's natural "one time only" interpretation: (29)
a. Which soldier(s) didn't _ visit this house? b. ?? Which soldier(s) ED
In (32b), α new coat does not c-command it. Hence α new coat can only bind it in a dynamic fashion. Consequently, the application of ED is wellformed only in contexts that allow cross-sentential anaphora. But if the indefinite is inside an inaccessible domain created by some operator, and the pronoun is outside that domain (as depicted in (33)), binding, and ED, are not possible. (33)
* {x:... OP [...indefinite;...] and itj is identical to x} where OP creates an inaccessible domain for anaphora
Honcoop now makes two crucial predictions. First, he predicts that scopai operators that create inaccessible domains for anaphora will make splitting impossible, on the assumption that the operator in split constructions
Islands
231
is related to the indefinite in the same way as an adverb of quantification (like usually in (31a)) is related to the indefinite it binds and that, hence, ED is required in both cases. (34)
Wat heeft hij (*niet / *twee keer) gezegd dat hij voorboeken what has he not/two times said that he for books heeft gelezen? has read
This prediction is bome out, as the discussion of B6 has shown. Secondly, Honcoop predicts that any other phenomenon whose treatment necessitates an application of ED for some other reason will, similarly, be sensitive to Wl-inducers, viz. inaccessible domain creators. An interesting novel domain that bears the second prediction out is that of NPIlicensing. Linebarger (1987) observed that the licensing relation between negative polarity items (B7) and their triggers is blocked by a variety of interveners. Picking up on this observation, Honcoop makes the novel argument that these are precisely the same interveners that create weak islands/inaccessible domains. Conclusive evidence for the Wl-sensitivity of NPIs comes from scope islands (A5). (35)
a.
John didn't give the beggar a red cent trigger: not; NPI: a red cent b. *John didn't give
To account for NPI licensing (B7), Honcoop—rather than relating this directly to splitting, which would be impossible since not all NPI-licensors can be analyzed as unselectively binding them—points out that all NPIs are associated with a scalar implicature. This requires computing entailment relations between alternative propositions, and the formation of these alternatives in tum requires an application of ED. Honcoop also notes that his Scope Theory provides for an explanation of the Wl-sensitivity of Krifka's (1990) event-related readings (B5): (36) b.
Four thousand ships passed through the lock last year object-related: 'there were 4,000 distinct ships that passed through the lock' event-related: 'there were 4,000 lock traversais by ships'
a. b.
How many ships
ho[ss']il 'single-ply thread' (Sohn 1994:469)
Sudanese Arabic also illustrates assimilation of a stop to a fricative of the same place of articulation, specifically, assimilation occurs when the consonants share place of articulation. See the sources for additional data. (11) a.
Sudanese Arabic (from Kenstowicz 1989, based on Hamid 1984) kitaa[b] 'book' kitaa[f] fathi Fathi's book' kitaalp] samiir 'Samiir's book'
402
Keren Rice b.
bi[t] bi[t] fariid bi[s] saamya ?al-bi[S] Saafat
'girl' 'Fariid's girl' 'Saamya's girl' 'the girl saw'
The D-eifect in many Athapaskan languages provides evidence that continuants can show unmarked patterning and stops marked patterning. This process coalesces a stop Ш with a following fricative in certain environments, creating a stop with the place of articulation of the fricative. This can be seen in Ahtna, as in the example in (12) (Kari 1990:25). (12)
s-t-Roi
sqoi 'it broke' (R = uvular fricative; Ahtna, Lower dialect)
Both 0/continuant (Korean, Sudanese Arabic) and stop/0 patterns (Ahtna) are found. Emergence of the unmarked environments likewise sometimes have stops and sometimes fricatives. In Korean, coronal fricatives neutralize to stops, as in (13). (13)
Ы /ka-s'-ta/
o[t] ka[t]t'a
'garment' 'went'
(Sohn 1994:473) (Sohn 1994:473)
However, in some cases, the continuant patterns as if its manner were unmarked. In Ahtna, which has a contrast between stops and continuants stem-finally, the stop-continuant contrast is neutralized in favour of the continuant word-finally, as in (14). (14)
/kuuq/
kuu[x]
'be warm, hot, momentaneous imperfective' (Kari 1990: 127)
In Korean and Ahtna, evidence from both emergence and submergence of the unmarked converges, showing that stop is unmarked in Korean and continuant in Ahtna. Returning to vowel systems, variation in patterning of height features occurs. Either high vowels or mid vowels can pattern as unmarked, as seen by either patterning as targets. For instance, in many Bantu languages high vowels are targets and mid vowels triggers for mid assimilation (e.g., Steriade 1995:122-123), while in many Spanish dialects with metaphony mid vowels are targets and high vowels triggers for high assimilation (Dyck 1995). (See Walker forthcoming for an interesting alternative view of metaphony based on the assumption that high vowels always represent the unmarked height, and an argument that assimilation to an unmarked feature is possible; see also Bakovié 2000.) Emergence of the unmarked reinforces these two possibilities: a high vowel is often epenthetic (e.g., Navajo (Athapaskan), Hargus and Tuttle 1997), but a mid vowel may be epen-
Featural markedness in phonology: variation
403
thetic as well (Gengbe (Abaglo and ArdiEingeli 1989), Spanish, and many Athapaskan languages (Hargus and Tuttle 1997)). These examples point to a single conclusion: within a class, it is not necessarily possible to identify a single feature of an opposition as unmarked cross-linguistically, but the feature which patterns as unmarked can differfromlanguage to language. 4.4. The absence of contrast: variation in the emergence of the unmarked Another type of variation is found when languages are compared, namely variability in the detailed realization of a particular phonological representation. In this section, I focus on cases where there is no evidence for a lexical contrast between two features within a feature class; however, either contextually or in variation, different surface realizations are possible. These cases thus involve the emergence of the unmarked. Consider first contextual cases, or allophony, where no particular evidence is available for which feature is present lexically as no contrast exists. For instance, in the absence of a contrast between stop and fricative manner of articulation in obstruents, either a stop or a continuant can surface, depending upon context. This is seen in many Australian languages (e.g., Hamilton 1996). Spanish too exhibits variation between voiced stops and voiced fricatives/approximants in different contexts. Just as either stops or continuants can pattern as unmarked with respect to phonological processes, they can be positional variants of a single representation. Whether the stop or the continuant is found depends on environment, and it is not possible to speak of one as being less marked than the other in any absolute sense, but only in terms of syntagmatic context. Notice that in counting frequencies or determining implications (section 5), such patterns must be ignored: it is the linguistic analysis that determines the underlying manner status. In many cases, free, or non-contextual, variation in the realization of a particular lexical representation can occur in a single position. In some Slave (Athapaskan) dialects, what is reconstructed as *n is realized variably as [d, n, nd] in the same position (Rice 1989, 1993). In White Mountain Apache (Athapaskan), a coronal (dental) and a velar stop occur stemfinally infreevariation (Rice 1996). In Algonquin (Algonquian), [u] and [o] are in free variation phonetically, at least in stressed position. Ahtna (Athapaskan; Kari 1990) has variation between ts/tj, s/J, etc. Steriade (1995: 155) remarks on the possibility of English alveolar stops being articulated either laminally or apically in the absence of a contrast. These cases illustrate that, in the absence of a contrast, variation in phonetic
404
Keren Rice
realization is possible. It thus may be difficult to pinpoint a single feature within a class as unmarked, given the possibility of variation in phonetic implementation. Variation thus may occur in the detailed realization of a particular sound in the absence of contrast. Variation in emergence of the unmarked features in epenthesis is also found. Briefly, epenthetic vowels can be front, central, or back in place and high, mid, or low in height; epenthetic consonants are drawnfromlarjmgeal, coronal, and velar places of articulation; they can be obstruents or sonorants. Clearly, the emergence of the vmmarked does not yield a single clear statement on any dimension. 4.5. Phonetic space and
markedness
Another criterion for determining the marked member of an opposition concerns phonetic space. For example, the phonetic space assigned to a particular vowel in a system may vary considerably. For instance, in Diyari (Australian; Austin 1981), with an /i a u/ system, [u] is fixed in its position, while [i] can occupy a fairly large phonetic space, rangingfromfront to central (see Rice 1995 for discussion). In some Spanish dialects, phonemic mid vowels occupy phonological space from relatively low to mid, while phonemic high vowels are relatively fixed in their height; see Dyck (1995). Suppose that fixedness of phonetic realization correlates with markedness. The marked pole is more clearly defined and less subject to variation in interpretation; the unmarked pole can occupy whatever space the marked pole does not. In Diyari, this would suggest that /и/ is marked with respect to /i/, and in Spanish dialects that high vowels are marked with respect to mid vowels. This criterion is parallel to one that Battistella (1990: 27) notes is foimd in discussions of semantic markedness: "marked elements are characteristically determinate in meaning while the vmmarked elements are characteristically indeterminate". Marked features in phonology show little variation in phonetic realization; unmarked elements are more subject to variability. 4.6. How much variation
is possible?
In this section, I have focused on variation in the (un)marked feature within a class. Given the degree of variation, one might draw the conclusion that a theory of markedness based on phonological processes is hopelessly flawed. However, the variation is not without limit. The table in (15) provides information about featural classes and what can potentially serve as unmarked within a class based on phonological tests, attempting to abstract away from positional effects. I use the term 'unit'
Featural markedness in phonology:
variation
405
rather than feature to avoid issues of feature substance. A few remarks will help in interpreting this table. In terms of vowel place, two possibilities are shown—a system may have two places or three places at a particular height. In terms of consonant place, three of a number of possible systems are illustrated, a two-way contrast and two three-way contrasts. Larger and smaller systems are evaluated slightly differently, and space precludes discussion here; see Rice forthcoming b. (15)
Variation in markedness 0 indicates unmarked patterning
class
possible units
possible unmarked markedness relatiunits ons
vowel place 1
front, back
front, back
vowel place 2
front, central, back central
consonantal place 1
coronal, velar
consonantal place 2
coronal, labial, dor- coronal, (laryngeal) 0/labial/dorsal sal, (laryngeal) (ignoring laryngeal)
consonantal place 3
coronal, labial, ve- velar, (laryngeal) lar, (laiyugeal)
0ЛаЫа1/согопа1 (ignoring laryngeal)
consonantal manner (obstruents)
stop, continuant
stop, continuant
0/continuant or stop/0
vowel height 1
high, low
high, low
high/0 or 0/low
vowel height 2
high, mid, low
mid (?)
high/0/low
coronal, velar
0Ласк or front/0 &опУ0Ласк 0/velar or coronal/0
laryngeal features voiceless unaspira- voiceless unaspira- 0/voiced/sg/cg ted, voiced (voiced ted unaspirated), spread glottis (voiceless aspirated), constricted glottis (voiceless glottalized) tongue root
atr, rtr
atr, rtr
atr/0 or 0/rtr
tone 1
H,L
H,L
0/L,H/0
tone 2
H, M,L
M
H/0/L
Generally in classes with a two-way opposition (e.g., vowel place in the absence of a central vowel, consonantal manner, tongue root), one of the two
406
Keren Rice
poles of the opposition can pattern as immarked, while in classes with a potential larger number of oppositions (e.g., consonantal place of articulation, vowel place in the presence of a central vowel, laryngeal), not all features can pattern as unmarked. Complex segments involving more than one feature within a class (e.g., front rounded vowels) never exhibit immarked patterning, nor do features that are not generally considered to be complex (e.g., labial and dorsal consonants; see section 6.4). Given a binary opposition between features [X] and [Y] of class [Z], either [X] patterns as marked and [Y] as unmarked, or vice versa. The introduction of a three-way opposition leads to two possibilities: both [X] and [Y] pattern as marked and the third member as unmarked (e.g., vowel place with front, central, and back vowels), or either [X] or [Y] patterns as unmarked and the others as marked (e.g., consonantal place).
4.7. Building up segments: relationships between feature classes I have focused on variation as a complicating factor in coming to an understanding of markedness, looking strictly at what phonological processes reveal about potential markedness relationships within a class. In the creation of a sound, of course, it is necessary to take into account more than the simple notion of feature class—features of different classes combine to create segments. One way of viewing combinatorial properties of features is through a study of inventories; see section 5.1. A second way of investigating interaction of classes involves a study of the possible paths by which features from one class can combine with those from another class. For instance, consider an asymmetrical system such as /i e a u/. I assume that the relevant height features are high and low; see, however, Causley (1999) for an alternative analysis. On the one hand, one could start with height classes. The height classes are three: /i u/ are high, /е/ is mid, and /а/ is low. It is only within the high vowels that place is relevant. The mid vowel, /е/, is unmarked for place as it does not contrast with another vowel in its height class. Suppose, on the other hand, that place classes are defined first. This yields /i e/ as front, lai as central, and Ы as back; now it is only within front vowels that height is contrastive. In this case, the mid vowel lei is marked for place; the back vowel /и/ is immarked for height, again in the absence of an opposition within its height class. Steriade (1987) discusses notions of markedness along these lines, and Dresher (1998a, b, 2001) and Dresher and Zhang (2000) argue that languages differ in terms of these types of scope relationships, with the same phonetic inventory allowing different markedness relationships depending upon the order in which featural assignments are made.
Featural markedness in phonology: variation 4.8.
407
Summary
Markedness has more than one face. On the one hand, a range of factors lead to language-particular variation in the computation of markedness: positional requirements, contrasts, assessment of contrasts, and choice of feature may all differ. On the other hand, there is a universal base for markedness, with the range of variation being limited. Assuming that a theory of markedness should account for the types of phonological properties outlined above, a full theory of markedness must address the following. (16)
i.
Positions must be distinguished: Different positions can have different markedness demands. ii. Inventories play a role in determining markedness: Depending upon the contrasts within a feature class, different features can emerge as unmarked with respect to submergence of the unmarked. iii. Variation in markedness exists: Two languages with identical surface contrasts within a class may have different unmarked features with respect to both submergence and emergence of the unmarked. iv. Variation in markedness is constrained: Not all features can show immarked patterning. In general, variation, or local variability, exists with two-way contrasts, while fixedness, or global uniformity, occurs when more contrasts exist within a class. v. Between-class interactions may vary: Interactions may create different markedness relations within systems that are identical on the surface.
5. Other markedness implication, frequency,
diagnostics: acquisition
So far I have concentrated on phonological factors involved in assessing markedness. Perhaps because phonological processes do not converge on a single feature as immarked, phonological processes have not been the major diagnostics of markedness. Instead, the factors that are most commonly cited are implication and frequency, factors that must be evaluated against cross-linguistic evidence. I begin with a general discussion of inventories (section 5.1), and then tum to implication (section 5.2) and frequency (section 5.3).
408
Keren Rice
5.1. Inventories This section is drawn from work with Peter Avery (Rice and Avery 1993, 1995). An investigation of inventories cross-hnguistically reveals a striking generalization: the segmental make-up of inventories is not random: smaller inventories contedn statistically more common segments and larger inventories contain the statistically more common segments as well as less frequent ones. While I have focused on features, segments are composed of features, and thus examining segments is worthwhile to further study combinatorial properties of features. The results are not surprising: segments that include more unmarked features are more likely to occur within inventories than segments that include more marked features. In work on consonant inventories, Lindblom (1988) recognizes three types of consonants, basic, elaborated, £md complex. He states that "The number of segments a language uses in the basic or complex categories is predictable from the total size of its inventory. Small vowel or consonant inventories recruit only basic segments" (quoted in de Boysson-Bardies and Vüiman 1991: 316). Lindblom identifies the following consonants as basic: three places of articulation of voiceless stops (p, t, k), three places of articulation of voiceless fricatives (ζ s, J), three places of articulation of nasals (m, n, г)), and two liquids (1, r). He argues that these are the most frequent consonants and elaborations on these (e.g., addition of laryngeal features, subdivisions of places of articulation, etc.) are added only after all basic articulations are present. An examination of the inventories in the UCLA Phonological Segment Inventory Database (Maddieson 1984) validates Lindblom's generalization: small inventories of similar size are similar and, as inventories expand, differences emerge. For instance, abstracting away from laryngeal features, Yagaria (Indo-Pacific, 609), Roro (Austro-Thai, 420), and Finnish (Ural-Altaic, 53) have three-or four-way place distinctions in obstruents, two-way place distinctions in nasals, and, if they have liquids, single places of articulation for liquids (numbers refer to Maddieson 1984). Diyari (Australian, 367) £md Nootka (Amerindian I, 730) have more complex consonant inventories, and illustrate some of the different ways in which consonant inventories can be expanded. Diyari expands places of articulation by adding coronal distinctions in stops and sonorants both, while Nootka adds post-velar distinctions in the stops. Vowel inventories show similar patterns. A typical three vowel inventory contains the vowels /i a u/ (e.g., Aranda, Australian 362), although /e a о/ is reported (e.g., Alabama, Amerindian I, 759), and a typical extension to a five vowel inventory involves the addition of mid vowels. In ex-
Featural markedness in phonology: variation
409
panding to seven vowels, languages take different paths. Some add further distinctions in the mid vowels, as in Katcha (Niger-Kordofanian, 100), while others add front rounded vowels, as in Fuchow (Sino-Tibetan, 505). In consonant and vowel inventories, the overall observation can be made that the number of contrasts in place of articulation decreases with increasing sonority. If differences in numbers of places of articulation obtain within different manners, obstruents allow for more places than do nasals, and nasals in tum allow for more places than liquids. In vowel systems too, height, a sonority dimension (e.g., Schane 1984; Clements 1990), and place are differentiated. In a three-vowel system, high vowels may have place distinctions not shown by the low vowel. Thus, the prototypical three vowel system is /i a u/ rather than, for instance, /i, ce, al, where place is phonologically contrastive for low vowels. A single contrastive low vowel, on the other hand, often exhibits a far greater range of phonetic variation in place than do the high vowels; see section 4.5. I have so far discussed combinatorial constraints involving manner and place features. Constraints in the elaboration of features within a class Eire also foimd. For instance, with rare exception, languages include a coronal obstruent, and the presence of other places of articulation implies the presence of coronals (if they are lacking a coronal obstruent, they have a velar one; see Rice 1996). Maddieson points out that the existence of a dorsal nasal presupposes the existence of both a labial and a coronal nasal. (Note that these generalizations take full rather than positional inventories into account.) A similar generalization exists within sonorants. In Maddieson's survey, five languages are reported as missing nasals, while eighteen languages are without liquids, but have nasals. Maddieson lists only four languages with a Uquid but no nasal; however, the languages that he identifies as nasal-less have voiced stops that pattern as sonorants (see Avery 1996; Piggott 1992; Rice 1993; Rice and Avery 1991 on this claim). The existence of liquids implies that of nasals or voiced obstruents that replace nasals in an inventory. Thus, constraints on the elaboration of inventories exist. First, the basic segments result from combining less marked features within their classes. Second, within a class, implicational constraints are found. Inventory shapes and the elaboration of inventories thus reinforce the conclusions about markedness drawn in section 4. 5.2. Implication
and
frequency
A diagnostic of markedness that is often cited in the literature is implication: a feature X is more marked than a feature Y if the presence of X
410
Keren Rice
implies the presence of Y. Related to implication is neutralization: neutralized elements form the unmarked member of an opposition, as they must since the marked counterpart can be present only in the presence of the unmarked counterpart. Unmarked features are often also identified by frequency: immarked features are more frequent than marked features. Frequency can be investigated both language-internally and cross-linguistically. For example, Hamilton (1996) argues for the markedness of non-coronals in stem-final position in Australian languages on the basis of both implication emd frequency. As discussed in section 4.1, all Australian languages with final labials and/or dorsals have final coronals; in addition, in languages with more than coronals allowed in this position, coronals are of greater frequency than non-coronals. Maddieson (1984) investigates cross-linguistic frequencies as well, reinforcing the conclusion that unmarked features/ segments are more frequent than marked ones. 5.3. On the status of implication
and
frequency
In this section I raise two issues, one empirical and one theoretical, involving implication and frequency as diagnostics for markedness relationships. Recall from section 4.2 that if a central vowel is present in a system, it has unmarked phonological characteristics, all other things being equal. One might expect that central place should be implied by other places and be the most frequent place cross-linguistically. However, this is not the case judging from inventories. Maddieson lists 40 languages with a high central vowel, while 271 have the vowel /i/ and 254 the vowel /и/. These numbers may not be completely accurate, but they are indicative: phonetically, central place is neither implied by other places nor is it frequent compared to other places. Taken in this simplistic way, the notion of implication is unsuccessful at dealing with markedness facts — the phonological evidence that central vowel place is unmarked within its height is clear, but not reinforced by implication and frequency. This is not surprising, as not only do unmarked features emerge, but they are also frequently submerged, as discussed in section 3. Phonological patterning and frequency facts thus may conflict; see section 6 for discussion of how this problem might be resolved. In other cases, implication gives expected results. For instance, either front or back place could pattern as unmeirked in the absence of central place; no implicational relationship is found and their frequencies are roughly the same. Implication also faces a theoretical complication. Consider the position of a child acquiring a language. The child does not know, for instance, that
Featural markedness in phonology: variation
411
a dental or alveolar stop occurs in almost all the languages of Maddieson's survey (316; p. 32) while an uvular stop occurs in only 47 of the languages of the survey. As the child has input only from the language(s) to which s/he is exposed, no direct source is available to inform her/him that uvulars imply dentals/alveolars. Similarly, a child acquiring a language with only voiceless stops may not be aware of the existence of voiced stops; even the occurrence of both voiced and voiceless stops in a language is not in itself an indication of which is the marked pole. In short, impUcation cannot be determined on the basis of an individual grammar, but emerges from a theory of markedness. Frequency faces similar problems. Consider first cross-linguistic frequencies. No reason exists to believe that a child has access to these. Frequency within a language is of a different status. While it is reasonable to believe that a speaker has access to information aboutfrequency,it must be asked whether that information is utilized linguistically. For instance, does an English speaker make use of the information that voiced alveopalatal fricatives are of low frequency and voiced alveolar fricatives of greater frequency in determining markedness relations? Further, the criteria for counting must be firmly established. Trask (1996), for instance, following Lass (1984: 132), states that the marked segment has lower text-frequency, while Battistella (1990: 48) claims that frequency must refer to frequency of contexts rather than text frequency. One could imagine counting either tokens or types. In addition, the masking of the unmarked discussed in section 3 affects frequency coimts computed on surface representations. The notion of frequency thus is complex, and, I believe, not well-understood. Despite the increase in work on frequency in recent years, further investigation, both linguistic and psycholinguistic, is required to determine whether frequency, and implication, are useful criteria in assessing markedness. My sense is that they are consequences, or emergent properties: high frequency might imply unmarked, but unmarked does not imply high frequency. Perhaps because of the difficulties in computing implication and frequency, two other criteria are often stated in discussions of markedness. Sometimes referred to as simplicity, these involve ease of articulation and saliency of perception; I return to these in section 6.2. One conclusion can be drawn from diagnostics of frequency and implication: their widespread appeal suggests that there is a universal basis to markedness. In the next section, I examine some of the models that are current in the literature that attempt to account for these various aspects of markedness. (Some references: Battistella 1990; articles in Broe and Pierrehumbert 2000; Greenberg 1966; Pierrehumbert 1994)
412
Keren Rice 6. Models
In previous sections I examined diagnostics for markedness and pointed out how they converge on which aspects of markedness are universal and which are subject to variation. An adequate model of markedness must be expected to account for the full range of facts: positions, the emergence of the unmarked, the submergence of the immarked, variation, the role of contrast. In this section, I focus on variation in the emergence and submergence of the unmarked. I survey three recent models, a phoneticallybased model, a substantively-based model, and a structurally-based model. 6.1.
Underspecification
Before discussing these models, a detour into imderspecification is in order. At least since Archangeli (1984), there has been a tendency to conflate markedness with underspecification: unmarked values are underspecified, or absent, lexically while marked values are present. The prevalence of the identification of markedness with underspecification can be seen through an examination of recent textbooks — Kenstowicz (1994) and Roca (1994), for instance, link featural markedness inextricably with underspecification. Steriade (1995) argues against the conflation of markedness and imderspeciflcation. In the interest of space, I set aside the issue of specification altogether, but it is in reality an area that cannot be ignored in understanding phonological patterning and markedness. 6.2. Phonetic models: cue-based phonology and dispersion theory One model that aims to account for markedness facts is a phonetic one based on the importance of articulation, audition, and phonetic cues. Recall fi-om section 1 that markedness is sometimes defined by notions of phonetic simplicity — ease of articulation and salience of perception. A major focus of a phonetic model is to account for phonotactic patterns. This encompasses two of the topics addressed in section 4, positions and neutralization. In particular, the following question is asked: why is a given feature likely to be neutralized (lexically or actively) in some positions but not in others? For instemce, Hamilton (1996), in his survey of Australian languages, asks why laminai coronals are more likely than apical coronals word-initially, while the reverse holds true word-finally.
Featural markedness in phonology: variation
413
Hamilton argues that the answer lies in perceptual cues: laminais require robust release cues for perception, and these are available preceding a vowel; apicals, on the other hand, require attack cues and these are available following a vowel. Steriade (1997) makes similar arguments regarding the distribution of laryngeal features, arguing that, for instance, word-final position does not present good cues to voicing and thus neutralization of larsTigeal features is likely in this position. Steriade derives a hierarchy which ranks environments in terms of the perceptual cues that they provide for supporting laryngeal contrasts and shows that if a language allows a contrast in a perceptually more marked environment, it also allows one in the less marked environments. The features that are licensed in the most marked contexts can be considered to be unmarked in that context as they are implied by the presence of other features. Consider the other emergence of the immarked environment, epenthesis. Steriade (1995:139) offers some conjectures about the quality of epenthetic vowels and suggests that a schwa-like vowel is expected as epenthetic based on articulatory simplicity. She speculates that variation results from an attempt "to identify the schwa-sound with a vowel quality that is phonemically present in the language." Such a vowel will not vary in any significant way from the neutral position; it will be neither low nor round as these gestures involve a significant departure from neutral position. As we have seen (section 4), the facts are more complex, as central, front, and back vowels all occur as epenthetic. The definition of articulatory simplicity/complexity needs to be formalized to see if such a theory can make appropriate predictions. Consider now the submergence of the unmarked. Flemming (1995), Hamilton (1996), and Ni Chiosáin and Padgett (1997) argue that articulatory and auditory hierarchies are both required. For vowel place, for instance, articulatorily [i] is the least marked and [i, u] are more marked. Articulatory markedness is important in the emergence of the unmarked. Auditory markedness is defined through contrasts: the contrast between [i] and [u] is unmarked, and other contrasts (e.g., [i—i]) are marked. Auditory markedness plays an important role in the submergence of the vmmarked. Flemming (1995) and Ni Chiosáin and Padgett (1997, 2001) examine assimilation, proposing that it exists in order to maximize the duration of a feature type, with more salient perceptual cues taking precedence over less salient ones. Thus, one would expect that a central vowel, with weaker cues, would assimilate to a vowel with stronger cues. The possibility of assimilation of afrontvowel to a back vowel or of a back vowel to a front vowel (section 4.3) presents problems if these vowels are of differing perceptual salience imiversally. The account requires that perceptual salience can vary from language to language; for instance, in Ya-
414
Keren Rice
welmani the back vowel is more salient while in Chamorro the front vowel is. In order to account fully for assimilation, perceptual factors must always take precedence over articulatory ones; if articulatoiy factors take priority, assimilation to the unmarked, such as central vowels, is expected, contrary to fact to my knowledge. Finally, Flemming (1995) and Ni Chiosáin and Padgett (1997) examine inventory shapes, building on work by Lindblom (1986, 1990) on dispersion theory. Basically, Flemming argues that languages aim both to maximize the number of and the distinctiveness of contrasts. An antagonism exists: maximizing the number of contrasts reduces the distinctiveness of the contrasts; maximizing contrasts leads to a reduced number of contrasts. If a language has a two-way contrast in vowel place, the distance between the vowels will be maximized, giving a front unrotmded vowel and a back rounded vowel; if a language has a single contrast, then central place will surface, with ease of articulation taking precedence. This provides an account of the odd frequency of central vowels: they are immarked only if they are the only place at their height or if they are the third member of a contrast. Markedness does not reside in a single feature, as considerations of contrast enter in in a critical way. There is a simplicity to this theory which does not always meet with phonetic facts. For instance, in a position of reduced contrasts (e.g., epenthetic position, neutralization position), one would expect articulatory concerns to triumph, all other things being equal, but we have seen that this is not necessarily true. Further, one would expect maximal differentiation of contrasts in phonetic terms. However, this is often not the case. For instance, some languages have a three-way place contrast between i-ü-u rather than i-i-u (e.g., Rukai; Austro-Tai 417); some have a two-way contrast between i-ш (Adzera, Austro-Tai 419; Japanese, Ural-Altaic 71) rather than the expected i-u; some have a single high vowel which is front rather than central (e.g., Navajo, Athapaskan; Klamath, Penutian). These facts call into question the claim that articulatory simplicity governs in the absence of contrast and maximal distance in the presence of contrast. The empirical strength of the phonetic theories lies in their ability to accoimt for positional neutralization effects. The model opens new empirical ground with respect to markedness theory, and it raises the question of which diagnostics are appropriate. In particular, given the variation in both the emergence and submergence of the unmarked, is it necessary to abandon phonological processes as a testing grounds for markedness and appeal strictly to frequency and implication? Specifically, can an appeal to both articulatory and auditory scales account for the asjTnmetries and equipollencies in patterning and for the cross-linguistic variation that are characteristic of phonology?
Featural markedness in phonology: variation
415
(Some references: Flemming 1995; Hamilton 1996; Hume 1998; Hume and Johnson 2001; Ni Chiosáin and Padgett 1997, 2001; Steriade 1997, 2001)
6.3. Harmonic scales: featural markedness in Optimality Theory While universal scales have long been used in phonology (e.g., sonority scale), they have recently been used by phonologists working in Optimality Theory in order to capture markedness relations. For instance, Prince and Smolensky (1993) introduce scales to accommodate facts of featural markedness. Following the literature on coronal unmarkedness, they propose the dominance hierarchy in (17). (17)
* labial, *dorsal » *coronal
This is to be read as follows: a labial or a dorsal is more marked than a coronal. Coronal unmarkedness follows — an epenthetic segment, for instance, is most likely to be coronal as that is the least marked place of articulation. Beckman (1997) proposes the fixed hierarchy for vowel height in (18). (18)
* mid » *high, *low
This scale states that low and high vowels are unmarked with respect to mid vowels. Because most attention has been given to consonantal place, I use this as an example in the following discussion. Lombardi (1997) provides the most elaborated version of the place hierarchy, in (19). (19)
* glottal » *labial, *dorsal » *coronal » *pharyngeal
Lombardi observes that leiryngeal segments often pattern as unmarked with respect to epenthesis. She argues that the hierarchy proposed by Prince and Smolensky (17), and the proposals that assert coronal unmarkedness in general, cannot account for this fact. Instead, she proposes that pharyngeal is the least marked place. Consider the emergence of the unmarked. Lombardi's scale has pharyngeals, including laiyngeals, as unmEirked, and thus the predicted place of articulation in consonantal epenthesis. She recognizes that coronals can also emerge as unmarked in epenthesis, and proposes that this is a consequence of morphological factors, resulting from the high ranking of a constraint ruling out the glottal feature. Some account of the cross-linguistic variability is thus provided; however, velar unmarkedness remains unex-
416
Keren Rice
plained. Further, in the absence of morphological factors, epenthesis is often difficult to justify, weakening the claim that coronal epenthesis demands special circumstances. In neutralization cases, another instance of emergence of the unmarked, one would expect laryngeals to emerge as the unmarked consonantal place. This is found: neutralization to laryngeals occurs in many languages (e.g.. Slave, Athapaskan; Rice 1989). However, as discussed in section 4.3, neutralization to coronal and velar places of articulation also occurs. The facts of the emergence of the immarked thus require that the universal hierarchies be modified indirectly. Because they are universally fixed, this modification must come through the reranking of constraints around them. This analysis forces a single feature to be unmarked in its class, and accounts for alternative unmarked phonological patterning with distinct mechanisms. Now consider the submergence of the immarked. The constraint hierarchy approach appeals to a different set of constraints to deal with this phenomenon, constraints requiring preservation of lexically specified features. Recall that in the submergence of the unmarked, marked features take precedence over unmarked features. As Kiparsky (1994) and Smolensky (1993) note, this creates a dilemma for the theory. Constraints on faithfulness to marked features must be higher ranked than those requiring faithfulness to unmarked features in order to accoimt for the submergence of the unmarked, the opposite of the ranking needed for the emergence of the unmarked, again demanding an appeal to different mechanisms; see Causley (1999) for discussion. Consider next inventories. Recall that if a marked member of a feature class occurs in an inventory, the unmarked member does also. Prince and Smolensky (1993) argue that this is a consequence of the fixed markedness scale: a ranking that motivates the violation of a higher-ranked markedness constraint also allows the violation of a lower-ranked markedness constraint. Thus if an output bears a marked feature, faithfulness must be ranked above the markedness constraint prohibiting that feature. If faithfulness is ranked above a constraint prohibiting a marked feature (e.g., *labial), it is necessarily ranked above a constraint prohibiting a less marked feature (e.g., *coronal) since the ranking of the markedness constraints is fixed. Since faithfulness dominates the constraint against the unmarked feature, the unmarked feature must be permitted in the output. See Prince and Smolensky (1993) for discussion and Causley (1999) for a critique. I note parenthetically that no meta-ordering of faithfulness constraints has been proposed. Thus, one might rank faithfulness to [labial] above faithfulness to [coronal], or vice versa. The prediction is that two classes of languages exist, one in which labied conso-
Featural markedness in phonology: variation
417
nants are preserved over coronals in processes like assimilation, and one with the opposite effect. To my knowledge, the second class is not found. Further note that velar consonants are not present in the hierarchy in (19), an empirical problem. A theory incorporating imiversal constraint hierarchies makes strong predictions about markedness relations in the emergence of the unmarked. These predictions appear to be too strong, as greater variability exists in terms of what can be unmarked than is allowed by the fixed scales; however, too much variation would result from free ranking. The constraint ranking approach requires that careful consideration be paid to whether the types of variability identified in section 4 are related to featural markedness; if they are, the theory requires modification. The theory also relies on different devices to accovuit for the differences in patterning of unmarked features in the emergence and submergence of the unmarked; again it must be investigated whether this is empirically adequate. 6.4. Structural
accounts
The theories discussed so far deal with markedness as an issue of substance. Another approach to markedness is to treat markedness as primarily related to structure. Structural accounts of various sorts have long been available. In this section, I illustrate the structural theory arising from work done at the University of Toronto. In this theory, markedness correlates directly with structure with, overall, more complex structures equating with greater markedness. The goals of this work are to provide an account of what universal grammar allows to be unmarked and what the universal and language particular aspects to markedness are. The general gist is as follows. In a study of language acquisition. Rice and Avery (1995) argue that two aspects of language demand an account. First, there is a great deal of cross-linguistic imiformity in the features that pattern phonologically as unmarked (global imiformity) and second, it is not always the same features that pattern phonologically as unmarked (local variability). Capturing both uniformity and variability are the goals of this theory. The easiest way to illustrate this is through some specific examples; I draw on place of articulation in consonants (20) and vowels (21). The following representations are based on work by Avery and Rice (1989,1994), Rice (1995), and Rice and Avery (1993). Markedness is in leirge part a consequence of amovmt of structure: in general, the less structure, the less marked; the more structure, the more marked. It is this asymmetry in degree of structure that results in different
418 (20)
Keren Rice consonantal representations laryngeal velar coronal Root Root Root Place
(21)
Place 11 Coronal
labial Root 1 1 Place 11 Peripheral 11 Labial
dorsal Root Place 11 Peripheral 11 Dorsal
vowel representations central front back Place Place Place
front rounded Place /\ Coronal Peripheral Coronal Peripheral
pattemings. However, languages can differ with respect to their evaluation of structure in several ways. First, a language (or position) may require that a segment have a head, or some structure; if this is the case, a laryngeal will be ruled out among the consonants and a central vowel among the vowels despite their low degree of structural complexity. Second, a language/position may require that Place be specified; if so, then velars as well as laryngeals will be absent from the consonants and coronals will show unmarked patterning. Thus, depending upon the constraints that hold, laryngeals, velars, and coronals can each pattern as unmarked among the consonants. Finally, in cases of features with equivalent structure (e.g., labials and dorsals in the consonants, front and back vowels in the vowels), a language-particular choice is made, and either can pattern as unmarked (within the peripheral class for the consonants). Thus, the theory makes available the representations and constraints on whether structure is required; the language/position makes demands on how much structure is minimally necessary. The emergence of the unmarked, both epenthesis and neutralization, receives a structural account. Laryngeals are unmarked among the consonants if no structure is required; velars if Place is required; and coronals if Place is required to have a dependent. Labials and dorsals are not predicted as epenthetic segments because of their high degree of complexity (although within this class, either can function as unmarked because of their equivalent complexity). In vowels, in the absence of a constraint requiring that Place be specified, a central vowel patterns as unmarked; in the presence of this constraint, either the front or back vowel shows this patterning depending on language. The various places that can result from neutralization follow in a similar way. As for the submergence of the
Featural markedness in phonology: variation
419
unmarked, the retention of marked features is predicted: faithfulness demands that structure be retained. Thus, more marked structure takes priority over less marked structure. Inventory patterns also emerge, as the theory incorporates a learning path, with the presence of more marked structure requiring the presence of less marked structure, modulo the minimal structure constraints discussed above. Little structure correlates with ease of articulation and more with perceptual salience in the phonetic model. Other sources of variation come from the manner in which different subpieces of a representation (e.g., place, manner) combine and from enhancement. In this theory, variation has a source in structural properties. Languages may make different demands on whether structure is required, and, if some is, just how much; in cases of equipollent structure, languages make a choice as to which is marked. Implication and frequency are consequences of the way in which structure can be elaborated. (Some references: structural complexity—Causley 1999; Harris 1990; van der Hulst 1996; Rice 1992,1996; Rice and Avery 1993,1995; Rice and Causley 1998; Trubetzkoy 1939/1969; University of Toronto work—Avery 1996; Avery and Rice 1989; Causley 1997, 1999; Dyck 1995; Rice 1992, 1993, 1996, 1996, forthcoming a, b; Rice and Avery 1993, 1995; Rice and Causley 1998; Zhang 1996) 7. Final
remarks
Markedness is something about which linguists come to have strong intuitions. In many areas there is agreement: Something called markedness exists. It is multidimensional, with several factors involved at various levels (e.g., featural, combinatorial, positional). Variation exists in what can pattern as unmarked, although it is not without limit. However, many questions remain. What is the substantive set of features? Much of markedness relies on the substance of these features, yet there is not agreement on this question. What is the status of the diagnostics? Are implication/frequency and phonological patterning valid? How is the variation in markedness revealed by the phonological tests to be accommodated? What is accessible to the learner? Are implication and frequency available? Is markedness reducible to phonetics? Is it based in substance? Is markedness in phonology a specific consequence of a more general linguistic, or cognitive, facility? The issues surrounding markedness do not appear to be ones that will find quick solutions, and the area promises to continue to be one of lively debate for some time to come.
420
Keren Rice
Acknowledgments Thank you to Peter Avery and, especially, to Bill Idsardi for feedback on earlier drafts, and to the phonology research group at the University of Toronto for discussion. Pieces of this work have been presented at the North American Phonology Conference 1, held at Concordia University in 2000 and in a symposium at the University of California, Berkeley in October 2000. Many thanks to the helpful feedback from those audiences. In addition, I thank the members of my 2001 phonology seminar, Susana Bejar, Do-Hee Jimg, Sara Mackenzie, Jack Panster, Erik Jan van der Torre, and Robert Wiehe for their contributions to this work. This work was partially funded by Social Science and Humanities Council of Canada research grant #410-96-0842 to B. Elan Dresher and Keren Rice and by research grant number #410-99-1309 to Keren Rice and B. Elan Dresher.
The Featural Markedness
Bibliography
Abaglo, Poovi & Diana Archangeli (1989). Language particular imderspecification: Grengbe lei and Yoruba /i/. Linguistic Inquiry 20,457-480. Alderete, John (1997). Dissimilation as local conjimction. In Proceedings of the North Eastern Linguistic Society 27,17-31. Amherst, Massachusetts: GLSA. Alderete, John, Jill Beckman, Laura Benua, Amalia Gnanadesikan, John McCarthy, & Suzanne Urbanczyk (1999). Reduplication with fixed segmentism. Linguistic Inquiry 30, 327-364. ROA-226-1097. Archangeli, Diana (1984). Underspecification in Yawelmani phonology and morphology. Doctoral dissertation, MIT. (New York: Garland, 1988). Archangeli, Diana (1988). Aspects of underspecification theory. Phonology 5,183-208. Archangeli, Diana & Douglas Pulleyblank (1989). Yoruba vowel harmony. Linguistic Inquiry 20,173-217. Archangeli, Diana & Douglas Pulleyblank (1994). Grounded phonology. Cambridge, Massachusetts: MIT Press. Austin, Peter (1981). A grammar ofDiyari, South Australia. Cambridge: Cambridge University Press. Avery, Peter & Keren Rice (1989). Segment structure and coronal underspecification. Phonology 6,179-200. Avery, Peter (1996). The representation of voicing contrasts. Doctoral dissertation, University of Toronto.
Featural markedness in phonology: variation
421
Bakovic, Eric (2000). Harmony, dominance and control. Doctoral dissertation, Rutgers University. Battistella, Edwin (1990). Markedness: the evaluative superstructure of language. Albany: SUNY Press. Beckman, Jill (1997). Positional faithfulness, positional neutralisation and Shona vowel harmony. Phonology 14,1-46. Beckman, Jill (1998). Positional faithfulness. Doctoral dissertation, University of Massachusetts, Amherst. Boersma, Paul (1998). Functional Phonology: Formalizing the interactions between articulatory and perceptual drives. The Hague: Holland Academic Graphics. Boysson-Bardies, Bénédicte de & Marilyn Vihman (1991). Adaptation to language: evidence from babbling and first words in four languages. Language 67, 297-319. Bright, WilHam (1975). The Dravidian enunciative vowel. Dravidian phonological systems, edited by H. Shiffman and С Eastman. Institute for Comparative and Foreign Area Studies, University of Washington Press. Broe, Michael B. and Janet B. Pierrehumbert (2000). Introduction. In M.B. Broe and J.B. Pierrehumbert (eds.) Papers in Laboratory Phonology V. Acquisition and the lexicon. Cambridge: Cambridge University Press. 1-8. Broe, Michael B. & Janet B. Pierrehumbert (2000). Papers in Laboratory Phonology V. Acquisition and the lexicon. Cambridge: Cambridge University Press. Broselow, Ellen (1984). Default consonants in Amharic morphology. MIT Working Papers in Linguistics 7,15-31. Caims, Charles (1969). Markedness, neutralization, and universal redundancy rules. Language 45, 863-886. Caims, Charles (1988). Phonotactics, markedness, and lexical representations. Phonology 5, 209-236. Calabrese, Andrea (1995). A constraint-based theory of phonological markedness and simplification procedures. Linguistic Inquiry 20, 373-463. Causley, Trisha (1996). Markedness and underspecification in Optimality Theory. Toronto Working Papers in Linguistics 15. Causley, Trisha (1997). Fixed hierarchies and markedness variability. Paper presented at the meeting of the North Eastern Linguistic Society 28.
Causley, Trisha (1999). Complexity and markedness in Optimality Theory. Doctoral dissertation. University of Toronto. Cho, Young-mee Yu (1988). Korean assimilation. Proceedings of the West Coast Conference on Formal Linguistics 7. 41-52.
422
Keren Rice
Cho, Young-mee Yu (1990). The parameters of consonantal assimilation. Doctoral dissertation, Stanford University. Christdas, Ρ (1988). The phonology and morphology of Tamil. Doctoral dissertation, Cornell University. Clements, G.N (1985). The geometry of phonological features. Phonology Yearbook 2, 225-252. Clements, G.N (1989). On the representation of vowel height. Manuscript, Cornell University. Clements, G.N. and Elizabeth V. Hume (1995). The internal organization of speech sounds. The handbook of phonological theory, edited by J. Goldsmith, 245-306. Oxford: Basil Blackwell. Crowley, Terry (1983). Uradhi. Handbook of Australian languages, vol. Ill, edited by R.M.W. Dixon & Barry Blake, 307-428. Amsterdam: John Benjamins. Dikken, Marcel den & Harry van der Hulst (1988). Segmental hierarchitecture. Features, segmental structure and harmony processes, vol. 1, edited by Harry van der Hulst & Norval Smith, 1-78. Dordrecht: Foris. 1-78. Dresher, B. Elan (1998a). Child phonology, leamability, and phonological theory. In Handbook of Language Acquisition, edited by Tej Bhatia and WiUiam C. Ritchie, 299-346. New York: Academic Press. Dresher, B. Elan (1998b). On contrast and redundancy. Paper presented at the annual meeting of the Canadian Linguistic Association, Ottawa. Ms., University of Toronto. Dresher, B. Elan (2001). Contrast And Asjmmetries In Inventories. Paper presented at the Asymmetry Conference, Université du Québec à Montréal, May 2001. Dresher, B. Elan & Xi Zhang (2000). Contrast in Manchu vowel systems. Paper presented at the International Conference on Manchu-Tungus Studies (ICMTS 1), University of Bonn, August 28-September 1,2000. Dresher, B. Elan & Harry van der Hulst (1999). Head-dependent asymmetries in phonology: complexity and visibility. Phonology 15,317-352. Dyck, Carrie (1995). Constraining the phonetics-phonology interface. Doctoral dissertation. University of Toronto. Flemming, Edward S (1995). Auditory representations in phonology. Doctoral dissertation, UCLA. Frisch, Stefan (1997). Similarity and frequency in phonology. Doctoral dissertation, Northwestern University. Fukazawa, Haruka (1999). Theoretical implications ofOCP effects on features in Optimality Theory. Doctoral dissertation, University of Maryland at College Park.
Featural markedness in phonology: variation
423
Geifos, Adamantios (1996). The articulatory basis of locality in phonology. Doctoral dissertation, Johns Hopkins University. (New York: Garland, 1999). Gafos, Adamantios & Linda Lombardi (1999). Consonant Transparency and Vowel Echo. Proceedings of the North Eastern Linguistics Society 29. Gnanadesikan, Amalia (1995). Markedness and faithfulness constraints in child phonology. ROA-67-0000. Gnanadesikan, Amalia (1997). Phonology with ternary scales. Doctoral dissertation, University of Massachusetts, Amherst. Goad, Heather (1992). On the configuration of height features. Doctoral dissertation. University of Southern California. Goddard, Ives (1974). An outline of the historical phonology of Arapaho and Atsina. UAL 40,102-116. Greenberg, Joseph (1966). Language universals. The Hague: Mouton. Groves, T., R. Groves, & R. Jacobs (1985). Kribatese: An outline description. Pacific Linguistics Series D — No. 64. Canberra: Australia National University. Hale, Mark & Charles Reiss (1998). Formal and empirical arguments concerning phonological acquisition. Linguistic Inquiry 29, 656-683. Hale, Mark & Charles Reiss (2000). Substance abuse and dysfunctionalism: current trends in phonology. Linguistic Inquiry 31,157-169. Halle, Morris (1957). In defense of the number two. Studies presented to Joshua Whatmough on his sixtieth birthday, edited by E. Pulgram, 65-72. The Hague: Mouton. Hamid, A-Η (1984). A descriptive analysis of Sudanese colloquial Arabic. Doctoral dissertation, University of Illinois, Urbana. Hamilton, Philip (1996). Phonetic constraints and markedness in the phonotactics of Australian Aboriginal languages. Doctoral dissertation. University of Toronto. Hankamer, Jorge & Judith Aissen (1974). The sonority hierarchy. Chicago Linguistic Society 10, Parasession on Natural Phonology, 131-145. Chicago: Chicago Linguistic Society. Hargus, Sharon & Siri Tuttle (1997). Augmentation as affixation in Athabaskan languages. Phonology 14,177-220. Harris, John (1990). Segmental complexity and phonological government. Phonology 7, 255-300. Harris, John (1994). English sound structure. Oxford: Blackwell. Harris, John & Geoñrey Lindsey (1995). The elements of phonological representation. Frontiers of phonology: atoms, structures, derivations, edited by Jaques Durand & Francis Katamba, 34-79. Harlow, Essex: Longman.
424
Keren Rice
Hualde, José Ignacio (1991a). Basque phonology. London, New York: Routledge. Hualde, José Ignacio (1991b). Unspecified and unmarked vowels. Linguistic Inquiry 22, 205-209. Hulst, Harry van der (1996). Radical CV phonology: The segment-syllable connection. Current trends in phonology: Models and methods, edited by Jaques Durand & Bernard Laks, 333-361. Salford: ESRI. Hume, Elizabeth V (1992). Front vowels, coronal consonants, and their interaction in non-linear phonology. Doctoral dissertation, Cornell University. (New York: Garland, 1994). Hume, Elizabeth V. (1998). The role of perceptibility in consonant/consonant metathesis. Proceedings of the West Coast Conference on Formal Linguistics 17. 293-307. Stanford: CSLI. Hume, Elizabeth & Keith Johnson (editors). (2001). The Role of Speech Perception in Phonology. New York: Academic Press. Hyman, Larry (1973). The feature [grave] in phonological theory. Journal of Phonetics 1, 329-337. Hyman, Larry (1985). A theory of phonological weight. Dordrecht: Foris. Hyman, Larry (1999). Privative tone in Bantu. Handout, Workshop on Tone, Konstanz. Ito, Junko (1986). Syllable theory in prosodie phonology. Doctoral dissertation, University of Massachusetts, Amherst. Ito, Junko, Ralf-Armin Mester, & Jaye Padgett (1995). Licensing and underspecification in Optimality Theory. Linguistic Inquiry 26, 571-615. Iverson, Gregory & K.-H. Kim (1987). Underspecification and hierarchical feature representation in Korean consonantal phonology. Proceedngs of the Chicago Linguistic Society 23, Parasession on Autosegmental and Metrical Phonology, 182-198. Chicago: Chicago Linguistic Society. Jakobson, Roman. (1941/1968). Child language, aphasia and phonological universals. The Hague: Mouton. Jun, Jongho (1995). Perceptual and articulatory factors in place assimilation: an Optimality Theoretic approach. Doctoral dissertation, UCLA. Kari, James (1990). Ahtna Athabaskan dictionary. Fairbanks, Alaska: Alaska Native Language Center. Kawasaki, H. (1982). An acoustical basis for the universal constraints on sound sequences. Doctoral dissertation. University of California, Berkeley. Kaye, Jonathan, Jean Lowenstamm, & Jean-Roger Vergnaud (1985). The internal structure of phonological elements: a theory of charm and government. Phonology Yearbook 2, 305-328.
Featural markedness in phonology: variation
425
Kean, Mary Louise (1975). The theory of markedness in generative grammar. Doctoral dissertation, MIT. Kenstowicz, Michael (1989). Comments on 'The structure of (complex) consonants' by H. van der Hulst and N. Smith. Paper presented at the MIT Conference on Feature and Underspecification Theories. Cambridge, Massachusetts. Kenstowicz, Michael (1994). Phonology in generative grammar. Oxford: Blackwell. Kiparsky, Paul (1985). Some consequences of lexical phonology. Phonology Yearbook 2, 85-138. Kiparsky, Paul (1994). Remarks on markedness. Talk given at TREND 2, Stanford University. Lindblom, Bjom (1986). Phonetic universals in vowel systems. Experimental phonology, edited by John Ohala and Jeri Jaeger. Orlando: Academic Press. Lindblom, Bjom (1990a). Phonetic content in phonology. Perilus 11. Lindblom, Bjom (1990b). On the notion 'possible speech sound.' Journal of Phonetics 19,135-152. Lombardi, Linda (1991). Laryngeal features and laryngeal neutralization. Doctoral dissertation. University of Massachusetts, Amherst. (New York: Garland, 1994). Lombardi, Linda (1997). Coronal epenthesis and markedness. Maryland Working Papers in Linguistics, Proceedings of the Hopkins Optimality Workshop I Maryland Mayfest. 156-175. ROA 245-0298. Lombardi, Linda (1999). Position faithfulness and voicing assimilation in Optimality Theory. Natural Language and Linguistic Theory 17, 267-302. Mackridge, Peter A. (1985). The Modern Greek language: a descriptive analysis of standard Modern Greek. Oxford: Oxford University Press. Maddieson, Ιειη (1984). Patterns of sounds. Cambridge: Cambridge University Press. McCarthy, John (1988). Feature geometry and dependency: a review. Phonetica 43, 84^108. McCarthy, John & Alan Prince (1994). The emergence of the unmarked: Optimality in Prosodie Morphology. Proceedings of the North Eastern Linguistic Society 24, 333-379. Amherst, Massachusetts: GLSA. McCarthy, John & Alan Prince (1995). Faithfulness and reduplicative identity. Papers in Optimality Theory. University of Massachusetts Occasional Papers 18, edited by Jill N. Beckman, Laura Walsh Dickey, & Suzanne Urbanczyk, 249-384. Amherst: GLSA. Mithun, Marianne & Hasan Basri (1986). The phonology of Selayarese. Oceanic Linguistics 25, 210-254.
426
Keren Rice
Mohanan, K.P. (1991). On the bases of radical underspecification. Natural Language and Linguistic Theory 9, 285-325. Mohanan, K.P. (1993). Fields of attraction in phonology. The last phonological rule: reflections on constraints and derivations, edited by J. Goldsmith, 61-116. Chicago: University of Chicago Press. Morelli, Frida (1998). Markedness relations and implicational universals in the typology of onset obstruent clusters. ROA-251-0398. Morgan, J. & Katherine Demuth (1996). Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, New Jersey: Lawrence Erlbaum Associates. Ni Chiosáin, Maire & Jaye Padgett (1997). Markedness, segment realization, and locality in spreading. Report no. LRC-97-01 Linguistics Research Center, University of California at Santa Cruz. Ni Chiosáin, Maire & Jaye Padgett. (2001). Markedness, segment realization, and locality in spreading. In Linda Lombardi (ed.), Segmental phonology in Optimality Theory: Constraints and Representations. Cambridge: Cambridge University Press. Noss, Richard В (1964). Thai reference grammar. Washington, D.C. Foreign Service Institute, Department of State. Odden, David (1987). Dissimilation as deletion in Chukchi. Proceedings of the Eastern States Conference on Linguistics 3, 235-246. Columbus, Ohio: Ohio State University. Onishi, Masayuki (1994). A grammar of Motuna (Bougainville, Papua New Guinea). Doctoral dissertation, Australia National University. Padgett, Jaye (1994). Stricture and nasal place assimilation. Natural Language and Linguistic Theory 12, 465-513. Padgett, Jaye (1995). Feature classes. Papers in Optimality Theory. University of Massachusetts Occasional Papers 18, edited by J.N. Beckman, L. Walsh Dickey, & S. Urbanczyk, 285-420. Amherst: GLSA. Paradis, Carole & Jean-Francois Prunet (1989). On coronal transparency Phonology 6, 317-348. Paradis, Carole & Jean-Francois Prunet (1991). The special status of coronals: internal and external evidence. Phonetics and Phonology 2. San Diego: Academic Press. Paradis, Carole (1992). Lexical phonology and morphology: the nominal classes in Fula. New York: Garland. Payne, Doris (1981). The phonology and morphology ofAxininca Campa. Summer Institute of Linguistics Pubhcations in Linguistics No. 66, University of Texas at Arlington. Pham, Hoa (1998). Coronal-velar relationship in Vietnamese dialects: a prosodie account. Asia Pacific Language Research 1. (http://asiapacific. webjump.com/).
Featural markedness in phonology: variation
427
Picard, Marc (1977). Vowel assimilation and morphophonemic rules. CUNY Linguistics conference on vowel harmony. Issues in vowel harmony, edited by R. Vago. New York: CUNY. Pierrehumbert, Janet (1994). Syllable structure and word structure: a study of triconsonantal clusters in English. In P. Keating (ed.) Phonological structure and phonetic form. Papers in Laboratory Phonology III. Cambridge University Press. Piggott, Glyne (1992). Variability in feature dependency: The case of nasality. Natural Language and Linguistic Theory 10, 33-78. Piggott, Glyne (1999). At the right edge of words. The Linguistic Review 16,143-185. Prentice, D.J (1971). The Murut language of Sabah. The Australian National University: Pacific Linguistics. Prince, Alan & Paul Smolensky (1993). Optimality theory: constraint interaction in generative grammar. TR-2, Rutgers University Cognitive Science Center. To appear, MIT Press. Pulleyblank, Douglas (1988). Vocalic underspecification in Yoruba. Linguistic Inquiry 19, 233-270. Pulleyblank, Douglas (1998). Yoruba vowel patterns: deriving asymmetries by the tension between opposing constraints. ROA-270-0798. Rice, Keren (1989). A ^rammor of Slave. Berlin: Mouton de Gruyter. Rice, Keren (1992). On deriving sonority: a structural accoimt of sonority relations. Phonology 9, 61-99. Rice, Keren (1993). A reexamination of the feature [sonorant]: the status of'sonorant obstruents.' Language 69, 308-344. Rice, Keren (1994). Peripheral in consonants. Canadian Journal of Linguistics 39,191-216. Rice, Keren (1995). On vowel place features. Toronto Working Papers in Linguistics 14, 73-116. Rice, Keren (1996). Default variability: the coronal-velar relationship. Natural Language and Linguistic Theory 14, 493-543. Rice, Keren, (forthcoming a). Vowel place contrasts. In M. Amberber & P. Collins (eds.). Language universals and variation. Ablex Publishing Co. Rice, Keren, (forthcoming b). Featural markedness. Cambridge: Cambridge Univerrsity Press. Rice, Keren & Peter Avery (1991). On the relationship between laterality and coronality. The special status of coronals: Internal and external evidence. Phonetics and phonology 2, edited by Carole Paradis & JeanFrancois Prunet, 101-124. San Diego: Academic Press. Rice, Keren & Peter Avery (1993). Segmental complexity and the structure of inventories. Toronto Working Papers in Linguistics 12,131-153.
428
Keren Rice
Rice, Keren & Peter Avery (1995). Variability in a deterministic model of language acquisition: a theory of segmental elaboration. Phonological acquisition and phonological theory, edited by John Archibald, 23-42. Hillsdale, New Jersey: Lawrence Erlbaum. Rice, Keren & Trisha Causley (1998). Asymmetries in featural markedness: place of articulation. Talk presented at GLOW, Tilburg University. Roca, Iggy (1994). Generative phonology. London: Routledge. Rose, Sharon (1993). Coronality and vocalic underspecification. Toronto Working Papers in Linguistics 12,155—176. Rose, Sharon (1996). Variable laryngeals and vowel lowering. Phonology 13, 73-118. Sagey, Elizabeth (1986). The representation of features and relations in nonlinear phonology. Doctoral dissertation, MIT. (New York: Garland, 1991). Schane, Sanford (1984). The fundamentals of Particle Theory. Phonology Yearbook 1,129-155. Smolensky, Paul (1993). Harmony, markedness, and phonological activity. Talk presented at Rutgers Optimality Workshop L Smolensky, Paul (1996). On the comprehension/production dilemma in child language. Linguistic Inquiry 27, 720-731. Sohn, H-S. (1987). On the representation of vowels and diphthongs and their merger in Korean. Proceedings of the Chicago Linguistic Society 23, 307-323. Sohn, Ho-min (1994). Korean. London: Routledge. Spencer, Andrew (1996). Phonology: theory and description. Oxford: Blackwell. Steriade, Donca (1987). Redundant Values. Proceedings of the Chicago Linguistic Society 23, 339-362. Steriade, Donca (1995). Underspecification and markedness. The handbook of phonological theory, edited by John (îoldsmith, 114^174. Oxford: Basil Blackwell. Steriade, Donca (1997). Phonetics in phonology: the case of laryngeal neutralization. Manuscript, UCLA. Steriade, Donca (2001). The phonology of perceptibility effects: the P-map and its consequences for constraint organization. Thompson, Laurence C. (1965). A Vietnamese grammar. Seattle: University of Washington Press. Topping, Donald (1968). Chamorro vowel harmony. Oceanic Linguistics 7, 67-79. Trask, R.L. (1996). A dictionary ofphonetics and phonology. London, New York: Routledge.
Featural markedness in phonology: variation
429
Trigo, Loren (1988). On the phonological derivation and behavior of nasal glides. Doctoral dissertation, MIT. Trubetzkoy, Nikolay S. (1939/1969). Principles of phonology. Translated by C.A.M. Baltaxe. University of California Press. Tserdanelis, Georgis & Elizabeth Hume (2000). Nasal assimilation in Sri Lankein Portuguese Creole: Implications for markedness. Talk prsented at the MOT Workshop on Phonology, February 2000. Vance, Timothy (1987). An introduction to Japanese phonology. Albany: SUNY Press. Walker, Rachel (1993). A vowel feature hierarchy for contrastive specification. Toronto Working Papers in Linguistics 12,179-197. Walker, Rachel (1998). Nasalization, neutral segments, and opacity effects. Doctoral dissertation. University of California at Santa Cruz. Walker, Rachel, (forthcoming). Positional markedness in vowel harmony. To appear in Proceedings ofHILP 5. Weijer, Joroen M. van der (1994). Segmental structure and complex segments. Doctoral dissertation. University of Leiden. Yip, Moira (1991). Coronals, consonant clusters, and the coda condition. The special status of coronals. Internal and external evidence. Phonetics and Phonology 2, edited by C. Paradis & J-F Prunet, 61-78. San Diego: Academic Press. Zhang, Xi (1996). Vowel systems of the Manchu-Tungus languages of China. Doctoral, dissertation. University of Toronto. Zhou Hong (1999). Markedness in Mandarin vowel inventories. Doctoral dissertation, University of Toronto.
Schwa in phonological theory Marc van Oostendorp
1.
Introduction
If a language has schwa in its vowel inventory, this segment usually has a special role to play in its phonology: the vowel can only occur in a simple type of syllable; or it is invisible for the stress system; or it is epenthetic; or it is the result of reduction; etc. Linguistic theory has to explain this special behaviour of schwa: why is it exactly this segment which behaves in exactly this way in so many languages? A lot of subtheories of phonology are at stake — syllable structure theory, stress theory, the theory of (sub)segmental representation, and of interaction between segments — and the questions surrounding schwa therefore will probably not be resolved until the perfect theory of phonology has been discovered. On the other hand, schwa offers an excellent testcase for many theories of phonology. It therefore should come as no surprise that the vowel has been studied intensively within various branches of generative phonology. This article offers an overview of some of the ideas that have been proposed and some of the facts that have been discovered over the past thirty years. No analysis seems perfect in the sense that it can handle all the relevant facts; but together the different proposals give us a rather precise picture of the behaviour of schwa. It is my aim in this paper to present in a more or less systematic way those facts about schwa which I consider to be most characteristic. I also briefly present some of the more influential theories and ideas surrounding this vowel within the generative paradigm.
432
Marc van Oostendorp
I find it useful to distinguish pretheoretically between three tJφes of schwa: - E-schwa; this is the type of schwa that alternates with zero. The E in the name of this vowel refers to epenthesis because this is one of the possible sources of the alternation. It is of course also possible that this alternation is caused by deletion of schwa in certain environments, and in some (monostratal) theories one may deny the role of vowel epenthesis or deletion altogether and claim that the occurrence or non-occurrence of schwa in a certain position is a matter of phonetic interpretation (Charette 1991). In any case, if an alternation is observed, this counts as a diagnostic for E-schwa. - R-schwa; this is the name for a schwa that alternates with a full vowel. The R stands for reduction, but again, it is also possible that the alternation is the result of a 'fortition' of schwa; and in many cases, also nonderivational accounts are available. - S-schwa; this is stable schwa, which is a rest category from a descriptive point of view: if there is no reason to call a schwa E-schwa or R-schwa, I call it S-schwa. S-schwa is usually present in the underlying structure, but this is not a distinctive property of the vowel: also E-schwa can be imderlyingly present (when the alternation is the result of deletion), and so can R-schwa (when the alternation is the result of fortition). I do not think the taxonomy just presented should be awarded any theoretical status. A schwa may change its behaviour diachronically, and indeed we also find cases where a vowel is E-schwa and R-schwa at the same time (as in the French paradigm appeler [aple ~ арэк] 'to call' j'appelle [зоре!] Ί call'). The schwa in the stem counts as E-schwa because it is optionally realised as zero, and as R-schwa because it alternates with the full vowel [ε]. The classification is presented here because it helps to distinguish the various roles that schwa can play in the phonology of a language. Before we can study the behaviour of schwa in natural language, it may be usefiil to go into the question of how the segment should be represented phonologically. 2. The representation
of schwa
Schwa is special not only from the point of view of phonological theory. We find that it is a special vowel also if we look at it from an articulatory point of view: it could be described as a 'targetless vowel' for which no inherent articulatory target has been specified, or as a vowel which targets a neu-
Schwa in phonological theory
433
trai vocalic position, 'the mean tongue-tract variable position for all the filli vowels.' (Cf. Browman and Goldstein 1992). Some care is necessary here, since schwa is sometimes very close to e.g. the vowel [се]. This is the case in French, Norwegian and Dutch, for instemce. On the other hand, sometimes we find a so-called liigh' schwa, usually transcribed as [i]. The definition of schwa used here will be a rather loose onefi-omthe point of view of phonetics; an element is schwa if it is phonetically rather central, and if its phonological behaviour warrants attribution of this label. From a standard autosegmental/metrical point of view, we have several options to represent the emptiness of schwa. We can divide these options roughly into two sets. On the one hand, we can award schwa the status of an empty prosodie position without (auto)segmental material (an empty mora, an empty nucleus, an empty X-slot, an empty V-slot in a CV-model). On the other hand, we can assign some minimal segmental specification to this vowel. In the feature geometric model of McCarthy (1988), for instance, major class features constitute the root node. Schwa can then be seen as a bare [-cons] root, or as a root node with an empty (vocalic) place node. It is hard to find good empirical tests that could distinguish between these options. Many things depend more on the other theoretical choices we make vis à vis the structiire of syllables and segments than on the specific structxire assigned to schwa itself In the remainder of this article, I will rather arbitrarily assume that schwa could be pictured phonologically as a bare root node, in (1): (1)
· [-cons]
Schwa can be literally seen as the maximally unmarked vowel: it is marked for being a vowel ([-cons]) but for nothing else. I believe that, in the ideal case, as many phonological properties as possible should be made to followfi-omthe interaction of this fact and general principles of phonology. Ideally, no linguistic rule or constraint should specifically refer to 'schwa', which is not a theoretical primitive. I will take the fact that many languages do not have a schwa in their inventory as an exEmiple. Within constraint-based phonology, we could analyse this fact as a consequence of a specific constraint against schwa: (2)
* SCHWA: Don't have schwa
Yet it seems more reasonable to see this as a specific consequence of something more general, for instance the set of constraints that have been called FILL within Optimality Theory (Prince and Smolensky 1993): (3)
FILL: Empty segments are disallowed.
434
Marc van Oostendorp
In Van Oostendorp (2000) it is alternatively proposed that the behaviour of schwa can be understood in terms of a relation between segmental material and prosodie structure. If a segment has certain features, it should project to certain positions (say, the head of a syllable or a foot), and if a segment appears in the head position of a certain syllable or a foot, it should have certain (vocalic) features. This is regulated by the following general constraint scheme: (4)
Where F = a (vocalic) feature, Ρ = a prosodie position PROJECT (F, P): If a vowel is [+F], it should appear in position P. PROJECT (P, F): If a vowel appears in position P, it should be [+F].
Van Oostendorp (2000) shows that particular settings for the arguments F and Ρ сгт explain the behaviour of vowels in a number of languages. Schwa is a special case of this. Since schwa does not have any vocalic feature at all, it is restricted to a very small number of positions. In certain languages this number may be so limited, that it is null. No matter how we formulate the constraint against schwa, it can probably help to explain why the vowel is particularly sensitive to vowel harmony. In IHirkish, for instance, vowel harmony is a process working obligatorily across the bovmdary between a stem and an affix, but being subject to lots of exceptions within roots. One way of dealing with this is assuming that the harmony process is subject to a requirement of strict cyclicity: imderived domains cannot be affected. In that case, hsinnonic roots in Turkish are more or less as accidental as they are in other· languages without vowel harmony. Yet roots with a Ъigh schwa' are 'exceptions to the exception'; there are no roots which contain a high schwa and a front vowel. Whatever it is that blocks harmony within a root does not apply to schwa-like vowels (Clements and Sezer 1982; Bennink in preparation). Schwa is apparently absent also in the initial stages of first language acquisition even in languages such as Dutch and German which do have schwa in the adult language (Geilmann 1995; Levelt 2001). In these initial stages, schwa is often replaced by an (unrounded) full vowel, such as [a, ε, i], or not pronounced at all.
3. E-schwa One of the most prominent characteristics of schwa is that it alternates with zero. As a matter of fact the term shvarabakti vowel was initially meant to refer to the epenthetic vowel. Epenthesis — insertion of non-
Schwa in phonological theory
435
underlying material—is of course one possible source of the schwa-zero alternation that defines E-schwa. Another source is deletion of schwa. If schwa is very simple, it is easy to understand why this vowel is particularly eligible for both epenthesis and deletion. For epenthesis, it is the most economical choice to make if we have to insert a vowel: no material needs to be epenthesized beyond the feature [-consonantal]. For deletion, it is simpler to delete only the specification [-consonantal] than to delete this feature alongside others (which is needed in case of the deletion of a full vowel). It is thus predicted that if a language (i) allows schwa on the surface (of a given subphonology) and (ii) has a vowel-zero alternation (within that same subphonology), schwa will be among the alternating vowels. I have no evidence that would falsify this prediction; I know of no study which tries to systematically falsify it either. Assuming for the sake of simplicity that there can be no constraints on underlying forms (this is of course the standard assumption within Optimality Theory), we have evidence for deletion if there are certain phonologically well-defined contexts in which schwa does not surface. We have evidence for epenthesis, on the other hand, if there are phonologically well-defined contexts in which we always find a schwa. Interestingly, there are languages in which we find evidence both for schwa epenthesis and for schwa deletion: there are certain contexts in which schwa always occurs, there are other contexts in which schwa never occurs and there may even be contexts in which it is unpredictable whether we find a schwa or not. In languages of this type, epenthesis and deletion often seem to be defined in complementary contexts. These languages are of particular interest to constraint-based theories of phonology, because they often involve some form of'rule conspiracjr': epenthesis and deletion seem to conspire to attain some absolutely well-formed (syllable) structure. A well-known instance of such a language is French. The distribution of schwa in French has been a topic of debate during virtually any period of time within the history of (generative) phonology (Anderson 1982; Basb0ll 1981, 1988; Charette 1991; Dell 1973/1985; Durand 1976, 1986, 1990; Morin 1982, 1988; Noske 1982, 1988, 1992; Van Oostendorp 2000; Schane 1965, 1968, 1974; Selkirk 1978, 1982; Tranel 1981, 1987, 1994). In the following discussion I distinguish between 'static' and 'dynamic' pieces of evidence. 'Static' evidence for a constraint (or rule) is distributional: we never find schwa in a certain position, therefore there should be a constraint or a morpheme structure condition against schwa in that position (or a rule deleting it there); or we always find a schwa in a certain position, therefore there should be a constraint forcing it to be present (or a rule epenthesizing it). In order to be able to accept 'static' evidence for an output constraint, we have to assume some version of OT's 'Richness of
436
Marc van Oostendorp
the Base' (Prince and Smolensky 1993): there are no constraints on underlying representations, and everything can be input to the grammar. Something that is logically possible but never actually found should therefore be blocked from surfacing. 'Dynamic' evidence for a rule or a constraint is based on alternations. A schwa in form A corresponds to a zero in exactly the same environment in the morphologically related form B. I take this to mean that there is a constraint against zero in A, or against schwa in В (or that there is гт epenthesis rule applying to A or a deletion rule appljdng to B). The facts of French are rather complicated, and a lot of dialectal and probably even idiolectal variation is involved. In any case, the French facts are quite typical of the behaviour of E-schwa in natural language. Noske (1992) presents a useful classification of the facts of French E-schwa, which I will take as my guideline here. - Prevocalic schwa deletion. The evidence for this type of process is that schwa cannot be followed by a (full) vowel. Intuitively, the reason for this is that schwa is not 'needed' for syllable structure if a full vowel follows it. This vowel can take the responsibility for the onset preceding schwa. Since leaving the schwa in place would actually create hiatus, deleting it is preferable from the point of view of syllabification. It is actually hard to find dynamic evidence for the rule in French, because we need a context where /Хэ#У/ surfaces as [X#V], and schwa is deleted at the end of a word or phrase in many cases anyway. The alternation of the masculine singular determiner le is sometimes cited as evidence for the rule. This clitic surfaces as Лэ/ before a consonant-initial stem Ue camerade 'the comrade' [lakamrad]) and as /1/ before a vowel-initial stem (l'ami 'the friend' [lami]). The problem is, however, that a similar alternation applies to the feminine singular determiner la (la camerade—l'amie), but we have no reason to postulate a general rule of α-deletion before a full vowel. The other dynamic pieces of evidence usually given are equally dependent on other assumptions. For instance, it is often assumed that the feminine singular suffix is a schwa {autre 'other, FEM.' /О1К/+/Э/ [OÌR]). The problem is that this schwa never surfaces at the end of a phrase, and in those environments in which it does surface {autre femme 'other woman' [otRsfam]), this may well be due to schwa epenthesis (alternation type D discussed below). Although the indirect phonological evidence, which is based on the process of liaison but which I cannot review here (cf Encrevé 1988 and references cited there), is favourable to the hypothesis that the feminine suffix is schwa, there is no really strong phonetic evidence that the schwa is present at all, or that it is ever deleted specifically in the context before a vowel.
Schwa in phonological theory
437
Fortunately, we have more convincing static evidence that the constraint against schwa in front of a vowel is operative in French; although schwa can be rather freely distributed over all positions in the French word, there are no words in which it immediately precedes a vowel (either a full vowel or another schwa). French shares this property with many other languages, including Dutch, English, Germem and Indonesian. In Dutch we actually have some pieces of dynamic evidence: if the schwafinal word {elite [elita]) is followed by a vowel-initial suflfix {-air [er]), the schwa gets deleted {elitair [eliter] 'snobbish'). Similarly, in Norwegian we see that word-final schwa's, which surface before a consonant-initial word, get deleted before a vowel-initial word (Kristoffersen 2000:328-329): (5)
a.
â klemme gutten [o.klem.m3.g«t.tn ] 'to hug the boj^* toget stopper [toi.ga.stop.pr] 'the train stops'
(6)
Ъ.
à klemme onkelen [o.klem.muN.kln ] 'to hug the uncle' toget ankommer [to:.gaN.kom.mf] 'the train arrives'
This behaviour of schwa can be imderstood within Optimality Theory in the following way. As we have seen above, we need to posit a constraint against the occurrence of schwa. This constraint outranks most faithfulness constraints on schwa, but it is itself dominated by several wellformedness constraints on syllable structure. Schematically: (7)
Well-formedness » *Schwa » Faithfulness
This ranking implies that schwa never surfaces imless it is required to do so by the syllable structure constraints. This in t u m means that it is almost impossible to say whether we see deletion or epenthesis at work in a given case: schwa is deleted wherever it is not necessary because of *Schwa and inserted wherever it is necessary because of high-ranking Well-formedness. If an underlying schwa is both preceded and followed by a consonant, it needs to surface in order for the word to satisfy general constraints on syllable structure, in particular constraints against complex onsets (*[lkamrad] — [bkamrad]). Yet if schwa is followed by a full vowel, there is no necessity for it to surface ([lami] is perfectly well-formed as far as syllable structure is concerned). This may be the reason behind the deletion of schwa. A famous exception to the deletion of schwa before full vowels in French is formed by a class of words with a so-called h aspiré·, a consonant position that was probably filled by a /h/ in earlier stages of the language but that it is presently phonetically empty, although phonologically it still behaves as a full-fledged consonant:
438
Marc van Oostendorp
(8)
mets le dessus [mebdsy] 'put it on!' retourne la [rtumla] 'return it!' mets le dehors [mebdaor] 'put it outside!' rehausse la [raosla] 'raise it again' It is standardly assumed that this behaviour CEin be understood by assuming some more or less abstract consonant in this position. The schwa is therefore not adjacent to the full vowel in cases like these; hence, it is not subject to the deletion process (Dell 1973/1986). If this analysis is on the right track, something very similar happens in Norwegian if schwa is placed in front of a sonorant. For instance, if we add a def.sg. suffix -Ы to the noim sele [sei.b] 'suspender', we derive [se:.In] rather than *[se:.l3n]. Again, this can be understood if we assume that schwa is not necessary in cases like these, since the sonorant is able to take up a syllabic position and support the syllable. - Postvocalic schwa deletion. A similar line of reasoning may help to explain why schwa does not surface on therightheindside of a full vowel in a language like French (like many other languages with schwa). Also in this context, there is no necessity for schwa to surface. The syllable structure of [ami] is actually better than the structure of hypothetical [amia], because the latter has one extra hiatus, or onsetless syllable. Again it is rather hard tofinddyneimic evidence for a deletion rule—the evidence that is cited by Noske (1992) for instance involves once again the feminine suffix which reputedly gets deleted after a full vowel: risée 'laughed at, FEM' /rise/+/3/-+[rise], but as I have already pointed out above, word-final schwa gets deleted in many other cases as well. Fortunately, there is some static evidence in that the relevant constraint is at work in French. We don't find any (monomorphemic) words in which a schwa immediately follows a lull vowel. The question may now arise why not all vowels get deleted when they occur next to another vowel. Also these other vowels create hiatus. I think that the reason for this difference between schwa and the other vowels is that deletion of schwa is less costly than deletion of other vowels, due to the fact that the schwa literally is a substructure of all the other vowels. If it is allowed to delete a full vowel in a certain context, this implies that it is allowed to delete a root node in that context; hence, also schwa will be deletable in that context. - Schwa deletion in a 'two sided open syllable'. A schwa which occurs in an open syllable that is preceded by another open syllable may get deleted in French. In practice this means that an underl30ng schwa occurring in the context ..V$C $C(C)V... can disappear on the surface. In
439
Schwa in phonological theory
contradistinction to the two previous cases, this type of alternation is either optional or otherwise subject to styUstic conditions. The following relatively famous examples serve to illustrate this point. (9)
Henri devrait partir 'Henri would have to leave'
[àRÌdavRepaRtÌR ]
[âRidvRcpaRtÎR]
Jacques devrait partir [sakdavRepaRtÍR] 'Jacques would have to leave'* [sakdvRcpaRtiR] The relative wellformedness of the resulting syllables is much more questionable than in the previous cases. This might be the reason why 'twosided open syllable deletion' is much less widespread than the previous two cases. It looks as if underlying schwas should be deleted in principle (because of whatever the constraint *Schwa says), but deletion is blocked if it would result in an unwellformed cluster of consonants, in particular one which cannot form a reasonable syllable structure. Certain additional stipulations are necessary, however, since [dvR] is not otherwise a wellformed onset cluster of French. Yet the syllabification that most speakers would assign to Henri devrait partir seems to be [a.Ri.dvRe.paR.tiR].
This can be understood in several ways. It is possible for instance that there is still a 'trace' of schwa between [d] and [v] in the form of an empty vowel; this is the solution adopted in Charette (1991) modulo technical details. It is also possible that there is a distinction between two rounds of syllabification (e.g. lexical and postlexical); again modulo technical details, this seems the type of solution favoured by Noske (1992). - Schwa epenthesis in the environment CC] _ [C. A schwa is (optionally) inserted in French between two words, if the first word ends in a cluster, and the second word starts with a consonant. Also in this case syllable structure seems the driving force (rather than a force blocking deletion); clusters of consonants are avoided also in this case: (10)
un contact pénible un index formidable
[œkotakt(3)penibl] 'a painful contact' [œedeks(3)foRmidabl] 'a terrific index'
Maybe we can understand these examples in the following way: complex codas are dispreferred in French. A schwa is needed to support the final consonant in a cluster. However, schwa can never surface at the end of a phrase in French, as we will see below. It also cannot appear (is deleted) before a vowel. The only case in which we therefore see the supporting schwa appear on the surface is phrase-intemal before a consonant. The fact that we have epenthesis in these cases is therefore readily understood; the fact that the epenthetic vowel is schwa also is readily vmder-
440
Marc van Oostendorp
stood. Schwa is the simplest vowel of all; therefore it is more economical to insert this vowel than it is to insert any other. Charette (1991) observes that within compounds, stress also seems to play a role in the behaviour of schwa. A schwa has to surface if the first component ends in a consonant cluster and the second is monosyllabic; if the second is polysyllabic, the schwa cannot surface (according to Charette, a similar pattern is found in Québec French within phrases as well). (11)
port3-clefs garda-fou port9-manteau gard^-manger
'keyring' 'railing" 'coat rack' 'meat safe'
Since stress is on the final syllable of the compoimd, it does not seem unreasonable to suppose that it plays a role in these examples. The question is of course why schwa would surface when it is immediately followed by a stressed syllable. Charette's answer to this is that French feet are iambic, and that the imstressed syllable in the foot should be a schwa. These assumptions are not imcontroversial, but as far as I know alternatives to Charette's proposal remain to be worked out. - Schwa deletion in phrase-initial syllables. The existence ot this property, mentioned by Noske (1992), seems subject to debate: not all speakers of French agree on the data (Caroline Fery p.c.). In any case, some speakers allow alternations such as the following: (12)
revenez (demain) te fais (pas de bil)
[гэуэпе]~[гуэпе] [tafε] ~ [tfe]
'come back tomorrow' 'don't worry"
Syllable onsets [rv] or [tf] are not normally allowed in French. It is possible that the initial syllable of the word allows a more complex structure (as it does in Polish, cf Rubach and Booij 1990). According to Charette (1991) this type of deletion is restricted to bisyllabic words: cheval Ъогзе' may be pronounced as [ffal], but chevalier 'knight' is [Javalje]. In this case, it looks as if stressed syllables allow more complex onsets. On the other hand, the explanation given here seems to run coimter to the explanation ofporte-clefs vs. porte-manteau just given. In this case, French speakers do not try to build a 'perfect' iamb [favai]. - Schwa deletion in phrase-final syllables. This final type of 'alternation' is one of the most important ones within the phonology of French: schwa never surfaces phrase-finally (except in Midi French, Durand 1986). (13)
je vois l'autre voilà mon oncle
bavwalotR] [vwalamonokl]
Ί see the other' 'there is my vmcle'
441
Schwa in phonological theory la terre est plate
[latereplat]
'the earth is flat'
la route est longue
[laRutelSg]
'the road is long"
It is not immediately clear why schwa should be deleted in final position. One possibility, suggested in Van Oostendorp (2000), is to extend the scope of the constraint FINALC (proposed by Prince and Smolensky [1993] to hold at the level of the word) to the phrase. This constraint would then say that all phrases have to end in a consonant. Deletion of full vowels under this view would be blocked by the fact that this would involve deletion of too many features. Deletion of schwa, on the other hand, is less costly, and, therefore, allowed in this particular configuration. The solution of Charette (1991) is technically a little bit different. According to this scholar, a domain-final empty nucleus is exceptionally licensed to remain empty in French. In any case, it seems as if something special needs to be said about the end of the phrase in French. No special theory is needed, however, in order to describe why it is exactly schwa that behaves as special in this particular spot, and not some other vowel.
4. R-schwa Schwa not only alternates with zero in languages of the world: it also often alternates with fiill vowels. There is an instance of this in French: schwa alternates with [ε] in the following contexts in this language (Dell 1973/1985): - when schwa is the final phonetically expressed segment of a word ([paqu[e]t]) - when schwa phonetically occurs in a word-intemal or word-final closed syllable It seems reasonable to assume that imderlying schwa turns to [ε] in certain environments: namely when it is stressed or when it is in a closed syllable. This is not surprising; schwa disfavours stressed positions or closed syllables also in other languages, as we will see below. This thus seems to be an instance offortition rather than reduction. We may of course wonder why schwa turns to [ε] rather than to some other vowel; a good theory of segmental structure should provide us with an answer to this question. Reduction to schwa is well-known irom the study of (îermanic languages like English or Dutch and, to a lesser extent, the Scandinavian languages and German. A very simple theory of vowel reduction in English is presented in Chomsky and Halle (1968). The reduction rule in this work basically amounts to (9):
Marc van Oostendorp
442
(14)
Unstressed vowels tum to schwa.
Syllable structure plays an important role in the analysis of E-schwa; in the case of R-schwa it is often stress that seems to trigger the alternation. Within Optimality Theory, there are two possible ways to formalize (14). In Alderete ( 1 9 9 9 ) it is proposed that in principle all vowels tum to schwa. Schwa is the unmarked vowel and Alderete assumes that there is a constraint which requires all vowels in the word to be unmarked (I will call this constraint TURNSCHWA for the sake of simplicity, but it can of course be seen as an instance of the general constraint *STRUCTURE). On the other hand there are position-specific faithfulness constraints that specifically require stressed syllables to be faithful to the underlying structure. These position-specific constraints I will shorthand here as FAITHFULSTRESS. (Alderete calls them Head-Dependence and formalizes them in terms of Correspondence Theory, but this seems immaterial to the discussion at hand.) An analysis for the English word apron would now m n along the following lines. (15)
/аргэп/ ápron «3= ápran ápran
FAITHFULSTRESS
TURNSCHWA
FAITHFUL
*
*!
**
Alderete ( 1 9 9 9 ) claims that FAITHFULSTRESS constraints can also be put to use in analyses of phenomena other than reduction. For instance, he shows that the fact that epenthetic vowels usually avoid stressed positions (even if they are not schwa) can be understood in terms of this constraint. It is worse to epenthesize into a stressed position and violate both FaithfulStress and Faithful, than to epenthesize into an unstressed position in which case only Faithful will be violated. Yet there are also problems connected to an approach based on positional faithfulness. We have already seen evidence for the fact that we need to posit a constraint against schwa (e.g. *SCHWA, (2)) within the universal inventory of constraints, if only because there are languages which do not have schwa. We thus have a ranking FAITHFULSTRESS » * SCHWA » FAITHFUL within our factorial typology. This however gives us a result which is probably absurd: a language in which underlying schwa can only surface in a stressed position:
443
Schwa in phonological theory (16)
/эргэп/ FAITHFULSTRESS ápran 1®· ápren *! épren
*SCHWA
FAITHFUL
*
*
More in general, a F A I T H F U L S T R E S S accoimt implies that there are more contrasts possible in stressed positions than in unstressed positions. While this is true in general, it is not true in the case of schwa. The prediction that there are languages in which we can only have schwa in a stressed position does not seem to be borne out. An alternative is to make T U R N S C H W A rather than faithfulness into a positional well-formedness constraint, or a set of such constraints. We then get some form of projection constraints, as proposed in Van Oostendorp (2000): (17)
PROJECT-FOOT
a. b.
Unstressed vowels tum into schwa Schwa wants to be unstressed
These constraints fit into the general theory of Projection outlined in (4): (17b) states that a vowel in the head position of a foot needs to have some minimal feature structure; (17a) states that vowels with a certain feature structure desire to be in the head of a foot. If they are not, they tum into schwa. An analysis of English apron would run along the following lines. (18)
/аргэп/ PROJECT-FOOT ápran *! IS· ápran ápran
(a)
PROJECT-FOOT ( b )
FAITHFUL
*!
A language that only allows schwa in stressed positions is not part of the typology in this system: if we rank * S C H W A very highly, we get a language in which schwa does not appear at all; if we rank F A I T H F U L very highly, we get a language in which schwa can in principle occur in any position. Another advantage of this approach is it that it explains why there are languages in which imderlying schwas are just as stressless as the epenthetic ones or the ones which are the result of reduction. (Underlying schwas will be the topic of the next section.) There are also problems connected to this approach, however: most importantly, the fact that generally
444
Marc van Oostendorp
stressed positions do license more feature contrasts than unlicensed ones, and that epenthetic vowels are unstressable requires an explanation. Regardless of which of the two lines of research will t u m out to be most fhiitful, the analysis of English vowel reduction should be more sophisticated. Burzio (1994) argues, quite convincingly in my view, that the traditional view on reduction — every unstressed vowel gets reduced to schwa—is too simple. It causes all sorts of problems, because it forces us to assign stress in an extremely complicated way and furthermore to introduce the formal device οΐ destressing rules into the theory in order to undo some of the results of the stressing rules. Burzio (1994) proposes to simplify the stress rules. As a result of this, the conditions on reduction get more complicated. Burzio's main point is that reduction is generally blocked in unstressed closed syllables ending in an obstruent, but not in unstressed syllables ending in a sonorant; in this way, he can get rid of the so-called 'sonorant destressing* rules. I find Burzio's arguments convincing because the difference between closed and open syllables, and the sonority of a following consonant also play a role in the behaviour of other Ϊ5φβ8 of schwa: we have already seen that schwa cannot occur in closed syllables in French, and I will give an example of S-schwa in Dutch below. It therefore seems more natural to say that unstressed vowels resist reduction because they occur in closed syllables, than to say that they are stressed, and possibly later destressed. There is no independent evidence for the latter assumption. Having observed this, we can now tum to a language in which reduction is subject to even more factors than it is in English. This language is Dutch (the following discussion is based on Booij 1981,1995; Kager 1989; Martin 1969; Van Oostendorp 2000). Factors that presumably play a role in Dutch vowel reduction are: i. ii.
Frequency: Vowels in more frequent words tend to get reduced more easily than those in non-frequent ones. Differences between sociolects and dialects, and differences between style levels. Certain dialects/sociolects/style levens show more reduction than others. (This is known also for Portuguese, cf Mateus and d'Andrade 2000.)
It is hard to see how factors (i) and (ii) could possibly be taken into account within generative grammar; presumably, this is the reason why they are usually ignored in the literature (but see Hinskens et al. 1997 for several recent attempts to integrate some of these concepts into phonological theory). We will not go into these factors either. Other factors can be analysed more easily:
Schwa in phonological theory
445
Hi. Syllable type: Vowels in open syllables are easier to reduce than those in closed syllables; in some idiolects (e.g. the one of Booij 1981, and the one of the author of this article, but not in the one of Kager 1989) reduction in closed syllables is only possible if the following consonant is deleted. (19)
/benzina/ /kantoir/
[bazína] (*[Ьэпгтэ]) [katóir] (*[k3ntor])
'petrol' 'office'
This is the same restriction as the one that Burzio (1994) claimed to exist for English. Yet in the literature on Dutch there is less controversy that the purported restriction on open syllables belongs to the theory of reduction rather than to the theory of stress assignment. The reason for this is that reduction is optional in Dutch. Apparently it is more attractive to say that reduction can optionally apply to invariably assigned metrical structures than it is to say that the stress rules themselves are optional (it has to be admitted, though, that there seems to be no solid groxmding for this theoretical preference). I think that the restriction can be understood in the following way. A closed syllable is more complex them an open syllable. A minimal vowel is not sufficient to license such a complex structure. A projection constraint relating the quality of the vowel to the rhyme structure of the syllable in which that vowel occurs is therefore to be preferred. iv.
(20)
Stress position: In Dutch — as in English and apparently any other language in which reduction is dependent on the position in the metrical structure—only unstressed vowels get reduced. As a matter of fact, we can distinguish between two types of imstressed vowel (Van Zonneveld 1985). Vowels in so-called 'semiweak' positions — immediately following a stressed position—are easier to reduce than those in 'weak' positions, i.e. all other unstressed positions. /fonoloyi/
formal style less formal style informal style impossible
[íbnoloyí] [fbnaloyi] [fbnalsyi] * [fbnobyi]
Similar facts can be found in other languages as well (Burzio 1994:113). In English, tat[3]magouchi seems preferable to tatam[3^ouchi. Jacobs (1989) observes a similar patterning for syncope in the history of French (but cf. Mateus and d'Andrade on Portuguese): (21)
Latin: similitudinem simKtudinem Old French: sembletume 'resemblance'
446
Marc van Oostendorp
The precise way to describe the difference between the two types of unstressed position is dependent on our precise assumptions regarding metrical structure. If we assume, with Burzio (1994) that feet can be ternary, we should distinguish between 'foot-intemal' and 'peripheral' unstressed syllables. If we work in a theory in which only binary feet are allowed, we should distinguish between weak syllables in a foot and syllables which are somehow left unfooted by the parsing algorithm. Vowel reduction to low vowels in Russian (/stol-á/—stal-á) only occurs in the syllable directly preceding the stressed syllable (the pretonic syllable henceforth). All vowels reduce to [э] in unstressed, non pretonic syllables (the following facts have been copied from Alderete 1999). (22)
Nom. Sg. Gen. Dat. Instr. Loc.
stól stal-á stal-ú stal-óm stal-é
slóv-o slóv-a slóv-u slóv-om slóv-e
v[ó]d3j v[a]d-á v[3]davóz
•water' nom. pl. 'water' nom. sing. 'water carrier'
Here it seems that the position next to (on the lefthand side of) a stress is less likely to undergo full reduction than vowels in other positions. The differences may be due to a different stress system for Russian, or possibly to the fact that the reduction system of this language is more complicated in other respects. Vowels do not just reduce to schwa; in some positions (and depending on their quality), they also reduce to [a] or [i], V. Position in the word: Vowels in absolute word-initial or absolute word-final position do not reduce. This observation is absolutely surface-true in Dutch. No matter what the style of speech is, we do not find reduction of the vowels that are absolutely word-initial or absolutely word-final. The /е/ is reduced quite easily in Dutch, for instance, but not if it is the first or the last segment of the word, as the following segments are intended to show: (23)
/plezir/ /eyal/ /tofe/
[plazir] tegál] (*[эуа1]) [tófe] (*[t'of3])
'fun' 'even' 'toffee'
We may think of this as some kind of alignment effect of phonological to morphological structure: the edges of output words should correspond exactly to their morphological specifications.
Schwa in phonological theory
447
An alternative explanation for the word-initial vowels is that syllables consisting of only schwa are disallowed; a syllable needs some minimal specification (Cohen et al. 1959). An argument for this alternative may be that vowels next to Ihl also cannot be reduced: (24)
/plezir/ /helas/
[pbzir] [helas] (*[halas])
'fun' 'alas'
We will see in the next section that also underlying schwas cannot occur in absolutely word-initial position; this is true for many other languages as well, including German, Indonesian and French. There is also an alternative explanation for the non-reduction at the right edge of the word. Some authors (e.g. Brink 1970) have claimed that Dutch schwa is always the result of reduction: words such as mode 'fashion' are underlyingly specified as [mode]; in that case reduction is obligatory in these environments and the word toffee [tofe] should somehow be marked as an exception (it is true that there is only a handful of words in Dutch ending in unstressed [e]). vi. Vowel quality: The quality of the underlying vowel plays a role in the choice between reduction and non-reduction in many languages. In English, for instance, high vowels are not as likely to undergo reduction as non-high vowels (Chomsky and Halle 1968). Mid vowels in Dutch are easier to reduce than high and low vowels; front vowels are easier to reduce than back vowels; rounded vowels are easier to reduce than imrounded ones. This is illustrated in the following table (from Kager 1989): (25)
Reduces in /е/ /а/ /о/, N /и/, /у/
Weak position formal semi-formal semi-formal informal
Semi-weak position formal semi-formal informal excluded
Interestingly, Dressier (1973) found more or less the same hierarchy of segments for Breton vowel reduction to schwa. In Norwegian, Id and Id are the only vowels that can alternate with schwa (Kristoffersen 2000). As far as I am aware, the structure of this hierarchy is still in need of an explanation.
5. S-schiva R-schwa can be a non-underlying schwa which is the result of a reduction operation; E-schwa can be a non-underlying schwa which is the result of minimal vowel epenthesis. Both types of schwa can also be underlying;
448
Marc van Oostendorp
the alternation in the case of R-schwa can then be seen as a case of fortition. The alternation in the case of E-schwa can be analysed as a result of deletion. As a matter of logics, we also expect imderlying schwas which do not alternate at all and are present in a morpheme imder all circumstances; these vowels I call S-schwa. Even though S-schwas are stable in the sense that they occur both in the underlying and in the surface structiu-e of a word, they still often display some special behaviour. The factors involved may be familiar by now: S-schwa does not occur in an stressed position or in a syllable that is too complex. The only evidence we have is therefore distributional, 'static' evidence. We have seen that schwa in French cannot occur in a closed syllable. In Dutch, there are also several restrictions on the syllable shape in which a schwa can occur. For one thing, we do not find schwa in an onsetless syllable. The static evidence for this is that the vowel does not occur as the first segment of a word (*[эуа1]), and that it does not occur immediately after a vowel (*[hi3t]). This restriction seems quite common in languages of the world: it can be found also in Indonesian, Slavic languages, and French. On the other hand, Dutch schwa also cannot occur in a syllable with a complex onset (*[papavr3]). I have found no other languages yet in which this restriction seems to hold (except maybe related Germanic languages such as Grerman and certain dialects of Norwegian, Kristoffersen 2000). The restrictions on schwa are not just restrictions on the onset; there are also restrictions on the rhyme of a schwa syllable. Other than other (short) vowels {harp 'id.' [Ьаф], Belg [bebe] 'Belgian', ramp [ramp] 'disaster"), schwa caimot be followed by two tautosyllabic consonants, except if the second consonant is coronal (*[ad3rp], *[kat3lx], *[ad3mp], anders [andars] 'different'). Marked coda's, just like marked onsets, are not allowed. Even within the class of schwa syllables with a simple coda, we find a clear asymmetry. Schwa has a preference for syllables closed by a sonorant. Words such as hennep Ъетр', tinnef 'garbage', etc. are very rare. It is not imcommon for languages to require that codas can only contain a sonorant. There is no evidence, however, that syllables headed by a full vowel are subject to this requirement in Dutch. In general it seems that the requirements on schwa syllables are much stronger than those on syllables headed by other vowels. In Van Oostendorp (2000) I proposed that this is due to the theory of Projection in (4): schwa as an (almost) empty vowel does not have sufficient material to license all of the syllabic nodes that are necessary in order to get complex onsets or codas. Simple vowels are not allowed to project complex syllables.
Schwa in phonological theory
449
- Schwa in stress. In most languages that have (underl5dng) schwa, the vowel is left unstressed. Roughly speaking, this can happen in two ways: schwa can preferably end up in the weak position in a foot; this seems to happen in a language like Dutch, in which stress always falls on the penultimate syllable if the final syllable contains a schwa {ballade [baláds], *[bálad3], *[baladá]); if the final syllable contains a full vowel, more (lexical) variation is possible (pànama, pyáma, chocolá). Schwa thus is not completely invisible for stress; under the standard assumption that Dutch feet are trochaic, we should say that schwa can only occur in the weak syllable of a foot. Alternatively, schwa can behave as if it is completely invisible for the stress algorithm; an instance of this is Indonesian. In this language (Cohn 1989) primary stress is usually on the penultimate syllable of the word (21a-c); secondary stress is on the first syllable and on every odd syllable counting from the right-hand side of the word (21d-g). An exception to this are words which contain a schwa (21a'-h'). These can be best understood as if schwa were not present at all: (26)
a. b. c. d. e. £ ga'. b'. c'. d'. e'. f. g'· h'.
cát cari bicára bijaksána kòntinuàsi òtobìogràfi àmerikànisàsi bari sataláh gámalan apártsmen cantara parampúan difarensiási divarsifikási
'print' 'search for' 'speak' 'wise' 'continuation' 'autobiographjr" 'Americanization' 'give' 'after' 'Indonesian orchestra' 'appartment' 'stoiy "woman' 'differentiation' 'diversification'
It is unclear, at least to me, how exactly these facts should be understood. We may imderstand why schwa cannot occur in the stressed position of a foot, but this does not clarify why it cannot occur in the unstressed position of a foot. Particularly intriguing are the cases in (g') and (h'). There is no secondary stress on a full vowel in these cases: we do not find [difarénsiási]. According to Cohn (1989) the reason for this is that stress only sees full vowels: a trochaic foot is therefore built over the first two full vowels of the word, i.e. the foot structure of this word is [(difaren)si(ási)]. It thus looks as if in this case schwa is not strong enough to pro-
450
Marc van Oostendorp
ject a grid mark at all, not even one which appears in the weak position of a foot. Schwa remains unstressed also in most dialects of Yupik. Different dialects of Yupik choose different means to escape an unstressed schwa. Stress is generally on the second syllable of the word in Yupik; the stressed vowel is lengthened (/qayani/— [qaya:ni]). If this syllable contains schwa, the vowel is deleted in the General Central dialect of Yupik, as if it wants to escape a stress position. Stress is then assigned to the first syllable of the word. (27)
/qayapigkani/ /qanaqa/
[qayáápixkani] [qánqa]
Ъ18 own future kayak' 'my mouth'
GCY schwa therefore should be classified as an E-schwa; it is however different from the cases of E-schwa we have seen above because the triggering factor for the alternation is not syllable structure, but rather stress. A pattern that is fairly similar to the one of GCY can be found in Alutiiq Yupik. In other dialects of Yupik, schwa does get stressed, but this stress does not trigger lengthening, as it does on other vowels. Different dialects of Yupik may solve this problem in different ways. In the Norton Sovmd dialect, for instance, the consonant following the schwa gets lengthened: /atapik/— [at6ppik]. In this way, the stressed syllable is still heavy, even though schwa needs not be lengthened in order to achieve this result. In the Central Siberian dialect of Yupik, on the other hand, nothing happens to the stressed schwa-headed syllable at all: /atapik/— [atópik].
6.
Conclusion
Schwa is special in the sense that it has a more limited distribution than other vowels: for instance it can often only occur in a rather simple syllable type, and only in an unstressed position. Schwa is also special in the sense that it is the target of reduction and deletion; furthermore it is easier to delete schwa than it is to delete a full vowel. Phonology should be able to describe all of these facts. In order to fully understand the behaviour of schwa, we need a fully developed theory of syllable structure, of metrical structure, of segmental structvu-e, and of the way in which these different dimensions of phonological structure can interact. Inversely, while developing these subtheories, we sharpen our view of schwa. Most of the facts mentioned above could probably not have been raised without sufficient understanding of phonological structure, phonological derivation and phonological con-
Schwa in phonological theory
451
straint interaction. Some of the facts mentioned here are still rather problematic for any phonological theory known to the author of this overview; chances are that new facts may be discovered while phonologists are developing their theories. I suspect that we will not have a satisfying theory of schwa untili we have a satisfying theory of phonology as a whole.
A Schwa
Bibliography
Alderete, John (1999). Head dependence in stress-epenthesis interaction. In: Ben Hermans and Marc van Oostendorp (eds.). The Derivational Residue in Phonological Optimality Theory, 9-31. Amsterdam/Philadelphia: Benjamins Anderson, Stephen (1982). The analysis of French schwa; Or, how to get something from nothing. Language 58,121-138. ArchangeU, Diane (1984). Underspecification in Yawelmani phonology and morphology. PhD dissertation, Massachusetts Institute of Technology. Barry, William J. (1992). Comments on chapter 2, ' "Targetless" schwa; An articulatory analysis". In: Gerard Docherty and Robert Ladd (eds.). Papers in Laboratory Phonology II, Gesture, Segment, Prosody. Cambridge, UK: Cambridge University Press. Barry, William J. (1995a). Schwa vs. schwa+/r/ in German. Phonetica 52, 228-235. Barry, William J. (1995b). Variation in schwa + /г/ in German. In: Proceedings of International Congress of Phonetic Sciences 1995. Stockholm: ICPhS. Basboll, Hans (1981). Metrical theory and the French foot. In: Wolfgang Dressier, Oskar Pfeiífer and John Rennison (eds.). Phonologica 1980; Akten der vierten internationalen Phonologietagung, 12-36. Innsbruck: Institut für Sprachwissenschaft der Universität. Basb0ll, Hans (1988). Sur l'identité phonologique du schwa français et son rôle dans Г accentation et dans la syllabation. In: Verlu3rten (1988). Basb0ll, Hans (1990). Distinctive features, syllable structure, and vowel space. In: Pier Marco Bertinetto, Michael Kenstowicz, and M. Loparco (eds.). Certamen Phonologicum II, Papers from the 1990 Cortona Phonology Meeting, 3-18. Torino: Rosenberg and Sellier. Bell, Alan (1978). Syllabic consonemts. In: Joseph Greenberg, (ed. ). Universals of Human Language, vol. II, Phonology, 153-201. Stanford: Stanford University Press. Bennink, Clements, in prep. Harmony and disharmony in Hungeirian and Turkish. PhD Dissertation, Tilburg University.
452
Marc van Oostendorp
Berendsen, Egon (1986). The phonology of diticization. PhD dissertation, University of Utrecht. Also appeared at Dordrecht: Foris. Berendsen, Egon and Wim Zonneveld (1984). Nederlandse schwa-invoeging op z'n Deens. Spektator 14,166-17. Bergem, Dick van (1993). Acoustic vowel reduction. Speech Communication 12,1-23. Bergem, Dick vein (1994). Reflections on aspects of vowel reduction. Proceedings of the Institute ofPhonetic Sciences 18. Amsterdam, University of Amsterdam. http://fonsg3.let.uva.nl/Proceedings/Proceedings_18_ contents.html. Bergem, Dick van (1995a). A model of coarticulatory effects on the schwa. Speech Commun. 14,143-162. Bergem, Dick van (1995b). Acoustic and lexical vowel reduction. PhD Dissertation, University of Amsterdam. Bethin, Christina (1992). Polish Syllables. Columbus, Ohio: Slavica. Bobaljik, Jonathan (1997). Mostly predictable: Cyclicity and the distribution of schwa in Itelmen. Manuscript, Harvard. Available at ftp://ruccs. rutgers.edu/pub/OT/TEXTS/archive208-07971/208-07971.ps. Booij, Geert (1976). Klinkerreductie in bet Nederlands. Leuvense Bijdragen 65,461-470. Booij, Geert (1981). Generatieve Fonologie van het Nederlands. Utrecht & Antwerpen: Het Spectrum. Booij, Geert (1992). Fonologische en fonetische aspecten van klinkerreductie. Spektator 11, 295-301. Booij, (îeert (1983). French C/0 alternations, extrasyllabicity and Lexical Phonology. The Linguistic Review 3,102-134. Booij, Geert (1995). The Phonology of Dutch. Oxford: Oxford University Press. Bouchard, Denis (1981). A voice for "e muet". Journal of Linguistic Research 1,17^7. Brink, Daniel (1976). Problems in phonology; A generative phonology of Dutch. PhD Dissertation, University of Wisconsin. Brockhaus, Wiebke (1995). Final Devoicing in the Phonology of German. Tübingen; Niemeyer. Browman, Catherine and L. Groldstein (1992). "Targetless schwa": An articulatory analysis. In: Gerard Docherty and Robert Ladd (eds.). Papers in Laboratory Phonology II, Gesture, Segment, Prosody. Cambridge, UK: Cambridge University Press. Burzio, Lmgi (1989). Prosodie reduction. In: Carl Kirschner and Janet Decesaris (eds.). Studies in Romance Linguistics. Amsterdam: John Benjamins.
Schwa in phonological theory
453
Burzio, Luigi (1994). Principles of English stress. Cambridge, UK: Cambridge University Press. Charette, Monik (1991). Conditions on Phonological Government. Cambridge, UK: Cambridge University Press. Chomsky, Noam and Morris Halle (1968). The sound pattern of English. Cambridge, Mass: The MIT Press. Clements, Nick and Engin Sezer (1982). Vowel and consonant disharmony in Turkish. In: Harry van der Hulst and Nerval Smith, (eds. ). The Structure of Phonological Representations, Vol. II, Dordrecht: Foris. Cohn, Abigail (1989). Stress in Indonesian and bracketing paradoxes. Natural Language and Linguistic Theory 7,167-216. Cohn, Abigail and John McCarthy (1994). Foot alignment and apparent cyclicity in Indonesian. Manuscript, Cornell University and University of Massachusetts. Dell, François (1973/1985). Les règles et les sons. Paris: Herrmann. First edition 1973, second, revised edition 1985. Dell, François (1976). Schwa précédé d'un groupe obstruante-liquide. Recherches Linguistiques 4, 75-111. Dell, François (1979). On French phonology and morphology and some vowel alternations in French. Studies in French linguistics 1.3. Dell, François (1995). Consonant clusters and phonological syllables in French. Lingua 95, 5-26. Donegan, J. (1978). On the natural phonology of vowels. PhD Dissertation, Ohio State University. Published by Garland Press, 1985. Dressier, Wolfgang (1973). Allegroregeln rechtfertigen Lentoregeln; Sekundäre Phoneme des Bretonischen. Inssbruck: Innsbrucker Beiträge zur Sprachwissenschaft 9. Dressler, Wolfgang (1975). Methodisches zu Allegro-Regeln. In: Wolfgang Dressler and F.V Mares (eds.). Phonologica 1972, 219-234. München: Fink Verlag. Durand, Jacques (1976). Grenerative phonology, dependency phonology, and Southern French. Lingua et Stile 11, 3-23. Durand, Jacques (1990). Generative and Nonlinear Phonology. London: Longman. Durand, Jacques, C. Slater and H. Wise (1988). Observations on schwa in Southern French. In: C. Slater, J. Durand and X. Bate, (eds.). French Sound Patterns: Changing Perspectives, Association for French Language Studies no. 32, University of Essex. Encrevé, Pierre (1988). La liaison avec et sans enchaînement; Phonologie tridimensionelle et usages du français. Paris: Éditions du Seuil. Fidelholtz, J. (1979). Word frequency and vowel reduction in English. Proceedings of the Chicago Linguistics Society 11, 200-213.
454
Marc van Oostendorp
Fouché, Pierre (1959). Traité de prononciation française; 2e édition. Pîiris: Klinksieck. Greilmann, Jürgen (1995). The acquisition of German vowel quality. Two case studies. In: Harry van der Hulst and Jeroen van de Weijer (eds.) Leiden in Last/HIL Phonology Papers I. The Hague: HAG. Giegerich, Heinz (1987). Zur Schwa-Epenthese im Standarddeutschen. Linguistische Berichte 112, 449-469. Giegerich, Heinz (1992). English phonology; An introduction. Cambridge, UK: Cambridge University Press. Gimson, Alfred Charles (1980). An Introduction to the pronunciation of English. London: Arnold Publishing. Gussenhoven, Carlos (1978). Het Nederlandse diminutief-suffix: schwainsertie anders bekeken. De Nieuwe Taalgids 71, 206-211. Gussmann, Edmund (1991). Schwa and syllabic sonorants in a nonlinear phonology of English. Αη^ίίοα Wratilaviensa 17, 27-39. Haas, Wim de. (1986). Partial syllabification and schwa epenthesis in Dutch. Gramma 10,143-161. Hall, Tracy (1989). German syllabification, the velar nasal and the representation of schwa. Linguistics 27, 807-842. Hall, Tracy (1992). Syllable structure and syllable-related processes in German. Tiibingen: Niemeyer. Halle, Morris and Karuvannur Puthanveettil Mohanan (1985). Segmental phonology of Modem English. Linguistic Inquiry 16, 57-116. Halle, Morris and Jean-Roger Vergnaud (1987). An Essay on Stress. Cambrdige, Mass: The MIT Press. Hamans, Camiel and Roland Noske (1988). The analysis of German schwa. Wiener Linguistische Gazette, Supplement/Beiheft 6,17-19. Harris, John (1990). Reduction harmony. Paper presented at the 13*ь GLOW Colloquium, London: SOAS. Harris, John (1994). English Sound Structure. Oxford: Blackwell. Hart, Johan't (1969). Fonetische steunpunten. De Nieuwe Taalgids 62, 168-174. Hayes, Bruce (1995). Metrical stress theory: principles and case studies. Chicago: University of Chicago Press. Hawkins, W. (1985). Patterns of vowel loss in Macushi Carib. International Journal of American Linguistics 16, 87-90. Hoeksema, Jack (1985). Wortels, stammen en de schwa. TABU 15, 150-153. Hooper, Joan (1976). An Introduction to Natural Generative Phonology. New York: Academic Press. Hulst, Harry van der (1984). Syllable Structure and Stress in Dutch. Dordrecht: Foris.
Schwa in phonological theory
455
Hulst, Harry van der, and John van Lit (1987). Lettergreepstructuur. GLOT10,165-195. Issatschenko, Alexander (1974). Das "schwa mobile" und "schwa constane" im Deutschen. In: Ulrich Engel and Paul Grebe (eds.). Sprachsystem und Sprachgebrauch; Festschrift für Hugo Moser, 141-171. Düsseldorf: Schwann. Jacobs, Haike (1989). Non-linear studies in the historical phonology of French. PhD Dissertation, Nijmegen: Catholic University of Nijmegen. Jong, Daan de and Martin Hietbrink (1994). The morphonology of the French prefix RE. In: Reineke Bok-Bennema and Grit Gremers (eds.). Linguistics in the Netherlands 1994, 43-52. John Benjamins: Amsterdam. Kager, René. (1989). A metrical theory of stress and distressing in English and Dutch. Dordrecht: Foris. Kager, René (1990). Dutch schwa in moraic phonology. In: Michael Ziolkowski, Manuela Noske and Karen Deaton, (eds.) Papers from the 26''' Regional Meeting of the Chicago Linguistic Society, 189-214. Volume II, The parasesion on the syllable in phonetics and phonology. Chicago: CLS. Kager, René, EUis Visch and Wim Zonneveld (1987). Nederlandse woordklemtoon (Hoofdklemtoon, bijklemtoon, reductie, voeten). GLOT 10,197-226. Kahn, Dsmiel (1976). Syllable-based generalizations in English phonology. PhD Dissertation, Cambridge, Mass: MIT. Also appeared with Garland Press. Katz, Dovid (1987). Λ grammar of the Yiddish language. London: Duckworth. Kaye, Jonathan (1990). Government in phonology: The case of Moroccan Arabic. The Linguistic Review 6,131-160. Kaye, Jonathan, Jean Lowenstamm and Jean-Roger Vergnaud (1985). The internal structure of phonological elements: A theory of charm and government. Phonology Yearbook 2, 305-328. Kaye, Jean and Yves Morin (1977). II n' y a pas de règles de troncation, voyons! Manuscript Montreal: UQAM. Keating, Patricia (1988). Underspecification in phonetics. Phonology 5, 275-292. Kenstowicz, Michael (1994). Syllabification in Chuckchee: A constraintsbased analysis. Manuscript, Cambridge, Mass: MIT. Available at ftp://ruccs.rutgers.edu/pub/OT/TEXTS/archive/30-10941/30-10941.ps. Kleinhenz, Ursula (1991). Die Alternation von Schwa mit silbischen Sonoranten im Deutschen. Magisterarbeit, Köln: Institut für Deutsche Sprache und Literatur.
456
Marc van Oostendorp
Kohler, Klaus (1990). Segmental reduction in connected speech in Grerman: Phonological facts and phonetic explanations. In: William J. Hardcastle and Alain Marchai (eds.). Speech Production and Speech Modelling., Dordrecht: Kluwer Academic Publishers. Koopmans-Van Beinum, Florien (1980). Vowel Contrast Reduction: An Acoustic and Perceptual Study of Dutch Vowels in Various Speech Conditions. Amsterdam: Academische Pers. Koopmans-Van Beinum, Florien (1982). Akoestische en perceptieve aspecten van klinkercontrastreductie en de rol van de fonologie. Spektator 11,284^294. Koopmans-Van Beinum, Florien (1993/1994). What's in a schwa? IFA proceedings 1992,16, 53-62. Kristoffersen, Gjert. (1992). The syllable structure of Norwegian. PhD Dissertation, Troms0 University. Kristoffersen, Gjert. (2000). The phonology of Norwegian. Oxford: Oxford University Press. Ladefoged, Peter, and Ian Maddieson (1996). Sounds of the world's languages. Oxford: Blackwell. Lahiri, Aditi and Jean Koreman (1988). On foot typology. NELS 18, 286-299. Lapointe, Steven and Mark Feinstein (1982). The role of vowel deletion and epenthesis in the assignment of syllable structure. In: Harry van der Hulst and Norval Smith, (eds.). The Structure of Phonological Representations, 69-120. Vol Π, Dordrecht: Foris. Lass, Roger (1976). English phonology and phonological theory; Synchronic and diachronic studies. Cambridge, UK: Cambridge University Press. Cambridge Studies in Linguistics 17. Lass, Roger (1984). Phonology; An introduction to basic concepts. Cambridge: Cambridge University Press. Levelt, Claartje (2001). Schwa-schma. Manuscript, Leiden University. Lindblom, Björn (1963). Spectographic study of vowel reduction. Journal of the acoustic society 35,1773-1781. Lodge, Ken (1987). French phonology again: Deletion phenomena, syllable structure and other matters. In: John Anderson and Jacques Durand, (eds.). Explorations in Dependency Phonology, 133-168. Dordrecht: Foris. Lyche, Chantal (1979). French "schwa deletion" in Natural Generative Phonology. Nordic Journal of Linguistics 2, 91-111. Martin, A. (1968). Klinkerreductie: Een casus. Manuscript, Universiteit Utrecht.
Schwa in phonological theory
457
Martinet, André. (1972). La nature phonologique d'e caduc. In: Albert Valdman, (ed.). Papers in linguistics and phonetics to the memory of Pierre Delattre, 239-399. The Hague: Mouton. Mascaró, Joan (1987). Vowel reduction as deletion, Handout Going Romance Conference. Mateus, Maria Helena, and Ernesto d'Andrade (2000). The phonology of Portuguese. Oxford: OUP. Meinhold, Gottfried (1989). Das problematische [э]. In: Edith Slembek, (ed.). Von Lauten und Leuten; Festschrift für Peter Martens zum 70. Geburtstag. {Sprache und Sprechen volume 21.) Frankfurt am Main: Scriptor Verlag. Melvold, Janis (1990). Structure and stress in the phonology of Russian. PhD Dissertation, Cambridge, Mass: MIT. Morin, Yves Charles (1974). Règles phonologiques à domaine indéterminé: Chute du cheva en français. In: Cahiers de linguistique 4,: 69-88. Presses de l'Université du Québec, Montréal. Morin, Yves Charles (1978). The status of mute e. Studies in French linguistics 1,312-345. Morin, Yves Charles (1982). Cross-syllabic constraints and the French "e-muet" '. Journal of linguistic research 2, 41-56. Morin, Yves Charles (1983). De l'ouverture des [e] du moyen français. Revue québécoise de linguistique 12, 37-61. Morin, Yves Charles (1985). French data and phonological theory. Linguistics 25, 815-843. Morin, Yves Charles (1988). De l'ajustement du cheva en syllabe fermé dans la phonologie du français. In Verlu5^en (1988). Moulton, William (1962). The vowels of Dutch; Phonetic and distributional classes. Lingua 11, 294-312. Nagy, Naomi ? Stress and schwa in Faetar. Manuscript. http://english-l. unh.edu/nagy/papers/stress.schwa.html. Neidle, Carol (1979). Syllable structure, markedness and the distribution of the "mute e" in French. Manuscript. Nijen Twilhaar, Jan (1990). Generatieve fonologie en de Studie van Oostnederlandse dialecten. Amsterdam: P.J. Meertens-Instituut. Nijen Twilhaar, Jan (1994). Genus en fonologisch gedrag van nomina op schwa. In: Geert Booij and Jaap van Marie (eds.). Dialectfonologie. Amsterdam: P.J. Meertens-Instituut. Nikiema, Norbert (1989a). Vocalic epenthesis reanalyzed: The case of Tangale. In: John Hutchison and Victor Manfredi, (eds.). Current approaches to African linguistics, 47-51. Dordrecht: Foris.
458
Marc van Oostendorp
Nikiema, Norbert. (1989b). Gouvernement propre et licenciement en phonologie: Le cas du Tangale. In: D. Peeters (ed.) Langues orientales anciennes philologie et linguistique 2, 225-251. Paris: Louvain. Nooteboom, Sieb (1972). Production and perception of vowel duration; A study of durational properties of vowels in Dutch. PhD Dissertation, Utrecht University. Nord, Lennart (1986). Acoustic studies of vowel reduction in Swedish. Speech transmission laboratory quarterly progress and status report 4, 19-36. Noske, Roland (1988). La syllabification et les règles de changement de syllabe en français. In: Verluyten (1988). Noske, Roland (1992). A theory of syllabification and segmental alternation; With studies on the phonology of French, German, Tonkawa and Yawelmani. PhD Dissertation, Tilburg University. Also appeared at Tübingen: Niemeyer. Oostendorp, Marc van (1997). Style levels in conflict resolution. In: Frans Hinskens, Roeland van Hout and Leo Watzels, (eds.). Variation, change and phonological theory, 207-228. Amsterdam: John Benjamins. Oostendorp, Marc van (2000). Phonological Projection. Berlin/New York: Mouton. Piggot, Glynn (1995). Epenthesis and syllable weight. Natural Language and Linguistic Theory 13, 283-326. Piggot, Glynn and Rajendra Singh (1984). The empty node: An analysis of epenthesis. McGill working papers in linguistics 2, 30-55. Piggot, Glynn and Rajendra Singh (1985). The phonology of epenthetic segments. Canadian Journal of Linguistics. 1, 64^109. Prince, Alan and Paul Smolensky. (1993). Optimality theory; Constraint interaction in generative grammar. Manuscript Rutgers and Colorado. Prokosch, Eduard A Comparative Germanic Grammar. Philadelphia: Linguistic Society of America/University of Pennsylvania. Pulgram, Emst (1961). French/э/: Statics and dynamics of linguistic subcodes. Lingua 10, 305-325. Pulleyblank, Douglas (1988). Vocalic underspecification in Yoruba. Linguistic Inquiry 19, 233-270. Ramers, Karl Heinz (1988). Vokalquantität und -qualität im Deutschen. Tübingen: Niemeyer. Rappaport, Malka (1984). Stress and ultra-short vowels in Tiberian Hebrew. Proceedings ofWCFL. Rennison, John (1980). What is shwa in Austrian German? The case for epenthesis and its consequences. Wiener Linguistische Gazette 24, 78-91.
Schwa in phonological theory
459
Rennison, John (1993). Empty nuclei in Koromfe: A first look. Linguistique Africaine 11, 35-65. Rennison, John (1994). Syllables, variable vowels and empty categories. In: Wolfgang Dressier, Martin Prinzhom, and John Rennison, (eds.). Phonologica 1992. Torino: Rosenberg and Sellier. Rennison, John (1997). Syllables in Koromfe. Gur Papers I Cahiers Voltaiques 2. Universität Bayreuth. Rialland, Anne (1985). Schwa et syllabes en français. In: Leo Wetzeis íuid Engin Sezer, (eds.). Studies in Compensatory Lengthening. Dordrecht: Poris. Roca, Iggy (1994). Generative phonology. London: Routledge. Ross, J.R. (1972). A reanalysis of English word stress. In: Michael Brame, (ed.). Contributions to Generative Phonology. Austin: University of Texas Press. Rowicka, Grazyna (1985). English word stress and empty vowel slots. Paper read at the Tamawy Conference on Contrastive Linguistics; cited in Szpyra (1996). Rubach, Jerzy (1986). Abstract vowels in three-dimensional phonology: The yers. The Linguistic Review 11,124-154. Rubach, Jerzy Emd Geert Booij (1990). Syllable structure assignment in Polish. Phonology 7,121-158. Schane, Stanford (1965). The phonological and morphological structure of French. PhD Dissertation, Cambridge, Mass: MIT. Schane, Stanford (1968). On the abstract character of French "E-muet"'. Glossa 2,150-163. Schane, Stanford (1974). There is no French Truncation Rule. In: Joe Campbell, Mark G. Goldin and Mary Clayton Wang (eds.). Linguistic Studies in Romance Languages, 89-100. Washington, D.C.: Greorgetown University Press. Selkirk, Lisa. (1978). Comments on Morin's paper: The French foot: On the status of "mute e". Studies in French Linguistics 1, 24-28. Sievers, Eduard (1878). Zur Accent- und Lautlehre der Germanischen Sprache. Beiträge zur Geschichte der Deutschen Sprache und Literatur 5, 63-163. Sojijärvi, A. (1965). Der Mokschamordvinische э-Vokal im Lichte der Sonagramme. Publicationes Instituti Phonetici Universitatis Helsingiensis 20, Helsinki. Spencer, Andrew (1985). А non-linear analysis of vowel-zero alternations in Polish. Journal of Linguistics 22, 249-280. Stemberger, Joe (1993). Glottal transparency. Phonology 10,107-138.
460
Marc van Oostendorp
Steriade, Donca (1995). Underspecification and markedness. In: J. Groldsmith, (ed.). Handbook ofPhonological Theory, 114-174. London: Blackwell. Szpyra, Jolanta. (1995). Three tiers in Polish and English phonology. Postdoctoral dissertation, Lublin: Marie Curie-Sklodowska University Press. Szpyra, Jolanta (1996). "Incomplete" vowels in English. In: Henryk Kardela and Bogdan Szymanek, (eds.). A Festschrift for Edmund Gussmann from his Friends and Colleagues, 269-292. Lublin: The University Press of the Catholic University of Lublin. Trager, George Leonard and Henry Lee Smith (1951). An Outline of English Structure. Norman, Oklahoma: Battenbrug Press. Studies in Linguistics, Occasional Papers, 3. Tranel, Bernard (1974). Le cas de l'effacement facultatif du schwa en français: Quelques implications théoriques. Recherches linguistiques à Montréal 1. Tranel, Bernard (1981). Concreteness in generative phonology. Berkeley: University of Colorado Press. Tranel, Bernard (1984). Floating schwas and closed syllable adjustment in French'. In: Woflgang Dressier et al., (eds.). Phonologica 1985. Cambridge, UK: Cambridge University Press. Tranel, Bernard (1985). On closed syllable adjustment in French. In: Larry King and Catherine Maley (eds.). Selected Papers from the 13"' Linguistic Symposium on Romance Languages, 377-405. Amsterdam/ Philadelphia: John Benjamins. Tranel, Bernard (1987). French schwa and non-linear phonology. Linguistics 25, 845-866. Tranel, Bernard (1988). A propos de l'ajustement de e en français. In Verluyten (1988). Tranel, Bernard (1994). French liaison and elision revisited: A vmified account within Optimality Theory. Manuscript Irvine: University of California. Trommelen, Mieke (1983). The syllable in Dutch; With special reference to diminutive formation. Dordrecht: Foris. Velde, Hans Van der (1996). Variatie en verandering in het gesproken Standaard-Nederlands. PhD Dissertation, Nijmegen: Katholieke Universiteit Nijmegen. Verlu)^en, Sylvain (1985). Prosodie structure and the development of French schwa. In: Jacek Fisiak (ed.). Papers from the 6"' International Conference on Historical Linguistics, 549-559. Amsterdam: John Benjamins, and Poznañ: Adam Mickiewicz University Press.
Schwa in phonological theory
461
Verluyten, Sylvain (1988). La phonologie du schwa français. Amsterdam: John Benjamins. Verluj^n, Sylvain and M. Bertele (1987). Observational adequacy and formal complexity: A comparative evaluation of five phonological theories on French schwa deletion. Manuscript, University of Antwerp. Visser, Willem (1994). Schwa-appendixen in het Fries. In: Geert Booij and Jaap van Marie (eds.). Dialectfonologie. Amsterdam: P.J. Meertens-Instituut. Visser, Willem (1997). The syllable in Frisian. PhD Dissertation, Vrije Universiteit Amsterdam. Appeared at The Hague: HAG. Vijver, Ruben van de (1998). The iambic issue; Iambs as a result of constraint interaction. PhD Dissertation, Leiden University Appeared at The Hague: HAG. Wells, John (1982). Accents of English. Cambridge, UK: Cambridge University Press. Wiese, Richard (1986). Schwa and the structure of words in German. Linguistics 24, 696-724. Wiese, Richard (1988). Silbische und lexikalische Phonologie; Studien zum Chinesischen und Deutschen. Tübingen: Niemeyer. Yip, Moira (1987). English vowel epenthesis. Natural Language and Linguistic Theory 5,463-484. Zonneveld, Ron van (1985). Word rhythm and the Janus syllable. In: Harry van der Hulst and Norval Smith, (eds.). The Structure of Phonological Representations II, Dordrecht: Foris. Zonneveld, Wim (1993). Schwa, superheavies, stress and syllables in Dutch. The Linguistic Review 10,59-110.
Distributed Morphology Heidi Harley and Rolf Noyer
Whenever a major revision to the architecture of UG is proposed, it takes some time for sufficient work to accumulate to allow evaluation of the viability of the proposal, as well as for its broad outlines to become familiar to those not immediately involved in the investigation. The introduction of Distributed Morphology in the early 1990s, by Morris Halle and Alec Marantz, is a case in point. In the seven-year period since the first paper outlining the framework appeared, a reasonably substantial body of work has appeared, addressing some of the key issues raised by the revision. The goal of this article is to introduce the motivation and core assumptions of the framework, and at the same time provide some pointers to the recent work which revises and refines the basic DM proposal and increases DM's empirical coverage. Since the particular issues we discuss cover such a broad range of territory, we do not attempt to provide complete summaries of individual papers, nor, for the most part, do we attempt to relate the discussion of particular issues to the much broader range of work that has been done in the general arena. What we hope to do is allow some insight into (and foster some discussion of) the attitude that DM takes on specific issues, with some illustrative empirical examples. This article is organized as follows. Section 1 sketches the layout of the grammar and discusses the division of labor between its components. The "distributed" of Distributed Morphology refers to the separation of properties which in other theories are collected in the Lexicon, and in section 1 we elaborate on the motivation for this separation and its particulars. Section 2 explicates the mechanics of Spell-Out, giving examples of competition among phonological forms from Dutch, introducing the notion of "f-morpheme" and "1-morpheme" and distinguishing allomorphy from
464
Heidi Harley and Rolf Noyer
suppletion with examples from English. Section 3 discusses the operations which are available in the Morphology component, addressing in tum Morphological Merger, Impoverishment and Fission, with examples from Latin, Serbo-Croatian, Norwegian, and Tamazight Berber. We also provide an illustration of the contrast between a "piece-based" theory such as DM and process-based morphological theories. Section 4 treats the relationship of the syntax to the morphology, Separationism and its limitations, the ways in which a mismatch between syntactic terminal nodes and morphosjoitactic features may arise, and the distinct types of syntax/ morphology mismatches conventionally classified as cliticization. We conclude in Section 5 with an agenda for future research.
1. The structure
of the
grammar
There are three core properties which distinguish Distributed Morphology from other morphological theories: Late Insertion, Underspecification, and Syntactic Hierarchical Structure All the Way Down. The grammar, still of the classic Y-type, is sketched in (1) below. Unlike the theory of LGB (Chomsky 1981) and its Lexicalist descendants, in DM the syntax proper does not manipulate anything resembling lexical items, but rather, generates structures by combining morphosyntactic features (via Move and Merge) selected from the inventory available, subject to the principles and parameters governing such combination. Late Insertion refers to the hypothesis that the phonological expression of syntactic terminals is in all cases provided in the mapping to Phonological Form (PF). In other words, syntactic categories are purely abstract, having no phonological content. Only after syntax are phonological expressions, called Vocabulary Items inserted in a process called Spell-Out. It is further worth noting that this hypothesis is stronger than the simple assertion that terminals have no phonological content: as we will see below, there is essentially no pre-syntactic differentiation (other than, perhaps, indexing) between two terminal nodes which have identical feature content but will eventually be spelled out with distinct Vocabuleiry Items such as dog and cat. Underspecification of Vocabulary Items means that phonological expressions need not be fully specified for the syntactic positions where they can be inserted. Hence there is no need for the phonological pieces of a word to supply the morphosyntactic features ofthat word; rather Vocabulary Items are in many instances default signals inserted where no more specific form is available.
Distributed Morphology
465
Syntactic Hierarchical Structure All the Way Down entails that elements within syntax and within morphology enter into the same types of constituent structures (such as can be diagrammed through binary branching trees). DM is piece-based in the sense that the elements of both syntax and of morphology are understood as discrete constituents instead of as (the results of) morphophonological processes. (1) ListA
[Det]
Morphosyntactic features: [1"] [CAUSE] [Root] [pi] etc...
[+pst]
i i U Syntactic Operations , (Merge, Move, Copy)
Morphological Operations
Logical Form
P h o n ^ g i c a l Form (Insertion of Vocabulary Items, Readjustment, phonological rules) List В Vocabulary Items '/dog/: [Root] [+count] [+animate].?\ /-s/: [Num] [pi]... /did/: [pst]... etc...
Conceptual Interface ("Meaning")
ШС Encyclopedia (non-linguistic knowledge) dog: four legs, canine, pet, sometimes bites etc... chases balls, in environment "let sleeping lie ", refers to discourse entity who is better left alone... cat·, four legs, feline, pet, purrs, scratches, in environment "the out of the bag " refers to a secret...etc...
466
Heidi Harley and Rolf Noyer
1.1. The Lexicalist Hypothesis and DM There is no Lexicon in DM in the sense familiarfromgenerative grammar of the 1970s and 1980s. In other words, DM unequivocally rejects the Lexicalist Hypothesis. The jobs assigned to the Lexicon component in earlier theories are distributed through various other components. For linguists committed to the Lexicalist Hypothesis, this aspect of DM may be the most difficult to accept, but it is nevertheless a central tenet of the theory. (For discussion of this issue from a Lexicalist viewpoint, see Zwicky & Pullum 1992.) The fullest exposition of the anti-Lexicalist stance in DM is found in Marantz (1997a). There, Marantz argues against the notion of a generative lexicon, adopted in such representative examples of the Lexicalist Hypothesis as Selkirk (1982) or DiSciullo and Williams (1987), using argumentsfromthe very paper which is usually taken to be the source of the Lexicalist Hypothesis, Chomsk/s (1970) "Remarks on Nominalization". Marantz points out that it is crucial for Chomsky's argument that, for instance, a process like causativization of an inchoative root is syntactic, not lexical. Chomsky argues that roots like^roi/; or amuse must be inserted in a causative syntax, in order to derive their causative forms. If their causative forms were lexically derived, nothing would prevent the realization of the causativized stem in a nominal syntax, which the ungrammaticality of *John's growth of tomatoes indicates is impossible. Other lexicalist assumptions about the nature of lexical representations, Marantz notes, are simply unproven: no demonstration has been made of correspondence between a phonological "word" and a privileged type of unanalyzable meaning in the semantics or status as a terminal node in the syntax, and counterexamples to any simplistic assertion of such a correspondence are easy to find. Because there is no Lexicon in DM, the term "lexical item" has no significance in the theory, nor can anything be said to "happen in the Lexicon", and neither can anything be said to be "lexical" or "lexicalized". Because of the great many tasks which the Lexicon was supposed to perform, the terms "lexical" and "lexicalized" are in fact ambiguous. (For a discussion of terminology, see Aronoff 1994.) Here we note a few of the more usual assumptions about lexicalization, and indicate their status in the DM model: Lexical(ized) = Idiomatized. Because the Lexicon was supposed to be a storehouse for sound-meaning correspondences, if an expression is conventionally said to be "lexicalized" the intended meaning may be that the expression is listed with a specialized meaning. In DM such an expression
Distributed Morphology
467
is an idiom and reqviires an Encyclopedia Entry (see 1.4). There is no "word-sized" unit which has a special status with respect to the idiomatization process; morphemes smaller than word-size may have particular interpretations in particular environments, while expressions consisting of many words which obviously have a complex internal syntax may equally be idiomatized. Lexical(ized) = Not constructed by Syntax. The internal structure of expressions is demonstrably not always a product of syntactic operations. In DM structure is produced both in syntax and after syntax in the Morphology component (labelled "Morphological Operations"; section 3). Nevertheless, because of "Syntactic Hierarchical Structure all the Way Down", operations within Morphology still manipulate what are essentially syntactic structural relations. The syntactic component produces a representation whose terminal elements are morphosyntactic features, which is then subject to operations such as "Merger Under Adjacenc}^, "Fission" or "Fusion", accounting for non-isomorphic mappings from syntactic terminals to morphophonological constituents. Lexical(ized) = Not subject to exceptionless phonological processes, i.e. part of "lexical" phonology in the theoiy of Lexical Phonology and Morphology (Kiparsky 1982 et seq.). In DM the distinction between two types of phonology — "lexical" Euid "postlexical" — is abandoned. All phonology occurs in a single post-syntactic module. While Lexical Phonology and Morphology produced many important insights, DM denies that these results require an architecture of grammar which divides phonology into a pre-syntactic and post-syntactic module (see also Sproat 1985). Rather, post-syntactic Phonology itself may have a complex internal structure (Halle & Vergnaud 1987).
1.2. The status of Vocabulary Items and the lexical I functional distinction In DM, the term morpheme properly refers to a syntactic (or morphological) terminal node and its content, not to the phonological expression of that terminal, which is provided as part of a Vocabulary Item. Morphemes are thus the atoms of morphosyntactic representation. The content of a
468
Heidi Harley and Rolf Noyer
morpheme active in syntax consists of syntactico-semantic features drawn from the set made available by Universal Grammar. A Vocabulary Item is, properly speaking, a relation between a phonological string or "piece" and information about where that piece may be inserted. Vocabulary Items provide the set of phonological signals available in a language for the expression of abstract morphemes. The set of all Vocabulary Items is called the Vocabulary. (2)
Vocabulary Item schema signal » context of insertion
a. b. c. d. e.
Example Vocabulary Items /i/ »• [ , +plural] A Russian affix (Halle 1997) Ы » [ , +participant +speaker, plural] A chtic in Barceloni Catalan (Harris 1997a) /у-/ » elsewhere An affix in the Ugaritic prefix conjugation (Noyer 1997) 0 2 plu A subpart of a clitic in Iberian Spanish (Harris 1994) / k a e t / [ D P D [LP ]] Root inserted in a nominal environment (Harley & Noyer 1998a)
Note that the phonological content of a Vocabulary Item may be any phonological string, including zero or 0 . The featural content or context of insertion may be similarly devoid of information: in such cases we speak of the default or "elsewhere" Vocabulary Item. Note that the two do not necessarily coincide — that is, a null phonological affix in a given paradigm is not necessarily the default Vocabulary Item. For example, the zero plural affix inserted in the context of marked Enghsh nouns like sheep is not the English default plural. In early work in DM, Halle (1992) proposed a distinction between concrete morphemes, whose phonological expression was fixed, and abstract morphemes, whose phonological expression was delayed until after syntax. More current work in DM however endorses Late Insertion of all phonological expression, so Halle's earlier concrete vs. abstract distinction has been largely abandoned. Harley & Noyer (1998a) propose an alternative to the concrete vs. abstract distinction; they suggest that morphemes are of two basic kinds: f-morphemes and l-morphemes, corresponding approximately to the conventional division between functional and lexical categories, or closedclass and open-class categories.
Distributed Morphology
469
F-morphemes are defined as morphemes for which there is no choice as to Vocabulary insertion: the Spell-Out of an f-morpheme is deterministic. In other words, f-morphemes are those whose content (as defined by syntactic and semantic features made available by Universal Grammar) suffices to determine a unique phonological expression. One prediction is that Vocabulary Items conventionally classified as "closed-class" should either express purely grammatical properties or else have meanings determined solely by universal cognitive categories (see 2.3 for further discussion). In contrast, an 1-morpheme is defined as one for which there is a choice in Spell-Out: an 1-morpheme is filled by a Vocabulary Item which may denote a language-specific concept. For example, in an 1-morpheme whose syntactic position would traditionally define it as a "noun", any of the Vocabulary Items dog, cat, fish, mouse, table etc. might be inserted. Note that because the conventional categorial labels "noim", "verb", "adjective" etc. are by hypothesis not present in syntax (l-morphemes being acategorial), the widely adopted hypothesis that Prosodie Domain construction should be oblivious to such distinctions (Selkirk 1986; Chen 1987) follows automatically. 1.3. The syntactic determination
of lexical
categories
The conjecture we have just alluded to, which we will term the L-Morpheme Hypothesis (Marantz 1997a; Embick 1997, 1998a, 1998b; Harley 1995; Harley & Noyer 1998a, 1998b; Alexiadou 1998), contends that the traditional terms for sentence elements, such as noim, verb, and adjective have no universal significance and are essentially derivative from more basic morpheme types (see also Sapir 1921, ch. 5). As noted above, Marantz (1997a) contends that the configurational definition of category labels is already implicit in Chomsky (1970). Specifically, the different "parts of speech" can be defined as a single 1-morpheme, or Root (to adopt the terminology of Pesetsky 1995), in certain local relations with category-defining f-morphemes. For example, a "noun" or a "nominalization" is a Root whose nearest c-commeinding f-morpheme (or licenser) is a Determiner, a "verb" is a Root whose nearest c-commanding f-morphemes are v, Aspect and Tense; without Tense such a Root is simply a "participle" (Embick 1997; Harley and Noyer 1998b). Thus, the same Vocabulary Item may appear in different morphological categories depending on the syntactic context that the item's 1-morpheme (or Root) appears in. For example, the Vocabulary Item destroy is realized as a "noun" destruct-(ion) when its nearest licenser is a Determiner, but the same Vocabulary Item is realized as a "participle" destroy-(ing) when its nearest licensers are Aspect and v; if Tense appears immediately above
470
Heidi Harley and Rolf Noyer
Aspect, then the "participle" becomes a "verb" such as destroy-(s). However, it is probably the case that many traditional part-of-speech labels correspond to language-specific features present after syntax which condition various morphological operations such as Impoverishment (see 3.2) and Vocabulary Insertion. 1.4. Idioms: the content of the
Encyclopedia
In DM, the Vocabulary is one list which contains some of the information which in lexicalist theories is associated with the Lexicon. Another such list is the Encyclopedia, which relates Vocabulary Items (sometimes in the context of other Vocabulary Items) to meanings. In other words, the Encyclopedia is the list of idioms in a language. The term idiom is used to refer to any expression (even a single word or subpart of a word) whose meaning is not wholly predictablefi-omits morphosyntactic structural description (Marantz 1995,1997a). F-morphemes are typically not idioms, but 1-morphemes are always idioms. (3)
Some idioms cat (the) veil (rain) cats and dogs (talk) turkey
(a fuzzy animal) (vows of a nun) (a lot) (honest discourse)
The notion of "idiom" in DM, obviously, embraces more than the conventional use of the term implies. Idioms in the conventional sense—that is, groups of words in a particular syntactic arrangement that receive a "special" interpretation, for example kick the bucket, whose meaning is roughly "die" — are represented in DM as subparts of the Encyclopedic entry for the Root (or Roots) which are involved. The Encyclopedia entry for "kick", for example, will specify that in the environment of the direct object the bucket, kick may be interpreted as "die". The study of conventional idioms has been an important source of evidence for locality restrictions on interpretation in DM; in particular, following the observations of Marantz (1984), the fact that external arguments are never included as part of the contextual conditioning of Roots in conventional idioms has led to the proposal whereby external arguments are projected by a separate "little-v" head, not by any Root, and they thus are not mentioned by Encyclopedia entries for Roots as a possible interpretive conditioner. (For an alternative, non-DM discussion of idioms, see Jackendoff 1997.) As indicated in the schema in (1) above, the "meaning" of an expression is interpreted from the entire derivation of that expression, including the information from the Encyclopedia which is considered extralingusitic.
Distributed Morphology
471
LF does not express or represent meaning; LF is merely a level of representation which exhibits certain meaning-related structural relations, such as quantifier scope. (The relationship of the Encyclopedia to the Vocabulary is the topic of much current debate, see, for example, Marantz 1997a; Harley and Noyer 1998a.)
2.
Spell-Out
Spell-Out inserts Vocabulary Items (phonological pieces) into morphemes. In the unmarked case, the relation between Vocabulary Items and morphemes is one-to-one, but as we have seen, several factors may disrupt this relation (Noyer 1997), including Fission of morphemes, removal of morphosyntactic features by Impoverishment, local displacements of Vocabulary Items by Morphological Merger and post-syntactic insertion of Dissociated morphemes. Spell-Out works differently depending on what type of morpheme is being spelled out, f-morphemes or 1-morphemes. Regardless of the type of morpheme, however, Spell-Out is normally taken to involve the association of phonological pieces (Vocabulary Items) Avith abstract morphemes. Halle (1992) construes Spell-Out as the rewriting of a place-holder "Q" in a morpheme as phonological material. This operation is normally understood as cyclic, such that more deeply embedded morphemes are spelledout first. 2.1. Spell-Out
off-morphemes:
the Subset
Principle
Early work in DM was focused primarily on the Spell-Out of f-morphemes. In such cases sets of Vocabulary Items compete for insertion, subject to what Halle (1997) called the Subset Principle. (Lumsden 1987: 107 proposes a similar principle and calls it "Blocking". Halle's principle is not to be confused with the Subset Principle of Manzini and Wexler 1987, which deals with leamability issues.) Subset Principle. "The phonological exponent of a Vocabulary Item is inserted into a morpheme... if the item matches all or a subset of the grammatical features specified in the terminal morpheme. Insertion does not take place if the Vocabulary Item contains features not present in the morpheme. Where several Vocabulary Items meet the conditions for insertion, the item matching the greatest number of features specified in the terminal morpheme must be chosen."
472
Heidi Harley and Rolf Noyer
Below, we give an example from Sauerland (1995). (4)
a.
Dutch strong adjectival desinences [-neuter] [+neuter] [-pi] -e 0 [+pl] -e -e
b.
Vocabulary Items 0 «—* [ , +neuter -plural] /Adj + -e«-^Adj + --
In Dutch, after syntax, a Dissociated morpheme (see 3.0 below) is inserted as a right-adjunct of those morphemes which are conventionally labeled "adjectives". The Vocabulary Items above compete for insertion into this morpheme. In the specific environment of the neuter singular, 0 is inserted. In the remaining or elsewhere environment -e is inserted. The insertion of 0 in the specific environment bleeds the insertion of -e because, under normal circumstances, only a single Vocabulary Item may be inserted into a morpheme. Note that the Vocabulary Items above are not specially stipulated to be disjunctive except insofar as they compete for insertion at the same morpheme. Note that all Vocabulary Items may compete for insertion at any node; there is no pre-insertion separation of Vocabulary Items into "related" forms which may compete. However, since the insertion process is restricted by feature content, a certain collection of Vocabulary Items corresponding to the traditional notion of a "paradigm" may be the set vmder discussion when accounting for the phonological realization of a given terminal node. In some theories certain such collections have a privileged status or can be referred to by statements of the grammar (Carstairs 1987; Wunderlich 1996). But in DM, paradigms, like collections of related phrases or sentences, do not have any status as theoretical objects, although certain regularities obtaining over paradigms may result from constraints operating during language acquisition. 2.2. Feature Hierarchies, Feature Geometries and the Subset Principle In some cases it would be possible to insert two (or more) Vocabulary Items into the same f-morpheme, and the Subset Principle does not determine the winner. Two approaches have been proposed in DM for such cases. Halle & Marantz (1993) suggest that such confiicts are resolved by extrinsic ordering: one Vocabulary Item is simply stipulated as the wiimer. Alternatively, Noyer (1997) proposes that such conflicts can al-
Distributed Morphology
473
ways be resolved by appeal to a Universd Hierarchy of Features (cf. also Lumsden 1987,1992; Zwicky 1977; Silverstein 1976). Specifically, the Vocabulary Item that uniquely has the highest feature in the hierarchy is inserted. (5)
Fragment of the Hierarchy of Features 1 person > 2 person > dual > plural > other features
Harley (1994), following a proposal of Bonet (1991), argues that the conflict-resolving effects of the Feature Hierarchy can be derived from a geometric representation of morphosyntactic features, according to which the Vocabulary Item which realizes the most complex feature geometry is inserted in such situations. See also section 3.2 on Impoverishment, below. 2.3. Spell-Out of competition, suppletion
l-morphemes: and allomorphy
For l-morphemes there is a choice regarding which Vocabulary Item is inserted. For example, a Root morpheme in an appropriately local relation to a Determiner might be filled by cat, dog, house, table or any other Vocabulary Item we would normally call a "noun". Harley & Noyer (1998a) note that it is clear that such Vocabulary Items are not in competition, as are the Vocabulary Items inserted into f-morphemes. Rather, these Vocabulary Items can be freely inserted at Spell-Out, subject to conditions of licensing. Licensers are typically f-morphemes in certain structural relations to the Root where the Vocabulary Item is inserted. And as outlined above, these structural relations typically determine the traditional notion of category. "Nouns" are licensed by an immediately c-commanding Determiner; different verb classes, such as unergatives, unaccusatives, and transitives each are licensed by different structural configurations and relations to various higher eventuality projections. Marantz (1997a) discusses the interesting case of l-morphemes which undergo apparent allomorphy in different environments, such as the rise / raise alternation. These pose a problem in that they appear to be in competition for insertion in different environments (that is, raise is inserted in the context of a commanding CAUSE head, while rise, the intransitive and nominal variant, is the elsewhere case). They cannot be separate Vocabulary Items, however, for if they were, raise should be a separate verb with the properties of the destroy class. The absence of nominalizations like *John's raise of the pig for bacon, however, indicate that raise is simply a morphophonological variant of the basic intrimsitive rise root, which is a member of the grow class. That is, in DM, 1-morpheme alternations like rise I raise must not be determined by competition, as may be
474
Heidi Harley and Rolf Noyer
the case for allomorphy of f-morphemes, but rather must be the product of post-insertion readjustment rules. DM, then, must recognize two different tJφes of allomorphy: suppletive and morphophonological. Suppletive allomorphy occurs where different Vocabulary Items compete for insertion into an f-morpheme, as outlined in section 2.1 above. To give another example, Dutch nouns have (at least) two plural number suffixes, -en and -s. The conditions for the choice are partly phonological and partly idiosyncratic. Since -en and -s are not plausibly related phonologically, they must constitute two Vocabulary Items in competition. Morphophonological allomorphy occurs where a single Vocabulary Item has various phonologically similar underlying forms, but where the similarity is not such that Phonology can be directly responsible for the variation. For example, destroy and destruct- represent stem allomorphs of a single Vocabulary Item; the latter allomorph occurs in the nominalization context. DM hypothesizes that in such cases there is a single basic allomorph, and the others are derived from it by a rule of Readjustment. The Readjustment in this case replaces the rime of the final syllable of destroy with -uct. (Alternatively such allomorphs might both be listed in the Vocabulary and be related by "morpholexical relations" in the sense of Lieber 1981.) Traditionally it is often thought that there is a gradient between suppletion and other types of more phonologically regular allomorphy, and that no reasonable grounds can be given for how to divide the two or if they should be divided at all. Marantz (1997b) has recently proposed that true suppletion occurs only for Vocabulary Items in competition for f-morphemes, since competition occurs only for f-morphemes. An immediate consequence of this proposal is that undeniably suppletive pairs like go/ went or bad I worse must actually represent the spelling off-morphemes. The class of f-morphemes is as a result considerably enriched, but since the class of f-morphemes is circumscribed by Universal Grammar, it is also predicted that true suppletion should be limited to universal syntactico-semantic categories. Moreover, given that some independent grounds might in this way divide suppletive from Readjustment-driven allomorphy, a theory of the range of possible Readjustment processes becomes more feasible. The controversial distinction between derivational and inflectional (Anderson 1982) has no explicit status in DM. However, the distinction between f-morphemes and 1-morphemes perhaps captures some of the intuition behind the derivational/inflectional distinction, although certainly not all f-morphemes would normally be considered "inflectional". DM also distinguishes between syntactic and non-sjmtactic (dissociated) mor-
Distributed Morphology
475
phemes, although again this distinction has no straightforward analogue in the derivational/inflectional debate. 3. Manipulating structured expressions: morphological operations In DM any given expression acquires at least two structural descriptions during its derivation. In a morphophonological description, an expression's phonological pieces (its Vocabulary Items) and their constituent structure are displayed. In a morphosyntactic description, an expression's morphemes and their constituent structure are displayed. (6)
The expression cows: Morphosyntactic description: [Root [+plural]] Morphophonological description: [kaw+ z]
The morphosyntactic structure of an expression is generated by several mechanisms. Syntax, using conventional operations such as head-movement, plays a major role in constructing morphosyntactic structures, including "word''-intemal structure. But in addition, DM employs several additional mechanisms in a post-syntactic component, Morphological Structure. First, morphemes such as [passive] or [case] (in some instances, see Marantz 1991) which, by hypothesis, do notfigurein syntax proper, can be inserted after syntax but before Spell-Out. These morphemes, which only indirectly reflect syntactic structures, are called Dissociated morphemes. For a filli exposition of the mechanism of Dissociated morpheme insertion, see Embick (1997). Second, the constituent structure of morphemes can be modified by Morphological Merger, which can effect relatively local morpheme displacements. 3.1. Merger Morphological Merger, proposed first in Marantz (1984), was originally a principle of well-formedness between levels of representation in syntax. In Marantz (1988: 261) Merger was generalized as follows: Morphological Merger. At any level of syntactic analysis (d-structure, s-structure, phonological structure), a relation between X and Y may be replaced by (expressed by) the affixation of the lexical head of X to the lexical head of Y.
476
Heidi Harley and Rolf Noyer
What Merger does is essentially "trade" or "exchange" a structural relation between two elements at one level of representation for a different structural relation at a subsequent level. (Rebracketing imder adjacency is also proposed and discussed at length in Sproat 1985.) Merger has different consequences depending upon the level of representation it occurs at. Where Merger applies in syntax proper it is essentially Head Movement, adjoining a zero-level projection to a governing zero-level projection (Baker 1988). Cases of S5^tactic Lowering may be a tJφe of Merger as well, presumably occurring after syntax proper but before Vocabulary Insertion; e.g. the Tense to verb affixation in English (see Bobaljik 1994) or perhaps C-to-I lowering in Irish (McCloskey 1996). The canonical use of Merger in Morphology is to express second-position effects. Embick & Noyer (1999, to appear) hypothesize that where Merger involves particular Vocabulary Items (as opposed to morphemes), the items in question must be string-adjacent; such cases of Merger are called Local Dislocation. Schematically Local Dislocation looks like this: (7)
Local Dislocation: X[Y...] ^ [Y+X...
In Local Dislocation, a zero-level element trades its relation of adjacency to a following constituent with a relation of affixation to the linear head (peripheral zero-element) of that constituent. (Local Dislocation has also received considerable attention outside of DM from researchers working in Autolexical Syntax; Sadock 1991.) For example, Latin -que is a second-position clitic which adjoins to the left of the zero-level element to its right (* represents the relation of string adjacency; Q represents dissociated morphemes): (8)
Latin -que placement
Mophological structure:
[[A Q]
Vocabulary insertion:
[[bon i]
[N-Q]]
Linearization: Local dislocation
[[bon*i\ * [риегЧ\\ * [-que* [[bon*ae]* [рие11*ае]]] [[ЬопЧ] * [puer*iU * [[[bon*ae]*quei* [puell*aeU]
[puer ¿]]
[cl
[[A-Q]
[-que [[bon œ]
[N-Q]]]
[puell аеШ
good-nom.pl boy-nom.pl good-nom.pl-and girl-nom.pl 'Good boys and good girls'
By hypothesis, Prosodie Inversion (Halpem 1995) is a distinct species of Merger at the level of PF, and differs from Local Dislocation in that the affected elements are prosodie categories rather than morphological ones. For example, Schütze (1994), expanding on Zee & Inkelas (1990), argues that the avixiliary clitic je in Serbo-Croatian is syntactically in C, but inverts with the following Phonological Word by Prosodie Inversion at PF (parentheses below denote Phonological Word boundaries):
Distributed Morphology (9)
477
Serbo-Croatian second-position clitics
Morphological structure after Spell-Out \je [yp [pp U ovoj sobi] klavir]] Parse into Phonological Words je {U ovoj) (sobi) (klavir) Prosodie Inversion ((Uovoj)+je) (sobi) (klavir) In this AUX room piano 'In this room is the piano' The positioning of the clitic cannot be stated in terms of a (morpho)syntactic constituent, since U ovoj 'in this' does not form such a constituent. Embick & Izvorski (1995) specifically argue that sjoitactic explanations, including those involving remnant extraposition, cannot reasonably be held accountable for this pattern. However, it should be emphasized that the extent to whidi Local Dislocation and Prosodie Inversion are distinct devices in the mapping to PF remains controversial, with many researchers seeking to reduce the two to a purely prosodie or a purely syntactic mechanism.
3.2. Impoverishment Impoverishment, first proposed in Bonet (1991), is an operation on the contents of morphemes prior to Spell-Out. In early work in DM, Impoverishment simply involved the deletion of morphosyntactic features from morphemes in certain contexts. When certain features are deleted, the insertion of Vocabulary Items requiring those features for insertion cannot occur, and a less specified item will be inserted instead. Halle & Marantz (1994) termed this the "Retreat to the Generid Case". (10)
Adjectival sufixes in Norwegian (Sauerland 1995) STRONG [-neuter] [-pi] 0 [+pl] -e WEAK [-neuter] [-pi] -e [+pl] -e
[+neuter] -t -e [+neuter] -e -e
In Norwegian, there is a three-way distinction (i ~ e ~ 0 ) in adjectival suffixes in a "strong" syntactic position, but in the "weak" position one finds only -e. By hypothesis, it is not accidental that the affix -e is the Elsewhere affix in the strong context, and also appears everywhere in the weak context. Sauerland's (1995) Impoverishment analysis of the weak paradigms captures this insight. He proposes the following set of Vocabulary Items: (11)
Norwegian Vocabulary Items
478
Heidi Harley and Rolf Noyer /1/ [ , -pl +neut] / Adj + 0 [ , -pl -neut] / Adj + /е/ *—*• elsewhere/ Adj +
In the weak syntactic position, a rule of Impoverishment applies, deleting any values for gender features: (12)
Impoverishment [±neuter] -»· 0
Impoverishment thus guarantees that neither the Vocabulary Items t nor 0 can be inserted, since both require explicit reference to a value for [±neuter]. Insertion of the general case, namely -e, follows automatically. As we have noted above, in Bonet's original proposal (1991) and in several subsequent works (Harley 1994; Harris 1997a; Ritter & Harley 1998), morphosyntactic features are arranged in a feature geometry much like phonological features, and Impoverishment is represented as delinking. Consequently, the delinking of certain features entails the delinking of features dependent on them. For example, if person features dominate number features which in turn dominate gender features, then the Impoverishment (delinking) of number entails the delinking of gender as well: (13)
Impoverishment as Delinking 2
2
pl
Noyer (1997) rejects the use of geometries of this sort as too restrictive, and proposes instead that Impoverishments are better understood as feature-cooccurrence restrictions or filters of the tJφe employed by Calabrese (1995) for phonological segment inventories. For example, the absence of a first person dual in Arabic is represented as the filter * [1 dual], and a Universal Hierarchy of Features dictates that where these features combine, because [dual] is a number feature and [1] is a (hierarchically higher) person feature, [dual] is deleted automatically. Calabrese (1994) and (1996) further expand this idea. The use of feature geometries in DM remains an unresolved issue at this time, but Feature Hierarchies, whether geometric or not, ensure that normally more marked feature values persist in contexts of neutralization. Feature-changing Impoverishment, which as a device has approximately the same power as Rules of Referral (Zwicky 1985b; Stump 1993), has in general been eschewed in DM. However, Noyer (1998a) discusses
Distributed Morphology
479
cases where feature-changing readjustments seem necessary. It is proposed that such cases always involve a change from the more marked value of a feature to the less marked value and never vice versa. 3.3. Fission and Feature
Discharge
Fission was originally proposed in Noyer (1997) to account for situations in which a single morpheme may correspond to more than one Vocabulary Item. In the normal situation, only one Vocabulary Item may be inserted into any given morpheme. But where Fission occurs, Vocabulary Insertion does not stop after a single Vocabulary Item is inserted. Rather, Vocabulary Items accrete on the sister of the fissioned morpheme until all Vocabulary Items which can be inserted have been, or all features of the morpheme have been discharged. A feature is said to be discharged when the insertion of a Vocabulary Item is conditioned by the presence of that feature. However, Noyer (1997) argues that features conditioning the insertion of a Vocabulary Item come in two types. A Vocabulary Item primarily expresses certain features in its entry, but it may be said to secondarily express certain other features. This distinction corresponds (approximately) to the distinction between primary and secondary exponence (Carstairs 1987). Only features which are primarily expressed by a Vocabulary Item are discharged by the insertion of that Item. For example, in the prefix-conjugation of Tamazight Berber, the AGR morpheme can appear as one, two or three separate Vocabulary Items, and these may appear as prefixes or as sufiSxes: (14)
a.
Tamazight Berber Prefix Conjugation, dawa 'cure' 3m 3f 2m 2f 1
b.
singular i-dawa t-dawa t-dawa-d t-dawa-d dawa-y
Vocabulary Items In-I