191 8 9MB
English Pages 256 [268] Year 1981
Locality Principles in Syntax
Studies in Generative Grammar The goal of Studies
in Generative
Grammar
is to publish those texts that are
representative of recent advances in the theory of formal grammar. T o o many studies do not reach the public they deserve because of the depth and detail that make them unsuitable for publications in article form. We hope that the present series will make these studies available to a wider audience than has been hitherto possible.
Jan Henk
Köster van
Riemsdijk
Jan Koster
jocality D rincipies in Syntax
1981 FORIS PUBLICATIONS Dordrecht - Holland/Cinnaminson - U.S.A.
Second printing 1981. ISBN 90 70176 06 8 © 1978 by J a n Koster, Amsterdam. No part of this book may be translated or reproduced in any form, by print, photoprint, or any other means, without written permission from the publisher. Printed in the Netherlands by Intercontinental Graphics, H.I. Ambacht.
for my parents for Charlotte
ACKNOWLEDGEMENTS The prehistory of my interest in generative grammar was dominated by Albert Kraak's inspiring syntax lectures and Frits Staal's articles on philosophy. The point of no return was finally reached, however, when I enrolled in a syntax class given by Wim Klooster. This was my first opportunity to combine vague philosophical interests with very concrete language puzzles, and the many enthusiastic discussions I had then together with Alied Blom, Saskia Daalder, Frederiek van der Leek, and Peter Nieuwenhuijsen are largely responsible for my complete addiction to the field. It has been a pleasure to work on this dissertation under Wim's liberal supervision, and his accurate and detailed comments have contributed much to the final result. My grant supervisor, Henk Verkuyl, pushed me to get everything ready in time without ever losing his patience and stimulating enthusiasm. Many improvements on the earlier draft of this thesis are due to his careful reading and comments. More than to anyone else, I am indebted to Henk van Riemsdijk. Our friendship and almost daily discussions have greatly influenced my ideas about linguistics as well as my professional attitudes. This thesis is in part the outcome of our discussions and my reactions to the exciting "Dutch P-soup". We all owe it to Henk1s hospitality and initiative that "each linguist knows the others" and that we have GLOW. This international conspiracy has united Hans den Besten, Ivonne Bordelois, Riny Huybregts, Lyle Jenkins, Richie Kayne, Jean-Yves Pollock, Luigi Rizzi, Alain Rouveret, Jean-Roger Vergnaud, Frans Zwarts, and many other stimulating friends and colleagues into a group that I always felt was reading over my shoulder while I was writing this thesis. I am sure it helped! I would like to express my gratitude to Simon Dik, the clearest and most efficient teacher from my student days, for his admirable generosity in having created a favorable climate for forms of linguistics other than his own. Particularly during the last few years, I have felt completely at home at the Amsterdam Linguistics department, thanks to the presence of Hans Bennis,
Hans den Besten, Reineke Bok, Anneke Groos, Pieter Muysken, Henk van Riemsdijk, Sie Ing Djiang, Norval Smith, and Irene Vogel. My special thanks go to Connie Menting for her beautiful preparation of the final text in such an incredibly short time. I would also like to thank my colleagues at the Institute for Translation Studies for the unique freedom they have permitted me, while working on my dissertation. Of our regular guests, I have profited from the fruitful influence of Joe Gmonds and Edwin Williams, who never came to Amsterdam without new ideas. I have particularly good memories of Mike Brame's vivid and challenging syntax class in Utrecht which definitely contributed to my view of linguistics. This dissertation could not have been written without the financial support of the Netherlands Organization for the Advancement of Pure Research (Z.W.O.) (grant 30-51). Z.W.O. also enabled Hans den Besten and me to go to M.I.T. for a semester (grant R3071). At M.I.T., some of my ideas grew to their present form through discussions with Noam Chomsky, Bob Freidin, Leland George, Norbert Hornstein, Marie-Louise Kean, Craig Thiersch, and many others. Some of the major topics of this dissertation have their origin in discussions with Bob Freidin and hints from Noam Chomsky whose kind interest in my work has meant a great deal to me and whose brilliant ideas have made linguistics the exciting field it is today. I feel deeply indebted to my parents, who have always stimulated my interests and offered their support in more ways than one. Most of all, I would like to thank Charlotte, for her patience, intelligent companionship and love, that are responsible for the fact that I would not be less happy in a world without linguistics.
TABLE OF CONTENTS CHAPTER 1
INTRODUCTION
1
1.1.
Background
1
1.2.
Locality principles
4
1.3.
Some methodological preliminaries
1.4.
Remarks about the base
1.4.1.
Base rules
1.4.2.
Functional notions
8 11 11 18
Footnotes to chapter 1
24
CHAPTER 2
30
2.1.
THE BOUNDING CONDITION
Introduction
30
2.1.1.
A difference between trace and PRO?
32
2.1.2.
Subjacency as a distinctive property
37
2.2.
Conditions and the redundancy problem
2.3.
The evidence for Subjacency
38 42
2.3.1.
The Complex NP Constraint
(CNPC)
43
2.3.2.
The Subject Condition
44
2.3.3.
Extraposition phenomena
48
2.3.4.
Wh-island phenomena
57
2.4.
Preliminary conclusion
57
2.5.
Core grammar and markedness
59
2.5.1.
Introductory remarks
59
2.5.2.
Coindexing: the core rule of syntax
64
2.6.
The Bounding Condition
68
2.6.1.
Introduction
68
2.6.2.
The Complex NP Constraint
70
2.6.3.
The Subject Condition
70
2.6.4.
Phrases as islands
71
2.6.4.1.
NP as a bounding node
2.6.4.2.
AP as a bounding node
82
2.6.4.3*
VP as a bounding node
85
2.6.4.3.1.
Wh-movement
2.6.4.3.2.
Infinitives
2.6.4.4.
PP as a bounding node
2.6.4.4.1.
Preposition stranding in English
2.6.4.4.2.
Preposition stranding in Dutch
2.6.5.
Gapping
72
86 90 93 94 97 104
2.6.6.
Is there a Head Constraint?
2.6.6.1.
The evidence for the Head Constraint
2.6.6.2.
The Bounding Condition and the Head Constraint
2.7.
108 108 . . . 120
Conclusion
122
Footnotes to chapter 2
126
CHAPTER 3
THE LOCALITY
PRINCIPLE
3.1.
Introductory remarks
3.2.
Scope of the Locality Principle
3.2.1.
The local properties of "move NP"
134 134 139 140
3.2.1.1.
Raising to Subject
142
3.2.1.2.
Raising to Object
144
3.2.2.
Clause-internal constraints on movement and construal . 150
3.2.2.1.
Introduction
150
3.2.2.2.
Passives in English and Dutch
154
3.2.2.3. 3.2.3.
Indirect object preposing in Dutch The SSC facts
157 164
3.2.3.1.
Introductory remarks
3.2.3.2.
Secondary predication
165
3.2.3.3.
Irreflexive predicates
166
3.2.4.
Rosenbaum's Minimal Distance Principle
164
171
3.2.4.1.
Introductory remarks
171
3.2.4.2.
Counterexamples to the MDP
173
3.2.4.3.
Opacity
176
3.2.5.
Grinder's Intervention Constraint
182
3.2.6.
The local properties of Wh-movement
185
3.2.7.
The superiority facts
193
3.2.8.
Word order phenomena in Dutch root sentences
199
3.2.8.1.
The root structure of Dutch
199
3.2.8.2.
Sentential adverbs
205
3.2.8.3.
Clitics
209
Cojacency
216
3.2.9. 3.3.
The form of the Locality Principle
227
3.4.
Conclusion
229
Footnotes to chapter 3
230
CHAPTER 4
240
CONCLUSION
BIBLIOGRAPHY
246
- 1 -
CHAPTER 1 INTRODUCTION
1.1.
Background The fundamental problem of linguistics, as formulated by
Chomsky, is to explain how language acquisition is possible on the basis of incidental and fragmentary evidence.1 At present, the problem can best be approached indirectly, by attempting to narrow the "variation space" of human languages, and by limiting the discussion to those parts of language for which promising explanatory theories can be developed. This thesis is meant to be a contribution in the domain of syntax. As a general framework, I presuppose the Extended Standard Theory, particularly those versions that have been developed since Chomsky (1973): Chomsky (1975, 1976, 1977a, 1977b, and 1978), Fiengo (1974, 1977), Chomsky and Lasnik (1977), and many others. Furthermore, I will stress some form of markedness theory, in the sense of the studies just mentioned, Koster (1977b), and Van Riemsdijk (1978). According to this conception, universal grammar provides a highly restricted system of "core grammar" which represents the unmarked form of syntax. Core grammar, which will be discussed more elaborately in chapter 2, includes at least 1. and 2. of the general model adopted here: 2
-
2
-
1. Base rules
(1)
2. Transformational rules
3a. Deletion rules
3b. Construal rules
4a. Filters
4b. Interpretive rules
5a. Phonological rules
5b. Conditions on binding
6a. Stylistic rules
6b.
As for the base rules, I assume some version of X-bar theory (see 1.4.). The transformational rules (2. of (1)) are restricted to very general formulations like: "move NP", "Wh-movement", or even "move a" (where a can be any category) (see Chomsky 1978) . The base rules (including the lexicon) create "deep structures" that are transformed to "surface structures" by "move a". Since movement rules leave a trace, their output is comparable to antecedent-anaphor configurations, and subject to the same conditions as these. In fact, I will argue in chapter 2 that "move a" is superfluous, because the only property that distinguishes movement transformations from coindexing rules (construal) is Subjacency, which is in part reduceable to a bounding condition that also applies to nonmovement rules. Another part of Subjacency can be reduced to a very general principle, which I will call the Locality Principle, and which also applies to rules of construal. Since the classification of rules in (1) is based on the properties of the various rule systems, it appears that the elimination of Subjacency enables us to simplify (1) as follows:
-
3
-
1. Base rules
(2)
2. Coindexing
3a. Deletion rules
3b. Interpretive rules
etc.
etc.
This is not to say that transformations do not exist. Many of the traditional transformations (Minor movements in the sense of Emonds (1976), stylistic rules, various deletions) are presupposed in the list on the left (2a., 3a., etc.). Only "move NP" and "Wh-movement" can be reduced to a form of construal. The syntax of core grammar consists of 1. and 2. of (1) or (2). I suppose that the base provides an index (non-negative integer) for all non-anaphoric categories, and that indices for (null)anaphors can be added by the coindexing rule 2. of (2). Surface structures in this framework differ from surface structures in earlier variants of generative grammar. The main difference is due to the role of unexpanded nodes (cf. trace, PRO) that are treated as bound anaphors by the coindexing rule. These surface structures are richer than the traditional ones in that they contain several such abstract elements, and in that they still contain the material to be deleted by later rules (cf. 3a. in (1)). Unexpanded nodes are possible since base rules (including lexicalization) are optional. If a category a is not expanded, we assume that the following convention applies (cf. Chomsky 1978) : (3)
a
where e is the identity element. This is the only way to create null
-
4
-
anaphors. Since I will propose to base-generate Wh-phrases and other "moved" phrases, e-nodes will not originate as a residue of movement. I do not, in other words, distinguish trace from PRO.^ The other movement rules (on the left side) have no semantic impact, and do not leave trace. The a-rules (left side) map surface structures into phonetic representations, while the b-rules (right side) associate these structures to "logical form" (LF). Universal Grammar defines a narrow set of possible grammars that are further limited by several conditions on rule applicability. Some of these conditions will be the main topic of this thesis. The structures that fall within the scope of core grammar are "optimal" in the sense of the evaluation metric. Possible grammars are ranked in terms of "optimality" or "simplicity" with respect to 4
core grammar. The fundamental empirical problem can be approached in several ways. One can try to limit the class of possible rules at all levels. One can also try to sharpen the evaluation metric, or approach the problem indirectly by extending the conditions on rule applicability. In what follows, it will be argued that these approaches are interrelated. By developing
the theory of markedness,
one can improve the conditions on rule applicability, which in turn enables us to limit the class of possible rules. 1.2.
Locality principles A research program can be characterized by several "thematic"
commitments.^ Besides our interest in a fundamental problem with its philosophical implications, we are committed to a certain ex-
-
5
-
planatory mode. Generative grammar has grown somewhat heterogeneous in this respect. Grammatical phenomena have not only been explained in terms of phrase structure, but also in terms of semantic notions like thematic relations, or in terms of notions borrowed from logic.6 In this thesis, the primary hypothesis will be stated in terms of phrase structure. At present, the purely formal mode seems most fruitful to me, and the most promising starting point for a mathematical
approach to linguistics in the (hopefully near) future.^
Powerful principles (defined in terms of phrase structure) have been the A-over-A principle, Subjacency, and the various notions of the cycle. Within this tradition of theorizing, there seems to me to be a family of principles that have grown into prominence. These principles are properly described as "locality principles", i.e. principles that aim at narrowing the "space" in which linguistic rules apply. For concreteness, I will give the following (incomplete) list: (4)
(i)
Clause mate principles
Q
9
(ii)
The strict cycle
(iii)
Subjacency1^
(iv) (v)
11 The Specified Subject Condition 12 The Minimal Distance Principle
(vi)
The Intervention Constraint1^ 14
(vii)
The Superiority Condition
(viii) Cojacency1^ (ix)
The Head Constraint16
This is only a selection from the literature, and it clearly shows 17 that there is a true proliferation of locality principles. If all
-
6
-
these principles were true, we would have an extremely redundant theory, because these principles largely overlap in their effects. This unsatisfactory picture suggests the main topic for this thesis. The problem that faces us is the following. How can we reduce the number of locality principles and construct a simpler, less redundant theory? It appears that the principles in (4) are in part superfluous and that the rest can be reduced to only two principles. The theory that will emerge is not only simpler, but also more adequate in that it covers facts that are beyond the scope of the principles listed in (4). The two principles that, as I will argue, are sufficient to account for the facts covered by all principles of (4) together, are the Bounding Condition (chapter 2) and the Locality Principle (chapter 3). Omitting some details and refinements, they can be stated as follows: (5)
THE BOUNDING CONDITION Y cannot be free in 3 (= Xn ) in: 18
This is a condition on null anaphora. It does not distinguish between trace and PRO, and it also applies to [ v e] in the case of Gapping and Verb Second in Dutch. The second major principle is a condition on rule application: (6)
THE LOCALITY PRINCIPLE No rule involves
Y in:
' - '"i+l' " --, a i' • ' -, y ' ' ' ' ,ai'- "'"i+X' • ' •
(i>D
-
7
-
The Bounding Condition is in part a generalization of Horn's NP Constraint, and replaces the Head Constraint and much of the con19 tent of Subjacency.
It also accounts for facts that are neither
explained by Subjacency nor by the Head Constraint. The Locality Principle replaces the Specified Subject Condition, the Minimal Distance Principle, the Intervention Constraint, the Superiority Condition, Cojacency, and the rest of Subjacency. It also makes predictions that cannot be derived from the reduced principles. The Locality Principle has been implicit in the literature for many years. Peter Rosenbaum came very close to it when he suggested a generalization of his Minimal Distance Principle (MDP) that was designed for Equi NP deletion: 20 "It is quite likely that the [MDP] as stated earlier is but a special case of a general principle of minimal distance. Such a possibility follows from the observation that if the terms of the principle include not only NP, but N, VP, and V, then one discovers that the principle (in its general sense) offers a natural explanation for the identity requirements of relative clause formation and for similar requirements in many instances of verb and verb phrase deletion." The principle was stated more or less in its present form in Koster (1976). Very similar principles were developed independently by Wilkins (1977a, 1977b) and De Haan (1977). These previous formulations all have the defect of too limited a scope. Rosenbaum1s principle was thought to apply to deletions only, and the other formulations applied only to transformations. The Locality Principle is considerably more general. It not only applies to deletions, but also to transformations, and all interpretive rules that link two terms (like bound anaphora). What holds for the Locality Principle holds even more for the
-
8
-
Bounding Condition: its truth and full generality can only be appreciated against the background of a theory of markedness. This presupposes a less empiricist methodology, and suggests a second major topic for this thesis. 1.3.
Some methodological preliminaries Linguistics today is still followed by the shadow of its 21
"Baconian" past.
According to much current practice, naive ex-
perience (judgments about sentences) is not only taken as a starting point, but as the ultimate test that immediately decides the fate of a theoretical idea.22 This naive falsificationism either leads to premature pessimism, or to a situation where every odd fact is des23
cribed by its own "constraint".
It is also at variance with the
practice of the more advanced natural sciences. The difference is properly characterized as follows: "Except in primitive sciences which eschew, or are little concerned with, the development of comprehensive explanatory theories, one finds little concern with refutation or inductive confirmation of theories in actual scientific practice. Rather the focus is on the use of24 reason, observation, and experiment to develop a promising theory." A science like physics has reached its great intellectual depth by taking seriously ideas that are in conflict with the judgments of naive experience. The main thrust behind this practice is the device of idealization. Idealizations are hooked up with the world of phenomena by the contrivance of auxiliary hypotheses. A classical physical theory like geometrical optics, for instance, is characterized by the "ideal of natural order" that light travels in straight lines. 25 This idea is in conflict with experience at
-
9
-
several points. At the boundary of two substances like air and water, for instance, light beams are refracted. It is typical for the natural sciences that deviances from the ideal of rectilinear propagation of light, like the refractory phenomena, are not taken as a "falsification" of the ideal. Deviations from ideals are taken as research problems, to be solved by auxiliary theories. In the case of geometrical optics, the problem was solved in the seven26 teenth century by Snell's law. In general, primary, idealized theories are used with tenacity, and the auxiliary hypotheses form the major area of experimentation. They form a "protective belt" 27 that shields the main theory from instantaneous falsification.
Putnam (1977, 426) summarizes this
picture as follows: "Thus, theories such as [Newton's theory of universal gravitation] lead to predictions only when augmented by [auxiliary statements] which are not part of the theory. These [auxiliary statements] are far more subject to revision than theories. Indeed, if the theory is accepted as a paradigm and no better alternative theory exists, the theory will be maintained even if there are known phenomena for which no [auxiliary statements] that have been tried lead to successful predictions." The search for suitable auxiliary hypotheses in order to apply idealizations to real-world phenomena is perhaps the most common scientific enterprise. It can be contrasted with the more traditional view of scientific activity by the following two schemata of Putnam"s: 2 8 (7)a.
Schema I
Theory Auxiliary Statements Prediction - True or False?
-
(7)b.
Schema II
10
-
Theory ???????????? Fact to be explained
According to Putnam, standard philosophy of science usually emphasizes Schema I. It certainly represents a common form of science, but Schema II (the search for missing auxiliary statements) is the most "normal" form. It is my opinion that linguistics can take advantage of these insights by constructing theories with the dual structure of primary idealizations and auxiliary hypotheses. The Bounding Condition and the Locality Principle are intended as paradigmatic, primary idealizations. Being idealizations, they are in conflict with many observations. But just as in other sciences it seems pointless to take these conflicts as refutations. Much of this thesis is an attempt to devise auxiliary statements in the sense of Putnam's Schema II. To give an example, the Bounding Condition (5) seems immediately falsified by the familiar phenomenon of extraction from PP in English, in cases like the following: (8)
What^ did they talk [ p p about e ± ] ?
This structure is in conflict with (5), because we have a free e in a PP (= P n ). The acceptability of (8) is therefore a problem. But rather than saying that (8) refutates (5), it is more appropriate to devise an auxiliary theory that reconciles (8) with (5). The answer in this case is a development of the theory of markedness, as will be argued in chapter 2. In general, my two principles cannot be falsified by a list of
-
11
-
anomalous phenomena, but only by better, more comprehensive prin29 • n ciples. 1.4.
Remarks about the base
1.4.1.
Base rules
Although I assume a version of X-bar theory for the base, I do not feel a strong commitment to any particular form of this theory. The "modular approach" that has been so successfully applied to the transformational component, has hardly affected the
b a s e T h e
familiar base rules hardly constitute a scientific theory, they do not explain anything, and largely stipulate what is observed. In spite of this unsatisfying state of the base theory, some phrase structure rules for Dutch will be given here, because our conditions on rules have - strictly speaking - no meaning without them. So, let us assume that Dutch
(and English) sentence structure is introduced
by the following rule:
(9)
E
Xn
VP
This rule introduces the kind of root structures mentioned in Koster (1978a) and discussed in Koster with the maximum number of bars (= PP), and A n
(1978b). X n stands for any category (i.e. V n
(= W ) ,
Nn
(= NP), P n
(= AP)). VP is usually represented as S, as will be
done throughout this thesis. The following V-projection is assumed for D u t c h : 3 1
-
(10)
12
-
V-projection Xn [+wh]
VP
{f,} V f
VP
->-
(Adv*)
V
-
(Adv*)
VP NP
(PP)
V
(VP)
(PP)
(NP) ({||})
(VP)
(iff))
(Prt)
V
(PP)
(VP)
X n stands for any phrase of type X n that bears the feature [+wh]. [+wh] 32 Only Wh-phrases may be inserted in this position. C stands for complementizer. Together, the Wh-position and C form what traditionally has been called COMP. There is good evidence from several languages that COMP consists in fact of two positions, that do not necessarily form a constituent.^ V^ stands for verbum finitum; it replaces the dat (= that = +tense) complementizer in Dutch root 34 sentences according to an idea of Hans den Besten's.
VP is the
traditional S, and V the VP. Adv* indicates an arbitrary number of APs, PPs, NPs and other categories with adverbial function.^ The distribution of adverbials is perhaps the most problematic part of phrase structure, for which no satisfactory solutions exist today. The rest of the rules are of a familiar type, with the exception perhaps of the extra bar on top of 37NP, etc., which purports to create a uniform three-bar level.
The third rule of (10) has the
V in (almost) final position in accordance with Koster (1975) . This third rule contains most deviances from 38 English word order. Here follow a few illustrative realizations: (ll)a.
V
b.
NP
V
dat Peter slaapt
dat Peter het boek leest
that Peter sleeps
that Peter the book reads
-
c.
NP
NP
13
-
V
dat Peter John het boek gaf that Peter John the book gave d.
PP
V
dat Peter naar Amsterdam ging that Peter to e.
NP
PP
Amsterdam went
V
dat Peter John naar Amsterdam stuurt that Peter John to AP
Amsterdam sends
V
g.
dat Peter ziek werd
3
i.
dat Peter weg
liep
that Peter away walked NP
NP
PP
V
that John Peter sick made
39
V
AP
dat John Peter ziek maakte
that Peter sick became
Prt
NP
NP
Prt
V
dat John Peter weg
that John Peter away send
V
dat John Peter de
bril
van
de
neus sloeg
that John Peter the glasses from the nose hit (that John hit Peter's glasses off his nose) k.
NP
NP
AP
V
dat John Peter het leven zuur maakte that John Peter the life
stuurde
sour made
(that John embittered Peter's life)
- 14 1.
PP
Prt
V
dat John naar Amsterdam af that John to
reisde 40
Amsterdam off traveled
(that John departed for Amsterdam) m.
NP
PP
Prt
V
dat John Peter naar Amsterdam weg that John Peter to
promoveerde
Amsterdam away promoted
(that John promoted Peter away to Amsterdam) The VP-complements (S-complements) undergo extraposition, or loose their V by the rule of V-raising. These processes have been studied in great detail by Evers (1975). 41 Note that we do not postulate a clitic position for Dutch. It has often been assumed that Dutch has a clitic position right behind the subject. This position has been based on the following observation. Although adverbs like waarschijnlijk (probably) can normally appear between subject and indirect object, and between indirect object and direct object, they cannot occur in these positions when indirect and direct object are clitics: 42 (12)a.
dat
iemand
waarschijnlijk Peter een boek gaf
that someone probably
Peter a
book gave
b.
dat
iemand
Peter waarschijnlijk een boek gaf
c.
*dat
iemand
waarschijnlijk 't 'm
that someone probably
gaf
it him gave
d.
*dat
iemand
't waarschijnlijk 'm
gaf
e.
dat
iemand
't 'm waarschijnlijk gaf
- 15 A special clitic position for J_t and Jjn after the subject iemand would explain the ungrammaticality of (12c) and (12d) and the grammaticality of (12e). This argument is without force for the following reason. Adverbs like waarschijnlijk can also precede the subject: (13)
dat
waarschijnlijk iemand
that probably
Peter het boek gaf
someone Peter the book gave
This front position for adverbs is somewhat marked, and acceptability of such sentences depends in part on the nature of the following subject NP. The crucial fact is that the subject itself can be a clitic (¿£ is the weak form of jij (= you), see 3.2.8.): (14)
dat
je
waarschijnlijk Peter het boek gaf
that you probably
Peter the book gave
If the subject is a clitic like je, it cannot be preceded by an adverb like in (13): (15)
*dat
waarschijnlijk je
that probably
Peter het boek gaf
you Peter the book gave
This fact can, of course, not be accounted for by postulating a clitic position after the subject. So, even with such a clitic position, we have to exclude (15) somehow. Suppose that we have the 43 following filter: (16)
"[Adv
-
CL]
Whatever the deeper explanation of (16) - if any -, it accounts for
-
16
-
both (15) and the ungrammaticality of (12c) and (12d). The second and third rule of (10) generate the following Adv-NP sequences: (17)
[vp
Adv*
NP
[-
Adv*
NP
NP
]]
If we assume furthermore a mechanism to "transport" the Advs to the right of the NPs (see 12a and b), an output filter like (16) is 44 sufficient to correctly predict all clitic positions in Dutch. The other projections (for Dutch) are not too different from English. The P-projection has been studied in detail by Van Riemsdijk (1978) . It can be summarized as follows: (18)
P-projection PP
--
Ñ
NP
Det
Ñ
(PP)
(VP)
(AP)
N
(PP)
(VP)
The specifier system (QP) is as different from English as it is 45
complicated, and beyond the scope of our present concerns.
The NP
that follows the QP in the first rule is the possessive ÑP in forms like Jan's huis (John's house). Besides possessive pronouns, this option is limited to names in Dutch. There are no forms like *de man's huis (the man's house). Dutch compensates for this gap by forms like de man zijn huis (lit. the man his house). Another major difference between Dutch and English is the rich AP-structure in prenominal position. If a participle is the head of these APs, complexity approximates that of relative clauses in many languages (cf. het gisteren tijdens de pauze door Jan gekochte boek) = (lit.) the yesterday during the intermission by John bought book). In this respect, Dutch looks like most other SOV languages. The normal relative clause is, nevertheless, postnominal like in English: de man die het huis kocht (the man that bought the house). For relative clauses we may add a standard rule like NP •*• NP
VP.
For the A-projection, I assume the following rules: (21)
A-projection AP
•+•
QP
AP
AP
->•
(Adv*) A
(PP)
(VP)
A
->-
(PP)
(PP)
(VP)
A
-
18
-
The main difference with English is the sentence-like pre-adjectival structure that is particularly manifest when the head is a participle . I have used a uniform three-bar system for this summary, following Jackendoff (1977) and Van Riemsdijk (1978). This system will be presupposed throughout this study, although I will use traditional notation most of the time. Notations can be related to the rules given here through the following schema: VP = v
n
= v3 = s
NP = N n = N 3 = NP
n_1 VP = v = v2
=
s
NP = N""1 = N 2 = NP
V
= vn"2 = v1
=
VP
N
= Nn"2 = N1 = N
V
= V n _ 3 = v°
-
V
N
= N n " 3 = N° = N
The schemata for PP and AP can be derived from the NP-schema by substituting P and A for N. The leftmost symbols refer to the symbols used in the rules of this paragraph. The rightmost symbols are the traditional symbols used in the remainder of this thesis. I will use category variables like X n ( X = V , P, N, A; n = 3) and X 1 (3 > i > 0) . The traditional symbols are used in order to make the text more readable, although one should bear in mind that the parallelism of the uniform three-bar level of the rules of this paragraph has real theoretical significance. The uniform three-bar level hypothesis is confirmed by the several simplifications it permits in the rules and conditions of the following chapters. 1.4.2.
Functional notions
Functional notions like subject, object, predicate, etc. play a
-
19
-
modest rôle in syntax. An example that comes immediately to mind is the Specified Subject Condition. The notion of a subject in this condition is completely determined by configurational information, in the sense of Chomsky (1965, ch. 2). And yet it is not generally true that the functional status of an NP is completely determined by its position, even in English: (23)a. b.
I gave Bill a book I consider Bill a fool
Both these sentences have two NPs in their VP. While in most cases the first of two VP-internal NPs is an indirect object, and the second a direct object, there are several exceptions like (23b), where Bill is functioning as a subject with respect to the predicate 46 a fool.
These differences are relevant to the Specified Subject
Condition (or rather the Locality Principle) as will be argued in chapter 3. In other languages, like Dutch, there is also some differentation with respect to the subject. Dutch has passives of the lexical variety (cf. 3.2.2.2.), and it will be shown that the subjects of these passives share properties with direct objects, in spite of their surface case, which is nominative. The same can be demonstrated for a large class of intransitives in Dutch. So, what we need is a somewhat more flexible system of assigning functional information to categories. Suppose we have the following functional labels: SU (subject), 10 (indirect object), DO (direct object), PO (prepositional object), PRED (predicate), and Adjunct. Suppose furthermore that lexical categories like verbs assign these labels to the phrases in their projection (cf. the PS-rules in the preceding paragraph). We can, of
-
20
-
course, take advantage of the fact that the function of a category is largely predictable on the basis of its position. Let us therefore suppose that in general we do not have to state in the lexicon which functional labels are assigned by an X^. If we limit ourselves to Vs, we may assume the following redundancy rules: 47 (24)a.
[+V, [ s NP
[ v p . . — ..]]] => [+V,
[ s NP SU
[vp..--..]]]
b.
[+V, [ v p —
NP ]]
=» [+V,
[vp ~
NP ]] DO
c.
[+V, [ v p ~
NP
NP ]]
=» [+V,
[vp ~
NP 10
NP ]] DO
d.
[+V, [ v p ~
[P
NP]]]
=> [+V,
[vp —
[P
NP]]] PO
The verb give, for instance, is regular. It may select two NPs in its VP, which can be represented by a subcategorization frame like 48 the lefthand side of rule (24c).
This rule applies by assigning
the functional labels so that Mary is an 10 and a book a DO in a sentence like John gave Mary a book. Functional labels are only assigned by the lexicon to argument NPs in the sense of Freidin (1978). Suppose that we distinguish argument NPs from non-argument NPs in the lexicon by representing them by NP and NP, respectively. This results in the following frames for verbs like consider and seem: (25)a. b.
consider:
[+V, [ y p
—
NP
seem:
[+V, [ g NP
[vp
X n ]] PRED --
S]]]
The redundancy rules (24) do not apply in these cases. In general,
-
21
-
they do not apply to NPs and to NPs that are assigned a functional label (like PRED) in the lexicon. Such explicit stipulations are only needed for irregular verbs like the class of intransitives in Dutch that will be discussed in 3.2.2.3. An example is bevallen (please) that selects (irregularly) a DO subject and an 10 object: (26)
bevallen:
[+V,
NP
tvp
DO
NP
—
]]]
10
For the rest, Dutch has the same redundancy rules as English (if we disregard the different position of the verb). NPs that are non-arguments with respect to Vs, receive their functional labels in a special way. In a sentence like John considers Bill a fool, Bill is a non-argument with respect to consider, but an argument with respect to the following PRED (a fool). Let us assume the following rule for this kind of secondary predication (cf. Rouveret and Vergnaud 1978): (27)
[vp
NP
PRED ] =a
[yp
NP
PRED ]
SU The functional structure of a sentence like John considers Bill a fool is now as follows: (28)
John considers Bill a fool SU
SU
PRED
The leftmost SU-label is assigned by rule (24a), the PRED-label by (25a), and the second SU-label (under Bill) by rule (27). On the basis of these rules, we can define the notion of a co-argument:
-
(29)
22
-
Two NPs, NP^ and NP ^ , are co-arguments if their functional labels are assigned by the same X^ or PRED.
Thus, in a sentence like John gave Mary the book the NPs are coarguments, because the three functional labels are all assigned by the verb give. John and Bill in (28) are not co-arguments, because the label of John is assigned by consider, while the label of Bill is assigned by PRED. A fool, being a PRED, is no argument at all. Only SU, 10, DO, and PO are arguments. NPs that belong to different clauses are usually not co-arguments: (30)
John forced Maryi SU
DO
e^ SU
to help Bill DO
The SU and DO of John and Mary are assigned by force, while the SU of e (the controlled embedded subject) and Bill are assigned by a different verb (help). In chapter 3, I will frequently use the notion prominence, defined for co-arguments (FLk refers to the functional label of NP^ for each k): (31)
NP^ is more prominent than NP^ if NP^ and NP^ are co-arguments, and FL^ > FL^ according to the following hierarchy: SU > 10 > DO(PO) (> = "is more prominent than")
I will suggest a slight extension of this hierarchy (in ch. 3) to express the fact that arguments are generally more prominent than
- 23 non-arguments. The prominence hierarchy, which is partly implicit in the Specified Subject Condition, can be added to the auxiliary hypotheses of the Locality Principle, and greatly extends its explanatory power (ch. 3). Furthermore, I would like to define the notion of a derived functional label, . NPs are assigned a derived FL by the following rule (that applies from top to bottom): (32)
NP. i
...
NP. l
NP. l
...
FL. i
l
l
NP. I FL.
Rule (32) applies to each pair of NPs that are coindexing
coindexed by the
rule to be discussed in chapter 2. In the case of con-
trol, for instance, the controller gets the derived FL from the controllee: (33)
John forces Mary i S
DO
e^
to help Bill
SU
DO
The first row has only lexically assigned FLs, the second row also has a derived FL. Mary is the DO of force, and the (derived) SU of the embedded verb help. Similarly, the verb seem, which has no lexical FL at all (see 25b), receives a derived FL: (34)
John^ seems
e^
SU
to help Mary DO
The assignment of derived FLs by (32), simply makes explicit the
- 24 fact that functional notions are carried over under
coindexing. 49
Landing sites of the classical movement rules contain only derived FLs. They miss a lexically assigned FL as a consequence of the properties of their verbs (cf. 25b). This lexical difference between verbs like seem on the one hand, and verbs like force on the other hand explains all differences between "movement" and construal. More about this issue will be said in chapter 2. A crucial assumption in chapter 3 is, that surface structures are well-formed if all NPs bear a FL or . The FLs can be seen as "deep cases". They represent the "transitivity" system of the language, and have to be distinguished from thematic relations and "surface oase".^" I also assume a system of surface cases, that is largely language-specific. English and Dutch have a defective genitive, and nominative and objective forms that only show up in pronouns (he vs. him). We can express this by the features [+nom] (= nominative) and [-nom] (= objective). Surface case is very idiosyncratic across languages (see German, for instance) and has less grammatical significance than deep cases, which are supposed to be fairly uniform across languages.
FOOTNOTES TO CHAPTER 1 1.
Cf. Chomsky (1965), chapter 1, Chomsky (1968), and especially the first paragraph of Chomsky (1973).
2. 3.
Cf. Chomsky and Lasnik (1977) and Chomsky (1978) . See the introduction of Chomsky (1977a), where the trace-PRO distinction is discussed.
4.
Cf. Chomsky and Lasnik (1977), and Van Riemsdijk (1978), chapter 7.
- 25 5.
For the role of "themes" in science, see Holton (1973).
6.
For thematic relations, see Jackendoff (1972). Notions from logic were frequently used by generative semanticists. Freidin (1978) is in part an attempt to shift the focus of explanation to Logical Form. See also Chomsky (1978) for notions like "opacity".
7.
Cf. Lasnik and Kupin (1976) .
8.
Cf. Postal (1971) and Postal (1974). Chomsky (1973) persuasively argued that clause mate principles can be dispensed with.
9.
Cf. Chomsky (1973) and Freidin (1978).
10.
Cf. Chomsky (1973).
11.
Cf. Chomsky (1973).
12.
Cf. Rosenbaum (1967) and (1970) .
13.
Cf. Grinder (1970) .
14.
Cf. Chomsky (1973).
15.
Cf. Koster (1977a).
16.
Cf. Van Riemsdijk (1978), chapter 4, and the references cited there.
17.
The term "locality principle" was first used by Evers (1975) for a principle not listed in (4). A complete survey of principles of this kind is beyond the scope of the present study.
18.
In the meta-metalanguage in which conditions on rules and anaphors are formulated, Greek letters (a, (3, y) stand for categories of type X 1 (X i-bar). In general, I will use a for antecedents, and y for consequents like anaphors. In chapter 2, I will define 3 as the top node (roughly an X n that is not immediately dominated by another X n ). The subscripts of a in (6) (i, i+1) indicate the relative distance of the as in question from y. Thus, a. is closer to v than a. 1 1 l l+l For the NP Constraint, see Horn (1977) and Bach and Horn (1976). For the Head Constraint, see note 16. Rosenbaum (1970), 28.
19. 20.
-
26
-
21.
For the term "Baconian Sciences", see Kuhn (1977), 4Iff. Baconian sciences are the non-mathematical, descriptive or experimental sciences like chemistry in the seventeenth century and biology. A number of Baconian sciences became more mathematical during the first quarter of the nineteenth century (op.cit. p. 61). Linguistics seems in a similar stage of transition.
22.
For some examples, see Koster (1977a).
23.
As for the pessimism, see Postal (1972). Pessimistic statements about the achievements and prospects of generative grammar have increased since the early seventies. This development is surprising, because it naturally takes some time to invent theories of some scope and depth. At least some of the current pessimism seems to reflect the bankruptcy of the empiricist, "Baconian" methodology of some early transformational grammarians. Another cause of pessimism is the popularity of the view that sciences are characterized by their falsificationism in the sense of Popper. In a science like linguistics, which involves little idealization, the idea of falsification is self-defeating, because explanatory depth can only be attained through idealized theories that deviate from the world of phenomena by definition. Pessimism originates where conflicts between idealized theories and phenomena are interpreted as falsifications of theories. It seems to me that "falsification" is much less important for a developing field than the key notion of "idealization".
24.
Suppe (1977), 707.
25.
For ideals of natural order, see Toulmin (1961). The example of geometrical optics is discussed in great detail in Toulmin (1953) .
26.
Toulmin (1953), chapter 3.
27.
The term "protective belt" is from Lakatos (1970). Lakatos1 article is a good introduction to the problem of falsification and "tenacity", and contains many illustrative examples. Suppe (1977) is the most comprehensive critical review of current ideas in the philosophy of science.
- 27 28.
Putnam gives a third schema that is irrelevant for present purposes. Schema I may also illustrate the so-called Duhemian thesis that implies that negative evidence (false predictions) does not necessarily falsify a theory, because predictions are not derived from the theory alone, but from a conjunction of the theory and several auxiliary statements. Any of the auxiliary hypotheses can be false. See Koster (1977a) for references and some discussion.
29.
Adoption of new principles does not necessarily depend on the demonstration that the new theory is more comprehensive. The scope of a theory usually increases in the course of time, if it is based on a fruitful program. New theories are very often adopted for their elegance, their explanatory promise, or for thematic reasons (cf. Holton 1973).
30.
The term "modular" is used in neurophysiological studies of visual perception, and in AI research. According to the modular approach, complicated phenomena are not described directly, but indirectly, as the result of the interaction of several components that are themselves relatively simple.
31.
The sequence PP VP returns at the end of all three levels. Cf. Williams (1978b).
32.
The term Wh-phrase has a broad meaning here. The class of Whphrases contains not only question words and phrases, but also relative pronouns (plus "pied piped" material), and phrases without phonological realization in the case of comparatives (in most dialects) and other constructions (cf. Chomsky (1977b)).
33.
For Middle English, see Keyser (1975). For Dutch, see Koster (1978b).
34.
Personal communication.
35.
Cf. Jackendoff (1972) and Emonds (1976).
36.
A major problem is that there are so many possible positions for adverbials. Keyser (1968) proposes a transportability convention to account for adverb distribution. There have been several attempts to generate adverbs in canonical base posi-
-
28
-
tions, followed by transformations of various kinds. All accounts I know of seem rather arbitrary. Cf. Jackendoff (1972), chapter 3 and Koster (1976) for discussion. 37.
Cf. Jackendoff (1977) and Van Riemsdijk (1978).
38.
In the following examples, literal translations have been given where little misunderstanding seems possible. All examples have non-root word order, which is basic in Dutch (see Koster (1975)). The complementizer dat is used to indicate the fact that these sentences have non-root order.
39.
According to Dutch spelling, particle + verb has to be written as one word (in non-root sentences).
40.
See note 39.
41.
Cf. Van Riemsdijk (1978), 33.
42.
Unstressed pronouns are called clitics for convenience in Dutch in grammar. There is no evidence that the forms J_t and (12) are real clitics, i.e. forms that are incorporated in other constituents like verbs or complementizers. (See also 3.2.8.).
43.
Apart from filters that concern the COMP-system, there are several constraints on the order of adjuncts that can be accounted for by (perhaps stylistic) filters like (16). See also Chomsky and Lasnik (1977), (154) (p. 479). Dik (1978, chapter 9) contains many interesting observations of the relative word order of small and "heavy" constituents.
44.
For transportability of adverbs, see Keyser (1968). Adverbs are also transported to clause-initial position.
45.
See Putter (1976) for some observations about the NP-specifier in Dutch.
46.
This is a case of "secondary predication". Cf. Rouveret and Vergnaud (1978).
47.
For this kind of redundancy rule, see Ruwet (1972). The position of the V holds for English; in Dutch, it is to the right of the complements.
48.
I assume that double-NP constructions are base-generated. Cf. Oehrle (1976). Dutch has several double-NP constructions with
-
29
-
no corresponding paraphrase with NP PP. 49.
The
coindexing rule will be discussed in chapter 2.
50.
For "transitivity", "ergativity", etc., see Lyons (1968) and references cited there. Dik (1978) also distinguishes semantic functions (thematic relations) from syntactic functions (sub- ' ject and object). The theory presented in this thesis differs from functional and relational grammar in that phrase structure notions form the primary explanatory mode. Thematic relations and functional notions enter in auxiliary theories which describe structures that interact with the primary, completely formal syntactic structure.
-
30
-
CHAPTER 2 THE BOUNDING
2.1.
CONDITION
Introduction Chomsky's epoch-making Conditions on Transformations (1973;
henceforth Conditions) has led to a three-fold unification of linguistic theory. First, it was shown there that movement transformations do not differ with respect to boundedness. An apparently unbounded transformation like Wh-movement could be reinterpreted as an iteration of a more local rule, constrained by conditions like Subjacency.1 The second unifying step - implicit in the Conditionsframework - concerns all other apparently unbounded transformations, insofar as they obey island conditions like the Complex NP Constraint (CNPC) and the Wh-island condition (cf. Ross (1967)). This class of rules could be eliminated by reducing them to the bounded 2
version of Wh-movement. There is a third revolutionary idea in Conditions - trace theory - that has worked in the direction of yet another kind of unification, but that has not yet been pushed to its utmost consequences. The idea of traces is unifying because it brings the theory of movement rules closer to the theory of bound anaphora. The relation between a moved category and its trace is seen as an antecedent-anaphor relation. In this way, the class of possible movements is considerably reduced: only those outputs are allowed that conform to the class of possible antecedent-anaphor configura-
-
31
-
tions. Were it for no other reason, traces would already be wellmotivated because of this result. Another unifying effect of traces is connected with semantics. All interpretation can be done at the level of surface structure, because the deep structure aspects of meaning, like thematic relations, are still accessible in surface structure. This might have radical consequences, to which I will return directly. In general, trace theory has considerably increased the sophistication of generative grammar as a scientific theory by making it more "modular". Complex aspects of particular constructions need no longer be stipulated in construction-specific transformations. Trace theory has made it possible to formulate extremely simple and general rules, like "move a". Such rules massively overgenerate, but this is without unwanted consequences because the outputs are filtered by independently needed conditions on anaphora. It is almost a standard feature of theoretical progress that apparently complex phenomena are not described directly, by stipulation, but indirectly, as the result of the interaction of relatively simple devices.^ In my opinion, the modular approach can be carried one step further, by eliminating the distinction between "move a" and "coindex" (in the sense of Chomsky (1978); cf. (1) and (2) of chapter 1). The effect of movement rules like "move NP" and Wh-movement is to create a partial coindexing, namely of a moved category with its trace (cf. Chomsky 1978). After this first round of coindexing, there is a second round, because construal rules like "control" (or more generally "coindex") do the rest of the coindexing. It is unsatisfactory to have two coindexing procedures, unless there is a
- 32 descriptive or explanatory need for it. As I will argue in the remainder of this chapter, it is possible to do with only one coindexing procedure. The alleged difference between "move a" and "coindex" can be seen as the result of an interaction of this single coindexing rule with the lexical properties of lexical items like verbs. The claim that there is a class of movement transformations (2. of (1) in chapter 1), is based on an alleged difference between trace and PRO, and on a specific property, Subjacency, that is said to distinguish the movement rules in question from rules of 4 construal.
I will make a few remarks about trace and PRO in 2.1.1.
The rest of this chapter is about Subjacency. The essence of the argument is that Subjacency is superfluous. Part of its content can be reduced to a very general Locality Principle, the topic of chapter 3. The remaining content of Subjacency follows from the Bounding Condition. This is a condition on null-anaphora. The CNPC, the Subject Condition, and several facts that are beyond the scope of Subjacency, follow from the Bounding Condition. Since the Bounding Condition also explains certain properties of control, and since it applies to non-movement rules like Gapping, it cannot be considered a distinctive property of movement rules. Because there are no other known distinctive properties of movement rules, there is no reason to see them as a separate class.^ 2.1.1.
A difference between trace and PRO?
Trace theory makes it possible to assign thematic relations in surface structure, because moved categories are linked to their original positions; the latter have a residue in the form of a
- 33 trace.® If such relations between a verb and its NP (like thematic relations) are available in surface structure, the same is true for other relational information, like selectional restrictions and idiomatic connections. The consequences of this were first realized by Otero (1976) and Den Besten (1976) . They independently came to the conclusion that trace theory allows lexical insertion in surface structure. Chomsky and Lasnik (1977) recognize this, but do not pursue the matter.^ Classical arguments for transformations were connected with properties of lexical items and idioms. Similarly, the distinction between trace and PRO has been based on contrasts like the following: ® (1)
There^ seems [ e^
(2)
The icej was likely
to be a reward] [e^
(3)
*There^ tries [ e^
(4)
*The ice^ was persuaded
to be broken
to be a reward] ej [ ej
(e^ = trace) ej ] (e^ = trace) (e^ = PRO)
to be broken ej ] (ej = PRO)
It is hard to see how these data justify a distinction between trace and PRO, or a distinction between movement (1,2) and construal (3,4). Suppose that we have a uniform coindexing procedure that assigns to all embedded subjects (e) the index of the base-generated subject NP of the matrix. All properties of these sentences would follow immediately, because try (3) selects an argument NP that has to be [+animate]. The verb seem does not select an argument subject (see chapter 1) and therefore has no selectional relations with it. As a consequence, any lexical material may be inserted in this position.
- 34 A similar argument applies in the case of idiom chunks. The idiomatic reading of to break the ice depends on the unique connection between break and its object, the ice. In (2), the condition of "unique connection" is fulfilled, because the ice is linked to exactly one argument position, namely the object position of break. This triggers the idiomatic reading. In (4), however, the connection is not unique. The ice is linked to two argument positions; the first is the object position of persuade, which does not trigger an idiomatic reading, and the second is the object position of break, which does trigger the idiomatic reading. This conflict is the source of the ungrammaticality. Moreover, persuade selects a [+human] object, so that the ice does not qualify as such. In general, "movement" is a linking from a non-argument antecedent to an e, while control is a linking from an argument position to an e. But the status of the antecedent is completely determined by uncontroversial properties of the verbs that are involved. The lexicon has to specify anyhow that the subject of seem is a nonargument, while the subject of try is an argument. To call the first subject position a landing site for movement, and the second a controller is just a matter of terminology. In no way does it follow that different kinds of rules are involved, or different kinds of anaphors (trace and PRO, respectively). One might object that landing sites are always subjects (in the case of "move NP"), while controllers are sometimes subjects (try, promise), sometimes objects, depending on the properties of the verbs. But, first, this asymmetry does not follow from the conjunction "move NP" + conditions on possible anaphoric configurations, and, second, there is empirical evidence against it. In chapter 3, I will argue that there are also
- 35 cases where the object position of a verb is a landing site. In other words, whether a given NP qualifies as a landing site or as a controller, depends on lexical information. Another argument for PRO (as distinct from trace) is the folq lowing contrast, supposed to be unique for control verbs: (5)a. b.
Bill was persuaded to leave *Bill was promised to leave
But also in this case it is difficult to see what follows from this contrast with respect to the rules (control vs. movement) or the anaphors (trace vs. PRO) that are involved. It seems hardly controversial that the difference follows from the distinct properties of the verbs persuade and promise. In chapter 3, I will propose a rather uncontroversial stipulation for promise, to the effect that its object does not qualify as a controller: (6)
promise:
[+V, —
NP
S]
[- antecedent] Under the general assumption that features are carried over under coindexing, the ungrammaticality of (5b) is explained. Since there is no antecedent in this sentence, the subject of leave remains uninterpreted.1® Sentences like Bill was promised a job are possible because they do not contain free variables. Bill can be the antecedent of himself in I promised Bill a picture of himself, because the feature [-antecedent] is only assigned when a (possibly tenseless) S follows (see 6).11 In other languages than English, stipulations like (6) have to be made for "movement" verbs. The Dutch verb
- 36 lijken (appear), for instance, may select an object that cannot be a "landing site" (the literal translation of the English phrase it appears to me is: it appears me). Thus, there are sentences like the following: (7)
Peter^
leek
hem
Peter
appeared him
e^
vertrokken
te zijn
e
disappeared to be
(It appeared to him that Peter had disappeared) This case is similar to promise in that we have to stipulate in the lexical entry for lijken that its object position does not qualify as an antecedent position. It is neither a possible controller, nor a possible landing site. The fact that it cannot be a landing site follows from its argument status. The fact that it cannot be a controller follows from the stipulation that it is [-antecedent]. That non-argument positions are never [-antecedent] follows from the requirement mentioned in chapter 1, and further discussed in chapter 3, that all NP-positions have either argument status (FL) or derived argument status (). If a non-argument is [-antecendent] there is no way left for it to receive a (derived) FL. All in all, it seems to me that there is no compelling argument based on idioms or sentences like (5) that justifies a distinction between trace and PRO. This conclusion is in agreement with Jenkins (1976) and Lightfoot (1977) who have maintained that both (8a) and (8b) can be base-generated: (8)a.
Mary^ seems
e^
to read
b.
Mary, tries
e.
to read
- 37 Under the assumption of trace theory, there is no reason to have a different derivation for these two sentences. 2.1.2.
Subjacency as a distinctive property
Since the introduction of trace theory has undermined the classical arguments for movement transformations, the motivation for them has to be found elsewhere. Base-generation of Wh-phrases in COMP and of NPs in surface position has in fact been considered by 12
Chomsky since Conditions (Chomsky 1973, §17).
This alternative to
a transformational core grammar, which involves interpretive rules only, is according to Chomsky (1977b, note 16) entirely possible, but not necessarily seen as meaningful, because of the specific properties of Wh-movement and "move NP". Thus, it is possible to see both these movements and rules of construal as belonging to the same class of rules, but the movement rules are said to have different properties, i.e. they apply cyclically and obey Subjacency. According to Chomsky, these differences can "[...] be explained (rather than stipulated) if we take the NP-movement and the wh-movement rules to be movement transformations."1^ This is the general line of argument: a classification of rule systems like (1) of chapter 1 is based on different clusterings of properties. If it can be shown that alleged differences in properties between classes of rules do not exist, we may tentatively conclude that the classification of rules can be simplified by postulating only one class of rules insteed of two. This happens to be the situation with (cyclic) movement rules: both the cyclic property and Subjacency can be dispensed with without loss of descriptive or explanatory adequacy.
-
2.2.
38
-
Conditions and the redundancy problem The most common form of criticism of a theory in linguistics
is the discussion of counterexamples. This is important but seldom leads to the elimination of a theory. It is even irrational to give 14 up a theory when an attractive alternative is lacking.
It is also
possible, and in the absence of substantial alternatives often more fruitful, to criticize the internal structure of a theory. This kind of criticism of the Conditions framework has proven to be of great heuristic power and has led to some remarkable conclusions and theoretical modifications. The first result has been Freidin's conclusions about the strict cylce (Freidin 1978) . His argument can best be illustrated with the following, most typical example: (9)
*
Who^ did the men ask [g
what,, [ e 1
saw
£2 ]]]
The conventional explanation for the ungrammaticality of this sentence involves the strict c y c l e . A c c o r d i n g to this explanation, who^ moves to COMP-position in S2; what,, has to stay in its original position (marked by £2), because doubly filled COMP is impossible. On the next cycle, who^ is moved to the COMP of S ^ The principle of the strict cycle determines that we cannot go back to S2 (a proper subdomain of Sj) in order to move what^ to the COMP of S2; hence, the ungrammaticality of (9). But as observed by Freidin, the strict cycle is not needed to block such sentences, given the Propositional Island Condition (PIC) and trace theory.1® As Freidin has pointed out, we may assume that (9) is the result of completely free application of Wh-movement. The surface result is just not interpretable,
- 39 because who^ cannot be linked to its trace, ^ , because of the PIC (e^ is in a tensed S) . The route through the COMP of S 2 is blocked, because this position is filled by what^. Freidin has shown that all known cases covered by the strict cycle follow as a consequence from independently motivated concepts like obligatory control, the SSC, the PIC, and trace theory. 17 Note that this important result has undermined one of the arguments for a separate class of movement rules. Cyclicity can no longer be seen as a distinctive property of this class, because the elements that produce the same empirical effects are not unique for movement. Another redundancy has been observed and eliminated by Chomsky (1978). He has observed that in sentences like (1), the PIC ("the tensed-S condition") and the SSC overlap: (10)
'-They told me [g what I gave each other]
Linking of each other to they is blocked by both the PIC and the SSC. Chomsky proposes to eliminate this redundancy by restricting the PIC to the subject of a tensed clause. Non-subject anaphors like each other in (10) are in the domain of the subject, and therefore already made inaccessible by the SSC. Chomsky suggests assigning nominative case to all subjects of tensed clauses, and replacing the PIC by (11):18
(11)
The Nominative Island Condition (NIC) A nominative anaphor cannot be free in S
That the anaphor "cannot be free in S" means that it has to be bound
- 40 in S. Compare, for instance, (12a) and (12b): (12)a. b.
*They said [g that each other were happy] Who^ did they say
e^ [
e^
were happy]
In (12a), the anaphor each other is free in S, i.e. there is no antecedent in this clause. In (12b), the rightmost e^ is bound in S, because there is an antecedent in the same clause (e^ in COMP). Chomsky argues that the NIC not only eliminates the redundancy of the PIC, but that it also brings some gains in terms of empirical adequacy for cases like: (13)
John said
that a picture of himself hung in the office]
The acceptability of this sentence was always a problem for the PIC, because himself is in
a tensed clause that does not also contain
the antecedent John. For the NIC (13) is no longer a problem because himself does not receive nominative case, which is only assigned to the whole subject (a picture of himself). Thus, a theoretical simplification also improves descriptive adequacy, which is a satisfactory result.1^ In Koster (1977b) it has been argued that the SSC largely overlaps with another major condition: Subjacency. The two conditions have been stated as follows: 20 (14)
The Specified Subject Condition No rule can involve X,Y in the structure ...X...[ . . .Z . . . -WYV...]... a where Z is the specified subject of WYV in a (= NP, S)
-
(15)
41
-
Subjacency A cyclic rule cannot move a phrase from position Y to position X (or conversely) in: X
a
Y
e
X
where a and f} are cyclic nodes (= S or S, and NP) The redundancy is obvious when we look at the four possibilities for Subjacency (if we take S as a cyclic node): (16)a. ...X...[„ . . [ c ..Y..] .. ] b
l
h
b. . . . X . . . [
s
..[np..y..]••]
2
Both the SSC and Subjacency exclude movement from Y to X in (16a) and (16b) (if we disregard the case where NP is the subject of S^ in (16b)). Both cases involve two cyclic nodes between X and Y, and in both cases S^ immediately dominates its subject. Just as in the case of the PIC and the SSC, it seems desirable to attempt to eliminate this redundancy. The other two cases of Subjacency, (16c) and (16d), fall within a wider range of island phenomena involving other phrase nodes than S. As has been argued by Van Riemsdijk (1978), this kind of phenomena - to be discussed in the next paragraphs - requires yet another principle, the Head Constraint: 21
- 42 (17)
The Head Constraint No rule may involve X./X. ...X....[ H n...[ HI
and Y./Y. in the structure
...Y....H...Y j ...] HI ...] H n...X j ...
(where H is the phonologically specified (i.e. non-null) head and H n is the maximal projection of H)
Van Riemsdijk rightly concludes that "[t]he addition of the head constraint in [the system of conditions] can hardly be claimed to increase the overall elegance of the system of constraints as a 22
whole."
If we look at (16) again, it appears that the Head Con-
straint overlaps with both the SSC and Subjacency (if S 1 has a V head) in the case of (16a) and (16b), and with Subjacency in the case of (16c) and (16d) (if Y is in the domain of the head of the topmost NP). It is very unlikely that both the Head Constraint and Subjacency are valid, because they almost completely overlap. The problem, however, is that the Head Constraint has some real content beyond Subjacency. It is, of course, possible that empirical reasons force us to adopt an extremely redundant system of conditions. But it seems more fruitful to assume, as a working hypothesis, that the aesthetics of the structure of language is not that disappointing. Our primary aim will therefore be the construction of a system of conditions with less redundancies. 2.3.
The evidence for Subjacency Before discussing the Bounding Condition, I would like to give
a brief review of the major evidence for Subjacency. This will serve as a background for the discussions to follow. The major evidence
- 43 for Subjacency consists of the following facts and empirical generalizations : (18)
2.3.1.
The Complex NP Constraint
2.3.2.
The Subject Condition
2.3.3.
Extraposition phenomena
2.3.4.
Wh-island phenomena
Let us discuss these classes of facts one by one. 2.3.1.
23
The Complex NP Constraint (CNPC)
One of the virtues of Subjacency has been that a number of empirical generalizations like Ross' CNPC could be derived with it. As for the general status of the CNPC, a few preliminary observations are in order here. The whole idea of a CNPC only makes sense when we look at languages like English, and a few others. In a broader perspective, it appears to be based on the wrong assumptions. Ross1 starting point was the existence of a class of unbounded transformations, for which there seemed to be ample evidence in English. So, a logical question was: when are these unbounded processes impossible? What are the exceptions to unboundedness? The CNPC was one of the answers. But if one looks at other languages, the more appropriate starting point appears to be just the opposite. Several languages have only bounded rules, and unbounded processes are very limited in all languages. Therefore, a better question seems to be: given the fact that rules are bounded, what are the apparent exceptions in languages like English? The answer is that extraction is only possible from complements of some verbs and adjectives (cf. Erteschik 1973). In other
- 44 words, clauses are islands except complements of a subclass of catecategories of type [+V]. In general, extraction is impossible from complements of categories of type [-V], i.e. prepositions and nouns. The Vs that allow extraction are called "bridges" by Erteschik. Sentential complements of Ps are often islands, just as N-complements: (19)a. b.
John left after finishing his book *What did John leave after finishing?
The fact that there is also a Complex PP Constraint is not covered 24 by Subjacency, unless we assume that PP is also a cyclic node. Although the island character of N- and P-complements might follow from the definition of a "bridge", it seems preferable to try to find an answer to the question why there is a difference between V on the one hand, and N and P on the other hand. There appears to be 25 a principled answer to this question. In spite of the fact that the notion "bridge" has to be defined anyhow, and might exclude N and P, it seems to me that the CNPC is one of the strongest arguments for Subjacency. Any alternative has to come to grips with the CNPC and related facts (like the Complex PP Constraint). 2.3.2.
The Subject Condition
A second class of facts that has been discussed in connection with Subjacency concerns the impossibility of extraction from subject phrases: (20)
*[g Wh£i [g was [ N p a picture of
e.^ ] stolen by Bill]]
- 45 Chomsky (1973) accounted for this fact with a special ad hoc provision, the Subject Condition (not to be confused with the Specified Subject Condition). Chomsky (1977b) drops this special condition in favor of the more general principle of Subjacency. This can only be done by assuming that S (rather than S) is the cyclic node for Subjacency. Chomsky (1977b) explores some consequences of this assumption, that will be briefly discussed here. Under the assumption that S is the cyclic node for Subjacency, the ungrammaticality of (20) is explained because whCK is separated from its trace by two cyclic nodes: NP and S. Chomsky (op.cit.) argues that this explanation is not straightforward because of the possibility of extracting a Wh-word from an object phrase: (21)
[g- Whoj [ g did you see [ N p a picture of
e^ ]]]
This sentence is acceptable. Why is it not blocked by Subjacency? After discussing a proposal by Bach and Horn (1976), Chomsky assumes that there is an extraposition rule (or a readjustment rule) that converts (22a) into (22b): (22)a. b.
[COMP [g you saw [ N p a picture of someone]]] [COMP [g you saw [ N p a picture] [ p p of someone]]]
Wh-movement now applies to (22b), so that Subjacency is no longer violated. The extraposition (or readjustment) rule is supposed to be sensitive to specific lexical items. It does apply to see, find, etc., but not to destroy. This would explain why (23) is less acceptable to many speakers than (21):
- 46 (23)
Who^ [g did you destroy [ N p a picture of
This solution is not very satisfactory, because the problem (i.e. the difference between subject and object under extraction) remains. Why does the extraposition rule apply to objects, and not to subjects? It cannot be ordinary extraposition, which applies to both objects and subjects. What is worse, the very extraposition rules that are supposed to make extraction possible, have in general just the opposite effect in that they make extraction impossible:
26
(24)a.
A book about farming appeared
b.
A book appeared about farming
c.
"Whatj^ did a book appear about
e^ ?
Wh-movement is also impossible after extraposition of object complements : (25)a. b. c.
He saw a book about farming, yesterday He saw a book, yesterday, about farming *Whatj did he see a book, yesterday, about
e^ ?
Thus, rather than making Wh-movement possible, extraposition rules in fact block it. Another problem is that taking S as a cyclic node makes the Wh-movement analysis of subdeletion more dubious than it is (cf. Chomsky 1977b). According to this analysis, (26a) is derived by Wh-movement, as shown in (26b):
-
47
-
(26)a. There were as many women as there were —
men
b. There were as many women as[g[^p+wh][gthere were [ N p Q p men]]] 1
*
If the +wh-QP is moved out of the NP to COMP position, it violates Subjacency under the new proposal, since two cyclic nodes are passed (NP and S). Sentence (26) does not seem bad enough to justify the conclusion that Subjacency has indeed been violated. All in all, I am not convinced that the class of facts described by the Subject Condition is explained by Subjacency. Research in the domain of X-bar theory has made it more and more plausible that S is in fact the maximal projection of V (i.e. V n , V^, or V1 ' ') . ^
An X n seems the natural boundary of a phrase, and the most
plausible candidate for a role in a system of conditions on rules. From this point of view, S (an X n
is an arbitrary node.
The readjustment rule looks too much like a deus ex machina to be trusted. The rule has to apply before Wh-movement to be of any relevance for the problem at issue. It can be shown that ordinary extraposition has to apply after Wh-movement (see the next paragraph). From this, and from facts like (24) and (25), it follows that the readjustment rule cannot be identified as extraposition. It is also unlikely that the readjustment rule leaves a trace, because there is a very general constraint that blocks extraction 28
from phrases that bind a trace, as shown by Riny Huybregts.
From
the fact that the readjustment rule applies before Wh-movement and from the fact that interpretation of the PP (after Wh-movement, see (1) of chapter 1) as a complement of the preceding NP cannot plausibly involve a trace, we can conclude that nothing would be different if we generated forms like (22b) directly by the phrase
- 48 structure rules: the PP (of someone) could be associated with the preceding NP (a picture) by exactly the same rules as would be needed after application of the readjustment rule. But this is, in fact, good news for Subjacency. A more serious problem is the following sentence, mentioned in Chomsky (1978): (27)
*It is a nuisance (for us) [g for [ g [ Np pictures of
e ]
to be on sale]] This is a crucial sentence because it shows that an e within a subject phrase is not only inaccessible in the case of movement (cf. (20)), but also in the case of control. The ungrammaticality of (27) cannot be explained by Subjacency, because Subjacency only applies to movement rules. The fact that (27) is as bad as (20) undermines the explanation of (20) in terms of Subjacency (as a distinctive property of movement rules). A more satisfactory theory would explain (20) and (27) in the same way, without reference to non-maximal phrase nodes like S. 2.3.3.
Extraposition phenomena
Extraposition phenomena form a third class of evidence for Subjacency. Chomsky (1975, 85) discusses the following data: (28)a.
The only one that I like of Tolstoy's novels is out of print
b.
The only one of Tolstoy's novels that I like is out of print
c.
*The only one of Tolstoy's novels is out of print that I like
According to Chomsky, the rule of Extraposition (from NP)
29
may move
- 49
-
the underlined sentence to derive (28b) from (28a), but not to derive (28c). In the latter case, Subjacency is violated, under the assumption that the underlying structure is something like:^ (28)d.
[np^NP
t h e onl
one
V
that I like] of T's novels] is out
of print Extraposition in this structure (resulting in (28c)) would be a movement over two cyclic boundaries (NPs in this case), which is a violation of Subjacency. This piece of evidence is problematic for several reasons. First, it is not clear why (28c) cannot be derived with the structure underlying (28b) as an intermediate step: (28) e.
^np^np
o n l
y
one
®
1
T
' s novels that I like] is
out of print Sentence (28c) can be derived from (28e) without violating Subjacency. A second problem is that Extraposition from NP only applies under very limited conditions, as shown by Gu§ron (1976). GuSron argues that only the complements of those NPs may be extraposed, that are new information in a discourse. Usually, these are indefinite (or rather, unspecific) NPs that are the focus of the sentence. As for predicates, only very special ones qualify as context for extraposition. Extraposition from the subject NP, for instance, is only possible with what GuSron calls "verbs of appearance". As a consequence, there are independent reasons to rule out (28c). The predicate be out of print is just not a proper context for extraposition rules. Cf.:
-
(28)f.
*[the novels
e ]
50
-
are out of print that I like most
Under standard assumptions about relative clause constructions, extraposition in (28f) involves movement over one NP boundary. So, there is no violation of Subjacency. And yet, this sentence is bad. It is therefore unclear how these extraposition phenomena bear on Subjacency. Akmajian (1975) gives arguments for Subjacency that are similar to the ones just discussed. He offers, for instance, examples like: (29)a.
A review of a book by three authors appeared last year
b.
A review of a book appeared last year by three authors
Sentence (29a) is ambiguous: by three authors can be interpreted as modifying review or book. Sentence (29b) does not display this ambiguity: by three authors can only be interpreted here as a complement of review. Subjacency can be responsible for the difference, according to Akmajian, if we assume the following derived structure for the excluded reading of (29b): (29)c.
[ N p a review of [ N p a book
e ]] appeared by three authors
The phrase by three authors cannot be moved from the position marked by e, because it would involve movement over two cyclic (NP) boundaries, which is again a violation of Subjacency. This type of evidence cannot be accepted without contradiction. There appear to be two other kinds of extraposition phenomena that lead to violations of Subjacency. The first was discovered by Klein
- 51 (1977). Klein observed that appositive NPs could be extraposed over two NP boundaries.^1 Consider first the following sentences: (30)a.
We hebben Pollini, de We have
beste pianist ter
Pollini, the best
wereld, gezien
pianist of the world,
seen
(We saw Pollini, the best pianist of the world) b.
We hebben Pollini gezien, de We have
Pollini seen,
beste pianist ter
the best
wereld
pianist of the world
(We saw Pollini, the best pianist of the world) Extraposition from object NPs can easily be observed in Dutch because of the SOV-structure: extraposed phrases always pass the verb (in sentences with more than one V; see (30b)) . Klein has shown that appositive NPs can be extraposed, no matter the number of NPboundaries: (31)a.
We hebben [ N p de We have de
van [ N p Pollini
the father of
beste pianist ter
the best b.
vader
e^ ]] gezien
Pollini
seen
wereld^
pianist of the world
We hebben [ N p de We have [ N p Pollini
zuster van [ N p de
the sister of
seen
van
the father of
e^]]] gezien, de
Pollini
vader
beste pianist ter
the best
wereld^
pianist of the world
The second violation of Subjacency can be observed when relative clauses are extraposed:
- 52 (32)
Ik heb I
[ N p de
have
zuster van [ N p de
the sister of
man
e^ ]] gezien
the man
e
seen
die het gedaan heeft^ who it
done
has
(I saw the sister of the man who did it) Just as in the case of appositives, the number of NPs can be extended
without limit:
(33)
Ik heb I
[ N p het kind
have
van [ N p de
the child of
zuster van [ N p de
the sister of
man
the man
e^]]] e
gezien die het gedaan heeftj seen
who it
done
has
(I saw the child of the sister of the man who did it) Van Riemsdijk (1978) observes similar facts with respect to
(34)a.
Ik heb I
have
[ p p met
de
meeste mensen die er
with the most
waren] gesproken
people who there were
spoken
(I spoke with most of the people who were there) b.
Ik heb [ p p met I
have
de meeste mensen
with the most
people
e ] gesproken die er spoken
waren
who there were
(I spoke with most of the people who were there) Van Riemsdijk concludes that the violation of the Head Constraint (and of Subjacency if PP is cyclic) is only apparent, because there is an escape route through the S positions (within the PP) that are not in the domain of the head:"*^
-
5 3
-
I I
(35)
[pp[p
m e t
tNP
de
with
w ]
mensen
[g die • • • ] s W P
the people
S
]
^ PP
V
S
that ..
Van Riemsdijk shows that step I is independently motivated, because there is evidence for PP-internal movement of S. Step II is no longer a problem, because the intermediate S position is not in the domain of the head, and only one cyclic node
(PP) is passed. A
similar escape route can be constructed for cases like
(32) and
(33), where more than one NP-boundary is passed:
(36) [ N p het kind van
[ N p de
the child of
zuster van
[ N p de man e^] S^]S^] V S^
the sister of
the man
j
But the availability of such escape routes undermines Akmajian's evidence, because a similar escape route has to be assumed for PPs. Van Riemsdijk's evidence for PP-internal S-movement applies also to PPs, in fact has to apply because PPs can also escape from the Head Constraint and Subjacency
(37)
Ik heb I
[ p p met
have
(with cyclic PP):
de
broer
e^ ] gesproken van die
with the brother e
spoken
of
jongen^
that boy
(I spoke with the brother of that boy)
If higher PP complements
(within PP or NP) can be used to escape
through, Akmajian's argument looses its force:
(38) [ Np [jj a review
[ p p t p of
[Np[^
a
book
l
by three
authors]]PP]]PP]PP I ¡M 31 t~
By passing through the intermediate PP-positions, the phrase by three authors can leave this NP without violating
Subjacency.
- 54 Thus, either Akmajian's example is explained by Subjacency, and (31), (32), and (33) are counterexamples to Subjacency, or there are escape routes for PP and S, and Akmajian's example becomes an irrelevant, anomalous fact that can no longer be explained by Subjacency. Quite apart from the fact that Subjacency cannot be applied to extraposition phenomena without contradiction, one may ask whether this class of facts is relevant at all for our issue. What we want to establish is that there is no unique property (Subjacency) that distinguishes "move NP" and Wh-movement from rules of construal. Thus, even if it could be demonstrated that extraposition rules are constrained by Subjacency, it would not follow that "move NP" and Wh-movement are also constrained by Subjacency, unless it can be shown that these rules and extraposition are in the same class. This is far from clear. Suppose that extraposition rules are stylistic (6a (39)
of (1) of chapter 1): 1.
Base rules
2.
Wh-movement, "move NP"
(6a) extraposition
Since this classification is based on properties like Subjacency, it does not follow that if (6a) is characterized by Subjacency, the same must be true for 2. Thus, properties of extraposition rules only have consequences for Wh-movement and "move NP" if the following classification can be established: 34
- 55 (40)
1.
Base rules
2.
Wh-movement, "move NP", extraposition
As it stands, (39) is a far more plausible classification than 35 (40).
Contrary to (clause-bound) Wh-movement and "move NP", extra-
position rules depend on idiosyncratic phonological facts, as documented by Guëron (1976). Another major difference is that Wh- and NP-movement create proper coindexing configurations (in which the "moved" element c-commands the trace), while extraposition rules destroy such configurations. Compare, for instance, the structure before extraposition of relative clauses in Dutch with the structure after application of the rule: (41)
(the man) (who) het gedaan heeft (it done (that I know the man who did it)
has)
- 56 The antecedent de man c-commands the relative pronoun die, so that coindexing applies as indicated. After extraposition of the relative clause, the c-command configuration has been destroyed:
3 fi
(42) C
The first NP^ no longer c-commands the second. Thus, we may conclude that extraposition cannot precede coindexing, as implied by (40). This evidence is only consistent with (39). A third argument for (39) is that extraposition destroys the environment for a very general filter formulated by Chomsky and Lasnik (1977). In its simplest form this filter reads as follows: (43)
* [ g ± WH tNpe]...], unless in the context: [ N p NP —
...]
But the context specification fails after extraposition of relative clauses (cf. Chomsky and Lasnik 1977, 451): (44)
A book arrived
that [ N p e ] may interest you]
We may arbitrarily complicate the filter, but everything can remain as simple as it is if we assume that (39) is correct, where the extraposition rules apply after the filer has been passed (4a of (1) in chapter 1). 37 The most plausible conclusion at this point is that (39) is correct, and that extraposition phenomena are outside the scope of core grammar. If extraposition leaves a trace it is an inconsequen-
- 57 tial trace, because as soon as the a—rules (of (1) of chapter 1) apply, the b-rules (that interpret traces, among other things) are inaccessible. It does not make much sense to have traces at all in the case of extraposition if (39) is correct, and they are certainly not treated as anaphors in the sense of the Bounding Condition, to be discussed in 2.6. The main conclusions of this paragraph are that extraposition phenomena hardly support Subjacency, and that they tell us nothing at all about the properties of "move NP" and Wh-movement. 2.3.4.
Wh-island phenomena
Facts like the following are explained by Subjacency (S cyclic): (45)a. b.
*[g Who^ [ s do you know [g- whatj [ g ^ *[c Where,1 [Oc do you know O
D
what.J [„ O
saw e
e^ ]]]]
to buy e. e. ]]]] J — 1
The ungrammaticality of (45a) also follows from the NIC (e^ is free in S). Both are explained by Subjacency, because there are two Ss between the leftmost Wh-phrase (with index i) and its trace (e^). Since these facts also follow
from the Locality Principle, I will
postpone discussion of them till 3.2.6.). Since Wh-island phenomena also follow from other principles, they are not crucial in a comparison between Subjacency and the Bounding Condition. These facts will therefore not be further discussed in this chapter. 2.4,
Preliminary coneluaion We have concluded so far that there are no clear reasons to
distinguish between trace and PRO. The classical arguments for move-
-
58
-
ment rules are undermined by trace theory. The different properties of movement on the one hand, and control on the other hand can all be accounted for by one and the same coindexing rule in conjunction with independently given properties of verbs like seem and try. These properties are responsible for the fact that in the case of seem-type verbs we have coindexing from a non-argument position to an e, while in the case of verbs like try the coindexing is from an argument position to an e. A separate class of non-stylistic movement rules can only be justified on the basis of distinctive properties like cyclicity, or Subjacency. A theory including cyclicity, Subjacency, and also conditions like the SSC, the NIC, and the Head Constraint is highly redundant. Attempts to eliminate redundancy have led to a successful research program, in which it was first shown that the notion of the cycle could be dropped without loss of descriptive adequacy (Freidin 1978); Chomsky (1978) eliminated another redundancy by replacing the PIC by the NIC. A theory incorporating both Subjacency and the SSC (or even the Head Constraint) is still unsatisfactory from this point of view, an important motivation to look for alternatives . Since the class of (non-stylistic) movement rules can no longer be considered characterized by cyclicity, full attention naturally goes to Subjacency. Really strong evidence for Subjacency appears to be rather limited. It comes mainly from the CNPC and from the Subject Condition. The explanation of the Subject Condition has only been partly successful, because it refers to arbitrary nodes like S (= V n
and fails to explain the very similar fact that subject
phrases are also inaccessible in the case of control. The evidence
- 59 from extraposition is almost certainly irrelevant, and all Whisland phenomena can be explained by other principles. A plausible alternative to Subjacency is descriptively equivalent if it explains the CNPC and the Subject Condition. It must be preferred if it also explains the Subject Condition without reference to the node S, and the fact that subject phrases may not contain controlled NPs. It will be argued in 2.6. that the Bounding Condition not only explains these empirical generalizations, but also several other facts not covered by Subjacency. Before introducing the Bounding Condition, I would like to make a few comments on markedness as a necessary background. 2.5.
Core grammar and markedness
2.5.1.
Introductory remarks
One of the main causes of the survival of the empiricist, descriptivist tradition in generative linguistics is the use of an unspecific transformational formalism, combined with what we might 38 call naive falsificationism.
Since unconstrained transformations
can describe almost anything, transformational grammar has often looked like a cataloguing of the data of pre-scientific experience. As briefly discussed in chapter 1, the way to terminate this "dataism" is the postulation of rigorous idealizations, and to stop interpreting conflicts between idealizations and data as immediate refutations of these idealizations. Interesting theories do not avoid conflicts with data. On the contrary, if a theory is in harmony with all known data, it is probably a worthless theory that 39 cannot lead to new discoveries, and tells us only the obvious. It is therefore wrong to base theory choice solely on considerations
- 60 -
of descriptive adequacy. A theory that is promising from an explanatory point of view, and that faces many counterexamples must be preferred to a theory that is descriptively adequate but without 40 explanatory perspective.
The lesson we can learn from the physi-
cal sciences (cf. chapter 1) is that most normal forms of science can be seen as attempts to reconcile idealized theories with reality by attaining descriptive adequacy in a new way: by the invention of new concepts, by revising auxiliary hypotheses, and so on. This is not to say that the hard core of an idealization is beyond criticism. Of course, this too is subject to change, but usually at a slower rate and almost never without the help of interesting alternatives. Chomsky's recent theories are an excellent illustration. The conditions proposed in Chomsky (1973) are seen as part of an idealization: sentence grammar. When the theory incorporating these conditions is confronted with problematic evidence, a crucial question is whether the rules that account for the problematic data have to 41 fall within sentence grammar.
Sometimes this question has been
answered with "no", and one of the major results has been that we are now able to make a principled 42 distinction between sentence grammar and discourse grammar.
Phenomena described by rules of
discourse grammar obviously have no bearing on the validity of conditions like the NIC and Subjacency. Similar considerations apply to the extraposition phenomena of the preceding paragraphs. In a primitive theory with undifferentiated scope, too many (and in certain respects also too few) data are believed to be relevant for the assessment of a theory, When theories become more sophisticated, more specific and differentiated ideas emerge as to their scope. 43 Recently, Chomsky has argued that besides the distinction
-
61
-
tween discourse and sentence grammar, we have to make a further fferentiation between core grammar and non-core or peripheral 44
grammar.
Core grammar represents the unmarked part of language
that is "optimal" in terms of an evaluation metric. It includes, among other things, the general outline of the base rules and the general rule schema "move a" (coindexing in our terms). Core grammar is responsible for the most rigid part of language. Its rules and conditions are either invariant across languages, or fall within a very limited range. In the latter case one may think of universal options like question formation with or without Wh-movement, or the fixing of a parameter (like S or S as a bounding node for conditions) . Beyond core grammar, languages may have rules in different degrees of markedness. On the periphery of language, anything learnable is possible. Thus, knowledge of language is seen to be organized in different layers from the practically invariant core to the extreme periphery, where languages naturally differ a lot. Plasticity increases towards the periphery, but at no level is there unlimited choice. For the language learner, core grammar is perhaps relatively easy to acquire; it is believed to be deeply entrenched in human biology. Language learning, in this view, is the fixing of the parameters of core grammar, plus the addition of marked rules up to the periphery. The theory of core grammar is the first step in the direction of a theory of markedness in syntax, and it naturally leads to the rigorous idealizations that are the essence of an explanatory science, It is always a welcome development if we can find ways to make principled distinctions within the chaos of available facts. Not all data are created equal! Just as some facts have a bearing on
-
62
-
sentence grammar, and others on discourse grammar, it has to be determined to what extent a given body of evidence is relevant to core grammar. If we adopt this approach, the theory of islands which started with Chomsky (1964) and Ross (1967) is going to look completely different. A given phrase may be an island of core grammar, but not necessarily an island in the peripheral sense. A good illustration of this is Wh-movement. Chomsky (1977b) persuasively argues that the following sentences are quite different in character: (46)a. b.
WhCK did Mary meet
e^ ?
WhQj did you tell Mary that she should meet
e^ ?
The first case (46a), involves Wh-movement within a clause; the second case (46b), extraction from a clause. The first Wh-movement is quite free, and not subject to lexical idiosyncrasies. Extraction (46b) is not free in this sense. As briefly mentioned in 2.3.1., it is only possible over a "bridge" (in the sense of Erteschik 1973), i.e. over specific lexical configurations (cf. extraposition phenomena), with the minimal condition that the clause involved is a complement of a specific class of verbs. A second difference is that clause-bound Wh-movement is, as far as we know, much more universal than extraction. Several languages only have the former sort of rule, while extraction is impossible. A natural way to account for these facts is to hypothesize that a full clause is an island in core grammar, and that extraction is a more peripheral phenomenon. In general, we may hope that rules show their marked character by their cost, i.e. they are less general across languages, and - within the languages in which they occur -
- 63 they are subject to lexical idiosyncrasies, and responsible for variance in judgment about the sentences they produce. However, a caveat is in order here. Markedness is a theoretical concept that refers to the evaluation metric. Since grammar interacts with other cognitive systems, there is no a priori necessity for a marked rule to produce "marked" data. Global complexity of data is a fact of performance that can be caused by grammar, but also by other factors . ^ Another illustration of markedness can be found in Emonds1 distinction between root and non-root sentences. Emonds (1976) has postulated a strict difference between rules that apply only to the root (roughly the top S), and rules that apply to all causes. Hooper and Thompson (1973) and others have pointed out that root phenomena also occur in embedded clauses under certain conditions. 46. These data are usually taken as a refutation of Emonds' framework. But under the idealization of core grammar this refutation does not follow. It even seems misconceived, because the peripheral character of embedded root phenomena is immediately clear as soon as one considers their cost. While root phenomena appear freely in real root sentences, they only appear under very limited, lexically governed conditions in embedded clauses. Root phenomena are possible in the complements of certain verbs. This class of verbs overlaps with the class of verbs that allow extraction, and root phenomena are impossible in NP or PP complements. In the following examples topicalization of the object (each part) applies freely in the root (47a), and in complements of verbs of saying (47b), but not in the complements of certain factives, and complements of nouns and prepositions (47c-e):47
- 64 (47)a. b.
Each part he has examined carefully He explained that each part he had examined carefully
c.
*I resent that each part he has examined carefully
d.
*Bill's claim that each part he had examined carefully, is clearly false
e.
*After each part he had examined carefully, he left
Another piece of evidence for the peripheral nature of embedded root phenomena is the obvious variation of judgment in many cases. But the most striking evidence is that there are languages, like Dutch and German, where root phenomena never occur in embedded clauses. In these languages, the distinction between root and non-root sentences is very strict. A reasonable assumption at this point is that in core grammar, full clauses are islands and root sentences differ strictly from non-root sentences. Both extraction from a clause and embedded root phenomena are peripheral. The theory of markedness is, of course, still in its infancy. But the differentiation of data that it entails is a necessary precondition for progress. Abstract principles like the Bounding Condition and the Locality Principle are immediately falsified if they are confronted with undifferentiated data of various languages. Valid or not, these principles can only be evaluated against the background of a theory of markedness. 2.5.2.
Coindexing: the core rule of syntax
Before discussing the Bounding Condition, I would like to give a more explicit account of the coindexing rule that it presupposes. Coindexing as conceived here, is a universal rule, supposed to be
- 65 the core rule of syntax in all languages. It crucially involves the notion c-command (cf. Reinhart 1976) and the notion consequent. C-command can be defined as follows: (48)
C-command A node a c-commands a node -y,iff the first branching node dominating a, dominates y,
and a does not dominate y
We assume that coindexing affects a pair (a, y) where a is the antecedent, and y the consequent. Any node of type X 1 (3>i>0) can be an antecedent, while the class of consequents may be defined as follows: (49)
Consequent A node y is a consequent if: (1) it does not dominate lexical material, (2) it is an anaphor, (3) it is a (non-interrogative) Wh-phrase
This class presumably has to be extended, but it suffices for present purposes. We assume furthermore that all nodes dominating lexical material receive a different index j (>0) in the base, except the class of consequents. Consequents are assigned a derived index
(50)
by the following rule:^®
Coindexing X^ C1 3
...
Y1
X^ J
...
_Y^
1
(where X c-commands Y, and X and Y are both aN and 3V)
49
-
66
-
This rule assigns an index to consequents, which have no basegenerated index. Since more than one X may c-command Y, the closest possible X must generally be selected. This antecedent selection is determined by the Locality Principle, to be discussed in chapter 3. The coindexing rule includes both "move a" and "coindex" in the sense of Chomsky (1978); in all of the following cases the derived index j has been assigned by rule (50): (51)a.
Whatj did you see
e^ ?
b.
The manj who^ did it escaped
c.
The book, was written
d.
Maryj seems
e^
to write poems
e.
Maryj tries
e^
to write poems
e.
by Bill
The coindexing rule applies to all categories of type X x , thus also to Vs as in the case of "Verb Second" in Dutch, where the consequent is a V in final position (cf. Koster 1975, 1978b): (52)
Peter leestj het boek
e^
Peter reads
e
the book
Furthermore, the rule applies to anaphors like reflexives and reciprocals : (53)a. b.
Peterj saw a picture of himselfj The men. saw each other. : 2
Apart from base-coindexing and coindexing by (50), there are perhaps a few special forms of coindexing. For instance, in sentences like this is a picture of myself, we may assume a special index, s,
- 67 (speaker) for myself. Let us assume that such indices are assigned to the appropriate lexical items when no antecedent is available. Another case of non-standard indexing is the arb (arbitrary) interpretation which Chomsky (1978) assumes for the embedded subject (e) of sentences like: (54)a.
Bill knows how e
e
b.
It is easy
c.
It is a nuisance
to solve these problems
to shave oneself e
to have left it
Bill can be the antecendent of e in (54a), but there is also an interpretation possible in which Bill knows how the problems can be solved in general. In the other two sentences (54b and c) there is no suitable antecedent at all. In chapter 3, it will be argued that in these cases e has a generic interpretation, and for present purposes we may assume that in the domain of certain lexical items a special index 2 (of generic) is assigned to the closest possible e. These generic interpretations are only assigned to NPs immediately dominated by a V 1 , i.e. to subjects and objects. Thus we have the following paradigm: (55)a. b.
Jesus saves *
e
e^
saves us
c.
It is difficult
e
to smoke cigars
d.
It is difficult
for us to smoke
e
Object interpretation is more idiosyncratic in such cases, as in the case of most idioms (pay heed, break the ice, etc.). Sentence (55b) is excluded by the NIC, because e
is nominative in this case,
-
68
-
and free in its S. More about these constructions in chapter 3. Thus, indices can be assigned in a number of ways: by the base in the case of most lexical items, by rule (50) for consequents that have an antecedent, and by special assignment elsewhere. If no assignment rule applies, a category remains without an index. This results in an ill-formed structure, to be excluded by the following filter: *[X 1 ]
(56)
This filter stipulates that only those X 1 s are well-formed that have an index j or j, or s (speaker), h, (hearer: e.g. this is a picture of yourself^), or 2 (generic). One of the effects of (56) is that sentences containing anaphors but no antecedents are starred: (57)
*This is a picture of himself
Filter (56) considerably reduces the unwanted effects of overgeneration in a system of optional base rules: unexpanded nodes are rejected unless they have been assigned an index by a rule. But overgeneration is still massive, because rule (50) often assign an index without grammatical results. Almost all unwanted forms of overgeneration that are not rejected by (56) are eliminated by the Bounding Condition. 2.6.
The Bounding Condition
2.6.1.
Introduction
One of the many fortunate consequences of trace theory is that it makes it possible to formulate conditions on null-anaphors in-
- 69 stead of conditions on rules. This has led to more adequate conditions in some cases, as in the replacement of the tensed S condition (PIC) by the NIC. The approach of conditions on anaphors has been explored in Chomsky (1978). A similar approach was introduced in Koster (1977b) with respect to the distribution of null anaphors in general.^ The basic idea is that the difference between (nonstylistic) movement rules and rules of construal is not a difference in the conditions on the rules that are involved (Subjacency), but a difference in the distribution of unexpanded nodes (e) on the one hand, and fully lexicalized anaphors on the other hand. Null anaphors have a more limited distribution, not only as traces, but also as PROs and unexpanded Vs in the case of Gapping. With a few marked exceptions, null anaphors have to have their antecedent within the same phrase (Xn), which is expressed by the Bounding Condition: (58)
THE BOUNDING CONDITION y cannot be free in £ (= X n ) in: [ B ...[ Y e]...]
The Bounding Condition is meant as an alternative for those instances of Subjacency that do not (like the Wh-island phenomena), follow from the Locality Principle. It is a condition on all null anaphors (trace, PRO, etc.) and has to be interpreted like the NIC: a null anaphor, y, is free in B if it is not bound in B, i.e. if there is no proper antecedent in 6; B does not have to be minimal, and y is bound in 3 if it is bound in a proper subphrase, B' (= X n ), of B (cf. (101) below). A significant difference between the Bounding
- 70 Condition and Subjacency is that the Bounding Condition does not refer to an arbitrary selection of nodes like S and NP, but to all major phrase nodes of type X n . In this sense, it seems more elegant. It also differs from Subjacency in its empirical predictions. But let us first consider the cases where Subjacency and the Bounding Condition are descriptively equivalent. 2.6.2.
The Complex NP Constraint
The following are standard instances of the Complex NP Constraint (CNPC): (59) a. b.
-Who, do you know [ _ the man. [ •=• Je. that b [ c Je. saw X N rH j b AWhat^ [ g did Bill make [ N p the claim [ g Mary saw
e.]]]
1
e ± that
e^ ] ] ]
Both sentences are explained by Subjacency, because both have two cyclic nodes between the sentence-initial Wh-phrase and its closest trace. Both cases are explained by the Bounding Condition, because in (59a) e ± is free in S (= V n ) and in NP (= N n ). In (59b), e ± is bound in S (it has an antecedent in COMP), but free in NP, which is forbidden by the Bounding Condition. That (59a) is far worse might be caused by the fact that it involves two violations of the Bounding Condition.^1 As for the CNPC, there is no empirical difference between Subjacency and the Bounding Condition. 2.6.3.
The Subject Condition
Again, there is no difference with respect to the standard cases:
-
(60)a. b.
71
-
aWhcK [ g did [ N p a picture of
e^ ] disturb you]
*WhcK did he say [g e^ that [ g [ N p a picture of e^ ] disappeared]]
Subjacency explains the ungranunaticality of these sentences by the fact that the Wh-phrases cannot pass two bounding nodes, NP and S. The Bounding Condition explains these cases by the fact that e^ is free in an NP in both cases. It is in constructions like these that we see the first advantage of the Bounding Condition, because subject phrases are also islands with respect to control. Recall the following case (cf. 27): (61)
*It is a nuisance for us^ [g for [ g [ N p pictures of
e^]
to be on sale]] The ungrammaticality of this sentence cannot be explained by Subjacency because there is no movement in this case. The Bounding Condition explains the ungrammaticality of (61) in exactly the same way as the ungrammaticality of the sentences in (60) : e^ is free in its NP. 2.6.4.
Phrases as islands
As it stands the Bounding Condition is descriptively superior to Subjacency. The major evidence for Subjacency, the CNPC and the Subject Condition, also follows from the Bounding Condition, and the island character of subject phrases with respect to control (cf.(61)) is explained by the Bounding Condition, but not by Subjacency. The greatest advantages of the Bounding Condition follow from the general island character of phrases of type X n . Bach and Horn (1976) and
- 72 Horn (1977) have made a case for the island character of NP, but the full generality of the phenomenon has been missed for a number of reasons. The first reason is that VP was often seen as the maximal projection of V. Recent developments of X-bar theory have made it feasible that S is the V n .^ 2 A second important reason is that early transformational theories naturally reflected some of the idiosyncrasies of English. Van Riemsdijk (1978) has persuasively argued that English shows a very exceptional behavior of the category PP. PPs are islands in most languages, and preposition stranding is a highly marked phenomenon. This brings us to the third reason for missing the island character of phrases in general: the negligence of the theory of markedness. Where extraction phenomena occur at all, their marked character can hardly be overlooked. The relevant facts, together with some remaining problems, can best be assessed by considering the various phrase nodes one by one. 2.6.4.1.
NP as a bounding node
Bach and Horn (1976) observed that extraction from NPs is generally impossible: (62)a.
*Who did a book of disappear?
(Subject Condition)
b.
*What did they discuss the claim that Mary saw?
c.
*Who did they destroy a book about?
d.
*About whom did they destroy a book?
e.
*What did Einstein attack a theory about?
f.
*About what did Einstein attack a theory?
g.
*Which city did Jack search for a road into?
h.
*Into which city did Jack search for a road?
(CNPC)
- 73 Bach and Horn sought to explain these facts by the NP Constraint: (63)
The NP Constraint No constituent that is dominated by NP can be moved or deleted from that NP by a transformational rule
Chomsky (1977b) rightly observes that these facts are also explained by Subjacency (with S cyclic), because in all cases the Wh-phrase has been moved out of an S and an NP. Furthermore, Chomsky argues that the NP Constraint is falsified by examples like: (64)a. b.
[A review
e^] was published of Bill's book^
Of the students in the class, [several 3
e.l failed the exam
~3
*
Note that the NP Constraint is explained by the Bounding Condition, as far as it concerns NPs containing an e (which is the case in all sentences of (62)). Sentence (64a) does not falsify the Bounding Condition if our earlier conclusions about extraposition phenomena are correct. Thus, we assume the following model, which is in accordance with Chomsky (1978) on the place of extraposition rules and conditions on anaphors like the Bounding Condition: (65)
extraposition
1.
Base rules
2.
Coindexing conditions on anaphors
Thus, (64a) falsifies the NP Constraint as a limited empirical generalization in a theory with undifferentiated scope. It does not falsify the Bounding Condition, if (65) is correct. At the moment
- 74 when extraposition rules apply, the Bounding Condition (on the right side of (65)) is "invisible". Sentence (64b) is a typical root structure, perhaps comparable to sentences like: As for the students, several passed the exam. It is also possible that (64b) is comparable to certain constructions in Dutch, to be discussed below (cf. (77b)). Bach and Horn (op.cit.) assume that Wh-movement of PP complements is only possible if they are base-generated under VP (66a), rather than inside an NP (65b): (66)a.
They [ v p wrote [ N p a book] [ p p about Nixon]]
b.
They [ v p wrote t N p a book about Nixon]]
Their argumentation for this difference in structure is correct, but it does not apply for another class of examples, as pointed out by Chomsky (1977b). Bach and Horn justified (66a) with examples like: (67)
He wrote it about Nixon
Since it cannot have a'PP complement, (66a) seems correct. The independent existence of structures like (66a) makes it possible to apply Wh-movement to the about-phrase without violating the NP Constraint. Chomsky (1977b) pointed out that a similar argument does not hold for sentences like (68): (68)
Who did you see a picture of?
In such cases (with verbs like see, find, etc.) there is no indepen-
- 75 dent evidence for a structure like (66a) - or so it seems at least because there is no analogue of (67): (69)
*He saw it of Nixon
Chomsky therefore assumes a source inside the NP for the PP complement in cases like (68). But, as we briefly mentioned in 2.3.2., this hypothesis undermines the explanation of the Subject Condition in terms of Subjacency, unless we further assume the rather obscure readjustment rule that dissociates PP complements from object NPs. It seems to me that the ungrammaticality of (69) is not a compelling argument against structures like (66a) for sentences like (68) or (69) . The base rules are such that both the following structures are possible: (70)a. b.
They [ v p wrote [ N p a book] [ p p about Nixon]] They [
saw
[ N p a picture] [ p p of Nixon]]
The difference between these two sentences is not necessarily a difference in base structure. There is another way to account for the difference, that also explains why (67) is grammatical in contrast with (69) . There is clearly a difference in the way the PP is related to the rest of the sentence in (70a) and (70b), respectively. The about-phrase in (70a) seems related to the verb, or the combination V + NP, while the of-phrase in (70b) seems exclusively related to the object NP. The object in (70a) can be omitted, in (70b) it is obligatorily present:
- 76 (71)a. b.
They wrote about Nixon *They saw of Nixon
Although such contrasts can be captured by different base structures, I would like to explore an alternative that accepts both (70a) and (70b) as deep structures, and that avoids recourse to undesirable readjustment rules. This alternative makes it possible to relate the PP of Nixon in (70b) to the object NP in a non-transformational way. The alternative hypothesis is based on an independently motivated discourse rule, first discussed by GuSron (1976), that can be stated as follows (72)
Linking to focus Interpret PP^ as the complement of NP.^ ( [ N p NP i PP^]) in: ...PP....NP....PP . 1 D 3 [FOCUS] where NP^ is immediately dominated by V v
(3>k>l)
A possible objection is that the readjustment rule that dissociates of Nixon in sentences like they saw a picture of Nixon, makes (72) superfluous. But, again, the readjustment rule is rather obscure, and cannot be identified with extraposition, so that is has no obvious advantages over (72). What is more important is that (72) is independently motivated as a discourse rule. Consider the following dialogue: (73)
A:
A book appeared:
B:
About relational grammar?
A:
No, about Rasing again
- 77 The about-phrases in this discourse have to be related to the focus in the first sentence (a book). There is no reason to block application of (72) within sentences like: a book appeared about relational grammar. Similar dialogues can be constructed for of-phrases and objects: (74)
A:
He saw a picture!
B:
Of Bill?
A:
No, of Nixon again
Again, these facts can be accounted for by (72) , while there is no obvious way to relate both of-phrases to a picture by an extraposition or readjustment rule. Similarly, rule (72) relates the PP to the object in sentences like they saw a picture of Nixon. We can now explain why (69) is ungrammatical: (69)
*He saw it of Nixon
The PP of Nixon is uninterpretable in this case. Rule (72) does not apply, because it, being an unstressed pronoun, is not a proper focus. More generally, PPs like of Nixon cannot be interpreted as complements of pronouns. Rule (72) also accounts for many properties of extraposition rules. Recall, for instance, Akmajian's data, discussed in 2.3.3.: (75)a.
A review of a book by three authors appeared last year
b.
A review of a book appeared last year by three authors
Akmajian observed that (75b) differs from (75a) in that by three authors can only be related to the topmost NP (review). This is pre-
- 78 dieted by (72) , because only the topmost NP is a focus dominated by k n-1 a V
(S = V
in this case). Most - if not all - peculiarities of
extraposition that were attributed to Subjacency can be explained in this way. That the discourse rule (72) is involved can be illustrated again by a dialogue: (76)
A:
A review of a book appeared
B:
By Bill again?
A:
No, by Jill this time
As in (75b), the by-phrases can only be linked to review of a book, and not to a book. So far, the issue between Subjacency and the Bounding Condition (and the NP Constraint) is undecided. All three explain the ungrammatically of the sentences in (62), and in cases where extraction from PP is possible, it can plausibly be argued that the PP is freely generated under VP, as first discussed by Bach and Horn (1976). For many cases they already gave the correct argument, and the problematic cases fall into place if we assume the rule of Linking to Focus. For crucial evidence, distinguishing between Subjacency and the Bounding Condition, we have to go to other languages, like Dutch. Dutch has NP-internal PP complements in the same places as English, but for the cases for which Chomsky postulated the readjustment rule, there is also the possibility that the PPs precede the object NP:
- 79 (77)a.
Hij heeft een foto He
has
a
van Bill gezien
picture of
Bill seen
(He saw a picture of Bill) b.
Hij heeft van Bill een foto He
has
of
Bill a
gezien
picture seen
(Of Bill he saw a picture) The PP van Bill in (77b) is not topicalized as in certain English constructions of this type, but it is in a VP-internal position where all kinds of PPs can be freely generated. As in English, rule (72) can relate the PP to the object. The focus NP has to be k directly dominated by a V
(VP in this case). Thus, linking to NPs
contained in other NPs or PPs is impossible (78a ungrammatical under the reading which relates the PP to the NP-internal NP een boek): (78)a.
*Hij heeft van Bill [een recensie van een boek] gelezen He
has
of
Bill
a
review
of
a
book
read
(He read a review of a book by Bill) b.
*Hij heeft van Bill [met He
has
of
Bill
een broer]
with a
brother
gesproken spoken
(He spoke with a brother of Bill's) Subjacency and the Bounding Condition explain that van Bill cannot be extracted from the PP complement position within the NP een boek in (78a), because this would imply an unbound e in an NP (Bounding Condition), or extraction over two NP boundaries (Subjacency). The Bounding Condition gives a similar explanation for the case that van Bill is extracted from the PP position in the object NP (een broer) in (78b). Subjacency can only explain (78b) by assuming that
-
80
-
PP is also a cyclic node: (79)
*Hij heeft van Bill.^ [ p p met [ N p een broer He
has
of
Bill
with
a
e^]] gesproken
brother e
spoken
(He spoke with a brother of Bill's) In general, only the few PPs that can be linked to a focus by rule (72) can, as complements, precede their NP in Dutch. This analysis presupposes free base-generation of the PPs in question. This is also the class of PPs that can be fronted by Wh-movement. An alternative analysis, assuming VP-internal extraction of PP complements of NPs, would be inconsistent with the Bounding Condition, but not with Subjacency, for the cases that only one NP is involved: (80)
...[vp ...PP ± ..[ Np N
e.]...]... I
t
This is clearly a test case, because Subjacency does not block movement out of one NP, while the Bounding Condition rejects (80) (free e in an NP). It appears that the Bounding Condition makes the correct prediction here, because PP complements can never be extracted in the way indicated in (80): (81)a.
Wij hebben [de We
have
reis naar Hawaii] geannuleerd
the trip to
Hawaii
cancelled
(We have cancelled the trip to Hawaii) b.
*Wij hebben naar Hawaii^ [de We
have
to
Hawaii
reis
the trip
e k ] geannuleerd cancelled
-
(82)a.
81
-
Zij heeft [een gesprek met She has
a
talk
Peter] afgeluisterd
with Peter
eavesdropped
(She eavesdropped on a talk with Peter) b.
*Zij heeft met She has
(83)a.
Peter^ [een gesprek
with Peter
a
talk
Nixon heeft [een boek over Nixon has
a
e^] afgeluisterd e
eavesdropped
Kissinger] gestolen
book about Kissinger
stolen
(Nixon stole a book about Kissinger) b.
*Nixon heeft over Nixon has
(84)a.
Kissinger^ [een boek
about Kissinger
Bill heeft [aardewerk uit Bill has
pottery
a
e^] gestolen
book
stolen
Azie] vernietigd
from Asia
destroyed
(Bill destroyed pottery from Asia) b.
*Bill heeft uit Bill has
(85)a.
from Asia
Mary heeft [de Mary has
Azie^ [aardewerk
brief
pottery
e^] vernietigd e
destroyed
aan John] weggegooid
the letter to
John
thrown away
(Mary threw away the letter to John) b.
Mary heeft aan John^ [de Mary has
to
John
brief
e^] weggegooid
the letter
thrown away
There is no obvious independent reason for the ungrammaticality of the b-sentences in (81) — (85) . The "landing site" is a normal PP position that c-commands the trace (see (80)). So, there is no violation of structure preservingness or proper binding conditions. In conclusion, we may say that Subjacency makes the wrong predictions because it requires one bounding node too many. Apart from some very marginal cases, NPs are islands. 54
-
2.6.4.2.
82
-
AP as a bounding node
AP is the least well-studied category at present. I will therefore limit myself to a few sketchy remarks, that, no doubt, will have to be revised when more detailed studies about the AP become available. As for attributive APs (in prenominal position), Dutch has a richness of structure that has no counterpart in English, and that makes it possible to observe some island phenomena that involve the node AP. Consider the following example (86)
Wij hebben medelijden met t N pt A p op Mary verliefde] [jongens]] We
have
sympathy
with
with Mary in love
boys
(We feel sorry for boys that are in love with Mary) The PP op Mary cannot be extracted in any way: (87)
*0p Mary^ hebben wij medelijden met
^nP^AP
—k
ver
li e fde]
[jongens]] This fact is, of course, already explained by the bounding nature of the NP. Subjacency can explain (87) by including S
in the class of
cyclic nodes. If participles are adjectives, very similar facts can be observed in English, where participles are post-nominal:
(88)a. b.
All books written by Nixon are sold out *By Nixon^, all books written
e^
are sold out
The general prohibition against extraction from attributive APs is explained by the Bounding Condition.
- 83 More crucial examples involve extractions from single APs, not contained by NPs. Predicative APs provide the relevant evidence, in spite of the following complication: (89)a.
Bill is fond of Mary
b.
Who.1 is Bill fond of
e. —l ?
Superficially, (89b) seems to be a case of extraction from the AP [ A p fond of who]. But it seems to me that sentences like (89a) can have two different structures: (90)a.
Bill [ v p is [ A p fond [ p p of Mary]]]
b.
Bill [ v p is [ A p fond] [ p p of Mary]]
The second structure (90b) is not necessarily derived from the first (90a), because both are possible base structures. A copula and a following adjective form a tight semantic unit, and the PP in (90b) can be interpreted as a modifier of this unit. Under the assumption that (90b) is a plausible base structure, we can hypothesize that Wh-movements like (89b) are only possible from structures like (90b), so that the Wh-phrase is extracted from the PP, and not from the AP. Extraction from AP is impossible in structures where verb and adjective do not form the almost verb-like unit of (90b): (91)a.
Bill looked at her,
full of excitement
b.
*Whatj did Bill look at her,
full of
c.
*Of what i did Bill look at her, full
e^ ? ^
?
One might argue that these examples are explained, if AP is a cyclic node; just as in the case of NPs, examples like (91b) and (91c)
- 84 would involve two cyclic nodes, AP and S: (92)
[g Whatj [ s did Bill look at her, [ftp full of
e ± ]]]
But as in the case of NPs, there is evidence from Dutch that extraction of PPs from APs is also impossible within a VP. Thus, we have the following situation again (cf. (80)): (93)
...[vp ...PP^^ ...tAp A
e.]...]...
PPs can precede the AP to which they are related, if the AP forms a unit with the verb in the sense of (90b); both the following are possible: (94)a.
Bill is verliefd op Bill is in love
haar geweest
with her
been
(Bill has been in love with her) b.
Bill is op
haar verliefd geweest
Bill is with her
in love
been
But in Dutch counterparts of sentences like (91a), the PP cannot precede the AP: (95)a.
Bill heeft [ziek van opwinding]
naar haar gekeken
Bill has
to
sick of
excitement
her
looked
(Bill looked at her, sick of excitement) b.
*Bill heeft van opwinding^ [ziek Bill has
of
excitement
sick
e^] naar haar gekeken e
to
her
looked
- 85 The Bounding Condition predicts the impossibility of extractions of this type, because it prohibits a free e in an AP (= A n ) (see (93)). Subjacency does not block movements like the one indicated in (93) , and has therefore to be supplemented with unknown principles . As fas as we can tell, AP is a bounding node like all other phrases of type X n . 2.6.4.3.
VP as a bounding node
Up until recently, the node VP (= V, or V) was seen as the maximal projection of the category V. This view led to a curious asymmetry in linguistic theory. While other nodes like the NP in English, or the PP in most languages, showed an abundance of island-like phenomena, the VP seemed a completely open category, which never had a clear role in any system of conditions. Recent studies in X-bar theory have effectively put an end to the unexplained exceptional status of the category VP. By making explicit certain features of the sketch of X-bar theory in Chomsky (1970) , it could plausibly be argued that not VP, but S is the maximal projection of V.
This valuable insight makes it possible to construct
a simpler and more elegant system of conditions on rules. The fact that this appears to be possible, can be seen as independent evidence for the insight of the X-bar theoreticians. Thus, in the following pages, we will consider S the X n for the category V. There are two apparent violations of the Bounding Condition, one involving Wh-movement, and the other involving "move NP" and control.
-
2.6.4.3.1.
86
-
Wh-movement
Although I will occasionally use the term Wh-movement for ease of exposition, I will in fact assume that the base rules provide a sentence-initial position where only designated elements of type [+wh] can be generated. In this chapter, I will take the following rule for granted (S = V n , S = V n _ 1 , C = complementizer):57 (96)
S
Xn C [+wh]
S
What is called "Wh-movement" simply is an instance of the coindexing rule discussed in 2.5.2. As far as I know, this way of generating Wh-phrases is empirically indistinguishable from the seemingly more complicated procedure according to which Wh-phrases are first generated elsewhere, and subsequently fronted by the rule of Whmovement. Under the assumptions of trace theory and the general schema "move a" the outputs of movement have to conform to the possible configurations for antecedent-anaphor relations (c-command). Since the mechanism that checks the outputs of movement is characterized by the same conditions (c-command, Locality Principle) as the coindexing rule for antecedent-anaphor pairs, there seems to be no reason to make the distinction between (non-stylistic) movement and coindexing in the first place. Rule (96) stipulates that each clause is obligatorily introduced by a Wh-phrase (which may be left unlexicalized). This Whphrase is assigned an index by the base in the case of interrogatives, and by the coindexing rule (cf. 2.5.2.) elsewhere. Clauses introduced by Wh-phrases without an index are filtered out by our general convention against unindexed categories (see (56)) of
- 87 2.5.2.). According to Chomsky and Lasnik (1977), filters apply after rules like the general free deletion in COMP, which may also delete Wh-phrases without an index in sentence-initial position. In this way, the grammar accepts clauses that do not require a Wh-phrase for the purpose of binding. Thus, the general filter against categories without an index does not apply when an irrelevant Wh-phrase has 58 been freely deleted in COMP. As I will show in chapter 3, the coindexing from a Wh-phrase to an e is successive cyclic as a consequence of the obligatory sentence-initial Wh-position in conjunction with the Locality Principle. Neither successive cyclicity, nor a great deal of the Whisland phenomena follow from the Bounding Condition. This will be clear when we consider Wh-movement in somewhat more detail. Extraction of Wh-phrases is inconsistent with the unqualified Bounding Condition: (97)
O
Whatj1 do you think tOF 2
The rightmost e^ is bound in
e.1 that [o c Bill saw [+wh]
e. —1 ]]]
because it has an antecendent in
the COMP position of S 2 . But this e^ in COMP is free in S 2 , which is a violation of the Bounding Condition. We therefore have to qualify the Bounding Condition as follows ((58) is the Bounding Condition): (98)
(58)
... unless y is [+wh]
where we assume by convention that the domain (6) of this stipulation is S, because [+wh] is part of the base expansion of S (see (96)). Thus, (98) implies that [+wh]-null elements can only be free
-
88
-
in S (in COMP position), and not in NP (cf. the CNPC) or any other category. With (98) added to the Bounding Condition, sentence (97) is accepted, because the e^ that introduces S2 is assigned the feature [+wh] by base rule (96) . Qualification (98) is a language-specific auxiliary hypothesis to the universal Bounding Condition. It has to be added to the grammar of those languages that allow extraction of Wh-phrases. Since it is language-specific, it is not part of core grammar, and can only be incorporated in the grammar of specific languages at a certain cost. This seems to be a correct conclusion, because there are languages that do not permit extraction of Wh-phrases at all, and others, like English and Dutch, only allow it under idiosyncratic "bridge" conditions (see 2.3.1.). In general, extraction of Wh-phrases from an S produces less acceptable sentences, with uncertainties in judgment that never occur in the case of clause-internal Wh-movement. The incorporation of (98) in the grammar of certain languages causes that many Wh-island phenomena in these languages do not follow from the Bounding Condition. This can be illustrated with examples like the following: (99)
*
What. do you know 1 that Peter saw
^
t>2
who. [ c e. said [ 3 ¡3 J 3
e.1
]]]]
This ungrammatical sentence is accepted if we qualify the Bounding Condition with (98) : the rightmost e^ is bound by the e^ in the COMP position of S^; this e^ itself is [+wh], and can therefore be free in S- according to (98); e. is bound in S_, so, all e's are
-
89
-
accepted by the qualified Bounding Condition. We need the Locality Principle of chapter 3 to block sentences like (99) (what^ cannot be coindexed with the e^ in the COMP of S ^, because another Whphrase intervenes). For similar reasons, successive cyclicity does not follow from the Bounding Condition: (100)
[g
Wh-phrasei [ g ... [g
e [ g ... [g
e±
...e±...]]]]]]
**
t
^
We can skip S 2 in this structure, when we coindex the Wh-phrase in S^ with e^ in the COMP of S^. The resulting structure would be acceptable according to the (qualified) Bounding Condition: the rightmost
e^ is bound by the e^ in the COMP of S^, which is itself
accepted by (98); the e in S^ can be deleted by the rule of free deletion in COMP in the sense of Chomsky and Lasnik (1977). Again we need another principle, the Locality Principle, to guarantee 59 "successive cyclic" application of the coindexing rule.
It is not
possible to define "free in g" in our formulation of the Bounding Condition in such a way that e^ is bound in S^, but not in S 2 of (100) . In general, an e is bound in S if it is bound in a proper sub-S (see 2.6.1.): (101)
[k O ^ I wonder
c Bill saw O 2 what,1 [ > 3
e.1 ]]]
This sentence is acceptable, because e^ is bound in S2. It is also bound in S ^ because S 2 is a proper sub-S of S ^ Since the case of extraction of Wh-phrases from S is the only case where an e in a tensed S is coindexed with a phrase external to that tensed S, (98) is sufficient to reconcile apparently un-
- 90 bounded forms of binding with the Bounding Condition. 2.6.4.3.2.
Infinitives
Infinitival constructions that involve "move NP" or control form a second apparent violation of the Bounding Condition: (102)a. b.
Mary^ seems [g
e^
to read]
Maryjl tries [k s> e. —3
to read]
These structures are in conflict with the Bounding Condition, because in both cases we have a free e in S (= V n ). There are some proposals in the literature about infinitives that are consistent with the Bounding Condition. These so-called VP-analyses imply that infinitives are not full clauses, but subjectless VPs.**1 But these VP-analyses cannot be correct
for three reasons. First of all, it
would be a considerable extension of the class of possible grammars, if it were possible to generate complements of type X n
1
besides
the usual complements of type X n . It has sometimes been observed that grammars show a gap by not having VP complements next to NP, a2 AP, and PP complements.
But this is exactly what we expect if S
rather than VP is the maximal projection of V. if only categories of type X n are possible as complements, there is no longer any reason to be surprised about the fact that there are S complements and no VP complements. A second argument against a VP analysis of infinitives is the existence of infinitival relatives and questions: (103)a. b.
I found a topic on which to write my term paper It is not clear what to do
- 91 Chomsky (1978) has persuasively argued that it is an unnecessary complication of the grammar of English to expand S as COMP+VP where we also have to expand it as COMP+S. A third argument against VP-analyses is that in many languages infinitival clauses are introduced by a complementizer. Ozark Eng63 lish has sentences like (104): (104)
I want for to go
In Dutch, infinitives are often introduced by an optional complementizer om: (105)
Zij probeerde (om) president te worden She tried
C
president to become
(She tried to become president) These arguments make it very unlikely that infinitival clauses are in fact VP complements. An S with optional subject, as suggested by Jackendoff (1977), is at first sight less problematic. But Quicoli (1976) and Kayne (1975) have shown that the distribution of clitics in Romance languages can be explained by the SSC if we assume that infinitives have a subject. All in all, there seem to be very good reasons to assume the following structure for infinitival complements : (106)
..,NPi...[g
[ N p e] i
to
VP ]
Since the e-subject of the infinitive is not bound in S, but eventually coindexed with an external controller, (106) violates the Bounding Condition. Apparently, 6 in the formulation of the Bounding
- 92 Condition has to be interpreted as "full clause" or "tensed S", where infinitives do not qualify as such. We can express this by adding a second auxiliary hypothesis to the Bounding Condition (58): (107)
(58) unless:
(i)
(98)
(ii)
3 is [-tense]
With this qualification, all cases of control and "move NP" over clause boundaries are accepted (cf. 102), including cases like (108), where the subject can have a generic interpretation (arb in the sense of Chomsky (1978)) which is not assigned by a controller: (108)
It is impossible [e^
to walk a mile in five minutes]
This sentence is not excluded by the Bounding Condition if it incorporates (107 (ii)). It also shows that the Bounding Condition cannot be a condition on rules - more specifically a condition on the coindexing rule - because (108) does not involve a rule that links the e-subject of the infinitive to an external controller. I will return to cases like (108) in chapter 3. For all cases that involve an external NP as the antecedent, the Locality Principle determines that only the subject of an infinitival clause is available as a consequent. The selection of the antecedent is in general governed by the Minimal Distance Principle, which is again an instance of the Locality Principle discussed in chapter 3. With (107 ii)) added to the Bounding Condition, it overlaps to a certain extent with the NIC in cases like the following:
- 93 (109)
*Johni was believed [(that)
e^
saw Peter]
This sentence is blocked by the Bounding Condition, because e^ is free in the embedded clause. But the NIC also blocks (109), because — 64 e^ is nominative and free in the embedded S.
We can solve this
redundancy, rather trivially I believe, by dropping the NIC altogether and by stipulating in the lexicon that anaphors like each other and himself etc. are [-nominative], so that they cannot be inserted in nominative positions (cf. Brame 1977). In this way, (109) is only blocked by the Bounding Condition (see also note 74). The incorporation of (107 ii)) in the grammar does not seem to lead to the typically marked phenomena that have been observed in the case of extraction of Wh-phrases. I therefore assume that this second qualification to the Bounding Condition is not languagespecific, but universal. It means that in all languages that make the distinction between tensed and non-tensed sentences, only the former are interpreted as islands (f5s in the sense of the Bounding Condition) while the latter are freely accessible to the coindexing rule. Apart from the two exceptions covered by the language-specific qualification concerning Wh-phrases (98) and the universal (107 (ii)), clauses seem to be islands, in accordance with the Bounding Condition. 2.6.4.4.
PP as a bounding node
As for its island character, PP is the least problematic category of all. Apart from English and the Scandinavian languages, very few cases of extraction from PP (preposition stranding) have been reported for any of the world's languages. Van Riemsdijk (1978)
- 94 has convincingly argued that any reasonable theory of universal grammar excludes preposition stranding from its core, and only allows it as a peripheral, highly marked phenomenon. The Bounding Condition, being part of core grammar, explains this state of affairs. Nothing has to be added to the grammar of Dutch to account for the ungrammaticality of the following sentences: (110)a.
*Welke man^ heb Which man
je
[ p p over
have you
e^ ] gesproken?
about e
talked
(Which man did you talk about?) b.
*Welke partijj behoort hij [ p p tot Which party
belongs he
to
e^ ]? e
(To which party does he belong?) Extra provisions only have to be made for the marked cases, which can be observed quite systematically in English, and only for a single class of lexical items in Dutch. 2.6.4.4.1.
Preposition stranding in English
Since the topic of preposition stranding in English has extensively been discussed by Van Riemsdijk (1978), the only thing I would like to do in this section is to explain how preposition stranding relates to the Bounding Condition, and to provide some additional evidence for the marked character of extraction from PP in English. In 2.6.4.3.1., we accounted for the fact that clauses can be introduced by a Wh-phrase by a PS-rule (cf. (96)) repeated here as (111) :
- 95 (111)
S
->- X n [+wh]
C
S
Together with the stipulation (98) that [+wh]-null anaphors in COMP can be free in 6 (= X n ), this rule accounted for the possibility of extraction of Wh-phrases from S. Thus, the following sentence is acceptable, because the rightmost e.^ is bound in S 2 , while the e^ in COMP of S 2 is ruled in by (98): (112)
to 1
Who, do you think [ F 111
1
e.
O ^
1
2
— 1
that [„ Bill saw O
e. ]]] — 1
[+wh]
Suppose now that the PP has essentially the same expansion as the S in (111) in the grammar of English (cf. note 70): (113)
PP
->• (Xn) [+wh]
QP
PP
Van Riemsdijk (1978) has shown that there is prima facie evidence for Wh-positions in the PP of English in forms like: what about? where from?
who with? etc. A language like Dutch, with hardly any
preposition stranding at all, does not have these forms. Van Riemsdijk (op.cit.) has furthermore shown that the principles that determine the distribution of fronted Wh-phrases in PPs are similar to those that account for Wh-phrase distribution in clause-initial 65 position.
What is important in the present context is that we have
nothing to add to the grammar of English to account for extraction from PP if (113) is accepted. The independently motivated qualification (98), which allows free e's that are marked [+wh], suffices to account for the grammaticality of structures like the following:
- 96 (114)
What i did you talk [ p p
^
[about
e ± ]]
[+wh] Such examples are very similar to (112); the rightmost e^ is no longer free in PP, because it is bound by the PP-initial e i that has the feature [+wh]. Null anaphors with the feature [+wh] can be free in their X n , as stipulated by (98). The fact that the incorporation of (113) in the grammar of English makes it possible to account for all forms of preposition stranding is highly suggestive, and a major argument in favor of Van Riemsdijk's analysis. Rule (113), the analogue of (111), is a very simple way to extend the marked option of Wh-phrase extraction to PPs. Just as in the case of extraction from S, there is some evidence for the marked character of preposition stranding in English. PPs that contain an e have a more limited distribution than other PPs. They have to be peripheral within V, and they cannot be extraposed.^^ As for the first fact, compare (115a) with (115b) and (115c) : (115)a.
Who^ did you
[y see
a picture of
give a picture of
e^
a finishing touch]
c. ??Who, did John [— hand a picture of
e,
to Mary]
b. ??Whoj did you
Kuno (1973) has observed that stranded prepositions must be peripheral as in (115a). If the stranded preposition (i.e. the PP containing e's) is nested within V, the result is rather unacceptable (115b, c). Many phenomena of this kind can be observed, but Kuno's empirical generalization, implying that prepositions can only be
- 97 stranded in final positions, where they are not followed by material of the same phrase has to be amended because of the following facts: (116)a. b.
He saw a picture, yesterday, of Bill *Who^ did he see a picture, yesterday, of
e^
In (116b), a Wh-phrase cannot be extracted from the PP, in spite of the fact that it is in clause-final position. In general, extraction from a PP is not possible if the PP itself is in a marked position as the result of extraposition (cf. Huybregts 1976). Facts like (115b, c) and (116b) suggest that marked phenomena like e's in PPs
are only possible in positions that are "easy"
from a perceptual point of view, such as peripheral positions (without nesting) and positions that are not linked to the rest of the sentence by complicated non-core rules like extraposition. 2.6.4.4.2.
Preposition stranding in Dutch
In Dutch, preposition stranding is a much more limited phenomenon than in English. It is essentially limited to a largely morphologically definable class called R-words (cf. Van Riemsdijk 1978): (117)a.
*Welk
boek^ praatten zij [over
Which book
talked
e^ ]
they about e
(Which book did they talk about?) b.
Waarj praatten zij
[e^ over]
t+R] Where talked
they
e
about
(What did they talk about?)
- 98 R-words include items like er, daar (there), overal (everywhere), ergens (somewhere), nergens (nowhere), that are also used as local adverbs. I consider these words adverbs in all cases, also in sentences like (117b). There is no reason to assign different categorial status to different usages of the same words. For ease of exposition, we will refer to this class of adverbs by the feature [+R], following Van Riemsdijk (1978). Contrary to Van Riemsdijk (op.cit.), I assume that R-words are base-generated in the position in which they occur, on the left of the P: (118)
P
+
( [+R])
P
(NP)
(Xn)
There are no good reasons to generate R-words on the right of the P, and to have a PP-internal rule of R-movement. R-words simply do not occur on the right of the P, since they are in complementary distribution with the category NP. Of course there can be reasons to have a difference between deep and surface word order, but these differences must be motivated. In the present case the motivation is lacking. The original motivation behind the rule of R-movement was that there are PPs of the form erna (thereafter), but no corresponding PPs of the form *na het (after it) in Dutch (cf. Koster 1975). This fact could be accounted for by obligatorily transforming forms like na het to erna. But first of all, this would be an implausible transformation (as recognized by Van Riemsdijk (1978)) because of the morphological change that it would bring about. And second, these correspondences are not complete, because words like het (it) are also impossible in PPs that cannot contain R-words. Thus, with the preposition gedurende (for the duration of), we have neither gedurende het, nor er gedurende. What it all comes down to
- 99 is that the distribution of R-words is partially predictable, and partially idiosyncratic lexical fact. As usual in such cases, we can capture the regularities by a redundancy rule for subcategorization frames: (119)
[+P,
—
NP
(Prt)] => [+P, [+R]
—
(Prt)]
This rule captures the regularity that prepositions followed by an NP (and eventually by a particle) usually can also be preceded by an R-word. A preposition like na (after), for instance, has the following subcategorization in the lexicon: 6 7 (120)
na:
[+P, —
[+P, [+R] —
]
The second subcategorization frame has been added without cost, because of the redundancy rule (119). A preposition like gedurende (for the duration of), that cannot select an R-word, simply misses the second frame: (121)
gedurende:
[+P, —
{||} ]
Rule (119) accounts for the distribution of R-words insofar as it is predictable. Theoretically, it is possible to have R-words in other PPs than those containing Ps with the frame [ — NP (Prt)] in their lexicon. But PPs containing Ps with the frame [ — PP] never have an R-word preceding the P: (122)a.
Ik heb I
de
cognac
have the cognac
68
[voor [er^ for
na]]
there after
(I have bought the cognac for after it)
gekocht bought
-
b.
*Ik heb I
de
100
-
cognac [er\
have the cognac
voor [e^ na]]
there for
e
gekocht
after bought
(I have bought the cognac for after it) (123)a.
Dit
boek dateeert misschien [van
This book dates
perhaps
from
[eTj
voor ]]
there before
(This book is perhaps from before it) b.
*Dit
boek dateert misschien [er. — 3
This book dates
perhaps
van
there from
[e. voor 11 -J
e
1
before
(This book is perhaps from before it) Similarly, most Ps with the frame [ — NP
X n ] like met (with) cannot
have an R-word preceding the P : ^ (124)a.
Met
die
zaak
in orde ...
With that matter in order .., b.
*Daar
mee
in orde ...
There with in order ... If we assume a general R-position that introduces all PPs, facts like the b-sentences of (122), (123), and (124) become problematic. If we consider the distribution of R-words a matter of subcategorization (as expressed by 119), these facts are as expected. Of course, there might be a deeper explanation for the fact that Rwords are generally limited to the context specified in (119). But this eventual explanation is irrelevant for the question whether or not R-positions are a matter of subcategorization, or a part of all PPs. Crucial evidence distinguishing between the two conceptions of R-positions comes from the extraction possibilities for R-words. To account for the fact that only R-words can be extracted from
-
101
-
PPs in Dutch, we have to make a special stipulation. R-word extraction is an idiosyncratic fact of Dutch, which simply has to be learned as an exception to the Bounding Condition: (125)
Waar^ praatten zij [+R] Where talked
[ p p e^ over]
they
e
about
(What did they talk about?) In order to account for this exception to the Bounding Condition (125) has a free e in a PP), we may add a third qualification to the Bounding Condition: (126)
(58) unless:
(i)
(98)
(ii)
(107)
(iii)
y is [+R] (in Dutch)
In accordance with our convention, briefly discussed for Wh-phrases (see (98) and following comments), (126 (iii)) stipulates that Rwords may be null and free in those categories that have [+R] in their base-rule expansions (see (118)). In this case, the implication is that R-word traces may only be free in PPs that contain a P that may select an R-word as a complement. Thus, (126 (iii) correctly predicts that R-words may not be free in NPs, because NPs never contain R-positions: (127)
-Waar^ is [ N p een argument [ p p e^ tegen]] ongeldig? [+R] Where is
[+R ] an
argument
against invalid
-
102
-
The e i in the PP is [+R], and therefore acceptable according to (126 (iii)). But this e^ cannot be free in the NP, because NPs have no possible R-positions in their base-rule expansion. At this point, we see a clear advantage of the subcategorizational view of R-positions over the hypothesis of general PP-internal R-positions. The subcategorizational hypothesis predicts that R-words can only be extracted from PPs that have an R-position in their expansion. This happens to be the case (cf. 122-124): (128)a.
Dit
boek dateert misschien [van [er
This book dates
perhaps
voor ]]
from there before
(This book is perhaps from before it) b.
*Waar^ dateert dit Where dates
boek misschien [ p p van [ p p e^ voor ]]
this book perhaps
from
e
before
If all PPs are introduced by R-positions, the ungrammaticality of (128b) is unexplained. If only those PPs with a P that is subcategorized for R-words contain R-positions, there is no problem. The e^ can be free in the deepest PP in (128b), because this voor selects an R-position (see 128a). But e^ cannot be free in the PP introduced by van, because Ps that are subcategorized as [ — PP] never contain an R-position; see (123b), repeated here as (129): (129)
*Dit
boek dateert misschien [er
This book dates
perhaps
van [ e voor ]]
there from
e before
Thus, R-positions are very different from sentence-initial Wh-positions. Wh-positions are possible "escape hatches" that introduce all clauses (VPs), and that are completely independent from the subcate-
- 103 gorization characteristics of the category V. R-positions are not general in the same sense, and only occur in those PPs that have a P that is subcategorized for R-positions. This is why extraction of Wh-phrases (that are [+R]) is always impossible from PPs with a P that is subcategorized as [— PP]. The impossibility of extraction in sentences like (128b) is a general fact about the PPs in question, in no way comparable to the idiosyncratic "bridge" phenomena in the case of extraction (through COMP) from V P . ^ Since the exceptional behavior of R-words has to be stipulated as a language-particular fact of Dutch grammar (cf. (126 iii))) , we expect that its marked character can be demonstrated somehow. The evidence is similar to what we observed in the case of preposition stranding in English. PPs that contain an unlexicalized R-phrase have a more limited distribution than PPs with lexical R-phrases. In the following sentences, three different positions for PPs with (lexicalized) R-words are shown: (130)a.
Hij heef t daar mee He
has
een prijs gewonnen
there with a
prize won
(He won a prize with that) b.
Hij heeft een prijs daar mee He
c.
has
a
gewonnen
prize there with won
Hij heeft een prijs gewonnen daar mee He
has
a
prize won
there with
If the R-word is a fronted Wh-phrase, and the PP contains a bound e (which is [+R]), only one position is possible. This is the least marked position between direct object and verb, corresponding to (130b) :
- 104 (131)a.
*Waar^ heeft hij [e^ mee] Where has
he
e
with
een prijs gewonnen? a
prize won
(With what did he win a prize?) b.
c.
Waar^ heeft hij een prijs [e^
mee] gewonnen?
Where has
with won
he
a
prize
e
-Waar^ heeft hij een prijs gewonnen [e^ mee]? Where has
he
a
prize won
e
with
In conclusion we may say that the island character of PPs in Dutch is an established fact, explained by the Bounding Condition. The possibility of R-word extraction is clearly exceptional, a marked phenomenon, that does not fall within the scope of core grammar. Since we already concluded that NP, AP, and VP are islands, a further conclusion at this point is that all major phrases (of type X n ) are islands. This conclusion confirms the
major prediction of the
Bounding Condition, and therefore strongly supports it. 2.6.5.
Gapping
In 2.6.3., we concluded that there is at least one case that shows that a condition on null anaphora like the Bounding Condition is superior to Subjacency as a condition on movement rules. This is the fact that subject phrases are not only islands with respect to movement, but also with respect to control ((61), repeated for convenience) : (61)
Ait is a nuisance for us^ for pictures of e^ to be on sale
There is another obvious non-movement rule, Gapping, that appears to be constrained by the Bounding Condition. In order to make the
-
105
-
relevant distinctions, we have to slightly extend the definition of 6 in our formulation of the Bounding Condition, so that it not only includes S (= V n ) but also the Ss that immediately dominate S in coordinate structures, like S. in:
In our earlier statement of the Bounding Condition, we took 0 as the maximal projection, X n . In the case of Gapping we have to define 3 in such a way that it refers to S^, but not to the other Ss in (132) . Let us therefore define 8 as the top node: (133)
A maximal projection, X n , is a top node iff it is not immediately dominated by a node X m (of the same projection type) such that m > n
According to this definition, S^ is the top node in (132), and not the other Ss
because these are immediately dominated by S^, which
has the same number of bars. X n and X m have to be of the same projection type. Thus, NPj is the top node in (134a), while both NP ? and PP are top nodes in (134b): (134)a.
NP X NP 3
PP
(134)b. NP 4
PP
NP 2
NPJ and NP^ (in (134a)) are not top nodes, since they are immediately dominated by another category, NP., with the same number of bars.
-
106
-
Under the improved version of the Bounding Condition, not only all the earlier cases, but also the local properties of Gapping are explained: (135)
THE BOUNDING CONDITION y cannot be free in g in:
where 3 is a top node At the end of chapter 3, I will give an independent argument that Gapping (of V) is not a deletion rule, but a rule interpreting tve]. With this interpretive view of Gapping as background, we can explain the following contrast: (136)a. b.
i *[g
[g- John hit Mary] and [g Bill [ v e] Sue]] [g John hit Mary] and
I don't believe
[g Bill [ v e] Sue]]] In (136a), we have an empty V in an S (= V n ), but (135) does not apply, because this S
is not a top node because it is immediately
dominated by a node (S^) of the same projection type with the same number of bars. Within S^, [ v e] has an antecedent, namely hit. The other sentence, (136b), is predicted to be ungrammatical, because here the [^e] is free in an S that is not immediately dominated by another S, so that it is a top node. One might use Subjacency to explain (136b). But in that case 71 Subjacency is no longer a unique property of movement rules.
We
already came to this conclusion for control cases like (61) . But
- 107 given the fact that (136b) is also explained by the Bounding Condition, we can use the local properties of Gapping to give an argument that crucially favors the Bounding Condition over Subjacency. The Bounding Condition predicts that Gapping not only does not go into embedded VPs (Ss) as in the case of (136b), but that it is also impossible to have a free [ v e] in other X n s. This conclusion is confirmed by the facts. Let us first consider a case involving VP (S) that is explained by the Bounding Condition, but not by Subjacency (e. — "gap"): (137)a. b.
John said
[- (that) Mary said that he would come]
"John saidj^ [— (that) Mary
e^
that he would come]
The ungrammaticality of (137b) is not explained by any of the older conditions. The "gap", e^ is c-commanded by the antecedent said, and there has never been a satisfactory explanation of the ungram72 maticality of (137b).
Why can e^ not be interpreted as said in
(137b)? The Bounding Condition provides the explanation, because it forbids a free e in a (top) S. Note that Subjacency does not explain (137b), because this condition requires at least two Ss (or Ss) to block a linking of antecedent and consequent. Similarly, Subjacency fails to explain the fact that Gapping does not go into AP, or PP. As for the category NP, consider the following example: (138)a.
The claim that Mary wrote the book and the fact that John wrote the paper prove nothing
b.
*The claim that Mary wrote^ the book and [jjp the fact that John
e.
the paper] prove nothing
-
108
-
Sentence (138b) is ungrammatical, because it has a free e in an NP, which is prohibited by the Bounding Condition. It is only explained by Subjacency taken as a condition on non-movement rules like Gapping. Even if the scope of Subjacency is extended along these lines, it can only attain descriptive adequacy if AP and PP are included among the cyclic nodes, because Gapping does not go into APs (139a), or PPs (139b): (139)a. b.
*She is eager to please^ Bill and [^p unwilling *John wrotej the book [pp before Mary
e^
e^
John]
the paper]
These facts are easily explained by the Bounding Condition, because they involve a free e in an A n and a P n , respectively. The local properties of Gapping lead to the following conclusion. Contrary to the Bounding Condition, Subjacency cannot attain descriptive adequacy in this domain of facts, in view of cases like (137b). Even if (137b) would turn out to be irrelevant, the facts can only be explained by Subjacency by extending the class of cyclic nodes with AP and PP. But the most important conclusion is that the local properties of Gapping can only be explained by Subjacency if it is no longer considered a condition on movement rules. This establishes the conclusion that we aimed at from the outset: that there is no unique property (like Subjacency) that distinguishes (non-stylistic) "movement rules" from other rules of sentence grammar, like control, or Gapping. 2.6.6. 2.6.6.1.
Is there a Head Constraint? The evidence for the Head Constraint
Van Riemsdijk (1978) argues that the system of conditions that
-
109
-
includes the SSC and Subjacency, has to be supplemented with yet another principle (cf. 2.2.): (140)
The Head Constraint No rule may 1 involve X./X. and Y./Y . in the structure i : i : ...X....[Hn...[H1
...Yi...H...Y....]H,...]Hn...Xj
(where H is the phonologically specified (i.e. non-null) head and H n is the maximal projection of H) In our brief discussion of the Head Constraint in 2.2., we concluded that the Head Constraint almost completely overlaps with Subjacency. It is, in other words, extremely unlikely that a theory including both the Head Constraint and Subjacency is correct. The same can be said about the Bounding Condition and the Head Constraint. Both predict the CNPC facts, for instance. Before comparing the Bounding Condition and the Head Constraint with respect to the major facts discussed in previous sections, I would like to give a brief review of the evidence for the Head Constraint in other domains. The Head Constraint was first discussed in Fiengo (1974) , in connection with the following contrast:^ (141)a. b.
John^ sold his car and
e^
bought a boat
*John bought the boatj and Bill ruined
e^
According to Fiengo (1974), sentences like (141a) allow gapping (interpretation) of the subject (e^) of the second conjunct, while 74 (141b) shows that gapping of the object (e^) is impossible.
-
110
-
Fiengo sought to explain (141b) in terms of the Head Constraint: (141b) is ungrammatical because it involves interpretation of the non-head (e^) of a VP without involving its head (ruined). This explanation seems to be based on a misobservation, however. There is a reasonable alternative analysis for (141a) that does not involve gapping at all, but VP coordination: (142)
75
John [ v p [ v p sold his car] and [ v p bought a boat]]
For cases that unambiguously involve S (or S) coordination, there is abundant evidence that neither subject nor object can be gapped as long as there is an overt finite verb in the relevant clause. Consider, for instance, the following Dutch sentence: (143)
*Kocht
John i een auto, of kocht
Bought John
a
car
or bought
e^
een boot?
e
a
boat
(Did John buy a car, or did (he) buy a boat?) In (143), the Verb Second rule of Dutch Grammar has been applied, 76 or rather, the finite verb has been generated in COMP position. Since the finite verb (kocht) in the second conjunct of (143) is in COMP, dominated by S, VP coordination (like in (142)) is impossible. This appears to be the general situation: if it can be shown that the second conjunct is an S or an S (so that VP coordination is excluded), gapping of non-verbs is only possible after gapping of the finite verb. There is no difference between subject and object in this respect. The ungrammaticality of (143) is neither explained by Fiengo's version of the Head Constraint, nor by Van Riemsdijk's original
- Ill formulation (140). At the end of chapter 4, Van Riemsdijk (1978) presents a more sophisticated version, in which the head (H) is no longer necessarily dominated by H' (cf. (140)). In the new version, the only requirement is that the Y position (of (140)) is c-commanded by the head (kocht), which causes sentence (143) to be ungrammatical. The new version of the Head Constraint is only an improvement for cases like (143), because there are several other examples that show that subjects cannot be gapped, even when they are not c-commanded by the finite verb. Note first of all that there are English analogues of (143) like: (144)
*Has John^ eaten, or has
e^ drunk?
This example shows again that a subject cannot be gapped when there is an overt finite verb form. In this case, the new version of the Head Constraint is no improvement, because e^ is not c-commanded by the head of the VP (drunk), but by the finite form has, which is an Aux. There are also cases in Dutch that obviously involve S coordination, and in which the gapped subject is neither c-commanded by the head of a VP nor by any Aux. Note first that Gapping goes into indirect questions: (145)
Ik weet wat I
John kocht^ en
wat
Piet
know what John bought and what Peter
e^ e
This is again an unambiguous case of S coordination, because both coordinated phrases are introduced by a Wh-word, which is in COMP under S. Subjects cannot be gapped in indirect questions as long as
-
112
-
there is a finite verb: (146)
*Ik weet wat I
Johrij kocht, en
know what John
wat
bought and what
e^
verkocht
e
sold
(I know what John bought and what (he) sold) In this case, there is no version of the Head Constraint that explains the ungrammaticality. Dutch has verb-final word order in subordinate clauses, so that
is not c-commanded by the V (ver-
kocht) in (146). The evidence shows that Gapping of both subject and object is only possible when there is no overt expression of tense. If this conclusion is correct, there is no evidence for the Head Constraint in constructions that involve Gapping. Van Riemsdijk (1978, 4.5.2.) gives two additional arguments in support of the Head Constraint, one based on directional PPs in Dutch, the other based on case attraction in German. The case attraction facts are difficult to evaluate,
because the relevant
rules have not been formulated. Since the case attraction facts also seem to follow from (an appropriate extension of) the Bounding Condition, I will concentrate on the argument based on directional PPs. The basic observation is that adverbial elements like niet (not) have to precede directional PPs in Dutch: (147) a.
John is niet naar Amsterdam gegaan John is not
to
Amsterdam gone
(John has not gone to Amsterdam) b.
*John is naar Amsterdam niet gegaan
- 113 According to Van Riemsdijk (1978, 4.5.2.), there are also postpositional directional PPs, that show a different behavior: (148)a.
John is niet de John is not
boom in
the tree into
geklommen climbed
(John has not climbed the tree) b.
John is de boom
niet in
geklommen
Under the assumption that de boom in is a postpositional directional PP, we have to explain the fact that the negation element niet can follow the NP de boom. Van Riemsdijk's explanation involves the Head Constraint, in conjunction with the hypothesis that the head of the PP, the postposition in, is raised from the PP and incorporated in the V: (149)
John is [ p p [ N p de boom] niet [ p e] ] [ v t p in] tv geklommen]] 1 I
Since the head of the PP (in) has been removed and incorporated in the V, the PP would be accessible for adverbs like niet. This explanation is suspect for a number of reasons. First of all, it is inconsistent with the Bounding Condition, because (149) has a free e within a PP. The rule of P-movement makes little sense, because its sole function is to account for the fact that directional particles (like in in (149)) behave like particles, and not like other postpositions. The most plausible conclusion is that (148b) is just another instance of the NP-Prt-V-construction, which is not like (149) but like:
- 114 (150)
John is [ N p de John is
boom] niet [ p r t in] geklommen
the tree
not
in
climbed
(John has not climbed the tree) •If we assume that this is the correct structure for (148b), there is nothing mysterious about the fact that the negation element niet (not) can be placed between the NP (de boom) and the Prt (in). The position between NP and Prt is quite normal for adverbs (cf. (151b)): (151)a.
John heeft niet de John has
not
dokter
op gebeld
the doctor
up called
(John has not called up the doctor) b.
John heeft de John has
dokter niet op gebeld
the doctor not
up called
(John has not called up the doctor) The hypothesis that de boom and in do not necessarily form a constituent is consistent with the fact that the NP can be fronted if it is a Wh-phrase, while it is impossible to move both NP and directional particle together: (152)a.
Welke boom is hij in geklommen? Which tree is he
in climbed
(Which tree has he climbed?) b.
*Welke boom in is hij geklommen? Which tree in is he
climbed
(Which tree has he climbed?) This constellation of facts does not favor the postpositional PP-
- 115 analysis, because normally "Pied Piping" is obligatory for PPs that do not contain R-words. Another argument for (150) is that in can move over a V after V-raising (klimmen has been raised from the embedded clause to a position after the matrix verb willen; cf. Evers 1975): (153)a.
Hij had de He
boom in willen klimmen
had the tree in want
climb
(He had wanted to climb the tree) b.
Hij had de He
boom willen in klimmen
had the tree want
in climb
(He had wanted to climb the tree) The word in shares these two word order possibilities with ordinary particles: (154)a.
Hij had Mary op willen bellen He
had Mary up want
call
(He had wanted to call Mary up) b.
Hij had Mary willen op bellen He
had Mary want
up call
(He had wanted to call Mary up) It is important to note that as soon as the PP complement contains an R-word, which is obligatorily followed by a postposition, and not by a particle, it is not possible to move the P over the first V like in (153) and (154):77
-
(155)a.
Hij heeft er He
has
116
-
in willen klimmen
there in want
climb
(He had wanted to climb it) b.
*Hij heeft er He
has
willen in klimmen
there want
in climb
(He has wanted to climb it) (156)a.
Zij heeft er She has
uit
willen rennen
there from wanted run
(She has wanted to run out of it) b.
*Zij heeft er She has
willen uit
there want
rennen
from run
(She has wanted to run out of it) In (155) and (156), a structure like (150) is impossible, because R-words can be complements of postpositions, but not of particles. Thus, as soon as we have bona fide postpositions, like the ones in (155) and (156), there is only one word order possibility, where particles (like the ones in (153) and (154)) show two possibilities. Since the evidence in favor of a structure like (150) is so straightforward, it is difficult to see what counts against it. Van Riemsdijk (1978, 3.7.2.) gives some convincing arguments to the effect that NPs followed by a directional particle do form a constituent sometimes. But these constituents are all adverbial in nature and differ from the verb complements considered so far. None of the arguments for one-constituent structure applies to the verb complements. The other arguments for underlying directional PPs (from which the P can be raised) are rather weak. The first argument concerns the following ambiguous sentence (Van Riemsdijk 1978, 92):
-
(157)
Omdat
John de
117
-
vrachtwagen in
Because John the truck
reed
into drove
Van Riemsdijk observes that this sentence has the following two readings: (158)a. b.
Because John broke the truck in Because John drove into the truck
According to Van Riemsdijk, the two readings can be structurally distinguished by assuming an NP-Prt-V-structure for the first reading, and a (postpositional) PP-V-structure for the second reading. Although the two readings can be kept apart in this way, it is far from clear that it is necessary to do so. There is no reason to assume that particles cannot be lexically ambiguous, like all other lexical items. But apart from this possibility, there is a way to keep the two readings structurally apart, without giving up a basic NP-Prt-V structure for both. This solution is based on the fact that the two readings (158) differ in the way the particle is related to the verb. It has often been observed that in ordinary verb + particle constructions, there is a very close semantic relation between particle and verb. This relation is so close that verb and particle often have been considered one word. In Dutch grammar, this view has led to some technical problems, because in root sentences verb and particle are separated by the rule of Verb Second (see Koster 1975). These problems can be solved by assuming that verb + particle are generated as two separate items by the base component, and that they can be lexically reanalysed in certain cases. This possibility of lexical reanalysis was first proposed by
-
118
-
Chomsky (1973) for cases like to take advantage of, which is reanalysed as a unit (indicated by the curly brackets) to account for the following passive: (159)
His book^ was {taken advantage of}
e^
Van Riemsdijk (op.cit.) also assumes the possibility of lexical reanalysis in his account of preposition stranding in pseudo passives (op.cit., 218ff.). Lexical reanalysis can also structurally distinguish the two readings of (157). In the first reading (158a), the particle and the verb form a close semantic unit with idiomatic character. We can express this by placing verb and particle between curly brackets (lexical reanalysis): (160)
Omdat
John [ N p de
Because John
vrachtwagen
the truck
{[ p r t in
1
into
[reed v ]} drove
(Because John broke the truck in) In the second reading (158b), we have a directional particle, which does not form a similar close unit with the verb. We can account for the difference by not applying lexical reanalysis in this case. Formally, this can be expressed by giving the same structure as (160) for this reading, but without the curly brackets. The solution just sketched, accounts for the differences between the two readings of (157), but also for the fact that in behaves as a particle in both cases under certain conditions. Van Riemsdijk's second argument for underlying PP-V-structure (as opposed to NP-Prt-V-structure) for (148b) concerns subcategorization. The idea is that motional verbs are usually subcategorized
-
119
-
for motional PPs, and that it is a complication if they also have to be subcategorized for NP-Prt. This is true, but from the fact that we wish to keep subcategorization as simple as possible, it does not follow that we can make it simpler than it really is. Motional verbs can be combined with any kind of directional expression, including directional adverbs in sentences like John liep huiswaarts (John walked homeward). Thus, motional verbs are not only subcategorized for directional PPs, but also for directional adverbs. Similarly, it seems necessary to extend the subcategorization frame for motional verbs to the string NP + directional Prt. There is no a priori argument against such diversity, which is an 78 empirical matter. In conclusion we may say that there are no arguments against structures like (150), with NP-Prt-V instead of (postpositional) PP-V. There are, on the contrary, several arguments in favor of a structure like (150), including the original problematic fact, namely the possibility for adverbs like niet to occur on both sides of the NP (cf. (148)). Since the NP-Prt-V-structure is well-motivated, a sentence like (148b) cannot be considered evidence for the Head Constraint. Sentence (148b) exhibits a normal word order for the structures in question, as we saw in (151b). Problems only arise under the less plausible assumption that de boom (in (148b)) is not only an NP, but also a PP from which the postposition79has been removed by the unmotivated rule of Postposition Shift. There is no other sound evidence for the Head Constraint that does not also follow from the Bounding Condition. There is, on the other hand, much evidence for the Bounding Condition that is not covered by the Head Constraint. Let us therefore compare the two conditions.
-
2.6.6.2.
120
-
The Bounding Condition and the Head Constraint
The major evidence for the Bounding Condition can be summarized in the following four classes of facts: (161)
(i)
The Complex NP Constraint
(ii)
The Subject Condition (movement and control)
(iii) The island character of phrases (Xn) (iv)
The local properties of Gapping
It appears that the Head Constraint is only descriptively equivalent to the Bounding Condition with respect to the Complex NP Constraint. In the three other cases, the Bounding Condition makes better predictions. Let us consider these cases one by one. As for the Subject Condition, it is important to bear in mind that I assume, together with Blom (1977) and Van Riemsdijk (1978), that PP complements have two possible positions within an NP: (16 2)a.
NP
(162)b.
NP Det
I a
N AP
PP
N
I I nice story
I about John
(162a) has, what Blom (1977) calls, an object PP; in (162b) there is less cohesion between PP and N, which is expressed by the higher level at which the PP is generated (for arguments, see Blom 1977). The Head Constraint predicts that only the higher PP (in (162b)) can be extracted. If it is assumed that the higher PP position in (162b) can serve as an escape position for the lower PP in (162a) (in the sense of Van Riemsdijk (1978), chapter 5), the Head Constraint pre-
-
121
-
diets that both kinds of PPs can be extracted. This is not the case if the NP is a subject, as predicted by the Bounding Condition: (163)a.
*To whonK did [ N p a long letter
e.^ ] disappear?
*About whonij did [ N p a nice story
e^ ] disturb you?
These sentences are blocked by the Bounding Condition, because there is a free e in an NP in both cases. The Head Constraint is superior to Subjacency in that it predicts many island phenomena for single phrases. For instance, Horn's NP Constraint follows partially from the Head Constraint, but not from Subjacency. Nevertheless, the Head Constraint incorrectly predicts that PPs can be extracted from an NP (or an AP) inside a VP. This is impossible as we saw in 2.6.4.1. for sentences like: (164)
*Mary heeft aan John^ [de Mary has
to
John
lange brief
the long
letter
e^ ] weggegooid e
thrown away
(Mary threw away the long letter to John) The ungrammaticality of this sentence cannot follow from the fact that aan John was originally in the domain of the head of the NP (de brief), since there is an "escape" position for PPs that is not in the domain of N (cf. (162b)). In general, the idea of an "escape" route through positions not c-commanded by the head is fatal for the Head Constraint in cases like (164). The Bounding Condition correctly predicts the ungrammaticality of (164). It contains a free e in an NP, which is excluded, no matter the level of the e with respect to the head of the NP. The Head Constraint also fails to predict the marked character
-
122
-
of Wh-movement (extraction) from a VP (S)(see 2.6.4.3.1.). The COMP position is not in the domain of the head (V) of the sentence, and is therefore predicted to be freely accessible. This is not the case, because several languages do not have extraction at all, while it clearly is a marked phenomenon in languages like English and Dutch. As for the local properties of Gapping, the Head Constraint makes the same predictions as the Bounding Condition, except for one crucial sentence (137b), repeated here as (165): (165)
*John said^ [g (that) Mary
e^
that he would come]
The ungrammaticality of this sentence, predicted by the Bounding Condition, does not follow from the Head Constraint, because Gapping in this case involves the head (V) itself. Even if the Head Constraint had exactly the same empirical content as the Bounding Condition, we would prefer the latter to the former, because the difference - reference to the notion "head" seems superfluous. But the Bounding Condition is not only simpler in this sense, it is definitely more adequate from an empirical point of view. 2.7.
Conclusion Before giving a final formulation of the Bounding Condition
with its auxiliary hypotheses, I would like to propose a notation to distinguish between language-specific and universal auxiliary 80
hypotheses.
The auxiliary hypotheses that we have considered in
this chapter, are simple qualifications of the values of 3 and y. Let us draw a line between specifications that are part of Universal
- 123 Grammar (UG), and specifications that are part of English Grammar (EG), or Dutch Grammar (DG): (166)
UG:
(1)3=... (i) Y = ... (m)
EG:
.. = ...
.. = . .
DG:
.. = ..
We can use this notation for our final statement of the Bounding Condition: (167)
THE BOUNDING CONDITION Y cannot be free in [ g. ..t^e]...] where 3 is a top node (cf. (133))
unless: AUXILIARY HYPOTHESES UG: 3 =
[-tense]
EG: (1) Y = [+wh]
(107) DG:
(1) Y = [+wh]
(98)
(2) Y = [+R]
(126)
The Bounding Condition is part of core grammar, and only makes predictions about specific languages in conjunction with auxiliary hypotheses. The Bounding Condition is intended as a paradigmatic idealization. If it appears to be inconsistent with certain facts in a language X, it can be modified of course. But what is always
- 124 a useful research strategy is to look for auxiliary hypotheses that reconcile the anomalous facts with the Bounding Condition. Most "normal" science is of this type, as argued in chapter 1. As in other sciences, an idealized principle like the Bounding Condition cannot be refuted by a list of anomalous facts. It can only be replaced by a more comprehensive principle. Along these lines, I have been arguing that the Bounding Condition itself is more comprehensive than Subjacency. The major motivation for a class of non-stylistic movement rules like Wh-movement and "move NP" was a specific set of properties, consisting of cyclicity and Subjacency. Since cyclicity was already shown to follow from other concepts (that were not unique for movement; cf. Freidin
1978), it was only natural to focus on
the related notion of Subjacency. A comparison between Subjacency and the Bounding Condition has led to the following result (+ = "explained", - = "not explained"): (168)
Subjacency
Complex NP Constraint
Bounding Condition
+
+
+
+
Subject Condition a. movement b. control
+
Island character of Xns
+
Local nature of Gapping
+
Apart from some Wh-island phenomena, that will be discussed in the next chapter, all major evidence for Subjacency, like the Subject Condition and the CNPC, also follows from the Bounding Condition.
- 125 The Bounding Condition is superior in that it explains the local character of a non-movement rule like Gapping, and the fact that control does not go into subject phrases. The most striking failure of Subjacency is that it cannot account for the island character of single phrases of type X n . Extraction from VP (S)is allowed by Subjacency, which makes it impossible to account for the marked character of extraction, and for the fact that extraction is often impossible in languages other than English. The Bounding Condition solves this problem by stipulating that VPs are islands in core grammar. Extraction is only possible with a marked, language-specific accretion to core grammar (EG (1) in (167)). Subjacency also fails to explain why PPs are islands in almost all languages. The problem has been clearly stated by Van Riemsdijk (1978): how can we account for the exceptional status of PPs in English? Again, the solution seems to be a further development of markedness theory. This development is in its infancy, and the sketchy remarks made in Van Riemsdijk (1978), Koster (1977b), and the present study, have, no doubt, to be revised as research in these areas proceeds. Some of the evidence for Subjacency, like the extraposition phenomena, appeared to be spurious. Subjacency could not be applied to these phenomena without contradiction. The idea of core grammar has led to a more differentiated view of the scope of linguistic theory. As a result, extraposition phenomena could plausibly be banned from the domain of core grammar. The Bounding Condition is more comprehensive than Subjacency in that it applies to all null anaphora (trace, PRO, and [ v e] in the case of Gapping). Since neither cyclicity nor Subjacency can be
-
126
-
maintained as primitive concepts, and since these principles can be reduced to notions that apply to both movements and rules of construal, the distinction between these two classes of rules appears to be illusory.
FOOTNOTES TO CHAPTER 2 1.
Cf. Chomsky (1973), (1975), (1976). See also Koster (1977b).
2.
Cf. Chomsky (1977b).
3.
In classical astronomy (up until the seventeenth century), for instance, the near circular paths of the stars were described directly, by regarding circular motion as basic. Newtonian physics decomposes this circular (or rather elliptical) movement by reanalysing it as the result of the interaction of two forces: rectilinear inertia and gravity. This interactionism is quite typical for the advanced natural sciences, and finds its formal expression in vector algebra.
4.
See the concluding section of Chomsky (1976) .
5.
Another distinguishing property of movement rules, cyclicity, can be reduced to independently needed concepts like traces, PIC, and the SSC. Cf. Freidin (1978). I will return to this matter in 2.2.
6.
Cf. Chomsky (1975).
7.
Chomsky and Lasnik (1977), note 18.
8.
Cf. Chomsky (1977a), Introduction.
9.
loc.cit., p. 14.
10.
It cannot have the so-called arb interpretation that is sometimes possible for subjects of infinitives (cf. Chomsky 1978). An (empty) NP that is not assigned an index can be considered a free variable, which is unacceptable in Logical Form (cf. Chomsky and Lasnik 1977, 449).
11.
The further qualification that the S has to be tenseless is
- 127 perhaps necessary because of examples like: I promised Bill that a picture of himself would be appreciated, where Bill can be the antecedent of himself. 12.
See also Chomsky (1977b), note 16, and the concluding remarks of Chomsky (1976).
13.
Chomsky (1977b), note 16.
14.
See Koster (1977a) for a discussion of such issues.
15.
Cf. Chomsky (1973).
16.
The PIC can perhaps be replaced by the NIC (cf. Chomsky 1978). Although I assume only one kind of empty node (so that there is no difference between trace and PRO) I will often use the word "trace" and other movement terminology, for ease of exposition .
17.
Cf. Freidin (1978).
18.
Chomsky (1978), (103).
19.
For Dutch, the replacement of the PIC by the NIC is not without problems because the equivalent of (13) is ungrammatical. Cf. Koster (1977b), 3.3.
20.
For the SSC, see Chomsky (1973). The formulation of Subjacency is from Chomsky (1977b).
21.
Van Riemsdijk (1978), 160. For a more refined version, see op.cit., 169.
22.
op.cit., 170.
23.
There is some other evidence for Subjacency in the literature. See for instance Rizzi (1978), which is difficult to evaluate (see also note 70 of chapter 3).Van Riemsdijk (1978, 209ff.) gives some evidence that is based on the ungrammaticality of the sentence: *waar heeft zij er vaak over gesproken? (op.cit., 211). (lit.: where has she there often about spoken = what has she often spoken about there?). This sentence is ungrammatical, but if we replace er by the closely related word daar, the sentence becomes grammatical: waar heeft zij daar vaak over gesproken? It seems to me that the data do not support Subjacency for this reason.
-
128
-
24.
This is the position presented in Van Riemsdijk (1978). I will argue below that NPs and PPs are islands in the unmarked case. As for PPs, English is exceptional. It not only allows extraction from many simple PPs, but occasionally also from sentential PP complements (e.g.: who did you approve of my seeing?). In Dutch, it is never possible to extract Wh-phrases from such complements.
25.
The answer, to be given below, is connected with the Bounding Condition, and differs in certain respects from Koster (1977b).
26.
This was observed by Huybregts (1976).
27.
See Jackendoff (1977), Sturm and Pollmann (1977), Van Riemsdijk (1978).
28.
Huybregts (1976) .
29.
This rule has to be distinguished from the extraposition rule discussed by Rosenbaum (1967), Emonds (1976), Koster (1978a), and others.
30.
Chomsky (1975), 86.
31.
Klein (1977), 96.
32.
Van Riemsdijk (1978), 176.
33.
loc.cit.
34.
It is of course possible that rule systems share properties. It has not been demonstrated so far that the property of Subjacency is shared by the rules in (2) (of (39)) and extraposition .
35.
Chomsky also assumes the classification of (39) (lecture Amsterdam, December 1977, and personal communication).
36.
One might object that extraposition does not adjoin the S to S (as in (42)) but to VP, where it is still c-commanded by the NP^ that precedes V. This objection would be irrelevant for the issue, because the head of the relative clause (NPi) may be embedded in one or more other NPs, so that the c-command configuration is destroyed even when the S is attached under the VP. For examples of extraposition from NPs that are embed-
-
129
-
ded in other NPs, see (32) and (33). 37.
Chomsky and Lasnik (1977, 451) qualify (43) by allowing the trace of S to be in the relevant context.
38.
See Koster (1977a) for discussion and references.
39.
Idealization involves counterfactual representation, by definition. In general, theoretical improvements of theories may lead to a (temporary) loss of descriptive adequacy. Of course, one hopes to restore descriptive adequacy in the long run, but usually in a new way, e.g. by the discovery of disturbing entities, or by redefinition of the domain of a theory.
40.
It is important to note that major conceptual changes in the exact sciences were almost never without an (initial) cost in descriptive adequacy. See for instance Toulmin (1972, 228): Copernican astronomy was "at points marginally less exact" than the Ptolemaic account. "Similarly with the wave theory of light, as advanced by Young and Fresnel in the early 1800s: the corpuscular theory of the orthodox Newtonian physicists had some real advantages, which had to be sacrificed if one were to adopt the new wave theory."
41.
Unfortunately, this question is not always asked in this way, i.e. theories are often confronted with just problematic data (and not with rules). Formulation of rules is a necessary condition for effective criticism.
42.
See for instance Williams (1977).
43.
One of the major problems of linguistics (in fact of every theoretical development) is that the domain of a theory of language (of the kind we are interested in) is unknown. It would be wrong to assume that the standard data that are currently used form the relevant domain. These data are likely to be the result of the interaction between linguistic structure and several other cognitive structures. The problem is to find out how language structure is embedded in these other structures, and which of the standard data (judgments about sentences) fall within the domain of the theory of language. Similarly, it has to be discovered how the more limited struc-
-
130
-
ture of core grammar is embedded in linguistic structure at large. 44.
Class lectures, fall 1976. See also Chomsky and Lasnik (1977), 1.1. and Chomsky (1978).
45.
See also Van Riemsdijk (1978), chapter 7.
46.
A remarkable exception is Den Besten (1977). See also Koster (1978a), note 17.
47.
Cf. Hooper and Thompson (1973).
48.
This rule applies from top to bottom. The derived index is underlined here in order to distinguish it from lexically assigned indices. In general, I will omit underlining of indices .
49.
a and 3 are variables with values + and -. Categories that are coindexed by rule (50), have to have, in other words, the same categorial status. There are some exceptions to this generalization; NPs, for instance, can refer to Ss, and to APs (cf. Koster 1977a, note 24).
50.
The Main Projection Rule (Koster 1977b, (78)) is a condition on null anaphora. The Bounding Condition is a further development of this idea. Essential improvements are due to the more elegant formulations of conditions on anaphora in Chomsky (1978) and to the insight of the X-bar theoreticians that S is the maximal projection of V.
51.
This sentence also violates the Locality Principle (see chapter 3) .
52.
See Jackendoff (1977), Sturm and Pollmann (1977), Van Riemsdijk (1978) .
53.
This formulation slightly differs from earlier versions (Gugron 1976; Koster 1977b). In the present version, NP. has to v be immediately dominated by V . Furthermore, only PPs are linked to the focus. Closely related extensions seem possible in which S is linked to the focus. For the marginal cases, see Koster (1977b), (51).
54.
- 131 55.
Note that PPs can both precede and follow the head (A) of an NP.
56.
For references, see note 52.
57.
Cf. 1.4.1. and the references mentioned there.
58.
The deletion itself is optional, but the filter has the effect that clause-initial Wh-phrases are only accepted when they have an index. Note also that Wh-phrases can only bind an e when they are assigned an index by the base (interrogatives) or by the coindexing rule, which applies from top to bottom.
59.
See 3.2.6. below.
60.
Apart from free deletions, all apparently unbounded forms of binding can be reduced to successive cyclic application of clause-bound Wh-movement (or coindexing). See Chomsky (1977b) and Koster (1977a).
61.
For VP analyses, see Bresnan (1971), Bresnan (1976b). For some further explorations, see Brame (1975), (1976), (1977).
62.
Cf. Brame (1975).
63.
Cf. Chomsky and Lasnik (1977), 454.
64.
Sentences like (109) were discussed by Chomsky (1973, 237), in connection with the Tensed-S Condition.
65.
Van Riemsdijk (1978), 228ff.
66.
For more elaborated discussion, see Koster (1977b).
67.
Na selects an NP, but also a VP, which can be infinitival or tensed.
68.
There are perhaps postpositions in Dutch, like vandaan, that have a PP complement, and that allow extraction of R-words. See Van Riemsdijk (1978), 199. The status of vandaan is somewhat controversial (cf. Huybregts 1976). The sentences (122) and (123) are from Van Riemsdijk (1978), 200.
69.
Cf. Van Riemsdijk (1978), 77, 78.
70.
For the "bridge conditions", see Van Riemsdijk (1978), 200. I assume that in English, only a subclass of PPs has a Wh-
- 132 position. Contrary to the clause-initial Wh-position, the PPinternal WH-position is supposed to be optional, like the Imposition in Dutch. 71.
Chomsky (1978, (10d)), assumes that certain deletions under identity are governed by Subjacency.
72.
Note that the notion "parallelism" that normally governs Gapping, cannot play a role here, because parallelism constraints are only relevant in the case of coordinate structures .
73.
Cf. Fiengo (1974), chapter 4. The examples are from Van Riemsdijk (1978), 161.
74.
Note that (141a) is inconsistent with the NIC (free e in nominative position); such examples confirm our earlier proposal to the effect that null anaphors in nominative position are governed by the Bounding Condition, while the NIC is reduced to the stipulation that certain lexical anaphors like himself, etc., are [-nominative]. (141a) is allowed by the Bounding Condition, because the minimal S containing e^ is not a top node.
75.
Henk van Riemsdijk has pointed out to me that sentences like John ruined Bill and was ruined by Mary are problematic under this analysis, because the second, but not the first conjunct contains a trace. This would be a violation of the Coordinate Structure Constraint. As far as I know, this constraint has only been motivated for Wh-movement. There is no clear evidence that "move NP" applies across-the-board, because the trace positions for "move NP" all have to be unlexicalized for independent reasons. Subjects of infinitives, for instance, cannot be lexical, if the infinitival clauses are complements of verbs like seem (cf. Chomsky 1978). Thus, in a sentence like John^ seems e^ to hate Bill and e^ to like Mary, both trace positions (indicated by e^) have to be empty for independent reasons (the coindexing rule applies without problems because both e^s are c-commanded by the antecedent John).
76.
Cf. Koster (1978b).
- 133 -
77.
These facts are subject to some dialect variation. In my dialect, at least, sentences like (155b) and (156b) are completely impossible.
78.
Van Riemsdijk's argument (op.cit., 92ff.) is based on the subcategorization of the verb lopen (walk). The argument is without force, because the observations are inaccurate. Contrary to what is said, the verb lopen can easily occur with a single NP in sentences like hij liep een mijl (he walked a mile). There are even sentences with an NP followed by an ordinary particle: hij liep een ziekte op (he caught a disease)
79.
For this rule, see Van Riemsdijk (1978), 106, 108.
80.
Cf. Koster (1977b), 3.3.
- 134 -
CHAPTER 3 THE LOCALITY
3.1.
PRINCIPLE
Introductory remarks The Bounding Condition has been introduced as an alternative
to Subjacency, and also to the NP Constraint and the Head Constraint, insofar as these principles have empirical content beyond Subjacency. The Bounding Condition has not only made it possible to reduce several principles to one common
source, it has also led to new
predictions concerning the general island character of phrases and the boundedness of Gapping. Neither the NP Constraint, nor Subjacency or the Head Constraint has comparable explanatory scope in these new empirical domains. Although the Bounding Condition has simplified the system of constraints, the overall picture is still far from satisfactory. This will be clear when we compare the form of some current conditions on rules, like the Specified Subject Condition (SSC) and the Superiority Condition. As an illustration of the problem that faces us, consider the following data (where the men is intended to be the antecedent of each other, and Mary of herself): (l)a. b.
*The men expected [g the soldiers to shoot each other] *Mary saw [ N p Bill's picture of herself]
Chomsky (1973) sought to explain these facts by the SSC:1
- 135 (2)
No rule can involve X, Y in the structure ...X...[ . . .Z . . . -WYV...]... a where Z is the specified subject of WYV in a
A condition of this kind does not cover data like the following: (3) "John knows what who saw
2
To account for such facts, another condition was proposed that became known as the Superiority Condition; (4)
No rule can involve X, Y in the structure ...X...[ . . .Z . . . -WYV ...]... a where the rule applies ambiguously to Z and Y and Z is superior to Y^
The striking similarity of this condition to (2) suggests that the system is not optimal. In other words, a generalization seems to be missed. Both conditions have an X-Z-Y pattern, where linking of X to Y is blocked if Z is of a certain type. In (2), Z has to be a subject and in (4) Z has to be superior. Even these conditions on Z are almost alike, because a subject is superior to the other NPs in the sentence. An important difference is that (2) applies over clause boundaries (if we disregard a = NP for a moment), while (4) also applies clause-internally, as shown by (3). The common core of the two conditions (i.e. the X-Z-Y pattern,
- 136 no linking of X,Y when Z is of a certain type) appears to return in many different configurations. Take for instance the Cojacency facts discussed in Koster (1977a): (5)
Peter reads novels, Mary wrote books, and John —
papers
The gap (—) has to be interpreted as wrote, not as read. If we represent the gap by [ v e], we have the following pattern: (6)
...Vx...V2...[
e ] ...
V.^ cannot be linked to V^ if there is an intermediate V2. This is again an X-Z-Y pattern as in the other conditions. Again there is a condition on Z, because Vj may not be in the domain of (one of) the other Vs: (7)
John said that Mary promised it and Bill —
that she denied it
Here the gap has to be interpreted as said in spite of the intermediate verb promised. We cannot account for these facts by the SSC or the Superiority Condition. So, suppose we postulate a third con4
dition, Cojacency.
It will be clear now that there is a suspect
prolifération of very similar principles in this way. All three conditions have the following pattern: (8)
No rule involves X, Y in: ...X...Z...Y. .., unless Z = ...
The fact that this pattern returns in several different conditions
-
137
-
suggests that a more general principle can be formulated. This general principle, the Locality Principle, will be the topic of this chapter. Before discussing the scope of the Locality Principle, I would like to give a preliminary statement of its form: (9)
THE LOCALITY PRINCIPLE No rule involves
(where a c-commands or is parallel
to y) in: ...
a
i+ i'-.-'
Y, • • - ,a±, . . ., a i+1 ,...
(i > 1)
unless: ... The Locality Principle is a condition on rules like the SSC and the Superiority Condition. It applies to all rules that involve an antecedent (a) and a consequent (y), like the coindexing rule discussed in chapter 2.^ I suppose that not only the general two-term character of linguistic rules is universal, but also the configurations in which a and y can be linked by a rule. The typical rule configuration for subordinate structures is characterized by the notion "c-command", while coordinate structures are governed by a not yet fully understood notion of "parallelism".® The Locality Principle presupposes this universal and narrow range of possible a-y configurations. The subscripts in (9) (i, i+1, etc.) indicate relative distance of the as from y. Thus, for non-coordinate constructions, a^ is the first (closest) antecedent c-commanding y;
which c-commands
a^ in turn, is the second closest possible antecedent for y, and so on. If the Locality Principle applies to coordinate structures, the notion "parallelism" is presupposed instead of "c-command". Note that the Locality Principle is not an intervention con-
- 138 straint, i.e. a^ does not necessarily intervene between IO>DO>Adjuncts
Sentence (146) is now accepted without new auxiliary hypotheses to the Locality Principle (where is less prominent than who), while (143a) and (143b) are rejected (who is more prominent than where and when, respectively). This solution appears to be applicable to a number of construals without Wh-phrases:
(148)a. b.
John^ seems
e^
to have left the day before
*The day beforej seems John to have left
The day before is an NP, it is at the same level as the embedded subject in (148a), but since it is an adjunct it cannot block linking of John to its trace. In (148b), however, the day before cannot be linked to its trace because the intervening subject John is more 74 prominent.
A similar explanation can be given for the acceptabili-
ty of (149) (if predicative constituents like a fool are adjuncts): (149)
John,, was considered
e.
a fool
- 197 A fool is at the same level as e k , but it does not block linking of John to its trace because it is less prominent as an adjunct than the subject J o h n . ^ From now on, I will consider the notion "prominence" in (42) to be defined by (147) . If we slightly extend (42) for English, we can account for the problematic data of Tanya Reinhart's (135). This language-specific example goes beyond core grammar and therefore represents a marked option, which we can state in the notation familiar from the end of chapter 2: (150)a.
UG:
(42) unless