Local Modelling of Non-Local Dependencies in Syntax 9783110294774, 9783110294712

Syntactic dependencies are often non-local: They can involve two positions in a syntactic structure whose correspondence

310 52 3MB

English Pages 532 Year 2012

Table of contents :
Local Modelling of Non-Local Dependencies in Syntax: An Introduction
Long Distance Agreement in Relative Clauses
In Support of Long Distance Agree
Agree, Move, Selection, and Set-Merge
Probing the Past: On Reconciling Long-Distance Agreement with the PIC
Reflexivity and Dependency
Derivational Binding and the Elimination of Uninterpretable Features
German Free Datives and Knight Move Binding
Restricted Syntax - Unrestricted Semantics?
Local Case, Cyclic Agree and the Syntax of Truly Ergative Verbs
A Local Derivation of Global Case Splits Doreen Georgi
Function Composition and the Linear Local Modeling of Extended NEG-Scope
Ellipsis and Phases: Evidence from Antecedent Contained Sluicing
Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis
A Derivational View on Movement Constraints
Are Movement Paths Punctuated or Uniform?
A Hypothetical Proof Account of Chamorro Wh-Agreement
Deriving Reconstruction Asymmetries
Local Modelling of Allegedly Local but Really Non-Local Phenomena:Lack of Superiority Effects Revisited
Index

Recommend Papers

Nonlocal variations and local invariance of fields [No. 19]

336 90 19MB Read more

Recent Developments in Nonlocal Theory 9783110571561, 9783110571554

This edited volume aims at giving an overview of recent advances in the theory and applications of Partial Differential

163 34 3MB Read more

$Nonlinear, nonlocal and fractional turbulence 9783030260323, 9783030260330$

Nonlinear, nonlocal and fractional turbulence 9783030260323, 9783030260330

359 68 4MB Read more

Reconstruction and Resumption in Indirect A‘-Dependencies: On the Syntax of Prolepsis and Relativization in (Swiss) German and Beyond 9781614512202, 9781614512912

This monograph investigates A’-dependencies in Standard German, Alemannic and Dutch where the dislocated constituent is

144 112 2MB Read more

Reconstruction and Resumption in Indirect A‘-Dependencies: On the Syntax of Prolepsis and Relativization in (Swiss) German and Beyond 9781614512202, 9781614512912

This monograph investigates A’-dependencies in Standard German, Alemannic and Dutch where the dislocated constituent is

146 49 20MB Read more

Modelling of Nuclear Reactor Multi-physics: From Local Balance Equations to Macroscopic Models in Neutronics and Thermal-Hydraulics 0128150696, 9780128150696

Modelling of Nuclear Reactor Multiphysics: From Local Balance Equations to Macroscopic Models in Neutronics and Thermal-

102 2 22MB Read more

Handbook of Nonlocal Continuum Mechanics for Materials and Structures 9783319229775

182 43 8MB Read more

Modelling of Nuclear Reactor Multi-physics: From Local Balance Equations to Macroscopic Models in Neutronics and Thermal-Hydraulics 0128150696, 9780128150696

Modelling of Nuclear Reactor Multiphysics: From Local Balance Equations to Macroscopic Models in Neutronics and Thermal-

480 82 7MB Read more

The Syntax of Numerals in Bosnian 9783929075113

This monograph deals with numerals with special emphasis on numerals in Bosnian and their syntactic behaviour. At least

101 8 Read more

Irregularity in Syntax 0030841453

A study of irregularity in syntax in the Generative Semantics formalism of the mid to late 1960s.

103 13 3MB Read more

Local Modelling of Non-Local Dependencies in Syntax
9783110294774, 9783110294712

Author / Uploaded
Artemis Alexiadou (editor)
Tibor Kiss (editor)
Gereon Müller (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Linguistische Arbeiten

547

Herausgegeben von Klaus von Heusinger, Gereon Müller, Ingo Plag, Beatrice Primus, Elisabeth Stark und Richard Wiese

Artemis Alexiadou, Tibor Kiss, Gereon Müller (Eds.)

Local Modelling of Non-Local Dependencies in Syntax

De Gruyter

ISBN 978-3-11-029471-2 e-ISBN 978-3-11-029477-4 ISSN 0344-6727 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. © 2012 Walter de Gruyter GmbH, Berlin/Boston Gesamtherstellung: Hubert & Co. GmbH & Co. KG, Göttingen ∞ Gedruckt auf säurefreiem Papier Printed in Germany www.degruyter.com

Contents Local Modelling of Non-Local Dependencies in Syntax: An Introduction Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

1

Long Distance Agreement in Relative Clauses Fabian Heck & Juan Cuartero

49

In Support of Long Distance Agree Artemis Alexiadou, Elena Anagnostopoulou, Gianina Iord˘achioaia & Mihaela Marchis

85

Agree, Move, Selection, and Set-Merge Petr Biskup

111

Probing the Past: On Reconciling Long-Distance Agreement with the PIC Marc Richards 135 Reflexivity and Dependency Tibor Kiss

155

Derivational Binding and the Elimination of Uninterpretable Features Joachim Sabel

187

German Free Datives and Knight Move Binding Daniel Hole

213

Restricted Syntax – Unrestricted Semantics? Udo Klein

247

Local Case, Cyclic Agree and the Syntax of Truly Ergative Verbs Florian Sch¨afer

273

A Local Derivation of Global Case Splits Doreen Georgi

305

Function Composition and the Linear Local Modeling of Extended N EG-Scope Hans-Martin G¨artner

337

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing ´ Masaya Yoshida & Angel J. Gallego

353

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis Chiyo Nishida

371

vi

Contents

A Derivational View on Movement Constraints Christina Unger

401

Are Movement Paths Punctuated or Uniform? Klaus Abels & Kristine Bentzen

431

A Hypothetical Proof Account of Chamorro Wh-Agreement Chris Worth

453

Deriving Reconstruction Asymmetries Gregory M. Kobele

477

Local Modelling of Allegedly Local but Really Non-Local Phenomena: Lack of Superiority Effects Revisited Dalina Kallulli

501

Index

525

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

Local Modelling of Non-Local Dependencies in Syntax: An Introduction* Abstract This introduction first presents various types of non-local dependencies in syntax (among them instances of movement, reflexivization, case assignment, agreement, consecutio temporum, deletion, switch reference, extended scope of negation, and (semantic) binding). In the second part, we identify three classes of aproaches to non-local dependencies: (i) spurious non-locality, (ii) non-local modelling, and (iii) local modelling. Finally, we mention some of the core issues that arise under a local modelling perspective.

1. Non-local dependencies: setting the stage 1.1. Local vs. non-local dependencies Many syntactic dependencies are strictly local. Often they involve the most local structural relation that is available, viz., sisterhood in phrase structures. This holds, e.g., for the assignment of lexical (or inherent) case, which can generally only affect the lowest argument of a verb – i.e., the sister of V (see Fanselow (2001)). Relevant cases are illustrated in (1): The assignment of inherent genitive to a nominal argument DP by a simple transitive verb in (1-a), and by a ditransitive verb in (1-b), takes place under sisterhood in German. This view is supported by constituency tests: Strict locality is indicated by joint VP topicalization of DPgen and V (with the option of leaving other – higher – arguments in situ) in (1-cd). (1) a.

*

der Opfer gedenken sollte dass man that one.NOM the victims.GEN commemorate should

We would like to thank Anke Assmann, Doreen Georgi, Timo Klein and Philipp Weisser for editorial assistance, discussions of non-local dependencies, and various kinds of help with the present volume. We are also grateful to Jakob Hamann and Patrick Schulz for help with typesetting, and to Lisa Morgenroth and Daniela Thomas for their work on the index. And we are particularly indebted to Fabian Heck for his substantial input to the present text.

Local Modelling of Non-Local Dependencies in Syntax, 1-48 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

2

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

b.

c. d.

dass die Staatsanwaltschaft die Lehrer der Bestechlichkeit that the prosecution.NOM the teachers.ACC the corruptibility.GEN u¨ berf¨uhrt hat found guilty of has Der Opfer gedenken sollte man schon the victims.GEN commemorate should one.NOM PRT Der Bestechlichkeit u¨ berf¨uhrt hat die Staatsanwaltschaft die the corruptibility.GEN found guilty of has the prosecution.NOM the Lehrer teachers.ACC

In other cases, the dependency may range beyond strict sisterhood but can still be covered by relying on a notion like that of (the minimal) predicate/argument structure (i.e., a predicate together with all its arguments as they are required by its subcategorization frame – itself a syntactic dependency that is very local), or by invoking the clause-mate relation. Thus, the dependency formed by combining a reflexive with its antecedent (‘reflexivization’) is typically confined to a single predicate/argument structure, in the sense that it affects co-arguments. See (2-ab) (where co-occurrence in a reflexivization dependency is indicated by coindexing), and Pollard & Sag (1992), Reinhart & Reuland (1993), and B¨uring (2005) for approaches to reflexivization that more or less directly incorporate this argument structure-based concept of locality. (2) a. John1 likes himself1 b. *John1 thinks that Mary likes himself1 In many (but certainly not all) languages, the scrambling operation (cf. Ross (1967)) producing variable (or free) word order is also a highly local process, and similar considerations hold for movement to subject position (or raising to subject) in languages that exhibit a requirement to fill this position (the ‘EPP’ property; see Chomsky (1982)). Thus, scrambling in German and raising to subject in English are both dependencies that cannot span a (finite) clause; the dependencies, conceived of as movement, require the base position and the target position to be clause-mates. Local dependencies are shown for scrambling in German in (3), and for raising in English in (4), with t1 representing the trace left in the base position of the moved item. (3) a. b. (4) a. b.

dass den Fritz1 keiner t1 gesehen hat that the Fritz.ACC no-one.NOM seen has [ DP ein Buch t1 ] gelesen hat dass dar¨uber1 keiner a book.ACC read hass that about this no-one.NOM A book1 was given t1 to Mary Mary1 was talked [ PP about t1 ]

Local Modelling of Non-Local Dependencies

3

Note that (4-b), and in particular (3-b), call into question the adequacy of the minimal predicate/argument structure as the local domain relevant for these movement types; the PP dar¨uber (‘about this’) in (3-b) is not an argument of the verb sehen (‘see’). However, the movement operations are still highly local. As shown by the ill-formed examples in (5) (instantiating long-distance scrambling in German and long-distance raising to subject in English, respectively), they are confined by a clause-mateness requirement (where “CP” stands for a clause). den Fritz1 glauben [ CP dass keiner t1 sah ] (5) a. *dass wir that no-one.NOM saw that we.NOM the Fritz.ACC believe b. *John1 seems [ CP that t1 gave a book to Mary ] However, even though many core dependencies in syntax are local, syntactic dependencies may also be non-local, in the sense that they involve two positions in a phrase structure whose correspondence cannot be captured by invoking concepts like sisterhood, (minimal) predicate/argument structure or the clause-mate relation. First and foremost among these potentially non-local dependencies are various types of movement (or displacement) in the world’s languages. In addition (and in contrast to what we have seen above), some kinds of reflexivization may be non-local: Reflexivization is often confined to minimal predicate/argument structures, but may also apply long-distance in certain contexts in certain languages (without necessarily being amenable to an account in terms of logophoricity, see below). Similarly, it looks as though certain kinds of nonlocal case assignment (which are not necessarily confined to minimal predicate/argument structures) can also occur. Indeed, it turns out that there are nonlocal instances of many other dependencies (that may often be classifiable as local in their core occurrences), among them long-distance agreement (in languages like Tsez, Itelmen, Hindi, perhaps also Icelandic), consecutio temporum (which involves a non-local relationship between an embedded tense and a matrix tense), extended scope of negation, extended mood selection (cf., e.g., the relation between a matrix predicate like demand and an embedded mood marking as subjunctive), control of the subject of an infinitive by an argument belonging to a matrix clause, the related phenomenon of switch reference systems indicating identity of reference or disjointness of matrix and embedded subjects, and last but certainly not least the (semantic) binding of variables (as, e.g., in the case of bound-variable pronouns), which is potentially non-local almost by definition.1 Let us go through some relevant examples. 1

In order to show that these dependencies are indeed instances of non-local, or long-distance, dependencies, it must of course be established in each case (a) that there is a clause boundary (in a relevant, technical sense, e.g., as a CP phrase headed by C, or as an S node in earlier work), or

4

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

1.2. Long-distance movement The most well-known and best established cases of non-local dependencies are displacement constructions like wh-movement, topicalization, relativization, etc., where the moved item and its base position can in principle be separated by arbitrarily many intervening clause boundaries. Examples from English are given in (6-a) (wh-movement), (6-b) (topicalization, from Ross (1967))), and (6-c) (relativization, from Gazdar et al. (1985)). (6) a. b. c.

What1 do you think [ CP that John believes [ CP that Mary bought t1 ]]? Beans1 I don’t think [ CP you’ll be able to convince me [ CP Harry has ever tasted t1 in his life ] The man [ CP who1 I think [ CP t1 chased Fido ]] returned

The relation between the position in which the argument status of the displaced item is assessed, or its θ -role is assigned (here indicated by a trace t1 , but at this point this should not be taken to imply any theoretical analysis), and the position that it eventually shows up in can span arbitrarily many CPs (clauses), provided that no constraints on movement (like Ross’s (1967) island constraints, Chomsky’s (1973) Subjacency Condition, Rizzi’s (1990) Relativized Minimality condition, etc.) are violated. What is more, displacement operations like scrambling and raising to subject position, which as we have seen are strictly local in languages like German and English (respectively), can in fact apply non-locally in other languages. Thus, long-distance scrambling from CP is an option in languages like Russian (see, e.g., M¨uller & Sternefeld (1993) and Bailyn (2001)) and Japanese (see Saito (1985) and Grewendorf & Sabel (1999), among many others; Korean and Persian also belong in this group). The following example taken from Zemskaja (1973) instantiates a well-formed case of long-distance scrambling in (colloquial) Russian. (7) Ty [ DP doktor ]1 videl [ CP kogda t1 pod ezˇzal ] ? when came doctor.NOM saw you Similarly, raising to subject position across an intervening CP boundary seems to be available in a number of languages, among them Greek (see Perlmutter & Soames (1979, ch. 43) and Alexiadou & Anagnostopoulou (1999; 2002), among others) and Kilega and other Bantu languages (see Obata & Epstein (2011) and references cited there). A Greek example that illustrates such a legitimate case of super-raising is given in (8): some other independently established locality domain, separating the two items participating in the dependencies, and (b) that the clause boundary (in this technical sense; or some other locality domain) blocks the local relationship. Otherwise, one could argue that, e.g., a C projection is a mere theoretical construct or artefact. We will return to this issue below.

Local Modelling of Non-Local Dependencies

5

(8) [ DP I kopeles ]1 fenonde [ CP na t1 fevgun ] the girls.NOM seem-3. PL leave-3. PL SUBJ 1.3. Long-distance reflexivization Whereas many instances of reflexivization are clause-bound, and often confined to minimal predicate/argument structures, in some contexts, in some languages, long-distance reflexivization is possible. Some classical examples are given in (9), with (9-a) from Latin (see Kuno (1987), B¨uring (2005)), (9-b) from Icelandic (see Anderson (1983), Koster (1987), Fischer (2004), B¨uring (2005), and references cited in the latter two works), (9-c) from Chinese, and (9-d) from Korean (both taken from Cole et al. (1990)). (9) a.

b.

c. d.

n¯ untium mittit [ CP nisi subsium sibi1 Iccius1 unless relief.NOM REFL Iccius.NOM message.ACC sends submitt¯ atur ] is furnsihed ‘Iccius sends a message that unless relief be given to him (Iccius), ...’ (i) Hann1 sagi [ CP a sig1 vantai hæfileika ] that REFL lacked ability he said ‘He said that he lacked ability.’ sig1 a´ hverjum degi ] (ii) J´on1 segir [ CP a P´etur raki that Peter shaves.SUBJ REFL on every day John says ‘John says that Peter shaves him (John) every day.’ [ CP a Harladur (iii) J´on1 segir [ CP a Maria viti that Mary knows.SUBJ that Harold John says [ CP a Bill meii sig1 ]]] vilji that Bill hurts.SUBJ refl wants.SUBJ ‘John says that Mary knows that Harold wants Bill to hurt him (John).’ Zhangsan1 renwei [ CP Lisi2 zhidao [ CP Wangwu3 xihuan ziji1/2/3 ]] Zhansan thinks Lisi knows Wangwu lik REFL ‘Zhangsan thinks that Lisi knows that Wangwu likes him/self.’ Chelswu-nun1 [ CP Inho-ka caki-casin1 -ul sarangha-nta-ko ] Inho.NOM he-REFL-acc love-PRES-DECL-C Chelswu.TOP sanygkakha-n-ta think-PRES-DECL ‘Chelswu thinks that Inho likes him (Chelswu).’

In all the cases in (9), a reflexive pronoun (that must be bound by some antecedent) is (or at least can be) bound from outside of the minimal CP that it shows up in (typically by a subject).

6

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

1.4. Long-distance case assignment A first potentially relevant instance of non-local case assignment involves exceptional case marking (ECM) constructions, where an embedded infinitival subject receives case from the matrix verb, as in (10) in German. l¨asst [ α Fritz das Geschirr waschen ] (10) Maria1 Fritz.ACC the dishes.ACC wash Maria.NOM lets However, it is not quite clear that cases like (10) do indeed instantiate nonlocal case assignment. First, it has often been argued that α does not qualify as a clause boundary (so that case assignment in (10) would still comply with a clause-mate requirement); second, it has sometimes been argued that the embedded subject has in fact undergone raising to the object position of the matrix clause, in which case the dependency would clearly be local; and third, it might be that α does not exist in the first place, with the two predicates (lassen ‘let’ and waschen ‘wash’ in (10)) having been combined into a simple complex predicate, which would also get rid of any potential deviation from strict locality. The second option (i.e., movement of the case assignee to the matrix object position) has also often been pursued for apparent cases of exceptional case marking into finite clauses (see Massam (1985) for general issues, and, e.g., Chung (1976) on Indonesian, Alboiu & Hill (2011) on Romanian, Seiter (1983) on Niuean, and Kotzoglou (2002) on Greek). Still, it seems that there are some unequivocal cases of non-local case assignment. A particularly striking example is the phenomenon of accusative case assignment by a matrix verb to the internal DP argument of an embedded verb in the Kansai variety of Japanese, as it has been investigated by Ura (2007). In (11-a), the embedded predicate cannot assign accusative case, but the embedded object bears accusative. Illformedness results when the matrix predicate cannot assign accusative case (due to passivization), as illustrated in (11-b). This provides a good argument for accusative case assignment by the matrix predicate to the embedded object DP in (11-a). (te) ] Boku-wa [ CP John-ni sono koto-o deki-soo-ya John-DAT the task- ACC able-likely-be(PRES) COMP I-TOP omow-u think-PRES ‘I think that John is likely to be able to do the task.’ (te) ] b. *[ CP John-ni sono koto-o deki-soo-ya John-DAT the task- ACC able-likely-be(PRES) COMP omow-are-te ru think-PASS-PROG PRES ‘It is believed that John is likely to be able to do the task.’

(11) a.

Local Modelling of Non-Local Dependencies

7

Another relevant example involves long-distance assignment of accusative case in Finnish, as it has been described by Vainikka & Brattico (2011). Finnish has four different morphological exponents for structural object case that can be viewed as accusative allomorphs (with the choice basically governed by the well-established principles of differential object marking in simple contexts, see Aissen (2003) and Keine & M¨uller (2008)). Interestingly, with a particular class of infinitival complements, the choice of accusative marker depends on whether φ -agreement takes place with the nominative subject in the matrix clause. If so, the case allomorph n shows up on the embedded object; if no φ -agreement takes place in the matrix clause (or if the infinitive is not c-commanded by a matrix verb in the first place), the zero accusative allomorph is chosen. This is illustrated by the examples in (12). (12) a. b.

pihalta ] [ CP l¨oyt¨aa¨ sisko-n/*-ø Yritimme find.A sister-ACC(n)/ACC(ø) yard-ABL try.PAST/1 PL ‘We tried to find the sister in the backyard.’ pihalta ] ! Yrit¨a [ CP l¨oy¨aa¨ sisko-ø/*-n try.IMP find.A sister-ACC(ø)/ACC(n) yard.ABL ‘Try to find the sister in the backyard!’

A third relevant phenomenon that shows up in argument encoding systems with argument type-based splits is what Silverstein (1976) aptly called global case marking (as opposed to standard instances of local case marking, with the terminology – local vs. global – taken from Chomsky (1965)). For instance, in Yurok (see Robins (1958)), accusative case is realized on the internal argument of a predicate only if the external argument is lower on the referential hierarchy (here governed by person choice) than the internal argument; see (13-a) (no accusative marking on the object because both argument DPs are local – 1. or 2. – person) vs. (13-b) (accusative marking on the object because the internal argument DP is 1. person and the external argument DP is 3. person). (13) a. b.

nek ki newoh-paP KePl 2SG.NOM 1SG.NOM FUT see-2>1 SG ‘You will see me.’ YoP nek-ac ki newoh-pePn 3SG.NOM 1SG-ACC FUT see-3 SG>1 SG ‘He will see me.’

Accusative case assignment in (13-b) does not cross a clause boundary; still, it is non-local in the sense that, in contrast to standard instances of accusative assignment, information available within the VP that contains the case-assigning verb and the object does not suffice to determine whether case is actually assigned; in order to decide this, properties of the VP-external subject argument also have to be taken into consideration.

8

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

1.5. Long-distance agreement Typically, agreement is a clause-bound dependency: Abstracting away from DPinternal concord (see Alexiadou et al. (2007)), feature sharing involved in binding (see below) and various other, more marginal, phenomena (such as agreement spreading in Archi, see Chumakina & Corbett (2008)), the core cases of agreement are such that a predicate agrees with some DP(s) with respect to φ features like person, number, and gender. Usually, the agreement controller is an argument of the target predicate, but in some cases, a minimal extension of the local domain for agreement beyond the minimal predicate/argument structure may be necessary, as, e.g., with agreement in possessor raising constructions; an example from Mohawk is given in (14) (see Baker (1988)). Here, the predicate agrees with the possessor (‘John’) of the incorporated N (‘house’), as evidenced by the marker hrao (3 M); agreement with an unincorporated N nuhs would have triggered the marker ka (3 N) instead. (14) Hrao-nuhs-rakv ne sawatis 3 M-house-white John ‘John’s house is white.’ However, notwithstanding the question of whether this analysis in terms of genuine possessor raising can be maintained after all (see Baker (1996) for qualifications), it is clear that the agreement here would still qualify as fairly local – it would still be a clause-bound process. The case is different with another class of non-local agreement phenomena which have figured prominently in the more recent literature. In cases of socalled long-distance agreement (LDA), a matrix verb agrees with the argument of an embedded clause with respect to φ -features. Some relevant examples are given in (15), from Hindi (in (15-a)), Kashmiri ((15-b); both examples are taken from Bhatt (2005)), Tsez ((15-c), from Polinsky & Potsdam (2001)), Kutchi Gujarati ((15-d), from Grosz & Patel (2006)), Khwarshi ((15-e), from Khalilova (2007)), and Chukchee ((15-f), from Boˇskovi´c (2007)). (15) a. b.

Vivek-ne [ CP kitaab par.h-nii] chaah-ii ] book.F read-INF. F want-PFV. F. SG Vivek-ERG ‘Vivek wanted to read the book.’ Raam-an che hameeSI yatshImatsI [ CP panInis necivis Ram-ERG be.PRS . F always wanted.F. PL self.DAT son.DAT khAAtrl koori vuchini ] for girls see-INF. F. PL ‘Ram has always wanted to see girls for his son.’

Local Modelling of Non-Local Dependencies

c.

d. e.

f.

9

[ CP uˇz-¯ a Magalu b-¯ ac’-ru-łi ] Eni-r boy-ERG bread.III. ABS III-eat-PSTPRT- NMLZ mother-DAT b-iy-xo III-know-PRS ‘The mother knows the boy ate the bread.’ Valji-ne [ CP chopri vanch-vi ] par-i Valji.M - DAT book.F read-INF.F have.to-PFV.F ‘Valji had to read the book.’ Iˇset’u-l y-iq’-ˇse [ CP goli uˇza bataxu mother/OBL - LAT G 5-know-PRS COP boy/ERG bread( G 5) y-acc-u ] G 5-eat- PTCP : PST ‘Mother knows that the boy ate bread.’ @nan q@lGilu l@N@rk@-nin-et [ CP iNqun 0-r@t@m’N@v-nen-at / he-INST regrets-3-PL that 3SG-lost-3-PL qora-t ] reindeer.PL. NOM ‘He regrets that he lost the reindeers.’

As indicated by underlining in the glosses, in all the LDA cases in (15), the matrix predicate agrees with respect to φ -features (person, number, and gender – note that in the Nagh-Daghestanian examples in (15-c) and (15-e), the numbers signal genders rather than inflection classes). An interesting additional observation is that in all these cases, the embedded V also has to agree with whatever the matrix V agrees with. 1.6. Other phenomena Sequence of tense restrictions are also prototypical cases of a non-local dependency: Here the interpretation of an embedded tense depends on the tense specification of the matrix clause (in addition to other properties of the embedded clause, like aspect). Consider the following data from Korean and English (taken from Kang (1996)). (16) a.

b.

Yuna-nˇun [ CP Minsu-ka ap-ass-ta-ko ] Minsu-NOM be.sick-PAST-DECL-COMP Yuna-TOP malhæ-ss-ta say-PAST-DECL Mary said [ CP that John was sick ]

The Korean example in (16-a) can only have the (expected) reading where Minsu’s being sick precedes Yuna’s saying so, i.e., both PAST exponents are interpreted regularly. In contrast, the English example in (16-b) can be under-

10

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

stood in such a way that the time of John’s being sick and the time of Mary’s utterance are identical. In this reading, the embedded tense information must be ignored. Various proposals have been advanced to account for consecutio temporum effects like the one at hand (see, e.g., Ogihara (1989), Stechow (1995; 2003), Kratzer (1998)), but all existing analyses converge on treating the effect as a non-local phenomenon (e.g., by postulating an appropriate non-local tensedeletion rule, or by postulating a special type of non-local binding). Next, VP ellipsis, conceived of as PF deletion of a VP (see Merchant (2001)), may need to be viewed as non-local in certain contexts. As argued by Aelbrecht (2010) (though see Boˇskovi´c (2012) for a different approach), examples such as (17-a) in English must be analyzed in such a way that the lexical item licensing the deletion is not the locally embedding non-finite auxiliary (here: been), but the non-local finite head higher up in the structure (here: should); the argument is based on the premise that if the local non-finite verb could license ellipsis, (17-b) should also be possible, which it is not. (17) a.

I hadn’t been thinking about that. Well, you should have been thinking about that b. *I hadn’t been thinking about it, but I recall Morgan having been thinking about it

Another phenomenon whose non-locality may be initially unexpected concerns the scope of sentential negation. Typically, the scope of sentential negation is restricted to the minimal clause that it occurs in. Some initial doubt may be shed on the correctness of this generalization in the case of infinitival constructions such as the one in (18) in German (see Grewendorf (1988), Kiss (1995), and Haider (2010), among many others). The sentence is ambiguous, with the more natural interpretation assigning the negative item that is part of the object DP of the embedded verb (niemanden (‘no-one’)) wide scope: The natural reading is one where it is not the case that he intended to disturb anyone (and not one where he actually intends that no-one will be disturbed). [ α niemanden zu st¨oren ] beabsichtigt hat (18) dass er has that he-NOM no-one.ACC to disturb intended However, there would seem to be a general consensus that (18) is not to be analyzed as a genuinely bi-clausal structure; either sentential negation is already placed in the matrix clause (as part of the object DP in (18), in which case the phenomenon is at best an instance of the non-locality of movement), or clause union (or complex predicate formation) has applied, and the structure is monoclausal ot begin with. More interesting in the present context is the status of English constructions like those in (19-ab) (see Klima (1964), Kayne (1998)), where negation can take wide scope in the presence of clearly bi-clausal structures, thereby giving rise to non-local, extended scope of negation.

Local Modelling of Non-Local Dependencies

(19) a. b.

11

I will force you [ CP to marry no-one ] She has requested [ CP that they read not a single book ]

Next, obligatory control structures as in (20) are inherently non-local if one assumes that they qualify as biclausal, with an empty category (like, perhaps, PRO) in the embedded subject position. (20) John1 tries [ CP PRO1 to win ] Similarly (and perhaps even more strikingly), in languages with switch reference systems, there is a special marker on the verb of some clause CP2 if the subject of CP2 is coreferent with the subject of an immediately adjacent clause CP1 that is part of the same syntactic structure. In addition, in cases of disjoint reference of the two subjects, there often is another type of marker (a ‘different subject marker’) on CP2 . Thus, it seems that in order to determine marker choice in CP2 (same subject marker or different subject marker), the referential value of the two subjects must be compared. Some relevant examples with same subject (SS) marking and different subject (DS) marking in Choctaw are given in (21-a) and (21-b), respectively (see Broadwell (1997)). (21) a. b.

[ CP1 [ CP2 John-at abiika-haatokoo-sh ] ik-iiy-o-tok ] John-NM sick-because-SS III-go-NEG-PT ‘Because John1 was sick, he1 didn’t go.’ [ CP1 [ CP2 John-at abiika-haatokoo-n ] ik-iiy-o-tok ] John-NM sick-because-DS III-go-NEG-PT ‘Because John1 was sick, he2 didn’t go.’

As noted in the introduction of Weisser (2012), this is clearly a non-local dependency, and it is typically modelled as such in the literature, usually by invoking principles of binding theory (see Finer (1985), Watanabe (2000)).2 Finally, (semantic) binding of items that are interpreted as variables is often non-local, and sometimes radically so. Consider the case of bound variable pronouns, as in the German example in (22). (22) Jeder Student1 denkt [ CP dass die Pr¨ufung klappen wird [ CP wenn er1 every student thinks that the exam work out will if he sich bem¨uht ]] REFL tries Assuming an approach like the one in Heim & Kratzer (1998), the λ operator associated with a quantified DP like every student in (22) can be arbitrarily far 2

Still, there is evidence that switch reference obeys some locality restrictions; e.g., same subject marking cannot skip an intervening CP.

12

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

away (provided that c-command is available) from the variable that it binds (he in (22)).

2. Strategies for analysis Three general types of approach can be distinguished in view of non-local dependencies in syntax. First, one can pursue the hypothesis that dependencies that look as though they are non-local can in fact be shown to be local on closer inspection. Let us call this kind of approach the spurious non-locality approach. Second, one may bite the bullet and simply postulate that syntactic dependencies can in fact be non-local, and there is no reason not to assume that syntactic theory can handle non-local dependencies directly. Let us call this strategy a nonlocal modelling approach. And third, one may argue that instances of non-local dependencies should be decomposed into sequences of smaller, local dependencies. This is the local modelling approach that is the primary topic of the present volume. Let us go through the three kinds of approaches in a bit more detail. 2.1. Spurious non-locality For many syntactic dependencies that would seem to qualify as non-local at first sight, it has been argued that closer scrutiny might reveal them to be local after all. Still, it seems fair to conclude that spurious non-locality approaches have not all been successful to the same extent with reanalyizing the different kinds of (seemingly) non-local dependencies as local dependencies. 2.1.1. Spurious non-locality: reflexivization With respect to non-local reflexivization (see subsection 1.3), a spurious nonlocality approach would seem to qualify as the standard approach, given the state of the art. On this view (see Pollard & Sag (1992), Reinhart & Reuland (1993), B¨uring (2005)), typical instances of reflexivization are inherently local (confined to minimal predicate/argument structures, or at last to minimal clauses); and what looks like non-local reflexivization does not actually involve reflexive pronouns that must find a binder in some local domain, but rather ‘exempt’ anaphors that are governed by concepts like logophoricity (see Sells (1987)): A logophoric pronoun refers to the source of an embedded proposition; it can always show up as a first person pronoun if the embedded proposition is transformed into a separate quotation (with John said that he.LOG will leave becoming John said: “I will leave”); and it may in principle also occur without an (overt) antecedent. Although this kind of approach has proven successful in certain areas, it can be

Local Modelling of Non-Local Dependencies

13

noted that it still gives rise to a number of problems: A uniform concept of reflexive (‘anaphor’, in Chomsky’s (1981) terminology) becomes unavailable. (Plus, the notion of reflexivity as it is used in Reinhart & Reuland (1993) cannot be defined without recourse to reflexives.) Furthermore, it becomes more difficult to capture cross-linguistic variation. Third, the generalization that long-distance reflexives are always morphologically simplex remains unaccounted for. And finally, it seems that there are cases of long-distance reflexivization where a concept like logophoricity does not seem to be involved. 2.1.2. Spurious non-locality: case assignment Next, as regards non-local case assignment (see subsection 1.4), the first thing to note is that clear cases involving a genuine long-distance dependency seem to be few and far between. As remarked above, instances of ECM with infinitives can be given a local analysis by assuming the absence of a clause boundary and/or complex predicate formation, and instances of ECM with finite clauses often suggest that the case-marked item has undergone movement to the matrix clause. Based on an analysis of VP-internal (and apparently non-locally assigned) nominative DPs in Icelandic, McFadden (2009) explicitly advances the generalization that the only case that can show up in non-local configurations is the nominative (which he takes to be assigned by default, i.e., without a case assigner being present in the structure). Still, as we have seen, there are a couple of phenomena suggesting that non-local case assignment might sometimes be an option. 2.1.3. Spurious non-locality: agreement In the area of non-local agreement (see subsection 1.5), the spurious non-locality approach is actually one of the most widely adopted strategies of analysis. Analyses of LDA adhering to this general pattern come in two varieties. First, it is sometimes argued that the matrix verb and the DP that it agrees with form part of the same local domain from the beginning. Against the background of Chomsky’s (2001) theory of phases (where the predicate phrase vP and the clause CP qualify as locality domains), Boeckx (2004) and Bhatt (2005) suggest that the verb and the DP that undergo LDA are part of the same phase; this may be so either because there is very little phrase structure involved (see Boeckx (2004)), or because phases can indeed be somewhat bigger than is normally assumed (see Bhatt (2005)): On this view, LDA only affects restructuring (‘coherent’, ‘clause union’) infinitives. According to a second type of spurious non-locality approach, LDA arises as a consequence of the embedded DP moving to the matrix clause; i.e., on this view, non-local movement may feed local agreement.

14

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

Thus, it has been suggested (see Polinsky & Potsdam (2001), Polinsky (2003), Chandra (2005))) that DP moves to the left edge of the embedded phase (possibly higher) in LDA constructions, and thereby reaches the matrix V’s local domain. Case requirements or semantic/information-structure related reasons (e.g., a topic interpretation) may then be identified as possible triggers for these movement operations. A schematic derivation for this kind of analysis is given in (23): DP moves from the embedded local domain YP (which may be identified as a the complement of a phase head, or as a phase) to the matrix domain in (23-a), and as a consequence, it can locally agree with a matrix verb in (23-b). WP

(23) a. W

WP

b. W [φ ]

ZP Z Z

YP Y

ZP DP [φ ]

DP

Z Z

YP Y

tDP

Of course, this analysis does not get rid of long-distance dependencies per se – the movement operation preceding agreement in the matrix clause is not strictly local. What is more, such an approach clearly depends on the assumption that movement of DP needs to occur for LDA to take place; however, a brief glance at the examples in (15) already makes it clear that movement to the edge of the embedded YP will often have to be assumed to be covert; the DP in question is typically not pronounced in the position in which it needs to show up to effect local agreement with the matrix predicate. 2.1.4. Spurious non-locality: control and switch reference In the same way, some of the other (apparently) non-local phenomena mentioned above may be reanalyzed as involving only a local dependency. For instance, if control constructions do not actually involve two separate clauses with two separate subjects (one of them remaining without phonological realization), but rather a monoclausal structure, the non-locality issue disappears entirely; cf., among many others, Bresnan’s (1982) Lexical Functional Grammar (LFG) analysis, where a control predicate embeds a non-clausal XCOMP category whose subject is identified with the control predicate’s own subject, and a constraint on ‘functional locality’ ensures that such identification cannot be recursive (such

Local Modelling of Non-Local Dependencies

15

that the subject of an XCOMP of an XCOMP could be identified with the control predicate’s subject). Similarly, under Keine’s (2011) reanalyis of switch reference markers as coordinating conjunctions (with SS markers realizing conjunctions of VPs, i.e., of categories that do not include an external argument yet, and DS markers realizing conjunctions of vPs, i.e., of categories with an external argument in them), the non-locality of switch reference marking emerges as spurious. 2.1.5. Spurious non-locality: movement However, severe problems arise for this view in the domain of non-local movement phenomena (on the existence of which some analyses in terms of spurious non-locality may be parasitic, as we have just seen). In some cases, instances of seemingly non-local movement may indeed be reanalyzed as local. This holds, e.g., for the displacement of (unstressed or clitic) pronouns from infinitives embedded under certain verbs (‘restructuring’ verbs) in languages like Spanish or German, as in (24-a) vs. (24-b) (Spanish) and (25-a) vs. (25-b) (German). Luis las1 quiere ([α ) comer t1 (]) to eat Luis them wants b. *Luis las1 insiti´o [ α en comer t1 ] Luis them insisted on to eat

(24) a.

hat sie1 heute ([α ) t1 zu holen (]) versprochen Maria to fetch promised Maria.NOM has them.ACC today hat sie1 heute [ α t1 zu holen ] abgelehnt b. *Maria to fetch abgelehnt Maria.NOM has them.ACC today

(25) a.

Aissen & Perlmutter (1983) show that there is strong evidence against analyzing the construction in (24-a) as involving genuine movement of a pronominal element from a clausal complement (so-called ‘clitic climbing’); rather, a clauseunion operation may have been triggered by the matrix verb, and the whole construction is mono-clausal (with the clitic pronoun showing up in a perfectly regular position). Similarly, Haider (1993; 2010) and Kiss (1995) (among others) argue that the German construction in (25-a) does not involve movement from a clausal complement (either scrambling, or some special pronoun fronting); again, the assumption is that a clausal boundary α does not have to show up in the presence of a certain type of matrix verb (which permits formation of a complex predicate as a lexical property). More generally, though, a systematic local reanalysis of more recalcitrant data involving non-local movement like those in (6) does not suggest itself in any obvious way. However, it is worth noting that partial attempts in this direction

16

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

have been made in the literature. For instance, Reis (1996) suggests that what initially looks like a case of non-local extraction from a verb-second complement clause in German (as in (26-a)), should be reanalyzed as involving only a local, clause-bound movement operation accompanied by a special type of ‘integrated’ parenthetical expression, as indicated in (26-b) (also see Kiziak (2007)). (26) a.

b.

denkst du [ CP meint Maria [ CP sollten [ CP Wen1 believes Maria.NOM should whom.ACC think you.NOM t1 einladen ]]] ? wir we.NOM invite – [ denkst du meint Maria ] – sollten [ CP Wen1 think you.NOM believes Maria.NOM should whom.ACC t1 einladen ] ? wir we.NOM invite

Interestingly, given that a standard way of producing long-distance dependencies in German is by wh-scope marking (accompanied by local movement), as in (27-a), and given further that movement from CPs headed by the complementizer dass (‘that’) (as in (27-b)) is highly marked, or indeed fully unavailable, for some speakers of Standard German, one could then come up with the radical hypothesis that for some variety of German, there is no long-distance wh-movement at all. denkst du [ CP wen1 wir t1 einladen [ CP Was1 what.ACC think you.NOM whom.ACC we.NOM invite sollten ]]] ? should denkst du [ CP dass wir t1 einladen b. #[CP Wen1 that we.NOM invite whom.ACC think you.NOM sollten ]]] ? should

(27) a.

Finally, the modelling of long-distance movement dependencies carried out in analyses developed within Tree-Adjoining Grammar (TAG; see Kroch (1989), Frank (2002), and references cited there) can arguably be viewed as coming close to a local reanalysis. The basic assumption is that all long-distance dependencies must be brought about by (counter-cyclic) insertion (‘adjunction’) of so-called auxiliary trees that ‘pump up’ the local phrase structure generated thus far (so-called ‘elementary trees’). Thus a sentence like (28-a) is derived by inserting (28-b) (where think, which will eventually become the matrix predicate, subcategorizes for a C category) into the C node of (28-c). Crucially, (28-c) only has local, clause-bound movement to the minimal SpecC position.

Local Modelling of Non-Local Dependencies

(28) a. b. c.

What1 do you think that Mary bought t1 ? [ C do you think C ] [ CP what1 [ C C she bought t1 ]]

17 (auxiliary tree) (elementary tree)

Extending earlier work by Brosziewski (2003), Unger (2010) develops a related, but even more radical, analysis in a minimalist approach: A wh-phrase that is to undergo displacement merges with V by first carrying out a split operation, whereby the wh-item itself, together with its feature wh that drives the operation, ends up as one part, and an empty element ε that bears the categorial information, ends up as another part, of a complex category. The first part is next moved to the edge domain of V, and the second part is concatenated with V by a regular merge operation. Crucially, this extremely small movement step is the only instance of movement that there is in the theory: The effects of long-distance displacement are brought about by successively merging other material with the non-edge (nucleus) domain of the linguistic expression created thus far, which pushes the wh-phrase up the tree one step after the other, until an interrogative C head is merged that then remerges the wh-item by removing it from the edge domain and concatenating it with the expression created so far, thereby eventually producing a non-complex linguistic expression. Still, in Unger’s analysis as in the original TAG analyses, whereas the rules of grammar envisage only local movement operations, the resulting structures give rise to non-local dependencies, with the displaced item removed from its base position via arbitrarily many clause boundaries. Finally, a dependency that may prove even more recalcitrant for a local reanalysis than long-distance movement is the binding of variables, as in (22). Here it seems that a spurious non-locality approach would have to dispense with the very concept of variable binding, and resort to a variable-free semantics (cf. Jacobson (1999), B¨uring (2005)). 2.2. Non-local modelling 2.2.1. Types of dependencies There is not a lot to be said about analyses that treat non-local dependencies in syntax by non-local means. In the early days of transformational grammar, this used to be the only approach that was available, with non-local phenomena covered by transformations mapping one phrase structure tree, or P-marker (called SD, structural description), to another one (SC, structural change) (see Chomsky (1965; 1975)), and restrictions on the dependencies stated by constraints on variables in the structural descriptions (see Ross (1967), Bresnan (1976a;b)). In a few current theories of grammar, this is still a standard kind of analysis, at least for non-local movement. This holds, e.g., for Lexical Functional Grammar

18

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

(LFG); cf. Dalrymple (2001). Here, non-local movement dependencies can be stated as identity relations between two grammatical functions; what qualifies as a legitimate identity relation in a non-local dependency is then encoded as a regular expression in the phrase-structure component. Non-local analyses of non-local dependencies have also often been proposed for other phenomena. Thus, in the case of reflexivization, it has sometimes been argued (based on the hypothesis that spurious non-locality approaches do not suffice for all the relevant data) that the dependency between an antecedent and the long-distance reflexive bound by it does indeed require a non-local approach that directly correlates the two positions in order to determine whether the dependency is legitimate or not; see, e.g., Koster (1987), Manzini & Wexler (1987), and Progovac (1992). Similarly, as regards non-local agreement, it has been proposed that LDA may involve a genuinely non-local dependency that should be modelled as such; see Stjepanovi´c & Takahashi (2001), Sells (2006), and Boˇskovi´c (2007), who argue that non-local agreement may selectively circumvent locality domains in a way that other dependencies may not. For instance, Boˇskovi´c (2007) identifies the phase as the relevant locality domain and concludes (in contrast to Boeckx (2004) and Bhatt (2005)) that LDA crosses phase boundaries. However, agreement dependencies are assumed to simply be insensitive to intervening phases, by stipulation; so the phenomenon emerges as truly non-local under this analysis. As for non-local case assignment, most of the existing analyses are inherently non-local (e.g., Ura’s (2007) analysis of ECM in Kansai Japanese (see (11)) simply implies case assignment across a a CP boundary, albeit one which is classified as “not a strong phase”). With respect to the subcase of what Silverstein (1976) called ‘global case marking’ (see (13)), Aissen (1999) and de Hoop & Malchukov (2008) develop non-local accounts according to which φ - and definiteness-related properties of the external argument DP and φ - and definiteness-related properties of the internal argument DP can simultaneously be taken into account in order to determine whether the verb assigns case. Similar conclusions can be drawn for all the other cases of non-local dependencies discussed above: An analysis in terms of non-local modelling would always seem to qualify as the most straightforward approach, and has regularly been pursued; often, it qualifies as the standard approach, too. 2.2.2. Potential arguments against non-local modelling However, non-local approaches to non-local dependencies require scanning large amounts of structure, which is sometimes considered dubious from a con-

Local Modelling of Non-Local Dependencies

19

ceptual point of view (see McCloskey (1988) for a sketch of relevant considerations underlying the general abandonment of non-locality in syntactic theory). Sometimes it is argued (particularly in analyses of minimalist provenance) that a local modelling of non-local dependencies brought about by a reduction of syntactic domains (and concurrent postulation of a means to pass on the required pieces of information in a local fashion, thereby ultimately connecting the two items taking part in the long-distance dependency) may contribute to “efficient computation” by reducing “computational complexity” (see, e.g., Chomsky (2001; 2005; 2007)). This would then imply a conceptual argument in favour of a local modelling of non-local dependencies. However, such work typically does not provide a formal theory of complexity against which such claims could be checked. Therefore, it seems fair to conclude that one should treat these kinds of arguments with caution, at least for the time being.3 Another conceptual argument for a local (as opposed to a non-local) modelling of non-local dependencies that is perhaps more straightforwardly relevant comes from learning theory (see Heck & M¨uller (2010)): In a local approach, the set of possible grammars that the language learner needs to consider is reduced (see, e.g., Chomsky (1972), Sternefeld (2000)). The argument goes as follows. Let T1 be a theory according to which every grammar of a natural language obeys the constraint that a dependency may not cross more than one clause boundary. Next, let T2 be a theory according to which arbitrarily many clause boundaries may be crossed by syntactic dependencies. If one compares T1 and T2 , it turns out that, ceteris paribus, the set of possible grammars of T2 is a superset of the set of possible grammars of T1 . The reason is that T2 also (but, crucially, not exclusively) contains grammars that generate only dependencies which are more local in the sense that they cross at most one clause boundary. This consideration may suggest that a local reanalysis of non-local dependencies in syntax may push theory formation further into the direction of explanative adequacy.4 Furthermore (and perhaps even more importantly), there are empirical challenges for a non-local approach to non-local dependencies. Most obviously, these challenges arise in the area of long-distance movement: The syntactic structure between a displaced item and its base position may show certain morphological exponents and/or alternations that cannot be present outside this area, i.e., in domains that are not affected by movement (see Lahne (2009) for comprehensive discussion). This would seem to raise problems for a non-local approach, and ar3

4

This is not to say that we take it to be impossible, or even unlikely, that breaking down single non-local dependencies into multiple local ones may lead to a reduction of complexity, once the relevant notions are properly defined; see G¨artner & Michaelis (2007) and Graf (2009) for relevant discussion. In this context, also compare Lightfoot (1994) on the hypothesis of ‘degree-0 learnability’ that restricts parameter learning to matrix clauses, hence, to local dependencies.

20

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

gue for a segmentation of longer movement dependencies into smaller steps (see below). Thus, in some languages, wh-movement may be partial in the sense that the movement operation does not overtly reach the target position in the interrogative clause from which the wh-phrase takes scope, but stops in some lower position in the left periphery of a clause; cf., e.g., the phenomenon of partial wh-movement in Ancash Quechua (see Cole (1982)), Iraqi Arabic (see Wahba (1992)), and German (see Cheng (2000), Sabel (2000)).5 In other languages, the wh-phrase does show up in its scope position, but there are partial or total reduplication copies in intermediate positions; see Plessis (1977) on Afrikaans and Fanselow & Mahajan (2000), Nunes (2004) on German; in Dutch, wh-movement may strand part of the wh-phrase along the movement path (see Barbiers (2002)); also see McCloskey (2000) on a similar phenomenon in Irish English. In yet other languages, the reflex of successive-cyclic shows up on some other, movement chain-external, element along the movement path. Relevant cases include the choice of complementizer in Modern Irish (see McCloskey (1979; 2002), Sells (1984), Noonan (2002), Lahne (2009), among many others); obligatory verb raising to C in Spanish (see Torrego (1984), Bakovi´c (1998)), in Basque (see Ortiz de Urbina (1989)), and in Belfast English (see Henry (1995)); the selection of subject pronouns in Ewe (see Collins (1993; 1994)); special verbal morphology (‘wh-agreement’) in Chamorro (see Chung (1994; 1998), Lahne (2009)); tonal downstep in Kikuyu (see Clements et al. (1983)); occurrence of the morphological exponent no in Duala (see Ep´ee (1976), Sabel (2000)); meN deletion in colloquial Singapore Malay (see Cole & Hermon ´ (2000), Fanselow & Cavar (2001)); and participial agreement in Passamaquoddy (see Bruening (2001)). Let us look a bit more closely at two such reflexes of long-distance movement, beginning with the variation in complementizer shape in Modern Irish. Here, complementizers vary in form, depending on whether or not movement has taken place from the clause. The regular form of declarative C is go; see (29-a). However, if the left periphery (CP domain) of a clause is targetted by movement, C takes the form aL; see (29-bc); this is an instance of displacementrelated morphology, i.e., a reflex of movement. In addition, if a displacement dependency is expressed without movement (which McCloskey (2002) argues to be an option), by a resumptive pronoun in situ, C takes the form aN; see (29-d). (29) a.

5

inis s´e br´eag Creidim gu-r I-believe C:go-PAST tell he lie ‘I believe that he told a lie.’

However, note also that the German partial wh-construction, unlike its Ancash Quechua and Iraqi Arabic relatives, but like its Hungarian counterpart, goes hand in hand with the presence of an overt scope marker (was in German) that may plausibly be reanalyzed as genuine whobject quantifying over proposititions, and the lower wh-clause then acting as a restriction of this quantifier; see Dayal (1994) and the contributions in Lutz et al. (2000).

Local Modelling of Non-Local Dependencies

b. c. d.

21

dh´ıol t´u t1 ? C´eacu ceann1 a C:aL sold you which one ‘Which one did you sell?’ hinnseadh d´uinn a bh´ı t1 ar an a´ it an t-ainm O P1 a C:aL was told to us C:aL was on the place the name ‘the name that we were told was on the place’ C´eacu ceann a bhfuil d´uil agat ann ? which one C:aN is liking at you in it ‘Which one do you like?’

As McCloskey (2002) shows, these strategies can be mixed, giving rise to intricate patterns of morphological reflexes on the movement path (the phenomenon of ‘chain hybridization’; also see Asudeh (2004), Huybregts (2009), and Assmann et al. (2010)). Next, consider the reflexes of long-distance Movement in Ewe, as described by Collins (1993; 1994). As shown in (30-a), in embedded subject positions that have not been crossed by movement, the form of the masculine subject pronoun is always e´ , never wo. However, if movement takes place across it (focus movement in the case at hand), the embedded subject pronoun can take either form; i.e., optional wo is a reflex of movement into the matrix clause. (30) a. b.

me gble na t1 [ CP be e´ /*wo fo Kosi ] Kofi1 e hit Kosi that he Kofi FOC I said to ‘It was Kofi that I told that he hit Kosi.’ me gble [ CP t1 be e´ /wo fo t1 ] Kofi1 e that he hit Kofi FOC I said ‘It was Kofi that I said that he hit.’

Next to the formal (morphological or syntactic) reflexes of movement that may show up in parts of syntactic structure between a base position and the target position in the languages of the world, there can also be semantic reflexes; in particular, positions that are included in a movement dependency may act as positions into which reconstruction can take place (see Fox (2000)). It should be uncontroversial that the existence of these reflexes of movement initially favours a local modelling of non-local dependencies, in the sense that the reflexes suggest a partition of the structure affected by movement into subparts, and the availability of the relevant information (viz., that some domain has been affected by movement) for other (e.g., morphological) operations. Against this background, the question arises of how reflexes of movement can be captured in non-local approaches to non-local movement. Dalrymple (2001, ch. 14) develops a non-local LFG approach that addresses this issue. As mentioned on page 18, in this approach, a non-local movement dependency is treated nonlocally, as an identity relation involving a moved item and its base position (more specifically, involving a function characterizing the target position of movement,

22

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

like TOPIC or WH, and the grammatical function characterizing the base position); what qualifies as a permissible movement relation is stated as a regular expression in the phrase-structure component. This analysis does not involve smaller, intermediate movement steps; and, as such, it does not imply any record, or track-keeping device, of a non-local movement dependency in the syntactic structure that shows up between the displaced item and its base position (no feature, no trace, etc.; see below). In view of the existence of morphological reflexes of movement, Dalrymple (2001) proposes that a track-keeping device can be added to phrase structures after all, so as to provide a point of reference for the morphological reflex of movement. For concreteness, Dalrymple proposes a principle demanding that the mothers of all COMP s and GF s that satisfy the regular expression linking a topic function to a grammatical function (i.e., all material that is part of the movement path) must bear an [LDD] (‘long-distance dependency’) feature with the value [+]. In addition, a minimal solution constraint is needed that ensures that the feature [LDD] can only show up in a syntactic structure if it is required by some principle; this has the effect of blocking the feature [LDD] in all environments that are not part of a movement dependency. This way, morphological reflexes of movement can be handled: Special movement-related morphology realizes a [+LDD] feature.6 This solution may be viewed as satisfactory from a purely technical point of view; but it seems clear that it is inferior to local modellings of the phenomenon (see below): The sole purpose of the [LDD] feature is to make possible accounts of morphological reflexes of movement; the feature does not play any role in bringing about, or restricting, movement dependencies per se. At least from an optimal design perspective as advanced by Chomsky (2001), such an account may therefore be considered dubious. A second kind of potential, empirically rooted argument distinguishing between local and non-local approaches to non-local dependencies is related to the generality and plausibility of constraints on non-local dependencies. To see how this might work, consider two classic constraints on syntactic movement, viz., the Complex Noun Phrase Constraint (CNPC; see Ross (1967)) in (31-a) and the Subjacency Condition (see Chomsky (1977; 1986a), Rizzi (1982)) in (31-b) (both constraints are slightly updated to reflect current terminology). (31) a.

6

Complex NP Constraint (CNPC): No element contained in a CP dominated by a DP may be moved out of that DP.

Technically, this can be brought about by appropriate lexical constraints; also see Assmann et al. (2010) on an extension of this approach to the intricate patterns involving chain hybridization in Modern Irish discussed in McCloskey (2002)).

Local Modelling of Non-Local Dependencies

b.

23

Subjacency Condition: In a structure α ... [ β ... [ γ ... δ ... ] ... ] ..., movement of δ to α cannot apply if β and γ are bounding nodes. (DP and TP are bounding nodes in English, DP and CP are bounding nodes in Italian.)

Crucially, the CNPC in (31-a) is compatible with a non-local approach to movement dependencies (and was indeed originally formulated as such, as a constraint on variables in syntax), whereas the Subjacency Condition is explicitly designed as a constraint that presupposes a local modelling of non-local movement dependencies, such that long-distance movement operations are split up into sequences of more local movement operations targetting left-periphal (‘SpecC’) positions of intervening CPs, one after the other (cf. the next section). Both constraints succeed in ruling out sentences like (32), where wh-movement illegitimately takes place from a CP that is embedded in a DP. (32) *[ DP1 Which book ] did [ TP John [ vP hear [ DP2 a rumour [ CP that you had read t1 ]]]] ? However, the Subjacency Condition also covers several other restrictions on movement that the CNPC is silent about (among other things, it derives the effects of the Wh-Island Condition (see Chomsky (1973)), the Left Branch Condition (see Ross (1967)), the Sentential Subject Constraint (see Ross (1967)), the Subject Condition (see Chomsky (1973), Huang (1982)), and some of the effects attributable to the Coordinate Structure Constraint (see Ross (1967)). It is thus more general; therefore, ceteris paribus, it arguably qualifies as a ‘better’ constraint. Crucially, the two constraints can make different predictions if the wh-movement dependency in (32) is split up into a sequence of smaller dependencies. Suppose first that (32) is made up of two dependencies, such that movement to the left edge of the embedded CP is followed by a second movement step to the target position in the matrix clause (as originally assumed by Chomsky (1977)). Then, the CNPC and the Subjacency Condition make idential predictions: Under the CNPC, the sentence is still predicted to be ungrammatical because the second movement step crosses a DP from within CP; under the Subjacency Condition, the sentence is ruled out because DP and matrix TP continue to be crossed in one swoop by the second movement step. Suppose next that non-local dependencies are composed of even smaller parts, with intermediate steps to the predicate phrase (vP) also being required (see Chomsky (1986a; 2001; 2008)). Under this assumption, the CNPC still excludes the example, as intended (CP and DP are crossed by a single movement step, even if that step ends up in a lower position than before), whereas the Subjacency Condition does in fact not exclude (32) anmyore (vP intervenes between the two bounding nodes TP and DP). – Then again, suppose that the CNPC were to be minimally modified, such that, e.g., CP is replaced with C , or “contained” is un-

24

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

derstood in such a way that the specifier/edge of an XP does not actually count as (properly) “contained” in XP (see Baker (1988), Sportiche (1989), Chomsky (2001) for suggestions in this direction), and suppose further that intermediate steps only affect CP edges (not vP edges). In that case, the CNPC would make wrong predictions for (32) (since it should not be violated anymore), whereas the Subjacency Condition would correctly predict (32) to be impossible. All these considerations show that arguments distinguishing between local and non-local approaches to non-local movement dependencies can be constructed on the basis of constraints that are independently given. However, we would like to emphasize that the examples just mentioned are given here only for the purpose of illustration of the general schema of the argument. Both the Subjacency Condition and the CNPC have been convincingly argued to be indadequate (see Riemsdijk (1978), Koster (1978), and the overview in M¨uller (2011)), and one of the central problems that have been identified – viz., the fact that they are two-node rather than one-node locality constraints – turns out to be the one that makes them particularly interesting as a potential means to distinguish local from non-local approaches to long-distance dependencies. Still, the overall conclusion remains valid: The two types of theories can be distinguished by their behaviour vis-`a-vis well-established syntactic constraints (see Heck & M¨uller (2003; 2007) for several applications of this general logic in the slightly different domain of optimization procedures). 2.3. Local modelling 2.3.1. Local modelling: movement Third and finally, non-local syntactic dependencies can be modelled in a local way, by partioning the longer dependencies into combinations of smaller subdependencies. Such an approach has been pursued for all the dependencies mentioned so far, but first and foremost for movement. Consider an example like (33). (33) What do you think that Mary bought ? (33) would involve a single non-local (‘unbounded’) movement operation in earlier transformational approaches, as shown in (34) (with anachronistic notation, including the presence of a trace in the base position). (34) [ CP What1 do you think [ CP that Mary bought t1 ]] ? In contrast to this, Chomsky (1973) proposes that long-distance (wh-) movement as in (35) applies successive-cyclically, from one clausal edge position to the next one (‘COMP-to-COMP movement’, in the then contemporary conception of

Local Modelling of Non-Local Dependencies

25

phrase structure; movement from SpecC to SpecC in more current terminology). This is shown in (35). (35) [ CP What1 do you think [ CP t1 that Mary bought t1 ]] ? Given the Subjacency Condition in (31-b) (cf. Chomsky (1977)), breaking down the non-local movement dependency into smaller parts is indeed unavoidable: If movement did not first target the embedded SpecC position but went directly to the scope position in the matrix clause, the Subjacency Condition would be violated, with both the embedded TP and the matrix TP crossed by a single movement operation; and ungrammaticality should result in the same way that it results in cases of wh-islands (see (36)), where the use of the intermediate SpecC position is blocked (because this position is already filled, and assuming that specifiers are unique rather than multiple). (36) *[ CP What1 do you know [ CP who2 C t2 bought t1 ]] ? Such a moderately local approach to long-distance movement was prevalent for a while in the Principles-and-Parameters framework (see, e.g., Chomsky (1981)), but it is abandoned in Chomsky (1986a) in favour of an analysis that envisages even more local movement steps, by postulating movement to the left edge of the predicate domain (VP) in addition. With the advent of phase theory as an integral part of the minimalist program, this general idea has been systematized as movement to phase edges, where CP and vP are identified as special derivational units, viz., phases; see Chomsky (2000; 2001; 2008), Fox (2000), Nissenbaum (2000), Bruening (2001), Barbiers (2002), and many others. Consequently, a derivation of a sentence like (33) is assumed to involve four separate movement operations, each leaving a trace (or, in most versions of the minimalist program, a copy of the moved item – but these differences are irrelevant for the issues currently under consideration); see (37). (37) [ CP What1 do you [ vP t 1 think [ CP t1 that Mary [ vP t1 [ v tMary [ VP bought t1 ]]]]]] ?

The fact that movement steps must be local, successively targetting the next available phase edge on the way to the ultimate landing site, does not have to be stipulated in this kind of approach. Rather, it can be derived from the Phase Impenetrability Condition (PIC) in (38) (see Chomsky (2000; 2001)). (38) Phase Impenetrability Condition (PIC): The domain of a head X of a phase XP is not accessible to operations outside XP; only X and its edge are accessible to such operations. The PIC explains why successive-cyclic movement is required; but assuming that all syntactic operations must be triggered by designated features (as it has standardly been assumed in minimalist approaches, but see Chomsky (2008)

26

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

for a different view), this means that there must also be some device that guarantees that the intermediate movement steps in (37) are permitted in the first place. There are various possibilities to ensure this. First, it has been postulated that there are features triggering the local movements to phase edges on the phase heads (‘edge’ features) that are available, either freely or under certain conditions (see Chomsky (2000; 2001), Fanselow & Mahajan (2000), Sabel (2000), McCloskey (2002), M¨uller (2011) for some suggestions). Second, it has been claimed that intermediate movement steps can minimally violate the prohibition against non-feature driven movement (Last Resort) so as to satisfy a higher-ranked constraint (which is identified as Phase Balance in Heck & M¨uller (2003)). And third, it might be that intermediate movement steps are not the result of genuine movement operations; rather, intermediate traces are inserted (counter-cyclically, i.e., after movement to the final target position has taken place) into appropriate positions (see Chomsky’s (1995) concept of Form Chain, and also Takahashi (1994), Fox (2000), and Boeckx (2003), among others). Sometimes it has been argued that DP also qualifies as a phase (see Svenonius (2004), Heck & Zimmermann (2004), Matushansky (2005), Kramer (2007)); if so, the PIC also requires local movement steps to SpecD in cases of movement from DP (also see Cinque (1980), Shlonsky (1988) for earlier approaches of this kind). Abels (2003; 2012) argues for a phase status of PP, with the same consequence for successive-cyclic movement. Furthermore, it has been suggested that TP may also qualify as a phase (at least in some languages); see Richards (2004; 2011). Given the PIC, this then requires movement to take place successive-cyclically via SpecT. Finally, in some approaches to movement phases are viewed as more flexible objects that may vary across, or even within, languages; see Grohmann (2000), Bobaljik & Wurmbrand (2003; 2005), Maruˇsiˇc (2005), Gallego & Uriagereka (2006), den Dikken (2007), Gallego (2007), and Boˇskovi´c (2012). On such an approach, non-local movement dependencies have to be decomposed into smaller steps (of varying degrees of locality) in a non-homogeneous way. In contrast to all these approaches based on selective phase status of XPs, it has also been argued that all XPs qualify as locality domains for movement (see Koster (1978), Riemsdijk (1978)). In line with this, it has been suggested that non-local movement must take place via all intermediate XP edges. This may either follow from the PIC (if all phrases qualify as phases), or may need to be stated separately; see, inter alia, Sportiche (1989), Takahashi (1994), Agbayani (1998), Chomsky (1995; 2005; 2008), Boˇskovi´c (2002), Boeckx (2003), Boeckx & Grohmann (2007), M¨uller (2011). In such an approach, an example like (33) has the derivation in (39), with the non-local movement dependency split up into a sequence of six (or more, if more functional categories in the clausal spine are

27

Local Modelling of Non-Local Dependencies

postulated) separate, extremely local intermediate movement steps before the final movement step to the target position takes place.7 (39) [ CP What1 do [ TP t you [ vP t 1 1 [ VP t1 think [ CP t1 that [ TP t1 Mary [ vP t1 [ VP bought t1 ]]]]]]]] ?

An even more local (and even more radical) modelling of non-local movement dependencies involves partionings where not just every intervening phrase, but every intervening node of the movement path encodes the information that movement has taken place across it. The basic idea goes back to Gazdar (1981; 1982); the resulting technique is usually subsumed under the label of ‘SLASH feature percolation’. In Gazdar’s work, the initial motivation for this mechanism is based on complexity considerations: Given (i) that the computational complexity of classical transformational grammars (as in Chomsky (1965)) is due not to the base component (which consists of context-free phrase structure rules), but rather to the transformational component (with transformations being powerful tools that map phrase markers to phrase markers), and given (ii) furthermore that transformations seem nevertheless required to model displacement, the task is to capture displacement phenomena without transformations. To this end, Gazdar (1981) introduces SLASH features. On this view, with movement transformations gone, ‘movement’ emerges as a mere metaphor. More specifically, Gazdar (1981) distinguishes between three domains of a movement dependency. First, there is the top, the landing site of movement. Second, there is the middle: the movement path. And finally, there is the bottom: the base position of the moved item; see (40). (40) [What . . . [do you think that Mary bought [t]]] top

middle

bottom

The bottom and top parts of a movement construction can be addressed without further ado in a context-free phrase structure grammar; the crucial innovation that Gazdar introduces concerns the passing on of information in the middle of the dependency. The central concepts put forward in Gazdar (1981) are those of a derived category and of a derived rule. Given a set VN of basic category symbols, the set of derived categories D(VN ) can be defined as in (41). (41) D(VN ) = {α /β : α , β ∈ VN } Thus, if, say, S (CP) and NP were the only kinds of categories available, then there would be four derived categories, viz., NP/NP, NP/S, S/NP, and S/S. What follows the basic category has become known as the SLASH feature. The SLASH feature signals that something is missing (and what). Next, given the set G of 7

This is roughly in compliance with Barrett’s (1967) assumption that all movement is accomplished in six stages.

28

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

base rules of the grammar, derived rules can be produced on the basis of derived categories: For each syntactic category β , there is a subset of the set of nonterminal symbols VN whose members can dominate β according to the rules in G. This set is called Vβ (Vβ ⊆ VN ). Then, for each category β (β ∈ VN ), a finite set of derived rules D(β ,G) can be defined, as in (42).8 (42) Derived Rule Schema: D(β ,G) = {α /β → σ 1 ... σ i /β ... σ n : α → σ 1 ... σ i ... σ n ∈ G & 1≤i≤n & α , σ i ∈ Vβ }. According to (42), for every basic (context-free) phrase structure rule in the grammar, derived rules are generated in which the symbol on the left-hand side of the rewrite arrow and exactly one symbol on the right-hand side (i.e., a symbol in the replacing string) are derived categories, enriched by identical information about what is missing (unless, that is, the right-hand symbol can never dominate the missing category according to the basic rules G, as is always the case with X0 -categories). Thus, if, e.g., the context-free phrase structure rules in (43) are part of G, then the derived context-free phrase structure rules in (44) will be part of D(NP,G), and thus also available in the grammar. (43) a. b. c.

S → NP VP VP → V NP VP → V S

(44) a. b. c.

S/NP → NP/NP VP, S/NP → NP VP/NP VP/NP → V NP/NP VP/NP → V S /NP

Derived rules regulate the percolation of SLASH features in the middle; they pass on the information what is missing in an extremely local way throughout syntactic structures. In addition, rules are needed for the top and for the bottom of displacement constructions. These rules are non-derived rules. The rule for the bottom is basically just a rule schema that introduces traces into the structure; cf. (45-a) (where α can be any category, e.g., NP). Finally, there are various rules for the top, depending on the kind of movement dependency (wh-movement, topicalization, etc.) that is to be captured. Gazdar’s (1981) rule for (NP) relativization in English is given in (45-b). Here, R is the category for a relative clause, NP[±wh,+pro] is the moved relative pronoun (which may be absent in the case of objects), and S/NP is a slashed S category as it occurs in (44-a). The asymmetry in (45-b) (a slashed category on the right-hand side of the rule, a non-slashed, basic category on the left-hand side) ensures that the movement dependency is not propagated further up the tree once the target position of the displaced item is reached. (45) a. 8

α /α → t

Gazdar (1981) actually has node admissibility conditions of the type [S NP VP ] instead of phrase structure rules of the type S → NP VP, but this difference can be neglected in the present context.

Local Modelling of Non-Local Dependencies

b.

29

R → (NP[±wh,+pro] ) S/NP

In the standard Generalized Phrase Structure Grammar (GPSG) approach subsequently developed in Gazdar et al. (1985), the essentials of this approach have been maintained. However, there are some differences concerning all three domains of a movement dependency; most importantly, SLASH is explicitly viewed as a (category-valued) feature of categories. As for the bottom, (45-a) is replaced with the SLASH Termination Metarule in (46-a); given that there is a feature cooccurrence restriction according to which the presence of [+NULL] implies the simultaneous presence of [SLASH], this provides a starting point of SLASH feature percolation, i.e., it initiates the movement dependency. The top of the dependency is accounted for basically as in (45-b), by assuming a general filler-gap rule schema as in (46-b) (where H stands for whatever is the head of S in a given context, with options including VP and S again). Most importantly, the middle of the dependency – i.e., local SLASH propagation through syntactic structures, ultimately connecting the base position with the displaced item – is handled by assuming that SLASH is not just a head feature (that is passed on along the projection line), but also a foot feature, which implies that it is shared between daughter and mother not only along the projection line of the head, but also between a non-head daughter and its mother. This is ensured by an indendently motivated constraint, the so-called Foot Feature Principle. (46) a. b.

S LASH Termination Metarule: X → W, XP ⇒ X → W, XP[+NULL] S → XP, H/XP

On this view, a SLASH feature percolation analysis of (33) looks as in (47). As before, the assignment of category labels and assumptions about fine-grained aspects of clause structure are anachronistic, with orthogonal assumptions between analysis types minimized, so as to ensure that maximal comparability; also note that “s” in (47) is a shorthand for “[ SLASH:[DP1 what ]]”. (47) [ CP What1 [ C :s do [ TP:s you [ T :s T [ vP:s tyou [ v :s v [ VP:s think [ CP:s that [ TP:s Mary [ T :s T [ vP:s tMary [ v :s v [ VP:s bought t1 ]]]]]]]]]]]]] ? This extremely local SLASH feature-based modelling of non-local movement dependencies has also been adopted in Head-Driven Phrase Structure Grammar (HPSG) (see Pollard & Sag (1994), Sag & Wasow (1999)), with fery few changes. For one thing, SLASH does not take a category as its feature value anymore, but rather a list of categories, so as to permit multiple extraction from a given category, as it is in fact required for examples like (47) anyway if one assumes that external argument DPs are base-generated in Specv and then moved to SpecT in English; see already Maling & Zaenen (1982), and Pollard & Sag

30

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

(1994).9 For another, there is a controversial discussion in HPSG as to whether traces can (or should) be dispensed with in the modelling of the bottom of a dependency; see Sag & Wasow (1999), Levine & Sag (2003a;b), and M¨uller (2007) for the two different options. Mechanisms very similar to Gazdar et al.’s (1985) SLASH feature percolation have also been developed in Principles-and-Parameters-based work; cf. in particular the related concepts of gap phrase and operator feature percolation in Koster (2000) and Neeleman & van de Koot (2010), respectively.10 Unlike non-local approaches, local approaches to movement dependencies are in principle well designed to capture morphological (and other) reflexes of displacement because they presuppose that the syntactic domain where the reflex shows up is materially affected – either by an actual intermediate movement operation, or by a trace (or both), or by a SLASH feature that can be used for special morphological realization (or be held responsible for other reflexes); see in particular Sag & Wasow (1999) and Bouma et al. (2001) (also cf. Assmann et al. (2010) for the more intricate patterns of Modern Irish; and see above). For standard, run-of-the-mill movement dependencies (i.e., ignoring complications like parasitic gaps and across-the-board extraction; see Chomsky (1982), Ross (1967)), current minimalist approaches that envisage movement to all intervening XP edges (producing structures like (39)) turn out to be very similar to SLASH feature percolation approaches (which produce structures like (47)).11 However, local approaches that envisage designated intermediate landing sites, 9

10

11

This is a simplification, though. Technically speaking, SLASH in HPSG must take sets of local complements of categories as values. (Ultimately, this complication is due to the necessity to avoid what has become known as the ‘node vortex problem’ (with SLASHes inside SLASHes); see Pullum (1989).) In his taxonomy of approaches to movement dependencies, McCloskey (1988) also groups pathbased approaches as they have been developed by Kayne (1982), Pesetsky (1985), Koster (1987), and Longobardi (1985), together with SLASH-feature percolation based approaches, and considers them both as fundamentally distinct from approaches that envisage successive-cyclic movement. While there are indeed some similarities (most notably, SLASH-based and path-based approaches are both inherently representational rather than derivational, with ‘movement’ reduced to a metaphor), it would seem that these are mostly orthogonal to the issues currently under consideration. More important in the present context are the fundamental differences, which McCloskey (1988, 30) also notes: First, in path-based approaches to displacement, the movement path is “not formally marked in any way”. And second, “one inspects the geometry of the entire path between an empty position and its binder, to determine whether or not a given structure is well-formed.” From the present perspective, this means that path-based approaches qualify as instances of non-local approaches to movement, of roughly the same kind as the standard LFG approach developed by Dalrymple (2001) (see above). Consequently, they are also subject to the criticism raised above with respect to morphological (and other) reflexes of movement for the approach pursued in Dalrymple (2001). In fact, it is hard to see how fundamental differences between these two approaches with respect to empirical predictions for reflexes of movement could arise: Even though SLASH is present on every projection of an XP on the movement path whereas intermediate traces only show up in specifiers of XP, the latter items are presumably still close enough to all relevant items in XP to

Local Modelling of Non-Local Dependencies

31

like the COMP-to-COMP movement approach yielding structures like (35), or the classic phase-based approach yielding structures like (37), differ from these approaches in their empirical predictions for reflexes of displacement because, given independently motivated assumptions about the locality of certain syntactic or morphological operations, the relevant information may not be present. As for approaches that assume genuinely unbounded movement (see (34)), they either cannot handle reflexes of displacement easily in the first place, or they behave like extremely local minimalist and SLASH-based approaches in this respect (recall the role played by [LDD] features in Dalrymple (2001)). The two groups of approaches to movement emerging from this perspective have been labelled uniform movement path approaches and punctuated movement path approaches in Abels (2003; 2012); see (48). (48) Uniform vs. punctuated movement paths unbounded movement COMP -to- COMP movement movement to designated phase edges movement to all XP edges movement by SLASH feature percolation

uniform path punctuated path ± – – + – + + – + –

With respect to morphological reflexes of movement, evidence was initially taken to support a particular version of the punctuated path approach, viz., COMP -to- COMP movement. However, many of the relevant phenomena seem to involve verbal markers (e.g., wh-agreement in Chamorro, meN deletion in Malay), which would then seem to minimally support the standard phase version of the punctuated path approach (with vP and CP as phases). In this context, it is also worth noting that the ‘complementizer alternation’ facts of Modern Irish (recall (29)) may also plausibly be reanalyzed as involving verbal particles (see Sells (1984), Noonan (2002), Lahne (2009)); so the reflex may perhaps in fact not occur on C, or in the CP domain (depending partly on the analysis of VSO order in Irish). The displacement reflex in Ewe (see (30)) involves subject pronouns and may therefore be indicative of a TP (rather than vP or CP) domain affected by movement. If these tentative conclusions can be substantiated and generalized, they might then support a uniform path approach; but at present the issues are far from being resolved.12 Abels (2003) advances an argument for punctuated paths centering around a syntactic (rather than morphological) reflex of displacement. It is based on what

12

trigger any kind of reflex (morphological or other) in the XP domain, in particular on the head X. See Lahne (2009); but also Lechner (2009) for a qualification. For instance, even if the pronoun alternation in Ewe involves SpecT, this might nonetheless be due to different properties of an adjacent C head affected by intermediate movement to SpecC playing a role in morphological realization/insertion.

32

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

has sometimes been called “pit-stop reflexives”, a phenomenon that involves a potential feeding relation between movement and reflexivization (see Barss (1986), Epstein et al. (1998)). As shown in (49), reflexives that are not permitted as such because they lack a local antecedent (see (49-ac)) can extend their binding domains and find a new antecedent if they are part of a wh-phrase that moves to a higher clause (see (49-bd)). It suffices that the reflexive is locally bound at some intermediate point of the derivation (here designated by ); in the final representation, the reflexive does not have to be c-commanded by its antecedent anymore (this is an instance of opaque rule interaction, viz., counter-bleeding).13 (49) a. *Jane believes (that) John1 thinks (that) she likes some pictures of himself1 b. Which pictures of himself1 does Jane believe (that) John1 thinks [ CP (that) she likes ]? c. *Mary told John1 that she liked these pictures of himself1 d. Which pictures of himself1 did Mary tell John1 [ CP that she liked ]? Thus, (49) shows that reflexivization must be possible in intermediate positions of movement paths. However, the examples in (49) cannot yet decide between a punctuated and a uniform approach – can plausibly be assumed to be SpecC, and both kinds of approaches can make the relevant information available in this position. However, Abels (2003) argues that there is an argument for punctuated paths on the basis of raising constructions, as in (50-b), where wh-movement of the DP containing the reflexive takes place across an experiencer argument of the raising predicate seem that may in principle license a pit-stop reflexive; see (50-a). [ DP2 Which pictures of himself1 ] did it seem to John1 [ CP that Mary liked t1 ] ? b. *[ DP2 Which pictures of himself1 ] did Mary3 seem to John1 [ TP () t3 to like t1 ] ?

(50) a.

Given the standard assumption that raising infinitives are TPs (not CPs), the argument goes as follows: Under a uniform paths approach, reflexivization should be possible via the position (SpecT, or TP[S LASH :DP]) in (50-b) (to does not block binding here; see (50-a), as well as Pesetsky (1995), Sternefeld (1997) and references cites there). Under a punctuated paths approach, reflexivization should be impossible in (50-b) if SpecT is not a landing site for successivecyclic movement (e.g., if TP is not a phase). Since (50-b) is ungrammatical, this supports a punctuated paths approach. 13

The terminology here is derivational, but this is just for exposition. Barss (1986) develops a fully representational account of the relevant phenomena, in terms of chain-accessibility sequences.

Local Modelling of Non-Local Dependencies

33

Arguments of this general type are exactly what is needed to distinguish between different types of local modelling of non-local movement dependencies, but it is not clear that this particular argument is compelling. As noted by Boeckx & Grohmann (2007) and Boeckx (2008), sentences like (51) also lack the enrichment of binding options by movement to intermediate positions although the most deeply embedded clause is a CP, and movement to the position of this CP domain should suffice for creating the new binding option. This suggests that the correct generalization might be that an intervening experiencer blocks the enrichment of binding options, quite independently of the nature of the landing site involved. (51) *Which pictures of himself1 did Mary2 seem to Jane3 [ TP t2 to have told John1 [ CP that she likes t ]] ? Furthermore, assuming that reflexivization is not merely domain-based but also sensitive to intervention effects exerted by other potential antecedents, (50-b) may also be straightforwardly excluded in a uniform path approach according to which the embedded TP domain is directly affected by displacement, e.g., by intermediate movement to SpecT. Here is why: In a uniform paths approach, the raised subject Mary also has to move through all intervening XP domains, just like DP2 (which picture of himself) does. Since the eventual landing site of Mary is higher than that of the matrix experiencer to John, there is no step of the derivation where DP2 is in the vicinity of John (so that the reflexive in DP2 can pick up John as its local antecedent) without Mary also being in the same minimal domain. Thus, Mary may never cease to be an intervener for a reflexivization dependency between John and himself in DP2 , which will then account for the illformedness of (50-b). To conclude, it is unclear whether English pitstop reflexives can be taken to argue for punctuated paths versus uniform paths; nevertheless, this type of argument strikes us as fairly important in order to determine exactly how local a local modelling of non-local movement dependencies should be taken to be. 2.3.2. Local modelling: other dependencies Two kinds of approaches can be distinguished with respect to the options of local modelling of non-local dependencies other than movement – i.e., reflexivization, case assignment, agreement, control, switch reference, consecutio temporum, etc. On the one hand, the idea has been pursued that such non-local dependencies can in fact be treated as instances of movement (albeit abstract instances, in many cases).14 On the other hand, attempts have been made to directly model 14

This is not to be confused with spurious non-locality approaches to some given dependency where it is postulated that the dependency is parasitic on an independently existing (although

34

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

such non-local dependencies in a local way that is independent from movement. (Needless to say, the latter kind of approach may sometimes resemble the former one, and the boundaries may be blurry in invidiual cases.) 2.3.2.1. Local modelling: reflexivization There is a long tradition in Principles-and-Parameters-based work to treat longdistance reflexivization as an instance of abstract (covert, LF) movement. As noted by B¨uring (2005), these analyses come in two varieties: First, the displacement operation is often viewed as an instance of (successive-cyclic) head movement (see Pica (1987), Cole et al. (1990), Cole & Sung (1994), among many others); second, it may be considered an instance of phrasal movement (see, e.g., Huang & Tang (1992)). A well-known problem with the head movement approach is that head movement seems to be strictly local otherwise (see Travis (1984), Baker (1988)). A potential problem with the phrasal movement approach is that where reflexive pronouns must move overtly in the languages of the world, they typically do so via head movement (or cliticization). In contrast to movement analyses, Kiss (2004) introduces an HPSG-style analysis that treats non-local reflexivization dependencies similarly to SLASH feature percolation approaches to movement dependencies (see also Kiss (this volume)). Once a reflexive dependency calling for resolution is introduced into a syntactic structure (analogously to rules like (45-a) and (46-a) that introduce traces), the relevant information is passed on as a feature (D(1)) (analogously to SLASH feature percolation as a foot or head feature), and projected from daughter to mother; and the dependency is ultimately resolved once an antecedent with the same index is locally found (analogously to filler-gap rules like (45-b) and (46-b)). The minimalist approach developed by Fischer (2004; 2006) is a hybrid one, combining aspects of movement and feature percolation. The basic premise is that reflexivization involves an Agree operation involving antecedent and reflexive pronoun (also see Reuland (2001), Heinat (2006), Sch¨afer (2008; this volume)). However, Agree, by assumption, is only possible in extremely local domains because every XP qualifies as a phase. To make Agree (and thereby, reflexivization) possible, an abstract pronominal feature matrix generated in an argument position is moved locally, from phrase to phrase, until an appropriate antecedent is found. An interesting aspect of this proposal is that the more the pronominal matrix is frustrated by intermediate movement steps that do not yet also often abstract) movement operation, as in the approach to LDA developed by Polinsky & Potsdam (2001) and others. In the approaches to be considered momentarily, the non-local dependency is not fed by a movement dependency; it either is, or is an intrinsic part of a movement dependency.

Local Modelling of Non-Local Dependencies

35

find an antecedent, the more likely it is that reflexive features of the matrix are deleted, which will then lead to a non-reflexive (i.e., purely pronominal) realization of the pronoun. Thus, as in Polinsky & Potsdam’s (2001) analysis of LDA, movement precedes and enables agreement in Fischer’s approach to reflexivization. However, one cannot say that reflexivization is parasitic on movement in this approach because the movement of the pronominal feature matrix is an intrinsic part of reflexivization, together with the final Agree operation (see footnote 14); the movement operation is not assumed to be independently motivated. To conclude, in both Fischer’s (2006) and Kiss’s (2004) analyses, reflexivization may involve the passing on of relevant binding information in syntactic trees. A local modelling of (potentially) non-local anaphoric dependencies is involved. 2.3.2.2. Local modelling: case assignment Phenomena involving long-distance ECM do not seem to have successfully been tackled on the basis of strictly local approaches. Phenomena involving global case marking (cf. (13)) have been locally modelled in the minimalist program in ˇ acˇ (2009), Keine (2010), and Georgi (2009) (also see Georgi (this B´ejar & Rez´ volume)). Recall that the problem here is that case assignment of some verb to a DP depends not only on the φ - and definiteness-related properties of the DP itself (as in standard, local, cases of differential argument encoding), but also on the properties of another (typically co-argument) DP. In a local approach to case assignment, a classic dilemma will arise. First, there is the issue of look-ahead: The case of an internal argument may depend on properties of the external argument. However, given basic minimalist assumptions about structure-building, the external argument is not yet part of the structure when case needs to be assigned to the internal argument. If there is no look-ahead, case assignment to the internal argument therefore cannot take place before the external argument is merged. Second, there is the issue of backtracking: According to the Strict Cycle Condition (Chomsky (1973)), which is a fundamental principle of virtually all derivational approaches to syntax, an operation cannot solely affect a proper substructure of the currently existing syntactic structure. Therefore, case assignment to the internal argument also cannot take place after the external argument has been merged. ˇ acˇ (2009) (also cf. The main idea underlying the analysis in B´ejar & Rez´ Anagnostopoulou (2005)) is to postulate that v has to carry out Agree with both an internal and an external argument but may not sufficiently be specified with person features for both arguments; an atypical (first or second person) internal argument may require a special feature P on v which is responsible for a speˇ acˇ cial case assignment to the internal argument. The analysis in B´ejar & Rez´

36

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

(2009) manages to avoid both problems, but only for a subset of the relevant phenomena; in a nutshell, the problem is that variation in the properties of the external argument cannot systematically be accounted for. In contrast, Keine’s (2010) analysis, although basically local, turns out to exhibit remnants of nonlocality on closer inspection. Here, v (or T) first carries out agreement with both arguments; then, impoverishment applies depending on the properties of both arguments; and finally, case assignment takes place, with differential argument encoding emerging as a side effect of impoverishment (see Keine & M¨uller (2008)). In this approach, the ultimate case assignment operation is local. However, it presupposes an earlier step where the φ -properties of both arguments are recorded on the agreeing head (v or T), and this would seem to qualify as a clear non-local residue. Finally, Georgi (2009) addresses global case marking in a local way by assuming that whether or not case is assigned to an internal argument by v (which may depend on φ -features of the object, and may also be optional in some languages) can determine what kinds of external arguments v can take. 2.3.2.3. Local modelling: agreement As for instances of LDA (see (15)), next to various kinds of spurious nonlocality analyses (see section 2.1.3) and genuine non-local modellings (see section 2.2.1), it has sometimes been argued that they should be modelled in a strictly local way, by what has become known as cyclic Agree. An early (nonminimalist) approach of this type is developed by Butt (1995). More recent minimalist approaches include Legate (2005), Keine (2008), Lahne (2008), and Preminger (2009) (Preminger in addition also makes use of the movement strategy discussed above as an instance of a spurious non-locality approach). Taking Legate (2005) as a representative example for a local modelling of LDA, it is interesting to note that at no stage of the derivation is there an Agree relation between the matrix verb and the embedded DP in this kind of approach. Rather, the DP’s φ -features first valuate an [uφ ] probe feature of a phase head, which by definition (cf. the PIC in (38)) is also part of the higher phase. The matrix verb then probes the embedded phase head’s φ -features. Thus, the embedded phrase head acts as a hinge between the matrix and embedded domains. Such an approach straightforwardly accounts for the observation that LDA presupposes the existence of local agreement in the embedded clause. However, it is not entirely unproblematic from a theoretical point of view, given standard minimalist assumptions about probe features, goal features, and the Agree operation: It looks as though one and the same set of φ -features (on the phase head in the middle) must act as a probe in one case, and as a goal in another. It might also be worth noting that an alternative local analysis that mimicks SLASH feature percolation for movement dependencies might in principle be an option; but to the best of

Local Modelling of Non-Local Dependencies

37

our knowledge, such an analysis has not yet been proposed. See Richards (this volume) for extensive discussion. 2.3.2.4. Local modelling: control and switch reference Similarly, local approaches to other non-local dependencies can be found in the literature. As far as control is concerned, it has been argued that control is but an instance of movement (see Hornstein (2001) and Boeckx & Hornstein (2006), among many others). On this view, to the extent that movement can be treated in a strictly local way, so can control, and there is virtually nothing more to say. With respect to switch reference (see (21)), unlike the majority of work on these phenomena, Camacho (2010) does not employ a binding-theoretic approach. Rather, he suggests that agreement is involved. He reanalyzes the apparent non-locality of switch reference marking in terms of local cylic agremeent operations between the subjects of the two clauses on the one hand, and the case and φ -features on the C head of the clause in which the switch reference marker shows up, on the other hand. A same subject marker then indicates the presence of the case and φ -features, and a different subject marker signals the absence of these features. Another agreement-based approach to switch reference systems (based on tense agreement) is developed in Assmann (2012). In contrast, Georgi (2012) proposes that switch reference marking is an instance of (successivecyclic) movement, and presents an analysis that treats the phenomenon on a par with the control-as-movement approaches just mentioned.

3. Issues Given that it does not seem likely that approaches in terms of spurious nonlocality will plausibly be extendable to capture all relevant kinds of non-local dependencies, and given that genuinely non-local approaches to non-local phenomena in syntax face certain conceptual and empirical problems, it seems unavoidable to postulate that at least some instances of non-local dependencies will have to be addressed by local modelling. Assuming this to be the case, a number of central questions arise concerning the scope of local modelling of non-local dependencies in syntax. First, given that a uniform theory of syntactic dependencies may be viewed as a desideratum, and given that some dependencies are to be viewed as strictly local, could it be that there are no non-local dependencies in syntax at all, and all dependencies are modelled strictly locally? If this question is answered to the affirmative, several further questions need to be addressed. An obvious next question then is whether all the different types of non-local dependencies are to be captured in essentially the same way (e.g., by postulating local

38

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

movement – possibly of abstract items – or local feature percolation throughout)? Third, it is obvious that there tend to be asymmetries between different kinds of (non-local) dependencies (e.g., displacement may often be non-local to a higher degree than reflexivization; different types of displacement may be non-local to a different degree from other types of displacement; and so on). How can such asymmetries be accounted for (both under an approach that treats all non-local dependencies in exactly the same local manner, and under an approach that treats them in different ways, albeit locally throughout)? Fourth, how can asymmetries between different languages with respect to the same kinds of (basically non-local) dependencies be accounted for? Fifth, what size should the syntactic domains be taken to have that provide the space for local suboperations (which in turn are combined to yield non-local dependencies)? Should they be as small as possible (such that even dependencies that may not look as non-local from a pre-theoretic point of view then emerge as non-local; see, e.g., Richards (this volume) on agreement; and also cf. Chomsky’s (2007) remark that “phases should be as small as possible, to maximize the effects of [...] computational efficiency” (p. 17)); should they be as large as possible; or should the size be taken to vary, perhaps arbitrarily so? In addition to these considerations, it is worth noting that different syntactic theories favour (or, indeed, require) local approaches to non-local dependencies to different degrees. Interestingly, this issue is independent of other, fundamental differences between syntactic theories, spanning, e.g., the generativederivational/declarative-representational dichotomy. Thus, local modelling of non-local dependencies is an intrinsic feature of both GPSG and HPSG; e.g., none of the theoretical building blocks in Gazdar et al. (1985) involve nonlocality (this holds for immediate dominance rules; principles regulating the distribution of syntactic features, like the Foot Feature Principle and the Control Agreement Principle; feature specification defaults; feature co-occurrence restrictions; linear precedence statements; and, last but not least, metarules, notwithstanding the computational complexity they have been shown to introduce in Uszkoreit & Peters (1986)).15 Similarly, in categorial grammar (see Moortgat (1988), Steedman (2001), J¨ager (2005) for some versions) all syntactic restrictions are captured by (i) the complex properties of linguistic expressions, and (ii) a fairly small set of rules for combinations of the linguistic expressions, with no possibility to refer to widely separated linguistic expressions so as to model non-local dependencies in a non-local way. Finally, in more recent versions of the Principles-and-Parameters approach that have been developed 15

McCloskey (1988, 28, fn. 13) notes that “in early unpublished work, Gerald Gazdar discusses the possibility of using rules of the form A → [B [C D E ]] while remaining within the context-free languages as far as weak generative capacity is concerned.” This way, a (non-local) dependency involving A and D could be modelled non-locally in (early) GPSG. However, McCloskey then goes on to say that “such rules [...] have played no role in analytic practise as GPSG developed.”

Local Modelling of Non-Local Dependencies

39

within the minimalist program, the syntactic phase is a central concept that effectively forces local modellings of non-local dependencies. This becomes even more obvious if one assumes that the PIC (see (38)) does not have to be stipulated as such, but is in fact derivable from assumptions about cyclic spell-out (see Uriagereka (1999) for the original idea, which however differs substantially from the form it takes in Chomsky’s more recent work): On this view, once a phase is completed, the complement of the phase head is sent off (non-metaphorically) to the PF and LF interfaces, and material included in these spelled-out domains is simply not accessible anymore by subsequent syntactic operations (and that means, in higher parts of the syntactic structure). So, under this conception of phases, the only way to model a dependency correlating some item α in the complement domain of a phase and some other item β higher in the structure, is to locally pass on the relevant information associated with α via phase edges, until it becomes a phase-mate with β . Thus, there is some convergence among several more recent syntactic theories (GPSG/HPSG, categorial grammar, minimalist program) to the effect that non-local dependencies are to be modelled locally.16 And indeed, we would like to contend that closer inspection often reveals that local analyses of non-local phenomena developed in different kinds of syntactic theories can be shown to not only share similar research questions, but also, to a large extent, similar research strategies (among them most prominently those that center around the issues concerning the scope of local modelling mentioned above). This, we believe, holds some promise for the further development of syntactic theory as a collaborative enterprise in the next couple of years, irrespective of (and, hopefully, largely orthogonal to) other differences pertaining to conceptual issues that separate the frameworks – such as (i) the question of whether an abstract or a surface-oriented approach should be pursued, (ii) questions related to the nature/nurture debate, (iii) questions concerning the degree of formalization required for theory construction, and (iv) the issue of how minimalist syntactic theory should be taken to be (and how big the role of ‘third-factor’ explanations should be assumed to be; see Chomsky (2005)). The contributions to the present volume advance and discuss various kinds of local analyses of non-local dependencies in syntax from different theoretical points of view (minimalist program, HPSG, categorial grammar, and related – sometimes hybrid – approaches).17 Empirically, the focus is on those phenom16

17

This can be contrasted with other syntactic theories that permit (and, in many cases, systematically envisage) a non-local modelling of non-local dependencies (e.g., LFG, earlier versions of the Principles-and-Parameters paradigm like Government-Binding theory (see Chomsky (1981; 1982; 1986b)), and most versions of Optimality Theory (see, e.g., Grimshaw (1997), Legendre et al. (1998), Legendre et al. (2001), and Samek-Lodovici (2006) – but also cf. Heck & M¨uller (2003; 2007) for a strictly local version of optimality-theoretic syntax). Many of the articles collected here ultimately go back to a workshop at the DGfS (German Linguistics Society) meeting at Bamberg University in 2008.

40

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

ena that have featured prominently in the present introduction. First, non-local agreement is tackled in the articles by Fabian Heck & Juan Cuartero; Artemis Alexiadou, Elena Anagnostopoulou, Gianina Iord˘achioaia & Mihaela Marchis; Petr Biskup; and Marc Richards. Next, non-local reflexivization and binding are adressed in the papers by Tibor Kiss; Joachim Sabel; Daniel Hole; and Udo Klein, with the former two focussing on reflexivization and the latter two focussing on (semantic) binding. Third, the papers by Florian Sch¨afer and Doreen Georgi are concerned with non-local case assignment. After this third block, there are two papers on other, less widely addressed types of non-local dependencies: Non-local scope of negation is tackled in Hans-Martin G¨artner’s arti´ cle, and non-local (cyclic) deletion is at the core of Masaya Yoshida & Angel Gallego’s contribution. Finally, the remaining six papers are all about what is arguably the core instance of non-local dependencies in syntax: movement. They are (in that order) by Chiyo Nishida; Christina Unger; Klaus Abels & Kristine Bentzen; Chris Worth; Gregory Kobele; and Dalina Kallulli. In addition, several of the papers are not confined to a single non-local dependency but also address other dependencies; see, e.g., Sch¨afer on reflexivization, Worth on agreement, Abels & Bentzen on reflexivization and (semantic) binding, Kobele on binding, Biskup on movement, and Alexiadou, Anagnostopoulou, Iord˘achioaia & Marchis on movement and control. This is just what one would expect, given that capturing similarities and differences among the various types of non-local dependencies forms an important part of current research in this area.

Bibliography Abels, Klaus (2003): Successive Cyclicity, Anti-Locality, and Adposition Stranding. PhD thesis, University of Connecticut, Storrs, Connecticut. Abels, Klaus (2012): Phases. Linguistische Arbeiten, de Gruyter, T¨ubingen. Aelbrecht, Lobke (2010): The Syntactic Licensing of Ellipsis. Benjamins, Amsterdam. Agbayani, Brian (1998): Feature Attraction and Category Movement. PhD thesis, UC Irvine. Aissen, Judith (1999): “Markedness and Subject Choice in Optimality Theory’, Natural Language and Linguistic Theory 17, 673–711. Aissen, Judith (2003): ‘Differential Object Marking: Iconicity vs. Economy’, Natural Language and Linguistic Theory 21, 435–483. Aissen, Judith & David Perlmutter (1983): Clause Reduction in Spanish. In: D. Perlmutter, ed., Studies in Relational Grammar 1. University of Chicago Press, Chicago, pp. 360–403. Alboiu, Gabriela & Virginia Hill (2011): The Case of A-bar ECM: Evidence from Romanian. In: Proceedings of NELS. Vol. 42, Toronto University. Alexiadou, Artemis & Elena Anagnostopoulou (1999): Raising Without Infinitives and the Nature of Agreement. In: S. Bird, A. Carnie, J. Haugen & C. Norquest, eds, Proceedings of the 18th WCCFL. Cascadilla Press, pp. 14–26. Alexiadou, Artemis & Elena Anagnostopoulou (2002): Raising Without Infinitives and the Nature of Agreement. In: Dimensions of Movement: From Remnants to Features. Benjamins, Amsterdam, pp. 17–30. Alexiadou, Artemis, Liliane Haegeman & Melita Stavrou (2007): Noun Phrase in the Generative Perspective. De Gruyter, Berlin, New York.

Local Modelling of Non-Local Dependencies

41

Anagnostopoulou, Elena (2005): Strong and Weak Person Restrictions. A Feature Checking Analysis. In: L. Heggie & F. Ordo˜nez, eds, Clitics and Affix Combinations. Benjamins, Amsterdam, pp. 199–235. Anderson, Stephen (1983): ‘Types of Dependency in Anaphors’, Journal of Linguistic Research 2, 1–23. Assmann, Anke (2012): Switch-Reference as Inter-Clausal Tense Agreement: Evidence from Quechua. In: P. Weisser, ed., Papers on Switch-Reference. Vol. 89 of Linguistische Arbeits Berichte, Institut f¨ur Linguistik, Universit¨at Leipzig, pp. 41–81. Assmann, Anke, Fabian Heck, Johannes Hein, Stefan Keine & Gereon M¨uller (2010): Does Chain Hybridization in Irish Support Movement-Based Approaches to Long-Distance Dependencies? In: S. M¨uller, ed., The Proceedings of the 17th International Conference on Head-Driven Phrase Structure Grammar. CSLI Publications, Stanford, pp. 27–46. Asudeh, Ash (2004): Resumption as Resource Managment. PhD thesis, Stanford University. Bailyn, John Frederick (2001): ‘On Scrambling: A Reply to Boˇskovi´c and Takahashi’, Linguistic Inquiry 32, 635–658. Baker, Mark (1988): Incorporation. A Theory of Grammatical Function Changing. University of Chicago Press, Chicago. Baker, Mark (1996): The Polysynthesis Parameter. Oxford University Press, New York and Oxford. Bakovi´c, Eric (1998): Optimality and Inversion in Spanish. In: P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis & D. Pesetsky, eds, Is the Best Good Enough?. MIT Press and MITWPL, Cambridge, Mass., pp. 35–58. Barbiers, Sjef (2002): Remnant Stranding and the Theory of Movement. In: Dimensions of Movement. Benjamins, Amsterdam, pp. 47–67. Barrett, Syd (1967): Chapter 24. Ms., Cambridge. Barss, Andrew (1986): Chains and Anaphoric Dependence. Ph.d. thesis, MIT, Cambridge, Mass. ˇ acˇ (2009): ‘Cyclic Agree’, Linguistic Inquiry 40, 35–73. B´ejar, Susana & Milan Rez´ Bhatt, Rajesh (2005): ‘Long Distance Agreement in Hindi-Urdu’, Natural Language and Linguistic Theory 23, 757–807. Bobaljik, Jonathan & Susanne Wurmbrand (2003): Relativized Phases. Ms., University of Connecticut, Storrs. Bobaljik, Jonathan & Susanne Wurmbrand (2005): ‘The Domain of Agreement’, Natural Language and Linguistic Theory 23, 809–865. Boeckx, Cedric (2003): Islands and Chains. Resumption as Stranding. Benjamins, Amsterdam. Boeckx, Cedric (2004): ‘Long-distance Agreement in Hindi: Some Theoretical Implications’, Studia Linguistica 58, 23–36. Boeckx, Cedric (2008): Understanding Minimalist Syntax. Blackwell, Oxford. Boeckx, Cedric & Kleanthes K. Grohmann (2007): ‘Putting Phases in Perspective’, Syntax 10, 204– 222. Boeckx, Cedric & Norbert Hornstein (2006): ‘The Virtues of Control as Movement’, Syntax 9, 118– 130. Bouma, Gosse, Robert Malouf & Ivan Sag (2001): ‘Satisfying Constraints on Extraction and Adjunction’, Natural Language and Linguistic Theory 19, 1–65. ˇ Boˇskovi´c, Zeljko (2002): ‘A-Movement and the EPP’, Syntax 5, 167–218. ˇ Boˇskovi´c, Zeljko (2007): ‘Agree, Phases, and Intervention Effects’, Linguistic Analysis 33, 54–96. ˇ Boˇskovi´c, Zeljko (2012): Now I’m a Phase, Now I’m Not a Phase. On the Variability of Phases with Extraction and Ellipsis. Ms., University of Connecticut, Storrs. Bresnan, Joan (1976a): ‘Evidence for a Theory of Unbounded Transformations’, Linguistic Analysis 2, 353–399. Bresnan, Joan (1976b): ‘On the Form and Functioning of Transformations’, Linguistic Inquiry 7, 3– 40. Bresnan, Joan (1982): ‘Control and Complementation’, Linguistic Inquiry 13, 343–434. Broadwell, George Aaron (1997): Binding Theory and Switch-Reference. In: H. Bennis, P. Pica & J. Rooryck, eds, Atomism and Binding. Foris, Dordrecht, pp. 31–49.

42

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

Brosziewski, Ulf (2003): Syntactic Derivations. A Nontransformational View. Number 470 in ‘Linguistische Arbeiten’, Niemeyer Verlag, T¨ubingen. Bruening, Benjamin (2001): Syntax at the Edge: Cross-Clausal Phenomena and the Syntax of Passamaquoddy. PhD thesis, MIT, Cambridge, Mass. B¨uring, Daniel (2005): Binding Theory. Cambridge University Press, Cambridge. Butt, Miriam (1995): The Structure of Complex Predicates in Urdu. CSLI Publications, Standford, California. Camacho, Jos´e (2010): ‘On Case Concord: The Syntax of Switch-Reference Clauses’, Natural Language and Linguistic Theory 28, 239–274. Chandra, Pritha (2005): Hindi-Urdu Long Distance Agreement: Agree, A GREE or Spec-Head? Ms, University of Maryland, College Park. Cheng, Lisa (2000): Moving Just the Feature. In: U. Lutz, G. M¨uller & A. von Stechow, eds, WhScope Marking. Benjamins, Amsterdam, pp. 77–99. Chomsky, Noam (1965): Aspects of the Theory of Syntax. MIT Press, Cambridge, MA. Chomsky, Noam (1972): Some Empirical Issues in the Theory of Transformational Grammar. In: S. Peters, ed., Goals of Linguistic Theory. Prentice-Hall, Englewood Cliffs. Chomsky, Noam (1973): Conditions on Transformations. In: S. Anderson & P. Kiparsky, eds, A Festschrift for Morris Halle. Academic Press, New York, pp. 232–286. Chomsky, Noam (1975): The Logical Structure of Linguistic Theory. Plenum Press, New York. Chomsky, Noam (1977): On Wh-Movement. In: P. Culicover, T. Wasow & A. Akmajian, eds, Formal Syntax. Academic Press, New York, pp. 71–132. Chomsky, Noam (1981): Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam (1982): Some Concepts and Consequences of the Theory of Government and Binding. MIT Press, Cambridge, Mass. Chomsky, Noam (1986a): Barriers. MIT Press, Cambridge, Mass. Chomsky, Noam (1986b): Knowledge of Language. Praeger, New York. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Mass. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels & J. Uriagereka, eds, Step by Step. MIT Press, Cambridge, Mass., pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale. A Life in Language. MIT Press, Cambridge, Mass., pp. 1–52. Chomsky, Noam (2005): ‘Three Factors in Language Design’, Linguistic Inquiry 36, 1–22. Chomsky, Noam (2007): Approaching UG from Below. In: U. Sauerland & H.-M. G¨artner, eds, Interfaces + Recursion = Language?. Mouton de Gruyter, Berlin, pp. 1–31. Chomsky, Noam (2008): On Phases. In: R. Freidin, C. Otero & M. L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. MIT Press, Cambridge, Mass., pp. 133–166. Chumakina, Marina & Greville Corbett (2008): ‘Archi: The Challenge of an Extreme Agreement System’, Jazyki Slavjanskix Kul´tur. Chung, Sandra (1976): On the Subject of Two Passives in Indonesian. In: C. Li, ed., Subject and Topic. Academic Press, New York, pp. 57–98. Chung, Sandra (1994): ‘Wh-Agreement and ‘Referentiality’ in Chamorro’, Linguistic Inquiry 25, 1– 44. Chung, Sandra (1998): The Design of Agreement: Evidence from Chamorro. Chicago University Press, Chicago. Cinque, Guglielmo (1980): On Extracton from NP in Italian, Journal of Italian Linguistics 1/2, 47– 99. Clements, George, James McCloskey, Joan Maling & Annie Zaenen (1983): ‘String-Vacuous Rule Application’, Linguistic Inquiry 14, 1–17. Cole, Peter (1982): ‘Subjacency and Successive Cyclicity: Evidence from Ancash Quechua’, Journal of Linguistic Research 2, 35–58. Cole, Peter & Gabriella Hermon (2000): Partial Wh-Movement: Evidence from Malay. In: U. Lutz, G. M¨uller & A. von Stechow, eds, Wh-Scope Marking. Benjamins, Amsterdam, pp. 101–130.

Local Modelling of Non-Local Dependencies

43

Cole, Peter & Li-May Sung (1994): ‘Head Movement and Long Distance Reflexives’, Linguistic Inquiry 25, 355–406. Cole, Peter, Gabriella Hermon & Li-May Sung (1990): ‘Principles and Parameters of Long-Distance Reflexives’, Linguistic Inquiry 21, 1–22. Collins, Chris (1993): Topics in Ewe Syntax. PhD thesis, MIT, Cambridge, Mass. Collins, Chris (1994): ‘Economy of Derivation and the Generalized Proper Binding Condition’, Linguistic Inquiry 25, 45–61. Dalrymple, Mary (2001): Lexical Functional Grammar. Vol. 34 of Syntax and Semantics, Academic Press, San Diego. Dayal, Veneeta (1994): ‘Scope Marking as Indirect Wh-Dependency’, Natural Language Semantics 2, 137–170. de Hoop, Helen & Andrej Malchukov (2008): ‘Case-Marking Strategies’, Linguistic Inquiry 39, 565–587. den Dikken, Marcel (2007): ‘Phase Extension: Contours of a Theory of the Role of Head Movement in Phrasal Extraction’, Theoretical Linguistics 33, 1–41. Ep´ee, Roger (1976): ‘On Some Rules that are Not Successive-Cyclic in Duala’, Linguistic Inquiry 7, 193–198. Epstein, Sam, Erich Groat, Ruriko Kawashima & Hisatsugu Kitahara (1998): A Derivational Approach to Syntactic Relations. Oxford University Press, Oxford and New York. Fanselow, Gisbert (2001): Optimal Exceptions. In: B. Stiebels & D. Wunderlich, eds, The Lexicon in Focus. Akademie Verlag, Berlin, pp. 173–209. Fanselow, Gisbert & Anoop Mahajan (2000): Towards a Minimalist Theory of Wh-Expletives, WhCopying, and Successive Cyclicity. In: U. Lutz, G. M¨uller & A. von Stechow, eds, Wh-Scope Marking. Benjamins, Amsterdam, pp. 195–230. ´ Fanselow, Gisbert & Damir Cavar (2001): Remarks on the Economy of Pronunciation. In: G. M¨uller & W. Sternefeld, eds, Competition in Syntax. Mouton de Gruyter, Berlin, pp. 107–150. Finer, Daniel (1985): ‘The Syntax of Switch-Reference’, Linguistic Inquiry 16, 35–55. Fischer, Silke (2004): Towards an Optimal Theory of Reflexivization. PhD thesis, Universit¨at T¨ubingen. Fischer, Silke (2006): ‘Matrix Unloaded: Binding in a Local Derivational Approach’, Linguistics 44, 913–935. Fox, Danny (2000): Economy and Semantic Interpretation. MIT Press, Cambridge, Mass. Frank, Robert (2002): Phrase Structure Composition and Syntactic Dependencies. MIT Press, Cambridge, Mass. ´ Gallego, Angel (2007): Phase Theory and Parametric Variation. PhD thesis, Universitat Aut´onoma de Barcelona, Barcelona. ´ Gallego, Angel & Juan Uriagereka (2006): Sub-Extraction from Subjects. Ms., Universitat Aut´onoma de Barcelona. G¨artner, Hans-Martin & Jens Michaelis (2007): Approaching UG from Below. In: U. Sauerland & H.-M. G¨artner, eds, Interfaces + Recursion = Language?. Mouton de Gruyter, Berlin, pp. 161– 195. Gazdar, Gerald (1981): ‘Unbounded Dependencies and Coordinate Structure’, Linguistic Inquiry 12, 155–184. Gazdar, Gerald (1982): Phrase Structure Grammar. In: The Nature of Syntactic Representation. Reidel, Dordrecht, pp. 131–186. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum & Ivan Sag (1985): Generalized Phrase Structure Grammar. Blackwell, Oxford. Georgi, Doreen (2009): Local Modelling of Global Case Splits. Master’s thesis, Universit¨at Leipzig. Georgi, Doreen (2012): Switch-Reference by Movement. In: P. Weisser, ed., Papers on SwitchReference. Vol. 89 of Linguistische Arbeits Berichte, Institut f¨ur Linguistik, Universit¨at Leipzig, pp. 1–40. Graf, Thomas (2009): Some Interdefinability Results for Syntactic Constraint Classes. In: C. Ebert, G. J¨ager & J. Michaelis, eds, The Mathematics of Language. Springer, Heidelberg, pp. 72–87.

44

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

Grewendorf, G¨unther (1988): Aspekte der deutschen Syntax. Narr. Grewendorf, G¨unther & Joachim Sabel (1999): ‘Scrambling in German and Japanese’, Natural Language and Linguistic Theory 17, 1–65. Grimshaw, Jane (1997): ‘Projection, Heads, and Optimality’, Linguistic Inquiry 28, 373–422. Grohmann, Kleanthes K. (2000): Prolific Peripheries: A Radical View From the Left. PhD thesis, University of Maryland. Grosz, Patrick & Pritty Patel (2006): Long Distance Agreement and Restructuring Predicates in Kutchi Gujarati. Ms., MIT, Cambridge, Mass. Haider, Hubert (1993): Deutsche Syntax – generativ. Narr, T¨ubingen. Haider, Hubert (2010): The Syntax of German. Cambridge University Press, Cambridge. Heck, Fabian & Gereon M¨uller (2003): ‘Derivational Optimization of Wh-Movement’, Linguistic Analysis 33, 97–148. (Volume appeared 2007). Heck, Fabian & Gereon M¨uller (2007): Extremely Local Optimization. Proceedings of WECOL 2006. California State University, Fresno. Heck, Fabian & Gereon M¨uller (2010): Lokale Modellierung nicht-lokaler Abh¨angigkeiten in der Syntax. Ms., Universit¨at Leipzig. Heck, Fabian & Malte Zimmermann (2004): DPs as Phases. Ms., Universit¨at Leipzig and HU Berlin. Heim, Irene & Angelika Kratzer (1998): Semantics in Generative Grammar. Blackwell, Oxford. Heinat, Fredrik (2006): Probes, Pronouns, and Binding in the Minimalist Program. PhD thesis, Lund University. Henry, Alison (1995): Belfast English and Standard English. Oxford University Press, Oxford. Hornstein, Norbert (2001): Move. A Minimalist Theory of Construal. Blackwell, Oxford. Huang, Cheng-Teh James (1982): Logical Relations in Chinese and the Theory of Grammar. PhD thesis, MIT, Cambridge, Mass. Huang, Cheng-Teh James & C.-C. Jane Tang (1992): The Local Nature of Long-Distance Reflexives in Chinese. In: J. Koster & E. Reuland, eds, Long-Distance Anaphora. Cambridge University Press, Cambridge, pp. 263–282. Huybregts, Riny (2009): The Minimalist Program: Not a Bad Idea. Ms., Universiteit Utrecht. Jacobson, Pauline (1999): ‘Towards a Variable-Free Semantics’, Linguistics and Philosophy 22, 117–185. J¨ager, Gerhard (2005): Anaphora and Type Logical Grammar. Springer, Heidelberg. Kang, Jung-Goo (1996): Consecutio Temporum, Aspekt und Transparente LF. PhD thesis, Universit¨at T¨ubingen. Kayne, Richard (1982): Connectedness and Binary Branching. Foris, Dordrecht. Kayne, Richard (1998): ‘Overt vs. Covert Movement’, Syntax 1, 128–191. Keine, Stefan (2008): Long-Distance Agreement und Zyklisches Agree. Ms., Universit¨at Leipzig. Keine, Stefan (2010): Case and Agreement from Fringe to Core. Impoverishment Effects on Agree. Linguistische Arbeiten, Mouton de Gruyter, Berlin. Keine, Stefan (2011): On Deconstructing Switch Reference. Ms., University of Massachusetts, Amherst. To appear in Natural Language and Linguistic Theory. Keine, Stefan & Gereon M¨uller (2008): Differential Argument Encoding by Impoverishment. In: M. Richards & A. Malchukov, eds, Scales. Vol. 86 of Linguistische Arbeitsberichte, Universit¨at Leipzig, pp. 83–136. Khalilova, Zaira (2007): Clause Linkage: Coordination, Subordination and Cosubordination in Khwarshi. Ms., Max-Planck-Institut f¨ur evolutio¨are Anthropologie, Leipzig. Kiss, Tibor (1995): Infinite Komplementation. Niemeyer, T¨ubingen. Kiss, Tibor (2004): Psychologische Pr¨adikate und reflexive Bindung. Ms., Universit¨at Bochum. Kiziak, Tanja (2007): Long Extraction or Parenthetical Insertion? Evidence from Judgement Studies. In: N. Deh´e & Y. Kavalova, eds, Parentheticals. Benjamins, Amsterdam, pp. 121–144. Klima, Edward (1964): Negation in English. In: J. Fodor & J. Katz, eds, The Structure of Language. Prentice Hall, Englewood Cliffs, New Jersey, pp. 246–323. Koster, Jan (1978): Locality Principles in Syntax. Foris, Dordrecht. Koster, Jan (1987): Domains and Dynasties. Foris, Dordrecht.

Local Modelling of Non-Local Dependencies

45

Koster, Jan (2000): Variable-Free Grammar. Ms., University of Groningen. Kotzoglou, George (2002): ‘Greek ‘ECM’ and How to Control It’, Reading Working Papers in Linguistics 6, 39–56. Kramer, Ruth (2007): The Amharic Definite Marker and the Syntax/PF Interface. Ms., University of California at Santa Cruz. Kratzer, Angelika (1998): More Structural Analogies Between Pronouns and Tenses. In: D. Strolovitch & A. Lawson, eds, Semantics and Linguistic Theory. Vol. 8, Cornell University, Ithaca, New York, pp. 92–110. Kroch, Anthony (1989): Asymmetries in Long Distance Extraction in a Tree Adjoining Grammar. In: M. Baltin & A. Kroch, eds, Alternative Conceptions of Phrase Structure. University of Chicago Press, Chicago, pp. 66–98. Kuno, Susumu (1987): Functional Syntax. Chicago University Press, Chicago. Lahne, Antje (2008): Specificity-driven Syntactic Derivation: A New View on Long-distance Agreement. Ms., Universit¨at Leipzig. Lahne, Antje (2009): Where There is Fire There is Smoke. Local Modelling of Successive-Cyclic Movement. PhD thesis, Universit¨at Leipzig. Lechner, Winfried (2009): Evidence for Survive from Covert Movement. In: M.T. Putnam, ed., The Survive Principle in a Crash Proof Syntax. John Benjamins, Amsterdam, pp. 231-256. Legate, Julie Anne (2005): ‘Phases and Cyclic Agreement’, MITWPL 49, 147–156. Perspectives on Phases. Legendre, G´eraldine, Jane Grimshaw & Sten Vikner, eds (2001): Optimality-Theoretic Syntax. MIT Press, Cambridge, Mass. Legendre, G´eraldine, Paul Smolensky & Colin Wilson (1998): When is Less More? Faithfulness and Minimal Links in Wh-Chains. In: P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis & D. Pesetsky, eds, Is the Best Good Enough?. MIT Press and MITWPL, Cambridge, Mass., pp. 249–289. Levine, Robert & Ivan Sag (2003a): Some Empirical Issues in the Grammar of Extraction. In: S. M¨uller, ed., Proceedings of the HPSG03 Conference. CSLI Publications, Michigan State University, East Lansing. Levine, Robert & Ivan Sag (2003b): Wh-Nonmovement. Ms., Stanford University. To appear in Gengo Kenkyu. Lightfoot, David (1994): Degree-0 Learnability. In: B. Lust, G. Hermon & J. Kornfilt, eds, Syntactic Theory and First Language Acquisition: Crosslinguistic Perspectives. Vol. 2: Binding Dependency and Learnability. Erlbaum, Hillsdale, NJ, pp. 453–472. Longobardi, Giuseppe (1985): Connectedness and Island Constraints. In: J. Gu´eron, H.-G. Obenauer & J.-Y. Pollock, eds, Grammatical Representation. Foris, Dordrecht, pp. 169–185. Lutz, Uli, Gereon M¨uller & Arnim von Stechow, eds (2000): Wh-Scope Marking. Benjamins, Amsterdam. Maling, Joan & Annie Zaenen (1982): A Phrase Structure Account of Scandinavian Extraction Phenomena. In: P. Jacobson & G. Pullum, eds, The Nature of Syntactic Representation. Reidel, Dordrecht, pp. 229–282. Manzini, Rita & Kenneth Wexler (1987): ‘Parameters, Binding Theory, and Learnability’, Linguistic Inquiry 18, 413–444. Maruˇsiˇc, Franc (2005): On Non-Simultaneous Phases. PhD thesis, Stony Brook University. Massam, Diane (1985): Case Theory and the Projection Principle. PhD thesis, MIT, Cambridge, Mass. Matushansky, Ora (2005): ‘Going Through a Phase’, MITWPL 49, 157–181. Perspectives on Phases. McCloskey, James (1979): Transformational Syntax and Model Theoretic Semantics. A Case Study in Modern Irish. Reidel, Dordrecht. McCloskey, James (1988): Syntactic Theory. In: F. Newmeyer, ed., Linguistics: The Cambridge Survey. Vol. I. Linguistic Theory: Foundations. Cambridge University Press, Cambridge, pp. 18– 59. McCloskey, James (2000): ‘Quantifer Float and Wh-Movement in Irish English’, Linguistic Inquiry 31, 57–84.

46

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

McCloskey, James (2002): Resumptives, Successive Cyclicity, and the Locality of Operations. In: S. D. Epstein & T. D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell, Oxford, pp. 184–226. McFadden, Thomas (2009): Structural Case, Locality and Cyclicity. In: K. K. Grohmann, ed., Explorations of Phase Theory. Features and Arguments. Mouton de Gruyter, Berlin, pp. 107–130. Merchant, Jason (2001): The Syntax of Silence - Sluicing, Islands, and the Theory of Ellipsis. Oxford University Press, Oxford. Moortgat, Michael (1988): Categorial Investigations. Logical and Linguistic Aspects of the Lambek Calculus. Foris, Dordrecht. M¨uller, Gereon (2011): Constraints on Displacement. A Phase-Based Approach. Vol. 7 of Language Faculty and Beyond, Benjamins, Amsterdam. M¨uller, Gereon & Wolfgang Sternefeld (1993): ‘Improper Movement and Unambiguous Binding’, Linguistic Inquiry 24, 461–507. M¨uller, Stefan (2007): Head-Driven Phrase Structure Grammar: Eine Einf¨uhrung. Stauffenburg, T¨ubingen. Neeleman, Ad & Hans van de Koot (2010): ‘A Local Encoding of Syntactic Dependencies and its Consequences for the Theory of Movement’, Syntax 13, 331–372. Nissenbaum, Jon (2000): Covert Movement and Parasitic Gaps. In: M. Hirotani, A. Coetzee, N. Hall & J.-Y. Kim, eds, Proceedings of NELS 30. GLSA, Amherst, Mass, pp. 542–555. Noonan, M´aire (2002): CP-Pied-Piping and Remnant IP Movement in Long Distance WhMovement. In: A. Alexiadou, E. Anagnostopoulou, S. Barbiers & H.-M. G¨artner, eds, Dimensions of Movement. John Benjamins, Amsterdam, pp. 269–295. Nunes, Jairo (2004): Linearization of Chains and Sideward Movement. MIT Press, Cambridge, Mass. Obata, Miki & Samuel David Epstein (2011): ‘Feature-Splitting Internal Merge: Improper Movement, Intervention, and the A/A Distinction’, Syntax 14, 122–147. Ogihara, Toshiyuki (1989): Temporal Reference in English and Japanese. PhD thesis, University of Texas, Austin. Ortiz de Urbina, Jon (1989): Parameters in the Grammar of Basque: A GB Approach to Basque Syntax. Foris, Dordrecht. Perlmutter, David & Scott Soames (1979): Syntactic Argumentation and the Structure of English. The University of California Press, Berkeley. Pesetsky, David (1985): ‘Morphology and Logical Form’, Linguistic Inquiry 16, 193–246. Pesetsky, David (1995): Zero Syntax. MIT Press, Cambridge, Mass. Pica, Pierre (1987): On the Nature of the Reflexivization Cycle. In: Proceedings of NELS. Vol. 17, Amherst: GSLA, pp. 483–500. Plessis, Hans du (1977): ‘Wh-Movement in Afrikaans’, Linguistic Inquiry 8, 723–726. Polinsky, Maria (2003): ‘Non-Canonical Agreement is Canonical’, Transactions of the Philological Society 101. Polinsky, Maria & Eric Potsdam (2001): ‘Long-Distance Agreement and Topic in Tsez’, Natural Language and Linguistic Theory 19, 583–646. Pollard, Carl & Ivan Sag (1992): ‘Anaphors in English and the Scope of Binding Theory’, Linguistic Inquiry 23, 261–303. Pollard, Carl J. & Ivan A. Sag (1994): Head-Driven Phrase Structure Grammar. University of Chicago Press, Chicago. Preminger, Omer (2009): ‘Breaking Agreements: Distinguishing Agreement and Clitic Doubling by Their Failures’, Linguistic Inquiry 40, 619–666. Progovac, Ljiljana (1992): ‘Relativized S UBJECT: Long-Distance Reflexives without Movement’, Linguistic Inquiry 23, 671–680. Pullum, Geoffrey (1989): ‘The Incident of the Node Vortex Problem’, Natural Language and Linguistic Theory 7, 473-479. Reinhart, Tanya & Eric Reuland (1993): ‘Reflexivity’, Linguistic Inquiry 24, 657–720. Reis, Marga (1996): Extractions from Verb-Second Clauses in German? In: U. Lutz & J. Pafel, eds,

Local Modelling of Non-Local Dependencies

47

On Extraction and Extraposition in German. Benjamins, Amsterdam, pp. 45–88. Reuland, Eric (2001): ‘Primitives of Binding’, Linguistic Inquiry 32, 439–492. Richards, Marc (2004): Object Shift and Scrambling in North and West Germanic: A Case Study in Symmetrical Syntax. PhD thesis, University of Cambridge, Cambridge, UK. Richards, Marc (2011): ‘Deriving the Edge: What’s in a Phase?’, Syntax 14, 74–96. Riemsdijk, Henk van (1978): A Case Study in Syntactic Markedness: The Binding Nature of Prepositional Phrases. Foris, Dordrecht. Rizzi, Luigi (1982): Issues in Italian Syntax. Foris, Dordrecht. Rizzi, Luigi (1990): Relativized Minimality. MIT Press, Cambridge, Mass. Robins, Robert H. (1958): The Yurok Language: Grammar, Texts, Lexicon. University of California Press, Berkeley. Ross, John (1967): Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Mass. Sabel, Joachim (2000): Partial Wh-Movement and the Typology of Wh-Questions. In: U. Lutz, G. M¨uller & A. von Stechow, eds, Wh-Scope Marking. Benjamins, Amsterdam, pp. 409–446. Sag, Ivan & Thomas Wasow (1999): Syntactic Theory. A Formal Introduction. CSLI Publications, Stanford University. Saito, Mamoru (1985): Some Asymmetries in Japanese and Their Theoretical Implications. PhD thesis, MIT, Cambridge, Mass. Samek-Lodovici, Vieri (2006): Studies in OT Syntax and Semantics. Elsevier, Amsterdam. Lingua Special Issue 9, vol. 117. Sch¨afer, Florian (2008): The Syntax of (Anti-)Causatives. Benjamins, Amsterdam. Seiter, William J. (1983): Advancements and Verb Agreement in Southern Tiwa. In: D. Perlmutter, ed., Studies in Relational Grammar 1. University of Chicago Press, Chicago, pp. 317–359. Sells, Peter (1984): Syntax and Semantics of Resumptive Pronouns. PhD thesis, University of Massachusetts, Amherst. Sells, Peter (1987): ‘Aspects of Logophoricity’, Linguistic Inquiry 18, 445–479. Sells, Peter (2006): Using Subsumption Rather than Equality in Functional Control. In: M. Butt & T. King, eds, Proceedings of LFG-06. CSLI Publications, Universit¨at Konstanz. Shlonsky, Ur (1988): ‘Government and Binding in Hebrew Nominals’, Linguistics 26, 951–976. Silverstein, Michael (1976): Hierarchy of Features and Ergativity. In: R. Dixon, ed., Grammatical Categories in Australian Languages. Australian Institute of Aboriginal Studies, Canberra, pp. 112–171. Sportiche, Dominique (1989): ‘Le Mouvement Syntaxique: Contraintes et Param`etres’, Langages 95, 35–80. Stechow, Arnim von (1995): On the Proper Treatment of Tense, Proceedings of Semantics and Linguistic Theory V pp. 362–386. Stechow, Arnim von (2003): Binding by Verbs: Tense, Person and Mood under Attitudes. In: H. Lohnstein & S. Trissler, eds, The Syntax and Semantics of the Left Periphery. Mouton de Gruyter, Berlin, pp. 431–488. Steedman, Mark (2001): The Syntactic Process. MIT Press, Cambridge, Mass. Sternefeld, Wolfgang (1997): The Semantics of Reconstruction and Connectivity. SfS-Report 97-97, Universit¨at T¨ubingen. Sternefeld, Wolfgang (2000): Grammatikalit¨at und Sprachverm¨ogen – Anmerkungen zum Induktionsproblem in der Syntax. In: J. Bayer & Chr. R¨omer, eds., Von der Philologie zur Grammatiktheorie. Niemeyer, T¨ubingen, p. 15–42. Stjepanovi´c, Sandra & Shoichi Takahashi (2001): Eliminating the Phase Impenetrability Condition. Ms., Kanda University of International Studies. Svenonius, Peter (2004): On the Edge. In: D. Adger, C. de Cat & G. Tsoulas, eds, Peripheries. Syntactic Edges and their Effects. Kluwer, Dordrecht, pp. 261–287. Takahashi, Daiko (1994): Minimality of Movement. PhD thesis, University of Connecticut. Torrego, Esther (1984): ‘On Inversion in Spanish and Some of Its Effects’, Linguistic Inquiry 15, 103–129.

48

Artemis Alexiadou, Tibor Kiss & Gereon M¨uller

Travis, Lisa (1984): Parameters and Effects of Word Order Variation. PhD thesis, MIT, Cambridge, Mass. Unger, Christina (2010): A Computational Approach to the Syntax of Displacement and the Semantics of Scope. PhD thesis, Universiteit Utrecht, LOT. Ura, Hiroyuki (2007): ‘Long-Distance Case Assignment in Japanese and Its Dialectal Variation’, Gengo Kenkyu 131, 1–43. Uriagereka, Juan (1999): Multiple Spell-Out. In: S. Epstein & N. Hornstein, eds, Working Minimalism. MIT Press, Cambridge, Mass., pp. 251–282. Uszkoreit, Hans & Stanley Peters (1986): ‘On Some Formal Properties of Metarules’, Linguistics and Philosophy 9, 477–494. Vainikka, Anne & Pauli Brattico (2011): ‘The Finnish Accusative’, Biolinguistica Fennica Working Papers 2, 33–58. Wahba, Wafaa Abdel-Faheem Batran (1992): LF Movement in Iraqi Arabic. In: C.-T. J. Huang & R. May, eds, Logical Structure and Linguistic Structure. Kluwer, Dordrecht, pp. 253–276. Watanabe, Akira (2000): ‘Feature Copying and Binding: Evidence from Complementizer Agreement and Switch Reference’, Syntax 3, 159–181. Weisser, Philipp, ed. (2012): Papers on Switch-Reference. Vol. 89 of Linguistische Arbeits Berichte, Institut f¨ur Linguistik, Universit¨at Leipzig. Zemskaja, E.A. (1973): Russkaja Razgovornaja Reˇc . Nauka, Moskva.

(Alexiadou) Institut f¨ur Linguistik: Anglistik Universit¨at Stuttgart (Kiss) Sprachwissenschaftliches Institut Ruhr-Universit¨at Bochum (M¨uller) Institut f¨ur Linguistik Universit¨at Leipzig

Fabian Heck & Juan Cuartero

Long Distance Agreement in Relative Clauses* Abstract We argue that person and number agreement between the verb of a relative clause and its head noun is, at first sight, incompatible with the the PIC in its most restrictive form. We then propose a theory that allows to maintain the strict version of the PIC while still accounting for these instances of cross-clausal agreement. The approach makes use of the idea that agreement applies cyclically and involves feature sharing. Since relative pronouns in English and German differ with respect to their featural specification, they create different agreement relations. Crucially, a difference in agreement relations on early cycles results in distinct feature structures, which serve as the input for later cycles. In this way, some variation between English and German with respect to agreement into relative clauses is derived.

1. An observation In English, the verb of a relative clause agrees with the “head noun” (henceforth HN) that is modified by the relative clause with respect to number and person. (1) and (2) illustrate this for appositive relative clauses (see Akmajian (1970, 154)) and cleft constructions (see Ross (1970, 251), Akmajian (1970, 153)), respectively. To make sure that person agreement has applied, the examples involve a first person pronoun as HN (third person agreement is arguably the default in English). Similar facts can be observed for French (cf. Jespersen (1927, 90)). For reasons of space, we confine ourselves to English here.1 *

1

Different versions of this paper were presented at the DGfS-Workshop “Local Modelling of Non-Local Dependencies in Syntax” (Universit¨at Bamberg, February 2008), Generative Grammatik im S¨uden (ZAS Berlin, May 2008), the Workshop “Perspektiven minimalistischer Syntax” (Universit¨at Leipzig, October 2008). We would like to thank the audiences on these occasions, and, especially, the participants of the Colloquium on Syntax and Morphology at the Universit¨at Leipzig. Particular thanks go to Petr Biskup, Werner Frey, Katharina Hartmann, Andrew McIntyre, Gereon M¨uller, Stefan M¨uller, Marga Reis, Marc Richards, Peter Sells, Volker Struckmeier, and Ralf Vogel. Three remarks about the examples in (2) are in order. First, there is considerable variation as to person (but not number) agreement in English clefts. Akmajian (1970) discusses three dialects, only one of which shows person agreement (and which is also the dialect mentioned by Ross (1970)). We are exclusively concerned with this dialect here. Second, according to Akmajian (1970, 151, footnote 3), who and that are interchangeable in clefts with human antecedents; we found that some speakers have a slight preference for who in this case. Third, the examples in (2) all involve a nominative marked HN. Objective marked HNs are also possible but show additional restrictions on agreement (see section 5.1).

Local Modelling of Non-Local Dependencies in Syntax, 49-83 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

50

Fabian Heck & Juan Cuartero

(1) a. I, who am tall, was forced to squeeze into that VW. b. *I, who is tall, was forced to squeeze into that VW. c. We, who are tall, were forced to squeeze into that VW. d. *We, who is/am tall, were forced to squeeze into that VW. (2) a. It is I who/that am responsible. b. *It is I who/that is responsible. c. It is we who/that are responsible. d. *It is we who/that is/am responsible. In what follows, we argue that the facts in (1) and (2) pose a theoretical challenge if one assumes (a) that agreement is subject to the strict version of the Phase Impenetrability Condition (PIC, see Chomsky (2000)) and (b) that relative pronouns in English are underspecified for both person and number. We then propose that the challenge can be addressed appropriately if agreement applies cyclically and involves feature sharing. Before we turn to the challenge in section 2, we would like to briefly present our assumptions about the structure of relative clauses. First, we take it that relative clause constructions (RCCs) and cleft constructions (CCs) are structurally very similar (see Schachter (1973), Chomsky (1977)). Both involve a CP that (a) is introduced by a C-element or a relative pronoun (R L P), (b) contains a gap (sometimes filled by a resumptive pronoun, R S P), and (c) modifies a HN: (3) . . . HN [ CP {R L P/C} . . . {gap/R S P} . . . ] . . . We refer to the CP in (3) as the relative clause (RC), irrespective of whether it is part of a RCC or a CC. Note that we are concerned with appositive RCs only (as opposed to restrictive ones). The reason is that restrictive RCs hardly combine with first or second person pronouns for independent reasons, which makes it hard to investigate person agreement in this context. Second, we assume that a RC that modifies a (pronominal) HN is always merged as the complement of the HN (see Smith (1964), Chomsky (1965)). Finally, we follow Chomsky (1965; 1977), Ross (1967), and many others in assuming that the gap within the RC is the result of wh-movement of a (possibly phonetically empty) R L P to SpecC.

2. The challenge We now briefly illustrate why and under which assumptions the agreement facts in (1) and (2) are a challenge. At this point, we avoid the discussion of technical details concerning the operation Agree, which, according to Chomsky (2000; 2001), is at the heart of syntactic agreement. These issues are addressed in subsequent sections.

Long Distance Agreement in Relative Clauses

51

It is often assumed that agreement is an asymmetrical relation. Some node β has the potential to adopt different values for a feature [ F ] it bears. β thus seeks for another node γ from which it can receive a value for [ F ]; this leads to agreement between β and γ with respect to [ F ]. Another common assumption is that agreement is subject to locality conditions. Suppose that β and γ can only agree if γ is accessible for β . A particular notion of accessibility is proposed by Chomsky (2000; 2001). The idea is that CP and vP are special categories, which are called phases. Phases are subject to the (strict version of the) Phase Impenetrability Condition (PIC, see Chomsky (2000, 108), Chomsky (2008, 141, footnote 24)). The PIC defines the domain in which γ is accessible for β , see (4).2 (4) Phase Impenetrability Condition If γ is dominated by a phase P, then γ is inaccessible from outside P (for some β ) unless γ is in the edge domain of P. (5) Edge domain γ is in the edge domain of P iff a. or b. hold. a. γ is a specifier of P. b. γ is the head of P. Suppose that β is the head within the RC that hosts the features for verbal subject agreement. Following Chomsky (1957) (and much subsequent literature), we assume this head to be T. By assumption, T enters the derivation with unvalued agreement features. Now, a plausible candidate for γ in (1) and (2) to value the features on T is the subject R L P who, or, for that matter, the phonetically empty R L P in the case of that-RCs. Who, being the external argument, is merged in Specv. Although T is separated from who by the vP-phase, who is still accessible for T because who occupies the edge domain of v (see (4) and (5-a)). Agree between T and who (and thus valuation) can apply (see (6)). Hence, in this scenario there is no challenge. (6) HN [ CP . . . T . . . [ vP R L P . . . ]] . . . 6 AGREEMENT

However, there is reason to doubt that who actually bears the appropriate features to value T. Let us confine ourselves to number agreement for the moment. To begin with, English who also leads the life of an interrogative pronoun. Crucially, in its interrogative use, it cannot trigger number agreement; rather, a verb whose subject is interrogative who is always in the singular (which, presumably, is the default), see (7). (7) a. 2

Who is asleep?

Chomsky (2001, 14) proposes a more liberal version of the PIC.

52

Fabian Heck & Juan Cuartero

b. *Who are asleep? This is unexpected if relative who and interrogative who are the same lexical element and if relative who is assumed to trigger plural agreement in RCs. To account for the asymmetry in number agreement between interrogative and relative constructions, one could assume, in principle, that there are two homophonous instances of who, one being used in relative clauses, the other in interrogative clauses. Relative who would be specified for number while interrogative who would lack this specification. We think, however, that this assumption is not correct for the following reason. First, it can be observed that in German CCs and RCCs, just as in English, there is obligatory number and person agreement (the latter being confined to plural contexts, see section 5.3). This is illustrated by the contrasts in (8) and (9), respectively, which involve RCs that are introduced by the d-R L P die. Elements that introduce RCs are glossed as REL, comprising both R L Ps and RCcomplementizers.3 (8) a.

weil ihr es seid, die die ganze Arbeit macht. since you.2.PL it are, REL the whole work do.2.PL ‘since it is you who do all the work.’ es seid, die die ganze Arbeit machen. b. ??weil ihr since you.2.PL it are, REL the whole work do.3.PL ¨ macht, Ihr, die immer Arger habt mir gerade noch you.2.PL REL always trouble make.2.PL have me PART yet gefehlt. lacked ‘You, who always cause trouble, are the last thing I need.’ ¨ die immer Arger machen, habt mir gerade noch b. *Ihr, you.2.PL REL always trouble make.3.PL have me PART yet gefehlt. lacked

(9) a.

Second, similar to English who, German d-pronouns also fulfill another function: they act as demonstrative pronouns. As such, they do not trigger person agreement; rather, the verb is always marked third person (again, plausibly the default), see (10-a,b). (10) a.

3

ein Problem. Die da unten haben DEM there down have.3. PL a problem ‘They have a problem down there.’

For some reason, the contrast is much clearer in RCCs such as (9) than in CCs (see (8)). Some speakers are even indecisive about which form to chose in a CC. We abstract away from this here.

Long Distance Agreement in Relative Clauses

53

ein Problem. b. *Die da unten habt DEM there down have.2. PL a problem ‘You have a problem down there.’ In order to account for the facts in (8), (9), and (10), one would have to assume the existence of two homophonous instances of die, too, with only the R L P die being specifiable for first or second person. The problem is that this misses the generalization that in both English and German it is the relative variant of the homophonous pair that is fully specified for number (or person) while the interrogative (or demonstrative) variant is underspecified. It would be preferable to have a theory of number and person agreement in RCs that captures this generalization. Our hunch is that the above mentioned homophonies are not accidental. Rather, we would like to hypothesize that in both cases the same lexical item is involved, which is underspecified for number or person. (To be precise, we assume that English who is underspecified for both of these features while German die is underspecified for person but is specified for number, see section 5.3) Of course, if who lacks number and person to begin with, then it cannot provide any values for these features on T. Consequently, the question arises as to where these values come from. Chomsky (1995, 228)) introduces the condition in (11), which prevents features (or their values) from being “conjured up” out of thin air. (11) Inclusiveness Condition (IC) The derivation can only make use of elements that have been taken from the lexicon. Given the IC, the source of the feature-values on T of the RC must be present in the structure. We can think of two scenarios. Either, there is an empty R S P that occupies the subject gap in the RC and that is equipped with the appropriate features. Or, alternatively, the feature values come from the HN. Let us leave open for the moment which of these two scenarios is to be preferred (but see section 4, where the resumption-based theory is rejected) and rather turn to the question as to whether they can account for the agreement facts in (1) and (2). Consider first the hypothesis that it is the HN that provides the values for the Φ-features on T. While the HN is outside of the CP-phase (the RC), T is inside. Moreover, T is not in the edge domain of the CP-phase. Thus, the PIC in (4) prevents a direct Agree-relation between T and the HN, see (12). (12) HN [ CP {R L P/C} . . . T . . . ] . . . 6 XXX

AGREEMENT

54

Fabian Heck & Juan Cuartero

We conclude that if the HN is the source of the agreement at hand, then a theoretical problem arises. If the values of T come from an empty subject R S P, there is no such problem. Like the R L P (recall (6)), the (hypothesized) R S P is merged in Specv, where it is PIC-accessible for T inside the RC. However, this does not yet establish agreement between the HN and T. Since the feature values of the R S P and those of the HN are chosen independently of each other, it must be ensured that they coincide. And although both T and the R S P are within the RC, the HN is not. Again, agreement between the HN and the R S P (and, by transitivity, the HN and T) appears to run against the PIC (as was the case with (6)). One might argue that the relation between the HN and the R S P of a RC is one between an anaphor and its antecedent. Arguably, such anaphoric agreement differs from syntactic agreement (see also section 5.4 on this issue) such as the one that holds between the subject and T. Now, if antecedent-anaphor agreement is not subject to the PIC, then, of course, no PIC-problem arises for the resumption based theory. However, we argue in section 4.2 that one cannot generally assume the presence of an (appropriate) R S P in the context of long distance agreement of the type in (1) and (2). To conclude, provided the strong version of the PIC in (4) and under the assumption that English R L Ps lack person and number the agreement facts in (1) and (2) pose a theoretical challenge.

3. More theoretical background Before we present our proposal as to how the agreement facts can be accounted for while at the same time maintaining the PIC, we would like to introduce some theoretical background. As the discussion proceeds, some of the assumptions will be altered. Others will be introduced on the fly when needed. Chomsky (2000; 2001), assumes that agreement applies between two features (or sets of features), which are called probe and goal, respectively. Crucially, the probe enters the derivation unvalued; it receives its value by establishing Agree with a valued goal. We write an unvalued feature [ F ] as [ F :] and a feature [ F ] with value ω as [ F :ω ]. In what follows, we refer to the categories that bear probe and goal as the probe- and goal-category, respectively. Agree is then defined as in (13). Usually, Φ is a shorthand for the set that comprises person, number, and gender, see Chomsky (1981). In the present context, we only consider person and number as there is no (overt verbal) gender agreement in English. (13) Agree A probe-category β establishes Agree with a goal-category γ iff a.-e. hold. a. γ bears a set of interpretable valued Φ-features ([Φ:ω ]).

Long Distance Agreement in Relative Clauses

b. c. d. e.

55

β bears a matching (possibly improper) subset of uninterpretable unvalued Φ-features ([Φ:]). β c-commands γ . γ bears unvalued case ([ CASE :]; that is, γ is still “active”). There is no alternative goal α that intervenes between β and γ .

As a consequence of Agree, the hitherto unvalued probe(s) receive(s) a value from the lexically valued goal(s); the unvalued case-feature on the goal-category becomes valued, too. All unvalued features must become valued if the derivation is to succeed. Finally, we take it that structure building operations (Move, Merge, Agree) apply cyclically, that is they obey the Strict Cycle Condition (see (14); Chomsky (1973); Perlmutter and Soames (1979)), and derivationally from bottom to top (see Chomsky (1995; 2000; 2001)). (14) Strict Cycle Condition (SCC) If Σ is the current root of the phrase marker, then no operation can take place exclusively within Ω, where Ω is properly dominated by Σ. This said, we turn to our proposal as to how the facts in (1) and (2) should be accounted for, that is, how we think that the apparent gap in agreement locality should be bridged.

4. Bridging the gap 4.1. Cyclic agreement Suppose for the moment that the Φ-features of T of the RC receive their values from the HN via Agree. Since Agree between T and the HN cannot be established in a direct way (see section 2), T’s Φ-features have to reach a position where they are accessible from outside the CP-phase: the edge domain of the RC. To this end, suppose, following Legate (2005), that agreement can apply cyclically. The idea is that in a first step T agrees with C; in a second step, C agrees with the HN. As agreement is transitive, it then follows that T agrees with the HN. This requires some modifications of the standard definition of Agree in (13). Consider the second step of cyclic agreement between the HN and C. Suppose that the C-head of a RC bears the same set of agreement features as T does (see Platzack (1987), Carstens (2003), Chomsky (2008); cf. also Haider (1993)). Now, (13-c) states that the probe must c-command the goal. This implies that the HN must be the probe-category, while C is the goal-category. According to (13-b), [Φ] on the probe-category is unvalued and uninterpretable while, according to (13-a), [Φ] on the goal-category is valued and interpretable. But under the

56

Fabian Heck & Juan Cuartero

present assumptions, it is the other way round: the Φ-features on the HN, the probe-category, are valued and interpretable, while the Φ-features of C, the goalcategory, are unvalued and uninterpretable. We thus take it that the case feature on the HN, which under standard assumptions is unvalued and uninterpretable, can act as a probe, too. It follows that there must be a matching goal [ CASE :ω ] on C (with ω = nominative in English). Similar to the assumption (mentioned below (13)) that Agree between Φ-features automatically triggers valuation of [ CASE :], we now assume that Agree between case features leads to automatic valuation of [Φ:] (provided that there is an appropriate [Φ:ω ]). Yet, even with these modifications in place there still remains a problem. Assume that the derivation reaches the stage where the C-head of the RC has just been merged. Since T enters the derivation with unvalued Φ-features, it cannot pass any Φ-values onto C. As a consequence, C does not bear any feature values that could be matched against the values of the HN’s Φ-features, leading to (indirect) agreement between the HN and T. There is, in fact, an alternative derivation that involves downward cyclic agreement: first the HN values the (hitherto unvalued) Φ-features on C, which is possible as the HN enters the derivation with valued Φ-features; second, the now valued Φ-features on C value those on T. However, the latter step of this derivation is blocked by the SCC; again, (indirect) agreement cannot be derived. 4.2. Resumption At this point, it appears that the scenario that involves a R S P within the RC instead of a gap (see section 2) has the advantage: namely, if the R S P has valued Φ-features, then it is able to pass these values on to T. T in turn can then value the Φ-features on C; finally, C can agree with the HN. Interestingly, CCs and appositive RCs in English do not exhibit weak crossover (WCO) effects (see Lasnik and Stowell (1991, 715-716); see also Postal (1993, 550-554) and Adesola (2006) on CCs in French and Yoruba, respectively). Moreover, it has been observed that WCO effects do not arise if (overt) R S Ps are present (see Safir (1984) on English; Sells (1984, 253), Shlonsky (1992, 460) on Hebrew; Postal (1993, 553) on French; see also Safir (2004, 114-121)). This independently suggests that CCs and appositive RCCs involve (empty) R S Ps. There is a complication, though. Adger (2011) argues for the existence of two different types of R S Ps. One motivation for the distinction is that in some languages R S Ps repair island violations while in others they do not. The latter type of R S Ps are called “bare” R S Ps by Adger. Moreover, Adger (2011) argues that the inability to repair island violations goes hand in hand with the R S P’s inability to trigger Φ-agreement. As a consequence, Adger’s theory, which relates

Long Distance Agreement in Relative Clauses

57

these two facts, is based on the assumption that bare R S Ps lack Φ-features (the absence of WCO effects in CCs and appositive RCCs is not affected by this). Against this background, consider the CCs in (15). (15) a. b. c. d.

*It is I who Mary made the claim that am responsible. *It is I who Mary knows the person that claims that am responsible. *It is I who Mary wonders why am responsible. *It is I who that am responsible is highly probable.

All examples in (15) are strongly ungrammatical. (15-a,b) involve violations of a CNPC-island (an argument clause in (15-a), a relative clause in (15-b)); (15-c) violates a wh-island; and (15-d) violates a subject island. Apparently, none of these violations can be repaired by a (hypothesized, empty) non-bare R S P (see also Perlmutter (1972, 90) and Postal (1993, 554, footnote 19) on unrepairable island violations in French RCs). Note in passing that (15-a,b,d) also involve that-trace effects. However, their ungrammaticality is stronger than one would expect if only that-trace effects were at stake. The upshot of all this is that if CCs in English involve R S Ps at all, then these must be bare R S Ps. But if so, then one cannot expect them to value the Φ-features of T: according to Adger (2011), bare R S Ps lack Φ-features to begin with.4 It thus seems as if we were back to square one: on the one hand, there is evidence that there is no source of Φ-feature values within the RC (such as a non-bare R S P) that could pass them onto C via cyclic agreement, where they become accessible from outside the RC; on the other hand, the SCC prevents an analysis that involves downward cyclic agreement, passing the Φ-values from the HN via C onto the T-head of the RC. 4.3. Feature sharing We can make some headway on the problem if we assume that agreement involves feature sharing (see Pollard and Sag (1994), Frampton and Gutman (2000), Legate (2005), Pesetsky and Torrego (2007)). The idea is that a probe β and a goal γ coalesce into one single feature (matrix) if they enter into an Agree relation. To fully exploit this idea, let us assume that Agree can be established between β and γ even if γ does not provide any value for β (that is, effectively, no valuation takes place). All that is required for Agree to apply is that an unvalued probe finds a matching goal (fully valued or not). To illustrate the idea, consider again a stage of the derivation where the C4

In section 7, some facts are presented that weaken the argument given above; see in particular footnote 20. In what follows, we ignore this caveat and instead concentrate on an alternative to the resumption based analysis.

58

Fabian Heck & Juan Cuartero

head of the RC has just been merged. Suppose that the feature matrix of T equals the feature matrix of C (modulo their being valued or not); then Agree leads to coalescence of the two matrices into a single one, associated with both categories. This is shown by the representation in (16). (Note in passing that Agree values [ CASE :] on C in (16).) (16)

C ⎡

CASE :

T ⎤ ⎡

C

CASE :x

⎤

⎣ PERS : ⎦ ⎣ PERS : ⎦ NUM : NUM :

T ⎡

Agree

⇒

CASE :x

⎤

⎣ PERS : ⎦ NUM :

In principle, every feature in C’s matrix could have served as the probe that triggers Agree and coalescence of the matrices in (16). We take it, however, that it is sufficient for there to be one probe (i.e., an unvalued feature) on the probe-category to find a goal on the goal-category in order for all features of the probe-category to coalesce with the features of the goal-category. Note that the case feature on the goal-category in (16) is valued before Agree takes place. We therefore abandon Chomsky’s (2000) “activation condition” in (13-d) (cf. Carstens (2001; 2003), Nevins (2004); see also section 6). Next suppose the derivation reaches the stage where the RC is complete and merges with the HN. The relevant configuration is shown in (17), where D represents the HN (for the time being, we ignore the R L P, but see section 5.2). (17)

D ⎡

CASE :

C ⎤

⎣ PERS :y ⎦ NUM :z

T ⎡

CASE :x

D ⎤

⎣ PERS : ⎦ NUM :

C ⎡

Agree

⇒

CASE :x

T ⎤

⎣ PERS :y ⎦ NUM :z

As C is in the edge domain of the RC, it is accessible for the HN. The unvalued case feature of the HN triggers Agree with the valued case feature of C. This leads to coalescence, thereby valuing [ CASE :] on the HN, the Φ-features on C, and, crucially, also the Φ-features on T. As a result, the gap in locality is bridged without violating the PIC. The problem presented in section 2 is thus solved. Also note that valuation of the Φ-features on T does not violate the SCC: the operation in question does not exclusively affect CP or TP; the probe is on the D-head, and thus DP is affected as well. This solves the problem that came up in section 4.1. The proposal also derives without further ado why there is no number agreement in English interrogatives with who while there is such agreement in RCs with who (see section 2): who lacks number, and it is only in RCs, not in interrogatives, that the C/T-complex can inherit a number value from an antecedent, namely the HN.

Long Distance Agreement in Relative Clauses

59

The analysis comes at a certain price, though. Namely, it is incompatible with the theory of cyclic spell-out (see Bresnan (1971), Uriagereka (1999), Chomsky (2001)), at least in its most radical form. In this theory, T, being the complement of the phase head C, has already undergone spell-out at the point where, according to the present theory, it receives its Φ-values. Consequently, there should be no overt agreement on T, contrary to fact. It is possible to avoid this conclusion by assuming a post-syntactic morphology (see Halle and Marantz (1993)). Of course, no such issue arises in a resumption-based theory.5 We would like to conclude this section with some general remarks on feature sharing. Representations similar to those in (16) and (17) have been proposed in auto-segmental phonology (see Goldsmith (1979)) but also for syntactic phenomena such as movement, coordination, and right node raising (see G¨artner (2002), de Vries (2008), and Bachrach and Katzir (2006), respectively, and references therein). For instance, the account of movement in terms of structure sharing makes use of a property that feature sharing apparently also possesses: two nodes in a tree are associated with the same daughter. With respect to movement, this property solves a dilemma that arises in the theory of Chomsky (1995): on the one hand, Chomsky (1995) assumes that case features must be valued; on the other hand, he argues that movement of some element α involves creating a copy of α ; the copy is then merged in a higher position while the original remains in the base position, accounting for, among other things, reconstruction effects and distributed spell-out. The dilemma is that, technically, only the copy gets its case feature valued. [ CASE :] of the original remains unvalued and should thus lead to a crash of the derivation, contrary to fact. But if α is shared by both the base position and the landing site, then there is no copy and thus, there is only one instance of [ CASE :]. In this way, the base position can remain associated with α while, at the same time, α can get its [ CASE :] valued by being associated with the landing site. In a similar vein, the features on T of a RC receive a value by an Agree relation that takes place between the HN and C: the HN, C, and T share one feature matrix. There is, however, at least one difference between the above mentioned theory of movement and the account of agreement (in terms of feature sharing) envisaged here. In the latter, the structure shared is not a lexical item or a phrase, as it is in the garden variety cases of movement. Rather, it is part of a lexical item: a feature structure. At least the categorial features of the agreeing nodes are not shared, although they are arguably part of the formal features of these nodes. 5

One can maintain cyclic spell-out by assuming that spell-out of the complement of a phase head P does not apply unless the next higher phase head is introduced. Unfortunately, this undermines a core motivation for cyclic spell-out, namely that it derives the strict version of the PIC (which we assume here); the idea is that a domain that has been spelled-out is no longer accessible for the syntax.

60

Fabian Heck & Juan Cuartero

For the sake of concreteness, we therefore assume that the shared feature matrices in (16) and (17) are not part of any lexical item. Rather, what Agree does is to dissociate (part of) the feature matrices of two nodes, to coalesce them, and to place the result on an autonomous level of representation (for instance, in the “workspace” of the derivation, see Frampton and Gutman (1999)). The two nodes that the original feature matrices belonged to then receive a “pointer” that refers to the place in the workspace where the shared information is stored. However, spell-out of these abstract features applies at the positions of the agreeing nodes themselves, and it does so in different ways, depending on the categorial feature present at these positions. The question arises as to whether feature sharing is also appropriate for analyzing other instances of non-local dependencies. As discussed by Legate (2005) (see also Richards (this volume), Sch¨afer (this volume)), cyclic Agree can account for the (more orthodox) cases of long distance agreement familiar from the literature: these involve a functional head agreeing with an argument that is embedded within another clause c-commanded by that head (see Butt (1993; 2008), Polinsky and Potsdam (2001), Boeckx (2004), Bhatt (2005), Bobaljik and Wurmbrand (2005), Boˇskovi´c (2007)). In some of these cases, feature sharing – or some equivalent mechanism – must ensure that the argument receives the case determined by the functional head, without violating the SCC. Since feature sharing is a powerful tool, it would be interesting to see whether it can be dispensed with. Richards (this volume) argues that almost all of the above mentioned cases of long distance agreement can be reanalyzed so that they fall within the realm of well-behaved agreement, respecting the PIC. Moreover, according to Richards, the reanalysis accounts for the optionality of longdistance agreement, its semantic effects, and solves a problem that involves the activation condition. Basically, the only case that remains problematic involves there-constructions in English, under the assumption that passive and unaccusative vPs are phases, too (again, see also Legate (2005) and Sch¨afer (this volume)). But technically, these cases do not even require feature sharing as there is no morphological case in English. As far as we can tell, however, no such reanalysis is possible for agreement in relative clauses of the type discussed here. Moreover, it seems possible to reconstruct Fischer’s (2006) binding theory in terms of feature sharing. In this theory, the decision whether the base position of a binding relation is spelled out as a pronoun, an anaphor, or a self-anaphor is determined not before the ultimate landing site of the binder is reached (binding involves movement in this theory). At this point, the featural make-up of the base position must be changed accordingly. To this end, Fischer stipulates an exception to the SCC that makes reference to the movement chain. If, instead, binding involves Agree (or if all movement involves Agree), then the exception can be accounted for in terms of feature sharing.

Long Distance Agreement in Relative Clauses

61

4.4. Additional evidence Note that the HN in (17) (indirectly) receives the value for its case feature from the T-head of the RC (via coalescence with the C-head). As case agreement with T results in nominative in English, the HN should be marked nominative. The prediction is borne out for (1) and (2). Two relevant examples with the nominative marked HN I are repeated in (18). (18) a. b.

I, who am tall, was forced to squeeze into that VW It is I who/that am responsible

As for (18-a), it is not surprising that the HN bears nominative since it figures as the subject of the matrix clause. As such, it would receive nominative from the matrix T-head anyway. However, this is not the case for (18-b). In fact, it has been observed that post-copular DPs in English must usually appear in the objective, arguably the default case in English (see Sch¨utze (2002, 235)); this is illustrated by the contrast in (19). (19) a. It was us b. *It was we As (18-b) shows, a structure like (19-b), which is ungrammatical in isolation, becomes grammatical if it is part of a CC. This suggests that CCs have a possibility to assign nominative that is not available for other constructions that involve post-copular DPs. Under the present analysis, this source is the C/T-head of the RC.

5. Extending the analysis 5.1. Person agreement and case The HN of a RC whose R L P is a subject is not necessarily marked nominative. This is obvious for RCCs because it is not exclusively nominative marked subjects that can be modified by RCs (whose subject is relativized); it is perhaps less obvious but also true for CCs in English (see, for instance, (20-d)). Interestingly, as observed by Akmajian (1970, 154) and by Ross (1970, 251), person agreement breaks down in contexts where the HN is not marked nominative, see (20-a,c). Rather, the person feature of T within the RC must bear the value third person, see (20-b,d).6 (Number agreement is addressed in section 5.2) 6

See also de Vries (2002, 228-229) on this effect in Dutch. In French, if a RC modifies a personal pronoun, then the strong form of this pronoun must be chosen: moi ‘I’, toi ‘you’, etc. (cf. the weak forms je ‘I’, tu ‘you’, etc.). These strong forms look like oblique case forms. Yet, as briefly

62

Fabian Heck & Juan Cuartero

(20) a. *He had the nerve to say that to me, who have made him what he is today. b. He had the nerve to say that to me, who has made him what he is today. c. *It is me who am responsible. d. It is me who is responsible. Me is both the objective and the default case form in English. Considering (20-d) first, suppose that me in (20-d) realizes the default case that is spelled out on a nominal that has not undergone any case agreement in the syntax (see Sch¨utze (2002)). This presupposes that nominals need not undergo case agreement. Let us assume therefore, following Sch¨utze (2002), that nominals may enter the derivation with or without [ CASE :]. Only if they bear [ CASE :], must they establish Agree with a case valuing head; otherwise, they receive a default marking in the morphology. Such nominals must then be identified by some other mechanism in order to escape the case filter. We leave open here what this mechanism is in the present context. The configuration in (20-d) is the same as the one in (17), except that the HN receives default case in the morphology, that is, D lacks [ CASE :]. (21) shows the situation after the HN has merged with the RC (again, we ignore the R L P). (21)

D

PERS :y NUM :z

C

T ⎡

CASE :x

⎤

⎣ PERS : ⎦ NUM :

There is no probe on the HN. Agree cannot be established and the HN cannot transfer its Φ-feature values onto C. When the next higher phase head v is merged, it becomes evident that [ PERSON :] on C (and, due to coalescence, T) cannot be valued via Agree, because C has become inaccessible, due to the PIC. We therefore assume that, at this point and as a last resort, [ PERSON :] on the C/T-complex receives the default value third person. (We address the fate of [ NUMBER :] on C/T in section 5.2) Turning to (20-b), it follows from the assumptions made in section 1 that the HN combines with the RC before its case feature had a chance to be valued by the preposition. There are three possible scenarios: (a) the HN lacks [ CASE :] and receives the morphological default marker; or (b) [ CASE :] of the HN rementioned in section 1, French RCs also exhibit long distance agreement with respect to person and number of the type familiar from English. Thus, French does not seem to be subject to the nominative restriction observable in English. Note, however, that one can argue that the forms moi, toi, etc. are actually the strong versions of nominative forms in present day French: real oblique forms must be accompanied by the preposition a` ‘to’.

Long Distance Agreement in Relative Clauses

63

mains unvalued until P is merged and is then valued objective;7 or (c) [ CASE :] establishes Agree with C, thereby receiving nominative from and transferring person and number values to it. Consider the first two scenarios. In both, there is no Agree between the HN and C: in (a), the HN lacks [ CASE :] altogether; in (b), the case-probe is retained. As a consequence, D cannot transfer the value of its person feature onto C (analogous to what was the case in (20-d)). In both scenarios, [ PERSON :] on C/T receives, as a last resort, the default value when the next higher phase head is merged (as in (21)). Next consider scenario (c). Ultimately, the corresponding derivation results in (22). (22) *He had the nerve to say that to I, who have made him what he is today. Since (22) is ungrammatical, it has to be shown that its derivation is blocked. (23) is a partial representation of the phrase marker after P has merged with the HN. (23)

P ⎡

CASE :w

D ⎤

⎣ PERS : ⎦ NUM :

C ⎡

CASE :x

T ⎤

⎣ PERS :y ⎦ NUM :z

Under the assumption that unvalued features are not tolerated (modulo the remark in footnote 7), the ungrammaticality of (22) is derived if it turns out that (at least one of) the Φ-probes of P in (23) cannot be valued. This presupposes that P bears Φ-features to begin with. We assume that this is the case, the Φfeatures on P being abstract in English. The lack of overt Φ-agreement on P (as opposed to certain Celtic languages) would then be a matter of spell-out. Now, it is clear that the valued case feature on P cannot act as a probe, thus triggering the necessary transfer of Φ-values from D to P. But the question remains why [ PERSON :] (or [ NUMBER :]) on P cannot establish Agree with D. At this point, we resort to an extra assumption that blocks valuation of [ PERSON :] on P. It is given in (24). (In section 5.3 we present independent motivation for (24).) In a sense, (24) is the mirror image of the idea that case valuation requires a person feature (see Chomsky (2001) on the inability of participles to value case).

7

This requires that a probe need not be valued immediately; otherwise, the HN’s [ CASE :] would always establish Agree with C. This is a departure from the view that probes must be valued as early as possible (see Pesetsky (1989), Chomsky (1995, 233)); it is, however, still compatible with the idea that spell-out applies phase-wise.

64

Fabian Heck & Juan Cuartero

(24) Restriction on person agreement Valuation of [ PERSON :] requires coalescence of [ CASE ] within the same feature matrix. The features [ CASE :w] and [ CASE :x] in (23) cannot coalesce because they are valued differently. It follows from (24) that [ PERSON :] on P in (22) cannot be valued. If there is no other way to value [ PERSON :], this causes the derivation to crash. This reasoning raises the question as to why [ PERSON :] on P cannot receive a default value. Recall that this was assumed to be possible for [ PERSON :] on C in (20-d). What distinguishes [ PERSON :] on C in (20-d) from [ PERSON :] on P in (22) is that for the latter there is, in principle, an accessible goal in the ccommand domain of the probe (namely [ PERSON ] on D), which cannot be made use of because of (24). In contrast, the former is a probe for which no goal whatsoever is available in its c-command domain. For lack of better understanding, we therefore stipulate that default valuation is an option for unvalued Φ-features if and only if no goal is available. The intuition behind this stipulation is that the mechanism that provides a default value for a hitherto unvalued Φ-feature is “unaware” of (24). This is stated in (25). (25) Default valuation Unvalued Φ-features can only receive a default value if there is no accessible goal within their c-command domain. Obviously, the same fate awaits [ PERSON :] on P in scenario (a). Thus, the only scenario that converges is scenario (b). This accounts for the lack of person agreement in (20-b). 5.2. Number agreement and Λ We have not yet explained what happens to [ NUMBER :] on C in nonnominative contexts like (20-b) and (20-d). Recall that it cannot be valued by an Agree operation triggered by [ CASE :] on the HN because the HN either lacks [ CASE :] or must retain it for Agree with a higher case-assigner. Now, Akmajian (1970) observes that [ NUMBER :] on the T-head of the RC does not receive a default value in this context, in contrast to [ PERSON :]. (26) illustrates this for CCs and RCCs.8 The verb of the relative clause must exhibit number agreement with the plural HN: (26) a.

8

He had the nerve to say that to them, who have made him what he is today.

For some reason, number agreement is optional with the copula be, see (i). We ignore this here.

Long Distance Agreement in Relative Clauses

65

b. *He had the nerve to say that to them, who has made him what he is today. c. It is them who have made him what he is today. d. *It is them who has made him what he is today. We can draw two conclusions from this. First, there must be another probe present on the HN (which lacks [ CASE :]) in (26-c) (cf. (21), representing (20-d)). This probe must establish Agree with C, thereby transferring the number value of the HN onto C (and, due to coalescence, T). Second, number agreement in (26-a,c) is not subject to the restriction in (24), i.e., it is not dependent on case (cf. the derivations of (20-b) and (20-d)). In this way, (24) (partially) accounts for the generalization that person agreement is more fragile than number agreement (see Bhatt (2005), Boeckx (2006), Baker (2008)): if no case feature is available or if the case features on the probe- and goal-category bear different values, person agreement breaks down while number agreement, as we will show now, remains unaffected. But first things first. The probe on the HN that enables valuation of C’s [ NUMBER :] still needs to be identified. To this end, recall that the agreement facts discussed here arise in RCs. A hallmark of RCs is that they denote a property. This is usually represented by a λ -operator that has scope over the RC and binds a variable inside it. We follow Adger and Ramchand (2005) in assuming that this λ -operator is the denotation of an interpretable Λ-feature on C.9 Suppose now that Λ has an index as its value. This index is interpreted as the binding index. If Λ on C enters the derivation unvalued (i.e., as [Λ:]), it must acquire a value. This is done by establishing Agree with the R L P, which bears an uninterpretable but valued variant of Λ. If the R L P is in Specv, then its Λ-feature is accessible to [Λ:] on C and valuation applies before movement of the R L P takes place. If the R L P is inside VP, then it must first move to an outer Specv; from there, it can value [Λ:] on C; only after then, movement of the R L P to SpecC applies. Now suppose that the HN comes equipped with an uninterpretable but unvalued variant of Λ. In order to receive a value, [Λ:] on the HN must probe the RC. We would like to suggest that this [Λ:] is the probe that triggers valuation of C’s number feature. One may think of it as a selectional feature that signals that the HN is supposed to combine with a RC. (i)

9

a. b. c. d.

It is us who are responsible. ?It is us who is responsible. It is them who are responsible. ?It is them who is responsible.

Thus, the λ -operator is not the interpretation of the wh-movement that takes place in relative clauses, as proposed in Heim and Kratzer (1998).

66

Fabian Heck & Juan Cuartero

The partial representation of (26-c) at the point where the HN has just merged with the RC thus looks as in (27). As before, the HN, represented by the leftmost D in (27), lacks [ CASE :]. The second D from the left in (27) represents the R L P.10 (27)

D ⎡

D

PERS :y

⎤

⎣ NUM :z ⎦ Λ:

Λ:v

C

T

CASE :x

PERS :

NUM :

Agree

⇒

...

Recall that it was argued in section 2 that the English R L P does not bear any Φfeatures. The only relevant feature it bears is the valued Λ-feature (and perhaps case, but see section 5.3), which it shares with C through Agree. Thus, the R L P cannot value [ PERSON :] on C. Furthermore, we assume that [ PERSON :] remains unvalued for the moment: C, which is associated with [ PERSON :], is accessible from outside the RC. Thus, there is still a chance that [ PERSON :] receives a value via Agree; as a consequence, no default value is assigned yet. In the next step, [Λ:] on the HN probes for [Λ:v]. As [Λ:v] is associated with C and C is associated with [ NUMBER :], the value of [ NUMBER ] on the HN can value [ NUMBER :] on C. The resulting structure is shown in (28). (28)

D

PERS :y

D

Λ:v

C

CASE :x

T

NUM :z

PERS :

[ PERSON :] on C/T in (28) cannot be valued by the HN due to lack of [ CASE :] on the HN (see (24)). Thus, when the next higher phase head is merged, [ PERSON :] on C/T receives the default value. To summarize, there is no person agreement between T and the HN in (28): they do not associate with the same person feature; but there is number agreement: both the HN and T associate with [ NUMBER :z]. 10

Note that [ CASE ] is separated from [ PERSON ] and [ NUMBER ] in (27) (as opposed to (21)). The reason is that the D that represents the R L P associates with [ CASE ] but not with [ PERSON ] and [ NUMBER ]. This feature split is without consequence here, but it will become important in section 5.3.

Long Distance Agreement in Relative Clauses

67

5.3. Person agreement and number As mentioned in section 2, German also exhibits person and number agreement in RCs. Examples that involve CCs and RCCs are given in (29-a,b) and (29-c,d), respectively. As first and third person plural are syncretic in the German verb inflection, person agreement in the plural can only be observed with second person.11 (29) a.

weil ihr es seid, die die ganze Arbeit macht. since you.2.PL it are, REL the whole work do.2.PL ‘since it is you who do all the work.’ b. ??weil ihr es seid, die die ganze Arbeit machen. since you.2.PL it are, REL the whole work do.3.PL ¨ c. Ihr, die immer Arger macht, habt mir gerade noch you.2.PL REL always trouble make.2.PL have me PART yet gefehlt. lacked ‘You, who always cause trouble, are the last thing I need.’ ¨ die immer Arger machen, habt mir gerade noch d. *Ihr, you.2.PL REL always trouble make.3.PL have me PART yet gefehlt. lacked

As is the case for person agreement in English, person agreement in German is subject to restriction (24). This is illustrated by (30). As the post-copular DP in German CCs is always nominative and since nominative is also the default in German, restriction (24) can only be illustrated with RCCs. 11

In fact, German speakers often prefer RCs that contain an overt R S P in these contexts, see (i-a), which is (often) barred from third person contexts and from CCs, see (i-b) and (i-c), respectively. (i)

a. b. c.

ich, der ich hier die ganze Arbeit mache I REL I here the whole work do.1. SG ‘I, who do all the work here’ *er, der er hier die ganze Arbeit macht he REL he here the whole work do.3. SG ‘he, who does all the work here’ *weil ich es bin, der ich hier die ganze Arbeit mache. since I it am REL I here the whole work do.1. SG ‘Since it is me, who does all the work here.’

This R S P is fully specified for Φ-features and thus gives a trivial answer to the question as to where the T of the RC receives its Φ-values from. In what follows, we ignore this variant and its restrictions (but see Ito and Mester (2000)).

68

Fabian Heck & Juan Cuartero

¨ macht, (30) a. *Ich will euch, die Arger nicht mehr sehen. I want you.2.PL . ACC REL trouble make.2.PL not more see ‘I don’t want to see you, who cause trouble, anymore.’ ¨ machen, nicht mehr sehen. die Arger b. ?Ich will euch, I want you.2.PL . ACC REL trouble make.3.PL not more see However, in contrast to English, person agreement in German is confined to plural contexts. To our knowledge, this has gone unnoticed in the literature. (31) gives relevant examples.12 (31) a. *weil du es bist, der die ganze Arbeit machst. since you.2.SG it are REL the whole work do.2.SG ‘since it is you who do all the work’ b. weil du es bist, der die ganze Arbeit macht. since you.2.SG it are REL the whole work do.3.SG c. *weil ich es bin, der die ganze Arbeit mache. since I it am REL the whole work do.1.SG ‘since it is I who do all the work.’ d. weil ich es bin, der die ganze Arbeit macht. since I it am REL the whole work do.3.SG The account of these facts that we would like to propose is ultimately based on the observation that German d-R L Ps inflect for number (and, irrelevant in the present context, gender), in contrast to English R L Ps. Thus, we have (in the nominative) der, die, das for masculine, feminine, and neuter in the singular, and die for all three genders in the plural. It is our hunch that this difference between English and German is responsible for the difference in person agreement.13 As will become clear shortly, the account requires some re-thinking of the role of case on R L Ps. To begin with English, suppose that the who that appears as the subject in English RCs is not a R L P but rather the C-head of the RC (see Pesetsky and Torrego (2008)). The difference between who and that would be that who agrees with the HN in animacy. Let us stop here for a moment and ask whether this assumption undermines the argument (given in section 2) for the claim that R L Ps in English lack number, which was based on the assumption that who is a pronoun in both relative 12

13

The dependency of person agreement on plural also emerges in Spanish CCs with the R L P quien. Moreover, person agreement in Spanish is more pervasive than in German due to the lack of a syncretism in this domain. We refrain from presenting relevant Spanish examples for reasons of space. The Spanish R L P quien ‘who. SG’ also has a plural form: quienes ‘who. PL’. And, as mentioned in footnote 12, Spanish also shows the restriction on person agreement to plural contexts. By contrast, French, which lacks the plural restriction on person agreement, also lacks an (overt) plural specification on the R L P.

69

Long Distance Agreement in Relative Clauses

and interrogative contexts. It does not. Here is why. If who in RCs is actually a C-head, then there is presumably an empty R L P in this context. Admittedly, this empty R L P could be argued to be specified for number (in contrast to interrogative who). Note, however, that even then one has to account for the generalization that R L Ps (no matter whether empty or not) are more fully specified than interrogative pronouns.14 To conclude, we assume that the subject R L P in English is empty. Moreover, and more importantly, we also assume that this empty R L P always lacks [ CASE :]. In contrast, whom, which is employed in relativizing objects, is a R L P and also appears in case marked form. Under these assumptions, the appropriate representation of an English RC that has just combined with its HN is not the one given in (27) (where it was still assumed that the R L P bears [ CASE :]); rather, it looks like (32). (32)

D ⎡

D

C

⎤

⎤ ⎡ CASE :x ⎢ PERS :y ⎥

⎥ ⎢ ⎣ PERS : ⎦ ⎣ NUM :z ⎦ Λ:v NUM : Λ: CASE :

T

Agree

⇒

...

Crucially, since the R L P in (32) lacks [ CASE :], this feature can show up in the same feature matrix as [ PERSON ] and [ NUMBER ]; this would be impossible if the R L P had [ CASE ] because at the same time it lacks [ PERSON ] and [ NUMBER ]. Applying Agree to (32) results in (33), which reflects the usual person and number agreement between HN and C (and, due to coalescence, T). (33)

D

D

Λ:v

C

⎡

CASE :x

T ⎤

⎣ PERS :y ⎦ NUM :z

Note as an aside that (34) represents the same state of affairs as (32). However, [ PERSON ] and [ CASE ] are separated in (34). It is therefore crucial that (34) is not generated; otherwise, the restriction on person agreement in (24) would lead us to expect that English RCs at least optionally lack the type of person agreement discussed here, contrary to fact.

14

A similar situation arises in French, where interrogative qui ‘who’ does not trigger number agreement, while relative qui (which Kayne (1976) actually argues to be a C-head) does.

70

Fabian Heck & Juan Cuartero

(34)

D ⎡

CASE :

D

C

T

⎤

⎢ PERS :y ⎥

⎥ ⎢ CASE :x NUM : PERS : ⎣ NUM :z ⎦ Λ:v Λ:

We therefore assume that there is some principle to the effect that Agree economizes on the number of feature splits: (35) Minimization of feature splits The integrity of feature matrices should be preserved by Agree as fully as possible. Sure enough, (35) prevents Agree from generating (34) (because (32) preserves more structure than (34) does). We leave open here whether (35) can be derived from more general properties of Agree. But note that it might follow from two assumptions. First, probe-categories must get their probes valued as early and as completely as possible at each application of Agree; this recalls Pesetsky’s (1989) Earliness Principle and Chomsky’s (2001) Maximization Principle (see also van Koppen (2005) and Lahne (2008) for a related maximization requirement that determines the choice between different goal-categories). Second, this can be optimally achieved if all the probe-features of a probe-category are valued en bloc (see also Chomsky (2008, 161, footnote 48)). Returning to the main plot, namely the difference between English and German, note that the default case in German is nominative. As already mentioned, we follow Sch¨utze (2002) in assuming that default case spells out the case ending of a nominal that lacks a case feature in the syntax. Now, R L Ps in German are marked for case, just as other pronouns are. What we would like to claim now is that, in certain contexts, this is merely a morphological reflex and that in the syntax, R L Ps in German often (but not always) lack [ CASE :]. In particular, we claim that this applies to plural d-R L Ps. Thus suppose that d-R L Ps in German get their case feature assigned by the lexical redundancy rule in (36). (36) Lexical redundancy rule for German A d-R L P α bears [ CASE :] if and only if α bears [ NUMBER : SG ]. It follows from (36) that all d-R L Ps that lack the specification [ NUMBER : SG ] also lack [ CASE :]. Thus, plural d-R L Ps lack [ CASE :] and receive their form die as a default at spell-out. Note that (36) does not distinguish between accusative and nominative. Consequently, die is the only form in the plural for both.15 15

Morphological theories of the pronominal declension in German often assume massively underspecified marker entries for the nominative and accusative plural, which derives that the markers

71

Long Distance Agreement in Relative Clauses

Against this background, consider the case of a singular HN (where singular = z) that combines with a RC whose R L P is in the singular, too (see section 5.4 on number agreement between the HN and the R L P). The relevant configuration before application of Agree is given in (37). (37)

D ⎡

CASE :

D ⎤

⎢ PERS :y ⎥

⎥ ⎢ ⎣ NUM :z ⎦ Λ:v Λ:

C

CASE :x NUM :z

T

PERS :

Agree

⇒

...

Importantly, [ PERSON :] in (37) is separated from [ CASE ], as opposed to what was the case in English, cf. (32). The reason for this is that the R L P (represented by the second D-node from the left in (37)) is associated with [ CASE ] and [ NUMBER ]: it is singular, by assumption, and thus, by (36), also bears a case feature. But the R L P cannot be associated with [ PERSON ] because R L Ps generally lack [ PERSON ]. It is this split of [ PERSON ] and [ CASE ] that gives us a handle on accounting for the plural effect in person agreement. Namely, the lack of person agreement now follows without further ado from the restriction in (24): [ PERSON :] in (37) is separated from [ CASE ]. As a consequence, valuation of [ PERSON :] would not result in coalescence of [ CASE ] in the same matrix; and this would contradict (24). [ PERSON :] on T therefore receives the default valued third person. Note that if the R L P in (37) were not associated with [ CASE ], then [ CASE ] would group with [ PERSON ], to the exclusion of [ NUMBER ]; this is why we need the lexical redundancy rule in (36), which introduces [ CASE ] in precisely this context. Next consider a context where the HN and the R L P are in the plural (where plural = u). Recall that, due to (36), this means that the R L P lacks [ CASE :]. As usual, we enter the derivation after the HN and the RC have merged, yet before Agree has applied:

are syncretic for these cases in all three genders (see Bierwisch (1967), Blevins (1995), Wunderlich (1997), and Wiese (1999)). The same theories assume that markers for dative and genitive are more fully specified and thus do not take part in this syncretism. The theory proposed here transfers part of this underspecification from the morphology to the syntax: a DP that enters the derivation may be underspecified for case, but only in the plural. Since (36) does not mention gender, it applies to all genders alike. Without specifying the details, we assume that genitive and dative have different markers – even in the plural – because they are lexical cases in German.

72

Fabian Heck & Juan Cuartero

(38)

D ⎡

CASE :

D

C

⎤

⎢ PERS :y ⎥

⎥ ⎢ ⎣ NUM :u ⎦ Λ:v [NUM :u] Λ:

T

CASE :x PERS :

Agree

⇒

...

As in English (cf. (32)), [ PERSON :] and [ CASE ] share the same matrix in plural contexts in German. The reason is that in this context the R L P lacks [ CASE :] (due to (36)). Thus, comparing (38) with (37) we can see that [ CASE :] has “changed sides” from [ NUMBER ] to [ PERSON ]. It follows that (24) does not block person agreement from applying in (38). To sum up, person agreement in English in general and in singular contexts in German particularly differ because German R L Ps bear [ CASE :] in singular contexts but not in plural contexts; in contrast, empty R L Ps in English lack [ CASE :] altogether. Thus, Agree within the RC potentially creates different feature structures for English and German, depending on the number value of the German R L P. If [ PERSON :] and [ CASE ] in plural contexts in German share the same matrix (as in English in general), then (24) is respected and Agree can value [ PERSON :] on C/T on a later Agree-cycle; if they do not (as in singular contexts in German), then person agreement is blocked. Due to the lack of [ CASE :] on the d-R L P in plural contexts in German, the feature structures in this context are sufficiently similar to those in English to allow for person agreement. 5.4. Anaphoric agreement We have not yet addressed the question of how number (and gender) agreement between the R L P and the HN in German come about. In the representations (37) and (38), the values for number of the HN and the R L P are uniformly z or u. So far, this does not follow from anything. In principle, since these values are chosen independently from one another, they can differ. However, as (39-a,c) show, there is obligatory number agreement between the HN and the R L P in German. (39-b) shows that the problem with (39-a) is not only the clash in number agreement between the R L P and T of the RC. (39) a. *weil wir es sind, der die ganze Arbeit machen. since you.2.PL it are REL.SG the whole work do.1.PL ‘since it is we who do all the work.’ b. *weil wir es sind, der die ganze Arbeit macht. since you.2.PL it are REL.SG the whole work do.1.SG c. weil wir es sind, die die ganze Arbeit machen. since you.2.PL it are REL.PL the whole work do.1.PL

Long Distance Agreement in Relative Clauses

73

The brute force solution to this problem is to assume that the R L P bears [ NUMBER :], which receives its value by Agree with the HN, just as the C/Tcomplex of the RC. However, adopting this solution would make it impossible to resort to the lexical redundancy rule in (36) in order to determine whether the R L P bears [ CASE :] or not. Recall that (36) makes reference to the number value of the R L P. At the point where the hypothesized [ NUMBER :] of the R L P is valued by the HN, the R L P has already been introduced into the derivation. That is, insertion of [ CASE :] would have to apply after the derivation has started, in violation of the IC in (11). But if it is impossible to make use of the redundancy rule in (36), then either the hypothesized correlation between [ NUMBER ] and [ CASE :] on R L Ps must remain completely accidental or the lack of person agreement in singular contexts in German remains unaccounted for. In other words, our account of the lack of person agreement in singular contexts in German (see (31-a) and (31-c)) now forces us to come up with an alternative account of (39-a,b). Fortunately, there is reason to believe that number agreement between the HN and the R L P has a source that is different from the one responsible for agreement between the HN and the C/T-complex of the RC. If so, then this means that the present theory need not account for the former type of agreement (and, in fact, should not be expected to do so). The precise mechanics of this independent agreement can thus be ignored for the purpose of this article. The following reasoning for this view is taken from Sternefeld (2006, 382-384).16 To begin with, it is clear that there must be an independent mechanism that ensures number and gender agreement between an anaphoric pronoun and its antecedent. The question is whether R L Ps are anaphoric and, hence, subject to this mechanism (which could then be held responsible for agreement between the HN and the R L P). If R L Ps are semantically empty, then they cannot be anaphors because anaphors receive as their denotation a reference from their antecedent. In fact, it is often assumed for restrictive RCs (see, for instance, Heim and Kratzer (1998)) that R L Ps have no denotation, except for, perhaps, the identity function. In the present context, however, we are concerned with appositive RCs and CCs. For the former, it is quite plausible to assume that their R L Ps are indeed anaphoric. As such, they are subject to the agreement rule that operates on anaphors. For CCs this is perhaps less obvious. We assume, without further argument, that they share this property with appositive RCs; suffice it to say that the fact that CCs involve RCs that combine with first or second person 16

In contrast to the domain of verbal agreement in German, there is also overt gender agreement between the HN and the R L P in German. Note that this is not indicative of the hypothesized difference between the agreement relations under discussion: although there is no (overt) gender agreement on C or T in German, such agreement does take place in certain Bantu RCs (see, for instance, Zeller (2004)).

74

Fabian Heck & Juan Cuartero

pronouns suggests that they pattern together with appositive RCs rather than with restrictive ones. We conclude that even without specifying how exactly gender and number agreement between the HN and the R L P proceeds, it is justifiable to assume that this agreement can be ignored for the purpose of the present discussion.

6. Copula agreement CCs in English and German also differ with respect to agreement with the copula of the cleft: in English, the copula agrees with the expletive it (i.e., it is valued [ PERSON :3] and [ NUMBER : SG ]); in German, the copula agrees with the HN. This is illustrated in (40) and (41), respectively. (Again, French patterns with English in this respect while Spanish CCs behave like CCs German.) (40) a. It is you who are responsible. b. *It are you who are responsible. (41) a.

weil du es bist, der mich versteht. because you.2.SG it be.2.SG REL me understands ‘because it is you who understands me.’ b. *weil du es ist, der mich versteht. because you.2.SG it be.3.SG REL me understands

At first sight, this suggests a correlation between copula agreement and agreement with the T-head of the RC: if the HN agrees with the RC T-head, then it does not agree with the matrix T (the copula). The correlation makes sense if one adopts the activation condition (see (13-d)): once the HN’s case feature has been valued by an Agree relation with the embedded T, it cannot establish Agree with the matrix T. Although this view is certainly attractive in that it seeks to correlate the difference in copula agreement with the difference in person agreement with T of the RC, it is not compatible with the fact that there is agreement with the RC T-head in German (and Spanish), too, namely number agreement (and, in plural contexts, even person agreement, see section 5.3). Notably, in these cases there is also agreement with the copula, see (41-a) and (42-a) for number and person agreement, respectively, in German. es seid, die die ganze Arbeit macht. weil ihr since you.2.PL it be.2.PL, REL the whole work do.2.PL ‘since it is you who do all the work.’ es ist/bist/sind, die die ganze Arbeit macht. b. *weil ihr since you.2.PL it be.3.SG/2.SG/3.PL REL the whole work do.2.PL

(42) a.

75

Long Distance Agreement in Relative Clauses

In addition, in English RCCs, the HN also agrees with the matrix T-head if it is the subject of the matrix clause, see (43). (43) a. I, who am tall, am forced to squeeze into that VW. b. *I, who am tall, is forced to squeeze into that VW. This is unexpected if (as we argued above) the HN has already spent its [ CASE :] on the embedded C/T-complex and if the ability of the HN to enter into Agree depends on its having an unvalued case feature. For this reason, and also because we already rejected the activation condition in section 4.3 above, we must now offer an alternative account of the copula facts. To this end, suppose that English CCs have the partial structure and derivation in (44). (For other analyses of English clefts see, among others, Akmajian (1970), Chomsky (1971), Schachter (1973), Pinkham and Hankamer (1975), Meinunger (1998).) (44) a. b. c.

[ vP it BE HN [ CP REL . . . ]]] → [ TP T [ vP it is HN [ CP REL . . . ]]] → [ TP it2 is3 [ vP t2 t3 HN [ CP REL . . . ]]] →

(Merge T, Agree (it, T)) (Move it, Move copula) ...

The expletive it is merged in Specv (see Richards (2007)) while the HN is (within) the complement of v. As a consequence, the goal-category it is closer to the probes on T than the HN and thus triggers agreement with T (at the same time blocking agreement between the HN and T; see (13-e)). Later, the expletive raises to SpecT. In contrast, we assume that in German CCs the expletive es is not merged as the external argument; rather the HN is. The RC is merged as the complement of an empty D (akin to a free RC) whose projection is (within) the complement of v. The RC then undergoes extraposition and the empty D is spelled-out as es.17 The structure and derivation of German CCs thus look like (45) (subject raising being optional in German; see Grewendorf (1989), Diesing (1992), M¨uller (2000)). (45) a. b. c. d.

[ vP HN [ DP D [ CP REL . . . ]] BE ] → (Merge T, Agree (HN, T), Move copula) [ TP [ vP HN [ DP D [ CP REL . . . ]] t3 ] BE3 ] → (Move CP) [ TP [ vP HN [ DP D t4 ] t3 ] BE3 ] [ CP REL . . . ] 4 → (spell D out as es) [ TP [ vP HN [ DP es t4 ] t3 ] BE3 ] [ CP REL . . . ] 4 → ...

Note that the HN in (45) is closer to the matrix T than the empty D-head. Therefore, T agrees with the HN, not with es. 17

See also Jespersen (1937) for the idea (applied to English) that the RC of a CC actually modifies the expletive; cf. also Gundel (1977) for a related idea. We have to leave open why, generally, extraposition of a free RC in object position does not trigger es-insertion in German.

76

Fabian Heck & Juan Cuartero

We offer no explanation for why the structures of English and German CCs differ.18 But there is some independent evidence for the structure we hypothesize for German CCs. Weak pronouns in German show up in the Wackernagel domain in a strict order. M¨uller (2001) argues that this order reflects the order in which they are merged. Now, it turns out that the HN and the expletive pronoun es in a CC are subject to a similar order restriction, see (46). weil ich es bin, der hier die ganze Arbeit macht. since I it am, REL here the whole work do.3.SG ‘since it is me who does all the work here.’ b. *weil es ich bin, der hier die ganze Arbeit macht. since it I am REL here the whole work do.3.SG

(46) a.

The contrast in (46-a,b) suggests that the HN ich ‘I’ is merged higher than (and thus to the left of) the D spelled out as es ‘it’, thus supporting our assumptions.19

7. An alternative: head raising There is an alternative analysis of the agreement phenomena presented here: the head raising analysis of RCs (see Brame (1968), Schachter (1973), Vergnaud (1974), Kayne (1994), and many more on RCCs; see Schachter (1973), Pinkham and Hankamer (1975), and Meinunger (1998) on CCs). Under this analysis, CCs and RCCs involve raising of the HN out of the RC to a position immediately preceding it. Since a subject HN is merged within the RC, it stands in a local relation to the T-head of the RC and can thus value the Φ-features of T, thereby respecting the PIC. To our knowledge, the agreement facts discussed here have not been put for18

19

French patterns with English in that it shows copula agreement with the expletive (ce in French) while Spanish patterns with German (but there is no overt expletive in Spanish CCs). Note that there is indeed evidence that Spanish CCs with quien must be analyzed as involving a free RC (basically because non-free RCs cannot involve quien, except for contexts that involve piedpiping). Note that a full DP subject (like Fritz in (i)) may appear on either side of es in a CC: (i)

a. b.

weil es Fritz ist, der die ganze Arbeit macht. since it Fritz is REL the whole work does ‘since it is Fritz who does all the work.’ weil Fritz es ist, der die ganze Arbeit macht. since Fritz it is REL the whole work does

The reason for this is that subject raising is optional in German. In (i-a), Fritz is in Specv, which is below (and thus to the right of) the Wackernagel domain. In (i-b), Fritz is raised to SpecT, to the left of the Wackernagel domain. Weak subject pronouns cannot remain in situ but must undergo Wackernagel movement.

Long Distance Agreement in Relative Clauses

77

ward as an argument in favor of the head raising analysis. This is surprising because, as pointed out, head raising provides a straightforward account for the facts. Yet, for now we favor the present analysis over the head raising account for the following reasons. First, note that other connectivity effects that have usually been taken to motivate head raising (idioms, principle A effects, and variable binding; see (47-a-c), respectively) also emerge in the context of long relativization. This is shown in (48) for English (see also Salzmann (2006, 338-339) on RCs in Zurich German). (47) a. b. c.

The headway that John made was remarkable. The pictures of himself2 that John2 put on sale are unflattering. The relative of his2 that no-one2 should forget to invite is his mother.

(48) a. b.

The headway that Mary said that John made was remarkable. The pictures of himself2 that Mary believes that John2 put on sale are unflattering. The relative of his2 that Mary said that no-one2 should forget to invite is his mother.

c.

However, as observed by Morgan (1972, 284), person agreement in English RCCs breaks down in contexts of long relativization, see (49).20 (49) a. *I, who John says the FBI thinks am an anarchist, will always be incoherent. b. *I, who John says Martha believes the FBI thinks am an anarchist, may be losing my grip on banality. This is unexpected under a head raising approach: if the head is able to raise out of one CP, it should be able to raise out of the next higher CP, too. And in fact, this is what the connectivity effects in (48) suggest under the head raising analysis. A PIC-based approach can account for the facts in (49) by assuming that the 20

In fact, Morgan notes that if relativization crosses only one phrase boundary, then both the variant with and the variant without agreement are bad, see (i). We have no explanation for this to offer here. (i)

I, who the FBI thinks {*am,*is} an anarchist, will doubtlessly be arrested.

Moreover, note that Morgan’s observation weakens the argument against the resumption based analysis put forward in section 4.2. Recall that the argument was based on the claim that CCs that involve person agreement are island sensitive and thus, by hypothesis, cannot involve non-bare R S Ps. But crucially, all the island contexts discussed in (15) involve clefting across a sentence boundary. If clefting behaves similar to relativization, then Morgan’s observation suggests that there may be an independent reason for the ungrammaticality of the examples in (15), namely the failure of long person agreement.

78

Fabian Heck & Juan Cuartero

chain of cyclic agreement is broken at some point, which prevents long agreement from being established. Thus suppose that although C can enter into cyclic Agree with the T-head it embeds, it cannot enter into cyclic Agree with the next higher v. If so, then the Φ-values of the HN cannot be transferred onto the most deeply embedded C/T-complex of the RC in (49) because the higher C-head (which has received the relevant Φ-values of the HN) cannot agree with this C/T-complex across the phase boundary that is induced by the intervening vP. Interestingly, agreeing infinitives in Portuguese show a similar pattern. To begin with, infinitives in Portuguese that are embedded under verbs of perception, such as ‘to see’, obligatorily agree with their thematic subject, see (50) (from Perlmutter (1972, 88)). os cavalos correr. (50) a. *Vi saw.1. SG the horses run ‘I saw the horses run.’ b. Vi os cavalos correrem. saw.1. SG the horses run.3.PL As Perlmutter (1972) observes, this agreement breaks down if the agreement controller is supposed to probe into an infinitival RC, see (51). (51) a.

os cavalos que vi correr the horses REL saw.1. SG run ‘the horses that I saw run.’ correrem b. *os cavalos que vi the horses REL saw.1. SG run.3.PL

This follows in a PIC-based theory if the intervening vP boundary in (51) breaks the chain of cyclic agreement between the HN and the T-head of the ECMinfinitive. No such vP-boundary intervenes in (50). But then again, it is unclear why head-raising (and thus long distance agreement) should be barred from applying in (51-b). In this context, consider the examples in (52).21 (52) a. b.

It is me who John says is sick. It is I who John says is sick.

According to our analysis, I in (52-b) receives the value for its case feature from the embedded T-head (i.e., the T-head that is the clause mate of John).22 The agreement on the embedded copula in (52) is third person (i.e., there is 21 22

Akmajian (1970) reports that (52-b) is not an option in his dialect II. In the dialect that is under investigation here (presumably Akmajian’s dialect III), both variants are grammatical. Apparently, it does not matter that John has already valued its case feature against this T.

Long Distance Agreement in Relative Clauses

79

no person agreement), as expected in a context of long relativization. However, unlike person agreement, number agreement does not break down, see (53). (53) It is us who John says are sick This is a surprise under the assumption that the lack of person agreement in the context of long relativization is a PIC-effect. The question is why number and person agreement should behave differently. Unlike what was assumed above (see section 5.1 vs. section 5.2), the asymmetry between number and person agreement cannot be attributed to the dependency of person on case. We can think of two possibilities here. First, one may assume that R L Ps in English bear a number feature after all (but no person), as opposed to what has been claimed in section 2. This move would leave unaccounted for the lack of number agreement in interrogatives with who. More importantly, though, it is incompatible with the derivation of the number effects presented in section 5.3. We therefore reject this possibility here. Second, it is possible to assume that although v (in English) does not bear [ PERSON :] it still bears [ NUMBER :]. That is, cyclic person agreement breaks down because there is a link missing in the person agreement chain at the vP-boundary. However, the agreement chain is complete with respect to number. The second reason why we voted against analyzing agreement in RCs in terms of head raising is that we find it rather hard to imagine how head raising can account for the nominative restriction (see section 5.1) and the number effect (see section 5.3). As for the nominative restriction, it would be straightforward to assume that a nominative marked subject can only raise to become the head of the RC if it can preserve its case, i.e., if it targets a position that also receives nominative. However, this is exactly what proponents of the head raising analysis generally deny because they assume that head raising also applies in contexts where the case assigned to the HN from outside the RC is not identical with the case determined within the RC. Concerning the number effect, a naive approach would suggest that head raising only applies in plural contexts. But why this should be the case remains completely unclear. To summarize, although we have not shown that there are principled reasons why the head raising analysis should not be able to account for the agreement facts discussed here, it still seems to us that the complications that they involve can be approached more naturally by a theory that is based on Agree than by a movement-based theory.

8. Conclusion At first sight, the type of agreement in RCs shown in (1) and (2) can be analyzed as local agreement of the garden variety type: the subject R L P, which is

80

Fabian Heck & Juan Cuartero

merged inside the RC, bears a local relation to the RC’s T-head. As T is the locus of the (unvalued) Φ-features, the R L P can provide a value for these features: agreement. In this article, we have argued against this view, claiming that R L Ps in languages such as English and German (but also French and Spanish) lack person, and that R L Ps in English (and French), as opposed to German (and Spanish), even lack number. We concluded that it must be the HN that provides the Φvalues in question. We then showed that this conclusion is incompatible with the strict version of the PIC. Finally, we made a proposal as to how one can maintain the strict version of the PIC and still account for the facts. To this end, we assumed that agreement applies cyclically and involves feature sharing. The basic idea is that in a first step T and C establish Agree within the RC. This leads to coalescence of their Φ-features. In a second step, the HN values the features of C, which, being at the edge of the CP-phase, is accessible to the HN. Due to the coalescence on the previous Agree-cycle, this also values the Φ-features on T and therefore avoids a violation of the SCC. The approach requires that the morphology applies post-syntactically. We further proposed that the nominative restriction on person agreement owes to a constraint that requires valuation of [ PERSON :] to go hand in hand with coalescence of [ CASE ]. Because of the derivational nature of the approach, a feature structure that is the result of an Agree operation on an earlier cycle of the derivation serves as the input for later Agree-cycles. Since R L Ps in German and English differ with respect to their featural make-up, the feature structures that result if they enter into Agree differ, too. Ultimately, this has an impact on person agreement: in German, person agreement is only possible in plural contexts while in English it is always possible. We argued that this can be derived by the same restriction on person agreement assumed to be responsible for the nominative restriction, provided that case on R L Ps in German is a purely morphological phenomenon in plural contexts, as opposed to singular contexts.

Bibliography Adesola, Oluseye (2006): ‘On the Absence of Superiority and Weak Crossover Effects in Yoruba’, Linguistic Inquiry 37, 309–318. Adger, David (2011): Bare Resumptives. In: A. Rouveret, ed., Resumptives at the Interfaces. John Benjamins, Amsterdam, pp. 343–366. Adger, David and Gillian Ramchand (2005): ‘Merge and Move: Wh-Dependencies Revisited’, Linguistic Inquiry 36, 161–193. Akmajian, Adrian (1970): ‘On Deriving Cleft Sentences from Pseudo-Cleft Sentences’, Linguistic Inquiry 1, 149–168. Bachrach, Asaf and Roni Katzir (2006): Right node raising and delayed Spell-Out. Ms., MIT. Baker, Mark (2008): The Syntax of Agreement and Concord. Cambridge University Press, Cambridge.

Long Distance Agreement in Relative Clauses

81

Bhatt, Rajesh (2005): ‘Long Distance Agreement in Hindi-Urdu’, Natural Language and Linguistic Theory 23, 809–865. Bierwisch, Manfred (1967): Syntactic Features in Morphology: General Problems of so-called Pronominal Inflection in German. In: To Honour Roman Jakobson. Mouton, The Hague/Paris, pp. 239–270. Blevins, James (1995): ‘Syncretism and Paradigmatic Opposition’, Linguistics and Philosophy 18, 113–152. Bobaljik, Jonathan and Susanne Wurmbrand (2005): ‘The Domain of Agreement’, Natural Language and Linguistic Theory 23, 809–865. Boeckx, Cedric (2004): ‘Long-distance Agreement in Hindi: Some Theoretical Implications’, Studia Linguistica 58, 23–36. Boeckx, Cedric (2006): The Syntax of Argument Dependencies. Ms. Harvard University. ˇ Boˇskovi´c, Zeljko (2007): ‘Agree, Phases, and Intervention Effects’, Linguistic Analysis 93, 54–96. Brame, Michael (1968): A new analysis of the relative clause: Evidence for an interpretive theory. Ms., MIT. Bresnan, Joan (1971): ‘Sentence Stress and Syntactic Transformations’, Language 47, 257–281. Butt, Miriam (1993): The Structure of Complex Predicates in Urdu. PhD thesis, Stanford University. Butt, Miriam (2008): Revisiting Long-Distance Agreement in Urdu. Ms., Universit¨at Konstanz. Carstens, Vicky (2001): ‘Multiple Agreement and Case Deletion: Against φ -(In)Completeness’, Syntax 4, 147–163. Carstens, Vicky (2003): ‘Rethinking Complementizer Agreement: Agree with a Case-Checked Goal’, Linguistic Inquiry 34, 393–412. Chomsky, Noam (1957): Syntactic Structures. Mouton, The Hague. Chomsky, Noam (1965): Aspects of the Theory of Syntax. MIT Press, Cambridge, Massachusetts. Chomsky, Noam (1971): Deep Structure, Surface Structure, and Semantic Interpretation. In: D. Steinberg and L. Jakobovits, eds, Semantics: An Interdisciplinary Reader in Philosophy, Linguistics, and Psychology. Cambridge University Press, Cambridge. Chomsky, Noam (1973): Conditions on Transformations. In: S. Anderson and P. Kiparsky, eds, A Festschrift for Morris Halle. Holt, Reinhart and Winston, New York, pp. 232–286. Chomsky, Noam (1977): On Wh-Movement. In: P. Culicover, T. Wasow and A. Akmajian, eds, Formal Syntax. Academic Press, New York, pp. 71–132. Chomsky, Noam (1981): Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. MIT Press, Cambridge, Massachusetts, pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale. A Life in Language. MIT Press, Cambridge, Massachusetts, pp. 1–52. Chomsky, Noam (2008): On Phases. In: R. Freidin, C. P. Otero and M. L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. MIT Press, Cambridge, Massachusetts, pp. 133–166. de Vries, Mark (2002): The Syntax of Relativization. PhD thesis, Universiteit van Amsterdam. de Vries, Mark (2008): ‘Asymmetric Merge and Parataxis’, Canadian Journal of Linguistics 53, 355–385. Diesing, Molly (1992): Indefinites. MIT Press, Cambridge, Massachusetts. Fischer, Silke (2006): ‘Matrix unloaded: Binding in a local derivational approach’, Linguistics 44, 913–935. Frampton, John and Sam Gutman (1999): ‘Cyclic Computation’, Syntax 2, 1–27. Frampton, John and Sam Gutman (2000): Agreement is Feature Sharing. Ms. Northeastern University. G¨artner, Hans-Martin (2002): Generalized Transformations and Beyond – Reflections on Minimalist Syntax. Akademie-Verlag, Berlin. Goldsmith, John (1979): Autosegmental Phonology. Garland, New York. Grewendorf, G¨unther (1989): Ergativity in German. Foris, Dordrecht.

82

Fabian Heck & Juan Cuartero

Gundel, Jeanette (1977): ‘Where Do Cleft Sentences Come From?’, Language 53, 543–559. Haider, Hubert (1993): Deutsche Syntax – Generativ. Narr, T¨ubingen. Halle, Morris and Alec Marantz (1993): Distributed Morphology and the Pieces of Inflection. In: K. Hale and S. J. Keyser, eds, The View from Building 20. MIT, Cambridge, Massachusetts, pp. 111–176. Heim, Irene and Angelika Kratzer (1998): Semantics in Generative Grammar. Blackwell, Oxford. Ito, Junko and Armin Mester (2000): “Ich, der ich sechzig bin”: An Agreement Puzzle. In: S. Chung, J. McCloskey and N. Sanders, eds, The Jorge Hankamer WebFest. http://ling.ucsc.edu/Jorge/. Jespersen, Otto (1927): A Modern Grammar of English. Vol. 2, Allen and Unwin, London. Jespersen, Otto (1937): Analytic Syntax. Allen and Unwin, London. Kayne, Richard (1976): French Relative que. In: M. Luj´an and F. Hensey, eds, Current studies in Romance Linguistics. Georgetown University Press, Washington, pp. 255–299. Kayne, Richard (1994): The Antisymmetry of Syntax. MIT Press, Cambridge, Massachusetts. Lahne, Antje (2008): Specificity-driven Syntactic Derivation: A New View on Long-distance Agreement. Ms., Universit¨at Leipzig. Lasnik, Howard and Tim Stowell (1991): ‘Weakest Crossover’, Linguistic Inquiry 22, 687–720. Legate, Julie Anne (2005): Phases and Cyclic Agreement. In: M. McGinnis and N. Richards, eds, Perspectives on Phases. MIT Working Papers in Linguistics, MIT Press, Cambridge, Massachusetts, pp. 147–156. Meinunger, Andr´e (1998): A Monoclausal Structure for (Pseudo-)Cleft Sentences. In: P. N. Tamanji and K. Kusumoto, eds, Proceedings of NELS 28. GLSA, University of Toronto, pp. 283–298. Morgan, Jerry A. (1972): Verb Agreement as a Rule of English. In: P. M. Peranteau, J. N. Levi and G. C. Phares, eds, Papers from the 8th Regional Meeting of the Chicago Linguistic Society. University of Chicago, pp. 278–286. M¨uller, Gereon (2000): ‘Optimality, Markedness, and Word Order in German’, Linguistics 37, 777– 818. M¨uller, Gereon (2001): Order Preservation, Parallel Movement, and the Emergence of the Unmarked. In: G. Legendre, J. Grimshaw and S. Vikner, eds, Optimality-Theoretic Syntax. MIT Press, Cambridge, Massachusetts, pp. 279–314. Nevins, Andrew (2004): Derivations without the Activity Condition. In: M. McGinnis and N. Richards, eds, Perspectives on Phases. Vol. 49 of MIT Working Papers in Linguistics, MIT Press, pp. 287–310. Perlmutter, David (1972): Evidence for Shadow Pronouns in French Relativization. In: P. M. Peranteau, J. N. Levi and G. C. Phares, eds, The Chicago Which Hunt. Chicago Linguistic Society, pp. 73–105. Perlmutter, David and Scott Soames (1979): Syntactic Argumentation and the Structure of English. The University of California Press, Berkeley. Pesetsky, David (1989): Language Particular Processes and the Earliness Principle. Ms., MIT. Pesetsky, David and Esther Torrego (2007): The syntax of valuation and the interpretability of features. In: S. Karimi, V. Samiian and W. K. Wilkins, eds, Phrasal and clausal architecture: Syntactic derivation and interpretation. Benjamins, Amsterdam, pp. 262–294. Pesetsky, David and Esther Torrego (2008): Probes, Goals and Syntactic Categories. Ms., MIT, University of Massachusetts, Boston. Pinkham, Jessie and Jorge Hankamer (1975): Deep and Shallow Clefts. In: Papers from the 11th Regional Meeting of the Chicago Linguistic Society. University of Chicago, pp. 429–450. Platzack, Christer (1987): ‘The Scandinavian Languages and the Null Subject Parameter’, Natural Language and Linguistic Theory 5, 377–402. Polinsky, Maria and Eric Potsdam (2001): ‘Long Distance Agreement and Topic in Tsez’, Natural Language and Linguistic Theory 19, 583–646. Pollard, Carl and Ivan A. Sag (1994): Head driven Phrase Structure Grammar. University of Chicago Press, Chicago. Postal, Paul (1993): ‘Remarks on Weak Crossover Effects’, Linguistic Inquiry 24, 359–556. Richards, Marc (2007): Object Shift, Phases, and Transitive Expletive Constructions in Germanic.

Long Distance Agreement in Relative Clauses

83

In: P. Pica, J. Rooryck and J. van Craenenbroeck, eds, Linguistic Variation Yearbook. Vol. 7, John Benjamins, Amsterdam, pp. 139–159. Richards, Marc (this volume): Probing the Past: On Reconciling Long Distance Agreement with the PIC. This volume. Ross, John Robert (1967): Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Massachusetts. Ross, John Robert (1970): On Declarative Sentences. In: R. Jacobs and P. Rosenbaum, eds, Readings in English Transformational Grammar. Ginn and Company, Waltham Massachusetts, pp. 222– 272. Safir, Ken (1984): ‘Multiple Variable Binding’, Linguistic Inquiry 15, 603–638. Safir, Ken (2004): The Syntax of (In)dependence. MIT Press, Cambridge, Massachusetts. Salzmann, Martin (2006): Resumptive Prolepsis: A Study in Indirect A -Dependencies. PhD thesis, Universiteit Leiden, Leiden. Schachter, Paul (1973): ‘Focus and Relativization’, Language 49, 19–46. Sch¨afer, Florian (this volume): Local Case, Cyclic Agree, and the Syntax of Truly Ergative Verbs. This volume. Sch¨utze, Carson (2002): ‘On the Nature of Default Case’, Syntax 4, 205–238. Sells, Peter (1984): Syntax and Semantics of Resumptive Pronouns. PhD thesis, University of Massachusetts, Amherst. Shlonsky, Ur (1992): ‘Resumptive Pronouns As a Last Resort’, Linguistic Inquiry 23, 443–468. Smith, Carlota (1964): ‘Determiners and relative clauses in a generative grammar of English’, Language 40, 37–52. Sternefeld, Wolfgang (2006): Syntax – Eine morphologisch motivierte generative Beschreibung des Deutschen. Stauffenburg Verlag, T¨ubingen. Uriagereka, Juan (1999): Multiple Spell-Out. In: S. D. Epstein and N. Hornstein, eds, Working Minimalism. MIT Press, Cambrigde, Massachusetts, pp. 251–282. van Koppen, Mario (2005): One probe – two goals: Aspects of agreement in Dutch dialects. PhD thesis, Universiteit Leiden. Vergnaud, Jean Roger (1974): French Relative Clauses. PhD thesis, MIT, Cambridge, Massachusetts. Wiese, Bernd (1999): ‘Unterspezifizierte Paradigmen. Form und Funktion in der pronominalen Deklination’, Linguistik Online 4. Wunderlich, Dieter (1997): Der unterspezifizierte Artikel. In: C. D¨urscheid, K. H. Ramers and M. Schwarz, eds, Sprache im Fokus. Niemeyer, T¨ubingen, pp. 47–55. Zeller, Jochen (2004): ‘Relative Clause Formation in the Bantu Languages of South Africa’, Southern African Linguistics and Applied Language Studies 22, 75–92.

(Heck) Institut f¨ur Linguistik Universit¨at Leipzig (Cuartero) Departamento de Filolog´ıa y Traducci´on Universidad Pablo de Olavide

Artemis Alexiadou, Elena Anagnostopoulou, Gianina Iord˘achioaia & Mihaela Marchis

In Support of Long Distance Agree*

Abstract In the recent literature the phenomenon of long distance agreement has become the focus of several studies as it seems to violate certain locality conditions which require that agreeing elements in general stand in clause-mate relationships. In particular, it involves a verb agreeing with a constituent which is located in the verb’s clausal complement and hence poses a challenge for theories that assume a strictly local relationship for agreement. In this paper we present empirical evidence from Greek and Romanian for the reality of long distance agreement. Specifically, we focus on raising constructions in these two languages and we show that they do not involve movement but rather instantiate long distance agreement. We further argue that subjunctives allowing long distance agreement lack both a CP layer and semantic Tense. However, since the embedded verb also bears phi-features, these constructions pose a further problem for assumptions that view the presence of phi-features as evidence for the presence of a C layer. Finally, we raise the question of the common properties that these languages have that lead to the presence of long distance agreement.

1. (Backward) raising and long distance agreement In a recent paper, Polinsky and Potsdam (2007) (P&P) point out that under the Copy and Delete theory of movement, a raising construction such as (1) should be analysed as involving copying of the moved constituent with sub-sequent deletion of one of the two copies. In general, either the higher or the lower copy can be deleted or both could be pronounced (2). This leads to the typology in (3). (1) [ TP Bill [ vP (Bill) seem [ IP Bill to [ vP Bill cut the line]]]] (2) a. b. c.

*

Subject raising

[higher copy lower copy] anaphora [higher copy lower copy] cataphora [higher copy lower copy] resumption

An earlier version of this paper was presented at the GGS meeting in May 2008 in Berlin. We would like to thank Alex Grosu, Masha Polinsky, Eric Potsdam and Winfried Lechner for comments and suggestions. Special thanks to Gereon M¨uller for helpful discussions of this paper.

Local Modelling of Non-Local Dependencies in Syntax, 85-109 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

86

Artemis Alexiadou et al.

(3) Typology of raising in Polinsky and Potsdam (2007) (P&P):1 Higher copy pronounced Lower copy pronounced ✓ * * ✓ ✓ ✓

Structure Forward Raising Backward Raising Resumption

Polinsky and Potsdam (2007; 2008) furthermore argue that backward raising does not always involve actual subject-to-subject raising, i.e., movement followed by deletion of the higher copy, as in (2-b). Adyghe, a Caucasian language spoken in the south of Russia and Turkey has real backward raising. On the other hand, Greek, which has been analysed by Alexiadou and Anagnostopoulou (1999; 2002) (A&A) as having overt or covert raising out of subjunctive complements, actually has Long Distance Agreement (LDA) which requires Agree (Chomsky (2000; 2004). Our contribution to this discussion is as follows. First, we revisit Greek in the light of P&P’s findings and conclude that the instances analysed by Alexiadou and Anagnostopoulou (1999; 2002) as covert raising indeed involve LDA rather than actual movement. Second, we present evidence that Romanian also has LDA across subjunctive complements, similarly to Greek. And third, we propose an analysis for LDA focusing on the conditions under which LDA patterns obtain in the languages under discussion.

2.

P&P’s criteria for backward raising

Evidence for Backward Raising (BR) seems scant.2 Polinsky and Potsdam (2007; 2008) discuss Adyghe as a language with real BR. Adyghe is a Northwest Caucasian language with ergative case marking and relative free word order. Raising verbs in this language are ‘become, turn to’, ‘happen to’, ‘be likely to’ and the aspectual verbs ‘begin’ and ‘stop’. Constructions involving such verbs are biclausal, as e.g. shown by the fact that the event in each clause can be modified independently, as in (4) (Polinsky and Potsdam (2008, ex. 9); further 1

2

P&P point out that the same patterns can be found in the case of control structures, under the analysis of control as movement (see Hornstein (1999) and subsequent work). We further refer the reader to Alboiu (2007), Alexiadou et al. (2011), where it is argued that Greek and Romanian have extensive backward control across Obligatory Control (OC) complements. Evidence from more languages is given for Backward Control (BC). B(subject)C can be observed in several Nahk-Dagestanian languages, in Northwest Caucasian, in Malagasy, and in Korean. According to Polinsky and Potsdam (2007), Tsez offers the most compelling case of obligatory subject control. In Alexiadou et al. (2011), we argue that Greek and Romanian present a stronger argument for BC.

87

In Support of Long Distance Agree

arguments include the possibility of two negations and NPI licensing in both clauses): (4) a. b.

Xw K e J@ńes@m @ˇcwec [ˇshw enˇc’@m-ˇce twe s@-we-new] twice 1 SG-shoot-SUP turned out this year gun-INSTR ‘This year I turned out to shoot my gun twice (in a row).’ twe Xw Ke J@ńes@m @ˇcwec [ˇshw enˇc’@m-ˇce s@-we-new] 1SG-shoot-SUP twice turned out this year gun-INSTR ‘This year there were two times that I turned out to shoot my gun.’

Initial evidence for raising in Adyghe is provided by the preservation of idiomatic meaning in (5-c) which shows that the matrix predicate has a nonthematic subject: (5) a. b. c.

[Axe-me p@sme-r a-tx@-new] feˇza-R-ex 3-PL-ERG letter-ABS 3-PL-ERG-write-SUP begin-past-PL ‘They began to write a letter.’ @-pe q@rex@ hoz@-r 3-SG.POSS nose smoke-ABS blows ‘S/he is furious.’ (lit. smoke is coming out of his/her nose) q@rexj@-new] q@ˇc@ˇc @R hoz@-r [@-pe happened 3-SG.POSS nose smoke-ABS blow-SUP ‘S/he happened to be furious.’

In contexts like (5-c) the subject is in the lower clause as its case is determined by the lower predicate (ERG if transitive, ABS if intransitive). But there must also be a silent copy in the main subject position, as the subject is able to bind a reflexive in the matrix clause: (6)

ze-feˇza-R-ex [Axe-me p@sme-r a-tx@-new] 3-PL ERG letterABS 3-PL-ERG-write-SUP REFL.begin-past-PL ‘They began to write a letter for themselves.’

Moreover, a quantified DP subject in the lower clause can have wide scope with respect to the negation in the higher clause (see Polinsky and Potsdam (2007; 2008) for examples). P&P’s conclusion is that Adyghe has BR. They propose an analysis as in (7), where the higher copy is deleted. (7) axe-r [axe-me p@sme-r a-tx@-new] feˇza-R-ex 3-PL-ABS 3PL-ERG letter-ABS 3PL-ERG-write-SUP begin-past-PL ‘They began to write a letter.’

BR

P&P distinguish between fake BR and real BR and suggest that Greek is a language with fake BR, in spite of the evidence from the agreement patterns (the higher verb obligatorily agrees with the lower subject, even when the subject is thematically dependent on the lower verb as in the case of idioms; see Alexiadou

88

Artemis Alexiadou et al.

and Anagnostopoulou (1999) and below for discussion); Greek examples as in (8) are analysed as an instance of long distance agreement (LDA): (8) a.

Stamatisan na diavazun ta pedia vivlia tu Kazandzaki stopped-3 PL SUBJ read-3PL the children books Kazandzakis ‘The children stopped reading books by Kazandzakis.’ Agree

LDA

[TP stamatisan [TP na diavazun [DP ta pedia]]] SUBJ read.3 PL stopped the children

b.

In the following sections, we systematically go over the arguments in support of this conclusion, while at the same time adding Romanian to the discussion.

3.

Control and raising constructions in Greek and Romanian

3.1.

Control subjunctives

In both Greek and Romanian, control is instantiated in a sub-set of subjunctive complement clauses, as these languages generally lack infinitives.3 The debate so far has concentrated on whether the null subject of the subjunctive clause should be pro or PRO or more recently an A-trace, (Iatridou (1993); Varlokosta (1994); Terzi (1992); Tsoulas (1993); Philippaki and Catsimali (1999); Spyropoulos (2007); Kapetangianni and Seely (2007); Roussou (2009) among others for Greek; Dobrovie-Sorin (1994; 2000); Motapanyane (1995); Terzi (1992); Alboiu (2007) among others for Romanian).4 3

As is well known, Greek has lost infinitives entirely. Romanian does actually have infinitives, which may appear in raising structures but not in control environments: (i)

a. b.

4

Maria pare a citi (??Maria) o carte (Maria). Mary seems to read a book *Maria ˆıncearc˘a a citi o carte. Mary tries to read a book

For Greek, it has been shown that in principle nominative features are available in the complement clause, see e.g. Philippaki and Catsimali (1999); Spyropoulos (2007); (but see Alboiu (2007) for a different view with respect to Romanian). The argument is based on the availability of NP-modifiers/intensifiers licensed in the lower clause by the higher subject and this has been seen as evidence that the lower subject is pro. Landau (2004; 2007) argues that if PRO can be assigned case, these examples are straightforwardly accounted for. (i)

(O Janis) kseri na kolimbai (o Janis) monos tu John know-3SG SUBJ swim-3SG (John) alone-NOM ’John knows how to swim by himself’

In Support of Long Distance Agree

89

In Greek, subjunctive complement clauses are introduced by the subjunctive marker na (9).5 In Romanian, the subjective marker is sˇa (10). In both languages, the embedded verb, similarly to the matrix verb, shows agreement in number and person with the matrix subject. (9) O Petros/ego kser-i/-o na koliba-i/-o Peter-NOM/I knows/know-1 SG SUBJ swim-3SG/-1SG ‘Peter knows how to swim/I know how to swim.’ sˇa limpezeasc˘a c˘amas¸a. (10) Ion a uitat Ion forgot-3 SG SUBJ rinse-3SG shirt-the ‘Ion forgot to rinse the shirt.’

Greek

Romanian

In addition, Romanian has a second type of subjunctive complements, introduced by ca (the subjunctive complementizer) plus the subjunctive marker sˇa: (11) Ion vrea ca azi sˇa cˆante la violoncel Ion wants that today SUBJ play at cello Ca is absent in both Obligatory Control and raising complements (cf. Grosu and Horvath (1987)). Greek subjunctives and Romanian sˇa (but not ca sˇa) subjunctives lack obviation effects. In this respect, they behave like infinitives (see Terzi (1992) for a detailed discussion). (12) a. b. c. d.

Juani quiere que ECj/*i venga John wants that comes-SUBJ ’Johni wants that hej/*i comes’ O Janisi theli na ECi/j erthi John-NOM wants SUBJ come-3SG ‘John wants that hei/j comes’ Ion i vrea sˇa ECi/j cˆante la violoncel Ion wants SUBJ play at cello Ion i vrea ca sˇa EC*i/j cˆante la violoncel play at cello Ion wants that SUBJ

Spanish Greek Romanian Romanian

For both languages, it has been shown that not all subjunctive clauses involve control. Two main types of subjunctive complements have been recognized: Obligatory Control (OC) ones and non-OC ones (NOC) (or C(ontrolled)subjunctives and F(ree)-subjunctives in Landau’s (2004) terminology), but see Spyropoulos (2007) and Roussou (2009) for certain refinements. 5

Na has been analysed as a subjunctive mood marker (cf. Philippaki-Warburton and Veloudis (1984); Philippaki-Warburton (1990); Rivero (1994)) or a subjunctive complementizer (cf. Tsoulas (1993); Agouraki (1991)) or a device to check the EPP (cf. Roussou (2009)). Here we side with the first view.

90

Artemis Alexiadou et al.

(i) OC/C-subjunctives are found as complements of verbs such as ksero ’know how’, tolmo ‘dare’, herome ‘be happy’, ksehno ’forget’, thimame ‘remember’, matheno ‘learn’, dokimazo ’try’; aspectual verbs, such as arhizo ‘start/begin’, sinehizo ‘continue.6 The ungrammaticality of a DP subject in the embedded clause, different from the matrix subject in (13) – (14) indicates that these verbs are OC: (13) a. *O Petros kseri na kolimbao pro Peter-NOM knows SUBJ swim-1SG b. *O Petros kseri na kolimbai i Maria Peter-NOM knows SUBJ swim-3SG Mary-NOM (14) a. *Ion s¸tie sˇa cˆant˘am la chitar˘a pro Ion knows SUBJ play-3PL at guitar b. *Ion s¸tie sˇa cˆante Victor la chitar˘a Ion knows SUBJ play-3SG Victor at guitar

Greek

Romanian

(ii) NOC/F-subjunctives are found with e.g. volitional predicates: (15) a. b.

(16) a. b.

O Petros perimeni na erthun Peter-NOM expects SUBJ come-3pl ‘Peter expects that they come.’ O Petros elpizi na figi i Maria Peter-NOM hopes SUBJ go-3 SG Mary-NOM ‘Peter hopes that Mary goes.’ Petru se as¸teapt˘a sˇa venim Peter REFL expects SUBJ come-1PL ‘Peter expects that they come.’ Petru sper˘a sˇa plece Maria Peter hopes SUBJ go-3 SG Mary-NOM ‘Peter hopes that Mary goes.’

Greek

Romanian

In both languages, OC disallows partial control or split antecedents: (17) a. *I Zoi emathe na kolibane [ECi/+] Zoe-NOM learnt SUBJ swim-3PL Greek b. *O Janis ipe oti i Zoi emathe na kolibane [EC*i+j] John-NOM said that Zoe learned-3SG SUBJ swim-3PL (18) a. *Eu am ˆınv˘aşat sˇa ˆınot˘am I have learnt SUBJ swim-1PL 6

Romanian

Note that many predicates that are optional control in Greek correspond to predicates that are obligatory control in English (cf. Joseph (1992); Terzi (1992); Varlokosta (1994); Martin (1996)).

In Support of Long Distance Agree

91

ai ˆınv˘aşat sˇa ˆınotaşi. b. *Ion a zis ca tu John has said that you-SG have learnt SUBJ swim-2PL 3.2.

Raising subjunctives

Two raising environments have been identified in the literature (see Alexiadou and Anagnostopoulou (1999); Anagnostopoulou (2003) for Greek; DobrovieSorin (1994; 2000); Alboiu (2007) and references therein for Romanian): (i) complements of aspectual verbs7 such as stop, continue, begin and (ii) complements of the verbs seem, happen. The status of the latter environment is controversial in Greek, but not in Romanian. For Greek we will limit the discussion to aspectuals; for Romanian we will also include seem. We first demonstrate that the constructions in question are biclausal. This can be shown on the basis of event modification and the presence of separate negations (compare the data below to the Adyghe examples (4)): Greek (19) a.

b.

(20) a. b. c.

7

[na pirovolo dio fores me to oplo Afti tin xronia arxisa this the year started-1SG SUBJ shoot-1 SG two times with the gun mu ] my ‘This year I started to shoot my gun two times (in a row).’ dio fores [na pirovolo me to oplo Aftin tin xronia arxisa this the year started-1SG two times [SUBJ shoot with the gun mu my] ‘This year there were two times that I started shooting with my gun.’ [na magirevo] Den sinexisa NEG continued-1 SG [ SUBJ cook-1 SG ] ‘I didn’t continue to cook’ [na min magirevo] Sinexisa continued-1SG [SUBJ NEG cook-1 SG] ‘I continued not to cook.’ [na min magirevo] Den sinexisa NEG continued-1 SG [ SUBJ NEG cook-1 SG ] ‘I didn’t continue not to cook (i.e., I started cooking).’

Alexiadou and Anagnostopoulou (1999) showed that Greek aspectual verbs are ambiguous between control and raising structures and that similar effects hold for Romanian; in all the examples discussed here we show that these verbs qualify as raising and not as OC ones.

92

Artemis Alexiadou et al.

Romanian (21) a.

b.

(22) a. b. c.

Anul acesta am ˆınceput [sˇa trag de dou˘a ori year-the this have-1SG started SUBJ shoot-1 SG of two times cu pistolul]. with gun-the ’This year I started to shoot the gun two times (in a row).’ ˆınceput de dou˘a ori [sˇa trag Anul acesta am year-the this have-1SG started of two times SUBJ shoot-1 SG cu pistolul]. with gun-the ’This year there were two times that I started shooting with.’ continuat [s˘a g˘atesc]. Nu am NEG have-1 SG continued SUBJ cook-1 SG ’I didn’t continue to cook. (i.e., I stopped cooking.)’ Am continuat [s˘a nu g˘atesc]. have-1SG continued SUBJ NEG cook-1 SG ’I continued not to cook.’ Nu am continuat [s˘a nu g˘atesc]. NEG have-1 SG continued SUBJ NEG cook ’I didn’t continue not to cook. (i.e., I started cooking.)’

We now proceed to the raising/LDA properties of these constructions (see Alexiadou and Anagnostopoulou (1999); Anagnostopoulou (2003) for more raising tests). 3.2.1. Weak Crossover (WCO) and clitic doubling An initial argument for raising comes from the interaction between clitic doubling of the object and obviation of WCO effects. In Greek and Romanian WCO effects arise when the quantificational object is non clitic-doubled (23b/24b). When the quantificational object undergoes clitic doubling, the WCO effects are obviated (23c/24c) (cf. Alexiadou (1997)): (23) a.

Kathe mitera sinodepse to pedhi tis sto sxolio. every mother accompanied the child hers at school ‘Every mother accompanied her child to school.’ Greek b. ?*I mitera tu sinodepse to kathe pedhi sto sxolio the mother his accompanied the every child at school ?*His mother accompanied every child to school sinodepse to kathe pedhi sto sxolio. c. I mitera tu to the mother his CL-ACC accompanied the every child at school ‘His mother accompanied each child to school.’

In Support of Long Distance Agree

93

Fiecare mam˘a a ˆınsoşit copilul ei la s¸coal˘a. every mother has accompanied child-the her at school ‘Every mother accompanied her child to school.’ Romanian fiecare copili la s¸coal˘a. b. *Mama luii a ˆınsoşit mother his has accompanied every child at school ˆınsoşit pe fiecare copili la s¸coal˘a. c. Mama luii l-a mother his him-has accompanied PE every child at school ‘His mother accompanied every child to school.’

(24) a.

Alexiadou and Anagnostopoulou (1999) argue that this effect can be analysed as the result of object raising to the position of the clitic in combination with subject reconstruction to a position lower than the clitic position. Backward binding as in (23-c), (24-c) follows from the assumption that binding is computed on the basis of the derived position of the quantificational object, i.e., the clitic position, and the vP-internal position of the subject which contains the pronominal variable. The same effect is found with aspectual verbs. WCO effects arise when the quantificational object of the embedded verb (25b/26b) is not doubled and the matrix subject contains a pronominal variable. On the other hand, when the embedded object undergoes clitic doubling, the WCO effects disappear (25/26c). Kathe mitera arxise na sinodevi to pedhi tis sto sxolio. every mother started SUBJ accompany the child hers at school ’Every mother started to accompany her child to school.’ b. ?*I mitera tu arxise na sinodevi to kathe pedhi sto sxolio the mother his started SUBJ accompany the every child at school ’?*His mother started to accompany every child to school’ c. I mitera tu arxise na to sinodevi to kathe pedhi sto the mother his started SUBJ CL-ACC accompany the every child at sxolio school ’His mother started to accompany each child to school.’

(25) a.

Fiecare mam˘a a ˆınceput sˇa insoşeasc˘a copilul ei la ¸scoal˘a. every mother has started SUBJ accompany child her at school ’Every mother started to accompany her child to school’. b. *Mama luii a ˆınceput s˘a ˆınsoşeasc˘a fiecare copili la s¸coal˘a. mother his started SUBJ accompany every child to school. ‘His mother started to accompany every child to school.’ ˆınsoşeasc˘a pe fiecare copili la c. Mama luii a ˆınceput sˇa-l mother his started SUBJ-CL.ACC accompany PE every child to ¸scoal˘a. school. ‘His mother started to accompany every child to school.’

(26) a.

94

Artemis Alexiadou et al.

This argues for raising, since the matrix subject containing the pronoun may reconstruct to the embedded vP-internal position below the derived position of the doubled quantificational object. Obviation of WCO effects under doubling is impossible with OC verbs: (27) ?*I mitera tu kseri na to sinodevi to kathe pedhi sto the mother his knows SUBJ CL-ACC accompany the every child at sxolio school ˆınsoşeasc˘a pe fiecare copil la ¸scoal˘a. (28) ??Mama lui ¸stie sˇa-l mother his knows SUBJ - CL . ACC accompany PE each child at school. This is expected, because reconstruction of the matrix subject to an embedded position is impossible under control.8 3.2.2. Idioms A further argument, also illustrating the LDA pattern, comes from idioms. Fixed nominatives as part of idiomatic expressions in Greek and Romanian tend to occur in postverbal position. bikan psili st’aftia. Mu CL -1:SG :GEN entered-3 PL fleas- NOM in the ears ‘I became suspicious.’ b. #Psili mu bikan st’aftia s-au ˆınecat cor˘abiile. c. I him-DAT REFL-have drowned ships-the ‘He is very sad.’ d. #Cor˘abiile i s-au ˆınecat.

(29) a.

Greek

Romanian

Examples like (29a/c) can be embedded under aspectual verbs. The subject in the embedded clause agrees with the embedded and the matrix verb: 8

This contrast between raising and control in Greek is reminiscent of the following contrast in English: (i)

a. [Hisi father]j seems to every boyi [tj to be a genius] b. *?[Hisi father]j promised every boyi [PROj to be a genius]

In (i-a) the matrix subject reconstructs to the trace position where it can be bound by the quantificational object and therefore variable binding is possible. In (i-b), however, this is not possible resulting in a WCO violation.

In Support of Long Distance Agree

95

na mu benun psili (30) Stamatisan/arxisan stopped-3PL/started-3PL SUBJ CL-1:SG:GEN enter-3PL fleas-NOM:PL st’aftia. in the ears ’I stopped being/started becoming suspicious.’ ˆınceput s˘a i se ˆınece cor˘abiile. (31) Au have-3PL started SUBJ him-DAT REFL drown ships ‘He started being very sad.’ Idioms are impossible with OC verbs, which have a thematic subject position. (32) *Kserun na mu benun psili st’aftia know-3 PL SUBJ CL-1:SG:GEN enter-3PL fleas-NOM in the ears sˇa i se ˆınece cor˘abiile. (33) *S¸tiu know-3 PL SUBJ him-DAT REFL drown ships In (30)/(31), the nominative depends on the lower verb for its interpretation and yet it agrees with both verbs obligatorily, as shown by the ungrammaticality of examples lacking matrix verb agreement.9 (34) *Stamatise/arxise na mu benun psili stopped-3 SG/started-3SG SUBJ CL-1:SG:GEN enter-3PL fleas-NOM st’aftia in the ears ˆınceput sˇa i se ˆınece cor˘abiile. (35) *A has-3SG started SUBJ him-DAT REFL drown ships As already discussed, there are in principle two possible analyses for obligatory agreement between the matrix verb and the embedded subject. (i) Raising of the subject to the matrix subject position either covertly (at LF) or overtly with subsequent deletion of the higher copy, as in P&P’s analysis of Adyghe.10 (ii) Alternatively, the subject remains in situ and agreement with the matrix verb is a 9

Other examples of idioms containing plural subjects which show the pattern in (30), (31) are: (i)

Arxisan/*arxise [na mu anavun ta lambakia] Started-3PL/*started-3SG [SUBJ me-DAT turn on-3PL the lambs] ‘I started being furious’

(ii) Arxisan/*arxise [na mu vgainun kapni apo ti miti] Started-3PL/*started-3SG [SUBJ me-DAT come out-3PL smoke-PL from the nose] ‘I started being furious’ 10

Alexiadou and Anagnostopoulou (1999) also consider the possibility of overt or covert feature movement, a possibility we disregard here.

96

Artemis Alexiadou et al.

genuine case of LDA as a result of Agree. The raising analysis has been adopted by Alexiadou and Anagnostopoulou (1999) in a framework lacking Agree. We will see, however, that the latter option is the correct one for Greek and Romanian.

4.

Backward raising?

In both Greek and Romanian, nominatives can freely occur in situ in the embedded clause, as in (36), or raise into the matrix clause, as illustrated by the WCO cases in (25) and (26) above. The in situ DP subject obligatorily agrees with both the matrix and the lower verb in person and number, just as in the idiom cases discussed in the preceding section: (36) a. b.

Stamatisan/*Stamatise na malonun i daskali tus mathites stopped-3 PL/stopped-3SG SUBJ scold-3PL the teachers the students ‘The teachers stopped scolding the students.’ Greek sˇa-i certe profesorii Au ˆıncetat/*A ˆıncetat stopped-3PL/stopped-3SG SUBJ-CL-3 PL.ACC scold-3PL the teachers pe elevi. the students ‘The teachers stopped scolding the students.’ Romanian

In (36) the agreeing subject resides in the embedded clause and has not undergone scrambling to the matrix clause (an option systematically instantiated in Russian, as extensively argued for by Polinsky and Potsdam (2008). The subject is truly embedded (i.e., not situated in the higher clause) as it precedes objects (note the VSO order in (36)) and VP-modifiers of the lower verb. In (37) the event adverbial modifies either the matrix or the embedded verb. (37) a.

b.

to pukamiso Stamatise na ksevgazi o Janis stopped-3 SG SUBJ rinse-3SG the Janis-NOM the-shirt- ACC tesseris fores four times Greek A ˆıncetat [sˇa cl˘ateasca Ion camas¸a de patru ori.] has-3SG stopped SUBJ rinse-3SG John shirt-the of four times ‘John stopped rinsing the shirt four times (in a row).’ Romanian ‘Low interpretation: John stopped rinsing the shirt four times (in a row).’ ‘High interpretation: It was four times the case that John stopped rinsing the shirt.’

This difference in interpretation depends on the adjunction site of the adverb. In the high reading where it modifies the matrix verb it (right-) adjoins to the matrix vP/TP:

97

In Support of Long Distance Agree

(38) High reading: TP V-v-T stopped

vP vP

four times

V-v stopped

VP V stopped

Subjunctive Complement to rinse the shirt

When it modifies the embedded verb, it adjoins to the embedded vP/TP: (39) Low reading: TP V-v-T stopped

vP vP

V-v stopped

four times VP

V stopped

Subjunctive Complement MoodP na

TP V-v-T rinse

vP vP

o Janis-NOM V-v rinse

four times vP VP rinse the shirt

98

Artemis Alexiadou et al.

As illustrated in (39), the subject necessarily resides in the embedded clause when the adverb modifies the predicate of the embedded clause.11 Although the fact that the subject follows the subjunctive marker would be enough to show that the subject is in the embedded clause, one could argue that the embedded subject has been leftward moved to the higher clause (Idan Landau (p.c.)). But if the subject was part of the higher clause, the adverbial would be higher as well, adjoined to the higher clause, resulting in the high reading obligatorily. Note that the adverb only has matrix scope in (40) where it clearly modifies the matrix verb: (40) a.

b.

to Stamatise tesseris fores [na ksevgazi o Janis [SUBJ rinse the Janis-NOM the stopped-3 SG four times pukamiso] shirt- ACC] ‘It was four times the case that Janis stopped rinsing the shirt.’ ˆıncetat de patru ori [sˇa cl˘ateasca Ion c˘amas¸a]. A John shirt-the has-3SG stopped of four times SUBJ rinse ‘It was four times the case that John stopped rinsing the shirt.’

Having presented evidence that the subject is truly embedded, let us now turn to the raising vs. LDA question. Polinsky and Potsdam (2008) provide evidence that in Greek there is no copy in the matrix clause in support of the latter option; we show here that similar facts hold in Romanian (cf. Rivero and Geber 2008). P&P’s main arguments come from scope: while the matrix subject DP takes wide scope with respect to the raising verb ((41-a)/(42-a)), the unraised one only has narrow scope ((41-b)/(42-b)): (41) a. b.

(42) a.

11

Mono i Maria stamatise na perni kakus vathmus only Mary stopped SUBJ get-3SG bad grades ONLY > STOP ‘It is only Maria who stopped getting bad grades.’ Stamatise na perni mono i Maria kakus vathmus stopped SUBJ get-3SG only Maria bad grades ‘It stopped being the case that only Maria got bad grades.’ STOP > ONLY Numai Maria a ˆıncetat sˇa ia note slabe. only Mary stopped SUBJ get grades weak. ‘It is only Maria who stopped getting bad grades.’

ONLY

> STOP12

As is standardly assumed, the verb raises to T in Greek and Romanian (see Alexiadou (1997); Alexiadou and Anagnostopoulou (1998; 2001); Cornilescu (2000); Dobrovie-Sorin (1994), among many others). Alexiadou and Anagnostopoulou (1998; 2001) extensively argue that postverbal subjects in these languages are vP internal. The trees in (38) and (39) follow these analyses for ease of exposition. The main point of the argument presented in the main text does not crucially depend on this particular analysis of VSO orders.

99

In Support of Long Distance Agree

b.

A ˆıncetat sˇa ia numai Maria note slabe. stopped SUBJ get only Mary grades weak ‘It stopped being the case that only Mary got bad grades.’ STOP

> ONLY

In this respect, Greek and Romanian differ from Adyghe where downstairs subjects may take wide scope, as discussed in Polinsky and Potsdam (2007; 2008). A further environment where Greek and Romanian differ from Adyghe concerns the scope interaction between the subject DP and the matrix negation. The matrix quantified subject allows wide scope with respect to clause-mate negation, while the unraised one takes only narrow scope with respect to matrix negation. In Adyghe, on the other hand, the downstairs quantified DP has wide scope over the matrix negation, regardless of its linear position. (43) a. b.

(44) a. b.

Oli i fitites den arhisan na diavazun afto to vivlio all the students neg began.3PL SUBJ read.3PL this the book ALL > NEG ‘All the students did not begin to read this book.’ Den arhisan na diavazun oli i fitites afto to vivlio neg began.3PL SUBJ read.3PL all the students this the book NEG > ALL ‘Not all the students began to read this book.’ Toşi studenşii nu au ˆınceput sˇa citeasca aceast˘a carte. all students-the not have began.3PL SUBJ read this book. ALL > NEG ‘All students did not begin to read this book.’ Nu au ˆınceput sˇa citeasca toşi studenşii aceast˘a carte. NEG have.3 PL begin SUBJ read.3 PL all students-the this book NEG > ALL ‘Not all the students began to read this book.’

We would like to add a further argument for the LDA analysis, relating to the licensing of predicative modifiers. In Greek and Romanian, nominal secondary predicates and predicative modifiers like “alone” agree in gender and number with the c-commanding DP they modify: (45) a. b.

12

efige panikovlitos/*i O Janis Janis-NOM left panicking-MS/FEM lit. ‘Janis left in panic.’ O Janis irthe monos tu/*moni tis John-NOM came alone-MS/alone-FEM ‘Janis came alone.’

Greek

Note that the same judgements hold in Romanian for the infinitival raising constructions. We would like to point out here that with ‘seem’ Romanian only has the SEEM> ONLY reading, irrespective of the surface position of the subject, i.e., before the raising verb or in the embedded clause.

100

Artemis Alexiadou et al.

(46) a. b.

Ion a plecat panicat/*˘a. Ion left panicking-MS/FEM ‘Ion left in panic.’ Ion a venit singur/*˘a. Ion came alone-MS/alone-FEM ‘Ion came alone.’

Romanian

If this were a BR construction, we would expect such modifiers to be licensed in the matrix clause, while the DP they modify resides in the embedded clause. This is impossible, however, providing evidence against BR and in favor of LDA. It has been mentioned in footnote 7 that aspectual verbs in Greek and Romanian are ambiguous between a raising and a control construal. There is one environment where raising aspectuals behave differently than their control counterparts with respect to their Case/agreement properties (see Alboiu (2007), and Alexiadou et al. (2011) for a detailed discussion). When a quirky subject construction is embedded under aspectuals, OC aspectuals agree in person and number with the embedded quirky dative subject, as shown in (47-a). On the other hand, raising aspectuals agree in person and number with the embedded nominative theme argument regardless of the surface position of the quirky subject, i.e., whether it remains in the embedded clause (as in (47-c)) or it raises to the matrix clause (as in (47-c)): na min tis ksefevgun tis Marias polla (47) a. ?Arxise started-3SG SUBJ not CL-GEN escape-3 PL the Mary-GEN many lathi mistakes-PL ‘Mary started not to miss so many mistakes’ ksefevgun tis Marias polla b. Arxisan na min tis started-3 PL SUBJ not CL-GEN escape-3 PL the Mary-GEN many lathi mistakes-PL c. Tis Marias arxisan na min tis ksefevgun polla the Mary-GEN started-3 PL SUBJ not CL-GEN escape-3 PL many lathi mistakes-PL That (47-a) contains a thematic subject position while (47-c) doesn’t is evidenced by the fact that agent-oriented adverbs are licensed in (47-a) but not in (47-c)/(47-c), as shown in (48): (48) a.

Epitides arxise na min tis ksefevgun tis on purpose started-3 SG SUBJ not CL-GEN escape-3 PL the polla lathi Marias Mary-GEN many mistakes-PL ‘Mary deliberately started not to miss so many mistakes’

In Support of Long Distance Agree

101

na min tis ksefevgun tis b. *Epitides arxisan on purpose started-3PL SUBJ not CL-GEN escape-3 PL the polla lathi Marias Mary-GEN many mistakes-PL c. *Tis Marias arxisan epitides na min tis the Mary-GEN started-3 PL deliberately SUBJ not CL-GEN ksefevgun polla lathi escape-3 PL many mistakes-PL The predicative modification diagnostic can now be applied to the unambiguously raising construction exemplified in (48b,c). Adding a predicative modifier to the nominative argument is grammatical only when the modifier occurs in the embedded clause, as in (49-a), and not when the modifier occurs in the matrix clause, as in (49-b)13 : (49) a.

Arhisan apo fetos na mu aresun ta kreata oma started-3PL from this year SUBJ me-GEN like the meat-PL raw oma apo fetos na mu aresun ta kreata b. *Arhisan started-3PL raw from this year SUBJ me-GEN like the meat–PL

The ungrammaticality of (49-b) entails that there is no silent copy of the nominative in the matrix clause in Greek, unlike Adyghe. Similar observations hold for Romanian. We thus conclude – in agreement with P&P – that Greek and Romanian (see Alboiu (2007); Rivero and Geber (2008)) have fake BR, as there is no higher copy in the matrix clause. The apparent backward raising phenomenon attested in these languages is actually an instance of LDA between the matrix T and the embedded in situ nominative argument.

5.

Accounting for the properties of the situ patterns

In both Greek and Romanian, the absence of a copy in the raising verb’s clause accounts for the low characteristics of the subject. Agreement between the matrix verb and the embedded subject must be determined non-locally, across a clause boundary, as in (50), from Polinsky and Potsdam (2008):

13

In order to control for the adjective being interpreted as focussed in the left periphery of the embedded clause, we include an adverbial clearly modifying the matrix verb in our examples.

102

Artemis Alexiadou et al.

Agree

LDA

(50) [ TP arhisan [ CP/TP na trehun [ ta pedia]]] started-3PL SUBJ run-3 PL the children ‘The children started to run’ But how is this possible? We assume, following Chomsky (2004) and Baker (2008, 65), that the central principles governing Agree are as in (51): (51) Agree occurs between F and XP, XP a maximal projection, only if: a. F c-commands XP b. There is no YP such that YP comes between F and XP and YP has phi-features c. F and XP are in the same phase (locality condition) d. XP is made active by having an unchecked case feature (activation condition) e. α and β become valued for the matched features In the system of Chomsky (2000; 2004), the conditions in (51) can hold if the lower clause is not a phase, i.e., if it lacks a CP layer. Otherwise the Phase Impenetrability Condition would be violated and the embedded subject would be inaccessible for the operation Agree with matrix T. In addition, the subject must be active: it must have an unchecked Case feature. This means that the embedded T lacks Case. Is there evidence that the locality and the activation condition are met? The answer appears to be positive. Straightforward evidence for the absence of C comes from Romanian where the subjunctive complementizer ca is always absent in LDA constructions (and see Alboiu (2007) for further arguments that the lower clause is not a phase). (52) [TP1 T◦ [TP2 anaphoric na/sˇa NOM ]] ˆınceput (*ca) s˘a cˆante Maria la pian.14 (53) A has-3SG started that SUBJ sing Maria at piano Since the two languages behave identically in every other respect, we assume that a CP layer is also lacking from Greek LDA constructions. Proceeding to Case and the activation condition, we assume (following Iatri14

As already pointed out in section 4.1, OC is also ungrammatical in the presence of the subjunctive complementizer ca. This could be taken to point to the conclusion that OC and LDA are identical with respect to locality, which would be expected in an A-movement analysis of OC. This issue requires further investigation, though, since the claim that control complements lack a CP layer would be highly controversial. In addition, there is a clear difference between OC complements and raising complements with respect to Case, revealed in contexts like (47) and (48) above to be discussed below.

In Support of Long Distance Agree

103

dou (1993), Varlokosta (1994); Alexiadou and Anagnostopoulou (1999); Chomsky (2004); Landau (2004) and others) that Case is a property of complete, i.e., non-deficient Tense. In both Greek and Romanian, LDA subjunctives are characterized by the absence of morphological and semantic Tense, i.e., absence of independent temporal reference in the embedded clause. As (54-b) and (55-b) show, it is not possible to modify the embedded verb by a temporal adverb with independent reference:15 (54) a. *O Janis arhizi na kolibise. John begins SUBJ swam-3 SG avrio. b. *O Janis arhizi na kolibai John begins SUBJ swim-3SG tomorrow (55) a. *Ion ˆıncepe s˘a a ˆınotat. John begins SUBJ swam-3SG b. *Ion a ˆınceput s˘a ˆınoate mˆaine. SUBJ swim tomorrow John began

Greek

Romanian

This is in contrast with NOC/F-subjunctives:16 (56) a. b.

15

16

O Janis theli na figi avrio John-NOM wants SUBJ leave-3SG tomorrow ‘John wants to leave tomorrow.’ theli na figi i Maria O Janis John-NOM wants SUBJ leave-3SG Mary-NOM ‘John wants Mary to leave.’

Greek

The relationship between Case and Semantic Tense across languages is systematic in raising constructions. For English, Martin (1996) extensively argues that raising is contingent on the absence of semantic Tense. Note here that in Romanian the raising verbs ‘seem’ and ‘happen’ behave slightly differently (at least for the two native speakers involved in this paper), their complement patterning like F-subjunctives. Specifically, the subjunctive clause has semantic tense, as shown in (i): (i)

(Maria) s-a nimerit (Maria) s˘a plece (Maria) mˆıine. Mary REFL-has happened (Mary) SUBJ leave Mary tomorrow ‘It so happened that Mary would leave tomorrow.’ – It happened at that time that Mary would leave later and w.r.t. the time when we speak (now) Mary’s leaving will take place tomorrow.

This, in connection with the fact that with these two verbs the DP seems to be in the embedded clause, irrespective of its surface position (scope interaction mentioned in section 4.2; footnote 11), suggests that complements of ‘seem’ and ’happen’ in Romanian can license nominative and hence the DP is not active for valuation through the matrix verb. If it appears preceding the matrix verb, it is interpreted as a focus or topic (Alboiu (2007)). The agreement on the matrix verb is an instance of phi-feature chain formation. See also Rivero and Geber (2008).

104

Artemis Alexiadou et al.

(57) Ion a vrut sˇa plece Maria mˆaine. John-NOM wanted SUBJ leave-3SG Maria tomorrow ‘John wants Mary to leave tomorrow.’

Romanian

Lacking Tense, transparent subjunctives also lack Case, which explains why the nominative argument obligatorily agrees with the matrix T in raising subjunctives. Note in this context that OC subjunctives differ from raising subjunctives with respect to Case transparency. As has been shown in the previous section, embedded quirky subject constructions provide evidence for the existence of two independent Case/agreement chains with OC aspectuals, unlike raising aspectuals. With OC aspectuals, matrix T agrees in person and number with the embedded quirky subject (more accurately, it agrees with a silent matrix nominative copy entering control with the embedded quirky subject; see Alexiadou et al. (2011) for details), while the embedded T agrees in person and number with the embedded nominative argument (47-a)/ (48-a). By contrast, matrix T obligatorily agrees with the embedded nominative argument in the case of raising aspectuals embedding quirky subjects, as was shown in (48b,c) and (49b,c). A straightforward account for the fact that subjunctives allowing LDA lack both a CP layer and semantic Tense can be given in Chomsky’s (2007) system where Tense features are a property of C inherited by T. Since C is missing, Tense and Case are also missing. A consequence of this analysis is that phifeatures are not (necessarily) a property of C since they are present in Greek and Romanian embedded subjunctives allowing LDA. Moreover, the presence of complete phi-feature agreement on the embedded T shows that phi-feature valuation of the probe does not always result in Case checking of the goal (see Alexiadou and Anagnostopoulou 1999); (cf. Bhatt 2005). Obviously this analysis is incompatible with Chomsky’s (2004) claim that T has phi-features only as a result of Transfer from C (and see Richards (2011) for further arguments in favor of this view). Can we reconcile this proposal with the situation we find in Greek and Romanian? It seems to us that this is possible either if we follow Alexiadou and Anagnostopoulou (1999) who treat agreement in Greek and Romanian as being EPP-related or, alternatively, if we assume that this agreement is a surface morphological agreement of the concord type. More specifically, in A & A (1999) the agreement between the embedded verb and the subject DP is analysed as being EPP-driven and not Case driven; as a result, agreement on the embedded verb is fully specified. There are at least two different ways of further implementing this. One possibility is that agreement may in principle morphologically spell out EPP relations or Case relations, and that languages differ with respect to this. On this view, there are two features associated with T: an EPP feature and a Case feature. Both are formal features of the same type, i.e., [–interpretable] nominal features on functional heads (this is the view adopted in Alexiadou and Anagnostopoulou (1999).

In Support of Long Distance Agree

105

(58) [TP2 T [TP1 T [vP DP]]] Cross-linguistically then, there are at least two types of agreement-Case, agreement-EPP relations (see also Baker (2008): 209 ff. for a similar proposal): (i) Agreement is a reflex of Case-checking. (ii) Agreement is the reflex of EPP checking. The latter pattern is found in Greek and Romanian. On this view, the DP establishes an EPP chain with the embedded T, and both a Case and EPP chain with the higher T. Since the lower T lacks semantic Tense, it also lacks Case as argued above. An alternative possibility is that agreement in Greek is the result of phi-feature movement satisfying EPP (see Alexiadou and Anagnostopoulou 1998; 2001), i.e., the type of agreement we find in Greek and Romanian is of the clitic doubling type rather than the result of phi-feature valuation via Agree (see Anagnostopoulou (2003) and Preminger (2009) on the distinction between the two types of agreement). An altogether different approach to pursue would be that the agreement on the embedded verb is somehow ‘parasitic’ and the only real Agree relationship established is the one between the matrix T and the embedded DP. This would entail that the phi-features observed on the embedded verb are present due to a well-formedness requirement on all Greek and Romanian verbs, as in (59): (59) *T-V, when T-V bears no phi-features (59) suggests that in languages which lack infinitives, verbs cannot appear uninflected, under the assumption that there is no default agreement form to use. Since the embedded T-V complex must bear phi-features, we could imagine that an Agr node is inserted at PF on the lower T-V complex, which copies the features of the embedded subject on it. On this view, agreement on the lower verb is more like agreement within the DP, i.e., it is a case of concord that involves copying of features, as is discussed in Embick (2000) within the framework of Distributed Morphology. The final question we would like to address in the present context concerns the properties Greek and Romanian have in common which potentially explain the observed LDA pattern, i.e., the fact that the subject does not raise to matrix T.17 Comparing Greek to Romanian, we observe that they are both alike in that: 1. they have subjunctives in raising (and no infinitives) 2. they are pro-drop languages 17

A question arises here: do these languages disallow raising altogether or is raising optional? If the former, then examples with the subject in the matrix clause do not involve A-movement to matrix T but some other operation. It would be natural to pursue this option extending to these cases Alexiadou and Anagnostopoulou’s (1998) analysis of SVO orders in Greek and Romanian in terms of Clitic Left Dislocation. See Alboiu (2007) for an explicit such proposal. This requires further research on the A/A’ status of the raised subject. See also footnote 16.

106

Artemis Alexiadou et al.

3. they have VSO orders with VP-internal subjects (Alexiadou and Anagnostopoulou (2001)) 4. they have clitic doubling 5. they have been argued to have EPP checking via V-movement (Alexiadou and Anagnostopoulou (1998)) The fact that both languages have subjunctives cannot be the reason for the presence of LDA, as: (i) Romanian does have infinitival complements of raising verbs and they behave similarly to subjunctives with respect to LDA;18 (ii) other languages, e.g. Bulgarian, lack infinitives but also lack fake BR (Adrian Krastev (p.c.)); (iii) Spanish has infinitives, but exhibits fake BR. Importantly, Spanish shares with Greek and Romanian all other properties. That Spanish has LDA is shown by applying the LDA tests to the infinitival constructions of the language. First, like in Greek and Romanian, unraised DPs can be argued to reside truly in the embedded clause. As was the case in Greek and Romanian, the subject resides in the embedded clause and has not undergone (rightward) scrambling to the matrix clause. The subject is truly embedded as it precedes objects (again note the VSO order in (60)) and VP-modifiers of the lower verb. In (60-a) the event adverbial modifies either the matrix or the embedded verb. In (60-b) it clearly modifies the matrix verb only: (60) a.

de enjuagar Juan la camiseta cuatro veces. Acab´o for times stopped-3SG PREP to rinse-I NF John the shirt Low interpretation: ‘John stopped rinsing the shirt four times.’ High interpretation: ‘It was four times the case that John stopped rinsing the shirt.’

b.

Acab´o cuatro veces de enjuagar Juan la camiseta. stopped-3 SG four times PREP to rinse-INF John the shirt High interpretation: ‘it was four times the case that John stopped rinsing the shirt.’

Second, in situ subjects take narrow scope with respect to the raising verb and matrix negation: (61) a. b.

(62) a.

18

Solamente Mar´ıa acab´o de tomar notas malas only Mary stopped DE get grades bad ONLY > STOP ‘It is only Maria who stopped getting bad grades.’ Acab´o de tomar solamente Mar´ıa notas malas. stopped SUBJ get-3SG only Maria bad grades ‘It stopped being the case that only Maria got bad grades.’ STOP > ONLY Todos los estudiantes no empezaron a leer este libro. all the students not began.3PL to read this book ‘All students did not begin to read this book.’ ALL > NEG

Note also here that the languages discussed by P&P have infinitives in BR.

In Support of Long Distance Agree

b.

107

No empezaron a leer todos los estudiantes este libro. to read all the students this book. NEG begin NEG > ALL ‘Not all the students began to read this book.’

Third, licensing of modifiers in the matrix clause is ruled out, as in Greek and Romanian: (63) a.

Parece venir solamente Mar´ıa mal preparada a la escuela. seem come only Mary bad prepared-FEM at the school. venir solamente Mar´ıa a la escuela. b. ??Parece mal preparada Mary at the school seem bad prepared-FEM come only

empezado al final del a˜no a venir Juan solo Ha towards the end of the year to come John alone have.3 SG begun a la escuela. to the school. ‘John began coming alone to school towards the end of the year.’ empezado solo al final del a˜no a venir Juan b. *Ha alone towards the end of the year to come John have.3SG begun a la escuela. to the school.

(64) a.

It can thus be concluded that the existence of productive LDA patterns derives from properties 2–5, i.e., pro-drop, VSO orders with vP-internal subjects, clitic doubling and EPP checking via V-raising. Alexiadou and Anagnostopoulou (1998; 2001) have proposed that these properties are a reflex of a single one: the extensive availability of clitic/agreement-associate relationships in a language which permit DPs to remain in situ.19

Bibliography Agouraki, Georgia (1991): ‘A Modern Greek Complementizer and its Significance for Universal Grammar’, UCL Working Papers in Linguistics 3, 1–24. Alboiu, Gabriela (2007): Moving Forward with Romanian Backward Control and Raising. In W. Davies and S. Dubinsky, eds, New Horizons in the Analysis of Control and Raising. Springer, Dordrecht, pp. 187-213. Alexiadou, Artemis (1997): Adverb Placement: A Case Study in Antisymmetric Syntax. John Benjamins, Amsterdam.

19

As already mentioned, in those papers, it was argued that agreement (with clitic properties) satisfies the (EPP, Case) requirements of the higher clause so that the lower copy can be spelled out in a lower domain. This was seen as a case of feature movement. If we stick to this view, then the contrast between Greek/Romanian and Adyghe is not lack of movement (LDA) vs. movement, but rather X (feature) vs. XP movement in terms of Copy and Delete.

108

Artemis Alexiadou et al.

Alexiadou, Artemis and Elena Anagnostopoulou. (1998): ‘Parametrizing Agr: Word Order, VMovement and EPP-Checking’, Natural Language and Linguistic Theory 16, 491–539. Alexiadou, Artemis and Elena Anagnostopoulou. (1999): Raising without Infinitives and the Nature of Agreement. In: S. Bird, A. Carnie, J. Haugen and P. Norquest, eds, Proceedings of WCCFL 18. Cascadilla Press, Somerville, Mass., pp. 14–26. Alexiadou, Artemis and Elena Anagnostopoulou. (2001): ‘The Subject in situ Generalization and the Role of Case in Driving Computations’, Linguistic Inquiry 32, 193–231. Alexiadou, Artemis & Elena Anagnostopoulou (2002): Raising Without Infinitives and the Nature of Agreement. In: Dimensions of Movement: From Remnants to Features. Benjamins, Amsterdam, pp. 17–30. Alexiadou, Artemis, Elena Anagnostopoulou, Gianina Iord˘achioaia and Mihaela Marchis (2011): A Stronger Argument for Backward Control. Proceedings of NELS 39, 1–14. Anagnostopoulou, Elena (2003): The Syntax of Ditransitives: Evidence from Clitics. Mouton de Gruyer. Baker, Mark (2008): The Syntax of Agrement and Concord. Cambridge University Press, Cambridge. Bhatt, Rajesh (2005): ‘Long Distance Agreement in Hindi-Urdu’, Natural Language and Linguistic Theory 23, 757–807. Chomsky, Noam (1973): Conditions on Transformations. In S. R. Anderson and P. Kiparsky, eds, A Festschrift for Morris Halle. Holton, Rinehart and Winston, New York, pp. 232–286. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. MIT Press, Cambridge, Massachusetts, pp. 89–155. Chomsky, Noam (2004): Beyond Explanatory Adequacy. In: A. Belletti, ed., Structures and Beyond. The Cartography of Syntactic Structures (vol. 3). Oxford University Press, Oxford, pp. 104-131. Chomsky, Noam (2007): Approaching UG from Below In: H.-M. G¨artner and U. Sauerland, eds, (2007) Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from SyntaxSemantics. Mouton de Gruyter, Berlin, pp. 1-30. Cornilescu, Alexandra (2000): The Double Subject Construction in Romanian. In: V. Motapanyane, ed, Comparative Studies in Romanian Syntax. Elsevier, Dordrecht, pp. 83–134. Dobrovie-Sorin, Carmen (1994): The Syntax of Romanian. Mouton de Gruyter, Berlin. Dobrovie-Sorin, Carmen (2000): Head-to-Head Merge in Balkan Subjunctives and Locality, In: A. Ralli and M-L. Rivero, eds, Comparative Syntax of Balkan Languages. Oxford University Press. Embick, David (2000): ‘Syntax and Categories: Verbs and Participles in the Latin Perfect’, Linguistic Inquiry 31, 185–230. Grosu, Alexander and J. Horvath. (1987): ‘On Non-Finiteness in Extraction Constructions’, Natural Language and Linguistic Theory 5, 181–196. Hornstein, Norbert (1999): ‘Movement and Control’, Linguistic Inquiry 30, 69–96. Iatridou, Sabine (1993): ‘On Nominative Case Assignment and a Few Related Things’, MIT Working Papers in Linguistics 19, 175–198. Joseph, Brian (1992): Diachronic Perspectives on Control. In: R. Larson, S. Iatridou, U. Lahiri and J. Higginbotham, eds., Control and Grammar. Kluwer, Dordrecht, pp. 105–234. Kapetangianni, Konstantia and T.D. Seely. (2007): Control in Modern Greek: It’s Another Good Move. In W. Davies and S. Dubinsky, eds, New Horizons in the Analysis of Control and Raising. Springer. Landau, Idan (2004): ‘The Scale of Finiteness and the Calculus of Control’, Natural Language and Linguistic Theory 22, 811–877. Landau, Idan (2007): Movement Resistant Aspects of Control. In: W. Davies and S. Dubinsky, eds, New Horizons in the Analysis of Control and Raising. Springer. Martin, Roger Andrew (1996): A Minimalist Theory of PRO and Control. Doctoral dissertation, University of Connecticut. Motapanyane, Virginia (1995): Theoretical Implications of Complementation in Romanian. Unipress, Padova.

In Support of Long Distance Agree

109

Philippaki-Warburton, Irene (1990): Subjects in English and in Greek. Proceedings of the 3d Symposium on the Description and/or Comparison of English and Greek. Aristotle University of Thessaloniki, 12–32. Philippaki-Warburton, Irene and Jannis Veloudis. (1984): The Subjunctive in Complement Clauses, Studies in Greek Linguistics 5. Philippaki, Irene and Georgia Catsimali. (1999): On Control in Greek. In: A. Alexiadou, G. Horrocks, and M. Stavrou, eds, Studies in Greek Syntax. Dordrecht: Kluwer. Polinsky, Maria and Eric Potsdam (2007): Expanding the Scope of Control and Raising. In: W. Davies and S. Dubinsky, eds, New Horizons in the Analysis of Control and Raising. Springer. Polinsky, Maria and Eric Potsdam (2008): Real and Apparent Long-Distance Agreement in Subjectto-Subject Raising Constructions. Paper presented at the workshop on Local Modelling of NonLocal Dependencies, DGfS 30, Bamberg. Preminger, Omer (2009): ‘Breaking Agreements: Distinguishing Agreement and Clitic Doubling by Their Failures’, Linguistic Inquiry 40, 619–666. Richards, Marc (2011): ‘Deriving the Edge: Whats’ in a Phase,’ Syntax 14, 74-95. Rivero, Maria Luisa (1994): ‘The Structure of the Clause and V-movement in the Languages of the Balkans’, Natural Language and Linguistic Theory 12, 63–120. Rivero, Maria Luisa and Dana Geber. (2008): Experiencer Islands and Raising in Romanian. Ms. University of Ottawa. Roussou, Anna (2009): ‘In the Mood for Control’, Lingua 119, 1811–1836. Spyropoulos, Vassilios (2007): Finiteness and Control in Greek. In W. Davies and S. Dubinsky, eds, New Horizons in the Analysis of Control and Raising. Springer. Terzi, Arhonto (1992): PRO in Finite Clauses: a Study of the Inflectional Heads of the Balkan Languages. Doctoral dissertation, CUNY. Tsoulas, George (1993): ‘Remarks on the Structure and the Interpretation of Na-Clauses’, Studies in Greek Linguistics 14. Varlokosta, Spyridoula (1994): Issues on Modern Greek Sentential Complementation. Doctoral dissertation, University of Maryland.

(Alexiadou) Institut f¨ur Linguistik: Anglistik Universit¨at Stuttgart (Anagnostopoulou) Department of Philology University of Crete (Iord˘achioaia) Institut f¨ur Linguistik: Anglistik Universit¨at Stuttgart (Marchis) Institut f¨ur Romanistik Universit¨at Hamburg

Petr Biskup

Agree, Move, Selection, and Set-Merge Abstract In this paper, I show that there are many different relations in derivations that pose a problem for Chomsky’s (2000, et seq.) phase model, the Phase Impenetrability Condition and ‘forgotten’ phases. These relations are too non-local for the Phase Impenetrability Condition. I argue that the problem lies in the assumption that only labels of syntactic objects are visible to syntactic operations. Therefore, I propose a representational-derivational model that does not make this assumption. Specifically, the whole set information resulting from the operation Set-Merge is visible to syntactic operations. This allows deriving the non-local relations in several local steps. To derive the difference in locality behaviour between the operation Agree and Move, I propose that for the operation Agree only the set information on the sister syntactic object is relevant, and that for the operation Move also the tree information with the Phase Impenetrability Condition is relevant because it is a composed operation. I also show that c-selection behaves differently from s-selection, Agree and other long-distance relations with respect to the information given by the operation Set-Merge.

1. Introduction It is a well-known fact there are two versions of the Phase Impenetrability Condition. According to the first version in Chomsky (2000, 108) – which is the stronger one – no operation outside the phase can affect the complement of the phase head; consider (1): (1) Strong version of PIC In phase a with head H, the domain of H is not accessible to operations outside a; only H and its edge are accessible to such operations. However, later Chomsky recognizes that this condition is too strong because in certain cases, like in the Icelandic dative-nominative construction (2), taken from Sigursson (2004, 147), a probe in the next higher phase can access a goal in the complement of the lower phase head. More concretely, in (2), the probing T h¨ofu ‘had’ Agrees with object hestarnir ‘horses’, which occurs in the complement of the phase head v.1 1

The following abbreviations are used in this paper: ABS = absolutive, ACC = accusative, C . OBL = complementizing oblique, COP = copula, DAT = dative, ERG = ergative, G = gender, GEN = genitive, INS = instrumental, LAT = lative, M . ABL = modal ablative, NEG = negation, NMLZ = nominalizer, NOM = nominative, OBL = oblique, PL = plural, PRS = present, PST = past, PTCP = participle.

Local Modelling of Non-Local Dependencies in Syntax, 111-133 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

112

Petr Biskup

(2) Henni h¨ofu ekki l´ıka hestarnir her.DAT had.3PL not liked horses.the.NOM ‘She had not liked the horses.’ This means that the Agree relation must cross one phase boundary in cases like (2). Therefore Chomsky (2001, 14) proposes the weaker version of the Phase Impenetrability Condition, which allows the head T to access the object in the vP phase. More generally, according to this version, operations outside the phase can access the complement of the phase head, but only until the next higher phase head is merged into the structure; consider (3). It is crucial here that this version of the Phase Impenetrability Condition allows operations to cross maximally one phase boundary. (3) Weak version of PIC [In the structure [ ZP Z ... [ HP a [ H YP]]], with H and Z the heads of phases], the domain of H is not accessible to operations at ZP; only H and its edge are accessible to such operations.

2. The Phase Impenetrability Condition and locality In this section, I show that there are non-local relations in derivations that also pose a problem for the weak version of the Phase Impenetrability Condition because either they cross more than one phase boundary or they cross just one phase boundary but in addition the next higher phase head is present in the derivation. There are two types of problematic non-local relations. In the first type – the so-called bottom-up problems – a piece of information about an element from a lower phase must be present in a higher phase. In the second type – the so-called top-down problems – a piece of information about an element from a higher phase must appear in a lower phase. 2.1. Bottom-up problems Let us begin with binding principles and Condition A. Building on Reuland’s (2001) analysis, an Agree-based analysis of anaphors is proposed in Chomsky (2008, 142) and Chomsky (2007, 18). However, if the operation Agree indeed is to replace the earlier feature movement or covert movement, then a problem arises in the case of long-distance anaphors. Since control infinitives are standardly analyzed as CPs (e.g., Chomsky (2000, 105) or Chomsky (2001, 8)), then in example (4), which is taken from Bailyn (2007, 29), there are three phase boundaries – matrix vP, embedded CP and embedded vP – between the matrix T

Agree, Move, Selection, and Set-Merge

113

probe and the anaphor, which goes against the one-phase-boundary requirement of the weak Phase Impenetrability Condition2. (4) General1 poprosil polkovnika2 [PRO narisovat’ sebja1,2 ]. general requested colonel to draw self ‘The general asked the colonel to draw himself.’

Russian

According to Chomsky (2008, 141, 145), Condition C can be treated as a probegoal relation between the appropriate pronoun and the R-expression. This, however, means that in the following example – which is ungrammatical because the pronoun c-commands the coindexed R-expression – there are four phase boundaries intervening between the two elements3 , which again poses a problem for the weak Phase Impenetrability Condition, which allows maximally one phase boundary. Note that, theoretically, the pronoun and the R-expression can be indefinitely remote. (5) *Er1 sagte, dass Hans behauptete, dass Andreas1 klug ist. He said that Hans claimed that Andreas clever is ‘He said that Hans had claimed that Andreas was clever.’

German

The third problem concerns Condition C and coreference. Biskup (2011) shows that coreference between an R-expression within an adjunct clause and a pronoun in the matrix clause is possible only if the R-expression is scrambled in Czech. He argues that adjunct clauses like the one in (6) are merged cyclically. Example (6-a) shows that the c-commanded R-expression violates Condition C independently of in which position it appears in the adjunct clause. Example (6-b) demonstrates that if the R-expression occurs in situ in the adjunct clause contained in the moved presuppositional wh-phrase, the sentence is ungrammatical. In contrast, if the R-expression is scrambled in the adjunct clause, the sentence is grammatical, as illustrated by example (6-c). This means – under the assumption that sentences are sent to the interfaces in a phase-by-phase fashion – that the CP phase of the matrix clause with the appropriate pronoun must ‘remember’ not only that there is a coindexed R-expression, which was sent to the interfaces in a phase of the adjunct clause, but it must also ‘remember’ its (scrambling) feature. nˇekter´y argument, kter´y (Pavel1 ) pˇrednesl (6) a. *pro1 zuˇrivˇe br´anil furiously defended some argument which Pavel gave vˇcera (Pavel1 ). yesterday Pavel 2 3

The distance (number of the crossed phase boundaries) can be even higher because long-distance binding can cross more than one infinitive boundary in Russian. For details, see discussion in section 3.4.

114

Petr Biskup

b. *Kter´y argument, kter´y pˇrednesl vˇcera Pavel1 , pro1 zuˇrivˇe which argument which gave yesterday Pavel furiously br´anil t? defended c. ?Kter´y argument, kter´y Pavel1 pˇrednesl vˇcera, pro1 zuˇrivˇe furiously which argument which Pavel gave yesterday br´anil t? defended ‘Which argument that Pavel gave yesterday did he defend like a fury?’ Another bottom-up problem is related to morphological agreement. In Khwarshi – which is an SOV language spoken in southern Dagestan – the matrix verb can either Agree with its sentential complement, as illustrated by (7-a) from Khalilova (2007, 4), where it bears the gender-four marker (G4), or it can Agree with the absolutive argument in the finite complement clause, as shown in (7-b), where the verb is marked by the gender-five marker. Thus, (7-b) is problematic for the Phase Impenetrability Condition because by the time when the matrix v is merged, the absolutive object in the embedded clause should already haven been spelled out and ‘forgotten’. (7) a.

b.

Iˇset’u-l l-iq’-ˇse goli uˇza bataxu mother.OBL-LAT G4-know-PRS COP boy.ERG bread(G5) y-acc-u. G 5-eat- PST . PTCP ‘Mother knows that the boy ate bread.’ Iˇset’u-l y-iq’-ˇse goli uˇza bataxu mother.OBL-LAT G5-know-PRS COP boy.ERG bread(G5) y-acc-u. G 5-eat- PST . PTCP ‘Mother knows that the boy ate bread.’

The last problematic case in this section is related to long-distance scrambling and Relativized Minimality. Shields (2007) shows that short adverb scrambling can cross another adverb in Russian, Japanese and Korean, as demonstrated by the Russian example (8-a), taken from Shields (2007, 162). Example (8-b) shows that the adverb can also be long-distance scrambled. However, if the adverbial long-distance scrambling crosses another adverb (8-c) – note that the adverb is the same as in (8-a) – a Relativized-Minimality effect arises. Shields (2007) argues that these data pose a problem for derivational approaches because they evaluate each derivational step independently and argues for a representational approach because it has a simultaneous access to information created during different steps of the derivation.

Agree, Move, Selection, and Set-Merge

115

Ona bystro1 cˇ asto t1 zavodilas’. she quickly often started ‘It often started quickly.’ b. Ja bystro1 xoˇcu [ˇctoby ona t1 zavodilas’]. I quickly want that she started ‘I want it to start quickly.’ c. *Ja bystro1 xoˇcu [ˇctoby ona cˇ asto t1 zavodilas’]. I quickly want that she often started ‘I want it to often start quickly.’

(8) a.

2.2. Top-down problems As mentioned above, in the top-down problems, we find a reflection of the nonlocal relation either on a goal that is more than one phase boundary lower than the probe or on a goal in cases where two relevant phase heads are present in the derivation at the time of the appropriate operation. Let us begin with Japanese Exceptional Case Marking constructions. Hiraiwa (2001) argues that Japanese allows optional ECM across a CP clause boundary. This is illustrated by example (9) from Hiraiwa (2001, 72), where v of the matrix ECM verb can value case of the argument within the embedded clause. According to Hiraiwa, the placement of the dative argument before the goal argument Mary indicates that Mary does not raise into the matrix clause. Although there is only one intervening phase head (i.e., C) between the appropriate probe and goal in (9), this derivation also violates the weak version of the Phase Impenetrability Condition because the presence of the probing phase head v at the time of case assignment is also relevant; see definition (3) again. t1 muite-na-i to] (9) John-ga [ CP sono sigoto-ni1 Mary-ga/wo the job-DAT Mary-NOM/ACC suitable-NEG-PRS C John-NOM omo-ta. think-PST ‘John felt that Mary is not suitable for the job.’ There are languages – e.g., Australian languages, Japanese or Korean – that allow more cases on one element. In the Kayardild example (10) from Evans (1995, 5), we find a typical case of multiple adnominal case marking, also known as ‘Suffixaufnahme’. Concretely, ‘brother’ bears four different cases that are ordered in accordance with the syntactic structure: the innermost genitive, which is assigned to ‘brother’ as a possessor of ‘net’, instrumental, inherited from ‘net’, modal ablative, which codes a certain type of tense and complementizing oblique case, which is assigned to all elements in the clausal complement of ‘know’. Depending on the phase status of particular XPs in (10), there can be x number of phase boundaries between the case-assigning matrix v and ‘brother’ (in the

116

Petr Biskup

maximal case: DP ‘net’, PP ‘with net’, embedded vP, embedded CP). Even if we consider only vP and CP as phases, there is still at least one phase boundary (CP)4 between the case-assigning matrix verb and ‘brother’ plus the presence of the probing matrix phase head v itself, as in the problematic Japanese example above. To avoid the problem with non-local Agree and case assigning in languages where cases can ‘fall’ through the whole sentence, one would have to assume that there are no phases or that all appropriate elements always move from the the complement of the phase head to the phase edge to escape spellout. (10) Ngada mungurru, [maku-ntha yalawu-jarra-ntha yakuri-naa-ntha I know woman-C . OBL catch-PST-C . OBL fish-M . ABL-C . OBL thabuju-karra-nguni-naa-ntha mijil-nguni-naa-nth]. brother-GEN-INS-M . ABL-C . OBL net-INS-M . ABL-C . OBL ‘I know that the woman caught the fish with brother’s net.’ The last top-down problem concerns Latin control constructions and depictive predicates. As demonstrated by example (11), taken from Cecchetto and Oniga (2004, 143), the depictive predicate in the embedded clause is marked with the same case as the controller in the matrix clause. Independently of whether or not the depictive predicate bonum ‘good’ is spelled out in the embedded vP phase, this example also poses a problem for the weak version of the Phase Impenetrability Condition because there is still one intervening phase boundary and the matrix phase head v, assigning the accusative case. [PRO esse bonum]. (11) Ego iubeo te to-be good.ACC I order you.ACC ‘I order you to be good.’ Anaphors are referentially defective, therefore they should Agree in ϕ -features with their antecedent. Thus, in the Condition A example (4), not only the piece of information about the presence of the anaphor must appear in the matrix CP phase but also the piece of information about ϕ -features of the antecedent must get to the anaphor in the embedded vP phase. This means that the example also represents a top-down problem.5 4 5

Whether or not the embedded vP intervenes, depends on the position of the adverbial istrumental phrase containing ‘brother’. To be accurate, there are also approaches according to which long-distance anaphors are subject to discourse principles and not syntactic binding conditions.

Agree, Move, Selection, and Set-Merge

117

2.3. Agree vs. Move In this section, I show that the operation Agree and Move differ in locality conditions. Specifically, Agree, in contrast to the Move operation, is not subject to the Phase Impenetrability Condition. This is demonstrated by three typologically different languages. Let us begin with Tsez, Northeastern Caucasian language spoken in Southern and Western Dagestan. In Tsez, the embedded absolutive argument can trigger long-distance agreement on the matrix verb (marked as class III), as shown by example (12) taken from Chandra (2007, 48). However, as argued by Polinsky and Potsdam (2001, 590) or Chandra (2007, 56), the absolutive argument cannot raise to the matrix clause. [uz-a magalu b-ac-ru-li] (12) Eni-r mother-DAT [boy-ERG bread.III.ABS III-eat-PST.PRT.NMLZ] b-iy-xo. III-know.PRS ‘The mother knows that the boy ate the bread.’ The second example comes from English (see Boˇskovi´c (2007, 15)). According to Boˇskovi´c, coordination phrases are phases; therefore in example (13-a) movement of the first conjunct out of the coordination phase is ungrammatical. In contrast, when the first conjunct just Agrees with the verb, the sentence is grammatical, as shown in (13-b).6 (13) a. *A woman is and five men in the garden. b. There is a woman and five men in the garden. Czech also shows differences between the operation Agree and Move. As illustrated by example (14-a), Agreement between the head T and the anaphor sv´eho ‘self’ contained in the prepositional phrase is possible.7 Example (14-b) shows that Agreement between the head C and wh-phrase kter´eho ‘which’ contained in the prepositional phrase is also possible. However, if kter´eho moves out of the prepositional phrase, the sentence becomes ungrammatical, as shown in example (14-c). 6

7

See Boˇskovi´c (ibid.) and references therein for other examples. According to Boˇskovi´c (2005, 2007), Agree, in contrast to successive cyclic movement, is not subject to the Phase Impenetrability Condition because the Phase Impenetrability Condition effects follow from phonological considerations (similarly as in Fox and Pesetsky (2005)). Thus, since the Phase Impenetrability Condition effect of phases is achieved via PF, phases and the Phase Impenetrability Condition as a syntactic locality condition can be eliminated. Even approaches that are strongly derivational and take every phrase to be a phase exempt Agree from the Phase Impenetrability Condition, see, e.g., M¨uller (2010). As we saw above in connection with example (4), according to Reuland (2001) and Chomsky (2007, 2008) the head T mediates between the binder (subject) and the bindee (the anaphor).

118

Petr Biskup

(14) a.

Marie1 vypr´avˇela legraˇcn´ı historky o zˇ ivotˇe sv´eho1 pˇr´ıtele. Marie told funny stories about life self friend ‘Marie told funny stories about her friend’s life.’ b. Marie vypr´avˇela legraˇcn´ı historky o zˇ ivotˇe kter´eho pˇr´ıtele? Marie told funny stories about life which friend ‘About which friend’s life did Marie tell funny stories?’ zˇ ivotˇe t1 pˇr´ıtele? c. *Kter´eho1 Marie vypr´avˇela legraˇcn´ı historky o friend which Marie told funny stories about life ‘About which friend’s life did Marie tell funny stories?’

To conclude section 2, we have seen that there are non-local relations in the syntactic computation that go against the weak version of the Phase Impenetrability Condition and that morphological reflections of the non-local relations can appear in both directions. We have also seen that there is a difference in locality behaviour between the operation Agree and Move.

3. Proposal This section is concerned with the syntactic operations Select(ion), Agree and Move. I will show that the problematic non-local relations from the preceding section can be derived in a local fashion if the whole set information resulting from the operation Set-Merge is visible for syntactic operations. I will also show that c-selection behaves differently from s-selection, Agree and other longdistance relations with respect to the information resulting from the operation Set-Merge. To derive the difference in locality behaviour between the operation Agree and Move, I propose that for the operation Agree, only the set information on the sister syntactic object is relevant and that for the operation Move – because it is a composed operation – also the tree information with the Phase Impenetrability Condition is relevant. 3.1. Set-Merge In my analysis, I follow Chomsky’s Set-Merge proposal (Chomsky (1995a, 396397) and Chomsky (2000, 133)).8 Chomsky proposes that the operation SetMerge combines two elements and forms a new element with a label, which is identical to one of the original elements; see Chomsky’s formulation (1995a, 396-397) in (15-a).9 According to Chomsky (1995a, 397), the new object γ can 8

Chomsky (1995a, 396-397) uses just the term ‘Merge’; he does not differentiate between ‘SetMerge’ and ‘Pair-Merge’ as Chomsky (2000, 133) . For reasons of clarity, I use the term ‘SetMerge’.

Agree, Move, Selection, and Set-Merge

119

be represented as the tree in (15-b). This, however, is only informal notation according to Chomsky. (15) a.

b.

Set-Merge ‘Applied to two objects α and β , Merge forms the new object γ .[. . . ]γ must therefore at least (and we assume at most) be of the form {δ {α , β }}, where δ identifies the relevant properties of γ ; call δ the label of γ .’ ‘. . . the label δ is either α or β ; one or the other projects and is the head of γ . If α projects, then γ = {α , {α , β }}.’ The new object (Chomsky (1995a, 397)): ‘Thus we might represent α1 γ informally as:’ α2 β

In this respect, I differ from Chomsky because I assume that trees belong to the syntactic derivation. Consequently, for the new syntactic object γ , I propose the following representation: (16) {α , {α , β }}

α

β

I have two reasons for this proposal. The first reason is that derivations are standardly treated as trees with sets of features. The second reason, which is more important, is that this proposal can account for the problematic non-local relations and the difference between the operation Agree and Move. Given this proposal, a phase – with the phase head γ and its complement {α , {α , β }} – is represented as (17). (17) {γ , {γ , {α , {α , β }}}} H HH γ {α , {α , β }} H H H α β Concerning the label of the resulting syntactic object, Chomsky (2005, 14) and Chomsky (2008, 141) assumes that the label of syntactic objects contains all the information relevant for further computations and that for syntactic operations only the label of the syntactic object is visible. However, this seems to be correct only for c-selection, as I will show in the following section. 9

This means that the operation Set-Merge is, in fact, composed of two operations: the setconstructing operation and the labelling operation; (see also discussion in G¨artner (2002, 64)).

120

Petr Biskup

3.2. Selection In this section, I show that c-selection behaves differently from s-selection. More concretely, the only-label visibility is not correct for semantic selection because the s-selecting element can ‘see’ more than just the label of its sister. 3.2.1. C-selection First, let us look at c-selection. Example (18-a) shows that preposition na ‘on’ selects a noun and (18-b) shows that it does not select a verbal category. Example (18-c) demonstrates that the preposition can be combined with the event of ‘wiping’ when it is categorically a noun, which means that the ungrammaticality of (18-b) does not lie in semantic properties of the syntactic object utˇr´ıt ‘to-wipe’ but in its categorial status. Then, (18-d) demonstrates that there is no problem when verb utˇr´ıt is combined with noun st˚ul ‘table’. The crucial datum is (18-e), which shows that the preposition cannot select a noun non-locally, across the verb. Thus, c-selection behaves in accordance with Chomsky’s only-label visibility. (18) a. b. c. d. e.

[ PP na [ DP st˚ul]] on table *[ PP na [ VP utˇr´ıt]] on to-wipe [ PP na [ DP utˇren´ı]] on wiping [ DP st˚ul]] [ VP utˇr´ıt to-wipe table [ DP st˚ul]]] *[ PP na [ VP utˇr´ıt on to-wipe table

Czech

In terms of my analysis, this means that the preposition na only cares about the label of the verbal phrase utˇr´ıt st˚ul ‘to wipe table’ – i.e., only about α , as marked by the gray colour in (19) – and not about the whole set information (i.e., about the label of st˚ul). (19) HH HH { α , {α , β }} na γ HH H utˇr´ıt α st˚ul β As demonstrated in (20-a), the verbal phrase utˇr´ıt st˚ul can further project. Given the difference between the operation Set-Merge and Pair-Merge (see, e.g.,

Agree, Move, Selection, and Set-Merge

121

Chomsky (2000)), when adverb rychle ‘quickly’ is adjoined to it, it results in the following ordered pair: < γ , {α , {α , β }} >. Now, the question arises whether the label of the new syntactic object is the whole set {α , {α , β }}, as shown in (20-b), or just α , as shown in (20-c). (20) a. b.

c.

rychle utˇr´ıt st˚ul quickly to-wipe table H H HH HH δ { {α , {α , β }} , < γ , {α , {α , β }} >} HH H HH {α , {α , β }} rychle γ H HH H utˇr´ıt α β st˚ul HH HH H δ { α , < γ , {α , {α , β }} >} HH HH {α , {α , β }} rychle γ H HH H β st˚ul utˇr´ıt α

According to Chomsky’s (1995a) formulation in (15-a), the label of the new syntactic object is identical to one of the original objects, which could be interpreted as a support for the choice of tree (20-b). However, according to Chomsky (2000, 133), the label of the new object should be the label of one of the original objects; consider Chomsky’s formulation in (21). Thus, (21) supports the choice of tree (20-c). (21) ‘The constructed objects K, then, are of the form {γ , {α , β }} (substitution) or {γ , < α , β >} (adjunction), where γ is the label of K. [. . . ] On minimal assumption, the label γ should be the label of either α or β .’ Further support for the choice of tree (20-c) comes from the example below, which is a modified example (20-a). The example is ungrammatical, hence the label of the new syntactic object should be like (20-c), and not like (20-b). If (20-b) were the right choice, the preposition na (i.e., δ in (20-b)) could see and select β in the label of its sister.10 10

Unless we specify how exactly the selection works in such cases, e.g., that the presence of α in the label blocks the selection of β .

122

Petr Biskup

(22) *na rychle utˇr´ıt st˚ul on quickly to-wipe table 3.2.2. S-selection Let us now look at how s-selection behaves with respect to the only-label visibility. Select(ion) is standardly taken to be an operation under sisterhood. And we have seen that according to Chomsky (2005, 14) and Chomsky (2008, 141), only the label of the syntactic object is visible for syntactic operations. The following examples, however, demonstrate that the s-selecting element, in fact, sees more than just the label of its sister. Collins (2002) shows that there are cases with long-distance subcategorizations, e.g., subjunctive constructions in English. In example (23), which is taken from Collins (2002, 53), the matrix predicate demand with the subcategorization frame [− M] requires a subjunctive mood, which is lower than the complementizer in the embedded clause. Thus, the sselection operation must somehow be able to see into CP projected by that. (23) a. b.

Bill demanded that John leave. demand that M [− M]

As shown in the Czech example below, the matrix verb pˇrik´azal ‘ordered’ selects a non-past tense in the embedded clause, which under the standard analysis is located in the head T. This means that in both (24-a) and (24-b), s-selection crosses the complementizer projection and that the s-selecting verb sees more than just the label of its sister. pˇrik´azal Jirkovi, zˇ e mus´ı zazp´ıvat p´ısniˇcku. Pavel song.ACC Pavel.NOM ordered Jirka.DAT that must.PRS sing ‘Pavel ordered Jirka to sing a song.’ b. Pavel pˇrik´azal Jirkovi, zˇ e bude muset zazp´ıvat p´ısniˇcku. Pavel.NOM ordered Jirka.DAT that will must sing song.ACC ‘Pavel ordered Jirka to sing a song.’ c. *Pavel pˇrik´azal Jirkovi, zˇ e musel zazp´ıvat p´ısniˇcku. Pavel.NOM ordered Jirka.DAT that must.PST sing song.ACC

(24) a.

To conclude this discussion, s-selection, in contrast to c-selection, goes against Chomsky’s assumption that only labels of syntactic objects are visible for syntactic operations.

Agree, Move, Selection, and Set-Merge

123

3.3. Agree In the preceding section, we saw that s-selecting elements see more than just the label of the sister node. And in section 2, we saw that there are non-local relations in derivations that go against the weak version of the Phase Impenetrability Condition and that in certain cases, a probe can see into the spelled-out phases. Given these facts, I do not make the assumption that only labels are visible for syntactic operations. This is the crucial difference between the analysis proposed here and the one developed by Chomsky (2005; 2008). This means that generally the whole set information of syntactic objects is visible for syntactic operations. More concretely, in (25), probe δ can see the whole set information of its sister, as marked with the gray colour, which means that it sees the whole derivation. What is important is that given the two types of information – the tree information and the set information – we get a difference between syntactic objects themselves (e.g., node β in tree (25)) and the information about them, which is part of other syntactic objects ({α , {α , β }} or {γ , {γ , {α , {α , β }}}}). H (25) HH H δ {γ , {γ , {α , {α , β }}}} H H HH γ {α , {α , β }} H HH H α β Given the Phase Impenetrability Condition (no matter which version), when a phase (suppose that γ is the phase head in (26)) is spelled out, the complement of the phase head (i.e., {α , {α , β }}) becomes inaccessible to syntactic operations, as illustrated by the ellipse in (26). Consequently, we get a difference between the set information about syntactic objects on particular nodes and the presence of the syntactic objects in the derivation. Although syntactic objects in the complement of the phase head are inaccessible to syntactic operations (they have been spelled out), the information about them is present on the dominating node, as illustrated with the gray colour in (26). Since non-complement nodes always stay in the derivation after spellout, the respective information can move up in the tree. Hence, probing elements (e.g., δ in (26) or some higher probe) merged later can see the derivation with relevant goals and can be valued.

124 (26)

Petr Biskup

δ

H H HH {γ , {γ , {α , {α , β }} }} H H HH γ {α , {α , β }} H H HH α β

Let us look at how the proposal works, e.g., in the Khwarshi example with longdistance Agreement (7-b), repeated here for convenience as (27-a). If the absolutive argument bataxu ‘bread’ stays in situ, then the operation Agree between it and the matrix v would have to cross two phase boundaries in Chomsky’s phase model, which is not possible because by the time when the matrix v is merged, the absolutive object in the embedded clause is already spelled out and ‘forgotten’, given (both versions of) the Phase Impenetrability Condition.11 I make the standard assumption that lexical entries are triples of features {P, S, F}, where P indicates phonological features, S semantic features, and F formal features (see, e.g., Chomsky (1995a, 394)). Thus, in my analysis, the relevant parts of the derivation of (27-a) look like (27-b) on p. 125. Since the whole set information of syntactic objects is visible for syntactic operations, when the unvalued uninterpretable ϕ -features on the matrix v probe, they find the interpretable ϕ -features of bataxu on its sister, as illustrated in (27-b) on page 125. Thus Agree can take place and the ϕ -features of the matrix verb can be valued, though bataxu (i.e., node {G5, S, P}) is already spelled out. (27) a.

y-iq’-ˇse goli uˇza bataxu Iˇset’u-l mother.OBL-LAT G5-know-PRS COP boy.ERG bread(G5) y-acc-u. G 5-eat- PST . PTCP ‘Mother knows that the boy ate bread.’

In the same way, through this sister Agree operation, cases are assigned to goals that are already spelled out. We already saw in section 2.2, which is concerned with the top-down problems, that reflections of the non-local Agree operation can appear on a goal that is more than one phase boundary lower than the probe. And in the Kayardild example (10), we saw that elements can get more cases and that the particular cases can be assigned in different phases. Since the set information about the appropriate goal is visible on all dominating nodes (as in 11

Even if the object moves out of the vP-phase complement in the embedded clause, it does not help because there is still one phase boundary and the probing matrix phase head.

Agree (feature valuation) H H HH HH HH H G5-v {V, {V, {C, {C, {T, {T, {v, {DP, {v, {v, {V, {V, {G5, S, P}}}}}}}}}}}}} H HH HH HH H know V {C, {C, {T, {T, {v, {DP, {v, {v, {V, {V, {G5, S, P}}}}}}}}}}} HH HH HH C {T, {T, {v, {DP, {v, {v, {V, {V, {G5, S, P}}}}}}}}} H HH H HH H T {v, {DP, {v, {v, {V, {V, {G5, S, P}}}}}}} HH HH boy DP {v, {v, {V, {V, {G5, S, P}}}}} H HH H v {V, {V, {G5, S, P}}} HH H bread {G5, S, P} V eat

(27) b.

Agree, Move, Selection, and Set-Merge

125

126

Petr Biskup

tree (27-b)), then in every phase there can be a probe that Agrees with goal’s feature(s) present in the set information on its sister. It is usually assumed that the operation Agree replaced the earlier feature movement and covert movement, hence the question arises what the relation between the sister Agree operation proposed here and feature movement or covert movement is. In what follows, I argue that the operation Agree proposed here is neither feature movement nor covert movement. Let us begin with feature movement. The feature movement operation adjoins the moved feature to the target head, which means that the target head changes. More specifically, if β adjoins to the target head α , then the new head is of the form {α , < β , α >}. However, such a change does not happen in the operation Agree proposed here. Secondly, according to Chomsky (1995b, 265) feature movement takes along all formal features of the appropriate element (goal), as stated in (28). (28) Move F “carries along” FF[F]. The operation Agree proposed here (and the model generally) differs from Move F because all types of features – i.e., not only formal but also semantic and phonological features of the goal element – are carried along by the dominating nodes; consider (27-b) again. The third reason why the sister Agree operation is not like feature movement is that the latter operation, in contrast to the former, obeys restrictions on movement, e.g., the Adjunct Condition (see Takahashi (1997)). The sister Agree operation proposed here is not like covert movement because covert movement, in contrast to the operation Agree, creates a new syntactic object. The covert movement operation is a type of the operation Move and according to Chomsky (2004, 114) Move itself is composed of the operation Agree, Pied piping and Merge, consider the original formulation in (29). Since the operation Merge – no matter whether Set-Merge or Pair-Merge – always creates a new syntactic object, covert movement necessarily differs from the operation Agree. (29) Therefore, Move = Agree + Pied piping + Merge. The second reason why the sister Agree operation is not like covert movement is that the covert movement operation, in contrast to the Agree operation, is also restricted by constraints on movement like Adjunct Condition (see, e.g., Pesetsky (2000)).

Agree, Move, Selection, and Set-Merge

127

3.4. Other long-distance relations In section 2.1 we saw that certain Condition C relations pose a problem for Chomsky’s phase model with the Phase Impenetrability Condition because they are too non-local. The problematic data, e.g., example (5), repeated here as (30-a), I analyze in the same way as the Agreement data in the preceding section. More concretely, when the appropriate pronoun is merged into the structure, it probes and finds the information about the coindexed R-expression in the set information on its sister, though the R-expression is already spelled out; consider the simplified tree (30-b). Given the c-command relation between them, the derivation violates Condition C and crashes. If it is the index on the pronoun that triggers the probing process (Chomsky (2008) does not say anything about which feature on the pronoun probes) and indices are assigned in the Numeration, then the pronoun already probes in its base position, as illustrated in (30-b). If (co)indexation is not allowed because of the Inclusiveness Condition (Chomsky (2001, 2-3)), it must be something else that triggers the probing process.12,13 In any case, whatever probes in Chomsky (2008, 141-145), it can probe in my analysis as well and given the total visibility here, it excludes example (30-a) as ungrammatical without coindexation as well. (30) a. *Er1 sagte, dass Hans behauptete, dass Andreas1 klug ist. He said that Hans claimed that Andreas clever is ‘He said that Hans had claimed that Andreas was clever.’ b. er1 {v ,{v,{V,{V,{C,{C,{T,{T,{v,{Hans,{v,{v,{V,{V,{C,{C,{T,{T,{v,{v,{V,{V,{A,{Andreas1, A}}}}}}}}}}}}}}}}}}}}}}}}

Another interesting case is long-distance scrambling and its relation to Relativized Minimality effects. As already discussed in section 2.2, if an adverb crosses another adverb in long-distance scrambling, it induces a Relativized Minimality effect, but when the crossing happens in short scrambling, the sentence is grammatical. According to Shields (2007), the head and tail of the adverb chain must be in minimal configuration, which means that the adverb moving across an intervening adverb cannot move beyond the projection immediately dominating the node to which it was adjoined, as illustrated in (31). (31) [ ZP *ADV1 [ YP OK ADV1 [ADV2 Y [ XP ADV1 X . . . 12 13

It could be, e.g., the pronounness (the variable nature of pronouns) and it cannot be ϕ -features because all ϕ -features of er are valued and interpretable. But note that, e.g., in Chomsky (2001, 34) assigning an EPP feature to the phase head also violates the Inclusiveness Condition.

128

Petr Biskup

Then, the problematic example with long-distance scrambling (8-c), repeated for convenience as (32-a), might look like (32-b). If one treats the example derivationally in the phase-by-phase fashion and evaluates each derivational step independently, there is no problem with the derivation and the phase complements, as shown in (32-b). Thus, one needs a simultaneous access to information created during different steps of the derivation. (32) a. *Ja bystro1 xoˇcu [ cˇ toby ona cˇ asto t1 zavodilas’]. I quickly want that she often started ‘I want it to often start quickly.’ vP b. vP

bystro1

VP ← phase complement

xoˇcu

CP C

bystro1

TP ← phase complement

cˇ toby ona

YP YP

bystro1

Y

cˇ asto Y short scrambling

XP X

bystro1 X

...

The model proposed here is derivational-representational and the history of the derivation (representation) is present on every node resulting from the operation Merge. We have also seen that the whole set information on particular nodes is visible for syntactic operations, not only the label. Thus, the piece of information about the too distant copies of adverb bystro (in boldface) in example (32-a) is present on the matrix vP node, as illustrated in tree (33), and consequently in this step, the derivation crashes.14 (33)

14

T {v,}}}}}>}

If the condition on minimal configuration applies at the semantic interface, then the derivation crashes when the matrix CP phase is spelled out.

Agree, Move, Selection, and Set-Merge

129

3.5. Move We have already seen that according to Chomsky (2004, 114), the operation Move is composed of the operation Agree, Pied piping and Merge; see also Chomsky (2000, 101) or Chomsky (2001, 10). In my analysis, the operation Move with its three components looks like (34). In the first step, Agree happens between formal features of the sisters. In the second step, the operation Pied piping associates the Agreeing goal feature (F in tree (34)) with other features of the syntactic object (i.e., set {F, S, P} in (34)). In the third step, the operation Merge (re)merges the appropriate element up in the tree. (34)

1. Agree H 2. Pied piping HH H H H HH {F, S, P} δ {γ , {γ , {α , {α , { F, S, P}}}}} H H H γ {α , {α , { F, S, P}}} HH α {F, S, P} 3. (Re)Merge

In section 2.3, I showed that the operation Move and Agree have different locality conditions. The reason for this is that the operations affect different syntactic objects, as is obvious from tree (34). The operation Agree only affects the sister syntactic object (and only its formal features). In contrast, the operation Move – as a composed operation – also affects the associated syntactic object. Suppose that in tree (35) γ is a phase head, set {α , {α , {F, S, P}}} is the complement of the phase head and γ P (i.e., set {γ , {γ , {α , {α , {F, S, P}}}}}) is the phase. Then, for simplicity, suppose the strong version of the Phase Impenetrability Condition.15 The tree shows that probe δ , which is outside the γ P phase, sees features of the elements in the phase complement {α , {α , {F, S, P}}} in the set information on its sister and that it can Agree with the formal feature F. However, the elements in the phase complement cannot be moved because they are not present in the derivation; they were spelled out, as illustrated by the big ellipsis in (35). To be more specific, the problem lies in the operation of pied piping because the association of the Agreeing formal feature F with the syntactic object in the phase complement is not possible. 15

The same point can be made with the weak version of the Phase Impenetrability Condition as well but the tree would have to be more complex.

130

Petr Biskup

(35)

1. Agree H 2. Pied piping HH H H H HH {F, S, P} δ {γ , {γ , {α , {α , { F, S, P}}}}} H HH γ {α , {α , { F, S, P}}} HH α {F, S, P} 3. (Re)Merge

The question arises whether in such cases movement can happen out of the sister node, i.e., whether the syntactic object {F, S, P} can move directly out of sister of δ in (35). The answer is negative. According to Chomsky (2008, 158, note 17), the operation Move (copy theory) is Remerge. This means that movement of the syntactic object {F, S, P} should be Remerge of the same syntactic object, and not a part of other syntactic object. I assume that there is a Syntactic Integrity Condition at work, which prohibits the operation Move from splitting syntactic objects. In other words, the Syntactic Integrity Condition states that only whole syntactic objects can be moved.16 This is parallel to Chomsky’s (2008, 138) No-Tampering Condition, according to which the operation Merge leaves the original syntactic objects unchanged. Specifically, the operation Merge cannot break up the original objects or add new features to them, which is dubbed as the Extension Condition and the Inclusiveness Condition. To be more concrete and to show how the operation Move works in particular sentences, consider example (36). In (36-a)=(14-b), Agree between the head C and wh-phrase kter´eho ‘which’ within the prepositional phrase is possible, analogous to the Agree operation in tree (35). However, subextraction of kter´eho out of the prepositional phrase is not possible, as shown by example (36-b)=(14-c). Abels (2003) argues that prepositional phrases in Russian and other Slavic languages are phases. If it also holds true for the prepositional phrase in example (36) and if the adverbial prepositional phrase is adjoined to vP, then given (both version of) the Phase Impenetrability Condition, kter´eho is not accessible when the phase head C probes. Therefore the association (i.e., the operation of Pied piping) of the Agreeing wh-feature on sister of C with the syntactic object kter´eho is not possible, analogous to the second step in tree (35). (36) a.

16

Marie vypr´avˇela legraˇcn´ı historky o zˇ ivotˇe kter´eho pˇr´ıtele? Marie told funny stories about life which friend ‘About which friend’s life did Marie tell funny stories?’

This subsumes the standard syntactic condition according to which only constituents can move.

Agree, Move, Selection, and Set-Merge

131

b. *Kter´eho Marie vypr´avˇela legraˇcn´ı historky o zˇ ivotˇe t pˇr´ıtele? which Marie told funny stories about life friend Now, the question arises what happens when the operation of Pied piping associates the Agreeing wh-feature on sister of C with a node that dominates the syntactic object kter´eho. Given the model proposed here, this is theoretically possible because the wh-feature of kter´eho is visible on all dominating nodes. The following example, however, shows that subextraction of the whole DP kter´eho pˇr´ıtele ‘which friend’ from the prepositional phrase is ungrammatical. The syntactic object kter´eho pˇr´ıtele is also spelled out when the head C probes; hence the association (Pied piping) of the Agreeing wh-feature with it is also not possible. (37) *Kter´eho pˇr´ıtele Marie vypr´avˇela legraˇcn´ı historky o zˇ ivotˇe t? which friend Marie told funny stories about life The same also holds for the complement of the prepositional phase head. As shown by example (38), extraction of zˇivotˇe kter´eho pˇr´ıtele ‘which friend’s life’ out of the prepositional phase is ungrammatical as well. Although the probing wh-feature on the head C Agrees with the wh-feature on its sister, Pied piping of the prepositional complement again cannot happen because this syntactic object, too, was already spelled out. ˇ (38) *Zivotˇ e kter´eho pˇr´ıtele Marie vypr´avˇela legraˇcn´ı historky o t? life which friend Marie told funny stories about Then, one expects that movement of a syntactic object that is not trapped in the phase complement is grammatical because the association of the Agreeing feature with the appropriate syntactic object is possible. This expectation is correct, as shown by the example below. The Agreeing wh-feature in the set information on sister of C can be associated with node o zˇivotˇe kter´eho pˇr´ıtele ‘about which friend’s life’ bearing the wh-feature and consequently the whole prepositional phrase is moved. (39) O zˇ ivotˇe kter´eho pˇr´ıtele Marie vypr´avˇela legraˇcn´ı historky t? about life which friend Marie told funny stories ‘About which friend’s life did Marie tell funny stories?’ To conclude this section, for movement to be possible, the operation of Pied piping must associate the Agreeing feature with a syntactic object that dominates the phase complement.

4. Conclusion I have proposed a derivational-representational model that can account for the difference in locality behaviour between the operation Agree and Move and that

132

Petr Biskup

can derive non-local relations that are problematic for Chomsky’s (2000, et seq.) phase model with the Phase Impenetrability Condition. The problematic nonlocal relations can be derived in a local fashion if the whole set information resulting from the operation Set-Merge is visible for syntactic operations. As for the difference between the operation Agree and Move, I have proposed that for Agree, only the set information on the sister syntactic object is relevant and that for Move, also the tree information with the Phase Impenetrability Condition is relevant. The operation Move, though it is based on the operation Agree, does not affect all elements visible for Agree because some elements may have already been spelled out. I have also shown that c-selection behaves differently from s-selection, Agree and other long-distance relations with respect to the information given by Set-Merge. Whereas for c-selection, only the label in the set information on particular nodes is relevant, for the other relations, the whole set information on particular nodes is relevant.

Bibliography Abels, Klaus (2003): Successive Cyclicity, Anti-locality, and Adposition Stranding. PhD thesis, University of Connecticut. Bailyn, John F. (2007): A Derivational Approach to Microvariation in Slavic Binding. In: R. Compton, M. Goledzinowska and U. Savchenko, eds, Formal Approaches to Slavic Linguistics 15: The Toronto Meeting 2006. Ann Arbor: Michigan Slavic Publications, pp. 25–42. Biskup, Petr (2011): Adverbials and the Phase Model. John Benjamins, Amsterdam/Philadelphia. ˇ Boˇskovi´c, Zeljko (2005): ‘On the Locality of Move and Agree’. In: University of Connecticut Occasional Papers in Linguistics 3. ˇ Boˇskovi´c, Zeljko (2007): Agree, Phases, and Intervention Effects. http://web.uconn.edu/boskovic/papers.html (Final version published in Linguistic Analysis 33, 54–96.) Cecchetto, Carlo and Renato Oniga (2004): ‘A Challenge to Null Case Theory’, Linguistic Inquiry 35, 141–149. Chandra, Pritha (2007): ‘Long-Distance Agreement in Tsez: A Reappraisal’, University of Maryland Working Papers in Linguistics 15, 47–72. Chomsky, Noam (1995a): Bare Phrase Structure. In: G. Webelhuth, eds, Government and Binding Theory and the Minimalist Program. Blackwell, Oxford, pp. 383–439. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. MIT Press, Cambridge, Massachusetts, pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, eds, Ken Hale. A Life in Language. MIT Press, Cambridge, Massachusetts, pp. 1–52. Chomsky, Noam (2004): Beyond Explanatory Adequacy. In: A. Belletti, eds, Structures and beyond. Oxford University Press, Oxford, pp. 104–131. Chomsky, Noam (2005): ‘Three Factors in Language Design’, Linguistic Inquiry 36, 1–22. Chomsky, Noam (2007): Approaching UG from Below. In: U. Sauerland and H. M. G¨artner, eds, Interfaces + Recursion = Language?. Mouton de Gruyter, Berlin, pp. 1–29. Chomsky, Noam (2008): On Phases. In: R. Freidin, C. P. Otero and M. L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud. MIT Press, Cambridge, Massachusetts, pp. 133–166.

Agree, Move, Selection, and Set-Merge

133

Collins, Chris (2002): Eliminating Labels. In: S. D. Epstein and T. D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell Publishers, pp. 42–64. Evans, Nicholas (1995): A Grammar of Kayardild. With Comparative Notes on Tangkic. Mouton de Gruyter, Berlin. Fox, Danny and David Pesetsky (2005): ‘Cyclic Linearization of Syntactic Structure’, Theoretical Linguistics 31, 1–45. G¨artner, Hans-Martin (2002): Generalized Transformations and Beyond. Reflections on Minimalist Syntax. Akademie-Verlag, Berlin. Hiraiwa, Ken (2001): ‘Multiple Agree and the Defective Intervention Constraint in Japanese’, MIT Working Papers in Linguistics 40, 67–80. Khalilova, Zaira (2007): Clause Linkage: Coordination, Subordination and Cosubordination in Khwarshi. Ms., Universit¨at Leipzig. M¨uller, Gereon (2010): ‘On Deriving CED Effects from the PIC’, Linguistic Inquiry 41(1), 35–82. Pesetsky, David (2000): Phrasal Movement and Its Kin. MIT Press, Cambridge, Massachusetts. Polinsky, Maria and Eric Potsdam (2001): ‘Long-Distance Agreement and Topic in Tsez’, Natural Language & Linguistic Theory 19, 583–646. Reuland, Eric J. (2001): ‘Primitives of Binding’, Linguistic Inquiry 32, 439–492. Shields, Rebecca (2007): Derivation versus Representation: Evidence from Minimality Effects in Adverb Movement. In: LSO Working Papers in Linguistics 7: Proceedings of WIGL 2007. University of Wisconsin-Madison, pp. 161–176. ´ (2004): Icelandic Non-Nominative Subjects: Facts and Implications. In: Sigursson, Halld´or A. P. Bhaskararao and K. V. Subbarao, eds, Non-Nominative Subjects. Vol. 2. John Benjamins, Amsterdam/Philadelphia, 137–161. Takahashi, Daiko (1997): ‘Move-F and Null Operator Movement’, The Linguistic Review 14, 181– 196.

Institut f¨ur Slavistik Universit¨at Leipzig

Marc Richards

Probing the Past: On Reconciling Long-Distance Agreement with the PIC* Abstract This brief paper first narrows down the definition of problematic long-distance agreement (LDA) in terms of the Phase Impenetrability Condition (PIC). It then reviews various prominent cases of apparent PIC-violating LDA in the recent literature and the kinds of local analyses that have been brought to bear on them, arguing that there are plenty of ways to reconcile these data with the PIC after all. It is finally shown that the most problematic case of LDA that remains under the current minimalist system is, ironically, that for which probe-goal Agree ‘at a distance’ was originally devised, namely English expletive-associate constructions, which involve potentially infinitely long-distance Agree across unlimited phase boundaries. Once feature inheritance is assumed, the more relaxed version of the PIC in Chomsky (2001) that would allow for these cases becomes unavailable. A tentative solution is ultimately sketched that exploits an independent distinction between the phonological (SM) and semantic (CI) interfaces in terms of different Transfer-triggers and conceptions of phases. The result is that defective phases emerge as transparent for Agree but not for movement, owing to their status as PF-only phases.

1. Preliminaries The phenomenon of long-distance agreement (LDA) spanning clause boundaries poses a particularly acute and well-defined problem under the cyclic spellout model of grammar put forward in Chomsky (2000; 2001; et seq). Given the probe-goal approach to syntactic agreement, in which uninterpretable features (probes) seek a value located on a corresponding interpretable feature set (goal) within their c-command domain, the problem with LDA is no longer its ‘long-distance’ property. All Agree is now ‘LD’ in the relevant sense, with (1-b) the basic configuration, and the ‘local’ specifier-head relation in (1-a) only secondary, derived through an additional movement operation separate from Agree itself. *

For their insightful comments and questions, I would like to thank the anonymous reviewer, the audience at the DGfS workshop on Local Modelling of Non-Local Dependencies in Syntax, Universit¨at Bamberg, February 2008, and the participants of the Colloquium on Syntax and Morphology, Universit¨at Leipzig, November 2008.

Local Modelling of Non-Local Dependencies in Syntax, 135-154 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

136 (1)

Marc Richards

a. ‘Local’: Several men seem to John [to be tDP in the garden]. b. ‘Non-local’: There seem to John [to be several men in the garden].

Since movement no longer feeds agreement (feature-checking/valuation), the locality of agreement is defined not by checking relations (such as spec-head) but by independent constraints on probe-goal matching that are argued to increase computational efficiency in accordance with the Strong Minimalist Thesis (SMT). One such factor is minimal search, the idea that a probe seeks the closest goal within its complement domain, essentially yielding intervention and minimality effects. In addition, and of primary concern to us here, is the absolute limit on the search space of a probe that derives from the workings of cyclic spell-out. The relevant locality domain here is the phase, a unit of the derivation that sets a limit on the range of syntactic operations through the periodic ‘forgetting’ of derivational information, as regulated by the Phase Impenetrability Condition (PIC): (2) Phase Impenetrability Condition1 (Chomsky (2000) version: PIC1 ) In phase α with head H, the domain of H is not accessible to operations outside α; only H and its edge are accessible to such operations. Taking the phase heads to be C and (transitive) v, the PIC, first defined by Chomsky (2000) as in (2), has the effect that the complement of the phase head (e.g. v) is inaccessible from outside the phase (vP). This is illustrated in (3). Any Agree operation holding between T and V’s complement, or between v and an item inside an embedded CP, is non-local in the sense of exceeding the PIC and crossing a phase boundary. Such cases of agreement should therefore not exist. LDA, then, poses a problem in the current system precisely where it is non-local in this specific sense. (3) PIC1 patterns of search space CP C

TP T

Search Space available to C/T

v*P Subj

v*’ v*

PIC boundary (triggered by Merge-T)

VP V

Search Space available to v

Comp

Chomsky (2005, 9) claims such genuine cases of non-local (PIC-violating) LDA to be rare:

137

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

(4) “The only case I know of with agreement into a lower phase without intervention is experiencer constructions in which the subject is raised (voiding the intervention effect) and agreement holds with the nominative object in the lower phase (Icelandic).” (Chomsky (2005), 9) An example of this configuration is given in (5), from Icelandic. (5) M´er me.DAT

T

þ´ottu tm´er [þær vera duglegar] thought-3 PL they.NOM to-be industrious Icelandic

For such cases, Chomsky (2001, 12-14) proposes a modified version of the PIC relativized to phase heads (see (6)), so that the complement of the lower phase (here, vP) is only transferred to the interfaces upon merger of the next phase head (C), allowing T to agree with the complement of V as illustrated in (7). (6) Phase Impenetrability Condition2 (2001 version: PIC2 ) [Given structure [ZP Z ... [HP α [H YP]]], with H and Z the heads of phases]: The domain of H is not accessible to operations at ZP; only H and its edge are accessible to such operations. (7) PIC2 patterns of search space CP C

TP T

Search Space available to C

v*P Subj

v*’ v*

PIC boundary (triggered by Merge-C)

VP V

Search Space available to T/v*

Comp

This ‘weakening’ of the PIC only allows for clause-internal LDA between T and V-comp; still excluded are any instances of Agree holding between a matrix probe and an argument contained inside an embedded clause (CP). However, contra the statement in (4), such cases are widely attested. Among the most commonly discussed examples of non-local agreement between matrix T/v and an embedded argument are the following:

138

Marc Richards

(8) Itelmen (Bobaljik & Wurmbrand (2005)) na @ntxa-Bum+nın kma jeBna-s he forget-1SG.OBJ=3.cl me meet-INF ‘He forgot to meet me.’ (Bobaljik & Wurmbrand (2005, 846 (36-b))) (9) Chukchee (Stjepanovi´c & Takahashi (2001), Boˇskovi´c (2007)) @nan q@lGilu l@N@rk@-nin-et iNqun ø-r@t@m N@v-nen-at qora-t he-inst regrets-3-PL reindeer -PL . NOM that 3SG-lost-3-PL ‘He regrets that he lost the reindeers.’ (Boˇskovi´c (2007, 57 (2))) (10) Blackfoot (Legate (2005), Boˇskovi´c (2007)) kits-´ıksstakk-a om´a noxk´owa [m-´axk-it´ap-aapiksistaxsi kiist´oyi omi 2 -want-3 my-son-3 3-might-toward-throw you pok´on-i] ball-4 ‘My son wants to throw the ball to/at you.’

(Boˇskovi´c (2007, 57 (3)))

(11) Tsez (Polinsky & Potsdam (2001), Bobaljik & Wurmbrand (2005), Boˇskovi´c (2007)) [uˇz-¯ a magalu b¯ ac ruëi ] b-iyxo Eni-r mother.DAT boy.ERG bread.III.ABS III.ate] III-know ‘The mother knows the boy ate the bread.’ (Polinsky & Potsdam (2001, 584)) (12) Hindi (Boeckx (2004), Bhatt (2005)) Vivek-ne [kitaab parh-nii] chaah-ii Vivek.ERG book.F read-INF.F want-PFV.F ‘Vivek wants to read the book.’

(Boeckx (2004, 25 (5)))

The examples in (8)-(12) certainly appear to violate the PIC, i.e., to be non-local in the sense of probing too deeply across too many phase boundaries, namely the embedded vP and, at least in the case of (9), CP. For Chomsky, these should be illegitimate cases of “searching into a phase already passed” (Chomsky (2005), 16) – the material on the complement side of these lower phase heads should be inaccessible to the matrix probe, since this material has already been transferred to the interfaces (and thus ‘forgotten’), by the PIC. The question from the minimalist perspective, then, is how to reconcile LDA with the PIC: how can a distant goal be made accessible to a probe in a higher phase? Or, from the probe’s perspective: how is it possible to probe the past? The remainder of this brief paper is structured as follows. Firstly, section 2 sets out to show that all the cases of apparent LDA in (8)-(12) are just that – apparent, with plenty of viable local analyses being offered in the recent literature. After a glancing review of these alternative analyses in sections 2.1 and 2.2, the conclusion reached in section 2.3 is that there is no need to modify or enrich

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

139

the basic minimalist machinery outlined above (probe-goal Agree and PIC) in order to accommodate such LDA, and that ‘fake’ LDA of this kind is a nonunified phenomenon, each case requiring its own analysis according to its own particular properties (cf. Potsdam & Polinsky (2012)). Section 3 then turns from these apparent cases of LDA to a case of genuine LDA. What emerges as true, cross-phasal LDA of the kind that poses real problems for the PIC and Agree is the very configuration that provided the original motivation and template for probe-goal Agree in the first place, as in (1-b) above (cf. Chomsky (2000)). I argue that recent developments in phase theory render the more ‘liberal’ PIC in (6) unformulable, thus depriving us of the search space pattern in (7) that is necessary for accommodating simple Agree between T and V-comp in English expletive constructions (cf. also Icelandic (5)). A tentative proposal is then made for how PIC1 (i.e., (2)/(3)) might be reconciled with LDA of the kind in (1-b), based on the notion of defective probes.

2. Localizing LDA There is a general consensus in the literature that all the apparently non-local cases of Agree reviewed above (cf. (8)-(12)) are amenable to a local analysis which, in our present terms, respects the PIC; see, e.g., Butt (1995), Polinsky & Potsdam (2001), Boeckx (2004), Bhatt (2005), Bobaljik (2006), Preminger (2008; 2009). Two strategies are in principle available to us for localizing LDA. Either we take the probe and the goal to indeed be several phases apart, but localize the operation of agreement itself, so that Agree proceeds in several smaller, local, cyclic steps – call this ‘Strategy A’; or, we demonstrate that the distance between the probe and the goal is not as long as it seems, with no intervening phase boundaries after all, so that the position of the goal is actually within the PIC-defined range of the matrix probe – call this ‘Strategy B’. It is fair to say that most existing analyses fall under one of these two approaches.1 Let us 1

There are exceptions, of course, foremost amongst which are those approaches that would simply abandon the phase-based locality of Agree in light of such evidence as (8)-(12); see, e.g., Stjepanovi´c & Takahashi (2001), Boˇskovi´c (2007). Agree may then simply apply non-locally, freely crossing phase boundaries. The PIC is still maintained as a constraint on Move, for which phonological reasons are usually given (Stjepanovi´c & Takahashi (2001) appeal to pied-piping, which requires phonological content, removed at Spell-Out; Boˇskovi´c (2007) invokes linearization demands, such that spelled-out items are frozen in place). This is arguably the least interesting approach, as it simply renounces the conceptually motivated PIC in light of the empirical evidence that Agree seems to violate it. (Furthermore, there is evidence that the PIC does constrain Agree as well as Move. Thus, the in-situ/Agree counterpart of superraising, as in (i), is no less unacceptable than the movement alternant (ii). (i) *There T are likely [CP that it [vP seems [to be several men in the garden]]]

140

Marc Richards

briefly review some representative analyses from each camp, starting first with Strategy B.2 2.1. Some Strategy (B) approaches 2.1.1. ‘Invisible’ movement to the phase edge (Polinsky & Potsdam (2001)) In their influential analysis of Tsez LDA (cf. (11)), Polinsky & Postdam (2001) argue that this phenomenon involves topic absolutive goals which move (covertly) to a TopP position at the left edge of the finite embedded clause, from where local agreement with the matrix probe is possible (for Polinsky & Potsdam, this agreement obtains through head government; equivalently, here, Agree). The resultant LDA configuration at LF is given in (13). The complement of Top, here IP, is shaded out to indicate its inaccessibility to matrix-clause agreement (cf. PIC above). (13) VP TopP DPabs

V Top’

IP

Top

tDP This analysis finds compelling support in Polinsky & Potsdam’s observation that LDA is unavailable in precisely those environments where movement to the edge of the lower clause is blocked. This occurs either when a CP must be projected on top of TopP, such as in wh-clauses (14-a) and where an overt complementizer (ii) *Several men are likely that it seems to be in the garden

2

Since the usual analysis of superraising (in terms of improper movement) is not applicable to Agree, another explanation must be sought for (i). As argued in Richards (2008), defective intervention accounts are dubious, and the PIC is the most natural source of (ii), and therefore of (i) too, to which it readily extends.) Another alternative to strategies A and B is to take the same tack as Chomsky in view of (5) above and relax the PIC yet further, such that (for example) a phase is only transferred at the next phase head but one. See Richards (2011) for an argument deflating this possibility (essentially, the second PIC, (6), is not a ‘weaker’ version of the first PIC, (2), but simply one of the two logical possibilities that emerge as a side-effect of where the nonphase head T belongs – either to C’s phase or to v’s. There are no further possibilities, and thus no further PICs).

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

141

λin is present (14-b), or else when TopP is already occupied (overtly or covertly) by an overtly topic-marked item (a non-absolutive topic), marked -n(o) / -gon, as in (14-c). (14) a. *Eni-r [n¯ a c ohor-¯ a micxir boka¯k ruëi] r-iyxo mother.DAT [where thief.ERG money.III.ABS III.stole III.know ‘The mother knows where the thief stole the money.’ b. *Eni-r [uˇza¯ magalu bac’siλin] b-iyxo mother.DAT [boy.ERG bread III.ABS III.ate.COMP III.know ‘The mother knows that the boy ate the bread.’ [aè¯ a cˇ anaqan-go-gon ziya c. *Eni-r cow.III.ABS mother.DAT [shepherd.ERG hunter.TOP biˇsr-er-xosi-łi] b-iyxo feed.CAUS III-know ‘The mother knows that, as for the hunter, the shepherd made (him) feed the cow.’ In these cases, only class IV agreement is possible on the matrix verb, indicating agreement with the head of the CP complement itself. The lack of LDA in (14) is readily extensible to the phase system outlined in section 1: it is precisely in (14) that a C-head (and thus CP phase boundary) must be present, so that the covertly topicalized absolutive in spec-TopP is trapped inside the domain of this CP phase, and thus inaccessible to the matrix probe. Indeed, Boˇskovi´c (2007) offers a reinterpretation of Polinsky & Potsdam’s analysis in just such phasal terms (one which, furthermore, removes the need to appeal to covert topicalization, which Boˇskovi´c argues to be undesirable on several grounds). As he observes, all the cases in which LDA is blocked in Tsez are cases in which the complement clause must be analysed as a CP, on independent grounds (wh-movement to spec-CP in (14-a), the presence of an overt C head in (14-b), overt topic movement to the spec-CP left periphery zone in (14-c)). Assuming that finite clauses do not have to be CPs but can be bare TPs (Boˇskovi´c (1997), Boˇskovi´c & Lasnik (2003)), Boˇskovi´c points out that in all the other cases, i.e., those in which LDA may obtain, there is no independent evidence for a CP layer, and so these may be analysed as bare TPs. Under the standard assumption that CP is a phase and TP not, LDA in (11) is correctly ruled in by the PIC since no CP phase boundary is crossed, whereas LDA in (14) is correctly ruled out by the PIC because Agree is into the complement of C, a phase head. In this way, Tsez LDA in fact conforms to the PIC after all: it is not non-local in the sense of section 1. Structurally, Tsez LDA on this analysis resembles an ECM configuration (cf. know in English): agreement takes place into a TP rather than CP complement.3 3

Albeit a finite TP in this case. In this respect, Tsez LDA is perhaps rather more akin to hyperraising in Brazilian Portuguese, where Agree obtains into the finite TP complement of seem,

142

Marc Richards

Blackfoot (10) would also seem amenable to such an analysis (cf. the complementation possibilities of want in English). 2.1.2. LDA as a clause-union effect: restructuring (Boeckx (2004), Bhatt (2005), Bobaljik & Wurmbrand (2005)) Unlike the LDA of Tsez, Blackfoot and Chukchee in (9)-(11), LDA in Hindi (12) and Itelmen (8) involves nonfinite complement clauses only. Specifically, LDA is possible only into a certain subset of nonfinite clauses, namely those that are the complement of verbs such as want, try, forget – so-called restructuring verbs. As such, these cases of LDA are readily amenable to being analysed as clause-union effects, conditioned by the same factors that allow clitic-climbing in Italian Gianni lo ha voluto leggere vs. *Gianni lo ha deciso leggere (‘John it has wanted / decided to read’), namely a truncated structure. In this connection, Wurmbrand (2005) makes the suggestion that coherent (restructured) infinitives as in German (15-a) are functionally reduced or deficient – that is, they are bare VPs that lack the vP layer necessary for (a) introducing the external argument and (b) checking accusative on the embedded object. The embedded object therefore has to be Case-valued via Agree with a matrix probe, thus yielding long scrambling into the matrix clause. Such scrambling is barred where this functional structure is present (as with a verb like planen, (15-b)), since the object’s requirements are then taken care of downstairs. dass Hans [den Traktor]i versucht hat [ti zu reparieren] has to repair that Hans the tractor tried b. *dass Hans [den Traktor]i geplant hat [ti zu reparieren] that Hans the tractor planned has to repair

(15) a.

Agree into a non-finite clause under restructuring has been claimed to be the basis of LDA in Itelmen (Bobaljik & Wurmbrand (2005)) and Hindi-Urdu (Boeckx (2004), Bhatt (2005)), an approach which has numerous advantages from the perspective of the PIC and probe-goal Agree. Firstly, restructuring solves at a single stroke two problematic properties of LDA: its apparent nonlocality (PICinaccessibility) and its violation of the Activity Condition (the idea that a goal becomes inactive for any further Agree operations once its Case feature has been valued; section 2.2 below). If restructuring infinitives (RIs) are bare VPs, then they lack both the CP and vP phase layers. Therefore, agreement into them from the matrix clause is not non-local (it crosses no phase boundaries, in conforwith raising to the matrix clause: O Jo˜ao parece [que t est´a doente], ‘John seems is sick’. See also Gallego & Uriagereka’s (2007a) analysis of Spanish ‘ECM’ as involving agreement into a defective CP complement clause.

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

143

mance with the PIC), and the argument inside the RI cannot be licensed internally (there is no embedded probe), thus remaining active and visible for Agree with a matrix probe. A further problematic property of LDA is its optionality (cf. (16), (17)). Under the usual minimalist assumptions of economy of derivations (Last Resort), operations like Agree cannot be optional – if there are probe-goal valuation requirements to be satisfied, then they must be satisfied. However, since the selection of an RI (versus a non-RI) complement is itself optional, which in turn is a lexically specified property of the predicates in question, the optionality of LDA in (16) and (17) simply tracks the optionality of restructuring – wherever a non-RI complement is selected (i.e., full CPs/TPs/vPs), LDA with the embedded argument is barred by the PIC and the Activity Condition (see (18)). No optionality of syntactic operations (Agree, Move) is thus required. (16) Itelmen kma jeBna-s (LDA) a. na @ntxa-Bum+nın he forget-1SG.OBJ=3.CL me meet-INF b. na netxa-in (no LDA) kma jeBna-s he forget-3SG . SUBJ ( INTRANS ) me meet.INF (Bobaljik & Wurmbrand (2005, 846 (36-b,c))) ‘He forgot to meet me.’ (17) Hindi a. Ram-ne [rotii khaa-nii] chaah-ii (LDA) Ram.ERG bread.F eat-INF want-PFV.F b. Ram-ne [rotii khaa-naa] chaah-aa (no LDA) Ram.ERG bread.F eat-INF. M want-PFV. M ‘Ram wants to eat bread.’ (Bhatt (2005, 792 (57))) (18) a.

LDA (restructuring, RI complement; Boeckx (2004, 32)) [v [V [VP V Obj]]] 6

b.

No LDA (no restructuring, non-RI complement) * [v [V [vP (Subj) v [VP V Obj]]] 6

← violation of PIC/Activity

In sum, perfectly viable non-PIC-violating analyses already exist for at least three of the LDA types illustrated in (8)-(12): Itelmen (8) and Hindi-Urdu (12) are plausibly due to restructuring; and Tsez (11) and Blackfoot (10) fall into line if we postulate bare TP or Cde f complements (finite ‘ECM’, in-situ ‘hyperraising’, etc.). In all three cases – whether the complement is an RI (bare VP), a

144

Marc Richards

bare TP, or a defective C4 – we are dealing with non-phasal categories, and thus agreement is local in the sense of respecting the PIC. This leaves just Chukchee (9) as perhaps the only example of true, non-local (transphasal) LDA into a nondefective CP complement clause. However, even here there are convincing alternative analyses available. Thus Bobaljik (2006, 52, fn. 25) argues that the example in (9) has been misanalysed, and that it rather involves ‘proxy agreement’ with a null proleptic object located inside the matrix clause and coreferent with the embedded argument (the purported goal of LDA). This alternative gloss of (9) is given in (19). iNqun r@t@m N@v-nen-at l@N-@rk@-nin-et (19) @nan q@lGilu he.ERG sorry/pity/regret AUX - PAST-3SG:3PL because lose-3SG:3PL qora-t reindeer-PL ‘He feels sorry (for them), that he lost (them), the reindeer.’ In support of this parse, Bobaljik notes that (i) the choice of complementizer suggests that we are dealing not with a CP argument but rather with an adjunct clause (iNqun is normally glossed as ‘because’/‘in order to’, not as declarative ‘that’); (ii) the light verb of emotion in the matrix clause normally takes a DP, not a CP complement; (iii) the word order of complement clauses is normally SOV, not VO as here; and (iv) the embedded subject should act as an intervener, thus blocking LDA. At the very least, then, the alternative parse in (19), in which all Agree relations are local and conform to the PIC, cannot be ruled out for Chuckchee. None of the structures in (8)-(12), then, provide knockdown evidence against the PIC. 2.2. Strategy (A) approaches: Cyclic Agree The alternative to Strategy B is to maintain the phase boundaries between probe and goal and to allow Agree itself to proceed incrementally, phase by phase, in a manner resembling successive-cyclic feature movement or percolation up the tree. Such approaches go by various names in the literature (Cyclic Agree, Agree-chaining, head-to-head Agree, indirect Agree, Bhatt’s (2005) AGREE, etc.; see Stjepanovi´c & Takahashi (2001), Legate (2005), Boeckx (2007), Keine (2008) and others), but they all share the essential property of allowing the featural properties of the embedded goal to be transmitted up the tree via smaller, local Agree (or feature-movement) steps, whereby an embedded probe, once valued by the embedded goal, can itself then act as the goal for an Agree operation involving a higher probe (and so on). 4

See section 3 on the phasal status of defective probes/heads.

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

145

A sample derivation of an LDA structure proceeding in this way is given in (20) for Tsez (11), by way of illustration (further possible intermediate steps are omitted). (20)

b-iyxo Eni-r [uˇza¯ magalu bac ruëi] mother.DAT boy.ERG bread III.ABS III.ate.COMP III.know ‘The mother knows that the boy ate the bread’ a. Step 1: Agree(embedded v1 *, embedded Obj) v1 *[uΦ] → v1 *[III] b. Step 2: Agree(matrix v2 *, embedded v1 *) v2 *[uΦ] → v2 *[III]

Clearly, cyclic Agree of this kind is a powerful mechanism that essentially replaces all long-distance applications of probe-goal, i.e., Agree across phrases as well as across phases (indeed, Legate (2005) proposes it in the first instance as a way to allow clause-internal LDA of the English expletive-associate kind in (1-b), under the assumption that all kinds of v-head, and not just transitive ones, are phases – see section 3 below). The problem, then, is to constrain the mechanism sufficiently so that it does not overgenerate. For example, we saw in (14) that LDA in Tsez is blocked wherever a CP must be projected. Additional assumptions must therefore be made to prevent C from taking part in Cyclic Agree (e.g., C lacks the requisite Φ-probe, thus breaking the Agree-chain). Strategy-B approaches, however, can just appeal to the PIC here (which, of course, is what creates the very problem that Cyclic Agree is meant to solve). Similarly, the optionality of LDA that we saw in (16) and (17), and which is illustrated in the non-LDA alternant of (20) in (21), proves problematic under Cyclic Agree, since this optionality must lie in the optional application of (an intermediate step of) the Cyclic Agree operation, rather than in optional lexical choices as we saw for Strategy B in the previous section (i.e., selection of RI vs. non-RI complements). As noted above, optional operations are dubious from the minimalist perspective; however, any appeal to restructuring on a Strategy A approach, whilst of course possible, would render Cyclic Agree itself superfluous (i.e., there is no need for both). Thus Cyclic Agree faces an overgeneration problem here too, in that the LDA alternant should always be generated in favour of the non-LDA alternant.5 5

A further advantage of Strategy B over Strategy A approaches lies in a further property of LDA, namely that LDA alternants with restructuring predicates are associated with additional interpretive effects (wide-scope and/or specific readings of the LDA goal – see Bhatt (2005), Bobaljik & Wurmbrand (2005)). Whilst a full discussion of this property will not be attempted here, it would seem that Strategy B is better placed to give us a handle on these effects than Strategy A is, since Cyclic Agree approaches leave the embedded DP in situ, passing its features up via multiple agreement steps. Something extra must therefore be said to account for the wide-scope interpretation of these DPs, whereas this follows for free under those Strategy B approaches that move the embedded DP up to within range of the matrix probe (allowing it to be interpreted in the

146 (21)

Marc Richards

Eni-r [uˇz-¯ a magalu b¯ ac ruëi ] r-iyxo Mother.DAT [boy.ERG bread.III.ABS III.ate].IV IV-know ‘The mother knows the boy ate the bread.’ (Polinsky & Potsdam (2001: (1-a)))

Cyclic Agree is also arguably over-powerful in the extent to which it flouts the Activity Condition (AC). According to the AC, once the embedded probe (e.g. v*) has been valued, it should be inactive for subsequent Agree with any matrix probe. Bhatt (2005) proposes an AGREE operation that underlies LDA and is defined precisely by this property: it is Agree without the AC. However, there are pervasive empirical reasons for wanting to maintain something like the AC. Allowing a probe to Agree more than once effectively removes the principal source of ungrammaticality from our system, namely Case Filter effects (i.e., a Case feature on a DP going unvalued and thus crashing at the interfaces – hence John is afraid *(of) Bill, *It seems John to be nice, etc.).6 Since the AC provides a fundamental account of such effects, I take it to be a desirable property of the system that we should try to maintain (contra, e.g., Nevins (2004); see Richards (2008) for fuller argumentation). Legate (2005) in fact seeks to maintain the AC, which she achieves through the copying-up of unvalued Case features from the DP goal, thus ensuring that the mediating probe stays active (by inheriting an active feature from the original goal). However, whilst this works for English passives/unaccusatives (since accusative Case is not valued by defective v, and therefore the Case feature on

6

higher, matrix position). (Strategy A cannot appeal to the interpretation of the passed-up features (values) themselves in the matrix clause, since the features valued on probes are, by definition, uninterpretable.) To be sure, the problem of violating the AC is not unique to Cyclic Agree. Rather, it is an inherent property of the LDA phenomenon itself, since it involves a single argument DP agreeing in two different clauses (cf. (8)-(12)). According to the AC, the embedded DP should be rendered inactive by agreement and case-valuation within the embedded clause (at least where no restructuring is involved; cf. section 2.1), and thus not be available for further Agree with a matrix probe. Such AC violations are a property of Multiple Agree structures generally, of which LDA is just one particular kind, and so this is a problem faced by strategies A and B alike. One way around this particular AC problem (i.e., endlessly active goals) exists wherever the embedded probe can be shown to be defective, i.e., not to involve a complete match between probe and goal. If Agree requires complete match (cf. Chomsky (2001)), then the goal can remain active for the matrix probe after Agree with the defective embedded probe (see Chomsky’s (2001, 14ff.) original analysis of Icelandic participial agreement, and Frampton et al. (2000), Carstens (2001) for related discussion). Embedded agreement in at least Hindi and Tsez is incomplete in the manner required (it involves only gender and/or number agreement on the lower probe, not person), thus potentially obviating the AC problem here. No matter how the problem of multiple-agreeing goals shared by strategies A and B is solved, there remains a key difference between the two approaches: namely, the AC problem is exacerbated by Strategy A, since Strategy A involves not only multiple-agreeing goals (DPs) but also multiple-agreeing probes. Since it is the latter (endlessly active probes) that undermines Case Filter effects, only the latter approach (Strategy A / Cyclic Agree) faces the problem discussed here in the main text.

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

147

the DP remains active), it is unclear how this carries over to transitive clauses, since the Case of the object then is valued by v and so should be copied up as a valued, and thus inactive, feature.7 Further, the ability of defective v to act as a mediating probe for T is questionable since it cannot specify, and thus pass up, all of T’s features (it is defective, for example lacking Person). Thus a problem arises whether the mediating probe is defective or nondefective. If it is defective, then the AC can be observed by passing up an active Case feature, but the Φ-incompleteness of the probe will render full valuation of the higher probe’s features impossible, so that the latter should induce a crash. If, on the other hand, the mediating probe is nondefective, then the opposite scenario obtains: complete valuation of the higher probe is possible, but the AC will be violated since the copied-up Case feature is inactive. In sum, compelling as they are in their attempt to provide a unified account for all the LDA phenomena reviewed in section 1, Strategy A approaches would seem to face a greater number of difficulties than do Strategy B ones. Whilst Strategy A accounts can no doubt be adapted to meet these concerns, it is perhaps telling that such add-ons are required at all in order to achieve what the various Strategy B accounts deliver for free, albeit at the cost of foregoing a unified account. 2.3. Section summary The LDA configurations in (8)-(12), which might at first sight look like nonlocal instances of Agree (PIC-violations), are readily amenable to alternative, local analyses depending on the particular properties of the LDA phenomenon in question, as defined by the type of embedded clause involved. At least three different kinds of cross-clausal LDA can be identified in this way, each with its own analysis: (i) LDA into nonfinite TPs/VPs (as in Hindi-Urdu, Itelmen), which plausibly involves restructuring; (ii) LDA into finite TPs (e.g. Tsez, Algonquian), which plausibly instantiates a kind of ‘finite ECM’ configuration (or ‘Exceptional Agreement Marking’, as Boeckx (2007) puts it); and (iii) LDA into finite CPs, as in Chukchee, which is plausibly reanalysed in terms of proxy matrix agreement. LDA thus emerges as a non-unified phenomenon (cf. Polinsky & Potsdam (2001, fn. 9; 2012)). One might instead attempt a unified analysis in terms of Cyclic Agree (section 2.2), but there seems to be little justification for this – the aforementioned analyses in (i)-(iii) are independently available, and require none of the additional assumptions of Cyclic Agree. Instead, the 7

As argued by Bobaljik & Wurmbrand (2005, 856), Case on the embedded object is already valued in the finite embedded clause in Tsez. See also Keine (2008) for arguments that Case is assigned to the embedded argument from within the embedded clause in LDA.

148

Marc Richards

basic patterns can be accommodated using just the existing phase-based system of probe-goal Agree, Activity and the PIC. Evidence for true, unambiguous transphasal Agree of the kind that would force us to question (much less abandon) the PIC is thus lacking. Chomsky’s claim in (4) may then be accurate after all, a position which is forcefully echoed in a similar quote of Bobaljik’s (2006, 29): (22) “There are no clear cases in the literature of agreement reaching deeper into a finite clause than to the primary topic of that clause.” (Bobaljik (2006, 29))

3. Feature-inheritance, PIC, and ‘true’ LDA: irreconcilable differences? In the previous section we argued that no reconciliation between LDA and the PIC is required, as LDA is already ‘local enough’ (in the sense of conforming to the PIC), and none of the apparent cases of LDA into embedded clauses are incompatible with a PIC-respecting analysis. The PIC is not out of the woods yet, however. Once we bring some of the more recent developments in phase theory into the picture, specifically feature-inheritance (Chomsky (2005; 2007), Richards (2007)), the simplest cases of in-situ Agree in English expletiveassociate constructions suddenly become deeply problematic, and would seem to pose an even greater challenge for the PIC than the cases reviewed in section 2. Feature-inheritance, in brief, is a consequence of Chomsky’s (2005; 2007) proposal that all uninterpretable features (probes) belong to the phase heads (the latter can then be uniformly defined as the locus of probes). In order to derive traditional subject-agreement effects, i.e., the association of agreement with nonphase heads such as T, Chomsky claims that the relevant Agree-features are inherited from the phase head (here, C) onto its complement (T). Richards (2007) provides a conceptual argument for this inheritance operation, namely that it follows from the PIC in conjunction with the idea that unvalued features must be transferred to the interface (and deleted) as soon as they are valued (for reasons of computational efficiency – see Epstein & Seely (2002), Chomsky (2007), and footnote 10 below). The PIC acts against the latter requirement, since it forces the features on a phase head to remain accessible into the following phase (and thus prevents their immediate transfer). These conflicting demands receive a simple resolution through precisely the mechanism that Chomsky independently proposed: inheritance of the relevant features from the phase head to its complement, the latter being what is transferred. This mechanism is schematized in (23).

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

149

(23) CP PIC boundary

C [Φ]

TP T [Φ]

...

As noted in section 1, the original cases of agreement ‘at a distance’ for which the probe-goal model of agreement was first proposed in Chomsky (2000) include structures like (1-b) and (24-a). In (24-a), subject-verb agreement arises through an Agree relation holding between T and the in-situ DP inside VP, as a result of which the unvalued Φ-set on T and Case feature on DP are valued (cf. (24-b)). (24) a. b.

There T [vP arrived [VP tV [DP a man]]] Agree(T, a man) → [ T[Φ,(EPP)] ... DP[Φ,Case] ]

The ‘long-distance’ Agree operation in (24-b) is perfectly legitimate if either of the following two assumptions are made: (i) passive/unaccusative v (defective v) is not a phase (or is just a ‘weak’ phase in the sense of Chomsky (2001)); or (ii) the ‘weaker’ version of the PIC in (6)-(7) above is adopted, such that transfer of phase PH1 is triggered only upon merger of the next phase head, PH2. However, neither of these assumptions is tenable. Assumption (i) fails on a view of successive-cyclic movement as proceeding only through phase edges (i.e., the punctuated paths of Abels (2003)). In light of evidence from reconstruction and variable binding presented by Legate (2003), Sauerland (2003) and others, exemplified in (25), such a view implies that defective phase heads like passive/unaccusative v must be ‘strong’ phase heads no less than nondefective phase heads like transitive v. Only by interpreting the copies left in the boldface trace positions at the left edge of vP can the relevant interpretations be licensed (such that he is bound by every organizer in (25-a) whilst simultaneously avoiding a Principle C violation between her and invited speaker, and such that matrix negation scopes over every child whilst the latter simultaneously binds his father in (25-b)). It follows that defective v must be a phase, so that its edge (spec-vP) is available as an intermediate landing site. (See also Richards (2004) for an argument from linearization to this same effect.) (25) a. b.

[At which conference where hei mispronounced the invited speakerj ’s name]k did every organizeri’s embarrassment [vP tk escape herj tk ]? Every childi doesn’t [vP ti seem to hisi father [ti to be smart]] (¬ ∀)

150

Marc Richards

It is assumption (ii) that feature-inheritance renders impossible. In particular, the distinction between the two versions of the PIC in (2) and (6) becomes unformulable if T is a probe only derivatively, by inheritance (cf. (23)): T is now unable to act as a probe until C is merged (from which T inherits its Φ-probe), at which point v’s complement, VP, is transferred, thus preventing Agree between T and DP inside VP. In effect, only the PIC1 pattern of search space in (3) is compatible with feature inheritance. Consequently, (24-a) now becomes a case of genuine, PIC-violating LDA, a problem that is compounded by a further implication of feature-inheritance: every nonphase functional head requires the existence of a phase head to select it (Richards (2007; 2011)). Therefore, the defective T head that is present in raising and ECM infinitivals must itself be selected by a (defective) C head (see also Gallego & Uriagereka (2007a;b) on defective C and its role in ECM structures). This Cde f head, in turn, must also be a ‘strong’ phase (on a punctuated paths view), since the left edge of raising/ECM infinitivals provides an intermediate landing site, implying a phase edge (example based on Chomsky (2004, fn. 56)): (26) Johni seems to herj [CP1 ti to appear to himselfi [CP2 ti to like Mary∗j ]] Here, John must be interpreted in the boldfaced trace position in order to bind himself, since otherwise her (which must c-command Mary at the relevant level, hence the coreference restriction on Mary) would intervene. Again, this indicates the presence of a phase head at the edge of the non-finite clause – defective C.8 The unbounded nature of T-associate, expletive-raising constructions in English, as in (27), now implies that LDA must be possible across a potentially infinite number of Cdef and vdef phase heads, in stark defiance of the PIC. (27) [There seem to me [to appear to John [to be believed by Bill [...] [to be several dogs in the garden]]]] To resolve this tension between the PIC and infinitely long-distance agreement, we might adopt an approach to such ‘covert Agree’ relations in terms of Cyclic Agree, following Legate (2005) (cf. section 2.2). However, such an approach runs equally afoul of feature-inheritance. Cyclic Agree is mediated through phase heads (probes), whose valued uninterpretable features remain accessible to matrix probes by virtue of occupying the phase edge (consisting of the head plus specifiers of the phasal XP). Yet under feature inheritance, precisely those features are removed (‘inherited’) from the phase head to the transferred complement domain upon merger of the next head. That is, inheritance shifts the 8

Note, then, that there is no need to assume uniform (non-punctuated) paths in order to accommodate data like (26) – once we recognize the existence of defective C, then the relevant interpreted copy at the left edge of CP1 in (26) is actually in a phase edge after all, namely spec-Cdef , allowing us to maintain a ‘phase-edge-only’ punctuated view.

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

151

crucial features from the edge to the non-edge of the phase, so that they are no longer PIC-accessible to higher probes. Agree-chains with intermediate phaseedge links are thus no more compatible with feature-inheritance than PIC2 is. A possible solution to the problem of reconciling the PIC1 pattern of search space imposed by feature inheritance and the English expletive-LDA facts may become available if we recognize that two distinct conceptions of phases, each with a distinct contribution to make to the efficient satisfaction of interface conditions, are required under the SMT. As argued in Richards (2011), the lexical subarrays of Chomsky (2000) define the units that are actually transferred to the interfaces (the phasal domains), whilst Chomsky’s (2005; 2007) view of phases as the locus of uninterpretable features (uFs) provides the trigger for (and thus regulates the timing of) that Transfer operation, due to the aforesaid assumption that valued uFs must be immediately transferred and deleted. With these two conceptions of phases in place, a reasonable proposal is that wherever uFs are present on a phase head, that phase head will trigger Transfer to the semantic interface (‘LF’) upon their valuation; and whenever a lexical subarray is exhausted, Transfer to the phonological interface (‘PF’) is triggered. For the most part, these two triggers will coincide (and thus Transfer will occur simultaneously to both interfaces). However, in one specific case, a dissociation is predicted to occur, namely wherever a phase head lacks uFs (i.e., wherever we have a defective phase). Since here the LF-Transfer trigger is absent, only PF-Transfer is predicted to occur, on principled grounds (all operations must be triggered, and uF-less phase heads lack the LF-Transfer trigger). Thus, whereas all lexical subarrays trigger PF-Transfer, irrespective of whether their phase head is defective, and thus define separate PF-units, only those containing a nondefective phase head will trigger LF-Transfer too. The consequence of such a system is that Φ-less phases (defective v and defective C, which lack Agree-probes) transfer only to PF.9 Thus defective phases still act as separate linearization domains – see Richards (2004) – and constrain overt movement, yielding successive-cyclic movement through their edges, as in (25)-(26). At the same time, however, material contained inside these phases remains accessible to higher phase heads/probes (due to the lack of LF-Transfer), thus yielding the requisite extended pattern of search space (PIC2 /(7)), at least 9

A possible complication here is that defective phases in Chomsky’s system (cf. Chomsky (2001)), at least defective v, are not assumed to lack all Φ-features but just to lack Person. In our current terms, this would translate to Number and Person both being PF-Transfer (and inheritance) triggers, but only Person being an LF-Transfer (and inheritance) trigger. See Boeckx (2006) for related observations on the absence of Person-agreement from the kind of LDA structures at issue here, an effect which might now be derived in the manner just suggested if only Person valuation triggers LF-Transfer (in addition to PF-Transfer), so that only Person-Agree would be subject to the PIC. I leave exploration and motivation of this possibility to future research, and abstract away from it in the remaining few paragraphs.

152

Marc Richards

for defective phases.10 In this way, defective phases act as ‘strong’ phases for the purposes of triggering successive-cyclic movement, but as ‘weak’ phases in not counting for the purposes of the PIC (i.e., in not constraining Agree). If this reasoning is correct, then we derive a principled distinction between phase types (PF-only phases versus combined PF+LF-phases) that lends some substance to Chomsky’s (2001) seemingly superfluous distinction between strong and weak phases (the latter not triggering spell-out at all for Chomsky (2001), whereas under the present proposal they trigger spell-out just to PF). It also derives something akin to the ‘non-simultaneous’ phases of Maruˇsi´c (2005) and others, whilst adhering closely to Chomsky’s (2000; 2001; 2005; 2007) conception of phases. Finally, it offers a connection with those approaches to LDA that suspend the PIC as a constraint on Agree and retain it as a condition on movement alone (Stjepanovi´c & Takahashi (2001), Boˇskovi´c (2007) – see footnote 1). This is precisely the effect that we have now derived for defective phases. However, in the case of nondefective phases, the PIC remains a constraint on Agree and Move alike on the present approach (and contra the aforesaid works), as seems desirable. In sum, the phase-cyclic behaviour of the defective v and C heads in (25)(26) is reconciled with their ‘transparency’ for non-local agreement across phase boundaries in (27). True LDA is possible across endless defective phases, despite their status as separate PF-units (or rather, perhaps, because of it – defective phases are PF-only phases, hence their transparency for Agree). This would appear to be the correct empirical result.11 10 11

Or conversely, the PIC2 pattern can now be viewed as the basic, default pattern, with PIC1 only forced where uFs are present (i.e., on the phase head). A possible objection to the proposal put forward here is that the lack of LF-Transfer of defective phases, as in (27), constitutes a severe weakening of the major conceptual argument for phases, namely their reduction of computational complexity through the periodic curtailing of search space. However, the system proposed above is still premised on a central computational consideration, namely the need to immediately transfer valued uninterpretable features in order to avoid the ‘distinguishability problem’ of Chomsky (2001), Epstein & Seely (2002) – namely, the problem of how to distinguish uninterpretable features which acquire a value in the course of the derivation and which need deleting (for Full Interpretation) from interpretable ones, which enter the derivation already valued and must not be deleted. Cyclic Transfer still eases computation in the proposed system, then, since without immediate, cyclic transfer the system would have to ‘lookback’ (to see which features entered the derivation unvalued) and/or delay transfer, both of which increase the amount of material that the system has to keep track of. Where no uFs are present on a phase head in the first place (i.e., defective phases), no such increase in operative memory is at stake. Therefore, it seems perfectly reasonable for an optimally designed system to operate in the manner proposed above, transferring a phase to LF as soon as the uFs on its head are valued (in order to facilitate the computation as just described) but not transferring a phase to LF if there are no such uFs present. Thus, as long as no uFs are encountered, no Transfer to the interface is triggered, in accordance with SMT, since there are no computational gains of the relevant kind to be had. (Reduction of search space, on this approach, is then at most a secondary

Probing the Past: On Reconciling Long-Distance Agreement with the PIC

153

Bibliography Abels, Klaus (2003): Successive Cyclicity, Anti-locality, and Adposition Stranding. PhD dissertation, University of Connecticut. Bhatt, Rajesh (2005): ‘Long Distance Agreement in Hindi-Urdu’, Natural Language and Linguistic Theory 23, 757-807. Bobaljik, Jonathan (2006): Where’s Φ? Agreement as a Post-Syntactic Phenomenon. In: D. Harbour, D. Adger and S. B´ejar, eds, (2008) Phi-Theory: Phi Features Across Interfaces and Modules. Oxford University Press, Oxford, pp. 295-328. Bobaljik, Jonathan and Susanne Wurmbrand (2005): ‘The Domain of Agreement’, Natural Language and Linguistic Theory 23, 809-865. Boeckx, Cedric (2004): ‘Long-Distance Agreement in Hindi: Some Theoretical Implications’, Studia Linguistica 58, 23-36. Boeckx, Cedric (2006): The Syntax of Argument Dependencies. Paper presented at the Linguistics Colloquium. University of Leipzig, November 2006. Boeckx, Cedric (2007): Isolating Agree. Paper presented at Workshop on Morphology and Argument Encoding. Harvard University, September 2007. ˇ Boˇskovi´c, Zel’ko (2007): ‘Agree, Phases, and Intervention Effects’, Linguistic Analysis 33, 54-96. ˇ Boˇskovi´c, Zel’ko and Howard Lasnik (2003): ‘On the Distribution of Null Complementizers’, Linguistic Inquiry 34, 527-46. Butt, Miriam (1995 ): The Structure of Complex Predicates in Urdu. Dissertations in Linguistics, CSLI Publications, Stanford. Carstens, Vicki (2001): ‘Multiple Agreement and Case Deletion: Against Phi-Completeness’, Syntax 4, 147-63. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. MIT Press, Cambridge, Massachusetts, pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale. A Life in Language. MIT Press, Cambridge, Massachusetts, pp. 1–52. Chomsky, Noam (2005): On Phases. Ms., MIT, Cambridge, Mass. Published as Chomsky (2008). Chomsky, Noam (2007): Approaching UG from Below In: H.-M. G¨artner and U. Sauerland, eds, (2007) Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from SyntaxSemantics. Mouton de Gruyter, Berlin, pp. 1-30. Chomsky, Noam (2008): On Phases. In: R. Freidin, C. Otero and M.-L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud. MIT Press, Cambridge, Massachusetts, pp. 133–166. Epstein, Samuel David and T. Daniel Seely (2002): Rule Applications as Cycles in a Level-free Syntax. In: S. D. Epstein and T. D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell, Oxford, pp. 65-89. Frampton, John, Sam Gutmann, Julie Legate and Charles Yang (2000): Remarks on “Derivation by Phase”: Feature Valuation, Agreement, and Intervention. Ms., Northeastern University and MIT. ´ Gallego, Angel and Juan Uriagereka (2007a): Defective C. Paper presented at Alternatives to Cartography, Brussels, June 2007. ´ Gallego, Angel and Juan Uriagereka (2007b): Successive Cyclicity, Phases, and CED Effects. Ms. Universitat Aut`onoma de Barcelona. Keine, Stefan (2008): Long-Distance Agreement und zyklisches Agree. Colloquium talk, University of Leipzig, January 2008. Lahne, Antje (2008): Local Modelling of Long-Distance Agreement. Ms., Universit¨at Leipzig. Legate, Julie (2003): Some Interface Properties of the Phase. Linguistic Inquiry 34, 506-16. concern, a useful side-effect of cyclic spell-out but not its primary computational or conceptual motivation.)

154

Marc Richards

Legate, Julie (2005): Phases and Cyclic Agreement. In: M. McGinnis and N. Richards, eds, Perspectives on Phases. MITWPL 49, pp. 147-156. Maruˇsiˇc, Franc (2005): On Non-Simultaneous Phases. PhD. dissertation, Stony Brook University. Nevins, Andrew (2004): Derivations without the Activity Condition. In: Martha McGinnis and Norvin Richards, eds, Perspectives on Phases. MITWPL 49, pp. 287-310. Polinsky, Maria and Eric Potsdam (2001): ‘Long-Distance Agreement and Topic in Tsez’, Natural Language and Linguistic Theory 19, 583-646. Potsdam, Eric and Maria Polinsky (2012): ‘Backward Raising’, Syntax 15, 75-108. Preminger, Omer (2008): Agreement, its Reach, and its Failures: The Case of Dialectal Basque. Ms., MIT. Preminger, Omer (2009): ‘Breaking Agreements: Distinguishing Agreement and Clitic Doubling by Their Failures’, Linguistic Inquiry 40, 619-666. Richards, Marc (2004): Object Shift and Scrambling in North and West Germanic: A Case Study in Symmetrical Syntax. PhD. dissertation, University of Cambridge. Richards, Marc (2007): ‘On Feature-Inheritance: An Argument from the Phase Impenetrability Condition’, Linguistic Inquiry 38, 563-572. Richards, Marc (2008): Quirky Expletives. In: R. d’Alessandro, G.H. Hrafnbjargarson and Susann Fischer, eds, Agreement Restrictions. Mouton de Gruyter, Berlin, pp. 181-213. Richards, Marc (2011): ‘Deriving the Edge: What’s in a Phase?’, Syntax 14, 74–96. Stjepanovi´c, Sandra and Shoichi Takahashi (2001): Eliminating the Phase Impenetrability Condition. Ms., Kanda University of International Studies. Wurmbrand, Susanne (2001): Infinitives. Restructuring and Clause Structure. de Gruyter, Berlin.

Institut f¨ur Linguistik Goethe Universit¨at Frankfurt am Main

Tibor Kiss

Reflexivity and Dependency*

Abstract The introduction of exempt reflexives in Pollard and Sag (1992; 1994) and Reinhart and Reuland (1993) has led to a new characterization of ‘picture-NP-reflexives’, which are no longer considered anaphoric. These analyses, however, do not provide a clear concept of the term ‘anaphor’ any more, and cannot account for the existence of picture-NP-reflexives in languages without exempt reflexives. Focusing on object experiencer psych verbs in English, German, and Portuguese, we will propose a syntactic theory of anaphors. By ‘syntactic’ we mean that anaphors are not only resolved, but also introduced in syntax. Showing a reflexive form does not guarantee per se that a pronoun will become anaphoric, but in addition a local (or no too local) syntactic context is required. If such a context is not provided, the reflexive may become exempt, but once an anaphoric dependency has been introduced, it has to be resolved in syntax as well. The analysis will be applied to medium- and long-distance reflexives, reflexive binding in double object and impersonal constructions.

1. Introduction The analyses of Pollard and Sag (1992; 1994) and Reinhart and Reuland (1993) answer why certain reflexive pronouns – so-called picture-NP-reflexives – behave anomalously in allowing co-reference with non-binding antecedents. These reflexives are exempt from Binding Theory’s Principle A because Principle A only requires binding if a potential antecedent is available in a given local domain. The reflexive himself in (1) does not find a local antecedent in the domain of the predicate picture and hence becomes exempt from an application of Binding Theory. Just like the co-indexation of the pronoun him with the subject John, the co-indexation of the reflexive is not an instance of binding, but an indication of co-reference. *

This paper goes back to a series of talks I presented during early summer and winter 2004 in Mannheim (GGS 2004), K¨oln (Universit¨at zu K¨oln), Seoul (LSK 2004), Leuven (HPSG 2004), and Leipzig (Universit¨at Leipzig). For largely irrational reasons, I was unable to turn the talks into a paper, but Gereon M¨uller insisted on its production on a regular basis. So without Gereon, the paper would never have been written. I am deeply grateful that he proved his obstinacy on me, and also on his comments on an earlier version of this paper. I would also like to thank the audiences in the talks in 2004 for their comments and suggestions, Silke Fischer for helping to understand her analysis, and in particular Ana Luis for discussing the Portuguese data with me.

Local Modelling of Non-Local Dependencies in Syntax, 155-185 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

156

Tibor Kiss

(1) John1 believed that pictures of himself1 /him1 were on sale. With the problem of non-complementarity of reflexives and pronouns eliminated, it even looked as if Binding Theory as a research topic was ceasing. After closer scrutiny, however, it turns out that innovative answers lead to new problems. To begin with, picture-NP-reflexives occur in languages that do not show independent justification for the existence of exempt reflexives. A case at hand is German, as can be illustrated by the ungrammaticality of the translation of (1) in (2). (2) Hans1 glaubte, dass Bilder von *sich1 /ihm1 zum Verkauf standen. There is no ‘non-complementary distribution of anaphors and pronouns’ in examples like (2). Yet German allows intrasentential binding with picture-NPreflexives, as is illustrated in (3). (3) a. b.

c.

Warum hat Claude Cahun1 die Bilder von sich1 zur¨uckgehalten? why has Claude Cahun the pictures of herself withheld ‘Why has Claude Cahun withheld the pictures of herself?’ finden, Wenn Sie1 im Munzinger-Archiv einen Artikel u¨ ber sich1 you in Munzinger archive an article about yourself find if dann ist Ihnen1 dieser vor dem Erscheinen zur Kontrolle then is you this before the appearance for examination vorgelegt worden. propounded was ‘If you find an article about yourself in the Munzinger archive, it will have been propounded to you for examination before publication.’ Verst¨andlich, dass er1 keine konfusen Berichte u¨ ber sich1 it-stands-to-reason that he no confuse reports about himself lesen mag. read like ‘It stands to reason that he does not like to read confuse articles about himself.’

As will be illustrated below, German fails every test for exempt reflexives, and yet allows picture-NP-reflexives. If picture-NP-reflexives exist in certain languages where exempt reflexives do not, severe doubt is cast on an analysis of the former in terms of the latter.1 1

This criticism does not only apply to the analysis of Pollard and Sag (1992; 1994), where the concept of an exempt reflexive follows directly from their Principle A (cf. definition (24) below), but also to the analysis of Reinhart and Reuland (1993). Reinhart and Reuland introduce the concept of a syntactic predicate, and exclude nominal heads that are not deverbal from this concept. For further discussion of Reinhart and Reuland’s analysis, cf. section 3.

Reflexivity and Dependency

157

What is more, we are faced with a conceptual problem. The co-indexation in (1) is not an instance of binding, but of co-reference. The proposals differ sharply from earlier analyses in this respect, where anaphors have been classified as referentially deficient, thus being in need of a binding antecedent to receive an interpretation. These earlier proposals tacitly assume that being an anaphor (or not being an anaphor) is basically a lexical property. With exemptness pertinent in (1), it cannot be maintained that anaphors are analyzed as being referentially deficient. The present paper tries to solve this problem by proposing a syntactic theory of anaphoric dependency.2 It assumes that being a reflexive pronoun is indeed a lexical property. Anaphoric dependencies are not only resolved but also introduced in syntax. In a nutshell, the syntactic context may turn certain pronouns into dependent elements. Most syntactic theories assume – implicitly or explicitly – a closure on certain local domains, so that the local domain must not contain any open dependencies. Grammars do not derive sentences with missing arguments (unless they can be inferred and hence syntactically derived from the context) and by the same line of reasoning, they do not derive sentences that contain referential dependencies, which will explain why unbounded anaphors are excluded, while exempt reflexives are not.3 The present analysis will focus on reflexive pronouns in subjects of object experiencer psych verbs as e.g., to worry, to annoy, or to make one’s day. Reflexive binding in object experiencer psych verbs (OE psych verbs for short) has been a benchmark for theories of OE psych verbs since their inception in Belletti and Rizzi (1988). We will assume that what looks like anaphoric binding into the subject of an OE psych verb is in fact another case of an exempt reflexive being co-indexed, yet not bound. If reflexive binding into the subject of an OE psych verb is in fact a case of exemptness, we expect that the phenomenon can only be observed in languages that allow exempt reflexives in general. In section 2, we will discuss properties of reflexive binding into the subject of OE psych verbs. It will be shown that the syntactic distribution of reflexive binding into OE psych verbs is not uniform across languages. While the initial discussion will focus on the differences between English and German, we will turn to the anaphoric system of Portuguese to further illustrate the diver2

3

What we want to express by using the term syntactic analysis is that syntax does not only play a role in resolving anaphoric dependencies, but crucially introduces anaphoric dependencies (as opposed to a model where lexical reflexives are already marked as deficient in the lexicon). The present analysis thus rejects a lexicalist analysis of anaphoric binding, following the spirit of Borer (2005), but also the leading ideas of Gazdar et al. (1985), where syntactic dependencies are introduced through syntactic means. Pollard and Sag (1994, 266ff.) call reflexive pronouns that are exempt from Principle A exempt anaphors. As will become clear shortly, the term exempt anaphor is not only a misnomer, but strictly speaking contradictory. For the same reason, I will use the term exempt reflexive throughout.

158

Tibor Kiss

gent properties of picture-NP-reflexives. In section 3, we will turn to the concept anaphor itself and introduce the idea that reflexivity is a property of lexical classes, while anaphoricity is a dependency, which is not only resolved in syntax, but also introduced by syntactic contexts. Sections 4 and 5 will present the analysis of the data presented in section 2, while double object constructions, and binding patterns inside NPs are discussed in section 6.

2. Reflexives, picture-NP-reflexives and psych verbs 2.1. Variation in the syntactic distribution of picture-NP-reflexives The analyses of Pollard and Sag (1992; 1994) and Reinhart and Reuland (1993) illustrate that English picture-NP-reflexives appear in syntactic contexts where ordinary binding cannot apply.4 In addition to the case already illustrated in (3), picture-NP-reflexives allow intersentential antecedents (4-a), non-commanding antecedents (4-b), and split antecedents (4-c). (4) a. b. c.

John1 was upset. A picture of himself1 in the museum had been mutilated. [John1’s campaign] required that pictures of himself1 be placed all over town.5 John1 told Mary2 that pictures of themselves1+2 were on sale.

None of the phenomena illustrated in (4) are grammatical in German. This is in line with our observation that examples like (1) are not acceptable in German (cf. example (2)). The following examples show that German does not allow intersentential, non-commanding, or split antecedents: (5) a. *Ulrich1 war sauer. Ein Bild von sich1 war besch¨adigt worden. Ulrich was upset a picture of himself had mutilated been b. *[Schumachers1 Reklamevertrag] verlangte eine Nacktaufnahme Schumacher’s promotion contract required a nude-picture von sich1 . of himself c. *Ulrich1 zeigte Klaus2 einige Bilder von sich1+2 . Ulrich showed Klaus some pictures of themselves 4

5

The pertinent data have in fact been observed in many publications since the 1970s, but have mostly been taken to be exceptional in nature. Zribi-Hertz (1989) led to a re-evaluation of the data. As a funny side effect, it should be noted that the grammar checker of my text processor suggests that himself be replaced by him in (4-b).

Reflexivity and Dependency

159

Despite the obvious opposition against syntactic contexts that suggest a treatment of picture-NP-reflexives as exempt, German picture-NP-reflexives require medium-distance binding (cf. B¨uring (2005, 243)). By ‘medium-distance binding’, we mean that a reflexive contained in an NP requires a commanding antecedent within the same clause. It might be possible that a picture-NP-reflexive is realized inside a stack of NPs, yielding structures like [[NP . . . N [NP . . . N P refl1 ]] . . . V ], as e.g. illustrated in (6). (6) Der geschnappte Einbrecher1 in einem HL-Supermarkt in Großaiting the snapped burglar in a HL-supermarket in Großaiting bei Augsburg zog [zwei “Krone”-Ausschnitte [PP mit Berichten with reports close-to Augsburg pulled two “Krone” clippings u¨ ber sich1 ]] aus der Tasche. about himself out the pocket ‘The burglar who was caught in a HL supermarket in Großaiting close to Augsburg pulled two clippings from the newspaper “Krone” out of his pocket, which contained reports about him.’ Given the contrasts between (4) and (5), one could give up the idea that reflexive binding can be defined across languages. Hence, two independent Principles A would be the result, one of which would turn picture-NP-reflexives into exempt reflexives, while the other renders these reflexives as anaphors. Such a move would allow a description of the basic facts in the languages in question, but it would be necessary to extend the analysis with every new language being analyzed. Focussing on the individual formulations of Principle A, a disjunctive analysis of picture-NP-reflexives would be prone to miss structural similarities across languages. 2.2. Picture-NP-reflexives and OE psych verbs Languages typically include two different types of psych verbs. So-called subject experiencer psych verbs (SE psych verbs for short), as illustrated in (7), form a kind of norm, while OE psych verbs, as illustrated in (8) behave exceptionally. This exception is due to the assumption that the role experiencer generally occupies a higher rank than the role theme. The rank is respected in SE psych verbs, where the experiencer is realized as the higher-ranking subject, while the theme occupies the position of the object. In the case of OE psych verbs, we find the opposition situation: on the surface, the higher ranked thematic role is associated with the lower ranked grammatical function. (7) John1 fears [these pictures of himself1 ]. (8) [ S [ These pictures of himself1 ] [ VP frighten John1 ]] .

160

Tibor Kiss

Picture-NP-reflexives have been a benchmark for every theory of OE psych verbs. The problematic case is (8). If the linear appearance of theme and experiencer is mirrored in the configurational structure, the reflexive is not bound by its antecedent, as can be witnessed from the structure in (8). Starting with the analysis in Belletti and Rizzi (1988), this problem has been addressed by various means; in particular by assuming that the position of the subject in (8) is a derived one, and that the object experiencer at some syntactic level ordinarily binds the reflexive (cf. also Sabel (this volume)). This idea has been justified by assuming that the relevant predicates are unaccusative. Pesetsky (1995, 21ff.) has already argued against this view by showing that OE psych verbs can be passivized, which should not be possible if they were unaccusative verbs. (9) a. b.

Ghosts frighten Bill. Bill is frightened by ghosts.

What is more, Pollard and Sag (1992, 278) provide examples of type (10) showing that even a reconstruction of the subject theme would not provide a configuration in which the antecedent were able to bind the reflexive contained in the theme for the simple lack of c-command.6 Pollard and Sag (1994, 271) conclude that reflexives in OE psych verbs could be treated as exempt reflexives. (10) [ S [ Nude pictures of himself1 in various newspapers ] made [ NP John1’s day ]] . It is a basic tenet of both Pollard and Sag’s and Reinhart and Reuland’s proposals that local domains are responsible for determining whether a given reflexive has to be analyzed as exempt or not. The pertinent local configuration in (7), (8) and (10) is the same: a reflexive embedded into an NP without referential specifier. Hence, the analysis applied to (7) and (8) carries over to (10). Further evidence for treating picture-NP-reflexives in OE psych verb subjects as exempt reflexives comes from embedding psych verbs, as is illustrated in (11). (11) John1 said that pictures of himself1/2 annoyed Peter2 . If the reflexive in (11) were bound along the lines suggested in Belletti and Rizzi (1988), a co-indexation of the reflexive with the matrix subject should become impossible, counter to our observations. Such a co-indexation becomes available if the reflexive is analyzed as exempt. 6

It should be noted that examples like (10) become unacceptable if the antecedent is substituted by a quantified expression: (i) *Pictures of himself1 made every1 man’s day. The unacceptability of this example is expected, since the quantifier cannot bind the reflexive.

Reflexivity and Dependency

161

As picture-NP-reflexives in OE psych verb subjects are classified as exempt, we can derive the prediction that picture-NP-reflexives should only appear as subjects of OE psych verbs in languages allowing exempt reflexives. With regard to Italian – the language which first showcased picture-NP-reflexives as OE psych verb subjects – the existence of exempt reflexives has already been confirmed by Napoli (1979), as is illustrated in (12-a) below.7 (12) a.

b.

Giorgio1 raccont`o a Maria che la fotografia di se1 stesso erano Giorgio told to Maria that the picture of REFLEXIVE was in vendita. on sale ‘Giorgio told Maria that the picture of himself was on sale.’ Questi pettegolezzi su di s´e1 preoccupano Gianni1 pi`u Gianni much these rumours about REFLEXIVE concern meglio di ogni altra cosa. more than any other case ‘These rumours about himself concern Gianni much more than anything else.’ (Belletti and Rizzi (1988, 312))

The opposite situation is given in German. German neither allows exempt reflexives nor picture-NP-reflexives as subjects of OE psych verbs. The ban on exempt reflexives was already illustrated in (5), the unacceptability of pictureNP-reflexives in OE psych verb subjects as can be witnessed in (13).8 (13) a. *Die Bilder von sich1 gefielen den Kindern1. the pictures of themselves pleased the children b. *Den Kindern1 gefielen die Bilder von sich1 . the children pleased the pictures of themselves den Kindern1 gefielen. c. *Ich glaube, dass die Bilder von sich1 I believe that the pictures of themselves the children pleased gefielen. d. *Ich glaube, dass den Kindern1 die Bilder von sich1 I believe that the children the pictures of themselves pleased In example (13-a), the subject has been topicalized. In (13-b), the object experiencer has been topicalized. Both examples are equally unacceptable. To reduce the possible influence of topicalization, examples (13-c,d) employ subordinate clause structures. The examples remain unacceptable, irrespective of a possible scrambling of the object experiencer, which distinguishes (13-c) from (13-d). Example (14) further illustrates that OE psych verbs allow scrambling of the object experiencer over a theme that contains a co-indexed pronoun. 7 8

It should be noted however that the reflexive used in (12-a) is se stesso, while the morphologically simple se is used in (12-b). Cf. Frey (1993, 131, ex. 62-b).

162

Tibor Kiss

(14) Da ihm1 die Berichte u¨ ber ihn1 in der Presse nicht gefallen, because him the reports about him in the press not please wendet sich Popinga1 schriftlich an die Zeitungen. appealed REFL Popinga in-writing to the newspapers ‘Popinga wrote an appeal to the newspapers, because he did not like the reports about himself in the press.’ In summary, it is highly plausible to assume that OE psych verb subjects may contain reflexives just in case the language in question allows exempt reflexives. English and Italian allow exempt reflexives together with reflexives contained in OE psych verb subjects, German allows neither exempt reflexives nor reflexives contained in OE psych verb subjects. Further evidence comes from SerboCroatian, as illustrated in B¨uring (2005, 242). B¨uring shows that Serbo-Croatian does not employ exemption (as illustrated by (15-a)), and just as expected, the use of a reflexive pronoun in an OE psych verb construction leads to ungrammaticality (15-b): (15) a. *Ljutilo ga1 je da je ona pokusala napasti covjeka kao sebe1 . attack man like self anger him did that did she try ‘It angered him that she tried to attack a man like himself.’ b. *Ona slika sebe1 u Glasu Slavonije je mucila Petra1 cijeli dan. that picture self in Voice Slavonia did torture Peter whole day ‘That picture of himself in the Voice of Slavonia tortured Peter the whole day.’ A treatment of reflexives in OE psych verbs in terms of exempt reflexivization is also corroborated by data from Dutch. Everaert (n.d.) points out that there is a strong tendency to use the logophoric reflexive hemzelf in Dutch OE psych verb constructions, as is illustrated in (16). (16) De beschrijving van hemzelf1/*zichzelf1 als communist ergerde de the characterization of himself as communist annoyed de Gaulle1 . Gaulle ‘It annoyed de Gaulle that he was characterized as a communist.’ In addition to the pattern observed in (8), (11), and (13), the syntactic distribution of Portuguese ele pr´oprio illustrates a further instantiation of reflexive binding in OE psych verbs, as will be illustrated in section 2.3. The examples in (13) illustrate that a co-indexation is impossible in German. The examples in (8) and (11) show that reflexives contained in an OE psych verb subject can be co-indexed with the object in the lower clause and with a subject in a higher clause. It would be a natural extension of this pattern to find a language where a reflexive in an OE psych verb subject can be co-indexed with the verb’s object

Reflexivity and Dependency

163

in simple clauses, but is required to be co-indexed with a higher subject, if one is present. This language is Portuguese with the reflexive pronoun ele pr´oprio. 2.3. External reflexive binding in Portuguese The Portuguese non-clitic reflexives si pr´oprio and ele pr´oprio are derived from dative and nominative pronouns, combined with pr´oprio. They may not occur freely, if a commanding antecedent is available, as is illustrated in (17) (cf. Branco and Marrafa (1999, 171)). (17) A Rita1 destruio o retrato dele2 pr´oprio / dela1/*2 pr´opria / de of-she self of the Ritafem destroyed the picture *of-he self si1/*2 pr´opria. her self In (17), ele pr´oprio cannot occur freely since a Rita is a commanding antecedent, yet cannot bind ele pr´oprio since the gender values of both phrases differ. A coindexation of both ela pr´opria and si pr´opria is not only fine, but also required. The latter reflexive differs from the former, in that ele pr´oprio allows intrasentential non-local binding, while si pr´oprio requires a local binding domain, as is further illustrated in (18). Following standard terminology, ele pr´oprio is a long-distance anaphor. (18) O Jo˜ao1 disse que a Rita2 destruio o retrato dele1 / de si*1/*2 the Jo˜ao said that the Rita destroyed the picture of-he of him pr´oprio. self Since ele pr´oprio will only require a binder if a commanding antecedent is available, it is free to occur as a matrix subject, and as part of the matrix subject. Ele pr´oprio1 pagou a conta. payed the bill he self b. O retrato dele1 pr´oprio foi pintado pela Maria2 . was painted by-the Maria the picture of-he self c. *O retrato de si1 pr´oprio foi pintado pela Maria2 . was painted by-the Maria the picture of him self

(19) a.

As has been pointed out by Branco and Marrafa (1999, 171), ele pr´oprio cannot be coindexed with a non-commanding antecedent, if a commanding antecedent is present (20-a), nor with split antecedents (20-b).

164

Tibor Kiss

(20) a. *[ NP O journalista [ RelS que viu a Ana1 ]] disse ao Carlos que the journalist who saw the Ana said to Carlos that ela1 pr´opria danc¸ou na festa. she self danced at party b. *O Jo˜ao1 disse a` Maria2 que viu fotografias deles1+2 pr´oprios a` the Jo˜ao said to Maria that saw pictures of-they selves at venda. sale Only ele pr´oprio may be realized in a OE-psych verb construction: (21) a.

Estas fotos dele1 pr´oprio assustaram o Lu´ıs. these pictures of-he self frightened the Lu´ıs b. *Estas fotos de si1 pr´oprio assustaram o Lu´ıs. these pictures of him self frightened the Lu´ıs

If, however, the OE psych verb construction is realized in embedded structures, only external antecedents become acceptable and a coindexation of ele pr´oprio with the lower object experiencer is blocked: (22) a.

O Jo˜ao1 disse que estas fotos dele1/*2 pr´oprio assustaram o the Jo˜ao said that these pictures of-he self frightened the Lu´ıs2 . Lu´ıs ‘Jo˜ao said that these pictures of Jo˜ao frightened Lu´ıs.’ b. *A Ana1 disse que estas fotos dele2 pr´oprio assustaram o frightened the the Ana said that these pictures of-him self Lu´ıs. Lu´ıs ‘Ana said that these pictures of Lu´ıs frightened Lu´ıs.’

Summing up, the following picture emerges: We have to distinguish between internal and external co-indexation of reflexives in OE psych verb constructions. German does neither allow internal (13), nor external co-indexation, as is further illustrated in (23). (23) *Die Kommentatoren2 meinten, dass dieses Bild von sich1/2 den the commentators uttered that this picture of self the Kanzler1 beeindruckte. chancellor impressed English allows both internal and external co-indexation, while Portuguese ele pr´oprio allows internal co-indexation if no external antecedent is available, but

Reflexivity and Dependency

165

requires external co-indexation if an external antecedent is available.9 In the following section, we will reconcile the distribution of exempt reflexives, reflexives in OE psych verb constructions with the concept of anaphor itself. In particular, we will raise the question whether a concept of anaphor can be identified behind the reflexive variation just offered.

3. Reflexivity and anaphoric dependencies As was already mentioned in the introduction, various concepts of anaphor come to mind. In a definition of Principle A, as e.g. provided in Pollard and Sag (1992; 1994), and given below in (24), the concept seems to be a hypernym for reflexives and reciprocal pronouns. A similar decision is made in B¨uring (2005). The definition in (24) thus turns Binding Theory into a categorical theory. (24) Principle A (Pollard and Sag (1994)): Locally a-commanded anaphors must be locally a-bound. In the model of Reinhart and Reuland (1993), reflexive pronouns do not bear the feature +R, which stands for ‘being fully referential’ (B¨uring (2005, 236)). This feature is employed in Reinhart and Reuland’s General Condition on A-Chains, as given in (25). (25) General Condition on A-Chains (Reinhart and Reuland (1993, 696)): A maximal A-chain (α1 , . . . , αn ) contains exactly one link – α1 – that is both +R and Case-marked. As a consequence of (25), A-chains that solely consist of reflexive pronouns are prohibited. Although the concept anaphor is not directly employed in Reinhart and Reuland’s analysis, the feature +R is crucial for the distinction between reflexive pronouns and non-reflexive NPs. Exemption is not covered by this feature itself but by the concept of a predicate – which turns Reinhart and Reuland’s analysis into a categorical one as well, where the pertinent category is a verbal one. (26) Principle A (Reinhart and Reuland (1993)): A reflexive-marked syntactic predicate is reflexive. According to this analysis, nominal heads are not treated as reflexive-marked syntactic predicates,10 and hence do not require that a maximal A-chain is es9

In fact, the distribution is not complete, as we could conceive a language where an external co-indexation is always required, i.e., reflexives contained in an OE psych verb were only acceptable if the predicate is embedded under another verb. The present analysis clearly predicts the existence of such a pattern (which would otherwise be quite surprising, as it seems that for the well-formedness of an element in a clause, the clause is actually required to be embedded).

166

Tibor Kiss

tablished which would lead to local reflexive binding. It should be noted that the condition in (25) excludes the occurrence of Portuguese ele pr´oprio as subject of a matrix predicate, unless Reinhart and Reuland would assume that it bears the feature +R. The very idea that anaphors are referentially deficient entities that require a binder as an amendment is also problematic from the perspective of compositionality. Phrases like likes himself have a clear compositional interpretation, and this interpretation does not include a concept of deficiency. So, anaphors should more plausibly be seen as entities whose reference is syntactically forced, and not as entities without reference. This perspective also conforms to the behaviour of exempt reflexives. It seems much more plausible to assume that anaphors are elements that are turned into dependent entities by Binding Theory itself. The dependency does not emerge because a reflexive pronoun bears certain properties, but because it is embedded in a local syntactic structure with a given set of properties. Hence, we will assume that the syntactic distribution of anaphoric pronouns is driven by syntactic contexts and not by lexical specifications of the pronouns involved. In this view, an anaphor is a strictly syntactic entity (with repercussions in the interpretative component), while the concept reflexive is restricted to a designated form. Much confusion has arisen in Binding Theory because a designated form (reflexivity) has been confounded with a syntactic dependency (anaphoricity). Reflexivity is obviously related to anaphoricity, and anaphoricity can only emerge if the language offers a designated form that can be employed to signal an anaphoric dependency. As Dimitriadis and Everaert (2004) have pointed out, a designated form can be a lexical reflexive pronoun, but it can also be a designated noun (as in Albanian), a derivational affix (as in Kannada), an inflectional affix (as in Russian), a clitic (as in French) or even a designated verbal auxiliary (as in Tamil). Again, talking about referentially deficient entities does not make sense if anaphoric dependencies are introduced morphologically, or even syntactically, as in Kannada, Russian, French, or Tamil. The present analysis thus assumes that reflexivity is not a property of predicates, but a property of designated forms. If reflexivity is viewed as a property of predicates, as in the analysis of Reinhart and Reuland (1993), a distinction has to be drawn between predicates that can be reflexive and predicates that cannot be. This distinction is not only empirically problematic but also superfluous. As will become clear below, predicates of all kinds, be their heads verbal, nominal, or prepositional, may have complements (and specifiers) that introduce dependen10

Syntactic predicates in the strict sense of Reinhart and Reuland (1993, 678) are predicates that realize an external argument, which is part of their lexical conceptual structure. The concept of a predicate employed in Reinhart and Reuland (1993) bears thus close resemblance to the trigger feature [±A RG -S], which will be introduced in section 4.

Reflexivity and Dependency

167

cies. Reflexive predicates in the terminology of Reinhart and Reuland (1993) emerge if a dependency is introduced and resolved in the local domain of the same lexical head. As a corollary, the present analysis argues strictly against the idea that anaphoric dependencies should be dealt with in the lexicon. Anaphoricity is a syntactic concept and is dependent on syntactic contexts. Hence, the present analysis rejects implementations of Principle A that rely on lexical argument structure as an explicandum, such as Pollard and Sag (1992; 1994). The comparison of closely related languages like English and German in section 2.1 has already revealed that an analysis that relies on lexical argument structure must admit that binding can be confined to this lexical domain in English, but not in German. The languages under investigation (English, German, Portuguese) are similar insofar as the designated form can be called a reflexive pronoun. As we have seen in section 2.3, Portuguese reflexive pronouns can be analyzed as combinations of a pronoun with the adjective pr´oprio, and likewise, one could argue that English reflexives are derived from a combination of a pronoun with the adverb self, but we will ignore issues of derivation presently and simply assume that the designated forms in these languages are indeed reflexive pronouns. The designated forms do not differ from other elements employed in the lexicon (or as the output of a pre-syntactic derivational component). Anaphoric dependencies, however, emerge if such a designated form is realized in the domain of a syntactic trigger. Once the trigger is met, a syntactic dependency is indicated and thus in need of resolution. It is thus not the reflexive pronoun that leads to ungrammaticality; it is the syntactic context containing the reflexive pronoun. The resulting dependency requires a resolution, like other syntactic dependencies. A resolution takes place by identifying a proper antecedent in a given domain – this being a step which is not only familiar from other theories of anaphoric binding, but from models that deal with syntactic dependencies in general, as e.g. S LASH termination in GPSG (Gazdar et al. (1985)), or HPSG (Pollard and Sag (1994)), or in various minimalist models. Ideally, the introduction of an anaphoric dependency is dependent on the conjoined presence of a designated form and a trigger. If either is missing, an anaphoric dependency will not emerge. While this is the picture familiar from English, it does not carry over to other languages. Let us illustrate the problem by again addressing medium-distance anaphoric binding in German, as already presented in (2) – in contrast to (1) – and (6). From the comparison of (1) and (2) we learn that German reflexive pronouns require a binder, while English reflexive pronouns do not. In the following, we will assume that the relevant feature for designated forms will be indicated through R, the value of which will be an index n with φ -features P ERSON, N UMBER, and G ENDER, represented as R(n). This feature is either lexically assigned or determined in a pre-syntactic

168

Tibor Kiss

component. It is present in syntax only in the position where the designated element is syntactically inserted and will never project. Let us further assume that a syntactic trigger for the introduction of an anaphoric dependency will be any predicate that can have an articulated argument structure. Predicates showing the required argument structure will be marked as [+A RG -S], predicates not showing the necessary structure as [−A RG -S]. Obviously, verbs are always marked as [+A RG -S] and hence are prime candidates for the introduction of anaphoric dependencies. Thus any designated element bearing R(n) will immediately meet the trigger condition if it is realized as a syntactic object of a verb. In this case, the syntactic object will be marked as introducing a syntactic dependency. Anaphoric dependencies are indicated through the feature D, the value of which will be the index already introduced by R. Hence, in the present case, D(n) will be instantiated. The syntactic projection of D differs from the projection of R; while R never projects, D will project unless it is identified with another index. The local domain, in which identification, i.e., resolution, must take place, is again determined by the argument structure of the respective predicate. For the moment, let us assume that a resolution is required once all the syntactic arguments that have to be realized actually are realized by the predicate.11 The general scheme for the resolution of anaphoric dependencies is given in (27). (27) If a daughter of a phrase introduces an anaphoric dependency, then the index of the dependent can be identified with the index of the other daughter of the phrase.12 Now consider the structures for English and German in (28) and (29). (28) [S NP1 [VP[D(n)] V[+A RG -S] NP[R(n), D(n)] ]] (29) [S NP1 [VP[D(n)] NP[R(n), D(n)] V[+A RG -S] ]] In both cases, NP[R(n)] is syntactically realized in the context of V[+A RG -S] and hence receives the additional specification D(n). As D(n) cannot be resolved, it has to project to the VP level. After the verb has discharged each argument that is syntactically required, the value n of D at VP must be identified with the index of the subject, i.e., n = 1. As resolution has taken place, S does not bear a D feature. Let us next consider the structure of picture-NP-reflexives in (30). (30) [VP V[+A RG -S] [NP . . . N[−A RG -S] of/von NP[R(n)] ]] 11

12

We assume that the unexpressed subject of infinite verbs is not present on the C OMPS value, and hence that infinite VPs define local binding domains. In the following analysis, the treatment of infinite VPs is not included for the sake of perspicuity, but the analysis proposed can be easily extended to deal with binding through the subject of an infinite verb. This condition embodies c-command, as it is the index of the other daughter, and not an index contained in the other daughter, which can be identified with the dependent index. As for governed PP, cf. section 6.

Reflexivity and Dependency

169

The feature R(n) is not realized in the context of a trigger (the verb is just too far away), and consequently, D will not be instantiated and hence not project. While this result would immediately account for exempt reflexives in English, it fails to account for the pattern observed for German in (2) and (6). To distinguish between medium-distance bound anaphors on the one hand, and exempt reflexives on the other hand, we have to break up the conjunction of designated form and trigger. We assume that languages may chose between a strict interpretation of this conjunction, and a weak interpretation. In the strict interpretation, anaphoric dependencies are only introduced if a trigger is present in addition to a designated form. English is a language that obeys the strict interpretation. In the weak interpretation, the absence of a trigger does not lead to entirely ignoring the designated form, but to introduce a dependency into the syntax that has not been activated. Hence, the common representation of pictureNP-reflexives in (30) will have to be split up in the following representations for German and English: (31) English: [VP V[+A RG -S] [NP . . . N[−A RG -S] of NP[R(n)] ]] (32) German: [VP[D(n)] V[+A RG -S] [NP[D(n)] . . . N[−A RG -S] von NP[R(n), D(n)] ]] The representation in (31) is identical to the initial representation in (30). An R feature will not lead to a D feature unless a trigger is present. The representation in (32) for German introduces a new feature: D stands for an inactive dependency. An inactive dependency can be turned into an active dependency in the appropriate trigger context. Hence the inactive dependency present on the NP is turned into an active dependency on the VP level. Having become an active dependency, its resolution is required, and hence the ungrammaticality of (2) can be derived. Summing up, the analysis of anaphoric dependencies rests on the following assumptions:13 13

Fischer (2006) presents a model of reflexive binding that is very similar to the one presented here, although she assumes derivational rules and ordered constraints (in the sense of Optimality Theory). The present model is expressed in terms of local representational constraints in the sense of Gazdar et al. (1985), and does in fact bear some resemblance to the treatment of missing constituents through the S LASH feature in the latter model. The empirical coverage of Fischer’s model in comparison to the present model is not easily gauged, but the analysis presented in section 4 can handle cases of co-referential reflexives that do not c-command each other (cf. (i)) without further amendment. (i)

Peter und Maria gaben [den Eltern von sich] [ein Bild von einander]. Peter and Mary gave the parents of self a picture of each other

170

Tibor Kiss

1. An anaphoric dependency is the result of an R feature present in a designated form in combination with an appropriate trigger. Categories that show an articulated argument structure typically introduce an appropriate trigger while nominals and other categories do not. Additional conditions for triggers may be required in individual languages. 2. Two alternatives ensue if an appropriate trigger is not present. Either the introduction of an anaphoric dependency will not take place at all, or an inactive anaphoric dependency is introduced. 3. R features never project in syntax. 4. Inactive anaphoric dependencies project as long as they have not been turned into active anaphoric dependencies. 5. Active anaphoric dependencies project unless they have been resolved. 6. A resolution of an anaphoric dependency requires an identification of the index of the dependency with an index of another phrase. 7. An upper bound may be required for resolution. If an upper bound is imposed, the resulting dependency is a short- to medium-distance dependency. If no upper bound is imposed, the resulting dependency is a longdistance dependency. In the following section, we will spell out the workings of the aforementioned conditions in more detail for English and German.

4. Anaphoric dependencies in English and German In the following, we will assume without further discussion that pronouns bearing the feature R are lexically reflexive. A phrase bearing the feature R with index n (n a natural number) is represented as XP[R(n)] ; a phrase bearing the feature D with index n is represented as XP[D(n)] ; a phrase bearing the feature D with index n is represented as XP[D(n)] . Several cases of anaphoric binding may take place in parallel in a syntactic structure, and hence different numbers indicate different indices unless a dependency D(n) has been resolved by identifying it with index m, according to which n is set to m. Further to the features R, D, and D already introduced in section 3, we assume mostly theory-neutral (or, as one could say, theory-compliant) features. Neither the reflexive nor the reciprocal command each other in (i), and yet, they receive the same index, because the subject binds them simultaneously. The present proposal allows such oneto-many-relationships (cf. section 6), while Fischer’s analysis proposes one-to-one relationships between binder and anaphor, but can be extended to deal with one-to-many relationships as well.

Reflexivity and Dependency

171

The feature [+A RG -S] is assigned to predicates that contain articulated argument structures, i.e., verbs but also event nominals (cf. section 6). Typically a fully articulated argument structure includes an external argument. Elements that bear [−A RG -S] do not show an external argument. The feature [+A RG -S] bears some resemblance to the concept of a C OMPLETE F UNCTIONAL C OMPLEX introduced in Chomsky (1986). As a H EAD feature, [±A RG -S] follows the projection of a lexical head. The feature [±L EX] indicates whether the head of a phrase is lexical or phrasal. L EX could be derived from the syntactic context in various ways, and we employ it as an abbreviation to indicate whether a given head is syntactically complex or not. Following a long tradition in binding theory starting with Chomsky (1981), we assume that the realization of a syntactic subject of a predicate plays a major role in determining binding domains. We will represent the syntactic realization of arguments through the feature C OMPS, which is derived from the representation of valency in HPSG (Pollard and Sag (1994)). This feature is list-valued, and its value can either be an empty list, represented , or a list containing specifications for syntactic arguments, as e.g. [C OMPS NP]. We will assume that predicates bearing the specification [C OMPS NP] are in need of a subject and predicates bearing the specification [C OMPS ] have saturated their argument structure. This set of features allows us to define the formal conditions for the introduction of active and inactive anaphoric dependencies as given in (33), (34), and (35). (33) Active Dependency: Given a phrase Y with daughters X and ZP, where ZP bears the value R(n). ZP bears the value D(n) if and only if X is [+A RG -S]. (34) Dependency: Given a phrase Y with daughters X and ZP, where ZP bears the value R(n). ZP bears the value D(n) if X is [+A RG -S], and the value D(n) if X is [−A RG -S]. (35) Activation: Given a phrase Y with daughters X and ZP, where ZP bears the value D(n). ZP bears the value D(n) if X is [+A RG -S]. The projection of active and inactive anaphoric dependencies is governed by the condition in (36). (36) Dependency Projection: a. A D value present on a daughter is also present on the mother, unless the index of the daughter not bearing D is identical to the D value. b. A D value present on a daughter is present on the mother unless the daughter’s D value is identical to the daughter’s D value.

172

Tibor Kiss

Let us briefly discuss the aforementioned conditions. Active Dependency requires that not only R, but also a trigger be immediately present in the context of a reflexive pronoun for an anaphoric dependency to ensue. Dependency is a generalization of Active Dependency, not only covering the introduction of active, but also of inactive dependencies. Activation deals with the activation of inactive dependencies. It should be noted that activation amounts to copying the value of D to D, so that a phrase bearing the features D and D with identical values signals an activated dependency. Activated D values do not project, and the identity clause in condition (36-b) accounts for this. The gist of Dependency Projection can be summarized as follows: inactive dependencies project as long as they have not been activated, and active dependencies project as long as they are not resolved. We have now presented conditions for the introduction and projection of D (and D) values, and condition (36-a) already implicitly addresses the resolution of D values. In most general terms, the resolution of a D value takes place if the value is identified with the index of another daughter, as formulated in (27). The condition in (27), however, can only claim to be a necessary condition, a necessary condition that implicitly refers to a condition dealing with unresolved D values at the highest node in a syntactic derivation. This condition is made explicit in (37), but it should be noted that (37) is not strictly speaking part of the present proposal, since a general ban on open dependencies in complete structures must be imposed by any theory of grammar. (37) Open Dependencies: The maximal projection of a clause must not bear a value for D. Taken together, the conditions in (27) and (37) are still too tolerant to deal with short- and medium-distance anaphors. To capture these, condition (27) must be further constrained by making the resolution of D dependent on the saturation of the argument structure of the predicate that triggered the introduction of the dependency, as given below in (38).14 (38) Local Resolution: If a daughter of a phrase Y bears D(n) and Y is specified as [C OMPS ] then the other daughter of the phrase must bear index n; if Y is specified with a non-empty value for C OMPS, then the index if the other daughter can bear index n. 14

As will be discussed in section 5, condition (38) also accounts for the observation that German impersonal constructions must not introduce a reflexive (cf. (49)). It should be noted that local resolution in (38) does not require any modification to deal with cases of scrambling if we assume that scrambling, i.e., syntactic realization of more prominent arguments although less prominent arguments have to be realized, leads to a recording of the index of the more prominent argument. For a treatment of scrambling along these lines, cf. Kiss (2001).

173

Reflexivity and Dependency

Let us illustrate the workings of the conditions in (33), (34), (35), (36), and (38) for three different patterns in English and German. (39) a. b.

Peter1 likes himself1/*2 . Peter1 mag sich1/*2 .

(40) a. Peter1 believed that pictures of himself1 were on sale. b. *Peter1 glaubte, dass Bilder von sich1 zum Verkauf standen. (41) a. b.

Peter1 likes a picture of himself1/2 . Peter1 bevorzugt ein Bild von sich1/*2 .

With regard to simple transitive structures, as given in (39), English and German show the same pattern, which follows from the requirement that [+ARG-S]heads form a trigger to immediately introduce D from R. Neither verb second, nor the base order of the verbs plays a role here; hence we use the schematic structure in (42) for English and German. (42) Relevant structure of (39a,b): S [C OMPS ] HH H HH H VP NP1 [C OMPS NP, +A RG -S, D(n = 1)] H Peter HH HH V NP [C OMPS NP,NP, +A RG -S] [R(n), D(n)] likes/mag

himself/sich

In (42), himself and sich, respectively, introduce R(n). The presence of R(n) together with [+A RG -S] on the verb leads to the reflexive NP being marked as D(n). According to (36), D(n) is projected to the VP level, where n is identified with the index of the subject NP. Identification is represented through equations of the form dependent index = binding index in (42) and the following structures. Exemption is blocked even in the English case for the following reasons: First, the R(n) and [+A RG -S] conspire to introduce D(n). While R(n) will never project, the projection of D(n) can only be stopped by the identification of n. Secondly, Local Resolution (38) requires that the dependency introduced by D(n) has to be resolved below S[C OMPS ]. The situation differs if picture-NP-reflexives are considered. The larger structure actually do not play a role in the English examples given in (40-a) and (41a), as the introduction of a possible dependency is already barred inside the NP.

174

Tibor Kiss

According to Active Dependency in (33), the presence of R(n) in itself is insufficient to trigger D(n) in English. While it does not matter whether the preposition of is treated as a syntactic head or a case marker (for governed prepositions, cf. section 6), we assume that of actually heads a PP. The preposition is marked as [−A RG -S], and consequently, no dependency is introduced in (40-a) and (41-a). The pertinent local structure is given in (43). (43) Relevant structure of (40-a) and (41-a): N [−A RG -S] H HH H N PP [−A RG -S] [−A RG -S] H HH pictures NP P [−A RG -S] [R(n)] of

himself

Picture-NP-reflexives are treated differently in German, since German allows the introduction of inactive dependencies according to (34). Consequently, a pictureNP-reflexive will introduce an inactive dependency in (40-b) and (41-b), but only the latter can be bound, as required by Local Resolution in (38). The structure of a picture-NP-reflexive in German is depicted in (44), and the analysis of the lower clause in (40-b) is given below in (45). (44) Bilder von sich N [−A RG -S, D(n)] HH HH PP N [−A RG -S, D(n)] [−A RG -S] HHH Bilder NP P [−A RG -S] [R(n), D(n)] von

sich

175

Reflexivity and Dependency

(45) *Peter1 glaubte, dass [S Bilder von sich1 zum Verkauf standen]. S [C OMPS ] HH HH HH HH NP VP [−A RG -S, D(n), D(n)] [+A RG -S, C OMPS NP] H H H HH HH H H N PP V PP [−A RG -S] [−A RG -S, D(n)] PPP [+A RG -S, C OMPS NP, PP] H H zum Verkauf H Bilder P NP standen [−A RG -S] [R(n), D(n)] von

sich

The presence of the feature [+A RG -S] on the VP activates the inactive dependency D(n) on the subject. The resulting D(n), however, cannot be bound below S[C OMPS ], as would be required by Local Resolution. The dependent index n cannot be identified with any other index, as the VP does not offer one. If the structure of the English picture-NP-reflexive in (43) would be plugged into (45), a different picture would arise. As a dependency has never been introduced, there is no reason to resolve it either. Finally, let us turn to the ordinarily bound picture-NP-reflexive in the German example (41-b).

176

Tibor Kiss

(46) Peter1 bevorzugt ein Bild von sich1/*2 . S [C OMPS ] H H HH HH HH VP NP1 [+A RG -S, C OMPS NP, D(n = 1)] Peter HH HH HH HH H NP V [−A RG -S, D(n), D(n)] [+A RG -S, C OMPS NP, NP] H H HH bevorzugt N ART [−A RG -S, D(n)] H ein HH H PP N [−A RG -S] [−A RG -S, D(n)] H H H Bild P NP [−A RG -S] [R(n), D(n)] von

sich

In the analysis in (46), R(n) introduced by sich leads to the introduction and projection of D(n) inside ein Bild von sich. In the context of V[+A RG -S], D(n) is activated and consequently projected as D(n) until it is identified with the index of the subject. The index of the subject is the only index that would allow identification, i.e., even if NPs higher up in the structure would provide indices, they could not be used for identification as Local Resolution requires S[C OMPS ] to not bear a D feature. Although the examples presented in (39), (40), and (41) appear to be similar superficially, their respective grammaticality is determined by the different workings of the conditions (33), (34), (35), and (36). In the presence of the relevant trigger, German and English reflexives are turned into dependencies. But in the absence of a trigger, English and German behave rather differently, as English does not employ inactive dependencies. Hence the English examples (40-a) and (41-a) are no instances of binding. There is no dependency, while in the German cases (40-b) and (41-b), the reflexive leads to the introduction of an inactive dependency that is eventually activated. Once activated, the resolution

Reflexivity and Dependency

177

of the reflexive is required within the next sentential projection, which accounts for the difference in grammaticality in (40-b) and (41-b).

5. Exemption and reflexives in object experiencer psych verbs Let us now return to reflexives in OE psych verbs. We have introduced the relevant data in (8), (11) for English, (13) for German, and (21) and (22) for Portuguese. A summary of the data is presented in (47) for easier reference. (47) a. These pictures of himself1 frighten John1 . b. John1 said that pictures of himself1/2 annoyed Peter2 . c. *Ich glaube, dass die Bilder von sich1 den Kindern1 gefielen. d. Estas fotos dele1 pr´oprio assustaram o Lu´ıs1 . e. *Estas fotos di si1 pr´oprio assustaram o Lu´ıs1 . f. *A Ana1 disse que estas fotos dele2 pr´oprio assustaram o Lu´ıs2 . g. A Ana1 disse que estas fotos dela1 pr´opria assustaram o Lu´ıs2 . Nothing more has to be said about the grammaticality of the English examples in (47-a,b). The reflexive does not count as an anaphor in either case, and the co-indexations in (47-a,b) are direct consequences of the reflexive’s status as a syntactically non-dependent pronoun. The grammaticality distribution in (47a,b) should thus mirror the one of a personal pronoun, as illustrated in (48). Minimal differences might be due to factors like logophoricity (cf. Sells (1987)). (48) a. b.

These pictures of him1 frighten John1. John1 said that pictures of him1/2 frighten Peter2 .

The ungrammaticality of the German example in (47-c) follows immediately from the discussion in section 4: The reflexive introduces an inactive dependency, which again leads to an active dependency that cannot be resolved in the syntactic domain of the lower clause. As the lower clause in (47-c) provides a domain in which the reflexive cannot be bound, the ungrammaticality of (47-c) can be compared to the ungrammaticality of reflexives in German impersonal constructions, as illustrated in (49). The reflexives in (49-c,f) introduce a dependency that again cannot be resolved in the relevant local domain.15 (49) a. b.

15

Ernom friert. henom feels-cold Ihnacc friert. himacc feels-cold

It should be noted that the examples (49-c,f) would be treated as an instance of exempt reflexivization in Pollard and Sag (1994).

178

Tibor Kiss

c. *Sichacc friert. REFL acc feels-cold d. Er1 sagte, dass er1 friert. he said that he feels-cold e. Er1 sagte, dass ihn1 friert. he said that him feels-cold f. *Er1 sagte, dass sich1 friert. he said that REFL feels-cold This leaves us with the grammaticality distribution of the Portuguese reflexives si pr´oprio and ele pr´oprio in examples (47-d,e,f,g). As Portuguese employs more than one reflexivization strategy, we may expect that the different reflexives introduce dependencies according to the application of conditions (33) and (34). Hence, we assume that the introduction of anaphoric dependencies may indeed be dependent on the form of the reflexive in addition to the other conditions already introduced. What is more, we must also account for the observation that the two reflexives differ w.r.t. to their binding domain: While si pr´oprio clearly leads to short- to medium-distance anaphoric relations, ele pr´oprio has been analyzed as a long-distance reflexive, as is already illustrated in (47-g). To analyze (47-d,f,g), we assume that the trigger for turning ele pr´oprio into a dependency is the feature [+L EX], and that in the absence of [+L EX], i.e., in the presence of [−L EX], the R feature present in ele pr´oprio leads to an inactive dependency only. With respect to the long-distance capabilities of ele pr´oprio, we will assume that resolution will not be required to be local for this reflexive. Hence, we assume the following language-specific instantiation of (34) and (35) for Portuguese ele pr´oprio, and we also assume that Local Resolution does not apply to the reflexive. (50) Dependency (Portuguese, ele pr´oprio): Given a phrase Y with daughters X and ZP, where ZP bears the value R(n). ZP bears the value D(n), if X is [+A RG -S, +L EX], and the value D(n) if X is [−A RG -S] or [−L EX]. (51) Activation (Portuguese, ele pr´oprio): Given a phrase Y with daughters X and ZP, where ZP bears the value D(n). ZP bears the value D(n), if X is [+A RG -S, +L EX].16 16

It should be clear by now that activation always repeats the initial statement of dependency projection.

179

Reflexivity and Dependency

(52) A Ana1 disse que estas fotos dela1 pr´opria assustaram o Lu´ıs2 . xx S [C OMPS ]

HH HH H NP1 VP P P [+A RG -S, C OMPS NP, D(n = 1)] a Ana H HH HH

V S [+A RG -S, + L EX , C OMPS NP, S] [C OMPS , D(n), D(n)] disse

H HH HH

C

S [C OMPS , D(n)]

H H HH HH H

que

NP [−A RG -S, D(n)]

VP [+A RG -S, C OMPS NP]

H HH

ART

estas

N [−A RG -S, D(n)]

H HH

N

[−A RG -S] fotos

H H HH V NP P P [+A RG -S, + L EX , C OMPS NP, NP] o Lu´ıs

PP [−A RG -S, D(n)]

assustaram

HH

P

NP [−A RG -S] [R(n), D(n)] d

ela pr´opria

It should be noted that D(n) in (52) must be bound by the subject. If it were not, the clause would be marked with an open dependency and hence violate (37). In contrast to the ungrammaticality of (22-b), an instance of a violation of (37), examples like (19-a,b) are grammatical, because no active dependency has been introduced in the first place. We thus assume that matrix clauses must not bear open active dependencies, while inactive dependencies do not count as open. Let us now turn to (47-e). The syntactic distribution of si pr´oprio must not be handled by (50), for in this case, its ungrammaticality would remain without explanation. We have to repeat, however, that si pr´oprio already differs from ele pr´oprio in its lexical form: while the latter is derived from a fossilized nominative personal pronoun and the intensifier pr´oprio, the former consists of the same intensifier, but a fossilized dative personal pronoun. As the forms are different, we may very well assume that the syntactic conditions for introducing a dependency are different as well. Hence we propose that the syntactic distribution of si pr´oprio is not handled by (50) and (51), but by (33) and (35). In general, we may conclude that if a language employs more than one reflexivization strategy, it should also employ more than one resolution strategy. If the reader doubts this

180

Tibor Kiss

conclusion, I would like to point out two well-known strategies to deal with reflexive pronouns, which build on the same insight: First, Chomsky (1981) did not only introduce Principle A of Binding Theory for reflexives and reciprocals, but also Principle B for other pronouns. The principles are justified on the observation that the forms of the pronouns differ from one another, and also that their syntactic distribution is not identical. The same considerations apply to si pr´oprio and ele pr´oprio. Second, languages that employ long-distance and shortdistance reflexives are typically dealt with by introducing different conditions on their distribution. The pronouns ele pr´oprio and si pr´oprio clearly differ in that the one can be a long-distance anaphor, while the other can only be a mediumdistance anaphor, and once again to employ two different conditions seems appropriate. To fully implement this idea, it becomes necessary to relativize the features R, D, and D to the different forms present in Portuguese, i.e., we do not only employ R(n), D(n), and D(n), but R(si, n), D(si, n), and D(si, n) alongside R(ele, n), D(ele, n), and D(ele, n). In fact we can assume that for the languages discussed in the present paper, the features R, D, and D are always relativized to the form of the reflexive. As there is only one reflexive form in English and German, the relativization does not change the conditions given above.17 With (33) and (35) applying to (47-e), its analysis corresponds to the analysis of the ungrammatical German example (47-c), as analyzed in (45).

6. Double object constructions, governed prepositions and binding inside NP Double object constructions present an interesting application of the model developed so far. First, they are interesting in that we can show that a dependency can be bound by an element that introduces a dependency itself. Secondly, we can explicitly address the role of governed prepositions. Now consider the examples in (53). (53) a. In his schizophrenic phase, John1 introduced himself1 to himself1 . b. *Lola2 sold himself1 to himself1 . We will assume for the present purposes an analysis of English double object constructions in terms of verbal shell projections (Larson (1988)). In particular, we will assume that the to-PPs in (53) are subordinate to the reflexive NPobjects. To integrate governed prepositions into the present analysis, we apply the treatment of reflexive index projection in Pollard and Sag (1994) to the fea17

Linguists sometimes assume that the syntactic distribution of the German reflexive pronoun sich differs from the combination of sich with the adverbial selbst (self). For a comprehensive treatment of sich and sich selbst cf. Fischer (2006).

181

Reflexivity and Dependency

ture R. Pollard and Sag (1994) assume that nominal indices can be born by PPs if the head is a governed preposition. Similarly, we allow an exception to the general rule that R will never project syntactically and assume that PPs headed by governed prepositions bear the R feature introduced by their complements. Consequently, to himself bears the specification R(n) inherited from himself. The analysis of (53-a) is given in (54). (54) John1 introduced himself1 to himself1 . S [C OMPS ]

H H HH HH

NP1

John

VP [C OMPS NP, D(m = 1)]

HH HH H

V [C OMPS NP, +A RG -S] introduced

VP [C OMPS NP, D(m)]

H HH HH H

NP [R(m), D(m)] himself

V [C OMPS NP, NP, +A RG -S, D(n = m)]

HH HH

V PP [C OMPS NP, NP, PP, +A RG -S] [R(n), D(n)] introduced

PP to himself

The PP to himself, bearing R(n), is realized as a sister of the verb introduced, whose specification [+A RG -S] leads to the introduction of D(n) on to himself. This dependency is bound by identification with the index m, which is introduced by another reflexive pronoun that is also turned into an anaphor. The index n is hence identified by the index m, but the dependency introduced by the upper reflexive is still in need of identification. Eventually, it is identified and hence bound by the subject of the clause, and it follows from the transitivity of identity that the subjects and both anaphors bear the same index.18 The ungrammaticality of (53-b) follows from the same reasoning: the second anaphoric dependency is not resolved, as there is no identification of the upper anaphor’s dependency 18

It should be noted that the analysis presented in (54) is not the only one available. Both dependencies could project to the level of the highest VP, where the subject binds both simultaneously. The result would be the same in the present case, but the second analysis is required to deal with picture-NP-reflexives inside double object constructions that do not command each other, as was already mentioned in fn. 12.

182

Tibor Kiss

with the matrix subject. With an open dependency present at the root node of the clause, it violates condition (37). Let us now turn to double object constructions with two PP objects. Jackendoff (1990) presents the following paradigm: (55) a. I talked [ PP to John and Bill]1 [ PP about themselves]1. b. *I talked [ PP to themselves]1 [ PP about John and Bill]1 . c. ??I talked [ PP about John and Bill]1 [ PP to themselves]1. d. *I talked [ PP about themselves]1 [ PP to John and Bill]1 . We assume that the situation with two PP objects does not differ from the one with one PP object: both PPs may inherit the R marking of the reflexive. The ungrammaticality of (55-b) is derived straightforwardly: the lower PP cannot bind the dependency, as the dependency is only introduced after the lower PP has been syntactically realized. The near-ungrammaticality of (55-c) could be derived in the same fashion if we assume that scrambling does not extend binding options. If the to-PP remains the more prominent argument, a dependency introduced by the to-PP could only be resolved by identification with the matrix subject. Hence, an example like (56) should be grammatical under this analysis. (56) Peter and Mary talked about Bill to themselves. It remains to account for the ungrammaticality of (55-d). Any account of the ungrammaticality must depend on an analysis of scrambling in English. It might very well be that the preposed about-PP leads to ungrammaticality in itself, given that realizing the reflexive to the left of its antecedent would only be plausible if it should be indicated that the antecedent is in fact not the antecedent. It is a well-known fact that preposing in English changes coindexation options for reflexives, as can be illustrated by the following renowned example from Barss (1986): (57) a. b.

John1 wonders which picture of himself1/2 Bill2 saw. John1 wonders whether Bill2 saw a picture of himself*1/2 .

Issues of logophoricity might explain why a coindexation of himself with John leads to ungrammaticality in (57-b). But example (57-a) shows that preposing an exempt reflexive seems to lead to an extension of possible antecedents. Admitteldly, the examples in (57) differ from the ones in (55) in that the former but not the latter are instances of exempt reflexives. But if the main function of preposing a pronoun would actually be to change its coindexation options, then this function clashes with the non-availability of any antecedent in (55). Again, adding an antecedent, as in example (58) should lead to an improved judgement. (58) Peter and Mary talked about themselves to Bill. Interestingly, a pattern similar to (55) can be found in the nominal domain.

Reflexivity and Dependency

183

(59) a. [ NP gifts [ PP from John and Bill] [ PP to themselves]] b. *[ NP gifts [ PP from themselves] [ PP to John and Bill]] c. [ NP gifts [ PP to John and Bill] [ PP from themselves]] d. *[ NP gifts [ to themselves] [ PP from John and Bill]] The grammaticality pattern in (59) suggests that the reflexives introduce dependencies within nominal projections. We propose that the noun gift differs from picture and other nouns in allowing a fully articulated argument structure, and hence in bearing the specification [+A RG -S]. We will also assume that the structure of nouns with more than one argument mirrors the structure for verbs shown in (54). Finally, we will assume that the realisation of prepositional arguments inside NP is not as strict as the realisation of prepositional arguments in the verbal domain, to account for the distribution between (55-c) and (59-c). Given these two assumptions, the grammaticality of (59-a,c), as well as the ungrammaticality of (59-b,d) is accounted for: in each case, a dependency is introduced and in need of resolution inside NP. This condition cannot be met in (59-b,d). It should be stressed, though, that the picture is not as simple as that, as surprisingly often, we find erratic binding patterns inside NPs. As an illustration, consider the following examples from B¨uring (2005, 235). (60) a. b. c.

C.B.’s father1 resented his wife for [ NP her2 low opinion about himself1 ]. Even so, [ NP his2 remarks about herself1 ] were uncalled for. Unfortunately, you have a tendency to allow [ NP your2 obviously muddled, rather juvenile feelings about myself1 ] to cloud your judgment.

It is not implausible to assume that opinion, remark, and feelings are derived nominals. Yet, the examples show the behaviour of exempt reflexives, indicating that the analysis suggested for (59) should not be applied to (60). While we find reflexives as complements of derived nominals in (60) that show the behaviour predicted of picture-NP reflexives, Reinhart and Reuland (1993, 681f.) present the examples in (61-b,c), where apparently exempt reflexives show sensitivity to referential specifiers of N. (61) a. The picture of himself that John saw in the post office was ugly. b. *Your picture of himself that John saw in the post office was ugly. c. *Mary’s letters to Sarah about himself obsessed him. It would be possible to speculate about the ungrammaticality of (61-b,c) in terms of the present analysis, but I refrain from doing so. Clearly, much more empirical research is required to account for the full range of binding patterns found inside of NPs, and in addition, the syntactic analysis of nominals, and derived nominals, is still an open issue. As long as the grammaticality distribution remains unclear with regard to variation among speakers and dialects, the best we can

184

Tibor Kiss

do is to point out the predictions of the grammar. If the analysis suggested for (59) is basically correct, we predict that a reflexive as a complement of a derived nominal can only be bound in the local domain of the nominal, if the nominal realizes all the arguments it could realise. Hence, an external antecedent like the subject in (58) would not save an otherwise ungrammatical phrase, as illustrated in (62) below. (62) *[Peter and Mary]1 talked about gifts to themselves1 from John and Bill. As the present analysis offers an account for the undisputed data discussed above, I take it as a good starting point for an initial classification of the disputed data presented in this section, hence applying Chomsky’s suggestion (Chomsky (1957, 14)) that “[i]n many intermediate cases we shall be prepared to let the grammar itself decide, when the grammar is set up in the simplest way so that it includes the clear sentences and excludes the clear nonsentences.”

7. Conclusion We have presented a syntactic treatment of anaphoric dependencies that builds on the insight that a distinction has to be drawn between reflexivity and anaphoricity. Reflexivity is a lexical property of certain noun classes (or more generally, a formal property of linguistic entities). In languages like English, German, Portuguese, and Serbo-Croatian, a reflexive pronoun is introduced into the syntax to signal an anchor for a possible anaphoric dependency. It depends on the syntactic environment of the reflexive whether or not a dependency will be established. An anaphoric dependency can be established directly, if a given trigger is met, or can be postponed until a given trigger is met. The latter case typically results in mediumdistance anaphoric binding, which must be sharply distinguished from exempt reflexives. Exempt reflexives are the result from a combination of a syntactic environment which does not provide an appropriate trigger for an anaphoric dependency, and an introduction rule which force immediate establishment of an anaphoric dependency. Reflexives in OE psych verb constructions can be treated as a case of exemption, which results in the prediction that reflexives in OE psych verbs should only appear in languages allowing for exemption.

Bibliography Barss, Andrew (2001): Chains and Anaphoric Dependence: On Reconstruction and Its Implications. PhD thesis, MIT, Cambridge, Massachusetts. Belletti, Adriana and Luigi Rizzi (1988): ‘Psych-Verbs and θ -Theory’, Natural Language and Linguistic Theory 6, 291–352. Borer, Hagit (2005): Structuring Sense Volume 1: In Name Only. Oxford, Oxford University Press.

Reflexivity and Dependency

185

Branco, Ant´onio and Palmira Marrafa (1999): Long-Distance Reflexives and the Binding Square of Opposition. In: G. Webelhuth, J.-P. Koenig and A. Kathol, eds, Lexical and Constructional Aspects of Linguistic Explanation. CSLI Publications, Stanford, pp. 163–177. B¨uring, Daniel (2005): Binding Theory. Cambridge University Press, Cambridge. Chomsky, Noam (1957): Syntactic Structures. Mouton, The Hague. Chomsky, Noam (1981): Lectures on Government and Binding. Foris, Dordrecht. Chomsky, Noam (1986): Knowledge of Language. Praeger, New York. Dimitriadis, Alexis and Martin Everaert (1999): Typological Perspectives on Anaphora. In: P. Suihkonen and B. Comrie, eds, International Symposium on Deictic Systems and Quantification in Languages Spoken in Europe and North and Central Asia. Collection of Papers. Udmurt State University, Iˇzevsk, pp. 51–67. Everaert, Martin (no date): Reflexives in Discourse. Ms., Utrecht University/Nijmegen University. Fischer, Silke (2006): ‘Matrix Unloaded: Binding in a Local Derivational Approach’, Linguistics 44, 913–935. ¨ Frey, Werner (1993): Syntaktische Bedingungen f¨ur die semantische Interpretation: Uber Bindung, implizite Argumente und Skopus. Akademie Verlag, Berlin. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum and Ivan Sag (1985): Generalized Phrase Structure Grammar. Harvard University Press, Cambridge, Massachusetts. Jackendoff, Ray (1990): ‘On Larson’s Treatment of the Double Object Construction’, Linguistic Inquiry 21, 427–456. Kiss, Tibor (2001): Configurational and Relational Scope Determination in German. In: W. D. Meurers and T. Kiss, eds, Constraint-Based Approaches to Germanic Syntax. CSLI Publications, Stanford, pp. 141–176. Larson, Richard K. (1988): ‘On the Double Object Construction’, Linguistic Inquiry 19, 335–391. Napoli, Donna Jo (1979): ‘Reflexivization across Clause Boundaries in Italian’, Journal of Linguistics 15, 1–28. Pesetsky, David (1995): Zero Syntax: Experiencers and Cascades. MIT Press, Cambridge, Massachusetts. Pollard, Carl and Ivan A. Sag (1992): ‘Anaphors in English and the Scope of Binding Theory’, Linguistic Inquiry 23, 261–303. Pollard, Carl and Ivan A. Sag (1994): Head-Driven Phrase Structure Grammar. University of Chicago Press, Chicago. Reinhart, Tanya and Eric Reuland (1993): ‘Reflexivity’, Linguistic Inquiry 24, 657–720. Sabel, Joachim (this volume): Derivational Binding and the Elimination of Uninterpretable Features. Sells, Peter (1987): ‘Aspects of Logophoricity’, Linguistic Inquiry 18, 445–479. Zribi-Hertz, Anne (1989): ‘Anaphor Binding and Narrative Point of View: English Reflexive Pronouns in Sentence and Discourse’, Language 65, 695–727.

Sprachwissenschaftliches Institut Ruhr-Universit¨at Bochum

Joachim Sabel

Derivational Binding and the Elimination of Uninterpretable Features* Abstract Condition A and B have been stated as ‘Anywhere/Everywhere Conditions’ (see Belletti and Rizzi (1988); Epstein et al. (1998), Hicks (2006), Lebeaux (2010), among others), i.e., an anaphor that is A-bound at any step of the derivation (within a local domain D) satisfies Condition A during the whole derivation whereas a pronoun has to be free in its local domain at every step of the derivation. However, assuming a cyclic phase-based approach of derivations (Chomsky (2000; 2001; 2007; 2008)), this analysis does not specify at what stage of the derivation Binding applies with respect to the interplay of the structure-building operations and Transfer. Furthermore, I will show that this ‘global’ analysis faces empirical problems. I argue instead that the syntactic licensing conditions for anaphors and pronouns apply at a certain local step of the derivation, after elimination of their uninterpretable feature [uF] (the uninterpretable Case feature), i.e., as soon as these elements become visible for the Semantic Interface. It follows that Condition A and B are not active throughout every stage of a derivation. The locality restrictions for anaphors and pronouns are derived from phase theory, θ -theory and the distribution of uninterpretable and interpretable referential features. Finally, it will be discussed that certain (local and non-local) Condition C effects are subject to different constraints than Condition A and B.

1. Introduction Several authors have argued that Condition A has to be stated as an “Anywhere Condition” as in (1) (see Belletti and Rizzi (1988); Uriagereka (1988); Lebeaux (1991; 2009); Sabel (1996); Epstein (1998); Epstein and Seely (2006); Grewendorf and Sabel (1999); Saito (2003; 2005); Grewendorf (2003); Bailyn (2005), among others). (1) Condition A of the Binding Theory can be fulfilled at any stage of the derivation. The idea is that a reflexive or reciprocal expression that is A-bound at one step of the derivation (within a local domain D) satisfies Condition A of the Binding Theory during the whole derivation. Condition A in (1) is understood as “deriva*

For valuable discussions, I would like to thank G¨unther Grewendorf, Andreas Pankau, Cecilia Poletto, Marc Richards and Gert Webelhuth. Special thanks to Gereon M¨uller for comments on an earlier version of this article and to Erich Groat for extensive discussions.

Local Modelling of Non-Local Dependencies in Syntax, 187-212 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

188

Joachim Sabel

tional” in the sense that it does not apply at a certain level of representation (i.e., D-Structure, S-Structure, LF); instead it can be satisfied at any level. Assuming a cyclic phase-based approach to derivations (see Chomsky (2000; 2001; 2007; 2008)), (1) does not specify, however, at what stage of the derivation Binding applies with respect to the interplay of the structure-building operations and transfer. Furthermore, as will be discussed in detail below, (1) faces empirical problems. For example, an anaphor that is only locally bound before it has deleted its uninterpretable Case feature ([uCase]) causes ungrammaticality. In section 4, I will therefore argue for an alternative derivational approach with respect to anaphoric binding. The idea is that it is necessary and sufficient for an anaphor to be bound (c-commanded by a co-indexed element) in its phase as soon as it has its [uCase] erased, i.e., as soon as the anaphoric element becomes visible for the semantic interface. In (2), [uF] (uninterpretable feature) refers to [uCase] on an anaphor.1 (2) Condition A applies at the step of the derivation after the [uF] of the anaphor is erased. Notions such as “Binding Theory”, “co-indexing” and “Binding Condition A/B” are used only as descriptive terms here, referring to adequate local relations between antecedent-anaphor and antecedent-bound-pronoun pairs. In the Minimalist framework, these notions need to be derived from elementary concepts of narrow syntax such as c-command, phases, and operations for the encoding of referential dependencies. I will make a suggestion in section 4. As is well known, Binding Theory imposes opposing requirements on anaphors and pronouns in the same local domain. Condition B requires that pronouns be free within the very domain in which Condition A requires anaphors to be bound.2 Like Condition A, Condition B has been labelled an “Anywhere/Everywhere Condition.” Epstein (1998), Hicks (2009) and Lebeaux (2009), for example, claim that Condition B applies at every step of the derivation. In section 3, I will also develop a new approach with respect to the application of Condition B that entails as a central element the [uCase] of the pronoun. (3) Condition B applies at the step of the derivation after the [uF] of the pronoun is erased. Certain examples show that a pronoun can violate Condition B at early stages of a derivation but that, as soon as its [uCase] is erased (valued), it has to fulfill Condition B in its local domain, i.e., in its phase. 1

2

Other [uF] that can be associated with pronouns and anaphors, such as topic- or focus-features, are not part of an anaphor/pronoun itself but of corresponding functional heads. They do not affect the timing and application of the “Binding Conditions.” A short discussion of counterexamples where anaphors and pronouns are not in complementary distribution can be found in section 5.

Derivational Binding and the Elimination of Uninterpretable Features

189

(2)–(3) are reformulations of the application of Conditions A and B as “Anywhere” or “Everywhere” Conditions, in the sense that anaphors like pronouns become visible for binding as soon as their [uCase] is deleted. An element containing [uF] does not fulfill the necessary C-I interface legibility condition. Therefore, anaphors and pronouns with [uF] are inaccessible for binding theoretic computations at early stages of a derivation. (2)–(3) reflect the fact that Conditions A and B apply at the Semantic or C(onceptual)-I(ntentional) Interface to generate a well-formed semantic representation. It follows that Conditions A and B are not “global” in the sense of being active throughout every stage of a derivation, as suggested, for example, for Condition A as an “Anywhere Condition” in (1) or for Condition B as an “Everywhere Condition”. Binding Conditions become active and scan the structural licensing conditions for anaphors and pronouns at a certain local step of the derivation, after elimination of their [uF], i.e., as soon as these elements become visible for the Semantic Interface. Assuming a cyclic “phase-based” application of Condition A/B, the phase is the natural local binding domain in which structural licensing is computed.3 This implies that potential binders outside the phase do not affect the licensing of anaphors and pronouns that have their [uCase] checked. This is different with non-local Condition C effects. I do not apply my analysis to Condition C in general; however, in sections 4 and 5, some residual issues, among them local and non-local Condition C effects, are discussed. The article is organized as follows. Section 2 contains a short discussion of the motivation for Condition A as an “Anywhere Condition” (1) and the shortcomings of this approach. In section 3, I develop the basis for a new derivational analysis of the application of Condition B. This analysis is refined and applied to Condition A effects in section 4 where I also raise the question of how the Binding Conditions can be derived. Section 5, as already mentioned, contains the discussion of some residual issues. Section 6 is the summary.

2. Condition A as an “Anywhere Condition” Motivation for the version of Condition A in (1) comes from examples in which a structural relation required by Condition A is obliterated by movement. The relevant examples are well known. They are found, for example, with psycho3

Several authors claim that phases (i.e., vP, DP and CP) are binding domains (Quicoli (2008); LeeSch¨onfeld (2008); Hicks (2009), among others). Assuming vP and CP to be the local domains for binding resembles the idea of defining the binding domains in terms of the Complete Functional Complex (Chomsky (1986)). I restrict my discussion here to vP and CP but even DPs (nPs) and PPs (FPs; see Cinque (2010)) have been claimed to be phases and therefore independent binding domains.

190

Joachim Sabel

logical predicates and raising verbs (see Barss (1986, 108), Johnson (1985, 41ff), Johnson (1987; 1992), Pesetsky (1987), and Belletti and Rizzi (1988), among others). Even though I do not represent it in (4)–(9) because it is not important at this point in the discussion, I assume that these verbs alongside unaccusative, passive and raising verbs are associated with a v, as discussed at the end of this section.4 (4) a. b.

... [VP [please pictures of himselfi ] Billi ] [TP [pictures of himself i ] [VP [please t ] Billi ]].

(5) a. b.

... seem to [VP [bother these pictures of each otheri ] themi ] [These pictures of each otheri ] seem to [VP [bother t] themi ]. (It seems that these pictures of each other bother them.)

Passive constructions provide further examples for the relevance of (1) (see Johnson (1985, 44f), Johnson (1992), Belletti and Rizzi (1988)): (6) [Pictures of themselvesi ] were [VP painted t] by the meni . I assume that the experiencer DP following to in raising constructions may ccommand into the lower clause (see Boeckx (1999); Boˇskovi´c (2002); Collins (2005); Epstein and Seely (2006) and footnote 9 for relevant discussion):5 (7) a. b. c.

[Each otheri ’s pictures] seem to the meni [t to be t the most beautiful]. [The stories about themselvesi ] appear to the womeni [t to be t complete fabrications]. [Replicants of themselvesi ] were believed t to have seemed to the boysi [t to be t ugly].

The anaphoric expressions in (4)–(7) fulfill Condition A at some step of the derivation, ı.e., not in their final destination but before movement into this position. That an anaphor has to fulfill Condition A at only one step of the derivation, as formulated in (1), is further confirmed by the examples in (8)–(9). Here, in 4

5

Belletti and Rizzi (1988) argue that object experiencer psych-verbs are unaccusative double object verbs that select a theme and experiencer argument. Other authors have argued that psych verbs may involve a hidden predicate where the theme is in fact an external argument. For the present discussion, the details of an analysis of psych-predicates are not relevant. The important point for my analysis is that, irrespective of the base generated structure of the arguments involved, the experiencer c-commands the theme before the latter raises to Spec TP (for further discussion on binding and psych verbs, (see Rooryck and van den Wyngaerd (2011) and the literature cited there; see also Kiss (this volume)). The same holds for the agentive DP following by in passives as in (6), cf. also (i) (Collins (2005)): (i)

The letter was sent by Johni to himselfi .

Derivational Binding and the Elimination of Uninterpretable Features

191

contrast to (4)–(7), the anaphor violates Condition A before NP-movement of the antecedent to TP applies. It fulfills Condition A after NP-movement of the antecedent has taken place: (8) The meni seem to each otheri [t to be t nice]. (9) a. b.

Bill thinks that the meni were [kissed t] by each otheri ’s wives. Susani would be [pleased t] by these pictures of herself i .

Within certain versions of the Principles and Parameters framework, in which it was assumed that the Binding Conditions apply at a certain level of representation, examples like (4)–(9) have been problematic. Condition A is fulfilled at S-structure in (8)–(9) but violated at S-structure in (4)–(7); Condition A is fulfilled at D-structure in (4)–(7) but violated at D-structure in (8)–(9); it is fulfilled at LF in (8)–(9) under the assumption that reconstruction does not apply for Amovement (Chomsky (1995)), but violated at LF in (4)–(7). Given the global view in (1), however, the grammaticality of (4)–(9) can be explained in a unified manner. Furthermore, after the elimination of D-Structure and S-Structure as levels of representation (cf. Chomsky (1995)), (1) had the advantage of being compatible with the perspective that the only linguistically significant levels are the C-I and A(rticulatory)-P(erceptual) interface levels. An alternative analysis is based on a representational approach as proposed, for example, in Barss (1984; 1986; 1988). Barss’ analysis covers the effects that movement can have on binding relations. The central theoretical idea is the concept of “Chain Binding” according to which Condition A, for example, can be fulfilled if an antecedent α binds a dependent element β or c-commands a member of the movement chain whose head contains β . According to this view, Condition A is fulfilled in the examples involving A-movement (4)–(7) (and (8)–(9) as well as with A’-movement, for example in (10)–(11), because the antecedent binds a trace of the DP that contains the anaphor. (10) Billi wonders [CP [which pictures of himself i/j ] Joej likes t]. (11) [Which pictures of himself i/j ] does Billi think [CP t Joej likes t]? Finally, compare (10)–(11) with a more complex example involving whextraction from a wh-island (see Sabel (2002)): (12) a. *Bill asked Maryi [CP where [Paul bought some pictures of herselfi ]]. b. Bill asked Maryi [CP which pictures of herself i [Paul bought t]]. c. ??Which pictures of herself i did Bill ask Maryi t where [Paul bought t]? As (12-a) shows, the anaphor is not adequately bound by its antecedent. However, in the intermediate landing site in the embedded Spec CP it is accessible to binding by its antecedent, as can be seen from (12-b). The marginality of

192

Joachim Sabel

(12-c) is due to the violation of the wh-island constraint. (12-c) is much better than (12-a), indicating that the anaphor meets condition A in (12-c) in contrast to (12-a). In (12-c) the anaphor contained in the wh-phrase is not c-commanded by the matrix object. Nevertheless it can take Mary as antecedent. Given that the anaphor is not bound in its underlying position (12-a), nor in its surface position (12-c), and given that the intermediate Spec CP is filled with a wh-phrase in (12-c), the question is how the anaphor can fulfill Condition A in this example. The long-moved wh-phrase in (12-c) is extracted via a second specifier position in the embedded CP (Rizzi (1997); Richards (2001); Sabel (2002)). In this position, the anaphor may be bound by the matrix object, assuming a VP-structure for this ditransitive construction in which the base position of Mary in Spec VP is structurally higher than the embedded CP. The examples in (10), (11) and (12b-c) can be analyzed in terms of chain binding. However, it remains unclear why, from a Minimalist point of view, this chain condition should exist at all; for example, it does not follow from independently established operations (see also Hornstein et al. (2005, chapter 8)) and, furthermore, the presupposed notion of chain is not compatible with a cyclic phase-based approach. Assuming, however, that Condition A can be satisfied anywhere in the derivation as stated in (1), the anaphor in (10)–(11), (12b-c) satisfies condition A at a step of the derivation when the wh-phrase is located in an intermediate Spec CP position. Another way of dealing with the examples (10)–(11) and (12b-c) is to postulate an operation that yields “reconstruction” for the purposes of the Binding Theory in combination with A’-movement (see Chomsky (1995)). However, we can uniformly account for the data in (4)–(11) and (12-bc) on the basis of (1) alone without assuming that an additional reconstruction operation takes place (see also the discussion in section 5). With respect to A’-movement, the anaphor in (10) is bound, for example, by Joe before wh-movement takes place and it is bound in (11) by Bill and in (12b-c) by Mary at the step of the derivation when the wh-phrase is moved into the embedded Spec CP, i.e., as a result of independently motivated movement steps in syntax. A similar account can be given for examples (4)–(7) and (8)–(9) involving A-movement.6,7 6

7

Another (potential) problem for the representational “chain condition” approach arises in (11) under co-reference between Bill and himself, if the intermediate trace/copy deletion approach is adopted (see Lasnik and Saito (1992); Chomsky (1995); Lasnik (2001)). After deletion of t the example no longer meet the representational conditions for fulfilling Condition A; however, see Chomsky (1995, 387, fn. 75) for some related suggestions. Note also that a different concept of (uniform) chains is proposed in Chomsky (2008) which makes intermediate trace deletion superfluous in cases like (11). Chomsky proposes feature splitting with internal Merge, where φ -features and the valued [uCase] of a DP do not move to a phase edge position but only [Phon] and [iQ] do. (1) and likewise the revised version in (2) are compatible with the so-called predicate/argument asymmetries (see Barss (1986, sect. 3.4); Huang (1993); Takano (1995); Barss (2001)). (10)–

Derivational Binding and the Elimination of Uninterpretable Features

193

Although the “Anywhere Condition” (1) is compatible with the examples involving Condition A discussed so far, (1) is too weak. It does not exclude the examples (13)–(15): (13) a. *Himself i surprised t Johni . b. *Himself i surprised t himselfi . (14) a. *Himself i seems to Johni [t to be ugly]. b. *Himself i usually strikes Johni as [t amusing]. (15) *John expected herself i to seem to Maryi t to be t pregnant. For reasons discussed in the introduction and in section 5, I will not refer to Condition C in the analysis of these examples; and in addition, for (13-b) such an analysis would not be sufficient. However, as will be illustrated in section 4, we can account for the data on the basis of (2). The crucial difference between, for example, (13) and (4) is that the anaphor has its [uCase] erased in situ (in PP) in (4). Therefore it is visible for the binding operation in (4-a). The same explanation as for (4) can be given for (5)–(12). In all of these examples the anaphor satisfies Condition A, because it is c-commanded by its antecedent (in its local domain) after its [uF] is erased as demanded in (2). In contrast, in (13)–(15), the anaphor has erased its [uCase] in a position in which it is not c-commanded by a local antecedent. In the following, I assume that the verbs in (4)–(15) i.e., even psych verbs, passive and raising verbs are all associated with v (cf. Chomsky (2001, 12)) and that, furthermore, all types of v are phase heads (see also Legate (2003)). Hence, A-movement has to proceed successive-cyclically via vP in these cases (see also Sauerland (2003) and Richards (2011)). The examples in (4)–(6) have the more detailed structure shown in (4’)-(6’): (11) exemplify the phenomenon of multiple binding domains. The anaphor can be bound by the embedded or matrix subject. In (i), however, the anaphor is embedded within a predicate and can only be bound by the embedded subject: (i)

a. b.

[How proud of himself i/*j ] does Billj think [t Joei will be t]? [Critizice himself i/*j ] Billj thinks [t Joei will not t] .

Huang (1993); Takano (1995); Barss (2001) argue that the presence of an unbound predicateinternal subject trace is responsible for the impossibility of extending the binding domain of the anaphor: (ii) a. b.

[AP how tJoe proud of himself *j/i ] does Billj think [t Joei will be tAP ]? [vP tJoe critizice himself i/*j ] Billj thinks [t Joei will not tvP ].

The embedded predicate is the local binding domain for the anaphor in (ii) but not in (10)– (11). The anaphor is moved together with its local binding domain that contains the trace of the embedded subject, which represents the next possible antecedent at every stage of the derivation.

194

Joachim Sabel

(4’) a. b.

... [vP [VP [please pictures of himselfi ] Billi ]] [TP [pictures of himself i ] [vP t please [VP [ t ] Billi ]]].

(5’) a. b.

... seem to [vP [VP [bother these pictures of each otheri] themi ]] [These pictures of each otheri ] [vP t seem to [vP t bother t themi ]]. (It seems that these pictures of each other bother them.)

(6’) [Pictures of themselvesi ] were [vP t painted t] by the meni . Adopting the idea of feature-inheritance (Richards (2007); Chomsky (2007; 2008)), I assume that [uF]-probes start out on the phase heads (C, v) and are inherited to the head of their complements (T, V). Feature-inheritance, probing and transfer to the interface levels apply simultaneously (cf. Richards (2011)). This has the effect that when C is merged in the examples (4’)-(6’) above, T probes and VP is transferred and is no longer accessible for core syntactic operations. The relevant locality pattern of search space for a probe is reminiscent of the formulation of the Phase Impenetrability Condition (PIC) in Chomsky (2000: 18), i.e., “In phase α with head H, the domain of H is not accessible to operations outside α ; only H and its edge are accessible to such operations.” According to this notion of locality, probes on T (and C) can see as far as the edge of the vP phase (Spec-vP, vP-adjuncts) but not beyond, since v transfers its complement as soon as the next head (T) is operative (Chomsky (2000)) or as soon as v’s inherited φ -feature set is valued by the object (Richards (2011)). Given that every v is a phase head and given that T cannot probe into VP, successive cyclic movement of the nominative NP is necessary in (4’)-(6’). In the following section, I discuss a phase-based derivational analysis of the application of Condition B. This analysis is then applied to Condition A in section 4.

3. A derivational analysis of Condition B Let us turn to the anti-locality binding properties of pronouns and determine precisely at what point(s) of a derivation Condition B applies. At what stage(s) of a derivation does a pronoun have to be free in its local binding domain (i.e., in its phase)? In this section, I will try to show that Condition B applies as soon as the [uCase] of the pronoun is erased (but not at earlier stages of the derivation), and that from this point in the derivation on, the pronoun has to be free in its phase. Let us first consider examples with pronouns that have their Case checked in their base position. (16) a. It seems to himi [that it is likely [that hei will lose]]. b. *Hei [vP t seems to himi [t to be likely [t to lose]]]. c. Billi ’s mother [vP t seems to himi [t to be likely [t to lose]]].

Derivational Binding and the Elimination of Uninterpretable Features

195

In (16-b) the pronoun him has its [uCase] checked in its base position when it is merged with to. At this point of the derivation (in its base position, cf. (16-a)), it fulfills Condition B, but violates Condition B after NP-movement of he applies, as shown in (16-b). The sentence is ungrammatical because him, after it has had its [uF] checked, does not fulfill Condition B in its local domain (phase). Example (16-c) suggests that the ungrammaticality of (16-b) is not due to reconstruction or to an intermediate movement step of the pronoun he into the position t ; see also (19) below. A similar situation to (16) arises in (17)–(18). The pronoun is free and has its [uCase] checked before T is merged and NP-movement to Spec TP applies, as illustrated in (17-a), (18-a); after NP-movement it is bound and violates Condition B, cf. (17-b), (18-b). (17) a. [pleased Billi ] himi . b. *Billi [pleased t] himi . (18) a. [appeared a ghosti ] in front of iti . b. *A ghosti [appeared t] in front of iti . In non phase-based analyses, it has been argued on the basis of such examples that Condition B applies at S-Structure (see Barss (1986); Belletti and Rizzi (1988)). Other authors have argued that Condition B applies only at LF (see Uriagereka (1988); Chomsky (1995, 211)). We also find the view that Condition B applies either at S-Structure or at LF (cf. Hestvik (1990)). Within our derivational analysis, the licensing of a pronoun can be seen as constrained by local steps in the derivation. In this sense, the examples in (16-b), (17-b), and (18-b) show that it is not sufficient for a pronoun to fulfill the Binding Theory at one stage of the derivation, i.e., it is not sufficient for a pronoun to be free in its base position at some stage of the derivation. Note that the examples above are also compatible with a “global” analysis as mentioned in section 1, according to which disjoint interpretative procedures for pronouns occur at every point of the derivation (see Lebeaux (1995); Epstein (1998); Epstein and Seely (2006); Lebeaux (2009), among others). Given that the pronoun is bound in its domain at one step of the derivation in (16-b), (17-b), and (18-b), the derivation is cancelled. However, the following examples show that it is inadequate to conclude from (16-b), (17-b), and (18-b) that a pronoun violates necessarily Binding Theory if it is bound in the relevant binding domain at one stage of the derivation: (19) Hei seems to himselfi [t to be t smart]. (compare: *It seems to Billi hei is intelligent) (20) a. b.

[pleased hei ] himselfi . Hei [pleased t] himselfi .

196 (21) a. b.

Joachim Sabel

were considered by each otheri [theyi (to be t) intelligent]. Theyi were considered by each otheri [(t to be) t intelligent]. (compare: Theyi consider *themi /each otheri (to be) intelligent)

Before the pronoun moves to the subject position in (19)–(21), it violates Condition B at one step of the derivation), namely in its base position in (20) and in the embedded Spec TP in (19). Example (21-a) offers both possibilities. These examples show that the global Condition B is too strong. A pronoun may be bound in its local binding domain at one stage of the derivation without causing a violation of Condition B. Making certain assumptions about feature checking to which I turn in a moment we can describe the difference between (16)–(18) and (19)–(21) as follows. In (19)–(21), at the step of the derivation when the pronoun is bound in the relevant local domain, it does not yet have its [uF] checked. Bearing [uCase], the pronoun is not visible to interpretative procedures (Condition B). It becomes visible at the step of the derivation when it occupies the (matrix) Spec TP position. At this step of the derivation it does not violate Condition B. Therefore, (19), (20), and (21) are grammatical. By contrast, in (16)–(18), the pronoun’s [uF] is eliminated at the step of the derivation before the pronoun is bound in its local domain. The pronoun becomes visible to interpretative procedures and violates Condition B after movement of the antecedent to the root Spec TP takes place. Assuming a probe goal analysis, at the step of the derivation when the root C/T heads are merged in (19)–(21), T bears (unvalued) [uφ ] and the pronoun (DP) has (valued) [iφ ]. In addition, the pronoun has entered the derivation with [uCase], hence T and DP are active. The nominative pronoun is moved to vP, Agree between probe (T) and goal (DP) applies, and [uφ ] of T are deleted (valued). The [uCase] of DP deletes as a side effect (assuming that structural Case is a reflex of an [uφ ]-set). Movement of DP to Spec TP applies for an independent reason, namely, to check the uninterpretable [EPP] of T that cannot be checked in situ. Given that the EPP-feature is not a matching feature, we can think of pronoun movement to Spec TP in (19)–(21) as being induced by agreement between T and DP, i.e., by an operation that values the [uφ ] of T and satisfies the [EPP] at the same point of the derivation. Then the pronoun in (19)–(21) has its uninterpretable features deleted at the step of the derivation when it occupies the Spec TP of the finite clause and at this point (being located in the matrix Spec TP) it does not violate Condition B. Alternatively, we could assume that movement to the matrix Spec TP applies for Case checking reasons; i.e., that T and the DP that raises to Spec TP bear [uCase]. These features on T and DP are checked as a result of movement to Spec TP (see Chomsky (1995), Epstein and Seely (2006, 195ff)). Note that under a Case-checking/Move-F approach, in which the pronoun checks its own and T’s [uCase] via movement into a specifier position, as well as under the probe-goal analysis, the position of the pronoun in which it is sensitive to Condition B can be characterized uniformly for (16)–(21) as the

Derivational Binding and the Elimination of Uninterpretable Features

197

position where the pronoun has checked its [uCase]. As mentioned in section 1, I assume that movement to Spec TP proceeds via a vP specifier position in these examples for independent reasons. The role of this intermediate movement step will be discussed in the next section. (19)–(21) have shown that a pronoun that violates the structural licensing condition for Condition B in its θ - (base) or in an intermediate position can still satisfy Condition B if it is moved into a position where its [uCase] erases and the pronoun is outside of the local domain that contains its binder. Now compare (19)–(21) with the following examples. In contrast to (19)–(21) the pronoun’s [uCase] (accusative) is deleted in the VP in which the pronoun was externally merged. The pronoun, now being accessible to interpretative procedures, violates Condition B, because it is bound by he in its vP phase. Subsequently, he is moved to Spec TP, him is A’-moved and the complement of v is transferred. (22) a. *Hei washes himi . b. *Himi , hei washes t. c. *Himi , Mary thinks that t hei washes t. In (22-a), him is bound in its local domain (vP) after its [uCase] is deleted. A Condition B violation cannot be avoided by A’-movement of the pronoun that takes it out of its phase as in (22b-c). As already pointed out (see footnote 1), uninterpretable A’-features are not part of the pronoun, but of the corresponding functional heads. They are therefore irrelevant for the timing of the application of Condition B in examples such as (22). It is correctly predicted that the following examples are likewise impossible: (23) a. *Billi took [many pictures of himi ]. b. *How many pictures of himi did Billi take t? (24) a. *Billi never talked [with himi ]. b. *With himi Billi never talked t. The discussion so far gives rise to the following descriptive generalization: If a pronoun that has its [uF] checked/valued violates Condition B in its local domain (phase), then the violation cannot be overcome at further stages of the derivation, and the derivation crashes, see also (3). In other words, a pronoun that has its [uF] checked has to fulfill Condition B in its phase. After transfer, the pronoun is no longer subject to local licensing conditions. (25) Derivational Condition B Only a pronoun with its [uF] checked is visible to interpretative procedures and must be free in its local domain (phase). (25) correctly predicts that (19)–(21) are well-formed. In (19)–(21), the pronoun has its [uF] deleted in a (derived) position where it is free. The examples (16-b),

198

Joachim Sabel

(17-b) and (18-b) are excluded by (25). The pronoun in each of these examples has its [uF] checked in its base position and is bound in this position within its local domain at one step of the derivation. Therefore it violates Condition B and the derivation is cancelled. The same holds for (22)–(24). In these examples, the pronoun violates Condition B at the step of the derivation after its [uCase] is checked in its phase. It does not matter that it is A’-moved afterwards. (25) correctly predicts that such a derivation may not neutralize the Condition B violation. Consider likewise (26). The pronoun violates Condition B at an earlier step of the derivation taking Bill as its antecedent in its vP phase. If this option is used, the derivation is already terminated before wh-movement to Spec CP applies. However, if John is the antecedent the pronoun fulfills (25) and the derivation converges (see, however, also the discussion of this example in (46) below): (26) Johni wondered which picture of himi/*j Billj took t. Examples such as (27-a) and (28-a), in which predicates are A’-moved, are of the same type (Huang (1993); Takano (1995); Barss (2001)): (27) a. *[ti criticize heri ] John thinks Maryi will not. b. [ti criticize herj ] Maryj thinks Johni will not. (28) a. *[How ti proud of himi ] do you think Johni should be? b. [How ti proud of himj ] does Johnj think I i should be? The pronouns her and him in (27-a) and (28-a) are not free in their local binding domain after they have valued their [uCase]. They therefore violate (25). They are bound by the embedded subject; its presence is represented here as the unbound predicate-internal subject trace (see also footnote 7 for similar cases with respect to Condition A). In contrast, in (27-b) and (28-b) the pronouns fulfill (25). The predicate movement facts also confirm the derivational analysis of Condition B proposed in this paper. In (29-a), the antecedent has left the local binding domain of the pronoun him (vP) in which the latter has had its [uCase] deleted. (25) correctly predicts that this derivation is impossible. In (29-b), when the lowest VP is transferred the pronoun is not in a local binding domain with Mary located in the matrix clause, so we can conclude that the ungrammaticality of pronoun binding in (29-a) is not caused by John in its derived Case position but that it is caused at the step of the derivation when John and the pronoun are located within the first vP phase, as predicted by our analysis. (29) a. b.

[CP [TP Johni [vP t i seems to Mary [TP ti to [vP ti be expected [TP ti to [vP ti like *himi /himselfi ]]]]]]]. John seems to Maryi to be expected [TP t to [vP t like heri /*herselfi ]].

Derivational Binding and the Elimination of Uninterpretable Features

199

Note that the antecedent John has not yet deleted its [uCase] when it acts as a binder in (29-a). In the next section, I will discuss this asymmetry between binder and bindee in more detail. In addition, I will discuss the complementary distribution of pronouns and anaphors, as illustrated, for example, in (29). As already mentioned in the introduction, my analysis is based on the assumption that anaphors like pronouns are inaccessible for the computations of binding relations unless they have checked their [uCase] and that an anaphor that has its [uCase] erased needs to be bound in its local domain, i.e., in its phase. It is obvious that the anaphor binding facts in (29) follow from this analysis.

4. Condition A reconsidered (25) correctly predicts that a Condition B violation results for a pronoun that is moved to an EPP- (Case-) checking-position in which it is bound in its phase as in (30-a). After the pronoun is moved into the matrix vP phase that contains the antecedent, accompanied by checking its [uCase], it violates (25). As illustrated with (30-b), if the pronoun is replaced with an anaphor, the derivation leads to a grammatical result.8 (30) a. *Maryi expected [heri to seem to Bill [t to be t pregnant]]. b. Maryi expected [herself i to seem to Bill [t to be t pregnant]]. Let us next consider an example with a pronoun that has had its [uCase] checked and violates (25) as a consequence of an intermediate movement step of its binder (see also Chomsky (2004, fn. 56) and Nevins (2005) for relevant discussion): (31) a. b.

Billi /Hei seems [TP t i to ti appear to *himi (/himselfi ) [TP ti to be ti intelligent]]. Maryi / Shei seems to Bill [t i to ti appear to *heri (/herselfi ) [ti to be ti pregnant]].

(31) is interesting in at least two respects. Firstly, it is potentially problematic for analyses which assume that intermediate A-movement steps are impossible (as, for example, Epstein and Seely (2006)). According to such an analysis Bill/he and Mary/she occupy only the base (θ -) position and the final destination of movement at different points of the derivation in (31). Then the question is why (31) should violate Condition B or fulfill Condition A. Even if it is assumed that 8

However, this is also expected according to the global view of Condition A in (1) as an “Anywhere Condition.” The cases of anaphoric binding that are problematic for analysis in terms of the Anywhere Condition (1), already mentioned in the introduction, will be discussed later in this section.

200

Joachim Sabel

pronoun/antecedent or anaphor/antecedent pairs do appear together in the same phase. As shown in the following examples, co-reference between the pronoun in the intermediate clause and an antecedent in the matrix clause is possible whereas anaphoric binding is degraded (perhaps more difficult to interpret due to multiple raising in (32) vs. (33)). This provides additional evidence for the fact that binding in (31) is mediated from the position of ti : (32) Mary seems to Billi to appear to himi /*himselfi to be intelligent. (33) It seems to Billi to appear to himi /?*himselfi that Mary is pregnant. Note also that we cannot ascribe the ungrammaticality of (31) with pronominal binding to the fact that a Condition C violation arises at one step of the derivation. This would incorrectly rule out the derivation of (31) with an anaphor. (This, however, does not hold under the analysis of [i/uRef] discussed below). Within the analysis presented here, pronominal binding in (31) is ruled out due to an intermediate movement step and (25). At the step of the derivation when the antecedent is located in the position of t , the pronoun (him/her) has already checked its [uCase], being visible for Condition B at the point when it is too close to its binder. The examples in (31) suggest that intermediate A-movement steps exist. The examples in (31) are interesting for a second reason. (31) shows that in contrast to a bindee, an element may qualify as a binder for Condition A/B without having checked its [uCase]. This phenomenon has already been mentioned in connection with example (29). In order to be able to understand the asymmetry between binder and bindee, we have to determine whether the binder can bear its [uCase] or whether it needs to bear its [uCase]. In order to find this out, we first have to discuss which featural relationships are constitutional properties of binding relations and the reason for the different locality restrictions of anaphors and pronouns.9 Which Case the antecedent and the dependent element bear is not important for the licensing of pronouns and anaphors. We know that the Case of pronouns and anaphors might be different from the Case of their co-referent NP. However, anaphors and pronouns agree in φ -features with their co-referent DPs, as verbs agree in φ -features with their subjects (and objects). It is natural to search for a unified account of this similarity that relates Binding to Agree. One “binding as agreement” analysis is based on a feature-valuation (Agree) operation (as in, for example, Reuland (2005); Heinat (2006)), in which an unvalued feature of 9

I assume that experiencer phrases as in (31)–(33) are not PPs but DPs with lexical Case and with to adjoined to DP (see also Epstein and Seely (2006) for more extensive discussion). In contrast to the examples in (7), the experiencer argument in (32)–(33) qualifies at no step of the derivation as a binder. The examples in (32)–(33) with anaphoric binding are therefore ruled out for independent reasons, see also the discussion below.

Derivational Binding and the Elimination of Uninterpretable Features

201

the anaphor triggers Agree. The featural value of the anaphor depends upon the featural value of the antecedent via T that is in a local syntactic relation with the anaphor. Given that anaphors are often morphologically underspecified with respect to certain features (such as gender/number), such an analysis is based on the idea that names, definite descriptions, pronouns and quantifiers are already equipped with valued φ -features in the numeration whereas anaphors enter the derivation with unvalued φ -features. This difference is seen as the source of the different locality constraints for anaphors and pronouns, i.e., the fact that an anaphor has to be bound and a pronoun has to be free in its local domain. However, I do not adopt this view because examples also exist in which pronouns are underspecified with respect to φ -features (for, example you, underspecified with respect to number and gender) but behave nevertheless like pronouns rather than anaphors, and finally, we find examples with anaphors like him-/herself that are fully specified but nevertheless behave like anaphors rather than pronouns with respect to their binding domains (see Hicks (2009) for discussion). I assume that different kinds of locality restrictions for anaphors and pronouns derive from different kinds of referential features, i.e., interpretable referential features [iRef] (on non-anaphors) and uninterpretable referential features [uRef] (on anaphors). Their interpretation is determined by the context. Referential features are associated with an index that is assigned a specific referent at the semantic interface. These features are already selected as parts of (nonquantificational and non-wh) NPs in the numeration. An element with [uRef] (an anaphor) needs a local binder with an [iRef]. An unvalued referential-feature on an anaphor is valued by a matching valued referential-feature of the antecedent. Under this view, what counts as a local domain for anaphors and pronouns follows from phase theory, θ -theory (as will be illustrated with (34)-(35) below), and the different referential features of pronouns and anaphors. In example (34), already mentioned in section 2, both anaphors have valued their [uCase]. (34) is ungrammatical because the anaphors bear unvalued [uRef]: (34) *Himself surprised t himself. (35) *[vP Johni likes himi ]. Pronouns are inherently equipped with [iRef] which is the reason for their antilocality behaviour (Condition B). In (35), the featural specification does not distinguish John[3Ps/SG/Masc]/[iRef] from him[3Ps/SG/Masc]/[iRef] (except, irrelevantly, with respect to their PF-features). Hence the vP structure is interpreted as involving an ill-formed A-chain, i.e., as one (moved) NP bearing two θ -roles. The violation of the θ -criterion is overcome, when either a featurally distinct element, such as for example, a φ -agreeing anaphor himself [3Ps/SG/Masc]/[uRef] is selected (or an NP such as Bill bearing another index), or when the pronoun him is used in another phase. As already mentioned, I assume that the probe-goal

202

Joachim Sabel

relation with respect to [uRef] valuation is realized under an extended notion of a probe, i.e., between an (antecedent) XP and the anaphoric element.10 Condition A reduces to the fact that an anaphor needs its [uRef] valued in its phase; probably because a transferred phrase may not contain elements bearing [uRef]; i.e., a well-formedness condition at the semantic interface. Condition B reduces to the fact that two elements with identical [φ ]- and [iRef]-features cannot appear in the same phase. (Note that by the same reasoning local Condition C effects as in *Hei likes Johni are ruled out. However, Condition C effects are subject to additional (non-syntactic) constraints (see section 5).) Hence, syntactic Binding is a complex operation, in which an [iRef] bearing antecedent enters into a relationship with φ -agreeing pronouns and anaphors. The latter bear [uCase] as well as [uRef] (anaphors) or [uCase]/[iRef] (pronouns). Based on the preceding discussion of derivational steps at which Conditions A/B apply we can conclude that [uRef] on the anaphor is valued after its [uCase] is eliminated and that [iRef] on a pronoun becomes visible after its [uCase] is erased. Let us now return to the question of whether an antecedent needs to bear its [uCase] to be able to act as a binder, i.e., as a [Ref]-probe. It is plausible to assume that the antecedent XP needs to bear its [uCase] as a consequence of the Activity Condition (i.e., a structural Case feature renders DPs ‘active’, see Chomsky (2000; 2001). A probe is active by virtue of bearing [uF]. In fact, the examples discussed in the preceding sections are all compatible with this conclusion concerning the binder. Consider, for example, (16-b), (17-b), (19), (20-b), repeated here in (36) and (37), where the antecedent moves successive cyclically via vP. It is active at the step of the derivation where it binds the pronoun/anaphor before C/T have been merged, i.e., being located in t in (36-a) and (37-a) and in t in (36-b) and (37-b). (36) a. *Hei [vP t seems to himi [t to be likely [t to lose]]]. b. *Billi [vP t [[pleased t] himi ]]. (37) a. b. 10

Hei [vP t seems to himselfi [t to be t smart]]. Hei [vP t [[pleased t] himselfi ]].

See also Hicks (2009, 125) and the literature cited there for the same view of binders as XPs. In Sabel (1998; 2001; 2003) it is argued that XPs can be probes or “feature-checkers” in multiple fronting constructions because the relevant feature in the functional head (D0 ) may project up to the Xmax level and attract other XPs. Furthermore, the analysis raises the question of how the structural Case of an anaphoric object is checked. I assume that the φ -features on v are insufficiently specified for the case of object anaphors with structural Case. A subject in Spec vP, however, can provide the necessary φ -feature values so that v and the subject DP can do the job together (see also Reuland (2005)). The same explanation can be given for ditransitives such as [vP I showed [vP them [VP tshowed each other]]] where an additional functional head can be assumed (a low applicative v head, cf. Pylkk¨anen (2008) and McGinnis (2001)).

Derivational Binding and the Elimination of Uninterpretable Features

203

Another important effect of the Activity Condition for the distribution of pronouns and anaphors can be illustrated with the examples in (38) and (39). It is well-known that elements that are attracted by P(eripheral)-features in the C-system do not qualify as binders. This holds for Condition B as well as for Condition A: (38) a. b.

Hei thinks Mary likes himi . Himi , hei thinks Mary likes t.

(39) a. *The girlsi , [each otheri ’s dance partners] criticized t. b. *How many students of artsi did [paintings of each otheri ]/[each otheri’s paintings] convince the professor that he should support t? The descriptive generalization that binding from an A’-position is impossible can now be derived without stipulating inherent properties of A and A’-positions. Given that A’-moved elements have already erased their [uCase] in their derived position they do not qualify as binders. Hence, he fulfills Condition B in accordance with (25) in (38-b). In addition, an anaphor is barred in (39) because the potential antecedent has already erased its [uF]. To sum up, parallel activity conditions for the binder are observed with respect to Condition A and Condition B. Coming back to the question of whether the binder can have its [uCase] erased or needs to keep its [uCase] we can conclude that the binder needs to keep its [uF] in order to remain an active probe. The asymmetry between binder and bindee with respect to [uCase] results from the fact that transferred units may contain only (pro)nominal elements with [iRef] or valued [uRef]. Visibility of [iRef] and valued [uRef] presupposes deletion of [uCase]. The binder is transferred as part of a later phase than the bindee and therefore needs not to delete its [uCase] within the same phase as the bindee, but only at a later step of the derivation, i.e., when it undergoes transfer. Let us now turn again to the discussion of anaphors as bindees within a derivational formulation of Condition A. Based on the preceding discussion, we can conclude that Condition A applies at the step of the derivation after the [uF] of the anaphor is erased: (40) Derivational Condition A Only an anaphor with its [uCase] checked is visible to interpretative procedures and must be bound in its local domain (phase). Both the global formulation in (1) and the derivational formulation in (40) are compatible with examples involving Condition A discussed so far. But consider the following examples already mentioned in section 2: (41) a. *Himself i surprised t Johni . b. *Himself i seems to Johni [t to be ugly].

204

Joachim Sabel

c. *Himself i usually strikes Johni as [t amusing]. (1) does not exclude (41). However, we can account for the data on the basis of (40). The anaphor is locally bound at one step of the derivation, but at this step of the derivation, it has not yet checked its [uCase]. Therefore it is invisible for the binding operation. At the step of the derivation when it becomes visible it lacks a local binder. Hence, the “Anywhere Condition” in (1) is too weak whereas the derivational Condition in (40) makes the right predictions.11 According to Rizzi (1990), the examples in (41) are excluded due to the socalled “anaphor-agreement effect,” which results from an incompatibility between the property of being an anaphor and being construed with agreement (see also Woolford (1999)), but this leaves unanswered the issue of why (42-b) is possible where object agreement is involved. (See Kiparsky (2008) for a discussion of the (im)possibility of nominative anaphors in different languages). (42) a. *John expected herself i to seem to Maryi t to be t pregnant. b. Maryi expected herself i to seem to John t to be t pregnant. As with (41), we do not need to rely to Condition C in order to account for the contrast in (42). (40) correctly predicts the ungrammaticality of (42-a) (similarly to (41)). Assuming a cyclic, phase-based application of Condition A/B, the Binding Conditions are operative after deletion of the [uCase] of the anaphoric/pronominal element takes place. In ECM constructions, the anaphor gets its [uCase] deleted in the matrix vP which represents its local binding domain. When the ECM subject is a pronoun, ungrammaticality results because the pronoun has deleted its [uCase], and according to (25), it is visible for the Condition B relevant locality restriction:12 (43) *Billi believes himi to like John. I conclude that anaphors and pronouns are visible to Condition A and Condition B only as soon as they have their [uCase] checked. This gives rise to the following generalization:

11

12

Compare (41-c) to the grammatical Old pictures of himself usually strike John as amusing (cf. Pesetsky (1987); Barss (2001); Baltin (2003)). In this example, the anaphor has deleted its [uCase] in its base position where it is bound, as predicted. The same holds for examples such as (7-a), i.e., Each other’s pictures seem to the men to be the most beautiful. Note, however, that reciprocals in subjects as well as reflexives in picture object DPs often behave as “logophors” which are subject to specific principles of interpretation (see Hornstein (2001, 186), among others). The ECM complement is not a phase but the matrix vP. The situation is different with “for-to” complements, cf. Johni wants (*for) himself i to win. Here the complement clause is probably a phase, but see Chomsky (2008: 149) and the literature cited there for further discussion.

Derivational Binding and the Elimination of Uninterpretable Features

205

(44) Binding Conditions A/B apply at the step of the derivation when an anaphor/a pronoun has had its [uCase] deleted. To sum up, Condition A and Condition B become active during a derivation and check structural licensing conditions after an anaphor or pronoun gets rid of its [uCase]. The reason might be that some featural content of these elements such as the semantic specifications of their referential features is read off and, if necessary, manipulated (for example in cases of valuing [uRef] of anaphors by an antecedent) in the narrow syntax only after they have deleted their [uCase]. This implies that the semantic component can operate within a core syntactic derivation, or that the semantic component utilizes syntactic operations such as Agree.

5. Residual issues A residual issue is the analysis of so-called “reconstruction” effects in which the bindee is a subpart of a moved constituent as in (45) (=(10)): (45) Billi wonders [CP [which pictures of himself i/j ] Joej likes t]. In these cases himself need not be bound in the first phase in which it is merged. I would like to suggest the following analysis for these cases in which the anaphor has lexical Case. The anaphor’s [uRef] can be valued by a commanding antecedent in a phase either before or after valuation of the anaphor’s [uCase]. This gives rise to multiple possible binding domains. In case, the [uRef] is valued before [uCase] deletion we get binding in the first phase, in which the anaphor was externally merged. In case, the anaphor values its [uRef] after Case valuation it can wait until its [uRef] is valued in any higher phase. Concerning the relevant examples with pronouns there first is a decision to make with respect to the grammaticality judgements. The following sentence is not ungrammatical for every speaker with himj and Billj ((46)=(26)): (46) Johni wondered which picture of himi/(*)j Billj took t. Speakers who find this sentence grammatical have the same option for pronouns as for anaphors with lexical Case. The pronoun’s [iRef] relation with another DP in the same phase is computed before or after deletion of the [uCase] of the pronoun. In case it is computed after deletion of the pronoun’s [uCase] the sentence is well formed because the pronoun can be A’-moved out of the phase without being bound. For speakers who do not accept this sentence the pronouns [iRef] relation with another DP in its phase is always computed before [uCase] deletion of the pronoun. The difference between anaphors (where only one option for lexical Case marked elements exists) and pronouns (where two options exist) can be argued to be a consequence of the different nature of [uRef]- and

206

Joachim Sabel

[iRef]-features. Now we still have sentences such as (47) (=(22-b)). They are rejected by every speaker: (47) *Himi , hei washes t. As already mentiond, the option of "procrastinating" the computation of the [iRef] relation depends on the kind of structural or lexical [uCase]. Only for pronouns with lexical [uCase] the option of computing the [iRef] relationship in a phase can be delayed until after [uCase] deletion of the pronoun has taken place. A further residual issue of the analysis developed in this paper concerns the non-complementary distribution of anaphors and pronouns in certain contexts. Consider, for example, cases in which anaphors and pronouns are A’-moved. In a topic position, as shown in (48-b), an anaphor is located in the local domain that also contains the matrix subject. Why does the pronoun in (49-b) not violate Condition B? (48) a. *Billi thought that Mary talks about himselfi . b. Billi thought that about himself i Mary talks t. (49) a. b.

Billi thought that Mary talks about himi . Billi thought that about himi Mary talks t.

In (49-b), the pronoun is located in a left-peripheral position. One promising approach would be to find a unified analysis for (48-b), (49-b) and examples (50)–(51), in which anaphors and pronouns are not in complementary distribution in their base position (see Bresnan (1982), Chomsky (1982, 99), Huang (1983), Freidin (1986), Lasnik (1989, chapter 1), and Chomsky (1986, 171ff)), but such an aim is beyond the scope of this paper.13 (50) Theyi saw each otheri’s (/theiri ) pictures. (51) Joei likes this picture of himi (/himselfi ). Thirdly, several aspects concerning variable pronouns need further investigation. The first aspect concerns the status of binding by a quantificational antecedent from different positions. Although pronoun binding is impossible from an A’position that is created by internal merge of an argument, as illustrated with the weak crossover effect in (52), example (53) shows that binding by an adjunct operator, i.e., binding from a non-argument position, is possible. Probably, binding 13

Variation among languages with respect to binding phenomena might affect the proposed analysis as well. Parameterization can, for example, result from different lexical properties of anaphors and pronouns in languages (see Wexler and Manzini (1987), Aoun and Hornstein (1991), Koster and Reuland (1991), Reinhart and Reuland (1993), Cole and Wang (1997), and Lidz (2001), among others).

Derivational Binding and the Elimination of Uninterpretable Features

207

in (53) applies before topicalization takes place, i.e., when the adjunct is located in its base position: (52) *Whoi does hisi mother love? (53) Every dayi , Mary thought that iti was the happiest day of her life. Another important question is how the c-command condition is fulfilled in examples such as (54)–(55). Note also that (55) allows for an inversely linked reading (different letters for different students) and for an internal-scope reading (one and the same letter): (54) In every cityi you find someone who hates iti . (55) A letter about every studenti was sent to hisi parents. A discussion of these aspects would require an article of its own. Fourthly, the question arises as to whether Condition C can likewise be stated in derivational terms. A derivational version of Condition C according to which an R-expression must be free at every stage of the derivation is suggested in Lebeaux (1991); Heycock (1995); Epstein (1998), and Epstein and Seely (2006). This formulation, however, raises the question of why examples such as (56)– (57) are acceptable. (56) Billi seems to himselfi [t to be t intelligent]. (57) Billi ’s mother seems to himi [t to be likely [t to t win]]. In connection with Condition C, additional factors that cannot be captured with the instruments of Binding Theory are often relevant. For example, one additional interfering factor consists in the semantic nature of wh-phrases that contain R-expressions, as discussed at length in Heycock (1995): (58) a. Which stories about Dianai did shei most object to? b. *How many stories about Dianai is shei likely to invent? (59) a. *Which pictures of Billi does hei like? b. How many of the stories about Dianai was shei really upset by? The ‘depth of embedding’ of the pronoun (60) as well as of the R-expression (61) affects the possibility of co-reference and the appearance of Principle C effects in the following examples (see Gu´eron (1984, 145) and Huang (1993)). When the R-expression is not in the same sentence as the pronoun, the sentence is improved, as illustrated in (60). In (61), the depth of embedding of the Rexpression varies. (60) a. ??How many pictures of Johni does hei think that I like t? b. How many pictures of Johni do you think that hei likes t?

208

Joachim Sabel

(61) a. *In Billi ’s apartment, hei spends a lot of time. b. In the apartment Billi just rented, hei spends a lot of time. “Opacity” induced by genitive phrases is also a case in point (examples from Speas (1991, 248) and Lebeaux (1991, 212, 237)). Compare (61) with (63) and also (61-a) with (62): (62) a. b.

Maryi ’s cat, shei likes t. Which of Maryi ’s cats does shei like t?

(63) a. ??Mary’s pictures of Billi , hei really likes t. b. *Whose examination of Billi did hei fear t? Consider next the following contrast between a pronoun and an R-expression as a binder (see Emonds (1995)): (64) a. We had to introduce Maryi to Maryi ’s guest at the station. b. *We had to introduce heri to Maryi ’s guest at the station. R-expressions behave differently from pronouns and anaphors because the former are often subject to additional pragmatic constraints. However, as can be seen from (65), in certain contexts pronouns and R-expressions behave in a similar way. (65) Clintoni /Hei voted for Clintoni . Furthermore, agreement on grammaticality judgements is often difficult to get with Condition C effects (cf. Gu´eron (1984); Lebeaux (1991); Speas (1991); Huang (1993); Reinhart and Reuland (1993); Epstein and Seely (2006) for discussion). As pointed out in Chomsky (1995: 323), with respect to Condition C, “we enter here into a morass of difficult and partially unsolved questions.” Finally, consider the following examples that have been used to show that covert movement may not feed Condition A (see Barss (1986, 70), Barss (1988), Lasnik and Saito (1992), and Lasnik (1997)). (66) a. Many meni seem to themselvesi to be ti smart. b. *There seem to themselvesi to be [DP t many meni ] smart. (67) a. Bill asked Maryi [CP which pictures of herself i [Paul bought t]]. b. *Bill asked Maryi [CP who [t bought which pictures of herselfi ]]. Following the analysis of there-expletives in Sabel (2000) (66-b) is derived by movement of the expletive out of “big DP,” i.e., without assuming covert movement (“expletive replacement”). The sentence is ruled out because the [iRef] of the antecedent is associated with the part many men of the DP. Many men does not c-command the reflexive. Therefore it cannot value the [uRef] of the anaphor. There cannot value the [uRef] of the anaphor in (66-b) because it does not con-

Derivational Binding and the Elimination of Uninterpretable Features

209

tain [iRef]. (67-b) is ruled out under the assumption that covert movement of a complete wh-phrase does not exist with wh-in situ (in English).

6. Summary In this paper, I have discussed derivationally bound pronouns and anaphors. I have argued against Conditions B and A as “Everywhere/Anywhere Conditions.” A pronoun can violate Condition B at early stages of a derivation but as soon as its [uCase] is erased (valued) it has to fulfill Condition B in its phase. It was argued that Condition B can be derived from θ -theoretic constraints. A pronoun with a phase-internal co-referential antecedent is excluded because it is interpreted as an A-movement chain bearing two θ -roles. The fact that Condition B operates after the pronoun has eliminated its [uCase] has been argued to be a consequence of probing of referential; i.e., semantic features. The discussion of Condition A has shown that Condition A as an “Anywhere Condition” is likewise not adequate. Anaphors become likewise visible to Binding (Condition A) only after they have their [uCase] eliminated, i.e., as soon as they are visible for semantic operations such as the valuation of their [uRef]. Condition A and Condition B become active during a derivation and start scanning the structural licensing condition for an anaphor or pronoun at the same local step of the derivation, i.e., after their [uCase] is erased. The binder, however, needs to bear its [uF] in order to be an active probe. This asymmetry between bindee and binder results from the fact that the binder is transferred as part of a later phase.

Bibliography Abe, Jun (1993): Binding Conditions and Scrambling without A/A’ Distinction. Doctoral dissertation, University of Connecticut. Aoun, Joseph and N. Hornstein (1991): Bound and Referential Pronouns. In: C.-T. Huang and R. May, eds, Logical Structure and Linguistic Theory. Kluwer, Dordrecht, pp 1–23. Bailyn, John Frederick (2005): A Derivational Approach to Microvariation in Slavic Binding, FASL 15 Baker, Mark, Kyle Johnson and Ian Roberts (1989): ‘Passive Arguments Raised’, Linguistic Inquiry 20, 219–251. Baltin, Mark (2003): ‘The Interaction of Ellipsis and Binding: Implications for the Sequencing of Principle A’, Natural Language and Linguistic Theory 21, 215–246. Barss, Andrew (1984): Chain Binding. Ms., MIT, Cambridge Mass. Barss, Andrew (1986): Chains and Anaphoric Dependence, MIT-Dissertation, Cambridge, Mass. Barss, Andrew (1988): Paths, Connectivity and Featureless Empty Categories. In: A. Cardinaletti, G. Cinque and G. Giusti, eds, Constituent Structure. Foris, Dordrecht, pp. 9–34. Barss, Andrew (2001): Syntactic Reconstruction Effects, In: M. Baltin and C. Collins, eds, The Handbook of Contemrorary Syntactic Theory. Blackwell, pp. 670–696.

210

Joachim Sabel

Belletti, Adriana and Luigi Rizzi (1988): ‘Psych-Verbs and θ -Theory’, Natural Language and Linguistic Theory 6, 291–352. Boeckx, Cedric (1999): ‘Conflicting C-Command Requirements’, Studia Linguistica 53 ,227–250. ˇ Boˇskovi´c, Zeljko (2002): ‘A-movement and the EPP’, Syntax 5, 167–218. Bresnan, Joan (1982): ‘Control and Complementation’, Linguistic Inquiry 13, 343–434. Canac-Marquis, R´ejean (2005):, Phases and Binding of Reflexives and Pronouns in English. Ms., Simon Fraser University. Chomsky, Noam (1981): Lectures on Government and Binding. Kluwer, Dordrecht. Chomsky, Noam (1982): Some Concepts and Consequences of the Theory of Government and Binding. MIT Press, Cambridge, Mass. Chomsky, Noam (1986): Knowledge of Language. Its Nature Origin and Use. Praeger, New York. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Mass. Chomsky, Noam (2000): Minimalist Inquiries: the Framework. In: R. Martin, D. Michaels, and J. Uriagereka, eds, Step by step. MIT Press, Cambridge, Mass, pp. 89–156. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale: A Life in Language. MIT Press, Cambridge, MA, pp. 1–52. Chomsky, Noam (2004): Beyond Explanatory Adequacy. In: A. Belletti, ed., Structures and Beyond. The Cartography of Syntactic Structures, Vol. 3. Oxford University Press, Oxford, pp. 104–131. Chomsky, Noam (2005): ‘Three Factors in Language Design’, Linguistic Inquiry 36, 1–22. Chomsky, Noam (2007): Approaching UG from Below. In: U. Sauerland & H.-M. G¨artner, eds, Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from Syntax-Semantics. Mouton de Gruyter, Berlin, pp. 1–30 Chomsky, Noam (2008): On Phases. In: R Freidin, C.P. Otero and M.-L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. MIT Press, Cambridge, MA, pp. 133–166. Cinque, Guglielmo (2010): Mapping Spatial PPs: An Introduction. In: G. Cinque and L. Rizzi, eds, Mapping Spatial PPs. The Cartography of Syntactic Structures, Vol. 6. Oxford University Press, New York, pp. 4–21. Cole, Peter and Chengchi Wang (1997): ‘Antecedents and Blockers of Long-Distance Reflexives’, Linguistic Inquiry 27, 357–390. Collins, Chris (2005): ‘A Smuggling Approach to the Passive in English’, Syntax 8:2, 81–120. Emonds, Joseph (1995): Deep, Free, and Surface Bound Pronouns. In: H. Campos and P. Kempchinsky, eds, Evolution and Revolution in Linguistic Theory. Georgetown University Press, Washington, D.C., pp. 110–137. Engdahl, Elisabet (1986): Constituent Questions: The Syntax and Semantics of Questions with Special Reference to Swedish. D. Reidel Publishing Company, Dordrecht. Epstein, Samuel et al. (1998): A Derivational Approach to Syntactic Relations. Oxford University Press. Epstein, Samuel and T. Daniel Seely (2006): Derivations in Minimalism. Cambridge University Press, Cambridge. Fox, Danny (1999): Reconstruction, Binding Theory, and the Interpretation of Chains. Linguistic Inquiry 30, 157–196. Fischer, Silke (2004): Towards an Optimal Theory of Reflexivization. Ph. Diss., Universit¨at T¨ubingen Freidin, Robert (1986): Fundamental Issues in the Theory of Binding. In: B. Lust, ed., Studies in the Acquisition of Anaphora, Vol. 1. Reidel, Dordrecht, pp. 151–188. Giorgi, Alessandra (1987): The Notion of Complete Functional Complex: Some Evidence from Italian. Linguistic Inquiry 18, 511–518. Grewendorf, G¨unther (2003): Dynamic Binding and the Problem of Object-related Anaphors. In: L. Gunkel, G. M¨uller, and G. Zifonun, eds, Arbeiten zur Reflexivierung. Niemeyer, T¨ubingen, pp. 91–114. Grewendorf, G¨unther and J. Sabel (1999): Scrambling in German and Japanese: Adjunction Versus Multiple Specifiers. Natural Language and Linguistic Theory 16, 1–65. Gu´eron, Jacqueline (1984): ‘Topicalisation Structures and Constraints on Coreference’, Lingua 63, 139–174.

Derivational Binding and the Elimination of Uninterpretable Features

211

Heinat, Fredrik (2006): Probing Phrases, Pronouns, and Binding. Ms. Malm¨o University. Hestvik, Arild (1990): LF-Movement of Pronouns and the Computation of Binding Domains. Doctoral dissertation, Brandeis University, Waltham, Mass. Heycock, Caroline (1995): ‘Asymmetries in Reconstruction’, Linguistic Inquiry 26, 547–570. Hicks, Glyn (2009): The Derivation of Anaphoric Relations. John Benjamins, Amsterdam. Huang, Cheng-Teh James (1983): ‘A Note on the Binding Theory’, Linguistic Inquiry 14, 554–560. Huang, Cheng-Teh James (1993): ‘Reconstruction and the Structure of VP: Some Theoretical Consequences’, Linguistic Inquiry 24, 103–138. Hornstein, Norbert (2001): Move! A Minimalist Theory of Construal. Blackwell, Oxford. Hornstein, Norbert, Jairo Nunes, and Kleanthes K. Grohmann (2005): Understanding Minimalism. Cambridge University Press, Cambridge. Johnson, Kyle (1985): A Case for Movement. Doctoral dissertation, MIT. Johnson, Kyle (1987): Against the Notion ‘S UBJECT”, Linguistic Inquiry 18, 354–361. Johnson, Kyle (1992): Scope and Binding Theory: Comments on Zubizarreta. In: Stowell, T. and E. Wehrli, eds, Syntax and the Lexicon. Syntax and Semantics 26, pp. 259–275. Kayne, Richard (2002): Pronouns and Their Antecedents In: S. Epstein & D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell, Oxford. Kiparsky, Paul (2008): Universals Constrain Change; Change Results in Typological Generalizations. In: J. Good, ed., Linguistic Universals and Language Change. Oxford University Press, Oxford, pp. 23–53. Kiss, Tibor (this volume) Reflexivity and Dependency. Koizumi, Masatoshi (1992): Copy α and Reconstruction Effects. Ms., MIT. Koster, Jan and Eric Reuland (eds.) (1991): Long-Distance Anaphora. CUP, Cambridge. Kural, Murat and George Tsoulas (2005): Indices and the Theory of Grammar, Ms., University of California, Irvine and University of York. Lasnik, Howard (1989): Essays on Anaphora. Kluwer, Dordrecht. Lasnik, Howard (1997): Levels of Representation and the Elements of Anaphora. In: Hans Bennis, Pierre Pica, and Johan Rooryck, eds, Perspectives on Binding and Atomism. Foris, Dordrecht, pp. 251–268 Lasnik, Howard (2001): Derivation and Representation in Modern Transformational Grammar. In: M. Baltin and C. Collins, eds, The Handbook of Contemprorary Syntactic Theory. Blackwell, pp. 62–88. Lasnik, Howard and M. Saito (1992):, Move α . MIT Press, Cambridge, Mass. Lebeaux, David (1991): Relative Clauses, Licensing, and the Nature of the Derivation. In: S.D. Rothstein, ed., Perspectives on Phrase Structure. Syntax and Semantics 25, pp. 209–239. Lebeaux, David (1995): Where Does the Binding Theory Apply? In: Maryland Working Papers in Linguistics, Vol. 3. University of Maryland, College Park, MD, pp. 63–88. Lebeaux, David (2009): Where Does the Binding Theory apply? MIT Press, Cambridge, Mass. Lee-Sch¨onfeld, Vera (2008): ‘Binding, Phases, and Locality’, Syntax 11, 281–298. Legate, Julie Anne (2003): ‘Some Interface Properties of the Phase’, Linguistic Inquiry 34, 506–516. Lidz, Jeffrey (2001): ‘Condition R’, Linguistic Inquiry 32, 123–140. McGinnis, Martha (2001): Phases and the Syntax of Applicatives. In: M. Kim and U. Strauss, eds, Proceedings of NELS 31. GLSA, University of Massachusetts, Amherst, p. 333–349. Nevins, Andrew (2005): Derivations without the Activity Condition. MIT Working Papers in Linguistics 49, 283–306. Pesetsky, David (1987): ‘Binding Problems with Experiencer Verbs’, Linguistic Inquiry 18, 126– 140. Pylkk¨anen, Liina (2008): Introducing Arguments. MIT Press, Cambridge, MA. Quicoli, A. Carlos (2008): Anaphora by Phase. Syntax 11, 299–329. Reinhart, Tanja (1983): Anaphora and Semantic Interpretation. University of Chicago Press. Reinhart, Tanja and E. Reuland (1993): ‘Reflexivity’, Linguistic Inquiry 24, 657–720. Reuland, Eric (2005): Agreeing to Bind. In: N. Corver, H. Broekhuis, R. Huybregts, U. Kleinhentz and J. Koster, eds, Organizing Grammar. Linguistic Studies in Honor of Henk van Riemsdijk. Mouton de Gruyter, Berlin/New York, pp. 505–513.

212

Joachim Sabel

Richards, Norvin (2001): Movement in Language: Interactions and Architectures. Oxford University Press, New York. Richards, Marc (2007): ‘On Feature Inheritance: An argument from the Phase Impenetrability Condition’, Linguistic Inquiry 38, 563–572. Richards, Marc (2011): On Feature Inheritance, Defective Phases, and the Movement–Morphology Connection, Ms., University Leipzig Riemsdijk, Henk van and Edwin Williams (1986): Introduction to the Theory of Grammar. MIT Press, Cambridge Mass. Rizzi, Luigi (1990): ‘On the Anaphor-Agreement Effect’, Rivista di linguistica 2, 27–42. Rizzi, Luigi (1997): The Fine Structure of the Left Periphery. In: L. Haegeman, ed., Elements of Grammar. Kluwer, Dordrecht, pp. 281–337. Rooryck, Johan and Guido van den Wyngaerd (2011): Dissolving Binding Theory. Oxford University Press. Ruys, Eddy G. (2000): ‘Weak Crossover as a Scope Phenomenon’, Linguistic Inquiry 31, 513–539. Sabel, Joachim (1996): Restrukturierung und Lokalit¨at. Universelle Beschr¨ankungen f¨ur Wortstellungsvarianten. Akademie-Verlag, Berlin Sabel, Joachim (1998): Principles and Parameters of Wh-Movement. Unpublished Habilitation Thesis, University of Frankfurt am Main. Sabel, Joachim (2000): Expletives as Features. In: R. Billerey et al., eds, WCCFL 19 Proceedings. Cascadilla Press, Somerville, MA, pp. 411–424. Sabel, Joachim (2001): ‘Deriving Multiple Head and Phrasal Movement: The Cluster Hypothesis’, Linguistic Inquiry 32, 532–547. Sabel, Joachim (2002): ‘A Minimalist Analysis of Syntactic Islands’, The Linguistic Review 19: 271– 315. Sabel, Joachim (2003): Malagasy as an Optional Multiple Wh-Fronting Language. In: C. Boeckx and K. Grohmann, eds, Multiple Wh-Fronting. John Benjamins, Amsterdam, pp. 229–254. Sabel, Joachim (2005): ‘A Derivational Analysis of Condition B’, Linguistic Analysis 35, 255–274. Safir, Ken (2004): The Syntax of Anaphora. Oxford University Press, Oxford. Saito, Mamoru (2003): ‘A Derivational Approach to the Interpretation of Scrambling Chains’, Lingua 113, 481–518. Saito, Mamoru (2005): Further Notes on the Interpretation of Scrambling Chains. In: J. Sabel and M. Saito, eds, The Free Word Order Phenomenon. Mouton de Gruyter, Berlin. Sauerland, Uli (2003): ‘Intermediate Adjunction with A-Movement’, Linguistic Inquiry 34, 308–314 Speas, Margaret (1991): Generalized Transformations and the S-Structure Position of Adjuncts. In: T. Stowell and E. Wehrli, eds, Syntax and the Lexicon. Syntax and Semantics, Vol. 26, pp. 241– 257. Takano, Yuji (1995): ‘Predicate Fronting and Internal Subjects’, Linguistic Inquiry 26, 327–340. Uriagereka, Juan (1988):, On Government. Doctoral dissertation, University of Connecticut. Wexler, Ken and R. Manzini (1987): Parameter and Learnability in Binding Theory. In: T. Roeper and E Williams, eds, Parameter Setting. Reidel Publishing Company, pp. 41–76. Woolford, Ellen (1999): ‘More on the Anaphor Agreement Effect’, Linguistic Inquiry 30, 257–287.

Facult´e de philosophie et lettres Universit´e catholique de Louvain

Daniel Hole

German Free Datives and Knight Move Binding*

Abstract This paper proposes a binding analysis for German free datives which makes use of a peculiar tree-geometrical requirement. Free datives are shown to be antecedents in a binding voice which is very similar to reflexivity. The analysis covers “possessor” datives, “beneficiary” datives, and the dativus iudicantis. The unusual tree-geometry of the construction requires the variable which gets bound by the dative antecedent to occupy the left edge of a coargument.

1. Introduction This paper is concerned with German free datives and their peculiar binding behavior. I argue that free datives are best described in terms of voice. The free dative voice turns out to be very similar to run-of-the-mill cases of reflexivity, which must likewise be modeled as a kind of voice under the theoretical assumptions of Kratzer’s (1996) agent severance. The free dative, just like a reflexive antecedent in German, binds a variable in the local tense domain. What is highly peculiar about the free dative voice is the tree-geometrical requirement that goes along with it. The variable that free datives bind must be at the left edge of a clause-mate coargumental possessum phrase or purpose phrase (‘Knight Move Binding’). Standard implementations of binding don’t include requirements of this kind. The argumentation strives to show that the requirement of Knight Move Binding really exists, and that this kind of binding is a privileged configuration in the grammaticalization of reflexive pronouns crosslinguistically. The paper delimits the empirical domain of free datives in sections 2 and 3. Section 4 establishes the parallel locality restrictions of dative binding for “possessor” and “beneficiary” datives. Section 5 establishes the Knight Move Binding re*

I would like to thank the editors for creating a highly stimulating atmosphere during the Bamberg workshop. I benefitted a lot from comments made by the audience, especially Hans-Martin G¨artner, Dalina Kallulli, Gereon M¨uller and Florian Sch¨afer. Thanks are also due to Rajesh Bhatt, Joanna Błaszczak, Daniel B¨uring, Gisbert Fanselow, Gerson Klumpp, Manfred Krifka, Ewald Lang and Peter-Arnold Mumm for discussing, or commenting on, various details of the thoughts presented in this paper. The insightful comments made by an anonymous reviewer (‘Friendly Voice’) have likewise had an important impact on the way the views in the final version of this paper are presented. Remaining mistakes are mine.

Local Modelling of Non-Local Dependencies in Syntax, 213-246 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

214

Daniel Hole

quirement of free datives. Section 6 develops the semantic implementation of free dative binding with a large detour via semantic theories of reflexivization. Competing proposals are briefly discussed in section 7. Section 8 concludes the paper.

2. The empirical domain Free datives in German are those dative arguments of German tensed clauses that may be dropped without any syntactic or semantic residue (see section 3 for elaboration). Free datives contribute to sentence meanings in fully predictable ways. I will present my view of the thematic content of structures that license free datives in the context of section 6.3. The predictability of the thematic content of free datives forms a sharp contrast with dative arguments that are subcategorized for by verbs or adjectives. With verbs like geben ‘give’, schicken ‘send’, zeigen ‘show’, gratulieren ‘congratulate’, to name just a few verbs with datives that are subcategorized for, the absence of a dative argument leads to highly marked structures, and the thematic contribution of the dative arguments is often hard to pin down, or generalize over (Blume (2000), Maling (2001)). The subclassification of free datives has been a source of debate. Terms frequently used to single out subclasses include “beneficiary dative” or “dativus (in)commodi”, “possessor dative” or “dative of pertinence”, and “dativus iudicantis” (dative of the one who judges). Examples are provided in (3). einen Kuchen. (1) Paul backte Maria cake Paul baked Maria.DAT a ‘Paul baked Maria a cake.’ (2) Paul verband Maria den Arm. Paul bandaged Maria.DAT the arm ≈ ‘Paul bandaged Mary’s arm.’ (3) Paul ist die Treppe zu steil. Paul.DAT is the staircase to steep ≈ ‘Paul finds the staircase to steep.’

(classical “beneficiary” dative/ “dativus commodi”)

(“possessor” dative (sometimes with a beneficiary undertone))

(“dativus iudicantis”)

Maria in (1) can be seen as a beneficiary because the speaker thinks that Paul intended Mary to have a benefit of the cake that Paul made. In (2), Maria is the possessor of the arm that was bandaged, hence the term “possessive” dative. Paul in (3) is the one who makes the judgment that the stairs are too steep, and this is the motivation for the traditional label “dativus iudicantis”. “Dativus iudicantis” structures always occur with a predication that asserts a degree of a property with respect to some lower or upper threshold of appropriateness.

German Free Datives and Knight Move Binding

215

The range of meanings associated with free datives just enumerated (“beneficiary”, “possessive”, “judging”) has been a theoretical challenge in German linguistics for a long time. We will take a reductionist and categorical stance towards the thematic involvement of free datives in section 6.3. The truthfunctional import felt to be present in free dative sentences that goes beyond the minimal thematic entailments assumed there will be tied to other parts of the interpreted structure, namely to the phrases that host the variables bound by the respective datives. Like this, we will for instance be able to reconcile the intuition of possession in (2) with components of event perception and beneficiency.

3. The criterion for free datives The criterion applied here to distinguish free datives from subcategorized-for datives is the complete syntactic and semantic omissibility of free datives. What this means can be illustrated with the minimal pairs in (4) and (5). (4) a. b. c. (5) a. b. c.

Paul zeigt Touristen die Stadt. Paul shows tourists.DAT the town ‘Paul shows the town to tourists.’ Paul zeigt die Stadt. Paul shows the town ‘Paul shows the town.’ (4-b) entails ‘There is someone who is shown the town.’ eine Bouillon. Paul kocht Maria Paul cooks Maria.DAT a broth ‘Paul cooks a broth for Mary.’ Paul kocht eine Bouillon. Paul cooks a broth ‘Paul is cooking a broth.’ (5-b) does not entail ‘There is someone who is cooked a broth.’

(4-a) is a sentence with a dative that is subcategorized for by the verb used, viz. zeigen ‘show’. If the dative is dropped, as in (4-b), the meaning changes in certain ways, but, crucially, the fact that someone is shown the town remains stable. Put differently, dropping the dative argument preserves the existential closure of the dative argument of zeigen ‘show’. The situation is different in (5). Here dropping the dative argument goes along with the complete nullification of the dative involvement. Thus, (5-b) does not entail that there is someone who is cooked a broth. (6) states our criterion for free datives.

216

Daniel Hole

(6) Syntactico-semantic deletion test for free datives A dative argument D not dependent on a preposition is free in a simple positive declarative sentence S of German iff (i) S without D is grammatical; (ii) S without D does not entail that there is an individual (α ) which participates in the event described by S and (β ) which could be encoded as a dative argument. Let us return to sentences (2) and (3) from above (repeated here as (7) and (8)), because they are not as easily seen to conform to (6) as, for instance, (5). (7) Paul verband Maria den Arm. Paul bandaged Maria.DAT the arm ≈ ‘Paul bandaged Mary’s arm.’ (8) Paul ist die Treppe zu steil. Paul.DAT is the staircase to steep ≈ ‘Paul finds the staircase to steep.’

(“possessor” dative (sometimes with a beneficiary undertone))

(“dativus iudicantis”)

If we drop Maria in (7), the intuition persists that there is someone who gets his or her arm bandaged. This is, however, a fact about the real world, and not about grammar; arms are typically parts of human bodies. Therefore the intuition of an additional individual participating in the event at hand can be classified as an inference. This conclusion is supported very clearly if we keep the construction stable, but exchange a body-part nominal for a possessum that doesn’t partake in a part-whole structure. This is done in (9). ¨ (9) Paul stopfte (Maria) den/ihren Armel. Paul darned Maria.DAT the/her sleeve ‘Paul darned the/her sleeve (for Mary).’ If the dative argument is dropped, the entailed involvement of Mary in the event goes away, too. This holds even if a possessive pronoun is used instead of a definite article in the accusative argument. If the dative is dropped in (9), Mary need not be present in the situation, or be intended by Paul to know what Paul did. These are the thematic entailments free datives may have (cf. section 6.3 and Hole (2008, ch. 9, 10)). This proves that datives as in (9) are free if (6) is the diagnostic. For (8), too, it may seem at first that without a dative (cf. (10-a)) the existential closure of the dative involvement persists. (≈‘There is someone who finds the staircase too steep.’) The important point is that someone who has no personal benefit from a different degree of steepness and who may utter (10-a) may not felicitously utter (10-b) (= (8)). The context given in (10) makes this clear.

German Free Datives and Knight Move Binding

217

(10) [Paul is an inexperienced carpenter. He has built a staircase in a new house, but after he’s done he notices that the staircase doesn’t conform to the blueprint. He thinks:] a. Die Treppe ist zu steil. the staircase is too steep ‘The staircase is too steep.’ ist die Treppe zu steil. b. #Mir me.DAT is the staircase to steep ‘I find the staircase to steep.’ The fact that (10-b) is bad in the given context (this is not fully reflected in the approximate translational equivalent) has something to do with the fact that the staircase is too steep if compared with the sketches, and not with a use that the staircase could have for Paul. We will see in the following section 4 how the tie-up between free datives and purposes can be explicated. What is important here is that the constructional environment of the dativus iudicantis alone (the threshold-related assertion of a degree) does not entail the event participance of a referent that could be expressed with a free dative. We may say by way of a summary that datives are free iff they conform to (6), i.e., iff they can be omitted without any syntactic or semantic residue, and that at least the following traditional categories fall under the category label ‘free dative’: “beneficiary” datives (dativus (in)commodi), “possessor” datives, and the dativus iudicantis. In the remainder of this paper, only “possessor” datives and “beneficiary” datives will be treated. Cf. Hole (2008) for details concerning other types of free datives.

4. Free datives bind a variable in the local tense domain In this section, I will provide arguments to the effect that (i) free datives are binders and that (ii) they bind a variable in the local tense domain. Just like a subject of a German sentence with a reflexive pronoun binds a reflexive pronoun, i.e., a variable, in the local tense domain, the free dative binds a variable in the local tense domain. Since, in the case of free datives, the variable (and even the larger constituent containing it) is frequently not pronounced, this property is easy to overlook. 4.1. Sloppy identity The example in (11) shows a sloppy-identity effect for so-called “possessor” datives. (Here and in the following, I use indexes not just on pronominals proper, but also on other elements with anaphoric uses that may be targeted by dative

218

Daniel Hole

binding – which is to say that each of the indexed elements is assumed to have a variable in its denotation, at least in the uses discussed here. Elements of this kind are prepositions with definite endings (vom ‘from the’ as in (11)), bridging definites in general and particles with anaphoric components (hin ‘away from perspectival center’, her ‘to perspectival center’).) vomi Arm ab, (11) Dem Patienteni platzte ein St¨uck Gips von seinemi off.the arm off [the patient]dat cracked a piece cast off his und dem Arzt auch. and [the doctor]dat too ✓ ‘It happened to [the patient]i that part of the cast on hisi arm came off, and it happened to [the doctor] j that part of the cast on his j arm came off, too.’ * ‘It happened to [the patient]i that part of the cast on hisi arm came off, and it happened to [the doctor] j that another part of the cast on the arm of [the patient]i came off .’ Given coindexation as indicated in the first conjunct, the second conjunct has no mishap reading where it happens to the doctor that part of the cast on the arm of the patient came off; this would be a strict identity reading. The only available reading is the one where the doctor, just like the patient, has a cast on his arm, and part of that cast came off, too. This is the sloppy identity pattern indicative of a binding relationship in both conjuncts and, crucially, in the first conjunct. Either dative thus binds a possessor variable in the possessum DP which forms part of vom/von seinem Arm. This holds even though possessive pronouns as such are not restricted to bound uses in German. The binding requirement thus stems from the particular configuration in which the pronoun is used in (11). We will argue in section 6.3 that the thematic contribution of the dative DP itself is a locative LANDMARK entailment requiring the VP eventuality to be valid relative to the neighborhood region of the dative referent. (Simultaneously, the dative referent must be able to perceive the VP eventuality, an entailment that we will dub P - EXPERIENCERhood in section 6.3). A parallel sloppy identity effect can be observed with “beneficiary” datives. zuri Entspannung, (12) J.R. mixte Sue-Elleni einen Drink zu ihreri for.the relaxation J.R. fixed Sue-Ellen.DAT a drink for her und seiner Mutter auch. and [his mother]dat too

German Free Datives and Knight Move Binding

219

lit.: ‘J.R. fixed [Sue-Ellen]dat a drink for her relaxation, and [his mother]dat , too. ✓ ‘J.R. fixed Sue-Ellen a drink so that Sue-Ellen could relax, and J.R. fixed his mother a drink so that his mother could relax.’ * ‘J.R. fixed Sue-Ellen a drink so that Sue-Ellen could relax, and J.R. fixed his mother a drink so that Sue Ellen could relax.’ If the dative referent and the person to relax are to be identical in the first conjunct, the same must hold for the second conjunct. Both Sue-Ellen and J.R.’s mother are thus to relax. This is the binding construal which has both the first variable and the variable in the elided conjunct bound by the local antecedent. This state of affairs differs from a coreference construal1: (12) could not be used to describe a situation where Sue-Ellen has been busy serving everybody, and finally J.R. helps her by preparing a drink for her and his mother so that Sue-Ellen alone can relax. This would be a coreference construal where both variables are interpreted as referring to Sue-Ellen. What renders (12) interesting beyond the forced sloppy-identity construal are two things. For one thing, (12) gives us a first impression of how the alleged “beneficiary” thematic involvement of free datives can be reduced. Since the benefactive involvement is spelled out inside the purposive PP in (12), the dative argument itself is free to encode a thematic involvement other than beneficiency, namely P ( OTENTIAL )EXPERIENCERhood, as was the case with the example in (11), where we argued that a LANDMARK semantics was combined with a P - EXPERIENCER semantics. The second noteworthy thing about (12) is that the way its thematic dative in1

Friendly Voice wonders whether there’s a difference between binding on the one hand and coindexation plus c-command on the other. Indeed there is a difference, but it only materializes if it makes a difference. In the standard case of identical reference of an antecedent and, say, a possessive pronoun, the ambiguity between binding and co-reference is spurious; the interpretation of Pauli phoned hisi father comes out the same no matter if a binding relationship enforces identical reference of Paul and his (which means that the pronominal variable is bound), or if his just happens to have the same index as its antecedent (which means that there is mere co-reference as mentioned in the main text). Contexts in which an ambiguity between binding/sloppy-identity readings and co-reference/strict-identity readings crops up are precisely those diagnostic contexts that are used in the main text: Pauli phoned hisi father, and Mary did, too has a binding and a co-reference reading (binding: Mary called her own mother; co-reference: Mary called Paul’s father). In the recent literature, the (unwanted) spuriousness of the ambiguity in simple cases, and its highly relevant non-spuriousness in the ellipsis cases is given an account in terms informativeness: Derive an ambiguity just in case the readings differ in truth-conditions; derive a binding relationship otherwise. This is the content of B¨uring’s (2005a, 121) Rule Have Local Binding! as in (i) (cf. also Reinhart (1983), Heim (1993), Fox (2000)). (i)

For any two NPs α and β , if α could bind β (i.e., if it c-commands β and β is not bound in α ’s c-command domain already), α must bind β , unless that changes the interpretation.

220

Daniel Hole

volvement is separated from the purposive involvement parallels the case of the “possessor” dative in (11). In (11), too, the purported possessive semantics of the free dative was stated to have its real locus in the position of the bound variable in the PP containing the possessum phrase (vom/von seinem Arm). Like this, the dative in (11) was “set free” to encode a LANDMARK (and P - EXPERIENCER) relationship alone. In (12), the beneficiary semantics is encoded in the purposive PP, and the dative is again “set free” to encode a P - EXPERIENCER relationship, and not a possessive relationship. This paves the way for a parallel treatment of “possessor” datives and “beneficiary” datives. While it is conceded that possessor raising analyses make the same binding predictions that we make for (11) – traces must be bound – the parallel treatment of “possessor” datives and “beneficiary” datives is beyond the reach of such analyses. This constitutes a first clear advantage of our voice-based binding account of free datives. 4.2. Accommodated possessors and beneficiaries (13) through (15) combine free datives with VP-internal material that includes no pronounced pronoun that could be bound by the dative. Nonetheless, the sentences receive interpretations in which a variable bound by the dative DP forms part of the interpreted structure. gegen einen Stein. (13) Paul trat Maria stone Paul kicked Maria.DAT against a lit.: ‘Paul kicked Mariadat against a stone’ (i.e., ‘Paul kicked against a stone of Maria’s, and it wasn’t excluded that Maria noticed that.’) einen Stein sauber. (14) Paul wischte Maria stone clean Paul wiped Maria.DAT a lit.: ‘Paul wiped Mariadat a stone clean.’ (i.e., ‘Paul cleaned a stone, and Paul intended Maria to benefit from the stone being clean, and it wasn’t excluded that Maria noticed the eventuality at hand.’) zu steil. (15) Die Treppe des Mondmoduls war meiner Großmutter the stairs of.the lunar.module was [my grandmother]dat too steep ‘The stairs of the lunar module were too steep for my grandmother, and she noticed that.’ In (13) the indefiniteness of the prepositional object einen Stein ‘a stone’ and the absence of a pronouced possessive pronoun does not preclude its being interpreted as ‘one of her stones’, where her is Maria. Maria may own a valuable collection of stones, or she may be responsible for them. Whatever the exact relationship is, it is one that may be encoded by the possessive pronoun ihrer

German Free Datives and Knight Move Binding

221

‘her’ in a phrase like einen ihrer Steine ‘one of her stones’, and this amounts to a binding relationship between Maria and the implicit pronoun.2 Without a context for (14), we don’t know what benefit Maria is to have if the stone is clean as opposed to dirty, but it is implied that (the speaker thought) Paul thought the clean stone will have a benefit for her. Structurally, this may be explicated as Maria binding a beneficiary variable in a purpose phrase of the same type as in (12) above, i.e., zum . . . ‘to her purpose of . . . ’ (for instance zum Draufsetzen ‘for her purpose of sitting down on it’, or zum Mitnehmen ‘for her purpose of taking it with her’). In (15), finally, the steepness of the stairs must be judged by my grandmother in a context in which the stairs, if they hadn’t been so steep, could have fulfilled a purpose of hers. Perhaps she went to a space museum with me, and if the stairs of the lunar module hadn’t been so steep, she could have entered the module with me. Or she sees the lunar module on TV and simply doesn’t like steep stairs, and thus the steepness of the stairs fail to make a positive or beneficial aesthetic impression on her. Crucially, (15) may not be used if my grandmother finds the stairs of the lunar module too steep for the astronauts to get in and out. Put differently, the dative binds the implicit beneficiary variable, and the accommodated purpose may not be that of a person different from the dative referent. As said before, the binding relationships between datives and unpronounced pronouns illustrated above are predicted by possessor raising analyses the same way as we predict them in our framework. The approach taken here has a larger empirical coverage, though, since “possessor” datives are, to a certain extent, treated on a par with “beneficiary” datives and “iudicantis” datives. 4.3. Locality The binding requirement of free datives must be satisfied in the local tense domain. This puts dative binding on a par with reflexive binding in German where the binding domain of reflexives is likewise the local tense domain (at least for the SELF reflexive sich selbst; cf. Hole (2008, 55–56)). The only difference is that German reflexives are subject-oriented, i.e., their antecedent must – with few exception – be a subject, whereas the antecedent in the case of free dative binding is a dative DP. (16) states the locality constraint of free dative binding, and (17) through (18) deliver data to underpin the constraint. ((16-a) is to be read in such a way that 2

If a benefit that kicking against the stone has for Maria is in the context, or can be accommodated, (13) may also receive a beneficiary interpretation. In this case the reasoning for (14) applies in this case, too. Or (13) may receive both interpretations at a time. As will become clear below, our theory predicts this range of interpretive options. Thanks to Friendly Voice for pointing out the benefactive interpretive option for (13).

222

Daniel Hole

the first three omission marks may not represent material that contains another left TP or CP boundary.) (16) a. b.

[ TP ... [ DP free dative]i ... [ (*CP/*TP) ... [ *(PRONi ) ... ] possessum/purpose ] j ] Free datives must bind a variable in the local tense domain.

(17) Binding into definites: bridging is strictly local in diei Suppe gespuckt. a. Paul hat Paulai Paul has Paula.DAT in the soup spat ‘Paul spat (Pauladat ) in Paula’s soup.’ bridging reading construes without effort: binding of the possessor of the soup b. Paul hat Paulai in die Tasse, [ CP in die [ TP die∗i Suppe Paul has Paula.DAT in the bowl in which the soup sollte]], gespuckt. should spat lit.: ‘Paul spat Pauladat in the bowl in which the soup was supposed to go.’ bridging reading unavailable: left CP/TP boundaries intervene (18) Binding of overt pronouns across a TP boundary is available, but it doesn’t satisfy the specific dative binding requirement. a. Der Lehrer hat Paulai ein [ TP von ihri weggeworfenes] Buch the teacher has Paula.DAT a by her thrown.away book auf deni Tisch gelegt. on the table put ‘The teacher put a book thrown away by Paula on Paula’s table (for Paula).’ ein [ TP von ihri weggeworfenes] Buch b. Der Lehrer hat Paulai by her thrown.away book the teacher has Paula.DAT a hini -gelegt. deictic.to-put ‘The teacher put a book thrown away by Paula to a place related to Paula (for Paula).’ The definite die Suppe ‘the soup’ in (17-a) is interpreted as a bridging definite with the denotation ‘Paula’s soup’.3,4 The bridging requirement vanishes if a 3

If the sentence is construed with a pure “beneficiary” reading (a marginal reading of (17-a)), then it gets possible to interpret the definite die Suppe without possessive implications. This reading may be rendered as ‘Paul spat in the soup, Maria benefitted from this, and she must have been able to perceive this.’ The fact that this marginal reading is available does not undermine my argumentation. In fact, it supports it. In the absence of material forcing a bridging interpretation, it is generally predicted that the accommodation of a purpose/benefit of the dative referent should be possible. This accommodated purpose will then provide the required variable that is bound by the dative.

German Free Datives and Knight Move Binding

223

T(ense) node intervenes between the dative antecedent and the definite. This is shown in (17-b). (18) illustrates the following: if a pronounced pronoun receives a bound interpretation, but is situated across a Tense node with respect to the dative antecedent, then the binding of a local variable is forced alongside. In (18-a) the variable is situated in the definite which, thus, is accommodated to denote a bridging definite with the interpretation ‘Paula’s table’. In (18-b), the deictic particle hin has a variable as part of its denotation. This variable denotes the individual which is not at the perspectival center, but towards which the motion entailed in the sentence is directed at. It must be bound by the dative even if a binding relationship across a T node has independently been established. A difference between (17) and (18) concerns the presence of bridging effects in (17), and the presence of overt pronouns in (18). The binding of implicit variables in bridging definites is impossible across a T node. This is what the argument drawn from (17) rests upon. In (18) a pronounced variable (a pronoun) can be bound across a T node, but this binding doesn’t fulfil the local binding requirement postulated for free datives. The converging evidence that may be drawn from (17) and (18) is that whatever may get bound by a free dative across a T node, a local variable must always be bound alongside.

5. Knight Move Binding “Knight Move Binding” (R¨osslsprungbindung or Pferdchensprungbindung in German) is a term to capture the tree-geometric peculiarity of the kind of binding that free datives trigger. The binding requirement of free datives is not satisfied by coargument binding, or by binding of an argument embedded in the complement of an argument, but only by binding of the possessor variable of a coargument possessum, or by binding of a beneficiary variable of a coargument purpose phrase. Similar to knights in the chess game, which may only move in a specific oblique way (two squares in any non-diagonal direction, then one to the left or right), a free dative may only bind the possessor or beneficiary on the left branch of a prepositional coargument. If we are allowed to classify the beneficiary variable of a purpose phrase such as zuri /zu ihreri Entspannung on a par with possessor variables we can rephrase the requirement of Knight Move Binding as in (19). 4

The way the sentences in (17) are presented identifies the definite article of the bridging definite as the element which hosts the variable bound by the dative; cf. also the discussion at the beginning of 4.1. Hole (2008) takes a slightly different perspective in that, there, the NP complement of the article hosts the variable. The variant chosen here results in a certain ease of representation, which I am happy to make use of in this paper.

224

Daniel Hole

(19) Knight Move Binding Binding configuration in which the binder targets the possessor variable of a c-commanded coargumental possessum or purpose phrase. In this section we will first aim to demonstrate that (19) really holds. We will then move on to present crosslinguistic evidence underpinning the piviledged status of Knight Move Binding in grammar and grammaticalization. The section concludes with thoughts on how Knight Move Binding should be modeled, but the matter is left unsettled. 5.1. Free datives must enter into a Knight Move Binding relationship We want to show that the kind of binding that satisfies the binding requirement of free datives is always Knight Move Binding. Whatever else free datives may bind alongside, they must also enter into a configuration of Knight Move Binding. 5.1.1. Configurations with a bound DP-internal complement variable For the first argument in support of obligatory Knight Move Binding with free datives a case is checked where, instead of the possessor variable, the free dative binds a complement variable inside a complex DP. (20-b) is a pertinent example. (20-a) is a similar sentence with Knight Move Binding. (20) a. b.

5

[seineni Verdacht]. Sie zerstreuten Pauli suspicion they dispelled Paul.DAT his lit.: ‘They dispelled Pauldat i hisi suspicion.’ (zu seineri Entlastung) [ihren j Verdacht Siek zerstreuten Pauli they dispelled Paul.DAT to his exoneration their suspicion gegen ihni ]. against him lit.: ‘Theyk dispelled [Pauldat ]i their j suspicion against himi (to hisi exoneration).’5

Friendly Voice doubts the availability of the purposive reading if the parenthesis is not there/not pronounced. I assume that the reading becomes available more reliably if more context is delivered, or if fewer pronouns are used. (i) is a variant of (20-b) with reduced pronoun use, and more contextual clues: (i)

den Verdacht der Staatsanwaltschaft gegen ihni . Die Anw¨alte zerstreuten Pauli against him the lawyers dispelled Paul. DAT the suspicion of.the attorneys ‘The lawyers dispelled the attorneys’ suspicion against himi for Pauli .’

German Free Datives and Knight Move Binding

225

Paul has a suspicion about someone. His children talk him out of it. This is a context for (20-a). Paul is the possessor of his suspicion, the possessor variable gets bound by Paul, and no more need be said. In (20-b) things are different. Now somebody else, say, the attorneys (with index j), have a suspicion against him. Paul binds the complement variable of Verdacht gegen ihn ‘suspicion against him’. But, as the altogether different interpretation of the sentence shows, this is not enough. Even in the absence of the material in parentheses a benefit must be accommodated that Paul has from the dispelling of the suspicion. If a purpose is accommodated, or if the material in parentheses is pronounced, Paul binds the beneficiary inside the purpose phrase. The contrast in (20) thus shows that Knight Move Binding is enforced by the dative. If, as in (20-a), the dative binds the possessor variable of seinen Verdacht, Knight Move Binding has also been instantiated. Note that (20-a), as opposed to (20-b), need not imply that Paul is also a beneficiary, i.e., there needn’t be a purpose phrase in it, not even an implicit one. The people who dispell his suspicion may well have bad intentions if they are, say, his prospective heirs who plan to kill him, and the speaker of (20-a) may know this. In (20-b) the variable in the complement of Verdacht ‘suspicion’ is in the wrong position to instantiate Knight Move Binding. Therefore an additional purpose phrase must be added, explicitly or implicitly. 5.1.2. Concurring binding by a question operator A second argument in support of obligatory Knight Move Binding with free datives may be derived from the patterns that result if either the free dative or the potential binding target is bound by a question operator Q. If free dative binding is always Knight Move Binding, then it is predicted that Q-bound datives should pose no problem. They are bound by the Q-operator, and they may themselves bind their binding target. But, so the prediction goes, if the binding target of the free dative is Q-bound already the dative can’t bind it anymore. Such configurations should either lead to ungrammaticality, or force readings with an accommodated binding target. These predictions are borne out. (21) a.

b.

hat der Lehrer diei /seinei Hand festgehalten? Wemi who.DAT has the teacher the/his hand held.tight lit.: ‘Whodat i did the teacher hold hisi hand?’ ≈ ‘Whose hand did the teacher hold tight?’ festgehalten? (*) Wessen Hand hat der Lehrer ihm whose hand has the teacher him.DAT held.tight lit.: ‘Whose hand did the teacher hold himdat tight?’ good as: ‘Whose hand did the teacher hold tight for him?’

226

Daniel Hole

c.

Seinei/ j Hand hat der Lehrer ihmi /wemi festgehalten./? hand has the teacher him.DAT/who.DAT held.tight his lit.: ‘The teacher held himdat i /whodat i hisi hand tight.’ ‘The teacher held his/whose hand tight./?’

Example (21-a) is the case where the dative is Q-bound, and the dative itself binds the possessor variable. (21-b) tests the reverse configuration. The dative cannot bind the possessor variable because the possessor variable is Q-bound. Thus no reading parallel to (21-a) is available and hence the sentence turns out deviant on the possessive reading.6 But it can be rescued if a purposive interpretation is chosen (i.e., if a benefit of holding the hand for Paul is accommodated). In this case ihm can bind the possessor/beneficiary variable in the silent purpose phrase. (21-c) just serves to show that the surface order of the wh-question is irrelevant to the available binding options (in German). The dative binds the possessor variable even though the possessum DP has been topicalized. Therefore, by analogy, it is not the surface order of (21-b) that leads to the (potential) ungrammaticality of this sentence. 5.1.3. Bound coarguments A third argument to demonstrate the Knight Move Binding requirement comes from sentences where a free dative binds the sole c-commanded coargument. It is again predicted that, even though the dative binds something, binding of an im6

The ungrammaticality of the relevant reading of (21-b) is not a WCO effect. Generally, German does not display the typical weak crossover effects (cf. the availability of a good reading of (i)); specifically, bound readings are also available in WCO-prone configurations in German if the antecedent is a direct object and the bindee is a possessor in a dative DP which is undoubtedly of the high kind, and not of the low kind as with aussetzen ‘expose to’ or u¨ berschreiben ‘transfer to’ (Haider (2000)) (cf. (ii)). In other words, if dative binding is not obstructed, Q-bound accusatives may bind into free dative DPs, thereby bearing witness of the absence of WCO effects in ACCi DAT-ti sequences, too: in (ii) the dative binding requirement is independently satisfied by binding of a beneficiary in an implicit purpose phrase. (i)

Weni hat seini Onkel angerufen? who. ACC has [his uncle]nom phoned ‘Whoi was phoned by hisi uncle?’ (cf. the ungrammaticality of * Who(m)i did hisi uncle phone?)

(ii) Weni hat die Super-Nanny seineni Eltern zurechtgebogen? who. ACC has the Supernanny [hisi parents]dat straightened.out ‘Whoi was straightened out by Suppernanny for hisi parents?’ lit.: ‘*Whoacci has Supernanny [hisi parents]dat straightened out?’ Thanks to Friendly Voice (and Martin Salzmann) for bringing up the WCO issue, and to Daniel B¨uring for first pointing out to me that the behavior of (21-b) cannot be reduced to WCO.

German Free Datives and Knight Move Binding

227

plicit variable in a Knight Move position should be detectable. If only a looser binding requirement held true – say: A free dative must bind a c-commanded variable within the same tense domain – then coarguments bound by free datives should do the job. But they don’t. The kind of structure that we’re going to test are sentences similar to Paula trat ihmi ihni /sichi ‘Paula kicked himdat him(self)acc ’. A certain concern regarding the German reflexive pronoun sich must be dealt with before that. Predictions will differ if sich is classified as a subject-oriented reflexive or not. If it is a subject-oriented reflexive, then accusative and dative antecedents of sich should anyway be marginal at best. Things get complicated by competing SELF-reflexive forms such as sich selbst with a binding behavior of their own, and the contrast between stressed and unstressed variants of sich (Grewendorf (2003, 106)). Therefore, Hole (2008) evades the problem of third person anaphora altogether and uses the binding behavior of pronouns for speech-act participants for his argumentation. Even though the pronouns for first and second person lack distinguished reflexive forms in German and many other continental European languages, it has long been established that they may be interpreted as bound variables (“fake indexicals”; cf. Heim (1994), Kratzer (2009)). Just consider the sloppy-identity effect in the line from a pop song I’ve played all my cards, and that’s what you’ve done too, which means that the addressee has played his own cards, and not those of the speaker. With this background in mind, consider the sentences in (22) and (23). (22) a. b.

gegen meini /dasi Schienbein. Paul trat miri shin Paul kicked me.DAT against my/the ‘Paul kicked medat in the shin.’ Paul trat mich. Paul kicked me.ACC ‘Paul kicked me.’

Paul michi unter dem Tisch. (23) ?Wie ausgemacht trat miri as agreed.upon kicked me.DAT Paul me.ACC under the table lit.: ‘As we had agreed upon, Paul kicked medat meacc under the table.’/ ‘As we had agreed upon, Paul kicked me under the table to my benefit.’ (22-a) is a sentence with a standard Knight Move Binding configuration. The free dative binds the possessor variable in the directional complement. In (22-b) the same verb treten ‘kick’ as in (22-b) is used in a different argument frame; it only takes an accusative argument, and no directional complement. In (23) the latter argument frame is used, and a free dative in addition. If the free dative could bind just any c-commanded local coargument and thereby fulfil its binding requirement, (23) should get the interpretation ‘As we had agreed upon, Paul kicked me under the table, and I could notice this’. But these truth-conditions are incomplete. If the sentence gets an interpretation at all (cf. the question mark

228

Daniel Hole

that marks (23) as odd), we must accommodate a purpose that the kicking has for the speaker. Maybe the speaker knows that he frequently says things that, later on, he wishes he hadn’t said, and therefore asks his friend to kick him under the table whenever such a situation comes up. What counts for the argument to go through is not so much that sentences like (23) are impeccable – they are not – but that if they receive an interpretation, a beneficiary semantics is invariably added to the sentence meaning. A beneficiary semantics is the only possibility because the binding target inside a normal possessum phrase as in (22-a) is not available due to the use of the argument frame as in (22-b). Neither (22-a) with the dative nor (22-b) with the accusative have the benefactive entailment, so it can neither be the dative nor the accusative as such that triggers it. Our analysis which assumes obligatory Knight Move Binding into a silent purpose phrase makes the right prediction in such cases. 5.1.4. Grammaticalization of reflexives Our last argument in support of Knight Move Binding does not aim at proving that all free datives enter into Knight Move Binding configurations, but notes the crosslinguistically privileged status of Knight Move Binding in the emergence of reflexive pronouns. To be sure, the argument thus derived has no status in the justification of the Knight Move Binding claim made for German free datives. What it lends support to, though, is the idea that the peculiar configuration under scrutiny here is, for whatever reason, a special binding configuration in natural language. As such, the argument subtracts from the exotic concept that is instantiated by Knight Move Binding as a requirement. The argument is easily stated. Next to the combination of a pronominal with an emphatic particle, possessum phrases of the general make-up “possessor pronoun + body-part noun” constitute the most frequent source of reflexive anaphors in the world’s languages (Faltz (1985), K¨onig and Siemund (2000b), Schladt (2000), Gast et al. (2007)). Depending on how far the grammaticalization of such body-part reflexives proceeds, the underlying structure may continue to be transparent (cf. Georgian tavi ‘head’), or develop into opaque affixes (cf. Lamang (Chadic) -va < ghv ‘body’). Note that, to the best of my knowledge, not a single reflexive pronoun is attested which derives from a structure “noun + pronominal complement”. I.e., the following types of reflexive pronoun etymologies are unattested: (i) “picture noun + content pronominal” (e.g., ‘picture of PRON’, where PRON denotes the content of the picture, and not its possessor);7 (ii) “propositional noun + com7

Schladt (2000, 105-7, 110-1) identifies ‘reflection of PRON on water’ as a rare (1 SG ‘You will see me.’ nek-ac ki newoh-pePn b. yoP 3 SG.NOM 1SG-ACC FUT see-3 SG>1 SG ‘He will see me.’

2nd > 1st 3rd > 1st

Languages with GCS vary in two respects: (i) the relevant scale and (ii) the realization of the case split on either DPext or DPint . The following table shows a survey of GCS languages: Table 1. Languages with Global Case Splits

Language Arizona Tewa (Kiowa-Tanoan) Awtuw (Sepik-Ramu) Fore (Trans-New Guinea) Kashmiri (Indo-European) Kolyma Yukaghir (Yukaghir) Umatilla Sahaptin (Penutian) Yurok (Algic)

Reference Kroskrity (1978; 1985) Feldman (1986) Scott (1978) Wali & Koul (1997) Maslova (2003) Rigsby & Rude (1996) Robins (1958)

There is a large body of literature on Local Case Splits (cf. among others Silverstein (1976); Comrie (1979); Lazard (1984); Bossong (1985); Aissen (1999; 2003); Keine & M¨uller (2010)), but there are only very few formal approaches to Global Case Splits. This is remarkable given that the latter are more problematic for derivational syntactic theories: it seems that the decision which case to assign needs a non-local representation of structure that includes both coarguments and the case assigner, hence the name ‘global’ split.

2. Global Case Splits and locality: a challenge Global Case Splits like the one in Yurok are called ‘global’ because it seems that the case assigner must be able to have access to the properties of two arguments in order to be able to decide which case to assign to one of them. Recent minimalist syntactic approaches, however, try to reduce globality and to model restrictions within small subparts of the derivation. Therefore, GCS impose serious problems for a derivational syntactic theory like minimalism. Before I can illustrate this point, I briefly sum up standard minimalist assumptions about case

308

Doreen Georgi

assignment in transitive contexts and structure building in general (Chomsky (1995; 2000; 2001)). (5) Structure building and case assignment in minimalism a. Syntactic structure unfolds step by step in a bottom-up fashion. b. All operations are in accordance with the Strict Cycle Condition (cf. (6)). c. Structure-building (Merge) is feature-driven (by c-selection features represented as [•F•]). d. DPs enter the derivation with an unvalued case feature [ CASE :] that has to be valued by Agree with a c-commanding functional head. e. v has a dual role: it assigns case to DPint and selects DPext . f. T assigns case to DPext . (6) Strict Cycle Condition (SCC, based on Chomsky (1973)) a. No operation can apply to a domain dominated by a cyclic node α in such a way as to affect solely a proper subdomain of α dominated by a node β which is also a cyclic node. b. Every projection is a cyclic node. ➀: v assigns Case value F to DPint (7) Structure of transitive vP ➁: v selects DPext vP H H H H v DPext H H H v VP HH [•D•] DPint [∗C ASE :F∗] V [C ASE :] ➁

➀

This system has been developed on the basis of languages without case splits like English where an argument is assigned a case value independently of its coargument. However, if we try to derive a Global Case Split in the same way, a dilemma arises: There are two possible derivations depending on which operation-inducing feature on v is discharged first (the case assigning feature or the c-selection feature), but each derivation violates some core principle of a strictly derivational framework. Let me illustrate this on the basis of a global split like the one in Yurok on DPint . (i) ➀ ➁ (case assignment precedes c-selection): v assigns case to DPint directly after it has merged with VP. But the case value of DPint (Nom vs. Acc) also depends on the properties of DPext which has not yet been

A Local Derivation of Global Case Splits

309

merged. Hence, case valuation would need look-ahead, which is impossible in a strictly derivational syntax. (ii) ➁ ➀ (c-selection precedes case assignment): In order to circumvent the look-ahead problem one could assume that the order of operations induced by v is reversed such that DPext is merged before v values case on DPint . But then case valuation is counter-cyclic given the strictest version of the Strict Cycle Condition in (6): case assignment affects only v' although this projection is already dominated by vP. Thus, no matter which order of operations is chosen, none is in accordance with minimalist assumptions about locality and cyclicity.1 Apart from the look-ahead and the cyclicity problem further issues arise: Somehow v must be able to compare the properties of DPint and DPext in order to decide which case to assign. The question is thus how v communicates with two arguments. Finally, there must be a mechanism which fixes the case value that v assigns (Nom or Acc in Yurok). There could for example be a feature-changing operation in the syntax (see Noyer (1998)) or the case feature is inserted after v has compared the properties of the coarguments. Previous analyses of GCS include Aissen (1999); De Hoop & Malchukov (2008); Keine (2010). But each of them faces at least one of the problems mentioned above. The most problematic component of these approaches is that they are all global in the sense that the decision which case to assign is made on the basis of a representation which includes both arguments of a transitive verb and rules/constraints which make reference to both of these arguments. As a result, it is necessary in some cases to apply case assignment counter-cyclically. B´ejar 1

A different problem comes up for languages with the case split on DPext . There is no look-ahead or a counter-cyclic operation because when DPext receives its case value, both arguments are merged and hence potentially accessible for the case assigner of DPext , the functional head T (both are in the c-command domain of T). However, if the strict version of the Phase Impenetrability Condition, which restricts the search space of a probe, is adopted, it is impossible for T to access DPint : (i)

Phase Impenetrability Condition (PIC, Chomsky (2001)) a. In a phase α with the head H, the domain of H is not accessible to operations outside α , only H and its edge are accessible to such operations. b. The domain is the complement of a phase head, the edge is its specifier.

Under standard assumptions v is a phase head. This means that case assignment from T to DPext cannot refer to the properties of DPint because DPint is in the domain of v and hence no longer accessible as soon as vP is completed. There might be several solutions for this problem that are independently motivated, e.g. movement of DPint to the phase edge or cyclic Agree via v (Legate (2008)). Hence, with respect to locality, splits on DPint are more problematic for a derivational account than splits on DPext .

310

Doreen Georgi

ˇ acˇ (2009) also develop a local account but it is not clear to me how the & Rez´ case value that v assigns is fixed.

3. Analysis 3.1. A new perspective The desideratum is to derive GCS without violating the SCC in (6), but this means that we face the look-ahead problem. In order to circumvent it, I propose that the data should be looked at from a new perspective. GCS has always been described in a way that the case value of an argument depends on the properties of two coarguments and this is what brings about the global character of the phenomenon. I suggest that the data can also be characterized as follows: (8) A different perspective on GCS It is not case marking that depends on the properties of the coarguments. Rather, the properties of DPint determine what properties DPext can have. This means that the selection properties of v are restricted by the properties of DPint . Consider the case of Yurok in (4), where DPint bears an overt case marker (called ‘accusative’) if it is higher on the person hierarchy in (3) than DPext . All combinations are displayed in the following table: Table 2. GCS in Yurok

person of DPext 1st/2nd 3rd 1st/2nd 3rd

case of DPext Nom Nom Nom Nom

person of DPint 1st/2nd 1st/2nd 3rd 3rd

case of DPint Nom Acc Nom Nom

Under the new perspective GCS in Yurok can be described as follows: If DPint is 1st/2nd person nominative, DPext has to be 1st/2nd person as well; if DPint is 1st/2nd person accusative, DPext has to be 3rd person; if DPint is 3rd person, there are no restrictions on the person of DPext . This means that the person of DPext is the dependent feature, not the case of DPint . A local analysis is now possible if there is a way to let the first merged argument DPint influence the properties of DPext , depending on the case and person features of DPint . Two questions arise in this context: (i) How can DPint restrict the nature of DPext ? (ii) What drives the occurrence of the overt case marker? My answer to the first question is that there is a repair operation, called Maraudage, which

A Local Derivation of Global Case Splits

311

steals features originally provided for DPext depending on the features of DPint . With respect to the second question I propose that the case marker is a reflex of Maraudage which is realized postsyntactically. 3.2. Assumptions In this subsection I summarize my theoretical assumptions which are the basis for the derivations in the next section. I assume a strictly derivational model of syntax with the following properties (cf. (5)): The syntactic derivation unfolds bottom-up in accordance with the Strict Cycle Condition by alternating applications of the basic operations Merge and Agree. All operations are feature-driven ([•F•] triggers Merge, [∗F∗] triggers Agree; for the notation cf. Sternefeld (2006); Heck & M¨uller (2007)). v agrees with DPext and DPint in phi-features (cf. the two arguments against one head-configuration in Anagnostopoulou (2003); Adger & Harbour (2007); ˇ acˇ (2008); Richards (2008); B´ejar & Rez´ ˇ acˇ (2009); Heck & Richards (2010); Rez´ Keine (2010)). In order to be able to agree with two arguments v provides two sets of probe features [∗F∗]: one for checking with DPext and another one for checking with DPint . A v in a transitive context thus has the following features when it enters the derivation: (9) { [•V•] [•D•], [∗F∗]ext , [∗F∗]int } First, v wants to merge with VP and it selects a DP, the external argument. Furthermore, it provides a probe for Agree with DPext (= [∗F∗]ext ) and for Agree with DPint (= [∗F∗]int ). The operation Agree is defined as follows (based on Chomsky (2000; 2001)): (10) Agree Agree between a probe P and a goal G applies if a. P c–commands G.2 b. G is the closest goal to P. c. P and G have matching feature values (Match = feature identity). d. P and G have matching set indices. e. Result: P and G check their matching features. This is the standard definition of Agree in which a probe with operation-inducing features is checked by the closest matching goal in its c-command domain. What 2

In the present analysis v must enter into an Agree relation with DPext . I assume percolation of features from v to v' so that v can c-command DPext . Alternatively, one could replace ‘ccommand’ in the definition of Agree with ‘m-command’. Nothing in the analysis depends on the choice between the two options.

312

Doreen Georgi

is added for the purposes of this article is condition (10-d). Given the assumption that v has two probe sets, one for each argument, an argument can only check the features of the set which is coindexed with it, i.e., DPext can only check features in probe set [∗F∗]ext and DPint can only check features in probe set [∗F∗]int .3 Note that in this definition Match is a prerequisite for Agree and therefore has to apply before the actual Agree operation (checking) takes place. This will become relevant in what follows. Merge and Agree are triggered by the need to check operation-inducing features (c-selection features and probe features) as demanded by the principle Full Interpretation (Chomsky (1995)). If a clause contained unchecked features at the end of a derivation, it would not be interpretable at the interfaces. I assume that not only operation-inducing features must be checked, but in addition, also certain phi-features of the arguments must enter into an Agree relation in order to get checked. Which phi-features are subject to this constraint depends on the Silverstein scale which drives the split. In Yurok, for example, the person scale is relevant for the case split and hence, person features of the DPs must enter into Agree with v. This requirement is formulated in the constraint F EATURE ˇ acˇ (2009)): C HECKING (cf. the Person Licensing Condition in B´ejar & Rez´ (11) F EATURE C HECKING (FC) Goal features have to be checked (person, animacy, . . . depending on the relevant scale in a language). As a consequence, Full Interpretation does not only hold of operation-inducing features but also of goal features: (12) F ULL I NTERPRETATION (F ULL I NT) A clause must not contain unchecked features (c-selection features, probe features, goal phi-features). ˇ acˇ (2009), based on Harley & Furthermore, I follow B´ejar (2003); B´ejar & Rez´ Ritter (2002), in that inherent features of DPs like 1st person, [+animate], etc. are complex objects which are decomposed into privative features and represented by bundles of these privative features (for the same basic idea but with different privative features cf. Harbour (2008)). They argue for a decomposition of person 3

The indices of the probe sets are sufficient for the purposes of this article, but they might be problematic when Agree / case assignment over clause boundaries is considered, e.g. when v Agrees with an argument of an embedded clause in ECM constructions. This argument may be the external argument DPext of the embedded clause and then the internal probe set could not match with DPext due to different set indices. A solution which can be adopted for case splits and ECM would be to order the probe features on a stack such that the set with fewer features is the highest probe set and the probe set with more features is below this set. Since only the highest feature on a stack is accessible for operations, the lower set can only trigger operations if the first set is checked and deleted. In this way no indices are necessary to account for the order in which the probe sets are discharged and ECM is not problematic anymore.

313

A Local Derivation of Global Case Splits

into three, in part semantically motivated, privative features. There is a general person feature π which differentiates person from e.g. number or animacy. The feature [Participant] encodes speech act participiants (1st and 2nd person) and [Speaker] encodes the speaker of a speech act (1st person). These features are abbreviated as [1], [2] and [π ], respectively. (13) 1st person [speaker]=[1]

2nd person

[participant]=[2]

3rd person

[person]=[ π]

The traditional person values 1st, 2nd, and 3rd person are then represented as bundles of these privative features: ⎡ ⎤

[π ] [π ] ⎦ ⎣ 2nd: 3rd: [π ] (14) 1st: [2] [2] [1] The important point of this decomposition is that there are entailment relations between the privative features: If a category contains [Speaker], it also contains [Participant] and [π ]. In this way, hierarchies are encoded in the representation of phi-features: A value which is high on a scale is encoded by a superset of features compared to a value which is lower on this scale, in the case at hand this means 1st 2nd 3rd person. I adopt this decomposition and will apply its logic also to other goal features. However, I will represent the privative features by more abstract letters (instead of numbers or abbreviations of their semantics) in order to allow for comparison of patterns between languages in which different phifeatures are responsible for the case split (cf. section 4.3). For the person features ˇ acˇ (2009) this looks as in (15). of B´ejar & Rez´ (15) Abstract person features a. [C] = general person feature (= π) b. [B] = participant feature (= [Participant]) c. [A] = speaker (= [Speaker]) d. [C] =3rd person, [BC] = 2nd person, [ABC] = 1st person Finally, I assume that v has expectations about the properties of its arguments: It expects the typical unmarked case that DPint is lower on the hierarchy than DPext . In the present system, this means that the probe features which v provides in the internal probe feature set are a subset of the probe features in the external probe feature set. In Yurok, where a distinction is made between local person and nonlocal person, v enters the derivation with the following features (local person encoded by [BC], non-local person by and [C]; the feature [A] is irrelevant because 1st and 2nd person behave alike with respect to the case split):

314

Doreen Georgi

(16) v in Yurok v { [•V•] [•D•], [∗BC∗]ext , [∗C∗]int } These assumptions have the following consequences: Because of incremental structure building, v agrees first with DPint at a stage of the derivation where DPext has not yet been merged. If DPint is atypical in that it possesses more features than v provides for it (viz., if DPint is higher on a scale than expected), it cannot check all of its features and violates FC. Take Yurok as an example; v is repeated in (17). (17) v in Yurok v {[∗BC∗]ext , [∗C∗]int } v expects DPint to be 3rd person [C], but if it is 1st or 2nd person [BC], the feature [B] of the goal can not be checked and the constraint F EATURE C HECKING (FC) will be violated. I propose that there is a repair strategy, called Maraudage, which can apply in order to avoid the violation of FC: v possesses the required probe feature [∗B∗], but it is in the wrong probe set, the set provided for Agree with DPext . What happens is that the required feature is displaced into the probe set for Agree with DPint . (18) Maraudage: Features on v can be displaced from probe setA to probe setB . This means that features which were originally provided for checking with DPext are displaced from set [∗F∗]ext into set [∗F∗]int . Afterwards, DPint can check the displaced feature in [∗F∗]int as well:

v { [∗C∗]ext [∗BC∗] int } v{ [∗BC∗] ext [∗C∗]int }: v { [∗////// BC∗]ext [∗C∗]int } result −−−→

Maraudage

If Maraudage is to lead to the satisfaction of FC for DPint , a certain order of operations is to be adhered to: First, every privative probe feature that wants to enter into an Agree relation looks for a matching goal, because Match is a prerequisite for Agree. If it finds a goal but this goal has a superset of features of the probe and hence there is no one-to-one relation between probe features and matching goal features, Maraudage can apply. Afterwards, the actual Agree operation takes place which checks the involved features. In this way it is guaranteed that the marauded feature can enter into an Agree relation with DPint , too, which is the desired result. Note that Maraudage cannot apply freely; it is a repair strategy that only takes place when it is necessary to satisfy FC, but it is usually prohibited. This is expressed in the following constraint:

A Local Derivation of Global Case Splits

315

(19) N O M ARAUDAGE (N O M) Do not displace probe features from probe setA to probe setB . The fact that Maraudage is a repair operation suggests an optimality-theoretic analysis (Prince & Smolensky (1993)). If N O M is ranked below FC, it is possible to violate N O M and to displace features in order to fulfill FC for DPint . Whether Maraudage takes place or not has different consequences for what remains in set [∗F∗]ext . After Maraudage has taken place, only [∗B∗] remains in set [∗F∗]ext and therefore, DPext cannot be 1st or 2nd person [BC]; if it were [BC], the privative feature [B] of DPext could not be checked because the probe feature [∗B∗] had been checked by DPint after Maraudage. Hence, FC is violated. If Maraudage does not take place, i.e., if DPint is 3rd person [C], [∗BC∗] remains in the external probe set and DPext can be 1st or 2nd person [BC]. In this way, the restrictions on DPext are brought about by the properties of DPint .4 Crucially, FC and N O M are checked at each derivational step in order to guarantee a local derivation of GCS (for the motivation of this concept of extremely local optimization cf. Heck & M¨uller (2007)). In particular, because of incremental structure building, there is the stage v' of the derivation to which these constraints apply. At this stage, DPint is the only argument in the structure. It can trigger Maraudage before DPext is merged. DPext has to cope with the remaining features. Hence, DPext depends on the properties of DPint . The output of this first optimization is then the input for the next evaluation, hence, DPext is merged with the optimal v'-derivation and vP is projected. The constraints FC and N O M apply again, this time at the vP-level, but they can not access DPint which has been part of the previous optimization. Hence, there is no stage in the derivation at which the constraints can evaluate both arguments at the same time. The approach is thus local and not global. In contrast to FC and N O M, the constraint F ULL I NT does not apply at every stage of the derivation, but only at the phase level, viz. at vP. It cannot apply at v' because v has probe features for DPint and DPext and the latter can only be checked after DPext has been merged. 4

Cf. also Adger & Harbour (2007) for the idea that an atypical DPint can absorb features on a functional head which were originally needed to select DPext . For Adger & Harbour such a derivation is bound to crash. In this way, they derive the strong version of the Person Case Constraint (PCC). In the present analysis, however, it depends on the exact properties of DPext whether the derivation converges or not. A related difference between the two approaches is that the checking of the feature provided for DPext by DPint is unavoidable in Adger & Harbour’s approach but optional in mine. This is due to the fact that the strong version of the PCC that they discuss is not a global phenomenon but rather a local one: whenever DPint (i.e., the accusative marked direct object of a transitive verb) is 1st/2nd person, the PCC arises regardless of the features of the coargument (the dative). The global version of the PCC is the weak PCC in which the emergence of a PCC effect also depends on the properties of the coargument. When I discuss Local Case Splits, Maraudage will be also obligatory, cf. section 5.1. Hence, Adger & Harbour develop a similar idea but they only apply it to local phenomena.

316

Doreen Georgi

3.3. Morphological realization In this subsection I address the question of what the overt case marker realizes. The overt case marker shows up when DPint is higher on a scale than DPext . It is exactly in these contexts that Maraudage can apply. Therefore, I propose that the overt case marker is a morphological reflex of Maraudage.5 When Maraudage takes place, the diacritic ‘ ’ is generated in a probe set on v, represented as follows if the probe set contains a feature [F]: [F]. Let us assume for the moment that it attaches to the marauded feature, as in the shaded box below (18). This diacritic is passed on (copied) to the argument that checks the displaced feature via Agree. I propose that the overt case marker is the morphologically realization of this diacritic on an argument. This can be modeled in a postsyntactic, realizational model of morphology like Distributed Morphology (DM, Halle & Marantz (1993), Halle & Marantz (1994), Harley & Noyer (1999)). In DM, syntax operates solely on morphosyntactic feature bundles. Phonological information is added after the syntactic computation. Vocabulary items (VIs) which pair phonological information with morphosyntactic features are inserted into terminal nodes of the syntactic structure in accordance with the Subset Principle and Specificity: The most specific VI which realizes a subset of the features of the terminal node is inserted. In GCS languages, there is a VI which is sensitive to the diacritic generated by Maraudage, the element which is the overt case marker: (20) Vocabulary items a. /X/ ↔ [ ] b. Ø ↔ [ ] When a DP possesses the diacritic (DPint in the Yurok example), the first vocabulary item is inserted because it is more specific than the second. Otherwise, the 5

Harbour (2008) develops a similar idea, namely that the head which initiates Agree (V with DPint and v with DPext ) has different expectations on the properties of the arguments: DPext should should have certain phi-features which encode semantic properties, whereas the expectations for DPint are underspecified, viz. V probably selects only for the category N but not for certain phi-features. The idea is then that copying of phi-features takes place if the specifications of the DP and the expectation about the specifications on the head do not exactly match: a highly specified, atypical DPint copies phi-features on V because V has fewer selection features, whereas a modestly specified DPext receives phi-features provided by v for it. The core point is that an overt case marker is a realization of the copied phi-features. Our analyses are very similar in the sense that the case marker is a realization of displaced phi-features appearing with atypical arguments. In my approach the displacement is on the functional head which Agrees with the arguments whereas in Harbour’s account it takes place between a head and an argument. The crucial difference is that Harbour’s approach derives only local splits; he tentatively extends his analysis to the strong version of the Person Case Constraint which he takes to be a global phenomenon, but the strong version is indeed a local phenomenon, too, it is only the weak version of the PCC which is global (cf. fn. 4).

A Local Derivation of Global Case Splits

317

zero exponent is inserted; alternatively, there is no zero exponent and hence, no vocabulary item can be inserted if Maraudage has not taken place.6 As noted in the introduction, GCS languages differ with respect to the location of the split on either DPext (ergative) or DPint (accusative). I propose that this difference arises as a consequence of where the diacritic is generated: If it is generated in the probe feature set in which the displaced feature ends up (= [∗F∗]int ), it is passed on to DPint via Agree and hence, an accusative case marking pattern arises, as e.g. in Yurok. If the diacritic is generated in the set from which the feature is displaced (= [∗F∗]ext ), it is passed on to DPext via Agree and an ergative pattern arises. Hence, either a language marks that the set [∗F∗]int is atypical in that there are more features after Maraudage than there were originally, or the language marks that something unusual happened to the set [∗F∗]ext in that features have been displaced from it. 3.4. Intermediate summary Let me briefly summarize how the problems for an analysis of GCS in a derivational framework laid down in section 2 are solved in the present system. The case assigner v can communicate with two arguments because it provides Agreetriggering probe features for both DPext and DPint . Look-ahead is no longer needed because DPext depends on the properties of DPint under the new perspective in (8): DPint determines which properties DPext can have and this can be modeled in a derivational bottom-up syntax without look-ahead. Finally, case assignment is in accordance with the Strict Cycle Condition because the diacritic which is spelled out by the overt case marker is assigned cyclically by Agree. It is generated on v as a consequence of Maraudage when DPext has not yet been merged. The system as it is presented up to now overgenerates, because there is not always an overt case marker when DPint is high on a scale (see Table 2). The influence of DPext is illustrated in the next section. 6

Note that there is no abstract case in the system developed in section 3.2: Arguments are not assigned a case value in the syntax which can then be morphologically realized or not. Rather, there is just a morphological reflex of the operation Maraudage which can be called morphological case. But in a sense, there is an equivalent of the case filter in my analysis: the standard minimalist implementation of the case filter is that DPs have an uninterpretable case feature [ U C ASE ] which must be checked as a consequence of an Agree relation with a functional head. The requirement that DPs must Agree with functional heads is formulated in the constraint F EA TURE C HECKING in (11) in the Maraudage approach. Hence, there is also a licensing condition of DPs, namely that they must check their phi-features with v, the only difference is that this does not additionally result in checking or assignment of a case value.

318

Doreen Georgi

4. Derivations Languages with GCS can be divided in (at least) two groups: those which depend on a binary Silverstein scale and those which depend on a tripartite scale. I call the appearance of an overt case marker a binary scale effect in the former and a tripartite scale effect in the latter. 4.1. Binary scale effects 4.1.1. Yurok In this subsection I go through the derivation of binary scale effects in detail. The first example is Yurok. We have already seen that DPint bears an overt case marker if it is higher on the person scale in (21) than DPext . (21) Person hierarchy in Yurok 1st/2nd 3rd Table 3. Person/case combinations in Yurok

pattern 1 pattern 2 pattern 3 pattern 4

person of DPext 1st/2nd 3rd 1st/2nd 3rd

case of DPext Nom Nom Nom Nom

person of DPint 1st/2nd 1st/2nd 3rd 3rd

case of DPint Nom Acc Nom Nom

The split is driven by a binary scale that only distinguishes speech act participants from non-participants, hence, only the general person feature [C] and the participant feature [B] play a role for GCS. (22) Encoding of person in Yurok a. 3rd person: [C] b. 1st/2nd person: [BC] v expects DPext to be higher on the person scale than DPint . (23) v in Yurok v [[∗BC∗]ext , [∗C∗]int ] In order to derive a Global Case Split, Maraudage must be optional if it can apply. The reason is that we do not always find an overt case marker when DPint is atypical, i.e., 1st or 2nd person [BC]. Whether it shows up or not depends on the features of DPext (cf. Table 3), but since the present approach is local and

A Local Derivation of Global Case Splits

319

cannot access the features of DPext at the stage v' when it is decided whether Maraudage applies, Maraudage must be optional. It is only at the next cycle, vP, that a decision is made whether the candidate that has applied / has not applied Maraudage at v' wins, depending on the features of DPext . In OT, optionality can be expressed by a tie between the relevant constraints, namely FC, which eventually triggers Maraudage, and N O M, which prohibits Maraudage: FC ◦ N O M. This is a conjunctive local tie (cf. M¨uller (2000, ch. 5)) which means that the two constraints form a complex constraint that is violated if one of its subconstraints is violated. At the vP-level, F ULL I NT will become decisive. If v or DPext has an unchecked feature, the derivation must crash in order to derive that some logically possible patterns are not attested. The crash of the derivation can be represented in OT in the following way: there is a candidate which is empty Ø, the empty output. If this candidate becomes optimal, nothing is pronounced, which is in a sense the same as the crash of the derivation – a certain combination of features cannot be uttered. The E MPTY O UTPUT C ONDITION militates against the empty output: (24) E MPTY O UTPUT C ONDITION (EOC) Avoid the empty output. The final ranking of constraints for languages with a GCS is shown in (25): (25) Ranking in GCS languages F ULL I NTvP FC ◦ N O M ◦ EOC I go now through the derivations to show how the patterns in Table 3 are derived in a local way. The output(s) of the optimization at the v'-stage is (are) the input(s) for the optimization at the vP-stage of the derivation. Since F ULL I NTvP does not apply at v', I leave it out in the tableaux which evaluate this stage of the derivation. Let us begin with a typical DPint which is 3rd person [C]. Checked features are indicated by a strike-through [F]; a marauded feature is represented as [///// F] in its original set. ‘ ’ is the representation of the diacritic generated by Maraudage.

320

Doreen Georgi

S CENARIO 1: DPint is 3rd person [C] (26) Stage of the derivation = v Input: v {[∗BC∗]ext , [∗C∗]int } FC N O M EOC DPint =[C] ☞ C1 : v { [∗BC∗]ext , [∗C∗]int } DPint [C] C2 : v { [∗////// BC∗]ext , [∗BC∗]int } *! DPint [C] C3 : v { [∗BC∗]ext , [∗C∗]int } DPint [C] *! C4 : Ø *! C1 in which the only probe feature of v in [∗F∗]int is checked with the only person feature of DPint is the optimal candidate because it does not violate any constraint. All other candidates violate the constraint tie once: C4 is the empty output and violates EOC, in C3 no Agree applies and hence the person feature [C] of DPint is not checked, which violates FC; C2 marauds a feature but since Maraudage is not necessary in this case, it is blocked by a violation of N O M. C1 is thus the input for optimization at the vP-level. Since C1 is a candidate without Maraudage, there will be no overt case marker if DPint is 3rd person. The first case to consider at the vP-level is one in which a 1st/2nd person DPext is merged with the output of the previous optimization: (27) Stage of the derivation = vP, DPext is 1st/2nd person [BC] Input: v {[∗BC∗]ext , [∗C∗]int } F ULL I NT FC N O M EOC DPext =[BC] ☞ C1 : v { [∗BC∗]ext , [∗C∗]int } DPext [BC] C2 : v { [∗BC∗]ext , [∗C∗]int } *!* * DPext [BC] C3 : v { [∗BC∗]ext , [∗C∗]int } DPext [BC] *!*** ** C4 : Ø *! If DPext is 1st/2nd person [BC], all features of DPext and v can be checked since v provides exactly the probe feature counterparts [∗BC∗] and hence, no constraint is violated. If only one of the features or no feature has been checked (C2 and C3 ), F ULL I NT is fatally violated by the unchecked probe features of v and the unchecked person features of DPext . The empty output violates EOC. Next, assume that a 3rd person DPext [B] is merged with the optimal output C1 of the optimization in (26). This derivation should crash since the probe feature

A Local Derivation of Global Case Splits

321

[∗B∗] of v cannot be checked, given that DPext is [C]. However, this pattern is attested. The crucial observation is that when both arguments of a transitive verb are 3rd person [C], it is already clear before the derivation starts that [∗B∗] can never be checked because neither DPint nor DPext possesses a feature [B]; both are 3rd person [C]. I propose that the system is able to detect such a situation and provides a mechanism that solves the problem already in the numeration before the derivation starts:

(28) F-deletion7 A probe feature [∗F∗] can be deleted on a head α in the numeration if it is impossible to check F in the first place, because none of the arguments of α possesses a matching feature F (where F is a variable over the privative features A, B, and C). A feature which is deleted in the numeration is set in gray in the tableaux. The derivation with a 3rd person DPext is then as follows: (29) Stage of the derivation = vP, DPext is 3rd person [C], F-Deletion applies to [∗B∗] Input: v {[∗BC∗]ext , [∗C∗]int } F ULL I NT FC N O M EOC DPext =[C] ☞ C1 : v { [∗BC∗]ext , [∗C∗]int } DPext [C] C3 : v { [∗BC∗]ext , [∗C∗]int } DPext [C] *! * C3 : Ø *! As in the derivation in (27), C1 is the optimal output because no unchecked features remain on v or DPext . If no Agree took place, F ULL I NT and FC would be violated. To conclude, we have derived pattern 3 and 4 of Table 3: If DPint is 3rd person it bears no overt case marker, regardless of the features of DPext . There is no overt case marker because the optimal candidate of the v'-evaluation is a candidate without Maraudage and since the case marker reflects Maraudage, it cannot appear if DPint is 3rd person. We now turn to the derivations in which DPint is atypical, i.e., 1st or 2nd person [BC]. We start with the evaluation of the v'-level. Since DPint wants to check more features than v provides for it, Maraudage can apply to the probe feature [∗B∗] from set [∗F∗]ext to set [∗F∗]int .

7

See Heck & M¨uller (2003) for arguments that access to elements in the numeration is not another instance of look-ahead.

322

Doreen Georgi

S CENARIO 2: DPint is 1st/2nd person [BC] (30) Stage of the derivation = v Input: v {[∗BC∗]ext , [∗C∗]int } FC N O M EOC DPint =[BC] ☞ C1 : v { [∗BC∗]ext , [∗C∗]int } DPint [BC] * ☞ C2 : v { [∗////// BC∗]ext , [∗BC∗]int } * DPint [BC] C3 : v { [∗BC∗]ext , [∗C∗]int } DPint [BC] **! ☞ C4 : Ø * C3 does not apply Agree at all, which leads to two violations of FC one of which is fatal. All other candidates violate the constraint tie only once and are thus optimal: C1 checks only the feature [C] of DPint , but its feature [B] remains unchecked because v does not provide [∗B∗] in the internal probe set. C2 marauds [∗B∗] and can thereby avoid a violation of FC, but causes a violation of N O M.8 The empty output violates the EOC. The empty output can be ignored, it cannot be further expanded by structure building since it does not have any operation-inducing features; the two other optimal candidates can merge an external argument and can be further evaluated. As a result of the evaluation at v', C1 without Maraudage and C2 with Maraudage are the input for the optimization at the vP-level. In both a DPext of 1st/2nd or 3rd person can be merged. There are thus four possible derivations, but only two of them will converge and produce patterns 1 and 2 of Table 3. We continue with C1 , in which no Maraudage applied and merge a 3rd person DPext in (31).

8

One might think of other candidates which represent further repair strategies beside Maraudage, e.g. deletion of the unchecked feature of a DP, or insertion of a probe feature counterpart on v, etc. I assume that these repair strategies are not available because the faithfulness constraints that militate against these repairs outrank the highest ranked constraint F ULL I NT.

A Local Derivation of Global Case Splits

323

S CENARIO 2.1: C1 continued (31) Stage of the derivation = vP, DPext is 3rd person [C] Input: v {[∗BC∗]ext , [∗C∗]int } F ULL I NT FC N O M EOC DPext =[C] C1 : v { [∗BC∗]ext , [∗C∗]int } DPext [C] *! C2 : v { [∗BC∗]ext , [∗C∗]int } *!** * DPext [C] ☞ C3 : Ø * Since no Maraudage applied at the v'-stage, [∗BC∗] remains on v and must be checked. But DPext provides only [C] such that F ULL I NT is inevitably violated by the unchecked probe feature [∗B∗]. Only the empty output does not violate the highest ranked constraint F ULL I NT and is thus optimal. This means that there does not exist a pattern with a 1st/2nd person DPint , a 3rd person DPext and without an overt case marker (since there is no Maraudage in this case). This is correct, cf. Table 3. Next, consider the case where a 1st/2nd person DPext is merged with C1 of (30): (32) Stage of the derivation = vP, DPext is 1st/2nd person [BC] Input: v {[∗BC∗]ext , [∗C∗]int } F ULL I NT FC N O M EOC DPext =[BC] ☞ C1: v { [∗BC∗]ext , [∗C∗]int } DPext [BC] C2: v { [∗BC∗]ext , [∗C∗]int } DPext [BC] *!* * C3: v { [∗BC∗]ext , [∗C∗]int } *!*** ** DPext [BC] C3 : Ø *! In this case all remaining probe features of v and all the features of DPext can be checked. No constraint is violated as shown in C1 . In C2 and C3 only one or none of the features are checked and hence, FC and F ULL I NT are fatally violated. This derives pattern 1 in Table 3: In a context with a 1st/2nd person DPint and a 1st/2nd person DPext there is no overt case marker because Maraudage did not apply. The next two scenarios are those in which DPext is merged with the second optimal output C2 of (30), the candidate in which Maraudage did apply. As a consequence of Maraudage, only the probe feature [∗C∗] remains in the set [∗F∗]ext on v. We start with a 3rd person DPext .

324

Doreen Georgi

S CENARIO 2.2: C2 continued (33) Stage of the derivation = vP, DPext is 3rd person [C] Input: v {[∗////// BC∗]ext , [∗BC∗]int } F ULL I NT FC N O M EOC DPext =[C] ☞ C1 : v { [∗////// BC∗]ext , [∗BC∗]int } DPext [C] C2 : v { [∗////// BC∗]ext , [∗BC∗]int } *!* * DPext [C] C3 : Ø *! In C1 v and DPext can check all of their features and hence none of the constraints is violated. No checking at all (C2 ) or the empty output (C3 ) violate FC and the EOC, respectively. The optimal candidate is thus C1 . This derives pattern 2 in Table 3: in the context 3rd person DPext and 1st/2nd DPint , DPint bears an overt case marker since Maraudage applied at the v'-level and there is a converging continuation of this derivation at the vP-level. The final case is the one in which C2 of (30) and a 1st/2nd person DPext are merged. (34) Stage of the derivation = vP, DPext is 1st/2nd person [BC] Input: v {[∗////// BC∗]ext , [∗BC∗]int } F ULL I NT FC N O M EOC DPext =[BC] C1: v { [∗////// BC∗]ext , [∗BC∗]int } DPext [BC] *! * C2: v { [∗////// BC∗]ext , [∗BC∗]int } DPext [BC] *!** ** ☞ C3 : Ø * v provides only [∗C∗] after Maraudage, but DPext needs to check [BC]. Hence, F ULL I NT is inevitably violated by the unchecked feature [B] of DPext . As a consequence, the empty output becomes the optimal candidate. This means that there is no pattern in which both DPint and DPext are 1st/2nd person and DPint bears an overt case marker, cf. Table 3. Thus, all the patterns in Table 3 and the non-existence of the other logically possible combinations of person features and overt vs. zero case marking in Yurok are derived. In a nutshell, the derivation goes as follows: Since Maraudage is optionally triggered if DPint is high on the person hierarchy, namely 1st/2nd person, (i) there cannot be an overt case marker with a 3rd person DPint and (ii) a converging derivation of vP does not necessarily result in an overt case marker because both a derivation with and one without Maraudage are optimal at the v'-level if DPint is atypical. Only if DPext is such that it can check all of its own features and those of v, does a converging derivation arises. Otherwise,

A Local Derivation of Global Case Splits

325

the empty output wins, which amounts to a crash of the derivation, i.e., such a pattern does not exist. Remember that the diacritic generated by Maraudage in the probe set [∗F∗]int is transmitted via Agree to DPint in Yurok. There it is realized by the vocabulary item given in (35) (it attaches only to singular arguments, hence the context restriction):9 (35) Case exponent in Yurok -ac ↔ [ ] / [sg] 4.1.2. Umatilla Sahaptin Another example of a binary split can be found in Umatilla Sahaptin (Penutian). DPext bears an ergative marker (glossed as INV. ERG) if DPint is higher on the person scale in (36) than DPext . (36) Person hierarchy in Umatilla Sahaptin 1st/2nd 3rd (37) GCS in Umatilla Sahaptin (Rigsby & Rude (1996, 676-677)) a. 1w´ınˇs i-tuxnana y´aamaˇs-na man 3 NOM-shot mule.deer-OBJ ‘The man shot a mule deer.’ 3rd sg > 3rd b. ´In=aˇs a´ -q’inu-ˇsa aw´ınˇs-in-aman 1SG.NOM=1SG 3-see-IMPV men-DU-OBJ . PL ‘I see the two men.’ 1st > 3rd c. 1w´ınˇs-n1m=nam i-q’´ınu-ˇsa man-INV. ERG=2SG 3 NOM-see-IMPV ‘The man sees you.’ 3rd sg > 2nd i-wy´anawi-yawan-a d. 1w´ınˇs-n1m=naˇs man-INV. ERG=1SG 3SG-arrive-APPL-PST ‘The man came to me / my place.’ 3rd sg > 1st ˇ aw=nam paaman´a a´ -yk-ˇsa? e. C´ NEG =2 SG 3 PL . OBJ 3-hear- IMPV ‘Don’t you hear them?’ 2nd > 3rd This leads to the following attested patterns: 9

Plural forms of a 1st/2nd person pronoun in Yurok never bear the accusative marker. This restriction cannot be handled by a second Maraudage operation that applies to number features because then we would expect to obtain restrictions on the number of DPext by the number of DPint . But this is not the case. Therefore, I assume that -ac is a context-sensitive marker that can only be inserted in singular contexts.

326

Doreen Georgi

Table 4. Person/case combinations in Umatilla Sahaptin

Pattern 1: Pattern 2: Pattern 3: Pattern 4:

person of DPext 1st/2nd 3rd 1st/2nd 3rd

case of DPext Abs Erg Abs Abs

person of DPint 1st/2nd 1st/2nd 3rd 3rd

case of DPint Abs Abs Abs Abs

The patterns in Umatilla Sahaptin are exactly the same as those in Yurok (compare Table 3) and the derivations are thus exactly the same; the only difference is the location of the split: an overt marker shows up on DPint in Yurok, but on DPext in Umatilla Sahaptin. As was already discussed in 3.3, this difference is handled by a parameter which concerns the emergence of the diacritic when Maraudage takes place: it emerges in the set [∗F∗]int in Yurok (attached to the marauded feature), but in the set [∗F∗]ext (attached to the remaining features in the set which was affected by Maraudage) in Umatilla Sahaptin. It is then passed on to DPext in the latter and realized by an overt marker: (38) Case exponent in Umatilla Sahaptin10 /n1m/ ↔ [ ] / [sg] 4.2. Tripartite scale effects There are also GCS languages in which overt case marking depends on a tripartite scale, namely Fore, Kashmiri and Awtuw. In this article, I concentrate on Fore (Trans-New Guinea) for ease of exposition. Case marking in Fore is driven by the animacy scale in (39). DPext bears an ergative suffix if it is lower on that scale than DPint . (39) Animacy hierarchy in Fore human animate inanimate (40) GCS in Fore (Scott (1978, 116)) a. Yagaa-wama w´a aeg´uye pig-ERG man hit ‘The pig hits the man.’ b. Yagaa w´a aeg´uye pig man hit ‘The man hits (or kills) the pig.’

10

anim > hum hum > anim

As in Yurok, the marker only attaches to singular arguments, hence the context restriction.

327

A Local Derivation of Global Case Splits Table 5. Animacy/case combinations in Fore

Pattern 1: Pattern 2: Pattern 3: Pattern 4: Pattern 5: Pattern 6: Pattern 7: Pattern 8: Pattern 9:

animacy of DPext hum anim inanim hum anim inanim hum anim inanim

case of DPext Abs Erg Erg Abs Abs Erg Abs Abs Abs

animacy of DPint hum hum hum anim anim anim inanim inanim inanim

case of DPint Abs Abs Abs Abs Abs Abs Abs Abs Abs

The only important difference between binary and tripartite scale effects for the analysis is the decomposition of features: three privative features are needed to encode a tripartite scale in order to distinguish the three steps on the hierarchy, but only two privative features are needed for a binary scale. As was done for person in Yurok and Umatilla Sahaptin, animacy is decomposed into privative features such that the value which is higher on the animacy scale has a superset of features compared to the value lower on the scale. In Fore, [C] is a general animacy feature (as opposed to person, number, etc.), [B] encodes [+animate], and [A] means [+human]. The following encodings result: (41) Representation of animacy features a. [C] encodes inanimates. b. [BC] encodes animates. c. [ABC] encodes humans. Again, v expects DPext to be higher on the scale than DPint , namely that DPext is human and DPint is inanimate. Hence, Maraudage which results in an overt case marker can potentially take place when DPint is animate [BC] or human [ABC], cf. Table 5. (42) Lexical entry for v in Fore v [[∗ABC∗]ext , [∗C∗]int ] All other assumptions are exactly as in Yurok and Umatilla Sahaptin, especially the ranking of the constraints, i.e., Maraudage is optional. Under these assumptions the patterns in Fore can be derived in exactly the same way as in the two other languages, there are just more combinations of DPint and DPext that could potentially be generated and Maraudage can apply to more than one feature if DPint is human [ABC]. The gist of the analysis is again that since Maraudage need not and hence cannot apply with an inanimate DPint , there will never be an overt case marker in these cases. Maraudage applies if DPint is animate or

328

Doreen Georgi

human. Regardless of whether Maraudage takes place or not at the v'-level, the derivation only converges at the vP-level when DPext is such that it matches exactly the remaining probe features of v, otherwise v and/or DPext have unchecked features and cause a fatal violation of F ULL I NT, which leads to the crash of the derivation. 4.3. Direction marking In direction marking languages the occurrence of an overt verbal marker is driven by the same abstract pattern as the occurrence of the overt case marker in GCS languages: The verb bears an overt marker, the inverse marker, if DPint is higher on a Silverstein scale than DPext . The verb in a direct context is usually zero-marked. Thus, direction marking differs from Global Case Splits only in the locus of the exponent – head-marking in direction marking languages vs. dependent-marking in GCS languages (Nichols (1986)). Hence, direction marking is another global argument encoding phenomenon which can be analysed in the same way as Global Case Splits (Z´un˜ iga (2006); Drellishak (2008)). To see the similarity more clearly, consider Nocte (Sino-Tibetan, Aissen (1999)): the direct, zero marked verb form is used if DPext is higher on the person scale in (43) than DPint or if both are 3rd person (non-coreferent); the inverse marker –h is attached to the verb if DPint is higher on the scale than DPext . (43) Person scale in Nocte 1st 2nd 3rd (44) Person hierarchy effects in Nocte (Das Gupta (1971, 21)) a. hetho-min teach-1PL ‘I will teach you(pl).’ 1st > 2nd b. hetho-o teach-2 ‘You will teach them.’ 2nd > 3rd c. hetho-h-ang teach-INV-1 ‘You/he will teach me.’ 2nd/3rd > 1st d. hetho-h-o teach-INV-2 ‘He will teach you.’ 3rd > 2nd If person is decomposed as introduced in section 3.2, the following encodings of person features arises:

A Local Derivation of Global Case Splits

329

(45) Person features in Nocte a. 3rd person: [C] b. 2nd person: [BC] c. 1st person: [ABC] If we now compare the abstract patterns of Nocte (inverse) with those of Fore (GCS), we can see that they are identical (the gaps in Nocte are due to the fact that person is the decisive feature in Nocte – the gaps are reflexive contexts; since the relevant feature in Fore is animacy, no such gaps arise): (46) Patterns of Fore and Nocte Fore Nocte ergative Ø inverse Ø [BC]–[ABC] [ABC]–[ABC] [BC]–[ABC] – [C]–[ABC] [ABC]–[BC] [C]–[ABC] [ABC]–[BC] [C]–[BC] [BC]–[BC] [C]–[BC] – [ABC]–[C] [ABC]–[C] [BC]–[C] [BC]–[C] [C]–[C] [C]–[C] Thus, these patterns should be derived in the same way, although one shows dependent-marking and the other head-marking. Under the Maraudage analysis developed in this article, direction marking in Nocte can be derived in the same local way as the Global Case Split in Fore. Note that no further assumptions are necessary to derive direction marking, the pattern is even expected under the analysis given that the diacritic which shows that Maraudage has taken place is generated on v, a verbal projection. If DPint is atypical, Maraudage can apply. Depending on the properties of DPext , a derivation in which Maraudage has applied can become optimal. The diacritic is then realized postsyntactically on v instead of on an argument DP.11 The difference in locus between GCS and direction marking languages can be modeled by context restrictions on the relevant VI which make them category-sensitive: (47) a. b. 11

/X/ ↔ [ ] / [v] /X/ ↔ [ ] / [D]

inverse marker case marker

It is of no importance whether in direction marking languages the diacritic is also transmitted to a DP via Agree or not. Either it is not copied to a DP or it is copied just as in GCS languages but it is simply not spelled-out on the DP. Since there are languages which have direction marking and GCS simultaneously, e.g. Arizona Tewa (Kroskrity (1978; 1985); Z´un˜ iga (2006)), the second option seems preferable.

330

Doreen Georgi

5. Languages without Global Case Splits The analysis developed for languages with Global Case Splits might seem to be designed for the relatively small number of languages with GCS. In this section I show that the analysis can also handle languages with Local Case Splits (LCS) and languages without case splits. Finally, I address how Burzio’s generalization can be derived from the Maraudage approach. 5.1. Local Case Splits The main difference between Global and Local Case Splits is that in the latter case marking solely depends on the properties of DPint and not also on those of the coargument. To model this in the present analysis, Maraudage must be obligatory if it can apply, i.e., whenever DPint is atypical. Hence, FC must outrank N O M, FC N O M. As a result, only one candidate will be optimal when the v'-level is evaluated: if DPint is higher on the scale than expected, the candidate that applies Maraudage is optimal and it will be the input for optimization at the vP-level. There will thus always be an overt case marker with an internal argument which is high on a Silverstein scale. The ranking in languages with Local Case Splits is given in (48): (48) Ranking for LCS EOC F ULL I NT FC N O M Another contrast between languages with GCS and LCS is that the constraint EOC must be the highest ranked in the latter. The reason is the context in which DPext and DPint are both high on a scale. Take, for example, a language that is like Yurok in that the split depends on a binary scale, but it has a local split. v expects DPint to be low on the scale: v { [∗BC∗]ext [∗C∗]int }. If DPint is high on the binary scale, viz. [BC], Maraudage must apply under the ranking FC

N O M. The result is that v provides only [∗C∗]ext for the external argument. If, however, DPext is also high on the scale, viz. [BC], its feature [B] cannot be checked because [∗B∗] has been marauded and checked by DPint and hence, F ULL I NT and FC must be violated. If the EOC was ranked as in GCS languages, this would wrongly predict that the empty output is the optimal candidate in such a situation. But in LCS languages, if there is an overt case marker, i.e., if Maraudage has applied, a DPext with any properties can be merged – it will always lead to an attested pattern and hence, the derivation must converge (cf. Hebrew and Tauya below). This means that a violation of F ULL I NT and FC by an unchecked feature of a DP cannot be fatal in LCS languages. Therefore, the EOC is the highest ranked constraint, the empty output can never be optimal.

A Local Derivation of Global Case Splits

331

With respect to the locus of the diacritic, the same variation arises as in GCS languages: the diacritic indicating Maraudage can be generated in [∗F∗]ext or in [∗F∗]int and is passed on to DPext and DPint , respectively. The latter case is the most common local split in which the same argument on whose properties the split depends exhibits the split. An example for such an accusative case marking pattern is Hebrew as shown in (2). In Hebrew, DPint bears a case marker if it is high on the binary definiteness scale, viz. definite (a pronoun, a name or a definite noun). In any other configuration, it is zero-marked (nominative): (49) Definiteness hierarchy in Hebrew definite indefinite Table 6. Definiteness/case combinations in Hebrew

DPext case of DPext DPint case of DPint Pattern 1: def/indef Nom def Acc Pattern 2: def/indef Nom indef Nom What has to be done is to decompose definiteness into privative features: [C] is a general definiteness feature (as opposed to person, animacy, etc.) and [B] means [+definite]. Hence, the following feature bundles for definites and indefinites arise: (50) Representation of definiteness features a. [C] encodes an indefinite referent b. [BC] encodes a definite referent v in Hebrew also expects DPint to be lower on the scale than DPext : (51) v in Hebrew v [[∗BC∗]ext , [∗C∗]int ] Under the ranking in (48) the patterns in Table 6 are derived. The diacritic is generated in [∗F∗]int and transferred to DPint under Agree where it is spelled-out as Pet. There are also LCS languages with an ergative case marking pattern, i.e., in which the split depends on the properties of DPint but the case marker alternation shows up on DPext . Tauya (Trans-New Guinea) is such a language. In Tauya, DPext has an overt marker if DPint is high on the binary animacy hierarchy, viz. if it is human.

332

Doreen Georgi

(52) Animacy hierarchy in Tauya human non-human (53) GCS in Tauya (MacDonald (1990, 120-121 & 316)) fanu Ø-yau-e-Pa a. ya-ni/*Ø 1 SG-ERG/*ABS man 3 SG-see-1/2-IND ‘I saw the man.’ pai yau-e-Pa b. ya-Ø 1 SG-ABS pig see-1/2-IND ‘I saw the pig.’

hum > hum hum > non-hum

Table 7. Animacy/case combinations in Tauya

DPext case of DPext DPint case of DPint Pattern 1: hum/non-hum Erg hum Abs Pattern 2: hum/non-hum Abs non-hum Abs The split can be derived in the present system if animacy is decomposed as in (54) and under the ranking in (48). (54) Representation of animacy features a. [C] is a general animacy feature b. [B] means [+human] c. [C] encodes a non-human referent d. [BC] encodes a human referent (55) v in Tauya v [[∗BC∗]ext , [∗C∗]int ] The diacritic indicating Maraudage is generated in the set [∗F∗]ext on v and transmitted to DPext via Agree where it is realized by an overt marker. This gives rise to an ergative pattern of case marking. 5.2. Languages without case splits In languages without case splits an argument A always shows the same case marker, regardless of the nature of DPint . In German, for example, DPint of a transitive verb always bears accusative case (except when a verb assigns inherent case which overwrites the default accusative). If Maraudage leads to overt case marking, it must apply in every derivation in these languages. To guarantee that this is enforced, languages without case splits have to be treated like languages with Local Case Splits, i.e., Maraudage must be obligatory if possible, FC

A Local Derivation of Global Case Splits

333

N O M. In order to ensure that Maraudage can apply even with typical internal arguments which are low on a scale, the probe set [∗F∗]int on v must be empty: (56) v { [∗(A)(B)C∗]ext , [∗F∗]int } The external probe set must at least contain the feature which is part of every value of a category, i.e., [C] (it can contain more but this does not matter for the core of the analysis). Note that this representation also matches the requirement in GCS and LCS languages that v provides for Agree with DPint a subset of features of the features provided for Agree with DPext . The consequence of the empty internal probe set is that any DPint can trigger Maraudage because DPint always contains at least the feature [C]. Given the ranking FC N O M Maraudage must then apply. As in LCS languages, the properties of DPext are of no importance: F ULL I NT and FC are non-fatally violable under the ranking in (48). Depending on where the diacritic that indicates Maraudage is generated (in the set [∗F∗]int or in the set [∗F∗]ext ), an ergative or an accusative pattern of case marking arises. 5.3. Burzio’s Generalization An interesting consequence of the present approach is that Burzio’s Generalization can be accounted for. Reformulated in modern terms, Burzio (1986) states that only the v which selects an external argument can assign accusative. Burzio’s generalization is on abstract case, but in the present system there is no abstract case feature (cf. footnote 6), hence the correlation can only hold for morphological case. The derivation goes as follows: Morphological accusative case is always the indicator of the operation Maraudage. Maraudage is only possible if there are two probe feature sets on v. But since in intransitive contexts it is generally assumed that there can be only one probe for the single argument if the derivation is to converge (v { [∗F∗] }), Maraudage is excluded: there is no other probe feature set from which features could be displaced. Consequently, no diacritic is generated and there can be no accusative marking. The same holds for transitive verbs which are passivized. A necessary step for passivization is argument reduction which can be brought about by deletion of [•D•] on v in the present approach. Now, if this includes deletion of one of the probe sets on v since passivization is a detransitivizing operation and intransitive v has only one probe set, then again only a single probe set remains on v after argument reduction and Maraudage is precluded.

334

Doreen Georgi

6. Conclusion In this paper I have shown how Global Case Splits can be derived in a local and cyclic way. GCSs have rarely been addressed in the literature although they are a challenge for a cyclic derivational syntax: they either require lookahead or counter-cyclic case assignment under standard minimalist assumptions about structure building and case assignment. I proposed that the data should be looked at from a new perspective: The features of DPext depend on those of DPint . Whenever DPint is atypical (viz. high on a scale), the repair operation Maraudage can apply: it steals features on v provided for DPext in order to use them for Agree with DPint . As only the features of DPint are relevant for Maraudage, a local derivation is possible. At no point in the derivation is the information about both arguments of a transitive verb accessible. Overt case marking is analysed as a reflex of Maraudage. The typology of global and local case marking strategies is derived by two parameters: (i) Maraudage is obligatory (FC N O M) or optional (FC N O M) and (ii) the diacritic is generated in the external or the internal probe set on v, which accounts for ergative vs. accusative patterns of case marking. Furthermore, it was shown that the analysis carries over to another global argument encoding pattern, namely inverse marking, to languages without case splits, and finally that it can derive Burzio’s Generalization.

Bibliography Adger, David & Daniel Harbour (2007): ‘Syntax and Syncretisms of the Person Case Constraint’, Syntax 10, 2–37. Aissen, Judith (1999): ‘Markedness and Subject Choice in Optimality Theory’, Natural Language and Linguistic Theory 17, 673–711. Aissen, Judith (2003): ‘Differential Object Marking: Iconicity vs. Economy’, Natural Language and Linguistic Theory 21, 435–483. Anagnostopoulou, Elena (2003): The Syntax of Ditransitives: Evidence from Clitics. Mouton de Gruyter, Berlin. B´ejar, Susana (2003): Phi-Syntax: A Theory of Agreement. PhD thesis, University of Toronto. ˇ acˇ (2009): ‘Cyclic Agree’, Linguistic Inquiry 40, 35–73. B´ejar, Susana & Milan Rez´ Bossong, Georg (1985): Differenzielle Objektmarkierung in den neuiranischen Sprachen. Narr, T¨ubingen. Burzio, Luigi (1986): Italian Syntax. Kluwer, Dordrecht. Chomsky, Noam (1965): Aspects of the Theory of Syntax. MIT Press, Cambridge, MA. Chomsky, Noam (1973): Conditions on Transformations. In: S. Anderson & P. Kiparsky, eds., A Festschrift for Morris Halle. Academic Press, New York, pp. 232–286. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Mass. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels & J. Uriagereka, eds., Step by Step. MIT Press, Cambridge, Mass., pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale. A Life in Language. MIT Press, Cambridge, Mass., pp. 1–52. Comrie, Bernard (1979): ‘Definite and Animate Direct Objects: A Natural Class’, Linguistica silesiana 3, 13–21.

A Local Derivation of Global Case Splits

335

Das Gupta, Kamalesh (1971): An Introduction to the Nocte Language. North-East Frontier Agency, Shillong. De Hoop, Helen & Andrej L. Malchukov (2008): ‘Case-Marking Strategies’, Linguistic Inquiry 39, 565–587. Drellishak, Scott (2008): Complex Case Phenomena in the Grammar Matrix. In: S. M¨uller, ed., The Proceedings of the 15th International Conference on Head-Driven Phrase Structure Grammar. CSLI Publications, Stanford, pp. 67–86. Available at http://cslipublications.stanford.edu/HPSG/9/. Feldman, Harry (1986): A Grammar of Awtuw. Australian National University, Canberra. Georgi, Doreen (2009): Local Modelling of Global Case Splits. Master’s thesis, University of Leipzig. Hale, Ken (1972): A New Perspective on American Indian Linguistics. In: A. Ortiz, ed., New Perspectives on the Pueblos. University of New Mexico Press, Albuquerque, pp. 87–103. Halle, Morris & Alec Marantz (1993): Distributed Morphology and the Pieces of Inflection. In: K. Hale & S. J. Keyser, eds., The View from Building 20. MIT Press, Cambridge, Mass., pp. 111– 176. Halle, Morris & Alec Marantz (1994): Some Key Features of Distributed Morphology. In: A. Carnie, H. Harley & T. Bures, eds., Papers on Phonology and Morphology. Vol. 21 of MIT Working Papers in Linguistics, MITWPL, Cambridge, Mass., pp. 275–288. Harbour, Daniel (2008): A Feature Calculus for Silverstein Hierarchies. Handout of a talk given at the Workhop on Scales, University of Leipzig, March 2008. Harley, Heidi & Elisabeth Ritter (2002): ‘Person and Number in Pronouns: A Feature-Geometric Analysis’, Language 78, 482–526. Harley, Heidi & Rolf Noyer (1999): ‘Distributed Morphology’, GLOT International 4/4, 3–9. Heck, Fabian & Gereon M¨uller (2003): ‘Derivational Optimization of Wh-Movement’, Linguistic Analysis 33, 97–148. Heck, Fabian & Gereon M¨uller (2007): Extremely Local Optimization. In: E. Brainbridge & B. Agbayani, eds., Proceedings of WECOL 26. California State University, Fresno, pp. 170–183. Heck, Fabian & Marc Richards (2010): ‘A Probe-Goal Approach to Agreement and NonIncorporation Restrictions in Southern Tiwa’, Natural Language and Linguistic Theory 28, 681– 721. Keine, Stefan (2010): Case and Agreement from Fringe to Core. A Minimalist Approach. Linguistische Arbeiten 536, de Gruyter, Berlin. Keine, Stefan & Gereon M¨uller (2010): Non-zero/Non-zero Alternations in Differential Object Marking. In: P. Brandt & M. Garc´ıa Garc´ıa, eds., Transitivity. Form, Meaning, Acquisition, and Processing. John Benjamins, Amsterdam/Philadelphia, pp. 119–140. Linguistik Aktuell/Linguistics Today 166. Kroskrity, Paul V. (1978): ‘Aspects of Syntactic and Semantic Variation Within the Arizona Tewa Speech Community’, Anthropological Linguistics 20, 235–257. Kroskrity, Paul V. (1985): ‘A Holistic Understanding of Arizona Tewa Passives’, Language 61, 306– 328. Lazard, Gilbert (1984): Actance Variations and Categories of the Object. In: F. Plank, ed., Objects: Towards a Theory of Grammatical Relations. Academic Press, London, pp. 249–289. Legate, Julie Anne (2008): ‘Morphological and Abstract Case’, Linguistic Inquiry 39, 55–101. MacDonald, Lorna (1990): A Grammar of Tauya. Mouton de Gruyter, Berlin. Maslova, Elena (2003): A Grammar of Kolyma Yukaghir. Mouton Grammar Library, de Gruyter, Berlin, New York. M¨uller, Gereon (2000): Elemente der optimalit¨atstheoretischen Syntax. Stauffenburg, T¨ubingen. Nichols, Johanna (1986): ‘Head-Marking and Dependent-Marking Grammar’, Language 62, 56– 119. Noyer, Rolf (1998): Impoverishment Theory and Morphosyntactic Markedness. In: S. Lapointe, D. Brentari & P. Farrell, eds., Morphology and its Relation to Phonology and Syntax. CSLI, Palo Alto, pp. 264–285.

336

Doreen Georgi

Prince, Alan & Paul Smolensky (1993): Optimality Theory. Constraint Interaction in Generative Grammar. Book ms., Rutgers University. ˇ acˇ , Milan (2008): ‘The Syntax of Eccentric Agreement: The Person Case Constraint and AbsoRez´ lutive Displacement in Basque’, Natural Language and Linguistic Theory 26, 61–106. Richards, Marc (2008): Quirky Expletives. In: R. d’Alessandro, G. H. Hrafnbjargarson & S. Fischer, eds., Agreement Restrictions. Mouton de Gruyter, Berlin, pp. 181–213. Rigsby, Bruce & Noel Rude (1996): Sketch of Sahaptin, a Sahaptin language. In: I. Goddard, ed., Handbook of North American Indians: Languages. Smithsonian Institution, Washington, pp. 666–692. Robins, Robert H. (1958): The Yurok Language: Grammar, Texts, Lexicon. University of California Press, Berkeley. Scott, Graham (1978): The Fore Language of Papua New Guinea. Pacific Linguistics (PL-B47), Canberra. Silverstein, Michael (1976): Hierarchy of Features and Ergativity. In: R. Dixon, ed., Grammatical Categories in Australian Languages. Australian Institute of Aboriginal Studies, Canberra, pp. 112–171. Sternefeld, Wolfgang (2006): Syntax. Stauffenburg, T¨ubingen. Two volumes. Wali, Kashi & Omkar N. Koul (1997): Kashmiri. A Cognitive-Descriptive Grammar. Routledge, London. Z´un˜ iga, Fernando (2006): Deixis and Alignment. Inverse Systems in Indigenous Languages of the Americas. Vol. 70 of Typological Studies in Language, John Benjamins, Amsterdam.

Institut f¨ur Linguistik Universit¨at Leipzig

Hans-Martin G¨artner

Function Composition and the Linear Local Modeling of Extended N EG-Scope* Abstract This paper is devoted to exploring the local modeling of non-local dependencies from the perspective of Categorial Grammar (CG). Work by Błaszczak and G¨artner (2005) on a CG-based analysis of extended NEG-scope in English is revisited. The crucial role played by linear adjacency and function composition in this analysis is highlighted. Particular attention is paid to showing how empirically inadequate alternative derivations are blocked and thus to demonstrating that the system does not overgenerate. Also, a challenge by Wagner (2005) concerning the prosodic adequacy of the analysis is addressed.

1. Introduction This paper will be devoted to exploring the local modeling of non-local dependencies from the perspective of Categorial Grammar (henceforth CG). In particular, I will recapitulate and review work by Błaszczak and G¨artner (2005) on a CG-based analysis of extended NEG-scope in English. It will be shown how constraints on linear order play a crucial role in controling the main CG-tool for scope extension, i.e., function composition. These constraints are local in the sense that they restrict each step of composition as it applies to linearly adjacent constituents. The analysis is couched in the particular framework of Combinatory Categorial Grammar (CCG) as developed by Steedman (1996; 2000a) enriched with the tool of “structural inhibition” introduced in the type-logical branch of CG by Morrill (1994) (cf. Steedman and Baldridge (2009)). I will proceed as follows: Section 1 introduces the “Condition on Extended Scope Taking” (CEST), which descriptively captures crucial aspects of the scopal behavior of negative quantifiers in English. Next, CEST will be given a CGbased treatment (section 2). Section 3 will unfold some of the mechanics of CG that were taken for granted by Błaszczak and G¨artner (2005). In particular further constraints on how to block empirically inadequate alternative derivations *

For discussion of various aspects related to this paper I thank Andreas Haida, Greg Kobele, and Michael White, the audience at DGfS-Bamberg, as well as an anonymous reviewer. Common disclaimers apply. This work was supported by Bundesministerium f¨ur Bildung und Forschung (BMBF) (Grant No. 01UG0711).

Local Modelling of Non-Local Dependencies in Syntax, 337-352 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

338

Hans-Martin G¨artner

are discussed. Also, a challenge to CEST by Wagner (2005) will be addressed. Section 4 provides some discussion embedding the current analysis within its wider context. Finally, section 5 draws some very brief conclusions.

2. The Condition on Extended Scope Taking (CEST) Inspired by and in reaction to Kayne (1998), Błaszczak and G¨artner (2005) study aspects of the scopal behavior of negative quantifiers in English. The main controversy addressed there concerns the status of the generative “Y-model,” which makes interaction between PF- and LF-properties of the grammar indirect. Particularly contentious is the proper place in grammar of linearization. Following Chomsky (1995, 334), who found “[...] no clear evidence that order plays a role at LF or in the computation from N to LF,” minimalist theorists have made linearization the exclusive business of PF.1 By contrast, Błaszczak and G¨artner (2005, 1) “suggest that extending the scope of negative quantifiers in English and German is sensitive to linear and prosodic, i.e., PF-properties.” More specifically, they formulate the following “Condition on Extended Scope Taking” (CEST) (ibid., 9). (1) Condition on Extended Scope Taking (CEST) Extending the scope of a negative quantifier Q¬ over a region σ requires σ to be linearly and prosodically continuous. Evidence for linear continuity is provided by the contrast between sentences in (2) vs. (3). (2) a. b.

She requested that we read not a single linguistics book They have forced us to turn down no one

(3) a. b.

She requested that not a single student read Aspects They have forced us to turn no one down

The italicized Q¬ in (2) can take scope over the matrix predicate. This is impossible in (3). As a consequence, (2-a) can be used to report on an absence of requests for the reading of linguistics books, while use of (3) would limit the speaker to reporting on a request for students to refrain from reading a partic1

“N” refers to the “numeration,” i.e., the lexical repository serving as input (and “reference set”) for minimalist derivations (Chomsky (1995, 225)). An early plea for linearization-free “tectogrammatics” was made by Curry (1961). Dowty (1982; 1995) pursues this idea further in a Montagovian setting. Within GPSG, linearization was given greater autonomy through the ID/LP-format (Gazdar et al. (1985)) and HPSG has seen further radicalizations in terms of “word order domains” (Reape (1994)). For recent intensive discussion one can also consult work by Fox and Pesetsky (2005) and M¨uller (2007).

Function Composition and the Linear Local Modeling of Extended N EG-Scope

339

ular linguistics book, namely, Aspects. In terms of CEST, this contrast can be illustrated as follows. (4) a. b.

(σ She requested that we read ) not a single linguistics book (σ They have forced us to turn down ) no one

(5) a. b.

(σ She requested that ) not a single student (σ read Aspects ) (σ They have forced us to turn ) no one (σ down )

(4) shows continuous and (5) discontinuous σ -regions. This corresponds to the possibility vs. impossibility of wide scope for Q¬ in (2) vs. (3), respectively. The prosodic evidence for CEST is based on examples like (6). (6) a. b.

She requested that we read not a single linguistics book She requested that the students who finish first | read not a single linguistics book

Inserting prosodic boundaries of type intonational phrase () or intermediate phrase (|) – the latter induced by an additional relative clause – into (2-a) eliminates the wide scope option for Q¬. (7) illustrates the corresponding discontinuities predicted by CEST. (7) a. b.

(σ She requested ) (σ that we read ) not a single linguistics book (σ She requested that the students who finish first ) | (σ read ) not a single linguistics book

As a brief aside, let me stress that CEST is explicitly designed to capture extended scope, not “standard” scope. This subtlety can be brought out by considering ECM structures such as (8). (8) We expect not a single student to have read Aspects (8) can be used to report on the absence of expectations regarding the reading of Aspects by linguistics students. This means that wide scope is an option for Q¬ in (8) in spite of the fact that the σ -region (σ We expect ) (σ to have read Aspects ) is discontinuous. To reconcile (8) with CEST one can adopt a “raising-to-object” approach to ECM, according to which not a single student is part of the matrix clause at surface structure already. Consequently, wide scope in (8) would be a “standard” or “local” effect, not an instance of non-local extended scope taking (cf. Błaszczak and G¨artner (2005, 10)).

3. CEST and Categorial Grammar: the basics What makes categorial grammar attractive for capturing CEST is the fact that linear order is a key ingredient of all standard CG-frameworks, given the way

340

Hans-Martin G¨artner

categories are defined.2 In addition, among the main principles of CCG is a condition on (linear) adjacency as formulated in (9) (Steedman (2000a, 54)). (9) The Principle of Adjacency Combinatory rules may only apply to finitely many phonologically realized and string-adjacent entities. This means that non-adjacency, i.e., discontinuity, will crucially affect syntactic derivations.3 In particular, extended scope taking can be treated by “assembling” σ -regions via function composition. Given (9), function composition will be disrupted where σ -regions are discontinuous. This simple fact leads to an extremely elegant analysis of CEST, the details of which we turn to next. To begin with, we have to introduce the rule of forward function composition (Steedman (2000a, 40)). (10) Forward Composition (C> ) X/Y Y/Z ⇒C X/Z Cfg ≡ λ v.f(g(v)) The workings of (10) are best explained by going right into the analysis of wide scope in (2-a). (11)

1. she: S/(S\NP) 2. requested: (S\NP)/S 3. that: S/S 4. they: S/(S\NP) 5. read: (S\NP)/NP 6. not.a.single.lb: S\(S/NP) 7. she+requested: S/S 8. she+requested+that: S/S 9. she+requested+that+they: S/(S\NP) 10. she+requested+that+they+read: S/NP 11. she+requested+that+they+read+not.a.single.lb: S

| C> 1,2 | C> 7,3 | C> 8,4 | C> 9,5 | E\ 10,6

Steps (11.7) to (11.10) are all applications of forward composition. Each time, the output category of the second constituent – written to the left of its main dividing line – is identical to the input category – written to the right of the main dividing line – of the first constituent. Forward composition is applied due to the 2 3

Muskens (2007) develops a categorial approach that separates linearization from hierarchical organization. Kracht (2003: chapter 5) provides additional pertinent discussion. As it stands, the Principle of Adjacency is a methodological assumption in the same way that assuming Merge to be binary is a tenet of minimalism. Steedman (1996, 6; 2000a, 268, fn.3) points out that this principle is a counterpart to a ban on the use of variables in structural descriptions of transformation rules in TG. That the use of such variables has to be constrained has – famously – been argued by Ross (1967).

Function Composition and the Linear Local Modeling of Extended N EG-Scope

341

rightward orientation (/) of the first constituents. Application of the rule leads to cancelation of the shared identical “output-input-categories” plus preservation of the remaining categorial structure. The final step, (11.11), consists of a leftward function application (or “elimination”) rule, (E\). Here, the negative object quantifier not a single linguistics book, which has a leftward orientation (\), takes the remainder of the sentence as its argument. As we are going to see presently, this results in the desired wide scope interpretation. Let us therefore turn to another attractive feature of CG, its transparent way of relating syntax to semantics. Each syntactic rule is accompanied by a structurally corresponding semantic rule.4 In the case of composition, this rule, as indicated in (10), amounts to feeding an appropriate variable v to the lower functor g, making the result g(v) the argument of the higher functor f, and finally λ -abstracting over v from the outside. The resulting functor will then require arguments of the type originally required by the lower functor g and yield something of the type that would have been yielded by the higher functor f. (12)-(15) show this process for (11.7)-(11.10), respectively.5 → λ P.P(she) ◦ λ p.λ x.REQUESTED(x, p) = λ q[ λ P.P(she)(λ p.λ x.REQUESTED(x, p)(q)) ] = λ q[ λ P.P(she)(λ x.REQUESTED(x, q)) ] = λ q[ λ x.REQUESTED(x, q)(she) ] = λ q.REQUESTED(she, q)

(12) a. b.

she+requested: S/S

(13) a. b.

she+requested+that: S/S

(14) a. b.

she+requested+that+they: S/(S\NP)

4

→ λ r.REQUESTED(she, r) ◦ λ P.P(they) = λ Q[ λ r.REQUESTED(she, r)(λ P.P(they)(Q)) ] =

The correspondence is due to the so-called “Curry-Howard Isomorphism”, for which I refer readers to Morrill (1994: chapter 2). For the same purpose, Steedman (2000a, 37) formulates the following “transparency” principle: (i)

5

→ λ q.REQUESTED(she, q) ◦ λ p.p = λ r[ λ q.REQUESTED(she, q)(λ p.p(r)) ] = λ r[ λ q.REQUESTED(she, q)(r) ] = λ r.REQUESTED(she, r)

The Principle of Type Transparency All syntactic combinatory rules are type-transparent versions of one of a small number of simple semantic operations over functions.

→ denotes the translation relation, ◦ stands for semantic composition. For the sake of brevity, the semantics contains various obvious simplifications.

342

Hans-Martin G¨artner

λ Q[ λ r.REQUESTED(she, r)(Q(they)) ] = λ Q.REQUESTED(she, Q(they)) (15) a. b.

→ λ Q.REQUESTED(she, Q(they)) ◦ λ y.λ x.READ(x, y) = λ z[ λ Q.REQUESTED(she, Q(they))(λ y.λ x.READ(x, y)(z)) ] = λ z[ λ Q.REQUESTED(she, Q(they))(λ x.READ(x, z)) ] = λ z[ REQUESTED(she, λ x.READ(x, z)(they)) ] = λ z.REQUESTED(she, READ(they, z)) she+requested+that+they+read: S/NP

Finally, (16) shows the crucial step of semantic application corresponding to (11.11). This amounts to wide scope taking for Q¬. (16) a. b.

→ λ P.¬∃x.[ LB(x) ∧ P(x) ](λ z.REQUESTED(she, READ(they, z))) = ¬∃x.[ LB(x) ∧ λ z.REQUESTED(she, READ(they, z))(x) ] = ¬∃x.[ LB(x) ∧ REQUESTED(she, READ(they, x)) ] she+requested+that+they+read+not.a.single.lb: S

Section 3 will give a detailed account of why wide scope is not an option for Q¬ in (3-a). Before turning to that, let me address the effect of prosodic discontinuity illustrated in (6). Here, the analysis proposed by Błaszczak and G¨artner (2005) makes appeal to the technique of “structural inhibition” (Morrill (1994)). The key assumption is that prosodic boundaries possess the following kind of category.6 (17) %: []ϕ X/X This has the effect that a prosodic boundary applies to a neighboring constituent of category X and converts it into a “structurally inhibited” constituent of category []ϕX. Two additional assumptions are then necessary. First, []ϕX shows up nowhere else in the grammar. Accordingly, no constituent can directly combine with anything of type []ϕX. Second, there is a (semantically vacuous) one-place elimination rule for []ϕX. This is given in (18). (18) []ϕ -Elimination (E[]ϕ ) []ϕX ⇒E X Let us study the workings of (17) and (18) by having a look at the analysis of (6-a). (19) 1. she: S/(S\NP) 2. requested: (S\NP)/S 3. that: S/S 4. they: S/(S\NP) 6

Steedman (2000b) provides a different, more comprehensive, treatment of prosodic boundaries in CCG.

Function Composition and the Linear Local Modeling of Extended N EG-Scope

5. read: (S\NP)/NP 6. not.a.single.lb: S\(S/NP) 7. %: []ϕ S/S 8. they+read: S/NP 9. they+read+not.a.single.lb: S 10. that+they+read+not.a.single.lb: S 11. %+that+they+read+not.a.single.lb: []ϕ S 12. %+that+they+read+not.a.single.lb: S 13. requested+%+that+they+read+not.a.single.lb: S\NP 14. she+requested+%+that+they+read+not.a.single.lb: S

343

| C> 4,5 | E\ 8,6 | E/ 3,9 | E/ 7,10 | E[]ϕ 11 | E/ 2,12 | E/ 1,13

Crucially, given (18), []ϕ can only be eliminated if it shows up on the main functor, S in the case of (19). It follows that the complement S has to be fully assembled before it can combine with material of the matrix clause in step (19.13). As a consequence, Q¬ has to be built in before the matrix predicate, and this necessarily results in narrow scope for Q¬. (20) gives the semantics for building in Q¬ at step (19.9). (20) a. b.

→ λ P.¬∃x.[ LB(x) ∧ P(x) ](λ y.READ(they, y)) = ¬∃x.[ LB(x) ∧ λ y.READ(they, y)(x) ] = ¬∃x.[ LB(x) ∧ READ(they, x) ] they+read+not.a.single.lb: S

Let me repeat that function composition from the matrix clause into the subordinate clause is blocked in the presence of %:[]ϕS/S. Direct composition is ruled out due to categorial mismatch: []ϕS = S, and composition of (she+)requested with that, skipping %, would violate adjacency, i.e., condition (9). As a caveat it has to be added that the relevant prosodic boundaries have to be of a low enough type such as sentential or VP-boundaries.7 Higher types, combinable with individual lexical items directly, would be eliminable without the desired blocking effect. This is shown for the relevant subpart of (6-a) in (21). (21)

1. that: S/S 2. %: []ϕ (S/S)/(S/S) 3. %+that: []ϕ (S/S) | E/ 2,1 4. %+that: S/S | E[]ϕ 3

If this were allowed, deriving a wide scope reading of Q¬ in (6-a) would be possible since (21) could serve as a building block for a derivation similar to (11). 7

For the naturalness of this assumption, see discussion by Ladd (1996).

344

Hans-Martin G¨artner

4. CEST and Categorial Grammar: some refinements Let me turn to two areas in which refinements of the analysis by Błaszczak and G¨artner (2005) are called for. First, a more in depth analysis of how the system prevents wide scope for Q¬ in (3) can be given. Secondly, a challenge to CEST discovered by Wagner (2005) should be addressed and disposed of. 4.1. Linear discontinuity and narrow NEG-scope It is sometimes objected to CG-approaches that they are overly powerful and unrestricted.8 To counter such an impression, this section will provide a (somewhat pedantic and therefore potentially tedious) account of why wide scope is not an option for Q¬ in (3-a), repeated below for convenience. (3) a.

She requested that not a single student read Aspects

I will look at five potential alternative derivations and show why they don’t lead to undesirable consequences. The gist of this manoeuver is given in (22). (22) a. b. c. d. e.

Rightward function application: fixes local scope Forward function composition: fixes local scope Function composition of discontinuous σ -regions: disturbs word order Backward (crossed) function composition: is not possible Type raising over matrix: must be ruled out

To begin with, let’s look at the effect of introducing Q¬ via rightward function application. This requires an argument of type (S \NP). (23)-(25) show the crucial syntactic and semantic steps. (23) 1. she+requested+that: S/S 2. not.a.single.student: S/(S\NP) 3. read+Aspects: S\NP 4. not.a.single.student+read+Aspects: S | E/ 2,3 5. she+requested+that+not.a.single.student+read+Aspects: S | E/ 1,4

8

For much more specific and highly relevant criticism of CG-style linguistics, see von Stechow (1989). It should be noted that variants of CCG have been shown to belong to the “mildly contextsensitive” grammar formalisms (cf. Joshi, Vijay-Shanker and Weir (1991)), so premature discarding of this kind of approach to grammar would seem to be rather unmotivated. A similar complexity result for minimalist grammars is now available due to work by Michaelis (2001a; 2001b).

Function Composition and the Linear Local Modeling of Extended N EG-Scope

345

→ λ P.¬∃x.[ STUDENT(x) ∧ P(x) ](λ y.READ(y, a)) = ¬∃x.[ STUDENT(x) ∧ READ(x, a) ]

(24) a. b.

not.a.single.student+read+Aspects: S

(25) a. b.

she+requested+that+not.a.single.student+read+Aspects: S → λ p.REQUESTED(she, p)(¬∃x.[ STUDENT(x) ∧ READ(x, a) ]) = REQUESTED(she, ¬∃x.[ STUDENT(x) ∧ READ(x, a) ])

Clearly, the scope of Q¬ is fixed within the subordinate clause in (24). The resulting formula serves as an unalterable building block for embedding within the matrix clause. Derivation (22-b) has similar consequences. Introduction of Q¬ by forward function composition with the matrix clause assigns Q¬ the role of “lower” argument functor in the semantics. Again, narrow scope is an irrevocable consequence. The crucial steps are provided in (26)-(28). (26)

1. she+requested+that: S/S 2. not.a.single.st: S/(S\NP) 3. read+Aspects: S\NP 4. she+requested+that+not.a.single.st: S/(S\NP) | C> 1,2 5. she+requested+that+not.a.single.st+read+Aspects: S | E/ 4,3

(27) a. b.

she+requested+that+not.a.single.st: S/(S\NP) → λ p.REQ(she, p) ◦ λ P.¬∃x.[ ST(x) ∧ P(x) ] = λ Q[ λ p.REQ(she, p)(λ P.¬∃x.[ ST(x) ∧ P(x) ](Q)) ] = λ Q[ λ p.REQ(she, p)(¬∃x.[ ST(x) ∧ Q(x) ]) ] = λ Q.REQ(she, ¬∃x.[ ST(x) ∧ Q(x) ])

(28) a. b.

she+requested+that+not.a.single.st+read+Aspects: S

λ Q.REQ(she, ¬∃x.[ ST(x) ∧ Q(x) ])(λ y.READ(y, a)) = REQ(she, ¬∃x.[ ST(x) ∧ λ y.READ(y, a)(x) ]) = REQ(she, ¬∃x.[ ST(x) ∧ READ(x, a) ])

→

Quite obviously, forward composition of Q¬ with smaller constituents on its left doesn’t change anything. Thus, composing that+not.a.single.student would receive the semantics in (29), continuations of which will lead to narrow scope again. (29) a. b.

→ λ p.p ◦ λ P.¬∃x.[ ST(x) ∧ P(x) ] = λ Q[ λ p.p(λ P.¬∃x.[ ST(x) ∧ P(x) ](Q)) ] = λ Q[ λ p.p(¬∃x.[ ST(x) ∧ Q(x) ]) ] = λ Q.¬∃x.[ ST(x) ∧ Q(x) ]

that+not.a.single.st: S/(S\NP)

Method (22-c) starts from the insight that successful wide scope taking for Q¬ in (2-a) depends on assembly of the entire σ -region before Q¬ is in-

346

Hans-Martin G¨artner

troduced. This was shown in (11)-(16). So, why don’t we start by building she+requested+that+read+Aspects? The semantics for the required composition step is given in (30). (30) a. b.

she+requested+that+read+Aspects: S\NP

λ p.REQ(she, p) ◦ λ x.READ(x, a) = λ y[ λ p.REQ(she, p)(λ x.READ(x, a)(y)) ] = λ y[ REQ(she, READ(y, a)) ]

→

Indeed, if we now use the result as input argument for the generalized quantifier not.a.single.student the undesirable wide scope reading arises, as shown in (31). (31) λ P.¬∃x.[ ST(x) ∧ P(x) ](λ y[ REQ(she, READ(y, a)) ]) = ¬∃x.[ ST(x) ∧ λ y[ REQ(she, READ(y, a)) ](x) ] = ¬∃x.[ ST(x) ∧ REQ(she, READ(x, a)) ] However, syntactic word order will revolt against this approach to (3-b). Consider the derivation in (32). (32)

1. she+requested+that: S/S 2. not.a.single.st: S/(S\NP) 3. read+Aspects: S\NP 4. she+requested+that+read+Aspects: S\NP | C> 1,3 5. not.a.single.st+she+requested+that+read+Aspects: S | E/ 2,4

Given the rightward orientation of Q¬, and given the constraint on adjacency, (9), the rightward application in step (32.5) yields a “topicalized” position for Q¬ corresponding to an overall structure that in addition violates the that-t condition, as shown in (33). (33) *Not a single student, she requested that read Aspects Clearly, method (22-c) is not suited for providing (6-b) with its proper linearization.9 So, yet another way of violating CEST is blocked, given the CG-approach. The previous point also serves as an indication that the technique of “wrapping,” used in various branches of CG for scope taking (Moortgat (1988); Morrill 9

It should be stressed that the that-t-violation is not the issue here but a purely coincidental side effect. Nothing is changed – i.e., method (22-c) still puts Q¬ in the wrong position – if the that-less variant in (i) is chosen, which incidentally seems to be quite unacceptable too. (i) ??Not a single student she requested read Aspects It is an independent question whether topicalization has an impact on scope, a question that I won’t pursue in this paper. Thanks to the anonymous reviewer for raising this issue.

Function Composition and the Linear Local Modeling of Extended N EG-Scope

347

(1994)), must be avoided in the analysis of extended scope taking. Thus, consider the derivation in (34). (34)

1. she+requested+that+⇓+read+Aspects: S↑NP 2. not.a.single.st: S↓(S↑NP) 3. she+requested+that+not.a.single.st+read+Aspects: S | W 2,1

⇓ serves as a placeholder for which a quantifier will be substituted by wrapping prefix and suffix of ⇓ around it. S ↑NP is the category of a predicate which combines with its argument by wrapping and S↓(S ↑NP) is the category of a generalized quantifier to be wrapped into a “wrapping” predicate. The semantics of the wrapping step in (34.3) would, of course, have to be identical to the one in (31). Clearly, discontinuity of the σ -region would no longer be an obstacle to extended scope taking for Q¬, an unwelcome result given CEST. The careful reader will have observed a subtlety of derivation (32), discussion of which will serve as introduction to the more involved argument related to technique (22-d). Forward composition as introduced in (10) requires both functors to be of equal linear orientation, namely, rightward. This requirement was actually violated in (32.4) where categories S/S and S \NP were composed. The rule that does this properly is a minimal variation on (10) called “forward crossed composition,” as stated in (35) (cf. Steedman (2000a, 55)). (35) Forward Crossed Composition (C×> ) X/Y Y \Z ⇒C X \Z Cfg ≡ λ v.f(g(v)) As we have seen, C×> has a reordering effect such that an argument of the second constituent that should have been placed in between the two constituents in fact precedes them both. This brings us to method (22-d) of analyzing (3-a). It has to be shown that Backward Crossed Composition, as given in (36), cannot be used to derive the undesired wide scope reading for Q¬ either. (36) Backward Crossed Composition (C×< ) Y/Z X \Y ⇒C X/Z Cfg ≡ λ v.f(g(v)) Of course, the idea is to try to make Q¬ into the higher functor of composition while preserving its position in the middle of the σ -region. Semantically, the crucial composition step would have to be (37). (37) a. b.

→ λ P.¬∃x.[ ST(x) ∧ P(x) ] ◦ λ p.λ y.REQ(y, p) = λ q[ λ P.¬∃x.[ ST(x) ∧ P(x) ](λ p.λ y.REQ(y, p)(q)) ] = λ q[ λ P.¬∃x.[ ST(x) ∧ P(x) ](λ y.REQ(y, q)) ] = requested+that+not.a.single.st

348

Hans-Martin G¨artner

λ q.¬∃x.[ ST(x) ∧ λ y.REQ(y, q)(x) ] = λ q.¬∃x.[ ST(x) ∧ REQ(x, q) ] But this does not work! Through backward composition, Q¬ will bind the variable of the matrix subject, so it ends up with wide scope but in the wrong argument position. Moreover, problems arise in the syntax already. (38)

1. requested+that: (S\NP)/S 2. not.a.single.st: S/(S\NP)

For C×< to apply, Q¬ should be leftward oriented, which it isn’t. Also, for simple backward composition to apply, both functors should be leftward oriented, which they aren’t. And even if (38.1) and (38.2) were allowed to combine by “funny” backward composition, problems would arise further down the line, since the resulting category should be S/S. But in order to create an argument of type S from the remaining materials, we would have to combine she with read Aspects, which would yield the preposterously garbled string in (39) and would suffer from the semantic problem illustrated in (37) in addition. (39) *Requested that not a single student she read Aspects We can therefore conclude that the only sensible and grammatical way of applying composition to matrix constituents and subject Q¬ is by standard forward composition, as we have seen. And this we have shown in (26)-(28) to lead to narrow scope. Thus, method (22-d) is equally unavailable for deriving a violation of CEST for (3-a). Finally, let us consider type raising as another means of giving Q¬ in (3a) scope over the matrix. (40) and (41)-(42) provide the required syntax and semantics. (40)

1. not.a.single.st: (S/(S\NP))\(S/S) 2. she+requested+that: S/S 3. read+Aspects: S\NP | E\ 2,1 4. she+requested+that+not.a.single.st: S/(S\NP) 5. she+requested+that+not.a.single.st+read+Aspects: S | E/ 4,3

(41) a. b.

she+requested+that+not.a.single.st: S/(S\NP) → λ℘.λ P.¬∃x.[ ST(x) ∧ ℘(P(x)) ](λ p.REQ(she, p)) = λ P.¬∃x.[ ST(x) ∧ λ p.REQ(she, p)(P(x)) ] = λ P.¬∃x.[ ST(x) ∧ REQ(she, P(x)) ]

(42) a. b.

she+requested+that+not.a.single.st+read+Aspects: S

λ P.¬∃x.[ ST(x) ∧ REQ(she, P(x)) ](λ y.READ(y, a)) = ¬∃x.[ ST(x) ∧ REQ(she, λ y.READ(y, a)(x)) ] = ¬∃x.[ ST(x) ∧ REQ(she, READ(x, a)) ]

→

Function Composition and the Linear Local Modeling of Extended N EG-Scope

349

However, in CCG – as in most other standard calculi for semantics – type raising is only allowed to raise arguments over the functions they are arguments of. (43) provides the appropriate rule (cf. Steedman (2000a, 44)). (43) (Backward) Type Raising (T< ) X ⇒T T \(T/X) Ta ≡ λ f.f(a) T is a variable standing for result types of functions over X. We have already used object generalized quantifiers as a raised type derivable via (43): Subject plus transitive verb constitute a predicate of type S/NP. Therefore, any object argument of type NP can be lifted to type S \(S/NP). Here T has been instantiated by S, the result type of function S/NP. Let us now reconsider (40). Clearly, the type assignment for Q¬ in (40.1) is in violation of T< , given that S/(S \NP) – or NP for that matter – is not an argument of S/S nor is it the result type of an appropriate function. Therefore, even method (22-e) won’t be available for deriving a violation of CEST for (3-a). Although we have no formal proof that such a violation cannot be derived by some other means within the sketched version of CCG + structural inhibition, I hope to have shown that such a violation will be hard to derive by any obvious means. I conclude that the CG-based analysis of CEST provided by Błaszczak and G¨artner (2005) is not only elegant but also essentially formally sound. 4.2. An apparent violation of CEST Wagner (2005) discusses CEST within a full-fledged theory of grammar-prosody interaction. This leads him to hypothesize that relative strength of prosodic boundaries should play a role overlooked by Błaszczak and G¨artner (2005). And indeed, in examples like (44), “wide scope is possible, at least if the prosodic boundaries in the italicized domain are not stronger than the boundary that sets off the negative quantifier” (Wagner (2005, 113)). (44) She expected the students who failed the mid-term to take none of the advanced tutorials, did she? What this shows is that the formulation of “prosodic continuity” as subpart of CEST needs some refinements. (45) can serve as a first step toward such a reformulation. (45) A linearly continuous string σ is prosodically discontinuous iff there is a prosodic boundary ϕ such that (i) ϕ breaks up σ into a non-vacuous prefix σ1 followed by ϕ followed by a non-vacuous suffix σ2 (σ = [ σ1 ϕ σ2 ]), and

350

Hans-Martin G¨artner

(ii) ϕ is stronger than prosodic boundaries ϕ ’ and ϕ ” setting off σ from its immediate surroundings ([... ϕ ’ [ σ1 ϕ σ2 ] ϕ ” ... ]). For CG-implementation it suffices to assume that only locally strongest boundaries, i.e., those that induce prosodic discontinuity according to (45), function as structural inhibitors. Thus, only these strongest boundaries are of category []ϕX/X as stated in (17). All others are “trivial” in the sense of bearing category X/X. Such a trivial boundary can be taken to intervene between mid-term and to in (44), which means that forward composition will not be interrupted and scope-taking for Q¬ remains a possibility.

5. Discussion Let us now try to put the above analysis into a somewhat larger perspective. To begin with, it should be noted that there are other more general attempts to combine CCG with type-logical CG. In particular, Steedman and Baldridge (2009) introduce “modalizing features” on slashes that control the combinatorics of categories by constraining the type of operations applicable to such categories. Thus, they “assume that function categories may be restricted as to the rules that allow them to combine with other categories, via slashes typed with four feature values: , ♦, ×, and •” (ibid., 7). For our purposes it suffices to note that “the lexical type is the most restricted and allows only the most general applicative rules” (ibid., 8). It follows that structurally inhibitory prosodic boundaries, as introduced in (17), could be given the alternative analysis in (46). (46) %: X/ X This would equally prevent function composition across such boundaries. The only way of introducing (46) will be via E/, i.e., rightward application. The analysis of (6-a) given in (19) will then carry over straightforwardly.10 Next, let us turn to the status of the “Y-model” in Chomskyan generative grammar and the proper place of linearization. Based on the analysis by Kayne (1998), Błaszczak and G¨artner (2005) present and discuss a “Y-model”preserving minimalist implementation of CEST in which all linearization can be done at PF. The present CG-analysis, on the other hand, profits from incorporating linear order into syntactic derivations directly. One concession to the spirit of the Y-model, which lies in keeping PF and LF separate as far as possible, consists in the fact that the rule of []ϕ -Elimination does not come with a corresponding semantic interpretation. Use of an identity function could, of 10

It has to be assumed in addition that there is no category that takes X/ X as an argument. Steedman and Baldridge (2009) do not discuss any such category, nor do they formally exclude its existence, as far as I can see.

Function Composition and the Linear Local Modeling of Extended N EG-Scope

351

course, provide a trivial semantics to that rule if such a concession is deemed undesirable. It should also be pointed out that, in a pre-minimalist version of generative syntax, Koster (1987) presented a theory that comes very close to the CGanalysis above. Central to that approach is the notion of “dynasty” as characterized in (47) (ibid., 36). (47) A dynasty is a chain of governors such that each governor (except the last one) governs the minimal domain containing the next governor. Dynasties are the counterparts of extended domains and thus relevant to an implementation of CEST. The second ingredient introduced into “dynasty-theory” is the idea that there are “certain types of agreement among the successive domain governors” one of which being “directionality” (ibid.). Extended NEGscope in (2-a) could therefore be put to the fact that all governors in the “chain”

requested,that,read govern rightward. Koster (1987: chapter 5) successfully applied his system to differential “restructuring” effects in OV vs. VO languages. Thus, to the extent that the minimalist abandonment of linearization and government in “narrow syntax” is not taken to be irreversible, this theory should be reconsidered as an interesting rival to the CG-approach outlined above.

6. Conclusion The present paper has demonstrated how to treat NEG-scope extensions in English, which are sensitive to linear and prosodic continuity, within categorial grammar. Core ingredient of the account is the operation of (forward) function composition which is responsible for “assembling” extended domains over which negative quantifiers can scope. We have seen how constraints on linear consistency as well as prosodic continuity make the right predictions as to the applicability of function composition. C(C)G should therefore be taken seriously as a framework for the (linear) local modeling of non-local dependencies.

Bibliography Błaszczak, Joanna and Hans-Martin G¨artner (2005): ‘Intonational Phrasing, Discontinuity, and the Scope of Negation’, Syntax 8, 1–22. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Curry, Haskell B. (1961): Some Logical Aspects of Grammatical Structure. In: R. Jakobson, ed., Structure of Language and its Mathematical Aspects. American Mathematical Society, Providence, Rhode Island, pp. 56–68. Dowty, David (1982): Grammatical Relations. In: P. Jacobson and G. Pullum, eds., The Nature of Grammatical Representation. Reidel, Dordrecht, pp. 79–130. Dowty, David (1995): Toward a Minimalist Theory of Syntactic Structure. In: H. Bunt and A. van Horck, eds, Syntactic Discontinuity, Mouton, The Hague, pp. 11–62.

352

Hans-Martin G¨artner

Fox, Danny and David Pesetsky (2005): ‘Cyclic Linearization of Syntactic Structure’, Theoretical Linguistics 31, 1–45. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum and Ivan Sag (1985): Generalized Phrase Structure Grammar. Blackwell, Oxford. Joshi, Aravind, K. Vijay-Shanker and David Weir (1991): The Convergence of Mildly ContextSensitive Grammar Formalisms. In: P. Sells, S. Shieber and Th. Wasow, eds, Foundational Issues in Natural Language Processing, MIT Press, Cambridge, Mass., pp. 31–81. Kayne, Richard (1998): ‘Overt vs. Covert Movement’, Syntax 1, 128–191. Koster, Jan (1987): Domains and Dynasties. Foris, Dordrecht. Kracht, Marcus (2003): The Mathematics of Language. Mouton de Gruyter, Berlin. Ladd, D. Robert (1996): Intonational Phonology. Cambridge University Press, Cambridge. Michaelis, Jens (2001a): Derivational Minimalism is Mildly Context-Sensitive. In: M. Moortgat, ed., Logical Aspects of Computational Linguistics (LACL ’98), Springer, Heidelberg, pp. 179–198. Michaelis, Jens (2001b): ‘On Formal Properties of Minimalist Grammars’, Linguistics in Potsdam 13. Moortgat, Michael (1988): Categorial Investigations. Foris, Dordrecht. Morrill, Glyn (1994): Type Logical Grammar. Kluwer, Dordrecht. M¨uller, Gereon (2007): Towards a Relativized Concept of Cyclic Linearization. In: U. Sauerland and H.-M. G¨artner, eds, Recursion + Interfaces = Language? Chomsky’s Minimalism and the View from Syntax-Semantics, Mouton de Gruyter, Berlin, pp. 61–114. Muskens, Reinhard (2007): ‘Separating Syntax and Combinatorics in Categorial Grammar’, Research on Language and Computation 5, 267–285. Reape, Michael (1994): Domain Union and Word Order Variation in German. In: J. Nerbonne, K. Netter and C. Pollard, eds, German in Head-Driven Phrase Structure Grammar, CSLI Publications, Stanford, CA, pp. 151–197. Ross, John Robert (1967): Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Massachusetts. Steedman, Mark (1996): Surface Structure and Interpretation. MIT Press, Cambridge, Mass. Steedman, Mark (2000a): The Syntactic Process, MIT Press, Cambridge, Mass. Steedman, Mark (2000b): ‘Information Structure and the Syntax-Phonology Interface’, Linguistic Inquiry 34, 649–689. Steedman, Mark and Jason Baldridge (2009): Combinatory Categorial Grammar. In: R. Borsley and Kersti Borjars, eds, Non-Transformational Syntax: Formal and Explicit Models of Grammar. Wiley, Oxford, pp. 181-224. Stechow, Arnim von (1989): ‘Categorial Grammar and Linguistic Theory.’ Arbeitspapiere der Fachgruppe Sprachwissenschaft der Universit¨at Konstanz 15. Wagner, Michael (2005): ‘Prosody and Recursion.’ PhD thesis, MIT, Cambridge, Massachusetts.

Department of Theoretical and Experimental Linguistics Research Institute for Linguistics Hungarian Academy of Sciences

´ Masaya Yoshida & Angel J. Gallego

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing*

Abstract It has long been observed that movement obeys strict cyclicity. To capture the step-wise nature of movement various attempts have been made in the history of generative grammar. This paper argues that one of the current theories of cyclicity, Chomsky’s (2000; 2001; 2004; 2007; 2008) Phase Theory, captures the properties of ellipsis in so-called Antecedent Contained Sluicing construction (ACS) (Yoshida (2010)) in a straightforward fashion. In so doing, we show that, similarly to movement, the relation between the ellipsis site and its antecedent is established in cyclic nodes, and provide support to a particular theory of cyclicity.

1. Introduction The aim of this paper is to study some new properties of the construction in (1) below, which, following Yoshida (2006), we refer to as Antecedent Contained Sluicing (henceforth, ACS), within the context of Chomsky’s (2000 through the present) Phase Theory: (1) John kissed someone without knowing [ CP who e ] As will be shown, some properties of ACS are particularly relevant for any approach that tries to provide locality-based accounts of ellipsis, since this process appears to require for either TP or VP to be the unique antecedents for TPellipsis (sluicing), a fact we will capitalize on in order to claim that the selection of the antecedent corresponds to the domain affected by Chomsky’s (2004) Transfer (formerly, Spell-out), an operation responsible for cashing out dedicated chunks of structure to the interfaces. In particular, we argue that a theory incorporating a semantic identity condition on ellipsis and a cyclic Transfer to the LF-interface (as in most phase/cycle-related frameworks) predicts that ellip*

´ Angel J. Gallego wants to acknowledge partial support by grants from the Generalitat de Catalunya (2009SGR-1079), the Ministerio de Ciencia e Innovaci´on (FFI2011-29440-C03-01), and the British Academy (2008 Visiting Fellowship). The authors want to express their gratitude for suggestions, comments, and judgments to Cedric Boeckx, Ricardo Etxepare, Tomo Fujii, Takuya Goro, Norbert Hornstein, Howard Lasnik, Jeff Lidz, Jason Merchant, Chizuru Nakao, Juan Uriagereka, and Luis Vicente. Usual disclaimers apply.

Local Modelling of Non-Local Dependencies in Syntax, 353-370 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

354

´ Masaya Yoshida & Angel J. Gallego

sis can be licensed by a semantically parallel (though syntactically unparallel) antecedent, and this process of licensing is governed by cyclic Transfer.1 Through a close examination of the data, we will argue that ACS shows a distribution exactly along the parameters just mentioned: TP-ellipsis is licensed (i.e., can have as antecedent) either by VP or TP, depending on the way the derivation unfolds. In other words, we will show that the generalization in (2) (which will be commented on in section 4 below) holds. (2) The antecedent of ACS corresponds to the complement domain of phase heads. We will defend that, to the extent that (2) turns out to be correct, it reinforces the status of periodic Transfer to the interfaces (particularly so under Single Cycle Output approaches to syntax, like Chomsky’s Phase Theory) , and its connection to grammatical phenomena such as ellipsis. The paper is divided as follows: section 2 provides an overview of the background assumptions that will be made with respect to ellipsis, Phase Theory, and cyclic Transfer; in section 3, we turn our attention to the main properties exhibited by Yoshida’s (2006) ACS; section 4 presents some additional properties of ACS, which will be argued to fall into place under Chomsky’s phase-based framework; finally, section 5 summarizes the main conclusions.

2. Ellipsis, semantic parallelism, and Single Output Syntax In this section, we lay out the background assumptions necessary to account for the properties of ACS. In our treatment we argue for an analysis of ellipsis along the lines of Merchant (2001; 2002), coupled with a specific locality theory (Chomsky’s framework of phases), which crucially restricts the range of possible antecedents of sluicing (a variety of TP ellipsis) in a way consistent with the empirical evidence we will review. 2.1. Ellipsis: semantic parallelism Let us begin by focusing on ellipsis, and, more particularly, on two key issues about this phenomenon: (i) what is its nature? and (ii) how must the identity requirement it has been argued to obey be understood (cf. Lasnik (2005) and Merchant (2001) for discussion). In a brief but insightful passage, Chomsky & Lasnik (1995, 125) argue in 1

Cf. Uriagereka (1999) for a different, but not incompatible, conception of Transfer – particularly concerned with PF issues. Cf. Uriagereka (2008) for more general background discussion.

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

355

favor of a view that treats ellipsis as a deletion process at the PF-component. In particular, they sketch out a proposal whereby ellipsis is related to deaccenting, i.e., they claim that ellipsis is the most radical manifestation of deaccenting, namely phonological deletion. From such a perspective, (3-b) is nothing but a more deaccented version of (3-a): (3) a. b.

John said that he was looking for a cat, and so did Bill [say that he was looking for a cat]. (bracketed material = deaccented) John said that he was looking for a cat, and so did Bill [e].

Chomsky & Lasnik’s claim is backed up by a property shared by the examples in (3): they point out that both deaccenting and ellipsis are subject to the same parallelism requirement: the interpretation of the pronoun he and an indefinite NP a cat in the elided and deaccented clauses must be the same as those in the antecedent clause. Although he basically follows Chomsky & Lasnik’s (1995) approach, Merchant (2001, ch.1) has recently pointed out some problems with treating ellipsis as a stronger form of deaccenting. To be precise, Merchant (2001) shows that the environment under which an XP can be deaccented (i.e., treated as given information) is not exactly the same as the environments where an XP can be deleted. To support his claim, Merchant (2001) reviews different pieces of evidence that threaten syntactic isomorphism accounts of ellipsis. Consider, to see this, the example in (4), taken from Merchant (2001). (4) Abby was reading, but I don’t know what [Abby was reading twhat ] [from Merchant (2001, 19)] The problem here – as Merchant (2001) emphasizes – is that reading is intransitive in the antecedent clause (i.e., it does not contain a direct object), but transitive in the ellipsis clause (i.e., it contains the direct object what). Therefore, the antecedent and the sluiced site are not syntactically parallel. Clearly, then, structural isomorphism is not available in (4), because the structure of the relevant VP is not the same. Merchant (2001) points out further cases that are problematic to syntactic parallelism accounts of ellipsis. To overcome the problems of those approaches, but still salvage the evidence (such as a cross-linguistics generalization on Pstranding and Case-matching connectivity under sluicing) that strongly suggests the presence of fully-fledged syntactic structure under ellipsis, Merchant (2001) pursues a semantic-parallelism approach to ellipsis. As a part of his proposal, Merchant assumes the condition on ellipsis in (5), based on the notion of eGIVEN ness, which is in turn defined in (6):

356

´ Masaya Yoshida & Angel J. Gallego

(5) Focus condition on XP-ellipsis An XP α can be deleted only if α is e-GIVEN. [adapted from Merchant (2001, 26)] (6) E-GIVENness An expression E counts as e-given iff E has a salient antecedent A and, modulo ∃-type shifting, (i) A entails F-clo(E), and (ii) E entails F-clo(A) [adapted from Merchant (2001, 26)] Under Merchant’s e- GIVENness approach, ellipsis is licensed if the focus-closed ellipsis site and the focus-closed antecedent site hold a mutual entailment relation. This indeed predicts that a sentence like (4) is grammatical because both the elided TPE and the antecedent TPA have the following semantic representation: F-clo (TP) = ∃x. Abbey read x. Furthermore, an e- GIVENness-based approach explains why the following example is excluded as a legitimate sluicing. (7) *I know how many politicians she called an idiot, but I don’t know . . . . . . WHICH (politicians) [ IP she insulted twhich politicians ] [from Merchant (2001, 31)] As Merchant (2001, 23 & 31) notes, calling someone an idiot certainly entails insulting someone, semantically speaking; however, this is not enough for ellipsis to be licensed. The mutual entailment relation does not hold, for insult does not entail calling someone an idiot: this is the problem for theories that do not resort to the Focus condition above (see (5)), which crucially relies on a mutual entailment relation between the elided site and its antecedent. To recap so far, we have seen that ellipsis can indeed be treated as a PF phenomenon (i.e., an instance of PF deletion that leaves syntactic structure intact; cf. Merchant (2001) and Lasnik (1999; 2001)), but not actually as a radical version of deaccenting (contrary to Chomsky & Lasnik’s (1995) observation). With Merchant (2001), we have assumed ellipsis to be governed by a parallelism condition that is stated semantically (involving mutual entailment, as just noted), but not syntactically.2 2

Lasnik (2005, 261) notes that some gaps remain to be accommodated under a semantic identity approach to ellipsis. In particular, it is not clear at all why (i) is ruled out if formal identity is not required for ellipsis to take place: (i) *Someone shot Ben, but I don’t know by whom [ Ben was shot t ] Together with (i), Lasnik (2005) mentions two problematic cases for a Merchant’s account: agreement mismatches (which disfavor sloppy identity), and VP ellipsis with be:

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

357

2.2. Phase theory and cyclic transfer A defining property of linguistic theory is its reliance on some notion of compositionality, which imposes that the interpretation of larger units be based on the interpretation of smaller ones. Within generative grammar, this leading idea took a serious form in the fifties, with Chomsky et al.’s (1956) pioneering work on stress, later on extended, and further applied to morpho-phonology, semantics, and syntax. In recent years, several attempts have tried to refine and reformulate the notion of cycle (cf. Lasnik (2006)): that was the goal behind explorations on bounding nodes (cf. Chomsky (1973)), barriers (cf. Chomsky (1986)), and phases (cf. Chomsky (2000; 2001; 2004; 2007; 2008)). One key assumption of recent frameworks that endorse the idea of there being dedicated domains (though building on different assumptions, dedicated domains are entertained in Chomsky 2000, Grohmann (2003), Hale & Keyser (1993; 2002), Rizzi (2006), and Uriagereka (1999)) is that these have a special status not only during the syntactic computation (an issue we return to), but also after syntactic objects are sent to the interpretive components of PF and LF, where they can be further manipulated: (8) Narrow Syntax – PF / LF Mapping Lexicon

Narrow Syntax

(Cyclic Transfer) LF Component + PF Component To defend the role played by the cycle, different syntactic processes (Case, agreement, binding, theta-role assignment, etc.) have been argued to operate obeying severe locality constraints, taking place within a relevant ‘derivational window’, neither before nor after. One such particular phenomenon is movement, which, (ii) a. ??Mary washed her car and John did [ wash his car ] too. b. *Mary is a doctor and John will [ be a doctor ] too.

´ Masaya Yoshida & Angel J. Gallego

358

as recently argued by Abels (2003) and Boeckx (2007), must apply in a very local fashion, possibly targeting the specifiers of all projections until the final site is reached. In this paper we want to explore the possibility that ellipsis can be included into the list above, at least in the specific case of ACS. For that to be possible, we would like to start by highlighting the traits of cyclic Transfer models. To begin with, we will assume the idea that there is an operation of Transfer that sends dedicated XPs to the interface components (i.e., PF and LF).3 In Chomsky’s (2000 through the present) recent theory, these XPs correspond to the complement domain of the phase heads v* and C, as can be seen in (10). (9) Transfer Transfer hands D[erivation]-NS over to [PHON] and [SEM] [from Chomsky (2004, 107)] (10)

PhP ZP

Ph Ph

YP

• Ph = v* or C • YP = complement domain The gist behind this type of framework concerns its taking very seriously the idea that only YP in (10) can be transferred to the interface levels. Cashing out a so designated portion of structure to both the LF and PF components is one of the major innovations with respect to the concept of cycle in the framework of phases. In Chomsky (1995), Spell-out was originally understood as a (a rule) stripping the phonological features of the computation to the PF-interface, and, as such, it affected neither the syntactic component nor the semantic interpretation. Spellout was, also importantly, regarded as a point of parametric variation: languages (English vs. Chinese or English vs. Spanish) were thought to vary depending on when Spell-out applied. As can be seen, the new conception is different in a way we can take advantage of. The formulation in (9) and (10) also differs from the classical notion of cycle, which was understood strictly as a domain that imposed boundary conditions on 3

The difference between ‘component’ and ‘level (of representation)’ is important but not crucial for what we have to say here. Cf. Uriagereka (2008) for relevant discussion.

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

359

morpho-phonological (and later on syntactic) transformations (cf. Chomsky et al. (1955), Chomsky (1973), Jackendoff (1972), Lasnik (2006), and Uriagereka (2008)); as such, the notion of cycle was independent from semantic interpretation. In sum, under both the early minimalist understanding of Spell-out and the classical concept of cycle, it is not immediately obvious that semantic interpretation can be constrained by the principle of the cycle. In Chomsky’s recent Phase Theory, on the other hand, it is possible to entertain such a possibility, as it is clear that cyclic Transfer (formerly, Spell-out) operates feeding the LF and PF components simultaneously. Accordingly, Phase Theory predicts that semantic interpretation is also affected by phase-based dynamics.

3. Antecedent Contained Sluicing So far we have reviewed two main issues: semantic parallelism approaches to ellipsis on the one hand (cf. Merchant (2001)), and phase-based approaches to the concept of cycle on the other (cf. Chomsky (2000) and subsequent work). The next question worth asking is what the predictions of a theory of syntax that incorporates both assumptions are. One natural possibility is that the semantic parallelism condition on ellipsis (cf. Fox (2000), Merchant (2001)) is calculated cyclically. To be precise, suppose that both a potential antecedent site and a potential ellipsis site reside in the same cyclic Transfer domain, i.e., the complement domain of a phase-head. If the semantic parallelism requirement can be satisfied in such situations, ellipsis will be licensed locally, as roughly illustrated in (11): (11)

PhP Ph

antecedent site YP

Y

XP

ellipsis site

– where Ph = v* or C In what follows we show that ACS features a licensing pattern that fits with the scenario just sketched. We start by reviewing the basic properties exhibited by ACS, and then we introduce some new data, which we will capitalize on in order to put forward a new analysis of ACS.

´ Masaya Yoshida & Angel J. Gallego

360

3.1. Antecedent containment, modals and negation As we noted in section 1, a typical example of ACS is (12). (12) [ TP John kissed someone [ PP without knowing who [ TP e ]]] It has been pointed out in the literature (cf. Yoshida (2006) for details) that this construction raises two problems: (i) antecedent containment, and (ii) unavailability of functional categories such as modals and negation. Consider them in turn. To do that, though, let us first diagnose the position of the PP that contains the elided TP. Running various constituency tests, it is obvious that the PP is attached to the VP. For example, the PP can be substituted by do so (a pro-form for VP), together with the VP as shown in (13-a). In the same way, the PP can be inside the scope of VP-ellipsis, as illustrated in (13-b). Finally, the PP can be fronted together with the VP, as can be seen in (13-c), thus indicating that PP and VP form a syntactic unit. (13) a. b. c.

John kissed someone without knowing who, and Mary did so [ VP kiss someone [ PP without knowing who]] too. John kissed someone without knowing who, and Mary did [ VP kiss someone [ PP without knowing who]] too. John kissed someone without knowing who, and [ VP kiss someone [ PP without knowing who]], he did indeed.

Given that the PP is attached to the VP, the structure gives rise to an infinite regress problem under the assumption that the TP-ellipsis requires a TPantecedent: if the elided TP in (12) requires a TP antecedent, then the only available antecedent is the whole matrix TP that contains the elided TP itself. Therefore, if the structure of the ellipsis site is recovered from the antecedent TP (the matrix TP), then it gives rise to an infinite regress. The second problem of this construction is concerned with its interpretation. Somehow, modals, negation, and similar functional material in the matrix clause are not interpreted in the elided site. This is exemplified by (14).4 4

As far as we can tell, the same effect is found in Spanish: (i)

Juan no debe besar a nadie sin saber a qui´en. Juan not must-3.SG kiss-INF to anyone without know-INF to who ‘Juan must not kiss anyone without knowing who’

(Spanish)

Thus, in (i), the ellipsis site is interpreted as sin saber a qui´en besa (Eng. without knowing who he kisses).

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

361

(14) John must not kiss anyone without knowing who. = . . . without knowing who he is kissing. = . . . without knowing who he must not kiss. Yoshida (2006) points out that both of the issues above are not problematic under Merchant’s (2001) semantic parallelism approach to ellipsis. As we have seen, under such an account, ellipsis can be licensed if the ellipsis site and the antecedent site hold a mutual entailment relation, without requiring syntactic parallelism of any kind. Consider this more calmly. Assuming some version of the so-called VP Internal Subject Hypothesis (cf. Fukui & Speas (1986), Kitagawa (1986), Kuroda (1988), and Koopman & Sportiche (1991), among others), a theory adopting a semantic parallelism condition on ellipsis will allow for VP to be an antecedent of TP. A VP segment that excludes the PP containing the elided TP while including all the other arguments is available ([VP [VP kissed someone] [PP without knowing who]]), and the E-type shifting process yields the representations in (15) for the antecedent and the ellipsis sites; as can be seen, mutual entailment between VPA and TPE is satisfied, and so TP-ellipsis can be licensed.5,6 (15) F-clo (VPA ) = ∃x. John kiss x F-clo (TPE ) = ∃x. John kissed x If on track, this approach can explain why modals and negation are not interpreted in the ellipsis site. This is so, quite simply, because the antecedent of TP-ellipsis is not the matrix TP, but the matrix VP, which therefore excludes the higher functional categories. 3.2. Ambiguity and attachment site Although we have argued that modals and negation are not interpreted in the elided TP in ACS, it is not always the case. As we have just shown, when the PP containing the TP-ellipsis is attached to the VP these elements are not recovered into the ellipsis site. However, the moment the PP is rightward shifted as in (16), these elements are available, and the sentence becomes ambiguous.7 5 6 7

Pursuing a QR solution will not be helpful, since it will create more variables than binders in this particular instance. Cf. Sprouse (2006) for relevant discussion. A similar analysis is adopted by Merchant (2002) to account for some peculiar properties of so-called swiping. We will put the interpretation of negation aside here. Some native speakers we interviewed clearly get the interpretation of negation, but some speaker reported to us that the interpretation of the sentence itself is pretty complicated due to the combination of the negation and a downward entailing preposition, without. Therefore, in this study, we keep concentrating on the interpretation of the modal, assuming that the same is true for negation.

362

´ Masaya Yoshida & Angel J. Gallego

(16) John must kiss someone tomorrow without knowing who. = . . . without knowing who he is kissing. = . . . without knowing who he must kiss. In (16) it is plausible to assume that the shifted PP is attached to the TP, due to the presence of the temporal adverb tomorrow. If so, a potential question that one may ask is whether the resulting phrase structure gives rise to an infinite regress. This is, however, not really problematic: as shown in (17), if the PP is attached to the TP, then a TP-segment that excludes the PP is always available for ellipsis resolution. Therefore, the antecedent containment scenario does not arise here either, and the potential infinite regress problem vanishes. (17) [ TP [TP John must kiss someone tomorrow] [ PP without knowing who [TP e]]]. The fact that the ambiguity arises if the PP is attached to the TP, but not to the VP, suggests that there is a correlation between the interpretation of the elided TP and the attachment site. Differently put, descriptively speaking, when the PP is attached to VP, modals and negations are not available in the TP-ellipsis in ACS, but if the PP is attached to TP, they are available. In the following section, we argue that this this situation can be accommodated in a framework incorporating phases.

4. Phase Theory and ACS resolution After having considered the properties of ACS and the technical assumptions that are required to account for them, in this section we want to focus on the asymmetry we mentioned at the outset of the paper, and presented in the previous section, namely the difference with respect to ellipsis resolution in ACS. The generalization we reached in the discussion so far can be rephrased as follows. (18) ACS Generalization Attachment site (structural height/size) and the interpretation/antecedent of the ellipsis site correlate. a. When the PP is attached to VP, VP is the antecedent for TP-ellipsis. b. When the PP is attached to TP, both VP and TP can be the antecedent for TP-ellipsis. Now, let us consider what this generalization may follow from. From the perspective of Chomsky’s system, the answer is simple: it follows from the principle regulating cyclicity, i.e., cyclic Transfer operating at the phase level. As can be seen, the two attachment sites indicated in (18) coincide with the complement

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

363

domains of phase heads: VP and TP, if we follow Chomsky in that C and v* are the phase heads.8 If we put all the pieces together, we can understand the correlation between the attachment site and possible interpretation(s) of the TP-ellipsis in the following way: when the PP is attached to the VP, then both the VP and the PP are transferred together. At the LF-interface, semantic calculation takes place; if the parallelism requirement on TP-ellipsis is satisfied, TP-ellipsis is licensed at this point of derivation. Notice that the material that is transferred at this point of derivation is a VP containing all its arguments and the PP containing the ellipsis site (which we take to be attached to the VP). With Merchant (2001), we assume that the VP and the elided TP hold a mutual entailment relation, and so TP-ellipsis can be licensed taking VP as its antecedent; consequently, high functional categories such as modals and negation are not available. Schematically, this can be represented as in (19). (19)

v*P v*

DP v* VPA

PP V

DP V

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬

VP

P DP

CP TPE

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭

>>> Cyclic Transfer

If, on the other hand, the PP is attached to the TP, PP is transferred to the interface levels together with the whole materials contained within the TP. At this step of the derivation, the elided TP can take the whole matrix TP as its antecedent. As a result, modals and negation become available, as expected. This is represented in (20).

8

We will leave aside the possibility that DP constitutes a phase too, as argued by Svenonius (2004). Cf. also Chomsky (2007) and Hiraiwa (2005) for additional discussion.

´ Masaya Yoshida & Angel J. Gallego

364 (20)

CP C C TPA

PP T

DP

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬

TP

T

P

CP

v*P

TPE

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭

>>> Cyclic Transfer

In this way, the ACS generalization follows very naturally if we assume both cyclic Transfer and e-GIVENness-based ellipsis. Although this nicely fits with our hypothesis in (2), we cannot fail to mention some immediate problems about this phase-based approach to ACS ambiguity. At this point, we are aware of the following three shortcomings: (i) the position of the external argument (EA), (ii) the timing of ellipsis, and (iii) the issue of optionality. The first problem is blatant if we adopt the fairly standard idea that the EA is generated as a specifier of the light verb v* (cf. Chomsky’s 1995 original claim, building on the ideas by Hale & Keyser (1993) and others), outside the complement domain of the phase (what Chomsky refers to as edge). This is not trivial because if the EA is externally merged with the v*P, then the EA is not available in the domain of cyclic Transfer, and as a result, the VP and the elided TP cannot be semantically parallel due to the unavailability of the EA. (21)

v*P v*

DP v* VPA V

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬

VP PP DP

P

CP TPE

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭

>>> Cyclic Transfer

365

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

If the VP is to be the antecedent of the TP (counting as a full proposition, for semantic parallelism to go through), then it must be the case that the EA be generated below the light verb v*, and transferred together with the VP, as in (22): (22)

v*P v* v* VPA DP (=EA)

PP V

V

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬

VP

P DP (=IA)

CP TPE

⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭

>>>

Cyclic Transfer

An analysis of the v*P along these lines overcomes the problem, but begs the questions of how the EA receives its theta role and then moves to SPEC-T. The first question can be tackled if one dispenses with the (influential) hypothesis that the EA theta role is assigned by v* (or by the v*-VP configuration); for instance, one could assume that the relevant theta role is checked by movement (cf. Hornstein (2001) and references therein), or else assigned in an ECM-like fashion. Another route (still compatible with Hale & Keyser’s framework) would be to argue that it is not v*, but V, that encodes the relevant semantic flavor. Whatever the ultimate analysis to capture the interpretation of the EA, we believe there are independent reasons to pursue (22). One such reason is offered by Gallego (2007), and comes from Romance VOS sentences, which, following Ord´on˜ ez (1998), we analyze as involving object shift over the in situ subject: (23) Recogi´o todo cochei sui propietario. its owner picked-up-3.SG every car ‘Its owner picked up every car.’

Spanish

The important thing to note in (23) is that we are before a bona fide instance of (object) A-movement, given the binding facts. If we were to stick to Chomsky’s (1995) analysis, (23) should be abstractly represented as follows, with the shifted object being in an outer-SPEC-v*. (24) [ v*P Object [ v*P Subject v* [ VP V tObject ]]] Though compatible with the facts, (24) shows some properties unexpected under the phase based system of Chomsky (2008). In recent years, Chomsky has been arguing for a parallel behavior of CP and v*P with respect to the Case/agreement

366

´ Masaya Yoshida & Angel J. Gallego

systems, assuming C and v* are the loci of structural Case. Relevantly, Chomsky (2008) puts forward an inheritance process that downloads ϕ-features from phase heads to non-phase heads. Chomsky (2008) connects ϕ-feature inheritance to subject and object raising processes (i.e., EPP effects) to specifiers of non-phase heads. Crucially, if inheritance precedes probing (for reasons discussed in Chomsky (2007, 22)), then the analysis in (24) cannot be correct, as it would predict that ϕ-features in v* can probe prior to inheritance. Let us therefore suppose that ϕ-features are indeed passed down to V, from where they probe the object, raising it to SPEC-V.9 If that much is assumed, then the next step it to accept that the EA is first-Merged below v*, as depicted in (22). Otherwise, the binding facts would remain unaccounted for. The step by step derivation would be as in (25): (25) a. b.

[ v*P v*ϕ [Subject [ VP V Object]]] [ v*P v* [Object [Subject [ VP Vϕ tObject ]]]

ϕ-feature inheritance object raising

From (25) the binding data noted by Ord´on˜ ez (1998) fall into place. Moreover, the fact that object raising is to V and not to the edge of the v*P phase explains why objects in VOS do not necessarily receive a discourse-oriented interpretation (say, specificity; cf. Chomsky (2001)). There is one other question. If the EA is transferred (together with the VP), then it is not clear how it can be raised up to TP. We can easily side step this problem by adopting the following two assumptions: first, we can assume that the EA moves to v*P before Transfer applies to VP (for reasons discussed by Mayr (2007), reformulating ideas of Moro (2000)); second, this movement leaves the copy of the EA in the VP adjoined position. The second problem with our approach to (18) has to do with the timing of the ellipsis process. Notice, in this respect, that for the TP to be transferred within the complement domain of a phase, be it VP or TP, it must somehow wait. In other words, the elided TP must not be transferred until the antecedent is, for otherwise the antecedent could not be calculated as desired. For the Transfer operation to be delayed we will assume that the preposition selects for a defective domain, similar to that of raising/ECM (or passive/unaccusative) cases. We believe this is tenable, as the tense interpretation of the CP is parasitic on the matrix’s clause (like in raising/ECM; cf. Pesetsky & Torrego (2007)). Thus, if the ACS domain is defective, then Transfer does not have to apply. Within Phase Theory, all we need is to assume a weak version of what Chomsky calls Phase Impenetrability Condition. In his writings, Chomsky has considered two distinct versions of the PIC: 9

This is in line with Chomsky’s (2007; 2008) reinterpretation of the Lasnik-Saito-Koizumi approach to Postal’s raising-to-object.

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

367

(26) Phase Impenetrability Condition (PIC-1) In phase α with head H, the domain of H is not accessible to operations outside α; only H and its edge are accessible to such operations. [from Chomsky (2000, 108)] (27) Phase Impenetrability Condition (PIC-2) [Given structure [ZP Z . . . [HP α [H YP]]], with H and Z the heads of phases]: The domain of H is not accessible to operations at ZP; only H and its edge are accessible to such operations. [from Chomsky (2001, 14)] Under the second version of the PIC, cyclic Transfer is not immediate after completion of a phase: it must wait until the next phase head shows up. In the case of ACS, given that the C selected by the preposition does not qualify as a phase head (it is a weak phase head, in Chomsky’s (2001) sense), Transfer is decided in the matrix clause, by either v* or C. The final problem is related to the ambiguity noted in (16), where the TP can take either the VP or the TP as its antecedent. As we have already pointed out, this ambiguity can be due to the copy of the EA left in the v*P. However, there is another possible source for the ambiguity. We would like to argue that such a possibility also arises depending on what copy of the PP is interpreted. If the PP starts its derivational life as a VP adjunct, and then moves to TP, there will be two copies. This can be seen in (28). Interpreting the high copy will yield a reading where the functional structure will be available; interpreting the low copy will block that interpretation. (28)

CP C

TP T

PP v*P

v*

VP

PP V

EA V

IA

We started this section emphasizing the ambiguity displayed by ACS, a novel observation due to Yoshida (2010) that we have dealt with by taking advantage of the phase-based machinery we have been assuming all along: if we are correct, the ambiguity can be related to the Single Output Syntax model of Chomsky (2000 through the present), where specific portions of the structure are cyclically cashed out. These portions, interestingly, are VP and TP, which is exactly what is needed to account for the ambiguity. Promising as this is, the account has – to

368

´ Masaya Yoshida & Angel J. Gallego

our mind – three potential loopholes, to which we have offered ways out that do not require new or ad hoc strategies, and are furthermore fully compatible with the framework of phases.

5. Conclusions In the preceding pages, we have argued that a particular implementation of the cycle (Chomsky’s Phase Theory) can be used to account for the varying properties of what Yoshida (2006; 2010) calls Antecedent Contained Sluicing (ACS): (29) John must kiss someone without knowing who. The interesting properties of (29), as Yoshida (2006) discusses at length, have to do with the infinite regress problem and the interpretation of the ellipsis site (where the modal verb is unavailable). In order to get around these problems, Yoshida (2006) adopts Merchant’s (2001) theory of ellipsis, which is based on a semantic parallelism condition (the Focus condition in (5)) holding between antecedent and ellipsis sites; in particular, Yoshida (2006) argues – by and far following Merchant’s (2002) analysis of swiping – that TP ellipsis in (29) can take matrix VP as its antecedent: since matrix VP and embedded TP are semantically parallel, ellipsis can take place under a theory like Merchant’s (2001). Yet the main new puzzle posed by ACS is the following: as soon as the PP is attached to some higher position (TP, we have assumed here), modals (and negation) are available again. This was seen in (16), repeated here as (30) for convenience: (30) John must kiss someone tomorrow without knowing who. This time, placing the PP without knowing who after the adverb tomorrow has an unexpected effect: must can be interpreted in the ellipsis site. We have tried to connect these data and give them a unitary treatment by adopting Chomsky’s claim that there are dedicated chunks of the structure that feed semantic (and phonologic) interpretation: the complement domain of phase heads. If C and v* are the relevant phase heads, then TP and VP are the dedicated domains for interpretive purposes. Happily, this fits with the facts in (29) and (30). If correct, then whether TP can be taken as antecedent in ACS largely depends on where the PP is attached to: if it is attached to TP (within the CP phase), then TP is the antecedent; if it is attached to VP (within the v*P phase), then VP is the antecedent. More generally, there is a weak and a strong conclusion to be drawn from our studying of ACS. The weak conclusion is that adopting a framework incorporating some version of periodic Transfer (e.g. Chomsky’s Phase Theory) suffices to account for the properties exhibited by this construction; the strong conclusion is

Ellipsis and Phases: Evidence from Antecedent Contained Sluicing

369

much more interesting, for what we have discussed points to the possibility that ACS (a particular instance of ellipsis), like many other grammatical phenomena (Case assignment, movement, theta-role assignment, binding, agreement, and so on), be regulated by some sort of cyclic dynamics. Interestingly – and crucially for us –, the fact that TP and VP are the antecedents in ACS helps sharpen this leading idea, for it tells us that not any given phrase counts as a cycle, only dedicated domains (Chomsky’s phases) do.

Bibliography Abels, Klaus (2003): Successive Cyclicity, Anti-Locality, and Adposition Stranding. PhD dissertation, UConn. Boeckx, Cedric (2007): Understanding Minimalist Syntax. Lessons from Locality in Long-Distance Dependencies. Blackwell, Malden. Chomsky, Noam, Morris Halle and Fred Lukoff (1956): On Accent and Juncture in English. In: For Roman Jakobson. Mouton, The Hague, pp. 65–80. Chomsky, Noam (1973): Conditions on Transformations. In: S. Anderson and P. Kiparsky, eds, A Festschrift for Morris Halle. Holt, Reinhart and Winston, New York, pp. 232–286. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Chomsky, Noam (1986): Barriers. MIT Press, Cambridge. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. MIT Press, Cambridge, Massachusetts, pp. 89–155. Chomsky, Noam (2001): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale. A Life in Language. MIT Press, Cambridge, Massachusetts, pp. 1–52. Chomsky, Noam (2004): Beyond Explanatory Adequacy. In: A. Belletti, ed., Structures and Beyond. The Cartography of Syntactic Structures (vol. 3). Oxford University Press, Oxford, pp. 104-131. Chomsky, Noam (2007): Approaching UG from Below. In: U. Sauerland & H-M. G¨artner, eds, Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from Syntax-Semantics. Mouton de Gruyter, Berlin, pp. 1-30. Chomsky, Noam (2008): On Phases. In: R. Freidin, C. P. Otero and M. L. Zubizarreta, eds, Foundational Issues in Linguistic Theory. MIT Press, Cambridge, Massachusetts, pp. 133–166. Chomsky, Noam and Howard Lasnik (1995): The Theory of Principles and Parameters. In: The Minimalist Program. MIT Press, Cambridge, pp. 13-127. Chomsky, Noam, Morris Halle and Fred Lukoff (1956): On Accent and Juncture in English. In: M. Halle et al., eds, For Roman Jakobson. Mouton, The Hague, pp. 65-80. Fox, Daniel (2000): Economy and Semantic Interpretation. MIT Press, Cambridge. Fukui, Naoki and Margaret Speas (1986): ‘Specifiers and Projection’, MIT Working Papers in Linguistics 8, 51–67. ´ Gallego, Angel J.(2007): Phase Theory and Parametric Variation. PhD dissertation, UAB. Grohmann, Kleanthes (2003): Prolific Domains. John Benjamins, Amsterdam. Hale, Ken and Samuel J. Keyser (1993): On the Argument Structure and the Lexical Expression of Syntactic Relations. In: K. Hale & S. Keyser, eds, The View from Building 20. MIT Press, Cambridge, pp. 53-109. Hale, Ken and Samuel J. Keyser (2002): Prolegomenon to a Theory of Argument Structure. MIT Press, Cambridge. Hiraiwa, Ken (2005): Dimensions of Symmetry in Syntax: Agreement and Clausal Architecture. PhD dissertation, MIT. Hornstein, Norbert (2001): Move! A Minimalist Theory of Construal. Blackwell, Oxford. Jackendoff, Ray (1972): Semantic Interpretation in Generative Grammar. MIT Press, Cambridge.

370

´ Masaya Yoshida & Angel J. Gallego

Kitagawa, Yoshihisa (1986): Subjects in Japanese and English. PhD dissertation, UMass, Amherst. Koopman, Hilda and Dominique Sportiche (1991): ‘The Position of Subjects’, Lingua 85, 211-258. Kuroda, Shige-Yuki (1988): Whether We Agree or Not: Comparative Syntax of English and Japanese. In: W. J. Poser, ed., Papers from the Second International Workshop on Japanese Syntax. CSLI, Stanford. [also in Linguisticae Investigationes, 12, pp. 1-47]. Lasnik, Howard (1999): Pseudogapping Puzzles. In: S. Lappin & E. Benmamoun, eds, Fragments: Studies in Ellipsis and Gapping. Oxford University Press, New York, pp. 141-74. Lasnik, Howard (2000): When Can You Save a Structure by Destroying it? In: M. Kim & U. Strauss, eds, Proceedings of NELS 31. GLSA, University of Massachusetts (Amherst), pp. 356–362. Lasnik, Howard (2005): ‘Review of The Syntax of Silence’, Language 81, 259-265. Lasnik, Howard (2006): Conceptions of the Cycle. In: L. Cheng & N. Corver, eds, Wh-Movement: Moving On. MIT Press, Cambridge, pp. 197-216. Mayr, Clemens (2007): Subject-Object Asymmetries and the Relation Between Internal Merge and Pied-Piping. Ms., Harvard University. Merchant, Jason (2001): The Syntax of Silence: Sluicing, Islands, and the Theory of Ellipsis. Oxford University Press, Oxford. Merchant, Jason (2002 ): Swiping in Germanic. In: J.-W. Zwart & W. Abraham, eds, Studies in Comparative Germanic Syntax: Proceedings from the 15th Workshop on Comparative Germanic Syntax. John Benjamins, Amsterdam. Merchant, Jason (2005): Revisiting Syntactic Identity Conditions. Paper presented at Workshop on Ellipsis. University of California, Berkeley. Moro, Andrea (2000): Dynamic Antisymmetry. MIT Press, Cambridge. Ord´on˜ ez, Francisco (1998): ‘Post-Verbal Asymmetries in Spanish’, Natural Language and Linguistic Theory 16, 313-346. Pesetsky, David and Esther Torrego (2007): The Syntax of Valuation and the Interpretability of Features. In: S. Karimi, V. Samiian and W. K. Wilkins, eds, Phrasal and clausal architecture: Syntactic derivation and interpretation. Benjamins, Amsterdam, pp. 262–294. Potsdam, Eric (2007): ‘Malagasy Sluicing and Its Consequences for the Identity Requirement on Ellipsis’, Natural Language and Linguistic Theory 25, 577-613. Rizzi, Luigi (2006): On the Form of Chains: Criterial Positions and ECP Effects. In: L. Cheng & N. Corver, eds, Wh-Movement: Moving on. MIT Press, Cambridge, pp. 97-133. Sprouse, Jon (2006): Antecedent Contained Deletion and Movement Reconsidered. In: C. Davis et al., eds, Proceedings of NELS 36. GLSA Publications, Amherst. Svenonius, Peter (2004): On the Edge. In: D. Adger et al., eds, Peripheries: Syntactic Edges and their Effects. Kluwer, Dordrecht, pp. 259-287. Uriagereka, Juan (1999): Multiple Spell-Out. In: S. D. Epstein and N. Hornstein, eds, Working Minimalism. MIT Press, Cambrigde, Massachusetts, pp. 251–282. Uriagereka, Juan (2008): Syntactic Anchors. On Semantic Structuring. Cambridge University Press, Cambridge. Yoshida, Masaya (2006): Sometimes Smaller Is Better: Sluicing, Gapping and Semantic Identity. In: C. Davis et al., eds, Proceedings of NELS 36, GLSA Publications, Amherst. Yoshida, Masaya (2010): “Antecedent Contained’ Sluicing’, Linguistic Inquiry 41, 348-356.

(Yoshida) Department of Linguistics Northwestern University (Gallego) Departament de Filologia Espanyola Universitat Aut`onoma de Barcelona

Chiyo Nishida

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis*

Abstract This paper proposes an analysis of restructuring and clitic climbing in Romance using the version of categorial grammar that includes a type-changing rule, Division. The paper primarily examines Spanish data, but the analysis can be extended to other Romance languages. A minimalist mono-clausal account by Cinque (2006) of Italian data analyzes restructuring verbs as heads of various functional projections. Cinque’s proposal and its variants, however, is challenged by Spanish data, where object control verbs like permitir ‘to permit’, ense˜nar ‘to teach’, etc. allow clitic climbing. We first establish that these are indeed instances of restructuring, against Cinque’s claim to the contrary, and propose a uniform treatment of all data involved. Our categorical grammar analysis consists of applying Division to both the clitic and the restructuring verb, expanding their combinatorial properties. As a result, the clitic can directly combine with the restructuring verb, to which the embedded verb merges, yielding the surface structure [[CLITIC+V1]+V2]. Nonetheless, Division reserves the thematic properties of the clitic and the restructuring verb; therefore, the clitic is properly linked to V2. One advantage of our alternative analysis is that it offers a straightforward account of the coordination of two conjuncts comprising a clitic and a restructuring verb, for which the minimalist account has no simple solution.

1. Introduction and background This paper proposes a local treatment of restructuring (RS hereafter) and clitic climbing (CC hereafter) in Romance using the Generalized Categorical Grammar based on Lambek Calculus (Moortgat (1988)), which includes a typechanging rule, Division, and polymorphic categories and argues that it has empirical advantages over its comparable minimalist analysis proposed by Cinque (2000; 2004/2006). This paper primarily deals with Spanish data; however, the analysis proposed here can easily be extended to other Romance languages that allow RS and CC. In most Romance languages certain verbs taking a non-finite clausal complement can optionally allow the pronominal clitic that is thematically associated *

I would like to thank Cinzia Russi for her help with the Italian data and Elisenda Grigsby with the editing of the manuscript. Thanks are also due to Fred Hoyt and Jason Baldridge for calling my attention to their paper on a similar topic and offering discussions. Any errors are mine.

Local Modelling of Non-Local Dependencies in Syntax, 371-400 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

372

Chiyo Nishida

with the embedded verb to appear phonologically attached to them, as shown in the examples from Spanish. (1) a. b. (2) a. b.

Jos´e quiere leerlo.1 Joe wants to.read.CL-ACC.3 SG.MASC Jos´e lo quiere leer. ‘Joe want to read it.’ (both)2 Jos´e sigui´o mir´andome. Joe kept-3SG watching.CL-1 SG Jos´e me sigui´o mirando. ‘Joe kept watching me.’ (both)

Clitic climbing gives evidence that sentences like (1-b) and (2-b) have a monoclausal structure, where the matrix verb and the embedded verb form a complex predicate. On the basis of the clitic climbing and some other related facts, Rizzi (1978/1982) argues for a syntactic operation called R ESTRUCTURING, which collapses a bi-clausal structure into a mono-clausal one.3 As a result, the pronominal clitic thematically associated with the embedded verb becomes a clause mate with the matrix verb and is phonologically attached to it, yielding strings like (1-b) and (2-b). The restructuring operation is supposed to apply optionally, and, according to Rizzi, its application is lexically induced. Strozer (1976), Rivas (1977), Contreras (1979), and Perlmutter and Aissen (1976) (also see Aissen and Perlmutter (1983)) independently propose a similar mechanism for Spanish. Since then several mono-clausal analyses of RS and CL have been proposed within the generative framework for Italian, for the theory can no longer sustain the restructuring operation. One such proposal comes from Thomas Rosen (1990), who postulates that querer ‘to want’, for instance, may be used as a fullfledged verb or as a “light verb” (Napoli (1981)), which needs to borrow arguments from the embedded verb. Roberts (1997) proposes a similar analysis treating RS verbs as “auxiliary verbs”.4 Most recently, Cinque (ibid.), working within the minimalist framework, goes one step further and proposes that RS verbs are heads of various functional projections, which form the universal template, as shown in (3). 1

2 3 4

The clitic is enclitic when the host verb is in the non-finite (infinitive or gerundive) form; otherwise it is proclitic. Orthographically, the clitic is written as if it is an independent word when it is proclitic; otherwise it is written as a part of the verb. Semantically, they are commonly thought to be equal. See Napoli (1981) for a discussion as to how they may differ. Other constructions that allow RS include long object preposing (LOP) and auxiliary selection. Also see Burzio (1981; 1986); Goodall (1987); Di Sciullo and Williams (1987); Sadock (1991) for other types of analyses. Monachesi (1999) proposes a lexical analysis of Italian clitics using the HPSG framework.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

373

(3) Functional Projections (Cinque (2004/2006, 12)): MoodPspeech act > MoodPevaluative > ModPepistemic > TPpast > TPFuture > MoodPirrealis > AspPhabitual > AspPrepetitive(1) > AspPfrequentative > ModPvolitional Aspcelerative(1) > TPanterior > AspPterminative > AspPcontinuative > AspPretrospective AspPproximative > AspPdurative > AspPgeneric/progressive > AspPprospective > ModPobligation > ModPpermission/ability > AspPcompletive > VoiceP > AspPclerative(II) > AspPrepetitive(II) > AspPfrequentative(II) Furthermore, Cinque (2004/2006) postulates that even when there is no CC evidenced, the RS verb is invariably generated as the head of a functional projection; thus, (1-a) and (1-b) both have exactly the same mono-clausal underlying structure in his analysis. The clitic may optionally move from the embedded verb to adjoin to any of the RS-based functional heads, i.e., ModP and AspP, yielding strings like (1-b) and (2-b). Cinque’s analysis of RS verbs as functional heads is problematic for two reasons.5 First, he claims that RS verbs, in their roles as functional heads, are raising verbs like seem (sembrare in Italian or parecer in Spanish) and do not assign a θ role to their subject. Many RS verbs, however, are what are commonly analyzed as subject control verbs like querer ‘to want’, tratar ‘to try’, etc., which, unlike purely modal verbs like poder ‘can’ and deber ‘must’, do assign a θ -role to their subject. The more serious problem with Cinque’s analysis, however, is that in Spanish, RS/CC is evidenced, as discussed in Luj´an (1980) and Contreras (1979), and Su˜ner (1980), with some object control verbs which include directive verbs like permitir ‘to permit’, ordenar ‘to order’, prohibir ‘to prohibit’, mandar ‘to command’, etc., as illustrated in (4-b), and ense˜nar ‘to teach’, as shown in (5-b). (4) a. b. (5) a. b.

Jos´e me permiti´o leerlo. Joe me permitted to.read=it Jos´e me lo permiti´o leer ‘Joe permitted me to read it.’ (both) Jos´e me ense˜no´ a leerlo. Joe me taught to read=it Jos´e me lo ense˜no´ a leer ‘Joe taught me how to read it.’ (both)

Obviously, object control verbs cannot be reduced to the heads of some functional projections, which poses a serious challenge to Cinque’s proposal. The mono-clausal analyses that treat RS verbs as light verbs or auxiliary verbs also face the same problem. Kayne (1989) and Cinque (ibid.), the latter following the former, argue that 5

Also see Laca (2004) for the inadequacy of Cinque’s proposal from a semantic point of view.

374

Chiyo Nishida

strings in (4) and (5) are “hidden instances of the causative construction (2006, 24)”, suggesting that these strings need to be handled independently from those like (1) and (2), on a par with strings like (6). (6) a. b.

Jos´e me hizo leerlo. Joe me made to.read=it Jos´e me lo hizo leer ‘Joe made me read it.’(both)

Indeed, the strings in (4) are superficially similar to the causative sentences in (6). Nonetheless, the empirical data indicate, as shall be seen in Section 1, that object control verbs like permitir, ense˜nar, etc. behave differently from the causative verb hacer, and that (4-b) and (5-b) are indeed instances of restructuring, like (1-b) and (2-b). Although Rizzi does not discuss object control RS verbs, the bi-clausal analysis using the restructuring operation should be able to account for strings like (4-b) and (5-b) in the same fashion as (1-b) and (2-b) since this operation amalgamates the complements of both the matrix and the embedded verb as it reduces two clauses into one. Ironically, the subsequently proposed mono-clausal analyses, although they have done away with the theoretically unfavorable operation, have retained empirical inadequacies. In this paper we propose an analysis of RS and CC that can account for all RS/CC strings in a uniform fashion, adopting a generalized version of Categorial Grammar (CG, henceforth), as explored in Moortgat (1988). Our analysis allows the clitic/s in RS/CC-strings like (1-b) or (4-b) to first combine with the RS verb and then with the embedded verb, as in [[lo=quiero] leer] or [[me=lo=permitieron] leer], respectively, while thematically linking each clitic with the appropriate verb. We show that the CG analysis can not only handle the relevant data but also offers a straightforward account of certain coordination facts involving RS and CC, for which Cinque’s or his predecessor’s analyses mentioned above do not yield an easy solution. The organization of this paper is as follows. Section 1 defines the scope of data dealt with in this paper. It also establishes that strings like (4-b) and (5-b) are instances of restructuring and need to be accounted for in the same way as strings like (1-b) and (2-b). Section 2 provides a brief description of the theoretical framework used in this paper. Section 3 presents our CG analysis of RS and CC and discusses its empirical advantages over the comparable minimalist analysis and alternative CG analyses. Section 5 summarizes the key points.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

375

2. Data 2.1.

Basic data on RS/CC

Not all verbs that take a non-finite clausal complement allow RS since CC is not allowed from a non-finite clause in some cases, as shown in (7) below. (7) a.

Jos´e espera leerlo. Joe hopes/expects to.read=it ‘Joe hopes to read it.’ b. *Jos´e lo espera leer.

Verbs that allow RS/CC are commonly defined as motion, modal, and aspectual verbs (Rizzi ibid.). Although it is true that many of the RS verbs resemble the raising verb sembrare/parecer ‘to seem’ in that they do not assign a θ -role to their subject, as Cinque claims, not all verbs share this property. One such case is subject control verbs, and another is object control verbs, as discussed in 2.2 below. In essence, the RS/CC verbs do not constitute a completely homogeneous semantic or syntactic class. We postulate that RS verbs are lexically marked (cf. Rizzi (1978/1982), Thomas Rosen (1990); Monachesi (1999), inter alia). The fact that the list of RS verbs varies from speaker to speaker also supports this position. There are cases in which RS/CC appears to occur with the clause headed by a preposition, as shown in (8) and (9). (8) a. b.

Jos´e trat´o de leerlo. Jos´e lo trat´o de leer. ‘Joe tried to read it.’(both)

(9) a. b.

Jos´e empez´o a leerlo. Jos´e lo empez´o a leer. ‘Joe began reading it.’(both)

Following Luj´an (1982), we assume that de and a, as used in these strings, are not real prepositions but rather functional elements which show up only when certain verbs embed an infinitival clause. RS may involve not only two clauses, as shown in (1-b) above or (10-b) below, but also three or four, as shown in (10-c) and (d), respectively, allowing the clitic thematically linked with the most deeply embedded verb to show up with each of the RS verb. (10) a. b. c. d.

Jos´e quiere tratar de empezar a leerlo. Jos´e quiere tratar de empezarlo a leer. Jos´e quiere tratarlo de empezar a leer. Jos´e lo quiere tratar de empezar a leer. ‘Jos´e wants to try to begin reading it.’ (all)

376

Chiyo Nishida

2.2. Object control restructuring verbs As mentioned above, the empirical adequacy of Cinque’s analysis hinges critically upon whether verbs like permitir and ense˜nar, as in (4-b) and (5-b), are RS verbs or not. This is not an issue specific to Spanish data; Italian also allows CC with one object control verb, insegnare ‘to teach’, as shown in (11) below, taken from Cinque’s own example (ex. 46: 24). (11) a. b.

Gli ho insegnato a farlo io. him have taught to do=it I Gliel’ho insegnato a fare io ‘I taught him (how) to do it.’ (both)

Do (4-b), (5-b), and (11-b) indeed illustrate “hidden instances of the causative construction,” as Kayne conjectures and Cinque claims?6 The dative object of the Romance causative construction representing the causee is commonly analyzed as the underlying external argument of the embedded clause, and not as the argument semantically related to the causative verb itself (Burzio (1981; 1986); Kayne (1989); Zubizarreta (1985); Zagona (1988); Rosen (1990); Moore (1998); inter alia).7 Cinque tries to support his claim by demonstrating that the dative occurring with insignare, for instance, behaves differently from the ordinary dative, but similarly to the one occurring with the causative verb fare. He shows that the dative argument of a ditransitive verb ‘to give’ can be cliticized in terms of si if it is reflexive or reciprocal, as shown in (12-b). Note that this does not hold for the dative occurring with fare, and importantly, insegnare, as shown in (13-b) and (14-b), respectively. (Cinque’s data slightly modified to show the contrast.) (12) a. b.

(13) a.

6 7

Gianni e Mario regalarono un disco a Carlo/l’uno all’ altro. Gianni and Mario gave a disk to Carlo/to each other ‘Gianni and Mario gave a disk to Carlos/to each other.’ Gianni e Mario si regalarono un disco. Gianni and Mario si gave a disk ‘Gianni and Mario gave themselves/each other a disk’ Gianni e Mario fecero imparare la procedura a Gianni and Mario had learn the procedure to Carlo/l’uno all’altro. Carlo/to each other ‘Gianni and Mario had Carlo/each other learn the procedure.’

Cinque explains that ‘to teach someone (how) to do something’ is semantically causative because it can be decomposed into ‘to make someone learn to do something’. Bordelois (1988), however, analyzes the causative construction as the object control construction.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

b. (14) a. b.

377

Gianni e Mario *si fecero imparare la procedura. Gianni e Mario insegnrono la procedura a Carlo/l’uno all’altro. Gianni and Mario taught the procedure to Carlo/to each other ‘Gianni and Mario taught Carlo/each other the procedure.’ Gianni e Mario *?si insegnarono la procedura. Gianni and Mario *?si taught the procedure’8

Cinque’s argument, however, cannot be sustained for Spanish because both permitir and ense˜nar do not pattern with hacer in terms of the dative-related se-cliticization. First, just like in Italian, a ditransitive verb can occur with the dative-related reflexive/reciprocal clitic se, as shown in (15), whereas the causative verb hacer cannot, as shown in (16). (15) Juan y Jos´e se regalaron un disco. Juan and Joe se gave a disk ‘John and Joe gave a disk to themselves/to each other.’

(ditransitive)

(16) Juan y Jos´e *se hicieron aprender el proceso.(causative with hacer) Juan and Joe se-3 made-3PL to.learn the procedure (Intended: John and Joe made themselves/each other learn the procedure) However, unlike in Italian, the dative-related reflexive/reciprocal clitic se may occur with ense˜nar, as shown in (17) and (18), and permitir, as shown in (19).9,10 (17) “. . . olvidas que yo me ense˜ne´ a m´ı a hacerte gozar.” . . . you.forget that I se-1 SG taught to myself to have.you enjoy ‘. . . you forget that I taught myself how to have you enjoy.’ otro a hacer los trucos. (18) Juan y Jos´e se ense˜naron el uno al John and Joe se-3 taught-3 PL the one to.the other to do the tricks ‘John and Joe taught each other how to do the tricks.’

8

9 10

My two Italian native speakers commented that this sentence is odd because without the disambiguating phrase l’uno all’altro ‘each other’ the sentence tends to be interpreted as reflexive and the act of teaching oneself is odd in Italian. However, both accepted that si can be used with insegnare, if the intended meaning is reciprocal. This fact considerably weakens Cinque’s argument. Note that in (17) and (19), the clitic se changes its form in agreement with the person/number value of the subject. Examples in (17) and (19) are taken from Spanish Royal Academy’s Corpus de Referencia de Espa˜nol Actual (CREA).

378

Chiyo Nishida

(19) ¿C´omo te permitiste, (por ayudar a un lun´atico), correr un How se-2 SG permitted-2SG, (for helping a lunatic), to.run a riesgo de ese tama˜no? risk of that size ‘How did you permit yourself, (for helping a lunatic), to run a risk of that size?’ In view of the data in (15)–(19), the dative of permitir or ense˜nar in (4) or (5), respectively, is no different from the dative of the ditransitive verb regalar, but behaves differently from the one occurring with the causative verb hacer. There are several other pieces of evidence that support that strings like (4) and (5), containing permitir and ense˜nar, respectively, as the matrix verb, are different from the purely causative construction. First and most importantly, for the causative verb, the causee may be case-marked as dative, as shown in (6), or accusative, as shown in (20), depending on the transitivity of the embedded clause.11 (20) Jos´e lo hizo caminar todo el d´ıa. Jos´e CL-ACC.3 SG.MASC made to.walk all the day ‘Jos´e made him walk all day long.’ With permitir and ense˜nar, on the other hand, no such variation is evidenced in any variety of Spanish. The object is uniformly dative and never accusative, regardless of the transitivity of the embedded clause, as shown in (21) and (22). permiti´o caminar todo el d´ıa. (21) Jos´e le/*lo Joe him/her-DAT/*him-ACC permitted to.walk all day long ense˜no´ a tocar el piano. (22) Jos´e le/*lo Joe him/her-DAT/*him-ACC taught to.play the piano Second, when permitir takes a finite clause, the dative argument can commonly remain controlling the subject of the finite embedded clause, as shown in (23). (23) Jos´e lei permiti´o que proi leyera el libro. Joe him/her-DAT permitted that pro read-SUBJ.IMP the book ‘Joe permitted him/her to read the book.’ However, with hacer, strings like (24) below, where the dative remains controlling the subject of the finite embedded clause, is judged either ill-formed or extremely marginal by native speakers.12 11

This is the prescriptive rule given by the Spanish Royal Academy (Real Academia Espa˜nola). In spoken Spanish, this rule may not be strictly observed, and there may be other factors governing the case alternation. However, this fact does not weaken our argument because permitir, ense˜nar, etc. does not allow case alternation across all varieties of Spanish.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

379

hizo que proi leyera el libro. (24) *?Jos´e lei Joe him/her-DAT made that pro read-SUBJ.IMP the book Moore (1998) uses some constituency tests to demonstrate that verbs like permitir cannot be assimilated with the causative verb. One such test is the formation of cleft sentences, as shown in (25) below. Lo que me permiti´o/orden´o/mand´o fue [barrer la verada]. ‘What s/he permitted/ordered/commanded me was [to sweep the sidewalk].’ b. *Lo que me hizo/dej´o fue [barrer la verada]. ‘What s/he made/let me was [to sweep the sidewalk].’ (Moore’s examples (31-a,b))

(25) a.

The embedded non-finite clause of ense˜nar can also be clefted, as shown in (26). (26) Lo que me ense˜no´ fue [hacer los trucos]. ‘What s/he taught me was to do the tricks.’ Based on the data in (25) and some others, Moore argues that for permitir the dative is the object of this verb which controls the subject of the embedded clause, whereas for hacer the dative is (underlyingly) the subject of the embedded clause. In sum, the data presented in this section provides ample evidence supporting that verbs like permitir and ense˜nar are indeed object control verbs, and strings like (4-b), (5-b), and (11-b) are instances of RS and CC. In Section 3, we propose an analysis that provides a uniform account of strings like (1-b) and (2-b) and of those like (4-b), (5-b), and (11-b). In this paper, however, we will not deal with the causative construction, as in (6) and (20), or the construction involving perception verbs, as in (27), which is assumed to have a similar underlying structure as the causative construction. (27) Jos´e me lo vio romper. Joe me it saw break ‘Joe saw me break it.’ 12

My Spanish native speaker consultants were from Costa Rica (1), Mexico (1), and Central Spain (2).

380

Chiyo Nishida

3. Theoretical framework: Generalized Category Grammar 3.1. Categorial lexicon: assigning expressions to syntactic categories Categorial Grammar has two components: a) Categorial Lexicon, where expressions are assigned to syntactic categories (CAT hereafter) according to their lexical properties, and b) a set of reduction rules. Linguistic expressions are assigned to atomic or complex categories. (28) shows some of the atomic categories from Spanish. (28) Atomic CAT: Some examples S: El ni˜no est´a muy contento. N: ni˜no NP: el ni˜no AP (predicative): muy contento

‘The boy is very happy.’ ‘boy’ ‘the boy’ ‘very happy’

Complex categories are formed on the basis of atomic categories and connectives. In the first place, we have the so-called FUNCTOR categories which contain connectives, /, \, and |, as shown in (29) below. (29) Complex (Functor) CAT: a. right-looking X/Y b. left-looking Y \X 13 c. bi-directional X|Y where X and Y can be basic or complex CAT. These categories are considered as functions from CAT Y (domain) to CAT X (range). Linguistically, X/Y , Y \X, and X|Y mean that expressions that belong to these categories can combine with expressions of CAT Y to their right (/), to their left (\), or to either direction (|), respectively, to yield expressions of CAT X. Since X and Y can be atomic or complex categories, given atomic categories A, B, C, and D, X/Y , for instance, can be A/B, (C\A)/B, (C\A)/(B/D), etc. Besides complex categories, we also use PRODUCT categories, as shown in (30) below. (30) Product CAT: (A • B) where A and B are atomic categories. An expression assigned to CAT (A • B) is a concatenation of an expression of CAT A and an expression of CAT B. CAT (C\D)/(A • B), therefore, is equivalent to CAT ((C\D)/B)/A. Complement-taking verbs are good examples of complex categories. Transitive verbs like lee ‘reads’ and leer ‘to read’, for instance, belong to the complex 13

We would like to alert that the practitioners of Combinatory Categorial Grammar (cf. Steedman (2000)) consistently place the domain category on the right-hand side of the connective. Thus, the left-division functor CAT is represented as X\Y instead of Y \X.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

381

categories, as shown in (31-a) and (31-b), respectively. The thematic correspondence of a complex category is provided using the lambda notation. (31)

Linguistic Expression Syntactic CAT Thematic Representation a. lee ‘reads’ (NPSub[3sg] \S)/NPDO λ xλ y[lee (x)(y)] b. leer ‘to.read’ (NPSub \Sinf )/NPDO λ xλ y[leer (x)(y)]

The verb lee or leer takes the NPDO to the right, forming an expression whose combination with a subject NP to the left would yield a finite S or an infinitival S, respectively; for lee, the subject NP must be [3SG], whereas for leer, the subject’s person/number value is unspecified.14 Ditransitive verbs like da ‘gives’ and dar ‘to give’ take both the DO and the IO besides the subject to form an S. Here we use a product category (NPDO • a NPIO )15 to represent their complement structure, as shown in (32) below. (32)

Linguistic Syntactic Expression CAT a. da ‘gives’ (NPSub [3sg] \S)/(NPDO • a NPIO ) b. dar ‘to.give’(NPSub \Sinf )/(NPDO • a NPIO )

Thematic Representation λ xλ yλ z[da (x)(y)(z)] λ xλ yλ z[dar (x)(y)(z)]

Note that the syntactic category CAT(NPSub \S)/(NPDO • a NPIO ) is equivalent to CAT((NPSub \S)/a NPIO )/NPDO . Quiere ‘wants’, as used as a subject control verb, is assigned to the following category. (33) quiere ‘wants’ (NPSub[3sg] \S)/(NPSub \Sinf ) :16

λ Pλ y[quiere (P(ana y))(y)]

Quiere combines with an expression of CAT(NPSub \Sinf ), i.e., infinitival VP, to the right, forming an expression whose combination with a subject NP[3sg] to the left yields a finite S. Following Steedman (2000), we represent an argument that is controlled by y as (ana y), an analogue to the controlled PRO. Here y is the subject of the matrix verb. Likewise, object control verbs like permite ‘permits’ belong to the CAT shown in (34).

14

15

16

Spanish is a null-subject language; however, in this paper, we will not deal with how the null subject can be handled within Categorial Grammar, since it is not critical for the purpose of this paper. The IO in Spanish is always accompanied by the particle a. There is no consensus among linguists on whether the IO is an NP or PP. In this paper, we will simply represent it this way since it is not critical for our purposes. Hereafter, we use a simple S for finite clauses.

382

Chiyo Nishida

(34) permite ‘permits’ ((NPSub[3sg] \S)/aNPIO )/(NPSub \Sinf ): λ pλ xλ y[permite ((P(ana x))(x))(y)] Permite first takes an infinitival VP to the right, then an IO to the right, and finally a 3rd person/singular subject to the left to yield an S. Note in the thematic representation that the controlled subject of the embedded verb here, (ana x) is anaphoric to the IO of permite. To what categories do clitics belong? Following Miller (1992) and Miller and Sag (1996), we take the strong lexicalist approach, defining clitics as verb affixes which need to combine with a verb in the lexicon. The most critical empirical evidence in favor of this position comes from the following facts: a clitic, unlike a word, cannot take wide scope in a coordinate structure, as shown in (35-a) below, but needs to be attached to each of the two verbs coordinated, as shown in (35-b). See Miller (1992) and Miller and Sag (1996) for further arguments. (35) a. b.

Jos´e *lo [compr´o y ley´o]. Joe it [bought and read] Jos´e lo compr´o y lo ley´o. it bought and it read ‘Joe bought and read it.’

We assume that an ACC clitic or a DAT clitic belongs to a functor category that takes a verb needing a DO or an IO, respectively, and possibly another complement, and partially instantiates the DO or IO, respectively, by specifying some morphological features for these complements. In other words, the ACC clitic lo and the DAT clitic le, for instance, belong to the type-raised, polymorphic categories, as shown in (36) and (37), respectively below.17 Note that clitics belong to bi-directional categories in Spanish because they can be proclitic (rightdivision) or enclitic (left-division) depending on whether their host verb is finite or non-finite, respectively. (36) ACC clitics lo [ACC.3 SG.MASC] CAT((NPsub[β ] \Sα fin )/$)|(((NPsub[β ] \Sα fin )/$)/NPDO )): λ P[P(lo)] where a. α is + or −; b. If α is +, X|Y is to be interpreted as X/Y and β contains a specified person/number value; otherwise X|Y is to be interpreted as Y \X and β contains no specified person/number value; c. [−fin] can be infinitival or gerundive; 17

A polymorphic CAT contains variables like $ or &, as in (36) and (37).

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

d.

383

$ is a complement type that can co-occur with NPDO , which includes a NPIO , PP, NPsub \Sinf (=infinitival VP), CP, and 0; /

(37) DAT clitics le [DAT.3 SG] CAT((NPsub[β ] \Sα fin )/&)|(((NPsub[β ] \Sα fin )/a NPIO )/&): λ P[P(le)] where a., b., c. Same as for ACC clitics; d. & is a complement type that can co-occur with an a NPIO , which includes NPDO , NPsub \Sinf (=infinitival VP), CP, and 0. / Details of how clitic doubling is handled are beyond the scope of this paper; however, we assume that, when a clitic-doubled NP is incorporated into a string, its referential and semantic properties are unified with those of the clitic and together they fully instantiate the DO or IO. 3.2. Reduction rules The second component of Categorial Grammar comprises two kinds of reduction rules: a) binary rules, which combine two expressions to form a new expression, and b) unary rules, which changes the category assigned to a set of expressions. The first binary rule, Functional Application, has two versions, as shown in (38). (38) Functional Application (Abbreviated as FA) a. Forward A(>FA) X/Y : f Y : (a) → X : f (a) b. Backward A(FA

>FC

Both alternatives, D+FA and FC, are able to account for RS/CC; however, as we shall discuss in sect. 4.4 below, we prefer the first alternative for its empirical advantage.

4. Analysis: restructuring and clitic climbing 4.1. RS/CC with subject control, modal, aspectual, and motion verbs We assume that non-RS/CC strings like (1-a) and their RS/CC counterparts like (1-b) are formed differently. (1-a) is constructed by successive FAs, analogous to a derivation in phrase structure grammars. First, the clitic combines with its host verb and is phonologically affixed to it in the lexicon, as shown in (42-a), as the variable $ takes the value 0. / The result, shown under the line, is an expression that needs a subject NP to form an infinitival S. The thematic correspondence to this syntactic process is provided in (42-b), where the clitic lo ‘it’ is properly interpreted as the internal argument of the verb leer. (42) Cliticization: a. leer ‘to.read’ (NPSub \Sinf )/NPDO

18

=

lo ‘it’ (((NPSub \Sinf )/$)/NPDO )\((NPSub \Sinf )/$) FA(& = 0, / Z = NPDO , α = [3sg], W= (NPSub\Sinf ) /NPDO )

D1

386 Chiyo Nishida

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

387

→ λ y[quiere (leer (lo)(ana y))(y)] FA to form a complex verb [empezar a leer]. Likewise, tratar ‘to try’ undergoes a category change by D and combines with this complex verb to form another complex verb, [tratar de empezar a leer]. As shown in (51) below, the previously formed string, lo=quiero ‘it=wants’, combines with this complex verb by >FA, and finally the subject is incorporated into the sentence, yielding (10-d). (51) a. Jos´e NPSub[3sg]

lo=quiere

tratar de empezar a leer

(NPSub[3sg] \S)/((NPSub \Sinf )/NPDO ) (NPSub[3sg] \S) S

((NPsub \Sinf )/NPDO )

>FA

FA:λ Pλ y[quiere’((P(lo))(ana’y)) (y)](λ wλ z[tratar’(empezar’((leer’(w))(ana’ z))(ana’z))(z)])

→λ y[quiere’((λ wλ z[tratar’(empezar’((leer’(w))(ana’ z))(ana’z))(z)] (lo))(ana’y)) (y)] →λ y[quiere’(λ z[tratar’(empezar’((leer’(lo))(ana’ z))(ana’z))(z)](ana’y)) (y)] (where z=y) →λ y[quiere’(tratar’(empezar’((leer’(lo))(ana’ y))(ana’y))(ana’y)] (y)] FA1 : λ Qλ xλ y [empezar’((Q(x))(ana’y)) (y)] (λ wλ z[(leer’(w))(z)]) → λ xλ y[empezar’((λ wλ z[(leer’(w))(z)](x))(ana’y))(y)] → λ xλ y[empezar’(λ z[(leer’(x))(z)](ana’y))(y)] → λ xλ y[empezar’((leer’(x))(ana’ y)) (y)] D2 : λ Rλ x[tratar’(R (ana’ x)) (x)] → λ Tλ wλ z[tratar’((T(w)) (ana’ z)) (z)] > FA2 : λ Tλ wλ z[tratar’((T(w)) (ana’ z))(z)] (λ xλ y[empezar’((leer’(x))(ana’ y)) (y)]) →λ wλ z[tratar’((λ xλ y[empezar’((leer’(x))(ana’ y)) (y)] (w)) (ana’ z))(z)] →λ wλ z[tratar’((λ y[empezar’((leer’(w))(ana’ y))(y)](ana’ z)) (z)] (where y=z) →λ wλ z[tratar’(empezar’((leer’(w))(ana’ z))(z)) (z)]

(NPsub \de Sinf )/(NPsub \a Sinf )

de empezar ‘to begin’

a leer ‘to read’

(NPsub \a Sinf )/NPDO D1 ((NPsub \Sinf )/W )/((NPsub \de Sinf )/W ) ((NPsub \de Sinf )/Z)/((NPsub \a Sinf )/Z) >FA1 ((NPsub \de Sinf )/NPDO ) (Z = NPDO ) >FA2 (W = NPDO ) (NPsub \Sinf )/NPDO

(NPsub \Sinf )/(NPsub \de Sinf )

tratar ‘to try’

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

389

D2

=

D1 >FA (α = 3/sg,Z = NPDO , W = (NPSub \Sinf )/NPDO )

(((NPSub[3/sg] \S)/a NPIO )/Z)/((NPSub \Sinf )/Z)

((NPSub[3/sg] \S)/a NPIO )/(NPSub \Sinf )

permite ‘permits’

>FA ($ = a NPIO , β = α ,& = 0, / Z = ((NPsub.[α ]\S[+fin] )/a NPIO )/NPDO )

→ λ Qλ zλ xλ y[permite((Q(z))(ana (x))(x)(y)] D2 : λ P[(P(lo))(me)] → λ Rλ S[((R(S))(lo))(me)] >FA:λ Rλ S[((R(S))(lo))(me)](λ Qλ zλ xλ y[permite((Q(z))(ana x))(x)(y)]) → λ S[((λ Qλ zλ xλ y[permite((Q(z))(ana x))(x)(y)](S))(lo))(me)] → λ S[(λ zλ xλ y[permite((S(z))(ana x))(x)(y)](lo))(me)] → λ S[λ xλ y[permite((S(lo))(ana x))(x)(y)](me)] → λ Sλ y[permite((S(lo))(ana me))(me)(y)]

b. D1 : λ Pλ xλ y[permite (P(ana x))(x)(y)]

lo ‘it’ ((NPsub[β ]\S[+fin] )/$)/(((NPsub[β ]\S[+fin] )/$)/NPDO )

((NPSub[3/sg] \S)/((NPSub \Sinf )/NPDO )

((NPSub[α ] \S)/W )/((((NPSub[α ] \S)/a NPIO )/NPDO )/W )

(NPSub[α ] \S)/(((NPSub[α ] \S)/a NPIO )/NPDO )

me=lo ‘me it’

b. D : λ R[R(me)] → λ Qλ P[(Q(P))(me)] FA : λ Qλ P[(Q(P))(me)](λ T [T (lo)]) → λ P[(λ T [T (lo)](P))(me)] → λ P[(P(lo))(me)]

(53) a.

D

(((NPsub[α ]\S[+fin] )/(((NPsub[α ]\Sαfin )/a NPIO )/NPDO )

(((NPsub[α ]\S[+fin] )/&)/Z)/((((NPsub[α ]\S[+fin] ))/a NPIO )/&)/Z)

((NPsub[α ]\S[+fin] )/&)/(((NPsub[α ]\S[+fin]) )/a NPIO )/&)

(52) DAT=ACC Clitic-cluster formation me ‘me’ a.

390 Chiyo Nishida

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

391

Next, as shown in (53) on page 390, this clitic cluster combines with the verb permite by FA after D changes the category assigned to each expression and the variables α , Z, and W take specified values. This step is exactly the same as for forming lo=quiere seen in (45) above. Subsequently, the embedded verb and then the subject NP get incorporated into the string, as shown in (54), both by FA, yielding the string (4-b). (54) a. Jos´e ‘Joe’ NPSub[3sg]

me=lo=permite ‘me.it.permits’

leer ‘to read’

((NPSub[3sg] \S)/((NPSub \Sinf )/NPDO )

(NPSub \Sinf )/NPDO

NPSub[3sg] \S S

>FA

FA:λ Sλ y[permite((S(lo))(ana me))(me)(y)](λ xλ z[leer (x)(z)]) →λ y[permite((λ xλ z[leer (x)(z)](lo))(ana me))(me)(y)] → λ y[permite(λ z[(leer (lo)(z)](ana me))(me)(y)] →λ y[permite(leer (lo)(ana me))(me)(y)] >FA:λ y[permite(leer (lo)(ana me))(me)(y)](jose ) → permite(leer (lo)(ana me))(me)(jose ) Note that, although the two clitics were combined with the matrix verb at the same time, each one of them is interpreted as a complement of the verb to which it is linked. In order to incorporate the new data, we revise the two lexical rules proposed for RS verbs and clitics, as in (55) and (56), respectively. (55) Type-changing Rule (D) for RS verbs in Spanish (revisited): [V]((NPsub\ S[ ± fin])/$)/(NPsub\S[-fin]) → [V](((NPsub\ S[±fin])/$)/Z)/((NPsub\S[-fin])/Z) where a) V={querer, tratar, permitir, ense˜nar,. . . }; b) $ = 0/ or a NPIO ; / otherwise Z = c) Z = NPDO , a NPIO , or (NPDO • a NPIO ) if $ = 0; NPDO . (56) Type-changing Rule (D or D’) for CLs (revisited): [CL]X| Y → [CL](X/Z)| (Y/Z) where CL can be a single clitic or a cluster of clitics. here to form the cluster me=lo; we can also use the categories assigned to enclitics to form the same cluster.

392

Chiyo Nishida

4.3. Comparisons 4.3.1. CG analysis vs. Minimalist analysis. Our CG analysis of Romance RS/CC has one additional advantage beyond its ability to handle RS/CC strings with object control verbs without a special mechanism. Since it allows the clitic to combine directly with the RS verb, it offers a straightforward account of the coordinate structure involving RS/CC, as shown in (57). (57) Jos´e [lo puede y lo debe] leer. Joe [it can and it must] read ‘Joe can and must read it’ (57) is constructed first by forming each of the coordinated strings by combining the clitic and the RS verb. (58) on page 393 shows the derivation of lo=puede.23 The other coordinated string lo=debe ‘it.must’ is formed similarly, resulting in an expression of CAT (NPSub[3sg] \S)/((NPsub\Sinf )/NPDO ) of the thematic representation, λ Pλ x[debe’((P(lo))(x))]. Next, using the coordination rule proposed by Steedman (2000), as shown in (60), the two strings are coordinated, as shown in (59) on page 393. (60) Coordination() (Steedman (2000, 39)) X : g CONJ : b X : f ⇒ Φn X : λ . . . b( f . . . )(g . . . ) Finally, the embedded verb and then the subject are incorporated into the string, as follows. (61) a. Jos´e ‘Joe’ NPSub[3sg]

lo=puede y lo=debe

leer ‘to read’

(NPSub[3sg] \S)/((NPsub \Sinf )/NPDO )

(NPsub \Sinf )/NPDO

NPSub[3sg] \S

b.

23

>FA

FA: λ Sλ x[and (puede ((S(lo))(x)))(debe ((S(lo))(x)))](λ zλ y[(leer (z)(y)]) → λ x[and (puede ((λ zλ y[leer (z)(y)](lo))(x)))(debe ((λ xλ y[leer (x)(y)](lo))(x)))] → λ x[and (puede (λ y[leer (lo)(y)](x)))(debe (λ y[leer (lo)(y)](x)))] → λ x[and (puede (leer (lo)(x)))(debe (leer (lo)(x)))] FA:λ Rλ S[(R(S))(lo)](λ Qλ rλ x[puede ((Q(r))(x))]) → λ S[(λ Qλ rλ x[puede ((Q(r))(x))](S))(lo)] → λ S[λ rλ x[puede (S(r))(x))](lo)] → λ Sλ x[puede (S(lo))(x))]

((NPsub[3sg] \S)/(NPsub \Sinf )) D2 D1 (((NPSub[α ] \S)/&)/W )/((((NPSub[α ] \S)/&)/NPDO )/W ) (((NPsub[3sg] \S)/Z)/((NPsub \Sinf )/Z)) > FA (α = 3sg, & = 0, / Z = NPDO , (NPSub[3sg] \S)/((NPsub \Sinf )/NPDO ) W = (NPsub \Sinf )/NPDO )

((NPSub[α ] \S)/&)/(((NPSub[α ] \S)/&)/NPDO )

lo ‘it’

b. : λ Sλ x[puede (S(lo))(x))]and λ Sλ x[debe (S(lo))(x))] → λ Sλ x[and (puede ((S(lo))(x)))(debe ((S(lo))(x)))]

(59) a.

b.

(58) a.

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

393

394

Chiyo Nishida

Note that in Cinque’s analyses, on the other hand, there is no straightforward way to derive coordinated RS strings like (57). Since he postulates that the clitic in the RS construction must come from the embedded verb, it is not clear how two instances of the clitic lo can be produced by movement from one source, i.e., leer. For all other mono-clausal analyses mentioned above, coordinated RS strings like (57) are equally problematic. 4.3.2. Alternative CG analyses There are a couple of possible alternative CG analyses of RS/CC. In the first alternative, the string (1-b), for instance, is constructed as shown in (62) on page 395. In this alternative, the two verbs are first combined by FC to form a complex verb. Subsequently, the clitic and then the subject NP are incorporated into the string, both by FA. The thematic interpretation agrees with that derived as in (45)–(46). The above alternative analysis, however, runs into several difficulties. First, if allowed in the system as a combinatory rule, FC would freely overgenerate illformed strings like (7-b), where a non-RS verb appears in an RS/CC context. In order to block strings like (7-b), FC would have to be restricted from applying in certain cases. It seems more desirable to restrict the lexically-governed process by a type-changing rule, as Moortgat (1988) claims.24 Second, the analysis by FC would require a special mechanism to handle the affixation of the clitic to its host verb since, in this analysis, the clitic combines with a complex verb and not directly with the RS verb. When the RS verb is finite, the clitic can just affix to the verb on the right. However, when the RS verb is non-finite, the clitic must be placed to the right of the RS verb, as in [empezarlo a leer] ‘to begin reading it’ (See10b); thus, a concatenative combinatory operation like FA does not suffice.25 Moreover, because the clitic does not directly combine with the RS verb, the coordinated structure like (57) cannot be accounted for. Hoyt and Baldridge (2008) use what they call “D-rules” based on the Dcombinator (Curry and Feye (1958)), as formulated in (63), to account for various types of cross-conjunct extraction phenomena and use this rule to process strings like (57) seen above.

24 25

Moortgat (1988), following Hoeksema’s criticism, analyzes Dutch verb clusters through a lexical rule based on Division. Non-concatenative operations like “wrapping” (Bach (1984)) and “infixation” (Moortgat (1988)) have been proposed in categorical grammar literature. However, these operations have not received much support because they tend to complicate the system.

lo ‘it’

quiere ‘wants’

leer ‘to.read’

S b. >FC : λ x[λ Pλ z[quiere (P)(z)](leer (x)(ana z))] → λ xλ z[quiere (leer (x)(ana z))(z)] >FA : λ Q[Q(lo)](λ xλ z[quiere (leer (x)(ana z))(z)]) → λ xλ z[quiere (leer (x)(ana z))(z)](lo) → λ z[quiere(leer (lo)(ana z))(z)] FA(α = 3sg, & = 0) /

NPsub[3sg] ((NPSub[α ] \S)/&)/(((NPSub[α ] \S)/&)/NPDO ) (NPsub[3sg] \S)/(NPsub \Sinf ) (NPsub \Sinf )/NPDO

(62) a. Jos´e ‘Joe’

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

395

396

Chiyo Nishida

(63) x/(y/z) : f

y/w : g

→

x/(w/z) : λ h. f (λ x.ghx)

Note that this combinatory rule has the same effect as Division (applied to the second functor CAT) plus FC; thus, it is able to combine a CL directly with an RS verb. However, this rule, because it is a combinatory rule, cannot be blocked from generating ill-formed strings like (7-b) seen above or (64) below, where a clitic is combined with a non-RS verb. y lo esperaba leer. (64) *Juan lo quer´ıa John it wanted-3SG and it hoped-3SG to read In sum, from an empirical point of view, the analysis that uses the lexicallygoverned Division is the best CG alternative for analyzing Romance RS and CC. 4.4. A remaining issue Before we close this section, we address one remaining issue which deals with ill-formed strings like (65-b) and (c), where only one of the two clitics linked to the embedded verb has climbed. (65) a.

Jos´e quiere mand´armelo. Joe can-3SG send.CL-DAT.1 SG.CL-ACC.1 SG ‘Joe can send it to me’ b. *Jos´e me quiere mandarlo. c. *Jos´e lo quiere mandarme.

Our CG analysis would derive both strings, as illustrated below, using (65-b). (66) *Jos´e NPsub[3sg]

me=quiere ‘me.wants’

mandar=lo ‘to send.it’

((NPsub[3sg] \S)/(NPsub \Sinf)/a NPIO

(NPsub \Sinf )/a NPIO

(NPsub[3sg] \S) S

>FA

FA (NPSub[3sg] \S)/LEX((NPSub[3/sg] \Sinf )/NPDO ) (& = 0, / Z = NPDO , α = [3sg], W = LEX(NPSub \Sinf )/NPDO )

quiere ‘wants’

me ‘me’

398 Chiyo Nishida

Restructuring and Clitic Climbing in Romance: A Categorial Grammar Analysis

399

Bibliography Aissen, Judith and David Perlmutter (1983): Clause Reduction in Spanish. In: D. Perlmutter, ed., Studies in Relational Grammar 1. The University of Chicago Press, Chicago, pp. 360–403. Bach, Emmon (1984): Some Generalizations of Categorial Grammar. In: F. Landman and F. Veltman, eds, Varieties of Formal Semantics, Foris, Dordrecht, pp. 1–23. Bordelois, Ivonne (1988): ‘Causatives: From Lexicon to Syntax’, Natural Language and Linguistics Theory 6, 57–93. Burzio, Luigi (1981): Intransitive Verbs and Italian Auxiliaries. PhD thesis, MIT, Cambridge, Mass. Burzio, Luigi (1986): Italian Syntax: A Government and Binding Approach. Foris, Dordrecht. Cinque, Guglielmo (2000): Restructuring and Functional Structure, Ms., University of Venice. Cinque, Guglielmo (2006): Restructuring and Functional Structure. The Cartography of Syntactic Structure, Vol. 4. Oxford University Press, New York. (Chapter 1 originally appeared in: A. Balleti, ed., (2004), Structures and Beyond. The Cartography of Syntactic Structures. Vol. 3, Oxford University Press, New York, pp. 132–191.) Contreras, Heles (1979): ‘Clause Reduction, the Saturation Constraint, and Clitic Promotion in Spanish’, Linguistic Analysis 5, 161–182. Curry, Haskell B. and Robert Feys (1958): Combinatory Logic, Vol. 1. North Holland, Amsterdam. Di Sciullo, Anna-Maria and Edwin Williams (1987): On the Definition of Word. MIT Press, Cambridge, Mass. Goodall, Grant (1987): Parallel Structures in Syntax: Coordination, Causatives and Restructuring. Cambridge University Press, Cambridge. Hoyt, Frederick and Jason Baldridge (2008): A Logical Basis for the D Combinator and Normal Form Constraints in Combinatory Categorial Grammar. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pp. 326-334. Jacobson, Pauline (1999): ‘Towards a Variable-Free Semantics’, Linguistics and Philosophy 22, 117–184. Kayne, Richard (1989): Null Subjects and Clitic Climbing. In: O. Jaeggli and K. Safir, eds, The Null Subject Parameter. Kluwer, Dordrecht, pp. 239–261. Laca, Brenda (2004): Romance “Aspectual” Periphrases: Eventuality Modification versus “Syntactic” Aspect. In: J. Gu´eron and J. Lecarme, eds, The Syntax of Time. MIT Press, Cambridge, pp. 425–440. Luj´an, Marta (1980): ‘Clitic Promotion and Mood in Spanish Verbal Complements’, Linguistics 18, 381–484. Miller, Philip H. (1992): Clitics and Constituents in Phrase Structure Grammar Garland, New York. (Doctoral dissertation, University of Utrecht, 1991.) Miller, Philip H. and Ivan Sag (1996): ‘French Clitic Movement Without Clitics or Movement’, Natural Language and Linguistic Theory 15, 573–639. Monachesi, Paola (1999): A Lexical Approach to Italian Cliticization. CSLI Publications, Stanford. Moore, John (1998): Object Controlled Restructuring in Spanish. Ms., University of California, San Diego. Moortgat, Michael (1988): Categorial Investigations. Foris, Dordrecht. Napoli, Donna Jo (1981): ‘Semantic Interpretation vs. Lexical Governance: Clitic Climbing in Italian’, Language 57, 841–887. Perlmutter, David and Judith Aissen (1976): ‘Clause Reduction in Spanish’, Berkeley Linguistics Society 2, 1–30. Rizzi, Luigi (1978): A Restructuring Rule in Italian Syntax. In: S.J. Keyser, ed., Recent Transformational Studies in European Languages, MIT Press, Cambridge, pp. 113–158. (Also in: Rizzi, L. (1982): Issues in Italian Syntax, Foris, Dordrecht.) Rivas, Alberto (1977): Theory of Clitics. Unpublished doctoral dissertation, MIT. Roberts, Ian (1997): ‘Restructuring, Head Movement and Locality’, Linguistic Inquiry 28, 423–460. Rosen, Sara Thomas (1990): Argument Structure and Complex Predicates. Garland, New York. (Ph.D. dissertation, Brandeis University, 1989)

400

Chiyo Nishida

Sadock, Jerrold M. (1991): Autolexical Syntax: A Theory of Parallel Grammatical Representations. University of Chicago Press, Chicago. Steedman, Mark (2000): The Syntactic Process. MIT Press, Cambridge. Steedman, Mark and Jason Baldridge (2011): Combinatory Categorial Grammar. In: R. Borsley and K. B¨orjars, eds, Non-Transformational Syntax: Formal and Explicit Models of Grammar. Blackwell, Oxford, pp. 181–223. Strozer, Judith (1976): Clitics in Spanish. Unpublished doctoral dissertation, UCLA. Su˜ner, Margarita (1980): Clitic Promotion in Spanish Revisited. In: F. Neussel, ed., Contemporary Studies in Romance Languages. Indiana University Linguistics Club, Bloomington, pp. 300–330. Zagona, Karen (1988): Verb Phrase Syntax: A Parametric Study of English and Spanish. Kluwer Academic Publishers, Dordrecht. Zubizarreta, Maria Luisa (1985): ‘The Relation Between Morphophonology and Morphosyntax: The Case of the Romance Causatives’, Linguistic Inquiry 16, 247–289.

Department of Spanish and Portuguese University of Texas at Austin

Christina Unger

A Derivational View on Movement Constraints* Abstract This paper presents Ulf Brosziewski’s model of syntactic derivations as an implementation of a strongly derivational organization of grammar that keeps syntactic representations minimal. Based on Brosziewski’s local encoding of movement dependencies, the paper explores how this model can capture different constraints on movement, in particular remnant movement and Freezing, the Condition on Extraction Domain, weak island phenomena, minimality effects and across-the-board movement.

1. Introduction This paper explores conditions on movement in a strongly derivational organization of grammar. The setting is Ulf Brozsiewski’s model of syntactic derivations (see Brosziewski (2003)). It offers a local encoding of movement dependencies that combines properties of transformational and feature-based approaches. It is based on simple assumptions about syntactic expressions and the operations defined on them, and is thus suitable to investigate how much is needed to derive, or at least express, fundamental characteristics of movement. Moreover it reduces the amount of representations needed for syntactic operations to a minimum and thus constitutes an exploration of how derivational syntax can be. The outline of the paper is the following. In section 2, Brosziewski’s view on derivations will be introduced. After that, I address a problem that it faces with extraction from phrases that are extracted themselves, and show how to solve it. Then I will turn to generally assumed constraints on movement. In section 4, I review how the Condition on Extraction Domain is expressed in Brosziewski’s model. Then, in section 5, I will consider how weak islands can be captured. Furthermore, section 6 deals with accounting for minimality effects. Finally, in section 7, I will also show how across-the-board extraction can be incorporated. *

For thorough remarks and helpful suggestions, I am very grateful to Gereon M¨uller, Eric Reuland, and Andreas Pankau.

Local Modelling of Non-Local Dependencies in Syntax, 401-429 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

402

Christina Unger

2. On derivations 2.1. Syntactic expressions In approaches like HPSG and the Minimalist Program, features play a central role in constructing and manipulating syntactic structures. Brosziewski’s model of syntactic derivations shares this assumption. Simple expressions are taken to be feature bundles containing phonological, syntactic, and semantic features. In the following, phonological and semantic features will be ignored and focus will be on syntactic features only. With respect to these, Brosziewski sticks to a very simple feature system. For encoding syntactic dependencies he assumes that features come in two varieties: as plain features f , and as features ∧ f with a prefix that indicates that it wants to be matched with a corresponding plain feature. Following Adger (2008), among others, I add a distinction between two types of features. The first one are category features Cat; they subsume subcategorization features, that are checked when two expressions are merged and that are assumed to be ordered according to the argument structure of the selecting expression. The second one are morphosyntactic features Form, that trigger movement and require no ordering. Although this distinction is not made explicitely by Brosziewski, the presentation of features here and relations based on them later on does not differ in an essential way from his view in Brosziewski (2003, sect. 2.2). Definition 1. The set of syntactic features is given by: Cat ::= B | ∧ B Form ::= F | ∧ F B ::= V | N | P | v | T | C F ::= wh | case | . . . In the following I will assume that matching features f and ∧ f are deleted when two expressions merge, and that undeleted syntactic features at the interfaces to phonology and semantics cause a crash of the derivation. However, there is nothing deep in this view on features. In fact a lot of phenomena would call for a more sophisticated theory of features, especially because the combinatorial rules for merge and movement are quite primitive. But since I am mainly interested in core aspects of syntactic operations and restrictions on them, I will leave the feature system as simple as possible. Based on features as basic building blocks, syntactic expressions are defined as either being simple feature bundles, or being complex expression built of other syntactic expressions. Simple expressions are triples containing a phonological

A Derivational View on Movement Constraints

403

representation phon, an ordered list of category features and an unordered list of formal features.1 Complex expressions are pairs of expressions. Definition 2. Exp ::= (phon, [Cat], [Form]) | Exp, Exp I use round brackets for triples and angled brackets for tuples just to make it easy to distinguish simple and complex expressions at first sight. Moreover, in the following I will use α , β , γ as variables for simple expressions, and x, y, z as variables for arbitrary expressions. Examples of simple expressions are the following. (1) (Ulf, [N], [∧ case]) (2) (punch, [∧ N,V ], [ ]) (3) (ε , [∧V, ∧ N, v], [case]) (1) constitutes an NP, which has a categorial feature N and a formal feature ∧ case that expresses that it still requires case to be assigned. A transitive verb would look like in (2). It is of category V and additionally subcategorizes for its internal argument of category N. The v-head as in (3) subcategorizes for a VP and an external argument of category N, and also carries a formal feature case expressing that it can assign case. Its phonological content is empty, which is represented by ε . The idea behind simple and complex expressions is the following. Derivations are assumed to not build phrase structures. When simple expressions are merged together, these expressions are combined into another simple expression, whose features and semantic value are a combination of the features and semantic values of the daughters. Since no structure is built, information about the expressions that were combined and possible structural configurations is forgotten. As an example, consider we want to merge two expressions (from, [∧ N, P], [case]) and (Mars, [N], [∧ case]). Assuming that when merging them not only the categorial but also the case features are checked, the two expressions will end up inactive in the sense that they do not carry any features that still require checking. In the further derivation, it will not be necessary to look into the structure (4) but it can rather be treated as one constituent. All that is necessary to keep is the simple expression (from Mars, [P], [ ]), because it contains all the relevant information. 1

As mentioned before, expressions may also carry a semantic interpretation, which is omitted here.

404 (4)

Christina Unger

PP (from Mars, [P], [ ]) HH H P NP (from, [P], [ ]) (Mars, [ ], [ ])

The same holds if a whole CP like (5) was built. Everything inside the CP is inactive, so the subexpressions will be inaccessible for further operations and they and their structural configurations will not play a role in the derivation anymore. They can therefore be discarded; everything that syntactic operations need to care about in the further derivation is the root of the tree corresponding to the simple expression (6). (5) [ CP C [ TP Mary1 [ T [ vP t1 [ v’ v [ VP punches [ NP a unicorn]]]]]]] (6) (Mary punches a unicorn, [C], [ ]) These simple expressions behave like lexical items in the sense that syntactic operations do not have access to the parts or their structure – similar to the objects obtained after Spell-Out in Uriagereka (1999). However, the situation is different in case there are expressions that have to check formal features at a later stage of the derivation, i.e., have to move2, like the wh-pronoun in What does Mary punch?. When merging the verb punch with its internal argument what, we cannot build another simple expression forgetting about the subexpressions it was built from, because what is not inactive yet; it still has a wh-feature to check later in the derivation and therefore needs to be accessible until that later point. To keep information that still plays a role and thus cannot be forgotten, there will be a mechanism to copy expressions. One copy will be merged in base position (where it contributes its categorial feature for fulfilling subcategorization requirements), and one copy will be kept as the first element of a pair and carried along until it reaches the landing site. Merging punch and what will yield the following complex expression: (7) (what, [ ], [wh]), (punch, [V ], [ ]) Later in the derivation, when the C-head enters the derivation, what can check its wh-feature and will be merged into a simple expression. In the following I adopt Brosziewski’s notation. This means that (7) will be abbreviated as what∗, [punch cwhat ]. Simple expressions are not written as feature bundles but as strings, with c representing the (phonologically empty) copy staying in base position, and marking the percolating copy with an asterisk. For 2

I solely concentrate on wh-movement, but in principle nothing prevents one from applying Brosziewski’s mechanism to other long-distance phenomena as well, such as long-distance agreement (which is treated at length in other contributions to this volume).

A Derivational View on Movement Constraints

405

ease of reading, an expression (foo,[X],[ ]) will often be written as [X foo] or [XP foo]. Relevant features will sometimes be indicated by superscripts. A complex expression α1 , α2 . . . αn , β can be seen as having built a syntactic expression β together with a list (or stack) of expressions α1 , α2 , . . . , αn , that were extracted from β , that are now percolated, and that are going to be remerged at a later point of the derivation. β will be called the nucleus of the pair and sometimes I will refer to the αi as being at the edge of the complex expression (this notion will be made explicit in Definition 6 in section 3). In modern Chomskian terms the αi in α1 , α2 . . . αn , β could also be seen as being at the edge of a phase, while everything contained in β was sent to the interfaces. I will not go into details of this comparison here. I will turn to a short discussion of the relation between Brosziewski’s implementation of unbounded dependencies and the successive-cyclic implementation in Phase Theory at the end of the section. But first I want to demonstrate the mechanism for merging and moving expressions. The operations of merging and moving are driven by the need of expressions to check features. This will be expressed in their sensitivity to two relations, subcategorization and licensing, that can be defined on the basis of the two types of syntactic features. Definition 3. An expression x subcategorizes for an expression y if x has a subcategorization feature ∧ b and y carries the corresponding categorial feature b. Definition 4. An expression x licenses an expression y if x has a formal feature and y carries the corresponding formal feature f , or vice versa.

∧f

These definitions are general in that they apply to both simple and complex expressions. In order to be applicable to complex expressions, we need to specify what the syntactic features of a complex expression are. At this point remember that in a complex expression x, y, y is the root of the structure built so far, whereas x was merged somewhere in y and is just carried along. Thus x, y should be subcategorized or licensed by some expression if y is subcategorized or licensed by it. To this end, we say that the syntactic features of a complex expression x, y are the syntactic features of y. Now with these two relations it is possible to specify the notions of being a complement, specifier, or adjunct. Brosziewski assumes that specifiers are a special case of adjuncts (cf. Brosziewski (2003, 25f)). Definition 5. Let α and β be two expressions that are merged. Then β is a complement of α if α subcategorizes for β . Otherwise, β is an adjunct. Furthermore, an adjunct β is a specifier of α if α licenses β .

406

Christina Unger

2.2. Operations There is one general combinatoric function, merge, that combines two expressions into a third one. It is defined for simple and complex expressions and thus is independent of whether movement takes place or not. We will look at all possible cases in turn. In case no movement is involved, two simple expressions are merged. This will be handled by (M1) below. In case movement takes place, a complex expression is merged with either a simple or another complex expression. These cases will be covered by (M2) and (M3) below. Let us start with the case of merging two simple expressions, i.e., the case where no movement is involved. The result will also be a simple expression. (M1) merge α β = γ where γ has the same syntactic features as α except those that were discharged by β , and the phonological representation of γ is derived from the phonological representations of α and β One example that already appeared above was merging a preposition with an NP it subcategorizes for, here repeated as (8). The features of the preposition that are discharged by the NP are ∧ N and case, so only P projects. (8) merge (from, [∧ N, P], [case]) (Mars, [N], [∧ case]) = (from Mars, [P], [ ]) A case where no subcategorization features are checked is adjunction, as in (9). (9) merge (scientists, [N], [∧ case]) (from Mars, [P], [ ]) = (scientists from Mars, [N], [∧ case]) Now let us turn to the encoding of movement dependencies. Movement rests on the same basis as dislocation in GPSG (Gazdar (1981), Gazdar et al. (1985)), namely on dividing the relation between base position and landing site into three parts: bottom, where the dependency is introduced and encoded (in our case, where the expression enters the derivation), middle, where the information about the dependency (in our case, a copy of the expression) is percolated, and top, where the dependency is finally established (in our case, the expression is remerged). Let us first look at bottom. The motivation for a movement mechanism is that some expressions play a double role – they fill an argument position as well as check features somewhere else in the structure. So one possible assumption, which is adopted by Brosziewski, is that there is a derivational mechanism that allows to duplicate expressions in order to let them be merged at the two relevant points of the derivation. This mechanism is a function for copying elements in base position where they start the movement. This operation is assumed to be optional.

A Derivational View on Movement Constraints

407

(C1) Copying An expression y can be copied in (merge x y). However, copying here does not mean duplicating an expression or features in the literal sense; copying rather splits an expression by creating two new expressions among which the features of the splitted expression are distributed complementarily. One part will stay in base position, whereas the other part will be transported to the landing site. The definition of the function copy has the following form: (C2) copy α = α ∗, cα For example, when copying a wh-expression, the expression being merged in base position should contain the categorial feature in order to fulfill subcategorization requirements in base position, while the expression which will be percolated, should carry the wh-feature and the phonological representation to the landing site. Copying a wh-pronoun would then look like in (10). (10) copy (who, [N], [wh]) = (who, [ ], [wh]), (ε , [N], [ ]) The question now is how the features are distributed among the copies in general. To start with, categorial features of the copied expression clearly have to resume in base position in order to satisfy the subcategorization requirements of the expression that subcategorizes for it. Formal features, however, are features that trigger movement and will be needed later in the derivation. So they should be carried along, unless they can already be checked in base position. This is summarized as follows.3 (FD) Feature Distribution All syntactic features of α that can be projected or that can be checked with the element with which α is merged (via subcategorization or licensing) are assigned to the nucleus of (copy α ). In other words, features that can be projected or licensed in their base position cannot enter into movement processes. Now let us turn to the middle of a movement dependency, the percolation of copies. Since nothing is supposed to happen with these copies while they are carried along, their percolation can be handled by merge. All this operation is 3

This paper will stay ignorant with respect to phonological and semantic features. Just a short note on the effects of their distribution: If the phonological features are not merged in base position but move, it gives the phonological effect of overt movement, whereas in case the phonological features stay in base position, effects of covert movement can be obtained. Similarly, depending on with which expression the semantic value will be associated, the derivation would result in a surface interpretation or can show reconstruction effects.

408

Christina Unger

supposed to do is combine expressions as specified in (M1) and in doing so simply ignore the percolating copies. However, it cannot quite do that up to now, because in (M1) merge was only defined for simple expressions. In order to handle the percolation of extracted expressions, it also has to be defined for complex expressions. This will be done in the definitions (M2) and (M3). Let us start with (M2). The idea behind it is that if an expression α is merged with a pair x, y, then this merging is actually about the nucleus y and not about the percolating copy x, so α combines with y while x is simply retained at the edge of the pair. The following definition does exactly that (it will be slightly modified later, in section 4): (M2) (preliminary version) merge α x, y = x, merge α y As an example consider again the VP from above, repeated in (11), where the internal argument of the verb punch is a wh-phrase that was copied in order to check its wh-feature later in the course of the derivation. (Let us ignore case assignment for the moment.) (11) what∗, [ VP punch cwhat ] When this complex VP is selected by a v-head, (M2) applies. This means that v is merged with the VP it subcategorizes for, while the extracted expression what is simply taken along. (12) merge v what∗, [ VP punch cwhat ] = what∗, merge v [ VP punch cwhat ] = what∗, [ vP punch cwhat ] But (M2) captures just one case of merging a complex expression, namely when a simple expression selects a complex one. Another case that can arise is that a complex expression selects another expression. This is, for example, the case when the resulting expression in (12) selects the external argument of the verb. Then what to do with the following? merge what∗, [ vP punch cwhat ] mary The idea is exactly the same like with (M2): the nucleus of the pair, the vP, should be merged with mary, and the copy of what should be percolated further. The definition (M3) implements that. (M3) merge x, y z = x, merge y z So applying (M3) to our case above we get: (13) merge what∗, [ vP punch cwhat ] mary = what∗, merge [ vP punch cwhat ] mary = what∗, [ vP mary punch cwhat ]

A Derivational View on Movement Constraints

409

The rule (M3) also captures the case when a complex expression selects a complex expression. The way (M2) and (M3) are formulated, the second one of two merged pairs is incorporated into the first one. That means, in case of merge x, y z, w, (M3) applies before (M2) and we get x, merge y z, w, which reduces to x, z, merge y w. I will shortly come back to the significance of this in section 6, when discussing the relevance of the ordering in a pair. Now that we can handle the percolation of copies, we still need to specify what resolves the dependency at the top. What happens is the following: if a percolated expression finally can check its formal features with the nucleus of ∧ ∧ the pair it is part of, i.e., in a configuration x f , y f or x f , y f , merge is triggered. This is implemented by the following operation IM (reminiscent of internal merge). (IM) IM x, y = merge y x

if y licenses x

Now if the derivation outlined above in (12) and (13) reaches the point at which the C-head is merged, it will result in the expression:

what∗

∧ wh

, [ CP mary punch]wh

This is a configuration where IM applies, so what can be merged with the CP whereupon the wh-feature is checked. IM is assumed to apply as soon as possible, following the idea that operations in general have to be performed as soon as possible (e.g., stated in the form of the Earliness Principle by Pesetsky (1989) and later adopted by Chomsky (2000) in his condition Maximize Matching Effects). This means that movement cannot skip a potential landing site. Before moving on, let us walk through one more example. 2.3. Example derivation Consider the sentence Who punched a unicorn with the structure in (14), i.e., the subject wh-phrase originates in vP, first moves to SpecT to check its case feature, and then moves further on to SpecC in order to check its wh-feature.

410

Christina Unger

(14)

CP HH who C H H C TP H HH T twho HH H T vP PPP PP twho punched a unicorn

Suppose we have already built the vP, which corresponds to the expression in (15). (15) who∗

∧ case,wh

, [ vP punched a unicorn]

Next the T-head is merged. ∧ case,wh

(16) merge Tcase who∗ , [ vP punched a unicorn] ∧ case,wh case , merge T [ vP punched a unicorn] = who∗ ∧ case,wh = who∗ , [ TP punched a unicorn]case At this point of the derivation, IM can apply, which results in merging the two simple expressions. In doing so, the case feature can be checked. ∧ case,wh

, [ TP punched a unicorn]case (17) IM who∗ ∧ case,wh = merge [ TP punched a unicorn]case who∗ Now the two expressions can either be merged into another simple expression by applying (M1), or who can be copied, applying (C1). Since it still has an unchecked wh-feature, let us assume it is copied in order to percolate further. (18) merge [ TP punched a unicorn] (copy who∗wh) = merge [ TP punched a unicorn] who∗wh, c who = who∗wh , merge [ TP punched a unicorn] c who = who∗wh , [ TP c who punched a unicorn] Merging the C-head will then result in (19) who∗wh, [ CP punched a unicorn]

∧wh

Now, IM can apply again, this time in order to check the wh-feature. Then who does not have any more features to check, so both simple expressions can be merged with (M1) to another simple expression.

411

A Derivational View on Movement Constraints ∧wh

(20) IM who∗wh , [ CP punched a unicorn] ∧wh = merge [ CP punched a unicorn] who∗wh = [ CP who punched a unicorn] Written as a feature bundle, this expression is (who punched a unicorn,[C],[ ]). Now that we have seen how derivations proceed in Broziewski’s model, I also want to make a few remarks on the relation of Brosziewski’s model to other theories, such as Chomsky’s Phase Theory (see Chomsky (2000)). 2.4. Connection to other frameworks Although Brosziewski’s theory had been developed independently of Phase Theory, they share the core idea that expressions assembled by syntax are sent to the interfaces as soon as possible, and that expressions which need to be available for syntactic operations later on have to escape that procedure. In Phase Theory escaping happens by means of movement to the phase edge (usually the specifier of a designated head), which is not sent to the interfaces immediately but is kept until the next phase is finished. In Brosziewski’s account escaping happens by copying an expression and thereby keeping its relevant features accessible throughout the further derivation. Both implementations are very similar. In fact, the αi in an expression α1 , α2 . . . αn , β of Brosziewski’s model could be seen as being at the edge of a phase, while everything contained in β was sent to the interfaces. One difference to Phase Theory is that, although the extracting expressions αi are percolated through every step of the derivation, this is not done by application of merge, thus it does not require the insertion of edge features that legitimate the application of that operation. Another difference is the way of reducing the part of the structure that is accessible for syntactic operations. Instead of restricting accessibility to a certain domain (like vP and CP, or every phrase XP), it is restricted to active elements, i.e., expressions that carry yet unchecked features – independently of how deep in the structure they were introduced. I do not want to pursue a more detailed comparison with Phase Theory here, but what is interesting to note is that Brosziewski’s model is not per se restricted to a certain view on movement. In principle, it can mimic different possibilities: • Movement in one fell swoop can be implemented by stating that expressions at the edge of a pair are unaccessible for all operations except for percolation and remerging. They would then not be able to check features along the way, and could not show morphological reflexes or semantic effects. • Successive-cyclic movement with uniform paths is what might seem to be the default assumption for Brosziewski’s mechanism: extracted expres-

412

Christina Unger

sions are available during all steps in the derivation, and are thus always accessible for syntactic (as well as semantic and phonological) operations. • Successive-cyclic movement with punctuated paths would be achieved by the somewhat less obvious assumption that expressions at the edge of a pair are accessible only at certain points in the derivation. This would mean that they could trigger morphological reflexes or show semantic effects in some intermediate but not all positions. Which of these we prefer to adopt does not play a role for this paper. For a discussion of evidence that points towards punctuated movement paths, cf. Abels and Bentzen (this volume). I want to mention one possible conceptual objection that can arise from viewing complex expressions as trees, namely that (M2) and (M3) look like countercyclic operations because merge operates only on a proper substructure of the expression. First note that when thinking of complex expressions α , β as a syntactic expression β together with a stack containing α , this does not have to be the case. Under this view it seems reasonable to assume that the strict cycle only refers to the expression β but not to the stack. This is reasonable since we do not want to require every operation to operate on the stack – especially because during the whole middle part of a dependency the stack should just be passed on without being modified. But when thinking of complex expressions in terms of a tree structure, α is indeed merged below the root: H H −→ (21) merge α H H H x y x α y The only way to avoid this and have a cyclic derivation is assuming that (M2) actually consists of the following two steps: first merging α with the whole expression and then moving x to the edge again. H H H H −→ (22) merge α H −→ H H x y α H x x y α H y Arguably the counter-cyclic (21) and the cyclic (22) are not distinguishable in Brosziewski’s terms. The results of both derivation courses correspond to the expression x, merge α y. To accommodate this, we could formulate a sloppy version of the cycle which – analogously to the stack point of view – refers to the nucleus of an expression only and leaves the edge of an expression out of consideration, i.e., which states that an operation may not apply to a proper subpart of the nucleus of an expression. In phase-theoretic terms this is like always tucking in expressions below the phase edge, but requiring that below the phase edge merge respects the strict cycle. In this sloppy sense, none of the operations throughout the paper violates the cycle. In fact it is impossible for any operation to do so, for the nucleus of an expression is always a simple

A Derivational View on Movement Constraints

413

expression without internal structure, so no operation can reach into it. (Which is not surprising at all because that was the idea behind expressions in the first place.) Another way to think about (21) is to regard it as the adjoining operation of Tree Adjoining Grammar (cf. Joshi et al. (1975), Kroch and Joshi (1985)): percolating an extracting element through the middle of a dependency without anything happening to it very much resembles adjoining an intermediate structure. The difference is that under Brosziewski’s view the dislocated element is present while building the intermediate structure, thus can trigger morphological reflexes or semantic effects.

3. Remnant movement and Freezing Starting from the basic mechanism outlined above, Brosziewski concentrates on extending it to also capture head movement (cf. Brosziewski (2003, sect. 3.3)). I want to depart from his considerations and instead take a closer look at possible and impossible movement configurations. In particular, I want to show what remnant movement and Freezing configurations amount to in Brosziewski-style derivations, which problem arises with them, and how to solve it. This will lead to a slight modification of Brosziewski’s mechanism, which – as far I can see – is compatible with his modifications for incorporating head movement. For these explorations always keep in mind that complex expressions arise from copying and percolating an expression that is extracted, and that an XP that contains an element α that is extracted from XP is a pair of the form α , XP (possibly with more copied subexpressions). Now, which types of movement are possible up to now? So far, copy is only defined for simple expressions, thus movement is restricted to constituents without subextraction. For example the following extractions in (23) are unproblematic because only simple expressions move. (23) a. b.

β . . . α . . . [ XP . . . tβ . . . tα . . .] [ XP . . . α . . . tα . . . ] . . . tXP . . .

However, configurations like those in (24) cannot be derived. In both, an XP moves and an expression α is extracted from this XP. So at some point of the derivation, when the whole phrase has to be copied, copy would have to apply to the complex expression α , XP. Since up to now the operation copy is not defined for complex expressions, this is not possible, and hence it is not possible to move an XP with an extracting subexpression. (24) a. b.

[ XP . . . tα . . . ] . . . [ . . . α . . . [ . . . tXP . . . ]] α . . . [ [ XP . . . tα . . . ] . . . tXP . . .]

414

Christina Unger

What differs between (24-a) and (24-b) is the derivational order; (24-a) corresponds to (A), and (24-b) corresponds to (B). (A) Remnant movement First α moves out of XP, then (the rest of) XP moves. (B) Freezing configuration First XP moves, then α moves out of XP. Extraction from a moved XP is generally taken to be prohibited in languages, whereas remnant movement is assumed to be possible. Consider the following examples from German. (25) instantiates remnant movement: the NP das Buch scrambles out of the VP, and then the remnant VP is topicalized. (26) on the other hand shows a Freezing effect: first the NP ein Buch wor¨uber is scrambled, and subsequently the PP wor¨uber is extracted from this NP via wh-movement. t2 (25) [ VP t1 Gelesen ]2 hat [ NP das Buch ]1 keiner read has the bookacc no-onenom (26) *[ PP Wor¨uber ]2 hat [ NP ein Buch t2 ]1 keiner [ VP t1 gelesen ] ? a bookacc no-onenom read about what has Let us first look at (26). At some point of the derivation, the NP is built, amounting to the complex expression (27) [ PP woruber ¨ ]∗, [ NP ein Buch cPP ] If merged with the V-head, this NP would have to be copied in order to move further up.4 But since it is a complex expression, copy cannot apply. So, according to the reasoning above, the Freezing configuration in (26) is indeed excluded. But for the same reason also every instance of remnant movement is impossible. Consider (25). When the VP is built, it amounts to the following expression. (28) [ NP das Buch]∗, [ VP cNP gelesen] Which is again a complex expression, that cannot be copied, although it would need to be copied when merged with v. To remedy this, the definition of copy has to be extended to complex expressions. A first try could be to simply keep the list of extracting expressions and copy the nucleus of the pair (for we want to copy the constituent we built and not the list of moving elements): copy x, y = x, copy y This allows for remnant movement. In abstract terms because it is now possible to copy the complex expression α , XP in the following way: 4

Assuming that scrambling is triggered by some feature.

A Derivational View on Movement Constraints

415

copy α , XP = α , XP∗, cXP Hence the remnant movement derivation for (25) could proceed with copying the VP in (28) and then merging it with v, yielding the following vP. (29) [ NP das Buch]∗, [ VP cNP gelesen]∗, [ vP ε cVP ] The NP and VP at the edge are simple expression and thus can both be merged internally in the course of the derivation. But for the same reason, Freezing effects are no longer obtained. Copying the NP in (27) and merging it with V gives the following VP: ¨ ]∗, [ NP ein Buch cPP ]∗, [ VP gelesen cNP ] (30) [ PP woruber

Again, the two expressions at the edge are simple and can be merged internally in the course of the derivation. The situation is exactly the same like with remnant movement. The problem is that copying applies in base position and that there is no way to look ahead that could tell which element lands first, α or XP. Thus keeping both of them as separate elements at the edge, i.e., building an expression α , XP,..., does not allow one to distinguish between Freezing configurations and remnant movement. What to do about that? In fact, there is a possibility to distinguish between the case where α lands before the rest of the XP (as in remnant movement) and the case where the complete XP lands first (as in Freezing configurations). The trick is to not percolate α and XP as seperate elements at the edge, but instead move the whole expression α ,XP as one constituent. So instead of building an expression α , XP,..., we will build an expression

α ,XP,..., with α , XP at the edge. Now when in the course of the derivation α is merged internally first, what remains of this complex expression is just XP, a simple expression. So in this case, IM is applied to simple expressions only. If, however, the complete XP tried to land first, it would at that point of the derivation still be the complex expression α , XP. So in this case, IM has to be applied to a complex expression, for which it is not defined. In a nutshell, the idea lies in exploiting the derivational difference between remnant movement and Freezing configurations by moving the expression α , XP as one constituent. Then if α lands first and XP lands later on, both can be merged as simple expressions without a problem; however, if the whole, complex expression tries to land first, it will fail to do so. Technically, the only thing we need to ensure is that α and XP are not kept as separate, independent elements at the edge, but instead remain one constituent while moving. That is, when copying the complex expression α , XP, the derivation should not produce α , XP, c, but instead should produce

α , XP, c.5 5

Pied Piping might seem like a Freezing configuration in this theory, because a whole phrase is

416

Christina Unger

When having done so, copying the NP in the Freezing case and merging it with V would not result in (30) but the following. (31)

[ PP woruber ¨ ]∗, [ NP ein Buch cPP ]∗, [ VP gelesen c] At some point of the derivation, the whole NP would want to land, before the PP wor¨uber reaches its landing site. Then internal merge would find the complex expression [PP woruber ¨ ]∗, [NP ein Buch cPP ]∗ at the edge, for which it is not defined. Thus the Freezing effect is obtained. The situation is different with remnant movement, although it starts out the same because copying the VP and merging it with v does not result in (28) but in the following. (32)

[ NP das Buch]∗, [ VP cNP gelesen]∗, [ vP ε c] The difference is that the NP reaches its landing site first. Since it is a simple expression, it is allowed to be remerged. Assuming that it adjoins to vP, the result is this: (33) [ VP cNP gelesen]∗, [ vP das Buch ε c ] Now the extracted VP is a simple expression and nothing prevents it from being remerged later. Thus a remnant movement derivation is perfectly fine.6 In order for this to work technically, we need to adapt the definitions involved. First of all, we need to make explicit which expressions are considered to be at the edge of a complex expression and thus can be reached by IM. Definition 6. An expression x is at the edge of a complex expression y, z if one of the following three conditions holds: (i) x is equal to y (ii) x is at the edge of y

6

moved although only a subexpression x (the wh-element itself) in it would be copied. But this need not be the case. One way to account for Pied Piping is to assume that x is not copied but has the last resort option to project its features. Then the more inclusive phrase will be copied, percolated into landing position and there check the projected feature. This is similar to the treatment of Pied Piping in GPSG with the help of feature percolation. Another possibility is to follow more recent theories like Cable (2007). Adopting Cable’s idea, copy could not be triggered by wh-features but rather by some feature of a Q-morpheme higher up in the structure. The remnant movement account outlined in this section is stronger than, for example, the one in Abels (2007). This is because it prohibits every type of Freezing configuration, while Abels only prohibits Freezing configurations if the movement of the inclusive XP is ‘less urgent’ (higher in a certain hierarchy) than the extraction of an element α out of that XP. On the other hand, Abels permits Freezing configurations if the movement of the XP is ‘more urgent’ (lower in the hierarchy) than α ’s movement. He supports this claim with data that shows that A-movement can feed wh-movement and topicalization. An approach like Brosziewski’s could possibly account for these cases, for example if based on a treatment of A-movement as it is proposed by Kobele (this volume).

A Derivational View on Movement Constraints

417

(iii) x is at the edge of z Clauses (i) and (iii) capture the cases we already encountered. For example (i) reaches what in what, z and (iii) reaches who in what, who, z. Up to now (ii) would have been equivalent to (i), because y was always simple. Now (ii) captures the new case of reaching what in

what, x, z. Next we need to generalize IM slightly, in order to consider these new cases of edges as well, i.e., to reach not only the first element of the pair it is applied to but also the edge of this element. What will have to remain, most importantly, is that only simple expressions can be remerged with IM. Next, also the definition of copy needs to be extended with a clause for complex expressions that yields copy x, α =

x, α ∗, c. For exact definitions see the appendix.

4. The Condition on Extraction Domain A widely assumed constraint on movement is the Condition on Extraction Domain (CED), which originated in Huang (1982). In a simple form, it states that movement must not cross a barrier, where an XP is a barrier iff it is not a complement. Assuming that complements are exactly those expressions that check category features when they are merged, the CED can be stated as in (34). (34) Condition on Extraction Domain Extraction is prohibited from expressions which have been merged without checking category features. In other words, extraction out of complements is fine (cf. (35)), while extraction out of adjuncts (cf. (37)) and specifiers (cf. (36)7) is out. (35) What1 did Mary tell [a story about t1 ]? (36) *What1 has [a punching of t1 ] annoyed you? (37) *What1 did you feel bad [because Mary punched t1 ]? To see how this constraint can be expressed in Brosziewski’s terms, recall again that expressions with subextraction are encoded in complex expressions: an XP that contains an element α that is extracted from XP is a pair of the form

α , XP. Those expressions are created by copy, percolated by (M2) and (M3), and remerged by IM. So these operations is where an account of the CED could be rooted. A restriction on copy or IM, however, is not practicable, because these operations do not see whether α is part of a complement, specifier, or adjunct. What is rather called for is a restriction on (M2). (Note that (M3) can be 7

This example assumes that subjects occupy a specifier position. I will discuss data on subjects and their treatment below.

418

Christina Unger

ignored for that purpose, because application of it always boils down to either the case in (M1) or the one in (M2).) The definition of (M2) is repeated here for convenience. (M2) (preliminary version) merge α x,y = x, merge α y Up to now there is no restriction on (M2), so it allows merge with complex complements (i.e., complements out of which is extracted), with complex specifiers (i.e., specifiers out of which is extracted), and with complex adjuncts (i.e., adjuncts out of which is extracted). A straightforward way to disallow complex specifiers and adjuncts is to restrict (M2) to elements that are subcategorized for, according to the formulation of the CED. (M2) (still preliminary version) merge α x,y = x, merge α y iff α subcategorizes for y However, (M2) as it is now is too strong, because it also excludes cases like the one we encountered in the second line of (18), here repeated as (38), where an expression that was internally merged is copied again in order to move further. (38) merge [ TP punches a unicorn] who∗wh , c who In this case, the TP does not subcategorize for the second element of the pair. Thus merge cannot operate. The problem arises from the fact that there is no difference between pairs which result from copying, and pairs, where the first element is a proper subexpression of the nucleus, i.e., between a whole phrase moving (as in (38)) and expressions extracting from a bigger constituent. So Broziewski needs to add another clause to (M2) in order to exempt pairs of copies from the subcategorization restriction. The final version of (M2) then is the following (cf. Brosziewski (2003, (73-b) on p. 56)). (M2) (final version) merge α x,y = x, merge α y iff (i) α subcategorizes for y, or (ii) x,y is a pair of copies After having seen how to render adjuncts and specifiers opaque while leaving complements transparent, let us consider subject islands. The situation there is less clear-cut. In many languages, subjects are islands (as in English), but in some languages extraction from subjects is possible (for example in Japanese, Hungarian, Turkish, Palauan, see, e.g., Stepanov (2007)). In Brosziewski’s model there are basically two options to treat subjects. The first one is to assume that subjects are specifiers. In that case they are predicted

A Derivational View on Movement Constraints

419

to be islands. Languages that allow extraction from them have to lack the restrictions on (M2). (Specifiers created by movement will still be islands because of a Freezing effect.) The other option is to treat them as complements (of either V or v). This is straightforward, since neither multiple specifiers nor multiple complements are excluded. The prediction then is that subjects are transparent for subextraction. Their islandhood can be derived with the help of Freezing, so that subjects are barriers only if they moved from their base position. This is unproblematic in languages in which the subject obligatorily moves to SpecT, as it does in English. However, there are languages in which the subject can stay in situ. Contrary to what is predicted, they might exhibit subject island effects. This can be seen in the following German examples. (That the subject stays in situ is based on the assumption that particles like denn and wohl demarcate the vP edge.) (39) *Was1 haben denn [ DP t1 f¨ur B¨ucher ] [ DP den Fritz ] beeindruckt ? for booksnom the Fritzacc impressed what have PRT ¨ (40) *[ PP Uber wen ]1 hat wohl [ DP ein Buch t1 ] [ DP den Fritz ] about whom has PRT a booknom the Fritzacc beeindruckt ? impressed A very interesting fact is that these subjects can become transparent if the direct object scrambles across them, targeting a higher specifier position of vP.8 (41) Was1 haben [ DP den Fritz ]2 [ DP t1 f¨ur B¨ucher ] t2 beeindruckt ? what have the Fritzacc for booksnom impressed ¨ (42) [ PP Uber wen ]1 hat [ DP den Fritz ]2 [ DP ein Buch t1 ] t2 beeindruckt ? about whom has the Fritzacc a book impressed I have to leave open what could account for this behavior of subjects (and indirect objects) here. For different treatments of subjects in the framework of HPSG – either on the argument list of a verb or not – see, e.g., Pollard and Sag (2004) and M¨uller (2007).

5. Weak islands Weak islands like wh- and topic-islands do not follow immediately from the derivational mechanism employed so far. In order to account for them, Brosziewski (2003, 64f) hints at how the concept of subcategorization can be 8

These melting effects have been identified in M¨uller (2010). They can also be observed with indirect objects and also occur with scrambling in Czech.

420

Christina Unger

widened to involve both syntactic and semantic information, such that it can be sensitive to contrasts like ±finite, ±propositional, and so on. Islands then amount to expressions that are not subcategorized for. But obviously the success of such an approach depends on an appropriate theory of selection. I want to propose an alternative route, that makes use only of the formal means we already have, plus a simple additional constraint: the principle of Unambiguous Domination from M¨uller (1998). M¨uller introduced it to capture a core restriction on remnant movement, namely that remnant XPs cannot undergo a certain type of movement if the subextracted expression that created the remnant has undergone the same type of movement. The derivational version of the principle reads like the following. (43) Unambiguous Domination (derivational version) In a structure . . . [A . . . B . . .] . . ., A and B may not undergo the same kind of movement. This condition can straightforwardly be expressed in Brosziewski’s terms as a filter over expressions that states that an element with some feature f may not be percolated across a constituent that bears the same feature f . (UD) * x f ,y f Besides the generalization about remnant movement that motivated (UD), it can cover weak islands such as wh- and topic islands, examples of which are given in (44-a) and (45-a), respectively. If we assume that wh- and top-features on a CP are not deleted when checked but remain visible, the embedded CP corresponds to expressions (44-b) and (45-b), respectively – expressions that are excluded by (UD). (44) a. *What1 does John think [ CP whether Mary punches t1 ] ? b. what∗wh, [ CP whether Mary punches cwhat ]wh (45) a. *[ PP With her bare hands ]2 John thinks that [ CP [ a unicorn ]1 Mary punched t1 t2 ]] b. [ PP with her bare hands]∗top, [ CP a unicorn Mary punched cPP ]top A nice consequence is the following. With M¨uller’s version of Unambiguous Domination, it is necessary to define a local domain, in which the condition applies, because otherwise sentences like (46) would wrongly be predicted to be out, for the wh-element in the embedded CP and the more inclusive NP undergo the same kind of movement. (46) [ NP Wessen Frage [ CP was1 du t1 magst ]]2 hat t2 dich ge¨argert ? whose question what you like has you annoyed The crucial insight here is that (43) only plays a role in cases where B moves out of A, but not if B moves within A. The representational version that M¨uller uses

421

A Derivational View on Movement Constraints

(stating that an α -trace may not be α -dominated) does not capture that. However, in a derivational setting, this problem does not arise, because the movement of B will already be finished when A is moved. Thus these two movements should not interfere. And indeed, in a Brosziewski-style derivation of (46), the embedded CP is a simple expression, because was1 already reached its landing site, and even if the wh-feature is still visible on the CP, it will not project when the NP is built. The NP will be a simple expression with a ∧ wh-feature itself, it will be copied and percolated until it reaches its own landing site. At no point of the derivation does a configuration like (UD) arise.

6. The Minimal Link Condition Besides rigid constraints on movement like the CED and island constraints, there is another kind of movement constraints, that is based on the competition of expressions that carry the same relevant feature. A widely assumed instance of these minimality constraints is the Minimal Link Condition (MLC) from Chomsky (1995). In general terms it requires that of two expressions with the same feature the one which is closest to the target moves. Specifying closeness in terms of c-command or dominace (cf., e.g., Fitzpatrick (2002)), it can be stated as in the following definition. (47) Minimal Link Condition ∧ In a structure α f . . . [ . . . β f . . . γ f . . .], movement to ∧ f can only affect the expression bearing the f -feature that is closer to ∧ f , where β is closer than γ if β dominates or c-commands γ . First it should be noted that, since derivations do not build phrase structures, dominance and c-command can only play a role to the extent to which they are retrievable from the expressions built. So let us start with examining in which respect the structure in a pair constitutes a representational residue. Clearly, dominance is encoded in pairs: the nucleus α of a pair x1 , x2 . . . xn , α dominates every extracting expression x1 , x2 , . . . , xn (or is equal to it, as would be the case with a pair of copies). This is always the case, because the xi are extracted from α . With c-command, the situation is different. Although it is the case that a structure in which x c-commands y corresponds to a pair where x is nested deeper in the pair than y, this correspondence does not hold in the other direction. Consider the following structures, where α and β are assumed to have formal features that need to be checked by movement: XP

XP X

α X

X

YP

β

α

Y X

β

422

Christina Unger

Although in one α c-commands β whereas in the other one it does not, the derivation of both will build a pair of the following form:

β ∗, α ∗, [XP . . . cα . . .cβ ] Closer scrutiny reveals that what nestedness in a pair amounts to is the point at which the expressions entered the derivation. In a pair x1 , x2 . . . xn , α , xi was merged before xi+1 , i.e., the most recently merged expression will end up deepest in the pair (or closest to the nucleus). This is due to how (M2) and (M3) distribute expressions over pairs. A consequence of this is that intervention effects are bound to work in terms of derivational order. It is therefore predicted that intervention without c-command should be possible – which indeed is the case, as observed by Heck and M¨uller (2000) and as evidenced, e.g., by the following example. (48) ?*Who2 did [ DP the woman that punched what ] imitate t2 ? However, it is unclear how exactly to account for these data here. The main problem is that the intervening wh-expression is in situ. If in situ expressions are considered not be copied, they should not be able to intervene, unless their features somehow remain visible. The same problem arises if one wants to account for superiority effects in languages where only one wh-expression moves overtly, as in the following standard examples from English. (49) Who1 t1 punches what ? (50) *What2 does who punch t2 ? For the moment, let us assume that in a structure like (51-a), all wh-expressions are copied and there is a mechanism to delete the copies of those expressions that do not move overtly. (In fact, it is possible to formulate such a mechanism without much ado, see the appendix for details.) Then the complex expression corresponding to (51-a) is (51-b). (51) a. b.

∧

[ CP C wh [ TP whowh punches whatwh ]] ∧

what∗wh, who∗wh , [ CP cwho punches cwhat ] wh

The next step of the derivation would be to apply IM in order to check C’s ∧ whfeature. But, although IM applies to pairs, nothing was said yet about to which pair it applies if they are nested. IM can apply to the inner pair as well as to the outer pair, either checking C’s feature with who or with what. In order to derive superiority effects, IM has to be required to always proceed inside out. This strategy, however, does not need to be hardwired into the operation IM itself but can simply be a fact about how a derivation proceeds. The advantage of the approach sketched here over the MLC is that superiority is not expected to be a universal effect. Analogously to superiority effects,

A Derivational View on Movement Constraints

423

anti-superiority effects, where the farthest expression moves, can be achieved simply by chosing the opposite strategy, i.e., by letting IM proceed outside in. Moreover these two strategies cannot only vary across languages, but also across configurations within one language, e.g., in the line of Richards (2001), who proposes that anti-superiority is a property of multiple attraction to a single position, whereas superiority is a property of attraction to multiple positions. Finally, note that (UD)-configurations in (52-a), where an element at the edge carries the same feature as the nucleus, look very similar to Minimality configurations in (52-b), where two expressions at the edge carry the same feature. (52) a. * x f ,y f b. x f , y f ,z Indeed, (52-a) can be seen as a Minimality effect under certain assumptions about IM. This becomes clear when considering how a derivation that has built the expression x f , y f proceeds. The next step involving the pair would be merging it with another expression z. This can have two forms: either the pair x f , y f subcategorizes for z, or z subcategorizes for the pair x f , y f . The first case will again yield a pair of that form: (53) merge x f ,y f z = x f , merge y f z = x f ,[y z] f In the other case, i.e., when z subcategorizes for x f , y f , the pair needs to be copied. Otherwise y would not be able to check its feature f during the further derivation, because it is z that projects its features. The result is the following: (54) merge z x f ,y f =merge z (copy x f ,y f ) =merge z

x f ,y f ,c =

x f ,y f ,z At some point in the derivation there will be an expression with a corresponding feature ∧ f , yielding: ∧

(55)

x f ,y f , [. . .] f This configuration triggers IM. Now, if IM is required to target the ‘highest’ element at the edge, i.e., the pair x f , y f instead of x f , then the derivation will crash because IM is not defined for remerging complex expressions. That is, under the minimality assumption that IM has to pick the expression that was merged latest, the prohibition in (52-a) can be derived.

424

Christina Unger

7. Across-the-board extraction Since extraction in Brosziewski-style derivations combines the idea behind dislocation in GPSG with a Minimalist setting, let us finally take a look at a constraint that is very easy to express in GPSG but turned out to pose serious problems for Minimalist movement approaches: the Coordinate Structure Constraint (cf. Ross (1967)). (56) Coordinate Structure Constraint In a coordinate structure, no conjunct may be moved, nor may any element contained in a conjunct be moved out of that conjunct, unless movement simultaneously affects both conjuncts. The following examples illustrate this constraint. (57) *I wonder what1 Mary [ [ punched ] and [ t1 ]]. (58) *I wonder what1 [ [ John hugged t1 ] and [ Mary punched a unicorn ]]. (59) I wonder what1 [ [ John hugged t1 ] and [ Mary punched t1 ]]. The gist of the GPSG account (see especially Gazdar (1981)) is that only expressions of the same category can be coordinated. And since gaps are encoded in the category of an expression, this automatically accounts for across-the-board extraction. Now let us first look at how the generalization that only expressions of the same category can be coordinated can be achieved with Brosziewski’s model. This is actually fairly easy; all we have to assume is that a coordinating expression like and subcategorizes for the same category twice. The lexical entry for and that does that is given in (60), where X is a polymorphic feature that can be instantiated by any catgeorial feature. (60) and = (and,[∧X,∧ X,X],[ ]) But what this lexical item cannot account for yet is the fact that across-the-board extraction is possible, because – unlike in GPSG – extraction is not encoded in the syntactic category. In fact, it cannot be because even when an expression is moved, a copy is merged in base position in order to fulfill subcategorization requirements. Thus from the point of view of syntax, the constituent is complete. In other words, there are no gaps in Brosziewski’s model. However, the derivational mechanims does keep track of extracted expressions. Thus across-the-board extraction can be achieved by defining a recursive coordination function like the following.

β = merge (merge and β ) α (CO) coord α coord x,y x,z = x, coord y z

A Derivational View on Movement Constraints

425

This function is only defined for expressions that have exactly the same edge: either none at all, or equivalent copies of extracted expressions. As an example for how this works, consider (61) with parts of the derivation in (62). (61) I wonder what1 [ [ John hugged t1 ] and [ Mary punched t1 ]]. (62) coord what∗, [Mary punched cwhat ] what∗, [John hugged cwhat ] = what∗, coord [Mary punched cwhat ] [John hugged cwhat ] = what∗, merge (merge and [Mary punched cwhat ]) [John hugged cwhat ] Although an extra function for coordination is needed, it reduces to merge, such that no additional means for deleting the edge of one conjunct have to be presupposed. Moreover, since merge is a binary function and the first clause of the definition (CO) assumes it to first combine and with the second conjunct and only after that with the first one, hierarchy effects for the two conjuncts are expected, as they are found for example with binding as in (63). (63) a. Mary punched [ [ every unicorn ]i and [ itsi owner ]]. b. *Mary punched [ [ itsi owner ] and [ every unicorn ]i ]. One minor flaw, however, is that the definition (CO) still allows for the case of whole conjuncts extracting, as long as they are the same, as, e.g., in the ungrammatical What1 did Mary punch [t1 and t1 ]. To exclude this, it would have to be assumed that α and β in the first clause of (CO) cannot be phonologically empty.

8. Conclusion This paper started by presenting Ulf Brosziewski’s model of syntactic derivations as an implementation of a strongly derivational organization of grammar. Let me finally indicate where it would be classified with respect to the hierarchy of derivational approaches developed in Brody (2002). Brody calls a derivational theory weakly representational if the objects that derivations generate are transparent in the sense that the material they are assembled of is accessible for later operations. With respect to this characterization, Brosziewski’s expressions could suitably be called half-transparent. First because only some of the assembled material is still accessible later on, and second because it is so only as long as necessary, but not throughout the whole derivation. So Brosziewski’s model can be argued to reside somewhere between nonrepresentational and weakly representational theories. Even more, Brosziewski’s theory can be understood as a case in point for attempts to reduce the amount of representations in syntax to a minimum.9 9

A drawback for this point of view, however, is my use of (UD) to account for weak islands in section 5, because (UD) is a constraint over the expressions generated – something that Brody

426

Christina Unger

I then focused on showing how different kinds of constraints on movement can be captured and that actually very little is necessary to cover the core facts. The employed means were of two sorts. The first one were filters over expressions or over the input of functions. One such filter consists in restricting the operation merge such that complex expressions can only be merged if they are subcategorized for. That allowed to account for a range of CED effects. Another filter was M¨uller’s principle of Unambiguous Domination, that was expressed as disallowing expressions of the form x f , y f . Besides the general restriction on remnant movement that it was designed for, it captured weak island effects like induced by wh- and topicalization islands. Another kind of constraint that seemed to naturally emerge in a derivational approach were conditions on the order in which operations apply. It was indicated how the order of application of the function IM in nested pairs allows to derive minimality effects or the lack of them in a quite flexible way. However, these effects also highlighted a problem, namely how expressions that stay in situ are visible, e.g., in order to be able to intervene. In the appendix I provide a provisional solution for languages in which exactly one wh-expression moves, but clearly more needs to be said about multiple wh-movement across languages and about intervention effects in general. Another open issue is to derive the considered movement constraints from more general properties of the computational system, requirements of the interfaces, or the feature make-up of lexical expressions. This mainly calls for a refinement of the feature system, with respect to syntactic features as well as regarding phonological and semantic features, whose influence on derivations was not considered yet at all.

Appendix Three things in the main text were left open. For section 3, the extension of IM still has to be given. This is captured by the following definition. (64) IM x,y = merge x ,y α

for some α that

(i) is at the edge of x, and (ii) is licensed by the nucleus of y where x is x without α . The nucleus of an expression can be explicitly defined by the following recursive definition: would classify as strongly representational. However, instead of prohibiting expressions x f ,y f , one could also specify merge x f y f as undefined or derive (UD) as a minimality effect as sketched in section 6.

427

A Derivational View on Movement Constraints

=α nuc α nuc x,y = nuc y From the same section also the exact, recursive definition of copy was postponed. It is given by: (C2) copy x,y =

x,y,[y:= ˆ yˆ1 ],yˆ2 where yˆ = nuc y ˆ yˆ1 = π1 (copy y) yˆ2 = π2 (copy y), ˆ where y[x := z] is like the expression y except that x is replaced by z, and where π1 and π2 grab the first and second element of a pair, respectively In section 6, I claimed that a mechanism to delete the copies of those expressions that do not move overtly can be stated rather easily, in order to capture languages in which exactly one wh-expression moves. What is needed are three assumptions: (i) The operation copy is not optional but obligatory (triggered by unchecked formal features). (ii) After a feature f was checked, all other instances of f in the expression are deleted. (iii) Occurrences of empty expressions (ε , [ ], [ ]) can be deleted since they will never play any role in the derivation, i.e., (ε , [ ], [ ]), y = y. Consider the example (49) from section 6, here repeated as (65). When the derivation has built the expression in (66-a), it proceeds as follows. As assumed in that section, IM will apply inside out, so C’s feature ∧ wh will be checked with the wh-feature of who and we get (66-b). Then, according to the second assumption above, all other instances of the wh-feature are deleted, yielding (66-c). (65) Who1 t1 punches what ? (66) a. b. c.

what∗wh, who∗wh, [ CP cwho punches cwhat ]

what∗wh, [ CP who cwho punches cwhat ]

what∗, [ CP who∗ cwho punches cwhat ]

∧ wh

Now two cases are possible. First, what∗ can have other unchecked features. In that case the derivation proceeds as usual. Extraction of what is then motivated by some other feature checking and could for example constitute a topicalization movement. The second case is that what∗ does not have any other unchecked features. Then this expression can be of two forms: either it is the expression (what, [ ], [ ]) or the expression (ε , [ ], [ ]), depending on which copy

428

Christina Unger

actually carries the phonological representation. In the first case, the expression is not empty and thus cannot be deleted. Neither can it be remerged with IM at any later point (this is because IM is always triggered by formal features which the expression does not have). Thus the derivation will never reach a simple expression and thus cannot succeed. In the second case, the expression is empty and can be deleted according to the third assumption. Then the desired effect is achieved: the wh-feature is deleted, so the derivation can proceed. Moreover the phonological content is on the copy that was merged in situ, thus the resulting expression will appear as if what was not moved at all. What this would come closest to in a classical movement analysis is feature movement.

Bibliography Abels, Klaus and Kristine Bentzen (this volume): Are Movement Paths Punctuated or Uniform? Abels, Klaus (2007): ‘Towards a Restrictive Theory of (Remnant) Movement’, Linguistic Variation Yearbook 7, 53–120. Adger, David (2008): A Minimalist Theory of Feature Structure. Ms., Queen Mary University of London. Brody, Michael (2002): On the Status of Representations and Derivations. In: S. Epstein and T.D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell, pp. 19–41. Brosziewski, Ulf (2003): Syntactic Derivations. A Nontransformational View.. Linguistische Arbeiten 470, Niemeyer. (PhD thesis from 2000). Cable, Seth (2007): The Grammar of Q: Q-Particles and the Nature of Wh-Fronting, as Revealed by the Wh-Questions of Tlingit. PhD thesis, MIT, Cambridge, Mass. Chomsky, Noam (1995): The Minimalist Program. MIT Press. Chomsky, Noam (2000): Derivation by Phase. In: M. Kenstowicz, ed., Ken Hale: A Life in Language. MIT Press, pp. 1–54. Fitzpatrick, Justin (2002): ‘On Minimalist Approaches to the Locality of Movement’, Linguistic Inquiry 33, 443–463. Gazdar, Gerald (1981): ‘Unbounded Dependencies and Coordinate Structure’, Linguistic Inquiry 12, 155–184. Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum and Ivan A. Sag (1985): Generalized Phrase Structure Grammar. Blackwell, Oxford. Heck, Fabian and Gereon M¨uller (2000): Successive Cyclicity, Long-Distance Superiority, and Local Optimization. In: R. Billerey and B. D. Lillehaugen, eds, Proceedings of WCCFL 19. Somerville, MA: Cascadilla Press, pp. 218–231. Huang, Cheng-Teh James (1982): Logical Relations in Chinese and the Theory of Grammar. PhD thesis, MIT, Cambridge, Mass. Joshi, Aravind K., Leon S. Levy and Masako Takahashi (1975): ‘Tree Adjunct Grammars’, Journal of Computer and System Science 10, 136–163. Kobele, Gregory M. (this volume): Deriving Reconstruction Asymmetries. Kroch, Anthony and Aravind K. Joshi (1985): The Linguistic Relevance of Tree Adjoining Grammar. Technical report, University of Pennsylvania Department of Computer and Information Sciences Technical Report MS-CIS-85-16. M¨uller, Gereon (1998): Incomplete Category Fronting. Kluwer. M¨uller, Gereon (2010): On Deriving CED Effects from the PIC. Linguistic Inquiry 41, 35–82. M¨uller, Stefan (2007): Head-Driven Phrase Structure Grammar: Eine Einf¨uhrung. Stauffenburg, T¨ubingen.

A Derivational View on Movement Constraints

429

Pesetsky, David (1989): Language-Particular Processes and the Earliness Principle. Ms., MIT, Cambridge, Mass. Pollard, Carl and Ivan A. Sag (2004): Head-Driven Phrase Structure Grammar. University of Chicago Press, Chicago. Richards, Norvin (2001): Movement in Language. Oxford University Press, Oxford. Ross, John Robert (1967): Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Mass. Stepanov, Arthur (2007): ‘The End of CED? Minimalism and Extraction Domains,’ Syntax 10, 80– 126. Uriagereka, Juan (1999): Multiple Spell-Out. In: S. Epstein and N. Hornstein, eds, Working Minimalism. MIT Press, pp. 251–282.

Semantic Computing Group Universit¨at Bielefeld

Klaus Abels & Kristine Bentzen

Are Movement Paths Punctuated or Uniform?* Abstract Generative syntacticians in the orthodox Chomskyan tradition have usually assumed that movement paths are punctuated in the sense of Abels (2003). In this paper we raise the question of what type of evidence would bear on the issue of punctuation. We then investigate whether existing arguments for punctuation stand up to scrutiny. While the evidence proposed in Abels (2003) is not convincing, that presented in Nissenbaum (2001) is valid. We propose a new argument for the punctuated nature of movement paths based on scope and binding reconstruction in Norwegian. If our conclusions are accepted, then movement should not be modeled in terms of a one-fell-swoop operation nor should it be understood as completely local percolation of features, local adjunction, or local function composition. Instead, the orthodox view of cyclic hopping finds empirical support.

1. Introduction In this paper we investigate how movement dependencies should be modeled. Movement is certainly the most-studied example of a long-distance dependency in language and it is therefore the focus of our investigation. We take the issue of whether movement dependencies are mediated in a very local, medium range local, or long-distance manner to be an empirical question. There are several different modes of investigation that one could use to pursue this issue. They will rely on effects that movement has on the material that has been crossed by the movement. Whether such effects exist, where and how they are expressed are all empirical questions. There is no a priori answer to the question of whether the fact that movement occurred in (1) has an effect on the material that has been crossed. (1) Which book does John think that Mary said that Frank believes that he should tell the police that it is unlikely that Edward has read twhich book

*

We would like to thank the audience at the 2008 DGfS workshop on Local Modelling of NonLocal Dependencies in Syntax, and Winnie Lechner for a constructive review.

Local Modelling of Non-Local Dependencies in Syntax, 431-452 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

432

Klaus Abels & Kristine Bentzen

What is clear, is that the material that falls between a filler and the corresponding gap in a movement dependency has an effect on that dependency. The most obvious case are island effects. While (2-a) is ambiguous between the readings in (2-a-i) and (2-a-ii), the ambiguity disappears under a minor change of an element along the path of movement, that is, the replacement of that by how in (2-b). Such effects of the material crossed on the dependency necessitate some notion of path of movement. (2) a. b.

When did the boy say that he hurt himself? (i) When did the boy say [that he hurt himself when]? (ii) When did the boy say [that he hurt himself] when? When did the boy say how he hurt himself? (i) *When did the boy say [how he hurt himself when]? (ii) When did the boy say [how he hurt himself] when?

Example (3) shows that changing that to how along the linear path between filler and gap does not always give rise to the effect seen in (2). This is why paths must be construed in hierarchical terms. All modern theories of grammar make available the relevant notion of path in one way or another. (3) a. b.

When did [the boy who told his mother [that he hurt himself]] go to bed when? When did [the boy who told his mother [how he hurt himself]] go to bed when?

Given this much, it is unsurprising that the influence between paths and fillergap dependencies also goes the other way. What lies along the path of movement influences whether and what type of movement is possible. Conversely: Movement along a path seems to exert an influence on what lies along the path. This is shown by familiar effects from word order (e.g., the famous inversion under question formation in Spanish (Torrego (1983; 1984); Uribe-Echevarria (1992))) and morphology (e.g. the alternation in the shape of the complementizer in Irish (McCloskey (1979; 1990; 2002); Noonan (1997))), shown in (4). Reconstruction effects to places along the path – like the reconstruction effects for binding theory to intermediate landing sites, sometimes called pit-stop reflexives, as discussed in Barss (1986), show yet another type of interaction between path and moving item. (4) Irish examples from McCloskey (1990, 205) a. D´uirt s´e gur bhuail t´u e´ said he COMP struck you him ‘He said that you struck him.’ sh´ıl m´e a d´uirt t´u a dh´eanf´a b. an rud a the thing COMPt thought I COMPt said you COMPt do-COND-2 SG ‘the thing that I though you said you would do.’

Are Movement Paths Punctuated or Uniform?

433

We introduce here a distinction between two types of theories which comes from Abels (2003). Abels distinguishes between punctuated and uniform paths. A path will be called punctuated if some nodes along it are affected by having been moved through while others are not. A path will be called uniform, if all nodes along it are affected in the same way. HPSG, Categorial Grammar, and the theory of the Configurational Matrix are examples of theories where paths are treated uniformly. All nodes along the path are affected – and are affected in the same way. Tree Adjoining Grammars offer a theory that is uniform in a very different way: the nodes along the movement path remain uniformly unaffected. On the other hand, theories in the narrower Chomskyan tradition postulate punctuated paths. This is true of the Extended Standard Theory of the seventies, where only selected nodes, namely the COMP nodes, along the path were affected. This is true of the Barriers theory, where intermediate landing sites are available at some nodes along the path while they are unavailable at others. The same is true, of course, also of the more recent idea of little vP and CP as phases. Even in theories where landing sites are quite close together, as for example in Chomsky and Lasnik (1993), Takahashi (1994), Stroik (1999), Boeckx (2001; 2008), and Boˇskovi´c (2007), it still remains true that only the maximal projections along the path are affected, but not intermediate projections.1 Most so-called cyclicity effects have no direct bearing on the question of punctuated versus uniform paths. The Irish data, for example, are compatible with various uniform and punctuated analyses. Bouma et al. (2001) treat the alternating element as a preverbal particle rather than a complementizer, but this has no bearing on the logic of the situation. They use a theory where paths are uniform, HPSG, and make the shape of the alternating element depend on whether its sister has an empty or a non-empty SLASH value. The analyses of the same alternation that McCloskey has given over the years (with the exception of McCloskey (1979)), treat the alternation in terms of a punctuational model where the shape of the complementizer depends on a local relation with the moving element at various stages of the derivation. The 1

Abels (2003) calls theories where the nodes affected by movement are very close together quasi uniform. The reason for this terminological move was the assumption that it would be empirically very difficult to distinguish quasi uniform theories from uniform theories, while it seemed at the time easier to distinguish punctuated theories with wide gaps between the affected nodes from the other two types. We believe that this assumption was wrong. We might still end up with a category of quasi-uniform theories. For example Lechner (2009) proposes that every instance of external merge and most (see Lechner’s paper for details) instances of internal merge trigger displacement, leading to a theory where there can be several intermediate landing sites within one and the same maximal projection. Still, not every node is affected identically.

434

Klaus Abels & Kristine Bentzen

moving element itself “leapfrogs” as it were, leaving many nodes along the path completely untouched. Finally, we can give a uniform non-local account of the alternation. We could assume the following realization rule for the complementizer in Irish.2 (5) a. b. c.

Realize an instance of the complementizer C0 as aL (leniting) if there is a movement chain in which the head c-commands C0 and C0 ccommands the foot. Otherwise realize an instance of the complementizer C0 as aN (nasalizing) if it is locally c-commanded (Spec-Head) by an operator. Otherwise realize an instance of the complementizer C0 as go.

Similar considerations make even fairly complex arguments that demonstrate the existence of a particular reconstruction site silent on the issue of punctuated versus uniform movement paths; thus, while (6) argues for the existence of a reconstruction site for the topicalized noun phrase in between the position of the subject and the object of ask, it does not bear on the question whether all nodes between the subject and object of ask can serve as reconstruction sites or just some. [The papers that he1 wrote for Ms. Brown2 ] every student1 [VP t’ asked her2 to grade t] b. *[The papers that he1 wrote for Ms. Brown2 ] she2 [t’ asked every student1 to revise t] (Lebeaux (1990), see also Fox (2000, 10-11))

(6) a.

In this paper we argue that movement paths are punctuated. In section 2 we discuss the shape that a true argument for the punctuated path hypothesis would have to take. In section 3 we investigate whether the argument in Abels (2003) for the punctuated path hypothesis is compelling, reaching a negative conclusion. In section 4, we offer a set of data from Norwegian as empirical support in favor of punctuated movement paths. Section 5 is a short survey of other configurations that would be involved in constructing prima facie arguments for the punctuated path hypothesis. The relevant cases have not been investigated yet. Section 6 contains a brief speculation on the location of intermediate reconstruction sites. Section 7 concludes the paper. 2

There might be an indirect argument here against a non-local treatment. The rules (5-a) and (5-b) are not ordered by the elsewhere principle unless ‘c-commands’ is replaced by ‘locally c-commands’ in the formulation of the first condition.

Are Movement Paths Punctuated or Uniform?

435

2. What constitutes a valid argument for punctuated paths? The putative arguments for the punctuated nature of movement paths mentioned in the previous section can all be construed as arguments from reconstruction: reconstruction for (local) agreement in the case of Irish complementizer agreement and reconstruction for binding and scope in the case of topicalization. What these arguments seem to show is that some nodes along the path of movement are affected because they are reconstruction sites. These arguments do not bear on the question of the punctuated nature of paths, since they are fully compatible with a theory where all nodes along the path are affected. To give a true argument for the punctuated nature of paths, we therefore need to show that some nodes along the path are unaffected by movement while others are affected. As noted for example in Abels (2003) and Boeckx (2008), there is little if any convincing empirical evidence to argue for the absence of reconstruction to a particular position. The situation is complicated by the fact that even the lack of reconstruction (construed in the broadest sense) to a particular position is not direct evidence for the punctuated nature of paths; a node might have been affected by movement, yet, for independent reasons, we might be unable to show this. Boeckx (2008, 58) expresses this clearly at the end of the following quotation: “Whereas the copy theory of movement readily accounts for reconstruction by involving the interpretation of unpronounced copies, we cannot conclude from this that if no reconstruction effect is found, no copy is available at the relevant site. All we can conclude from the absence of reconstruction is either that there is no copy present, or that a copy was created, but for some (perhaps interpretive) reason cannot be interpreted in the relevant position.” A well-known case where reconstruction is blocked is provided by the readings that quantified arguments get when they are extracted from a weak island. Consider example (7). There is no reconstruction of the restriction of the wh-phrase into the weak island (see Longobardi (1991); Cinque (1990); Cresti (1995); Frampton (1999)), hence, only a de re reading of the wh-moved NP is available. This could be taken to indicate that there is no copy of the wh-phrase inside of the weak island. This conclusion would be rash however – and a different explanation for the lack of reconstruction has to be sought – since there is reconstruction into the island for other properties such as binding (Cinque (1990); Starke (2001)). (7) a. b.

How many people do you think that John invited? How many people do you wonder whether to invite?

What is striking about this case and others like it is that while reconstructive behavior is not uniform along the entire length of the path, it is monotonic: for

436

Klaus Abels & Kristine Bentzen

some reconstructive property P, the path is cut into two contiguous bits one of which allows and the other one of which disallows reconstruction. Let us make a terminological distinction between uniform, (non-uniform) monotonic, and punctuated reconstruction patterns. Uniform reconstruction patterns are those where no two points along a path can be distinguished by their reconstructive behavior, i.e., either reconstruction is possible to every point along the path or to none. In Figure 1 this would correspond to a situation where either reconstruction is possible at all points along the path in between any two elements, i.e., where reconstruction to all of α , β , γ , and δ is possible, or else where no reconstruction is possible at all, i.e., none of α -δ are available for reconstruction. On the other hand, non-uniform monotonic patterns are those where the path can be divided into two contiguous bits one of which allows and the other one of which disallows reconstruction. In Figure 1 this would be the case if reconstruction was available to α and β but not to γ and δ . The weak island extraction facts are a case of this sort, where reconstruction of the nominal restriction is possible above the island inducing element but not below it.3 A punctuated reconstruction pattern is one where there are sites for reconstruction both above and below sites that do not allow reconstruction. In Figure 1, we would speak of a punctuated reconstructive pattern if α and δ were possible reconstruction sites while β and γ were not, if reconstruction sites alternated with non-reconstruction sites, etc. XP A

α

B

β

C

γ

D

δ

E tXP

Figure 1. Path between tXP and XP with four distinct points along the path, α -δ

Different theories of movement give rise to different expectations regarding 3

Notice that uniform reconstructive patterns are also monotonic, hence the modifier ‘nonuniform.’

Are Movement Paths Punctuated or Uniform?

437

reconstructive patterns. Uniform theories of movement predict uniform reconstructive patterns and need to invoke additional assumptions to handle nonuniform monotonic and punctuated reconstructive patterns. Theories of movement that predict punctuated movement paths on the other hand give rise to the expectation that we should find punctuated reconstructive patterns, need additional assumptions to deal with non-uniform monotonic and uniform patterns. Therefore, if a punctuated reconstructive pattern can be found, this provides a prima facie argument for a punctuated theory of movement paths. Such an argument will fall if an independent reason can be found why reconstruction to particular points along the path is blocked (the second disjunct in the quote from Boeckx above) or if reconstructive behavior for different properties does not align, i.e., if a position is not a scope reconstruction site but it is a binding reconstruction site, etc. Since these matters have not been investigated in detail before, we will present here two case studies of reconstructive patterns, the second of which will turn out to provide a prima facie argument for punctuated paths. We also sketch the logic of other potential cases which should be but have not been investigated. We are not able at this moment to tell whether our prima facie argument will eventually fall for one of the two reasons mentioned above.

3. Proposed evidence for punctuated paths (Abels (2003)) Let us start by looking at a case involving binding condition A. The locality inherent in Principle A of binding theory allows us to probe for lack of intermediate landing sites. Given that, in a language like English, binding condition A roughly requires the antecedent and the anaphor to be clausemates, binding condition A is a relatively coarse measure of the absence/presence of intermediate landing sites. The relevant structure is given below in Figure 2. In the structure there is an anaphor contained in a moving phrase, XP. Under the punctuated path hypothesis, there would be various traces/copies of XP, concretely in Figure 2 there are three. For each of the copies there is a certain local domain within which the anaphor has to be bound if that copy is to be involved. This is schematized in Figure 2 by the nodes labeled DomainP which are cosuperscripted with the trace/copy for which they constitute the binding domain. Finally, there are various potential antecedents for the anaphor. DomainP indicates the maximal possible binding domain of the anaphor within the moving constituent from the closest trace position. What we would test is whether there are DPs that cannot antecede the anaphor despite the fact that they c-command one or more copies of it, simply because these DPs are not sufficiently local to any of the intermediate copies. This is illustrated in Figure 2,

438

Klaus Abels & Kristine Bentzen

where pit-stop binding by Antecedent3 and Antecedent1 ought to be possible, while the same should not hold for Antecedent2. H H HH XP DomainP1 P PP HH . . . anaphor . . . H ... ... HH H Antecedent1 ... HH H ... t1XP H HH Antecedent2 ... H H ...

DomainP2

H HH ... ... H HH Antecedent3

...

H H DomainP3 3 PP

t2XP

. . . tXP . . .

Figure 2. Schematic representation of the argument in Abels (2003)

Abels (2003) attempts an argument of this shape. It is of course well-known that anaphors may be bound at various points along the movement path (cf. Barss (1986)). This is illustrated in (8). In (8-a), the anaphor himself within the whphrase is bound by John in its surface position, whereas in (8-b), herself is bound in a position below Mary, presumably its base position. In (8-c) himself is bound by John in some intermediate position. (8) a. b. c.

Johni wonders which pictures of himselfi Mary likes. John wonders which pictures of herselfi Maryi likes. Which pictures of himselfi does Jane believe (that) Johni thinks (that) she likes?

However, these examples do not tell us anything about whether paths are uniform or punctuated. A clause like (8-c) can be construed with either a uniform or a punctuated path, as shown in (9)-(10). (9) Uniform path: [which picture of himself]i [ ... John ... [ vP ti vo [ VP ti thinks [ CP ti that [ TP ti Mary [ ...]]]]]]

Are Movement Paths Punctuated or Uniform?

439

(10) Punctuated path: [which picture of himself]i [ ... John ... [ vP ti vo [ VP thinks [ CP ti that [ TP Mary [ ...]]]]]] Crucially, though, Abels (2003) provides a context in which binding of a moved anaphor is not possible. Consider the pair in (11). In (11-a), the experiencer of seem can bind the anaphor in the moved wh-phrase. In (11-b), when seem is used as a raising verb, however, this is not possible. (11) a. Which picture of himselfi did it seem to Johni that Mary liked? b. *Which picture of himselfi did Mary seem to Johni to like? Abels (2003) claims that this is because in (11-a) there is an available SpecCP position below John, which the wh-phrase moves through or adjoins to, and this position is local enough for John to bind the anaphor, as shown in (12). (12) [Which picture of himself]i it [ VP1 seem [ VP2 to John tseem [ CP ti that [ TP Mary [ VP3 ti liked ti ]]]]] In (11-b) on the other hand there is no such position available, as illustrated in (13). The raising infinitive is taken to be a TP rather than a CP, and following Chomsky (1986), adjunction to TP is not allowed. Furthermore, the wh-phrase could not have moved through SpecTP, as the trace of Mary occupies this position. (13) [Which picture of himself]i Mary [ VP1 seem [ VP2 to John tseem [ TP tMary to [ VP3 ti like ti ]]]] Abels takes this as evidence for punctuated paths; that is, the moving element only has intermediate stops in certain positions, in CP but not in TP. We are aware of two potential challenges to this argument. Gereon M¨uller makes the following observation concerning the two crucial examples: While in (11-a) only a single phrase, the wh-phrase, is moving, there are two moving phrases in (11-b). In the latter, the wh-phrase and the raising subject move along overlapping paths. This raises the possibility that there is an intermediate landing site both for the wh-phrase and the subject above the embedded SpecTP position but below the experiencer, as schematized in (14). (14) [Which picture of himself]i Mary [ VP1 seem [ VP2 to John [ tMary [ ti [ tseem [ TP tMary to [ VP3 ti like ti ]]]] Notice that all traces of the wh-phrase in (14) that are c-commanded by the experiencer are also c-commanded by the subject and various traces thereof. Notice further that in every single case the subject or its trace are closer to (the trace of) the wh-phrase than the experiencer. Assuming a binding theory strictly in terms of closest c-command, the subject would always be the relevant binder of the anaphor in (14). Given that the subject does not raise in (11-a), there is

440

Klaus Abels & Kristine Bentzen

an intermediate position where the experiencer is the closest potential binder for the anaphor. Hence, M¨uller argues, the contrast between (11-a) and (11-b) does not provide evidence for the punctuated nature of movement paths. This objection, of course, is only as strong as the binding theoretic assumptions that it rests upon, namely, that anaphors in English can only be bound by the closest c-commanding antecedent. This assumption is problematic, as the examples in (15) illustrate. As is well known, the DP object in such examples c-commands into the PP, (15-a). However, and this undermines the strength of M¨uller’s objection, in example (15-b) the subject can antecede the anaphor despite the fact that it is not the closest potential c-commanding antecedent, which is the DP object as in (15-a).4 (15) a. b.

Mary explained the man to himself. Mary explained the man to herself.

A second more damaging problem for the argument is pointed out by Boeckx (2008). If the contrast between (11-a) and (11-b) was only due to the presence versus absence of a CP below John, then we would expect anaphor binding to be possible also into a more deeply embedded CP, i.e., we would expect a punctuated pattern of binding reconstruction. The expectation then is that all examples in (16) should be fine. However, (16-c) is ungrammatical. It seems that reconstruction of the moved wh-phrase to an intermediate landing site below the experiencer in a raising construction is blocked in general. In the terminology of the previous section, we are dealing with a monotonic reconstruction pattern. We argued that non-uniform monotonic patterns like this one (or the reconstruction of the nominal restriction into a weak island discussed above) require additional assumptions no matter what we assume about movement paths in general and cannot, therefore, provide prima facie evidence one way or another. (16) a. b.

4

Which picture of himselfi did Mary tell Johni that she liked? Which picture of himselfi does it seem to Jane that Mary told Johni that she liked?

There might be ways of rescuing the closest c-command theory of anaphor binding. Thus Lechner (2009), for unrelated reasons, posits an intermediate structure where the subject locally ccommands the second object in a double-object structure. If Lechner’s theory is correct and if binding could be read-off this structure, then the closest c-command approach to anaphor binding might be workable for English after all. On the other hand we might accept as fact that the closest c-command theory of anaphor binding is wrong but assume that binding domains are upward bounded by subjects and that intermediate traces of subjects count as subjects. Under this latter approach (suggested to us by Winnie Lechner) M¨uller’s objection would again stand. We will not pursue these issues here simply because we believe that there is a second, stronger objection to Abels’ argument, to which we now turn.

Are Movement Paths Punctuated or Uniform?

441

c. *Which picture of himselfi does Mary seem to Jane to have told Johni that she liked? Thus, we agree with Boeckx that when the data in (16) are taken into account, the contrast between (11-a) and (11-b) does not constitute an argument for punctuated paths. In the next section, however, we present some new facts that we believe count as evidence for punctuated paths, namely reconstruction data from Norwegian (cf. Bentzen (2007)).

4. New empirical evidence: reconstruction in Norwegian In this section we investigate the interaction of scope and variable binding to give us information about the absence of sites for intermediate reconstruction. The idea is the following. Suppose a moved quantifier can take either wide or narrow scope with respect to another scope bearing element. If the quantifier needs to take scope below the other scope bearing element and simultaneously bind into an even lower XP, this will only be possible if there is a possible site for reconstruction in between the two but not if there is no such reconstruction site between them. The situation is illustrated in Figures 3 and 4, where the trace between the scope-bearing element and XP in Figure 3 marks the availability of a reconstruction site while its absence in Figure 4 indicates the absence of such a site. Both figures are concrete instantiations of the abstract schema in the earlier Figure 1, where in Figure 3 there are intermediate copies/reconstruction sites everywhere while in Figure 4 such an intermediate copy is missing between the scope bearing element and XP. The expectations created by the two structures are quite different: given that scope reconstruction of QP is possible below the scope bearing element by assumption, the structure in Figure 3 gives rise to the expectation that narrow scope of QP should be able to go hand in hand with binding into XP; the structure in Figure 4 gives rise to the expectation that low scope of QP and binding into XP cannot happen simultaneously. We now apply this logic to data from Norwegian, and as we will see, the observations we present provide support for the punctuated nature of movement paths. First consider example (17). There are two readings for this example, one in which the quantified DP some girls has surface wide-scope over the adverb probably, yielding the reading that some girls are probably such that they will come to the party. Alternatively, the quantified DP may be reconstructed into a position between probably and come (indicated by t in the gloss), yielding the reading that it is probable that some girls are such that they will come to the party.

442

Klaus Abels & Kristine Bentzen

HH HH HH ... QP1 PP HH HH ...... H scope bearing element ... H HH H tQP ... HH H ... XP PPP P . . . pronoun1 . . . tQP . . . Figure 3. Low scope and high binding possible with intermediate trace

H H HH HH H QP1 ... PP HH HH ...... HH scope bearing element ... HH H XP ... PPP P . . . pronoun1 . . . tQP . . . Figure 4. Low scope and high binding impossible without intermediate trace

(17) Noen jenter vil sannsynligvis komme p˚a festen. some girls will probably t come on party.the ‘Some girls will probably come to the party.’

Norwegian

In (17) it is not clear exactly where reconstruction position t is located; it could either be the DP’s base position (presumably in SpecvP), or some intermediate position. Thus, to probe whether or not an intermediate reconstruction site is indeed available, we need to construct a context in which reconstruction into the base position can be excluded semantically. (18) provides exactly such a context.

Are Movement Paths Punctuated or Uniform?

443

dratt til Roma. (18) ... at noen gutter sannsynligvis m˚a ha that some boys probably t must have t gone to Rome ‘... that some boys probably must have gone to Rome.’ Norwegian (18) is three-ways ambiguous. The quantified DP some boys may get a surface wide-scope reading: Some boys are such that they probably must have gone to Rome. A second reading is possible if the DP reconstructs into a position between the adverb probably and the modal must. This yields the reading that it is probable that some boys are such that they must have gone to Rome. In the third reading, then, the DP reconstructs below must, yielding the reading that it is probable that it must be the case that some boys have gone to Rome. Of course what we are interested in is the availability of the intermediate reading in (18), t .5 The availability of the intermediate reading, or its absence, provides information about the availability (or lack thereof) of an intermediate site for scope-reconstruction of the moved subject. This is the first ingredient in the crucial example we are about to present. The second ingredient involves variable binding. Since binding requires scope, we can force the subject to take scope at least as high as some other phrase, by forcing the subject to bind into that other phrase. The relevant phrase is the PP ‘p˚a eget initiativ’ – on their own initiative in (19). p˚a eget initiativ komme p˚a (19) Noen jenter vil sannsynligvis some girls will probably *t on own initiative *t come on festen. party.the ‘Some girls will probably on their own initiative come to the party.’ In (19) reconstruction into the DP’s base position is blocked for reasons of binding, indicated by the starred t, as the DP needs to bind the reflexive own inside the adverb on their own initiative. If paths were uniform, it should still be possible to reconstruct the subject in a position in between ‘sannsynligvis’ and ‘p˚a eget initiativ.’ However, it turns out that only a wide-scope reading of the DP is available in (19), suggesting that there is no reconstruction site for the subject in between ‘sannsynligvis’ and ‘p˚a eget initiativ.’ The observation thus argues that there is no intermediate reconstruction site for the subject in the position of the starred t , and, in conjunction with the observation that intermediate reconstruction is available in (18), this constitutes support for the assumption that paths are 5

A potential challenge to the argument we are developing here might come from the treatment of scope phenomena in the absence of syntactic scope by way of quantification over semantic objects of higher types (see Engdahl (1980; 1986); Chierchia (1993); Kratzer (1998); Sauerland (1998; 2004) and in particular the application of these ideas to scopal interactions between quantifiers and modals in Abels and Marti (2010)).

444

Klaus Abels & Kristine Bentzen

punctuated. Note that the observation in these two clauses cannot be accounted for by simply assuming monotonic (non-uniform) paths. The starred trace t is there, but cannot be used for reasons of binding, while the starred trace t must be assumed to be absent. A possibly even more telling contrast is that between (18) and (20). The examples form a minimal pair; the only difference is the addition of ‘mot sin vilje’ – against his own will to the right of ‘sannsynligvis’ in (20). (20) ... at noen gutter sannsynligvis *t mot sin vilje *t m˚a ha *t that some boys probably against REFL will must have dratt til Roma. gone to Rome ‘... that some boys probably must have gone to Rome against their will.’ Reconstruction of the subject to a position below the added PP is impossible, since this would leave the possessive anaphor unbound. This explains why reconstruction to t is impossible. Structurally, it seems that both trace t and t in (20) could be in a position corresponding to t in (18). However, we know that reconstruction to t is impossible in (20) for binding reasons, a restriction not found in (18). If there were an intermediate landing site in the position of t , (20) should still be ambiguous, though, between a reading where the subject takes scope over ‘sannsynligvis’ and a reading where ‘sannsynligivs’ takes scope over the subject. However, the example is unambiguous: only the wide scope reading for the subject is available. This suggests, again, that there is no trace in the position of t , which in turn suggests – together with the three way ambiguity of (18) – that paths are punctuated. There is no reconstruction site at t ; there are reconstruction sites at t and t, but they are unavailable in (20) because of binding. Although reconstruction below the PP containing the possessive anaphor is unavailable, the fact that there is no reconstruction site in between ‘sannsynligvis’ and the PP makes the pattern overall punctuated. In our view, these Norwegian constructions therefore constitute the appropriate kind of test cases for the availability of intermediate reconstruction sites, and we therefore believe that the data illustrated in this section provide real support for the claim that movement paths are punctuated rather than uniform.

5. Other cases 5.1. Scope: the best tool we have Scope, as in the Norwegian facts just discussed, is the best tool we have for probing the punctuated versus uniform nature of movement paths. When two scope-bearing elements, whose relative scope we can independently determine,

Are Movement Paths Punctuated or Uniform?

445

lie along a movement path, it is in principle a simple task to find out whether the moving element can scope below, above, and/or in between them. Arbitrarily fine spatial distinctions can in principle be made this way. Nissenbaum (2001) discusses a case of roughly this shape in his thesis, though it may be debatable whether ‘scope’ is exactly the right notion. He observes, following an observation originally due to Larson that in a situation with several VP adjuncts, if one of them contains a parasitic gap, then all the ones closer to the VP must obligatorily also contain a parasitic gap.6 This is schematized in (21). (21) Nissenbaum (2001): V ([ . . . PG ]) ([ . . . PG]) [ . . . no PG] (*[ . . . PG ]) ([. . . no PG ]) The examples in (22) illustrate this generalization. If both adjuncts contain a parasitic gap, as in (22-a), no problem arises. If neither of them does, as in (22-b), still no problem arises. However, when only one of the adjuncts contains a parasitic gap it must be the one closer to the verb, as the contrast between (22-c) and (22-d) illustrates. (22) Examples from Nissenbaum (2001, 82-83) a. Who did you [VP [VP [VP [VP praise who to the sky VP ] [after criticizing PG] VP ] ] [in order to surprise PG] VP ] who VP ] ? b. Who did you [VP [VP [VP [VP praise who to the sky VP ] who VP ] [after criticizing him] VP ] [in order to surprise the poor man] VP ]? c. Who did you [VP [VP [VP [VP praise who to the sky VP ] [after criticizing PG] VP ] who VP ] [in order to surprise him] VP ]? d. *Who did you [VP [VP [VP praise who to the sky VP ] [after criticizing him] VP ] [in order to surprise PG] VP ]? Nissenbaum accounts for these facts by assuming that in successive cyclic movement an intermediate copy, interpreted as a variable of type and a λ -binder, which gives rise to an abstract of type , are created. Clauses without parasitic gaps are of type , while those containing a parasitic gap are of type . This allows a straightforward explanation of Larson’s generalization in terms of a type mismatch, as shown in Figure 5. If a clause with a parasitic gap adjoins too high, a type-mismatch occurs leading to a failure of interpretation higher up in the tree. The same is true if a clause without a parasitic gap adjoins too low. Adjunction sites that are even lower than the λ -binder will have to be ruled out, presumably by syntactic stipulation.7 6 7

The observation as relating to Heavy NP Shift is apparently due to Larson (1988). To complete the argument, multiclausal structures have to be considered. To the extent that Nissenbaum discusses them, they indicate that Larson’s generalization is neither linear nor monotonic but holds separately of each VP along the path of movement, i.e., the pattern is punctuated.

446

Klaus Abels & Kristine Bentzen

XP! HH H HH HH XP CP PP H HH . . . PG HH H XP trace2 HH HH XP BP H PP HH . . . PG H AP XP HH PP λ HH . . . PG X YP P PP . . . trace1 Figure 5. Type-annotated tree illustrating Nissenbaum’s account of (21)

To the extent that it is empirically and theoretically sound, Nissenbaum’s analysis provides an argument for the punctuated nature of paths. His account of Larson’s generalization relies crucially on a distinction being made between nodes between the intermediate copy/trace of movement and the λ -binder and those above the intermediate copy/trace of movement. A uniform theory of paths, where all nodes are treated the same way – Nissenbaum himself mentions the slashed categories of HPSG – has no way of making such a distinction. Hence, not every node along the path can be treated as identically affected by the movement, hence, we have a prima facie argument for punctuated paths. 5.2. Condition C and scope for binding Certain interactions between scope for binding and condition C of the binding theory are potentially informative regarding the punctuated nature of paths. Recall examples (6), repeated as (23). In the good example, (23-a), the quantifier that binds into the moved phrase c-commands that pronoun, which potentially interacts with the R-expression in the moved phrase via condition C. In the bad example, (23-b), the c-command relations are reversed. The pronoun c-commands the quantifier: hence, if the quantifier is to bind into the moved phrase, so will the pronoun.

Are Movement Paths Punctuated or Uniform?

447

[The papers that he1 wrote for Ms. Brown2 ] every student1 [VP t’ asked her2 to grade t] b. *[The papers that he1 wrote for Ms. Brown2 ] she2 [t’ asked every student1 to revise t] (Lebeaux (1990), see also Fox (2000, 10-11))

(23) a.

Now, if an example just like the acceptable example, (23-a), could be found that was unacceptable so long as the quantifier and the pronoun were structurally very close, but which improved once the structural distance between them was increased, this would be an argument for the punctuated nature of paths. In the hypothetical case, represented in Figure 6, there is no possible intermediate node from which the moving element could take scope below the quantifier – which it needs to allow binding into XP – and above “her2” – which it needs to avoid a violation of Binding Condition C. This situation would then be remedied if the quantifer and the pronoun are structurally separated, as in Figure 7. This would make available an intermediate reconstruction site. Whether such cases exist, needs to be investigated. * ... HH H HH HH ... XP P PP HH PP HH P H . . . he1 . . . Ms. Brown2 QP1 ... PP HH P H every student HH ... her2 H HH H XP ... P P PP P . . . he1 . . . Ms. Brown2 Figure 6. A hypothetical unacceptable variant of (23-a)

6. Speculations on the location of reconstruction sites The section on Norwegian demonstrated what we consider evidence for the punctuated nature of movement paths. In this section we will briefly speculate on

448

Klaus Abels & Kristine Bentzen

the question of why certain intermediate positions are potential reconstructions sites, while others are not. Consider again (18) and (19). The intermediate position t appears to be the same in the two examples. Still, as pointed out, reconstruction to this position is only possible in (18), and not in (19). Why would this be? Within a phasebased framework, one suggestion is that reconstruction sites are related to phase edges. According to Chomsky (2000), vP and CP are phases, and their edges have been argued to display various special properties. One such edge property is the function of being an escape hatch for moving elements from one phase into another. Movement is consequently perceived as proceeding cyclically through phase edges. Furthermore, it has also been demonstrated that reconstruction appears to be possible at precisely these escape hatch positions. Thus, the vP edge and the CP edge have been claimed to be the sites available for reconstruction. An argument for reconstruction at the vP edge is provided in Fox (2000). As mentioned in section 1, Fox has shown that there must be a reconstruction site to the left of VP. The relevant context, example (6-a) in section 1, is repeated here as (24) for convenience. As discussed above, the topicalized noun phrase needs to reconstruct below the subject ‘every student’ in order for the pronoun ‘he’ to be bound by the subject. Fox locates the particular reconstruction site in a position adjoined to VP, but this can be reinterpreted as the edge of vP. (24) [The papers that he1 wrote for Ms. Brown2 ] every student1 [VP t’ asked her2 to grade t] Furthermore, Abels (2003) and Svenonius (2004) demonstrate that reconstruc...

HH H HH H ... XP PP H HH P P H . . . he1 . . . Ms. Brown2 HH QP1 ... P PP H HH every student HH XP

PPP P . . . he1 . . . Ms. Brown2

...

HH HH her2 ... H HH XP ... PPP P . . . he1 . . . Ms. Brown2

Figure 7. Hypothetical repair of the example in Figure 6

Are Movement Paths Punctuated or Uniform?

449

tion is also available at the edge of CP. We saw an example of this in (12), here repeated as (25). Here, the wh-phrase is assumed to reconstruct at the edge of the lower CP, allowing ‘John’ to bind the anaphor ‘himself’. (25) [ Which picture of himself]i it [ VP1 seem [ VP2 to John tseem [ CP ti that [ TP Mary [ VP3 ti liked ti ]]]]] Now let us consider the Norwegian examples again, in light of pinpointing the location of reconstruction sites to phase edges. We will focus on examples (18) and (19). On the assumption that reconstruction is (only) available at the edge of a phase, we expect the possibility of a surface wide scope reading in both these examples. In (18) we also expect the narrow scope reading as a result of reconstruction to the vP phase edge. Furthermore, it is predicted that the narrow scope reading in (19) is unavailable because the vP-external phrase on their own initiative needs to be bound by some girls, thus blocking reconstruction to the vP phase edge. However, for (18), we pointed out that an intermediate reading is in fact also available, while in (19) it is not. Within the “reconstruction at phase edges” approach, the lack of intermediate reconstruction is expected in (19). The intermediate position t does not correspond to a phase edge in neither (18) nor (19), according to the definition of a phase in Chomsky (2000). The question then is what makes reconstruction to t available in (18)? Based on verb movement facts in non-V2 contexts in Northern Norwegian, Bentzen (2007) suggests that not just vP and CP constitute phases, but that in fact all finite verbs induce a new phase in their surface position. If this suggestion is on the right track, we might have an explanation for the difference in intermediate reconstruction between (18) and (19). Assuming that reconstruction only occurs at phase edges and that finite verbs induce phases, we expect reconstruction to be available at the vP (and CP) edge, and also at the left edge of the finite verb. And in (18) this is exactly what we see; the intermediate reconstruction site can be associated with the specifier of the finite modal ‘m˚atte’. However, in (19), there is no finite verb at the position of t . The finite verb ‘vil’ surfaces in the V2 position, so there is no phase edge in the intermediate position. Thus, this phase-based approach correctly predicts which intermediate positions are possible reconstructions sites in Norwegian. One might of course question whether phases is the correct concept here. However, we note that the available reconstruction sites we have identified in Norwegian correspond to the structural positions that people independently have argued to constitute phase edges.

450

Klaus Abels & Kristine Bentzen

7. Concluding remarks In this paper we have sketched the logic that arguments should take which purport to argue for punctuated and against uniform movement paths. In particular, punctuated reconstruction patterns can lend support to the position that movement paths are punctuated, a position that has long formed part of the orthodoxy of Chomskyan syntactic theory without being backed by truly decisive arguments. We concede that the evidence for punctuated paths originally proposed in Abels (2003) turns out not to stand up to scrutiny, but the case for punctuated paths can still be made. We illustrated this using the interaction of scope reconstruction and binding in Norwegian. The Norwegian examples suggest that a moving element only makes pit-stops in selected positions along the movement path, positions we might want to call phase edges. To complete the argument, one would have to show that different properties cluster in their reconstructive behavior: if the positions involved in morphosyntactic changes under movement were limited in a cross-linguistic perspective, if they coincided with the positions crucially involved in locality theory, and if those same nodes were the only possible reconstruction sites, then this would constitute very strong evidence for the punctuated nature of paths. For the moment our knowledge, especially that of lacking reconstruction sites, is too limited to warrant such conclusions. Since the Norwegian facts discussed here are fairly subtle and subject to a certain amount of variation, we conclude noting that the true issues involved by the punctuated paths hypothesis have barely been probed and that the paucity of compelling evidence in favor of the punctuated paths position remains as a challenge to those who wish to uphold it.

Bibliography Abels, Klaus (2003): Successive Cyclicity, Anti-locality, and Adposition Stranding. PhD thesis, University of Connecticut, Storrs, Connecticut. Abels, Klaus and Luisa Mart´ı (2010): ‘A Unified Approach to Split Scope’, Natural Language Semantics 4, 435-470. Barss, Andrew (1986): Chains and Anaphoric Dependence: On Reconstruction and its Implications. PhD thesis, MIT. Bentzen, Kristine (2007): Order and Structure in Embedded Clauses in Northern Norwegian. PhD thesis, University of Tromsø, Norway. Boeckx, Cedric (2001): Mechanisms of Chain Formation. PhD thesis, University of Connecticut, Storrs, Connecticut. Boeckx, Cedric (2008): Understanding Minimalist Syntax: Lessons from Locality in Long-Distance Dependencies, vol. 9 of Generative Syntax. Blackwell Publishing. ˇ Boˇskovi´c, Zeljko (2007): ‘On the Locality and Motivation of Move and Agree: An Even More Minimal Theory’, Linguistic Inquiry 38, 589–644. Bouma, Gosse, Robert Molouf, and Ivan A. Sag (2001): ‘Satisfying Constraints on Extraction and Adjunction’, Natural Language and Linguistic Theory 19, 1–65.

Are Movement Paths Punctuated or Uniform?

451

Chierchia, Gennaro (1993): ‘Questions with Quantifiers’, Natural Language Semantics 1, 181–234. Chomsky, Noam (1986): Barriers. MIT Press, Cambridge, Mass. Chomsky, Noam (2000): Minimalist Inquiries: The Framework. In: R. Martin, D. Michaels and J. Uriagereka, eds, Step by Step: Minimalist Essays in Honor of Howard Lasnik. MIT Press, Cambridge, Mass., pp. 89–155. Chomsky, Noam and Howard Lasnik (1993): The Theory of Principles and Parameters. In: J. Jacobs, A. von Stechow, W. Sternefeld and Th. Vennemann, eds, Syntax: An International Handbook of Contemporary Research, vol. 1. Walter de Gruyter, Berlin, pp. 506–569. Cinque, Guglielmo (1990): Types of A’-Dependencies. MIT Press, Cambridge, MA. Cresti, Diana (1995): ‘Extraction and Reconstruction’, Natural Language Semantics 3, 79–122. Engdahl, Elisabet (1980): The Syntax and Semantics of Questions in Swedish. PhD thesis, University of Massachusetts, Amherst, Mass. Engdahl, Elisabet (1986): Constituent Questions – The Syntax and Semantics of Questions with Special Reference to Swedish. Reidel, Dordrecht. Fox, Danny (2000): Economy and Semantic Representation. MIT Press and MITWPL, Cambridge, Mass. Frampton, John (1999): ‘The Fine Structure of Wh-Movement and the Proper Formulation of the ECP’, The Linguistic Review 16, 43–61. Giorgi, Alessandra and Giuseppe Longobardi (1991): The Syntax of Noun Phrases. Cambridge University Press, Cambridge. Kratzer, Angelika (1998): Scope or Pseudoscope? In: S. D. Rothstein, ed., Events in Grammar. Dordrecht, Kluwer, pp. 163–197. Larson, Richard K (1988): Light Predicate Raising. Technical Report, Center for Cognitive Science, MIT, Cambridge, Mass. Lebeaux, David (1990): Relative Clauses, Licensing, and the Nature of the Derivation. In: R.-M. D´echaine, B. Philip and T. Sherer, eds, Proceedings of NELS 20. GLSA, University of Massachusetts, Amherst, pp. 318–332. Lechner, Winfried (2009): Evidence for Survive from Covert Movement. In: M.T. Putnam, ed., The Survive Principle in a Crash Proof Syntax. John Benjamins, Amsterdam, pp. 231-256. Longobardi, Giuseppe (1991): Extraction from NP and the Proper Notion of Head Government. In: Giorgi and Longobardi (1990), chap. 2, pp. 57–112. McCloskey, James (1979): Transformational Syntax and Model Theoretic Semantics: A Case Study in Modern Irish. Synthese Language Library. D. Reidel Publishing Company, Dordrecht, Boston, and London. McCloskey, James (1990): Clause Structure, Ellipsis and Proper Government in Irish. Syntax research center, Cowell College, University of California at Santa Cruz. McCloskey, James (1990): Resumptive Pronouns, A’-Binding, and Levels of Representation in Irish. In: R. Hendrick, ed., The Syntax of the Modern Celtic Languages. Syntax and Semantics 23. Academic Press, San Diego, pp. 199–248. McCloskey, James (2002): Resumption, Successive Cyclicity, and the Locality of Operations. In: S. D. Epstein and T. D. Seely, eds, Derivation and Explanation in the Minimalist Program. Blackwell, Malden, MA and Oxford, UK, pp. 184–226 Nissenbaum, John (2001): Investigations of Covert Phrasal Movement. Phd thesis, MIT. Noonan, Maire (1997): ‘Functional Architecture and Wh-Movement: Irish as a Case in Point’, Canadian Journal of Linguistics 42, 111–139. Sauerland, Uli (1998): The Meaning of Chains. PhD thesis, MIT, Cambridge, Mass. Sauerland, Uli (2004): ‘The Interpretation of Traces’, Natural Language Semantics 12, 63–127. Starke, Michal (2001): Move Dissolves into Merge: a Theory of Locality. PhD thesis, University of Geneva. Stroik, Thomas (1999): ‘The Survive Principle’, Linguistic Analysis 29, 282–309. Svenonius, Peter (2004): On the Edge. In: D. Adger, C. de Cat and G. Tsoulas, eds, Peripheries: Syntactic Edges and their Effects. Kluwer, Dordrecht, pp. 261–287. Takahashi, Daiko (1994): Minimality of Movement. PhD thesis, University of Connecticut.

452

Klaus Abels & Kristine Bentzen

Torrego, Ester (1984): ‘On Inversion in Spanish and Some of Its Effects’, Linguistic Inquiry 15, 103–129. Torrego, Esther (1983): ‘More Effects of Successive Cyclic Movement’, Linguistic Inquiry 14, 561– 565. Uribe-Echevarria, Mar´ıa (1992): On the Structural Positions of Subjects in Spanish, and their Consequences for Quantification. In: J. A. Lakarra and J. Ortiz de Urbina, eds, Syntactic theory and Basque syntax. ASJU, San Sebastian, pp. 447–493.

(Abels) Linguistics University College London (Bentzen) Department of Linguistics and CASTL University of Tromsø

Chris Worth

A Hypothetical Proof Account of Chamorro Wh-Agreement* Abstract Chamorro is an Austronesian Language spoken primarily in Guam, which is generally taken to have VSO word order. It displays an interesting pattern of agreement in unbounded dependency constructions, whereby the verb agrees via infixation and/or suffixation with one of its dependents, be it subject, object, or oblique, from which an element has been extracted. Convergent Grammar (CVG) is a relational, multi-modal, type-theoretic, resource sensitive grammatical framework which “can be seen as a coming together of ideas of widely varying provenances, be they transformational, phrase-structural, or categorial.” (Pollard (2007)) The question of how a verbal head can agree with a dependent out of which an element has been extracted can be accounted for in this framework using a combination of lexical specification and rules of natural deduction, in particular the notion of hypothetical proof. Embedded constructions are of particular interest, as each verb’s agreement morphology varies with the corresponding variance in the grammatical role of the dependent from which extraction has occurred.

1. Introduction Chamorro is an Austronesian language, spoken mostly in Guam, which is generally taken to have primarily VSO word order. It displays an interesting pattern of agreement in certain unbounded dependency constructions, whereby “the verb . . . agrees in grammatical function with the gap” (Chung and Georgopolous (1984)), be it subject, object, or oblique. Chung (1998) revises this to agreement with a trace, but both of these are a slightly incorrect formulation, as agreement is based not on the verb’s relation to the gap (or trace), but with the verb’s relation to whichever of its dependents (syntactic argument or adjunct) contains a gap. In some cases the gap may be the entire dependent itself. The question of how a verbal head can agree with an element from which extraction has taken place can be accounted for in a framework using a combination of lexical specification and rules of natural deduction (ND), in particular the notion of hypothetical proof. *

The ideas presented in this paper would no doubt be poorer were it not for the gracious help of Carl Pollard, David Dowty, Bob Levine, Judith Tonhauser, Scott Martin, Andy Plummer, Vedrana Mihalicek, and Lia Mansfield. Special thanks are due to several anonymous reviewers, the participants of the LMNLDS workshop at DGfS 30, and to Sandra Chung, whose timely email helped clarify a puzzling piece of data. Anything misleading or untrue is undoubtedly the responsibility of the author.

Local Modelling of Non-Local Dependencies in Syntax, 453-476 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

454

Chris Worth

This type of analysis has the benefit of accounting for the data in a way which is based on generally accepted logical principles, without resorting to explicitly movement-based strategies. Convergent Grammar (CVG) provides a natural way to show that phenomena like wh-agreement can be given a straightforward and logically precise analysis in a categorial framework with rules of hypothetical proof. In the second section, I lay out relevant data which is exemplary of the whagreement phenomenon, which is then summarized in section three – “The whagreement paradigm”. Section four explores the basic issues presented by the phenomenon. The fifth section is an introduction to the basics of CVG syntax and semantics, with examples from English and Chamorro. The sixth section outlines the ideas underlying my analysis, and section seven discusses the importance of hypothetical proof for this account. In section eight, I discuss the CVG perspective on linear word order, and define the operation ‘right wrap’. Section nine consists of a CVG sample derivation for the most complicated example from section two. Sections ten and eleven summarize the analysis, and lay out questions for future research.

2. Examples Morphology relevant to the wh-agreement phenomenon appears in boldface in the following examples. 2.1. Canonical declaratives (1) a. b. c.

Juan i kareta. Ha-fa’gasi si AGR-wash UNM Juan the car ‘Juan washed the car.’ Ha-ottu i petta i patas-su. AGR-bang the door the foot- AGR ‘The door banged my foot’ i famagu’un. Mang-´akati AGR-cry. PROG the children ‘The children are crying.’

(Chung (1998, (ch.6)52)) (Chung (1998, (ch.2)31-a)) (Chung (1998, (ch.2)23-a))

In (1-a), we see an ordinary transitive verb construction. The verb is sentence initial, with the third singular realis subject-agreement prefix ha-. It is followed by the subject, Juan, and the object, i kareta. The reader will note the case marker si, used to indicate that a proper name is in the unmarked case. Unmarked case, perhaps obviously, typically has no explicit marking. This is the case with the common nouns in (1-b), where the transitive verb ha-ottu is followed by its un-

A Hypothetical Proof Account of Chamorro Wh-Agreement

455

marked subject i petta and unmarked object i patas-su. Example (1-c) shows an intransitive construction, with an unmarked subject. 2.2. Subject extraction (2) a. b.

Hayi fuma’gasi — i kareta? the car who? WH[NOM].wash ‘Who washed the car?’ hao — ? Hayi um-ayuda who? WH[NOM]-help you ‘Who helped you?’

(Chung (1998, (ch.6)53-a)) (Chung (1998, (ch.5)41-c))

Both (2-a) and (2-b) illustrate similar constructions. In (2-a), the fronted whword is followed by a wh-agreeing verb fuma’gasi, and followed by an unmarked common noun phrase. In (2-b), we have a wh-word followed by a wh-agreeing verb and an overt pronoun, indicating that the choice of noun phrase type is unimportant to the wh-agreement phenomenon. 2.3. Object extraction (3) a. b. c.

˜ Hafa fina’tinas-niha i famalao’an — ? what? WH[OBJ].make-AGR the women ‘What did the women cook?’ (Chung (1998, (ch.5)39-a)) ˜ si Maria nu hagu — ? Hafa s-in-angane-nna what? WH[OBJ2].say.to-AGR UNM Maria OBL you ‘What did Maria tell you?’ (Chung (1998, (ch.6)73-a)) ? Hafa kinann´ono’-mu — what? WH[OBJ].eat.PROG-AGR ‘What are you eating?’ (Chung (1998, (ch.6)58-a))

In (3-a), we see a constituent question where an object extraction has taken place. The wh-word is followed by the wh-agreeing verb, which exhibits the -in- infix, and the suffix -˜niha, a nominalization suffix, which appears in a form agreeing with the subject i famalao’an. A slightly more complicated example is presented in (3-b). Here, the verb shows the same essential morphology as in (3-a), albeit with a differently agreeing nominalization suffix. The subject, Maria, is in unmarked case, and the first object hagu is marked by the oblique case marker for pronouns, nu. As previously noted, first objects typically appear in unmarked case. This anomalous case marking is required by the nominalization of the verb, which has the effect of forcing all of the verb’s complements to be marked as oblique (Chung, p.c.). In (3-c), we see a wh-agreeing verb with no overt subject. Chamorro allows for ‘understood’ pronominal subjects; in this case, we can tell

456

Chris Worth

that the subject corresponds to second person singular by the form of the nominalization marker -mu. 2.4. Adjunct extraction (4) a.

b.

Hafa p¨ara fa’gase-mmu ni kareta — ? what? FUT WH[OBL].wash-AGR OBL car ‘What are you going to wash the car with?’ (Chung (1998, (ch.6)53-c)) Hayi ma’a’˜nao-mu —? who? WH[OBL].afraid-AGR ‘Who are you afraid of?’ (Chung (1998, (ch.5)5-a))

In (4-a), we see a constituent question based on an extracted instrument. The wh-word is followed by a future marker, and the wh-agreeing verb fa’gase-mmu. It is important to note that, as with (3-c), the subject is understood. The verb does not appear with the infix -in-, which is mandatory for transitive verbs with object extractions. It is only this fact that indicates that kareta is the first object, and that the extracted dependent is something else, in this case, an instrument. We consider an intransitive verb in (4-b), which appears with an understood subject, and an extracted oblique. 2.5. Embedded constructions (5) a.

si Juan p¨ara godde-tta Hafa sinangan-˜a what? WH[obj].say.his-AGR UNM Juan Fut WH[obl].tie.our-AGR ni chiba — ? OBL goat ‘What did Juan say that we should tie up the goat with?’ (Hukari and Levine (2006, ch.3, ex. 63))

In an embedded construction, we see differing agreement markings on the verbs. The wh-word is followed by a verb bearing wh-agreement for an extracted object. This verb is followed by its subject, Juan, and then by a clausal object. This clause is headed by a verb bearing oblique agreement morphology. The subject of the lower clause is understood to be first person dual, and it is followed by the first object chiba, which is marked oblique by the case marker ni. As with (4-a), the verb does not bear morphology indicative of object extraction (the infix -in-). So the extracted element bears an oblique relationship to the lower verb, namely that of an instrument. The wh-agreement on the higher verb, however, does appear with the marking characteristic of an object extraction, indicating that the extraction has taken place within its own object.

A Hypothetical Proof Account of Chamorro Wh-Agreement

457

3. The wh-agreement paradigm The wh-agreement paradigm is characterized through a combination of infixation and suffixation on the agreeing verbs. The agreement marking patterns according to the relationship the verb has to whichever of its dependents contains the gap from which extraction has taken place. In all cases, the wh-agreement marking supersedes whatever agreement marking would be typical for the verb and its dependents. If the dependent dominating the gap is nominative (that is, corresponds to the subject of the verb), then the verb appears with the infix -um-, as illustrated by the examples in (2). If the dependent dominating the gap is a first (direct) object, the the verb appears with the infix -in-, and is optionally nominalized, typically by a number of markers corresponding to the agreement markers for possessor-noun agreement. This is illustrated by the examples in (3). If the dependent dominating the gap is oblique (e.g., an instrument, or second (indirect) object), then the verb is mandatorily nominalized, as shown in the examples in (4). Additionally, if the verb is unaccusative, it may occur with the optional infix -in-. In a declarative sentence, subjects and first objects appear in the unmarked case, while second objects and instruments appear in the oblique case. When verbs are nominalized, their complements appear in oblique case.

4. The central issues The marking of extraction on items along the extraction path has been well documented, notably by McCloskey’s work on Irish complementizer alternation. As noted by Hukari and Levine (2006), Irish complementizers present a parallel case to Chamorro wh-agreeing verbs; they exhibit the same patterns regardless of whether the extractees are arguments or adjuncts. These examples are similar to the phenomenon of wh-agreement in that certain constituents along the extraction path are morphosyntactically different than corresponding constituents in non-extraction contexts. The following examples from Irish illustrate this point: (6) a. b. c.

ghoid na s´ıoga´ı ´ı Deir siad gur say they C-[PAST] stole the fairies her ‘They say that the fairies stole her away.’ (McCloskey (2001, 1)) an ghirseach a ghoid na s´ıoga´ı — the girl aL stole the fairies ‘the girl that the fairies stole away’ (McCloskey (2001, 2)) ´ Bhaile Atha Cliath — ? siad f´a C´a fhad a bh´ı WH length aL be[ PST ] they around Dublin ‘How long were they in Dublin?’ (McCloskey (2001, 7))

In (6-a), a declarative sentence, we see the complementizer gur. In (6-b), a relative clause construction with an extracted object, the complementizer appears in

458

Chris Worth

the form of the particle glossed aL. The complementizer takes the same form in (6-c), a constituent question which contains an adjunct extraction. Chamorro presents a subtly different problem. While verbal agreement with various complements is certainly not a rare phenomenon, the issue of agreement with adjuncts is somewhat more problematic. If adjuncts are taken to adjoin at the VP level, then by what mechanism can the verbal head ‘see’ the adjunct? It is noteworthy that the Irish complementizers mark entire clauses, and as such, may be immediately sensitive to adjunct material. But this is not the case with Chamorro – how is it possible that a verb can be required to agree with material that does not appear to be accessible by the verb itself? An additional complication is presented by the fact that the agreement paradigm varies according to the grammatical relationships between each verb and its own particular dependents, rather than the relationships between the verb and and the gap itself. Chamorro verbs are sensitive not only to the fact that there has been an extraction somewhere, but to which of their dependents it has occurred in.

5. An overview of Convergent Grammar Convergent Grammar (CVG) is a grammatical framework whose basic setup should be familiar to anyone with a background in phrase structure or categorial grammars. CVG itself is essentially a categorial grammar; it is proof-theoretic and derivational in character. A full CVG proof term is a triple, consisting of a ‘phonological’ or ‘prosodic’ term, a syntactic term, and a semantic term (each of which is typed). Each proof term is asserted by the turnstile () to be axiomatic, or to follow from natural deduction rule schemata. In CVG, the syntactic proof term is one which is made explicit, making it slightly dissimilar to most other categorial grammars, where it is typically omitted. Grammatical features such as case, number, etc. are realized as syntactic types, with underspecified types (such as the NP type of proper names) treated as intersections of these more basic types. Dependencies are modeled with implication-type connectives of various modalities (‘flavors’), each of which represents different types of grammatical dependencies such as subject, complement, modifier, etc. These are roughly analogous to the ‘left’ and ‘right’ connectives in most categorial grammars, although CVG connectives exist in abstract syntax, and do not necessarily themselves completely specify linear word order. Instead, this is envisioned as part of the interface between the syntactic and phonological derivations, a discussion of which follows in the “Linearization and Wrapping” section. Each syntactic mode of implication has its own natural deduction rules associated with it. Local dependencies are modeled with flavors of implication that

A Hypothetical Proof Account of Chamorro Wh-Agreement

459

have implication elimination (modus ponens), or ‘merge’ rules. Nonlocal dependencies are modeled with flavors of implication with introduction rules (‘move’), but no elimination rule (like Moortgat’s (1988) ↑). Nonlocal dependencies are discussed further in the section “The Role of Hypothetical Proof”. 5.1. English syntactic technicalia Some basic CVG syntactic types and connectives for English: Fin – a finite sentence Nom – a nominative noun phrase Acc – an accusative noun phrase SU – an implication operator for subjects C – an implication operator for complements In the upcoming example, I intentionally obscure the distinction between the two types of NP, instead treating both as simple NPs. This is not intended to be any kind of theoretical claim, merely a way to more easily illustrate how a CVG derivation proceeds. Some basic CVG lexical entries for English: Sylvester : NP Tweety : NP chased : NP C (NP SU Fin) Natural deduction rule schemata for English: (7) Subject Merge (SU–Merge): If Γ v : A SU B, and Γ s : A, then Γ, Γ (SU s v) : B. (8) Complement Merge (C-Merge): If Γ v : A C B, and Γ o : A, then Γ, Γ (v o C ) : B. A sample CVG syntactic derivation for English follows. For purposes of legibility, these tree-style proofs contains only the numbers which correspond to the line numbers of the line-by-line derivation below. C-Merge 1

2 3 5

4 SU–Merge

460 1. 2. 3. 4. 5.

Chris Worth chased : NP C (NP SU Fin) Tweety : NP (chased Tweety C ) : NP SU Fin Sylvester : NP (SU Sylvester (chased Tweety C )) : Fin

(Lexical) (Lexical) (C–Merge) (Lexical) (SU–Merge)

Lines 1 and 2 assert the lexical entries for chased and Tweety. Line 3 shows their combination by the merge rule for complements, yielding the type of a nearlysaturated VP, NP SU Fin. Line 4 asserts the lexical entry for Sylvester, and line 5 shows its combination with chased Tweety via the merge rule for subjects, to yield a proof of the finite sentence Sylvester chased Tweety. 5.2. English semantic technicalia A compositional semantics may be given for this fragment of English in a fairly straightforward manner. Here, the lexical entries from the previous example are additionally given semantic terms and types in the calculus of Responsibility and Commitment (RC), a thorough introduction to which is given in Pollard (2008a;b).These should be familiar to those with a general background in Montague semantics and Typed Lambda Calculi in general. It may be useful to the reader to think of the RC type ι as Montague’s type e of individuals, and the RC type π as Monatague’s t, the type which will be interpreted as propositions in the model. The usual axioms of α and η -conversion hold, as does β -reduction. The reader will note that the semantic ND (natural deduction) rules corresponding to Subject Merge and Complement Merge have been relabeled (→ E), since both syntactic Merge rules correspond to implication elimination rules in the semantics. The syntax-semantics interface itself is envisioned as a number of ND rules specifying in the ways that derivations in both logics may proceed in parallel. The theorems are of the form syn term, sem term : Syn type, sem type. Some basic CVG lexical entries (with semantics) for English: Sylvester, sylvester : NP, ι Tweety,tweety : NP, ι chased, λy λx chase (y)(x) : NP C (NP SU Fin), ι → (ι → π ) Natural deduction rule schemata for English (with semantics): (9) Subject Merge (SU–Merge) : If Γ v, f : A SU B,C → D, and Γ s, a : A,C, then Γ, Γ (SU s v), f (a) : B, D.

A Hypothetical Proof Account of Chamorro Wh-Agreement

461

(10) Complement Merge (C–Merge): If Γ v, f : A C B,C → D, and Γ o, a : A,C, then Γ, Γ (v o C ), f (a) : B, D. In the following example, only the semantic derivation will be given here, as the syntactic derivation was given above. We use to indicate the normalized term which results from a series of β -reductions. →E

1

2 3

4 5

1. λy λx chase (y)(x) : ι → (ι → π ) 2. tweety : ι 3. λy λx chase (y)(x)[tweety ] λx chase (tweety )(x) : ι → π 4. sylvester : ι 5. λx chase (tweety )(x)[sylvester ] chase (tweety )(sylvester ) : π

→E

(Lexical) (Lexical) (→ E) (β ) (Lexical) (→ E) (β )

5.3. Chamorro technicalia Some basic CVG types and connectives for Chamorro: Unm – a noun phrase of ‘Unmarked’ case. Obl – a noun phrase of ‘Oblique’ case. Fin – a finite sentence Det – a determiner C – an implication operator for complements SP – an implication operator for specifiers (used here for determiners) Chamorro has a complicated morphological nominal case system, with three cases, ‘unmarked’, ‘oblique’, and ‘local’. Each of these is realized differently on common nouns, proper names, and pronouns. Local case is not discussed here. The choice to treat ‘unmarked’ as a case specification is motivated by several factors. In a typical finite sentence, subjects and first objects appear in unmarked case, with instruments and other oblique nominal material appearing in (explicitly marked) oblique case. With common nouns and pronouns, unmarked case is morphologically exactly that. However, unmarked case is realized on proper names through an overt case marker, si. The reader will note the absence of an implication connective for subjects. In the context of a VSO language, I consider the grammatical ‘subject’ to be simply the least oblique element of a series of the verb’s syntactic dependents. While Chamorro is not rigidly VSO with respect to subjects, I take the surface

462

Chris Worth

VSO order to be the normal state of affairs, and stipulate that an implication connective for subject modality could easily be defined, with different phonological linearization rules as needed. Some basic CVG lexical entries for Chamorro: petta : Det SP Unm patas − su : Det SP Unm i : Det ha − ottu : Unm C (Unm C Fin) Natural deduction rule schemata for Chamorro: (11) Specifier Merge (SP–Merge): If Γ n : A SP B, and Γ d : A, then Γ, Γ (SP d n) : B. (12) Complement Merge (C–Merge): If Γ v : A C B, and Γ o : A, then Γ, Γ (v o C ) : B. Since the semantics of Chamorro noun phrases themselves are not the primary topic of discussion in this paper, I hope that the reader will grant me the latitude to treat them as semantically primitive, in order to better illustrate the analysis of the wh-agreement phenomenon. To that end, I propose the following with respect to noun phrases: (SP i petta), door : Unm, ι (SP i patas − su), f oot : Unm, ι As before, syntactic and semantic derivations are written separately for purposes of legibility. 5.4. A sample Chamorro CVG derivation A syntactic derivation of (1-b) Ha-ottu i petta i patas-su – ‘the door banged my foot’. C-Merge 4

SP-Merge 1

2 3

6

5 9

7 SP-Merge 8 C-Merge

463

A Hypothetical Proof Account of Chamorro Wh-Agreement 1. 2. 3. 4. 5. 6. 7. 8. 9.

patas − su : Det SP Unm i : Det (SP i patas − su) : Unm ha − ottu : Unm C (Unm C Fin) (ha − ottu (SP i patas − su) C ) : Unm C Fin petta : Det SP Unm i : Det (SP i petta) : Unm ((ha − ottu (SP i patas − su) C ) (SP i petta) C ) : Fin

(Lexical) (Lexical) (SP–Merge) (Lexical) (C–Merge) (Lexical) (Lexical) (SP–Merge) (C–Merge)

Lines 1 through 3 and 6 through 8 show the creation of two unmarked case NPs through the process of common nouns taking determiners as their arguments, via specifier merge. Line 4 asserts the lexical entry for the main verb of the sentence. In Line 5, the object argument is taken as a complement. The final line shows the verb taking its subject argument to yield a finite sentence, through complement merge. The corresponding semantic derviation is as follows: →E

1

2 3

4 5

1. λy λx bang (y)(x) : ι → (ι → π ) 2. f oot : ι 3. λy λx bang (y)(x)[ f oot ] λx bang ( f oot )(x) : ι → π 4. door : ι 5. λx bang ( f oot )(x)[door ] bang ( f oot )(door ) : π

→E

(Lexical) (Lexical) (→ E) (β ) (Lexical) (→ E) (β )

6. Basic strategy Since the cases we are examining are ones where wh-agreement is morphologically marked on the verbs, it is possible to think of the differing verb forms as having slightly different lexical specifications. Each verb must be sensitive to which one of its dependents (arguments and adjuncts) has something ‘missing’ and must in turn report that fact that to ‘higher’ material, in order to preserve the informational pathway between filler and gap. Of course, this still fails to address the issue of adjunct extraction. Chamorro’s system for nominal case marking provides a tantalizing hint of how the empirical phenomenon may be captured. We have already noted that nominalized verbs force their complements to be marked with oblique case, rendering the complements morphologically similar to adjuncts in sentences which do not involve extraction. It is a fairly small step to assume, then, that the difference between

464

Chris Worth

argument and adjunct in extracted contexts is less than one initially expects. Those items which would be adjuncts in ordinary declarative sentences are actually arguments to wh-agreeing verbs. While it may initially seem counterintuitive to treat a verb as selecting for what would otherwise be an adjunct, I note that the wh-extracted adjunct agreement morphology in Chamorro appears in exactly the contexts where there is an adjunct gap, indicating that effectively, the verbs do require that the otherwise adjunctive material be present. As such, there is nothing terribly odd about analyzing adjuncts as arguments in these specific cases. This proposal is in some ways a spiritual successor to Bouma, Malouf and Sag (2001), which differs somewhat from the standard treatment of unbounded dependency constructions in HPSG. In HPSG, UDCs are modeled as a list of values for the SLASH feature, which is a nonlocal feature. The subcategorization requirements of various verbs are listed in SUBCAT (or COMPS / SUBJ, etc.), which is taken to be a local feature. We will deploy the concept of SLASH in a different way; it will be a particular ‘flavor’ of implication mnemonically abbreviated SL . Only local features are accessible for purposes of subcategorization, making it difficult to subcategorize for phrases which bear explicit gaps, which seems to be exactly what wh-agreeing Chamorro verbs do. That is, for purposes of subcategorization, there initially appears to be no way to differentiate between phrases with gaps and phrases without gaps in HPSG without modification to the framework. Bouma et al. (and subsequent work by Sag) do present a way to account for these facts in a phrase-structural manner. However, the purpose of the present work is not to debate the relative merits of different grammatical frameworks per se, but instead to illustrate how CVG provides a natural way of accounting for phenomena like wh-agreement. It would be possible to argue, instead, that adjuncts select for characteristically marked verbs. This has two unappealing consequences. First, in situations where the entire adjunct-corresponding constituent is extracted, it seems odd to think of ‘missing’ material subcategorizing for other material which is overt. Second, this fails to address the issue of the morphological difference between the verbs themselves. If there is fundamentally no difference between the verbs, other than marking, then this would incorrectly allow one to construct grammatical sentences using wh-agreeing verbs where no extraction has taken place. Take, for example, the following lexical entry for the verb godde-tta ‘tie (up)’: (13) e : A godde − tta : (Obl SL Obl) C (Obl C (Unm C Fin)) From the right of the turnstile, we read this as ‘godde-tta takes, as its sequential list of complements, an oblique argument gap (Obl SL Obl), then an oblique NP (Obl, corresponding to a direct object), and finally an unmarked NP (Unm - corresponding to a subject), and yields a (hypothetical) finite sentence’. Here, (Obl SL Obl) represents the extracted instrument.

A Hypothetical Proof Account of Chamorro Wh-Agreement

465

The issue of filler-gap connectivity is also addressed in a partially lexical manner. Verb forms which select for extracted material must contain the information that something is missing. I propose to treat this as a lexical specification: the verb forms themselves maintain the ‘gappy’ nature of their argument structure. This is represented in the lexical entry (13) above by the slashed variable e appearing as a hypothesis to the left of the turnstile. I take the semantics of wh-agreeing verbs to be similar to those of their nonwh-agreeing counterparts, with one significant difference: the translation and typing of the dependent containing the gap is slightly more complicated. This is illustrated in the semantics for godde-tta: (14) z : ι λ f λy λxtie (with ( f (z)))(y)(x) : (ι → ι ) → (ι → (ι → π )) In a non-wh-agreeing context, the semantic type of the dependent in question would be ι , an individual. Here, the argument which would otherwise correspond to the “missing” NP is represented by the hypothetical variable z, and the semantics of the gap itself are a function f of type ι → ι . Eventually, an identity function will arise as a result of the invocation of a hypothetical proof rule in the semantic logic. We will wish the verb to take this function as an argument which will be predicated of z, thus preserving the semantic connection between the original gap site and its ultimate binding.

7. The role of hypothetical proof The use of hypothetical proof in CVG allows us to distinguish phrases with gaps from ‘intact’ phrases in our syntactic typing, and it is possible to write lexical entries for the wh-agreeing verbs that effectively select for phrases with gaps of the requisite type. Since the logical formalism on which CVG is based is one which contains logical rules of hypothetical proof, it is possible to model whextraction using the introduction of hypotheses and their subsequent withdrawal via natural deduction rules analogous to implication introduction (here called a ‘move’ rule). The strategy is to treat a trace as a variable of a certain syntactic type which is stored in the SLASH field of the variable context. The label of this field will be systematically omitted in the rest of this paper for purposes of increasing legibility. We write ‘move’ rules which correspond to rules of hypothetical proof, which have the effect of modifying the syntactic typing as to encode the fact that the term in question contains a hypothetical variable. (15) If t : A, Γ s : B, then Γ λtSL s : A SL B. It is precisely this mechanism that allows phrases with gaps to be subcategorized for. We now have a syntactic type, A SL B, which represents a syntactic term of type B that contains a gap of type A. However, it is important to note that the

466

Chris Worth

hypothesis has been discharged. Once a verb takes a constituent of this type as an argument, all record that it contains a gap is lost, since ‘traces’ are characterized by undischarged hypotheses in CVG. But the information that a gap exists must still be available to ‘higher’ material, for purposes of subsequent agreement and eventual association with a filler. The question remains: how may this information be maintained? The final step is to specify that the lexical entries for wh-agreeing verbs contain hypotheses of their own, allowing for selection from and embedding within higher material. This is illustrated, as previously noted, by the material to the left of the turnstile in (13). Verbs which subcategorize for material containing gaps themselves carry hypotheses. These are different from the hypotheses which have been withdrawn to create gaps in the typing of the material which is selected for. This has the nice benefit of providing a straightforward account of how multiple verbs along an extraction path may have differing agreement morphology. Since each verb which subcategorizes for a gap has its own hypothesis, it is possible to subsequently withdraw that hypothesis as well, etc. The semantic ND rule of hypothetical proof corresponding to (15) ‘move’ is straightforward: (16) If x : A, Γ s : B, then Γ λx s : A → B. In the cases under examination, the hypothesizing and withdrawal of a syntactic variable is represented similarly in the semantics, albeit with a semantic variable instead. Using this hypothetical proof rule immediately after assuming a semantic variable results in the creation of an identity function on terms of the type of that variable. Since the variables in question here are of type ι , this rule yields a function of type ι → ι , corresponding to the A SL A typing in the syntax. Recall the lexical entry given jointly in (13) and (14). Semantically, the verb takes this identity function as its first argument, as it takes the first gap as its argument in the syntax. This identity is predicated of the argument in the semantics corresponding to the ‘missing’ dependent subcategorized for by the verb. Thus the gap is also maintained in the semantics, and a variable corresponding to that argument is returned intact and still available for eventual binding. Now we can see that the agreement between verbs and arguments containing gaps can easily and naturally be described by this iterative hypothesis introduction and withdrawal strategy. Once the hypothesis is withdrawn for the first time, it is no longer important what type it is, since the filler in all of these cases will just be a wh-word which is not marked for specific grammatical function. This lack of specific syntactic connectivity between filler and gap in wh-agreement phenomena is described by Chung (1998) (1998) as “not prototypical . . . they do not involve covarying values for the features of person or number” (p. 58). What is important to maintain is the knowledge that somewhere down the line, there

A Hypothetical Proof Account of Chamorro Wh-Agreement

467

is a gap. We now have everything we need to account for (5-a). A line-by-line derivation of the syntax and semantics for (5-a) follows in section 9.

8. Linearization and wrapping VSO languages present an interesting problem for categorial grammars which do not make a distinction between syntactic dependency and word order. While it would be possible to maintain that verbs take their subjects as first arguments, this strategy has unpleasant semantic consequences. Dowty (2007) points out that doing things in this way considerably complicates the semantics with respect to the accessibility of arguments to adverbial modifiers, as well as violating cross-linguistic facts about syntactic obliqueness. One alternative is to treat dependency and word order as separate phenomena, the utility of which has been recognized at least since Curry (1961). It is possible to posit a ‘right wrap’ operation, originally formulated in Bach (1979), which is the basic strategy pursued by Dowty with respect to English, and which is the strategy I wish to pursue here, although with some additional complications. The syntactic term of the CVG triple is understood to model only syntactic dependencies (tectogrammar), with the actual linear word order represented in the phenogrammatical term (henceforth φ -term). The connectives of the syntactic logic are not prosodically interpreted themselves; instead, linearization takes place in a multi-step process. A separate phenogrammatical derivation proceeds in parallel to the syntactic derivation – resulting in the creation of “structured phonologies” which are subsequently given a linear, string-based interpretation. In this paper I have endeavored to make the syntactic analysis of Chamorro as simple as possible, although I grant the complete facts about word order in Chamorro are somewhat more complicated. The phonological logic of CVG is similar to that proposed in Oehrle (1994), although it is a positive typed lambda calculus rather than a higher order logic, and incorporates ideas from Morrill and Solias (1993). It has one basic type St, as well as ‘wrappers’, which are ordered pairs of strings, denoted by terms of type St × St, abbreviated here as Wr. String-pair terms (s,t) are written s ◦↑ t. Informally, this is meant to suggest an ‘insertion point’ between s and t, or the ‘position’ in the wrapper where the phonology of a wrapped constituent will appear. The phonologies of most words are denoted by constants of type St, but verbs, being wrappers, have a slightly more complicated form: (17) HA - OTTU ◦↑ ε : Wr Here, the constant ε denotes the empty string. String concatenation is denoted by the constant ◦, which is of type St → St → St (written as an infix operator for

468

Chris Worth

the sake of clarity). The usual term equivalences are assumed, ensuring that in a Henkin model, St is interpreted as a free monoid: a◦ε = a ε ◦a = a (a ◦ b) ◦ c = a ◦ (b ◦ c) It is worth noting that ε is only taken to be identity on the concatenation operator, and not on an insertion point. This allows for a consistent definition of right wrap, and ensures that wrappers can be thought of as pairs of strings into which another string may be inserted. It is assumed that only strings are themselves pronounceable; of course, every word of a language is in principle pronounceable. This necessitates the definition of a function sayP which has the effect of taking φ -terms and transforming them into strings: saySt = λx x sayWr = λu (π (u) ◦ π (u))

Here, π and π are the first and second projections of the φ -term, roughly, the left and right side of a wrapper, respectively. It is now possible to give a right wrap a formal definition: the function rwrap of type Wr → St → Wr is the term λu λx (π (u) ◦↑ (x ◦ π (u))) (Mansfield, Scott, Pollard and Worth (2009)). What remains is to specify how this operation works with respect to Chamorro. My earlier assumption that all verbal dependents are effectively complements allows an easy interface between syntax and phenogrammar. Effectively, all complements are wrapped. An insertion point is maintained as long as is necessary, but multiple insertion points are disallowed. Since the subject corresponds to the final argument taken by the verb, it will appear to the verb’s immediate right, with objects to the right of the subject. The interface between syntax and phonology (or tectogrammar and phenogrammar) is envisioned similarly to the syntax-semantics interface. A series of ND rules is given detailing how the syntactic and phonological derivations may proceed. For the most part, I assume that simple concatenation is the method by which most phonological words combine in Chamorro. However, the verbs are different, and an enhanced rule of Complement Merge is required (where P is a metavariable ranging over φ -types): (18) Complement Merge (C–Merge): If Γ w, v : Wr, A C B, and Γ u, o : P, A, then Γ, Γ rwrap(w)(say(u)), (v o C ) : Wr, B. In order to complete the illustration of the mechanics of wrap in Chamorro, I offer the following phenogrammatical derivation of (1-b) (Ha-ottu i petta i

A Hypothetical Proof Account of Chamorro Wh-Agreement

469

patas-su – ‘The door banged my foot’). As before, for expository simplicity, I hope that the reader will grant me the latitude to treat noun phrases as simple combinations of the form D ◦ N. C-Merge 1

2 3 5

4 C-Merge

(Lexical) 1. HA - OTTU ◦↑ ε : Wr 2. I ◦ PATAS - SU : St (C–Merge) 3. rwrap(HA - OTTU ◦↑ ε )(say(I ◦ PATAS - SU)) HA - OTTU ◦↑ ((I ◦ PATAS - SU) ◦ ε ) : Wr 4. I ◦ PETTA : St 5. rwrap(HA - OTTU ◦↑ ((I ◦ PATAS - SU) ◦ ε ))(say(I ◦ PETTA)) (C–Merge) HA - OTTU ◦↑ ((I ◦ PETTA) ◦ ((I ◦ PATAS - SU) ◦ ε )) : Wr

Subjected finally to say, the resulting string (with ◦ replaced by a space and parentheses eliminated) is, as expected, Ha-ottu i petta i patas-su.

9. Derivation of selected example 9.1. Rules The first three rules (Right Merge, Case Merge, and Specifier Merge) are given as syntactic proof rules only. This is not in any way intended to be a claim about the insignificance of their semantics. Right Merge is used with the past tense marker p¨ara, and Case Merge and Specifier Merge are concerned with the formation of noun phrases. As noted previously, I do not wish to discuss the semantics of these particular constructions, as such are not immediately relevant to the phenomenon under discussion. All other rules are written as interface rules, specifying both syntax and semantics. (19) Right Merge (R–Merge): If Γ m : A R B, and Γ n : A, then Γ, Γ (m n R ) : B. This is a simple modus ponens (implication elimination) rule schema that is specifically relative to the R connective. This will allow for VP modification, used here with respect to the past tense marker p¨ara. (20) Case Merge (CA–Merge): If Γ m : A CA B, and Γ n : A, then Γ, Γ (m n CA ) : B. As above, but relative to the CA connective. This will allow case marking of NPs. (21) Specifier Merge (SP–Merge): If Γ n : A SP B, and Γ d : A, then Γ, Γ (SP d n) : B.

470

Chris Worth

As above, but relative to the SP connective. This will allow for the combination of nouns and determiners. (22) Lexical Entry / Axiom: a, a : A, B. This allows us to assert axioms, or lexical entries for words. (23) Complement Merge (C–Merge): If Γ v, v : A C B,C → D, and Γ o, o : A,C, then Γ, Γ (v o C ), v (o ) : B, D. This is a modus ponens (implication elimination) rule schema that is specifically relative to the C connective. This will be used to allow verbs to take their complements. In the semantic logic, it corresponds to function application. (24) Trace: t,t : A, B t,t : A, B This rule schema introduces syntactic and semantic hypotheses, which we conceive as typed variables in each logic. (25) Move: If t, x : A, A , Γ s, s : B, B , then Γ λtSL s, λx s : A SL B, A → B (x fresh). This rule schema is a rule of hypothetical proof (implication introduction), relative syntactically to the SL (SLASH) flavor of implication, for terms of type A, where A represents a metavariable ranging over the set of type {Unm, Obl} (unmarked or oblique case NPs, respectively). Semantically, this implication introduction is the lambda-binding of a hypothetical semantic variable. (26) Wh- Question Rule: If w, w : Wh, (ι → π ) → κ1 , and s, s : A SL Fin, ι → π , then q(w, s), w (s ) : Q, κ1 . This is a non-logical rule (or term constructor) schema allowing the formation of wh-questions. Semantically, it corresponds to function application, yielding a κ1 -type, the type of unary constituent questions. 9.2. Embedded constructions si Juan p¨ara godde-tta ni (27) Hafa sinangan-˜a what? WH[OBJ].say.his-AGR UNM Juan FUT WH[OBL].tie.our-AGR OBL chiba — ? goat

A Hypothetical Proof Account of Chamorro Wh-Agreement

471

‘What did Juan say that we should tie up the goat with?’ (Hukari and Levine (2006)) This example exhibits extraction across more than one verb. The lower verb, godde-tta, is nominalized (by -tta), but the infix -in- is absent, indicating that the agreement pattern is for an extracted oblique (an instrument, in this case). The higher verb sinangan-˜a bears the -in- infix as well as nominalization (-˜a). This is the agreement pattern for an extracted object. The object, in this case, is the clausal complement headed by the lower verb. 9.3. Axioms / lexical entries Syntax Hafa : Wh g : A sinangan − ˜a : (A SL Fin) C (Unm C Fin) si : Name CA Unm Juan : Name p¨ara : (Unm C Fin) R (Unm C Fin) e : A godde − tta : (Obl SL Obl) C (Obl C (Unm C Fin)) pro : Unm ni : (Det SP Unm) CA Obl chiba : Det SP Unm Semantics what : (ι → π ) → κ1 y : ι λ f λx say ( f (y ))(x ) : (ι → π ) → (ι → π ) j:ι z : ι λ f λy λxtie (with ( f (z)))(y)(x) : (ι → ι ) → (ι → (ι → π )) we : ι goat : ι

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

14

R–Merge

9

22

15

C–Merge 10

1 4

2 3 8

Move 5

12 13

7 11 C–Merge

(Lexical) (Trace) (Move) (C–Merge) (Lexical) (Lexical) (CA–Merge) (C–Merge) ((Lexical)) (R–Merge) (Lexical) (C–Merge) (Move) (Lexical) (C–Merge) (Lexical) (Lexical) (CA–Merge) (C–Merge) (Move) (Lexical) (Wh-Question Rule)

19 Move 20 Wh-Question Rule

Move

6 CA–Merge C–Merge

e : A godde − tta : (Obl SL Obl) C (Obl C (Unm C Fin)) t : Obl t : Obl λtSL t : Obl SL Obl e : A (godde − tta λtSL t C ) : Obl C (Unm C Fin) ni : (Det SP Unm) CA Obl chiba : Det SP Unm (ni chiba CA ) : Obl e : A ((godde − tta λtSL t C ) (ni chiba CA ) C ) : Unm C Fin p¨ ara : (Unm C Fin) R (Unm C Fin) e : A (p¨ ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) : Unm C Fin pro : Unm e : A ((p¨ ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C ) : Fin ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C ) : A SL Fin λeSL ((p¨ g : A sinangan − ˜ a : (A SL Fin) C (Unm C Fin) g : A (sinangan − ˜ a (λeSL ((p¨ ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C )) C ) : Unm C Fin si : Name CA Unm Juan : Name (si Juan CA ) : Unm g : A ((sinangan − ˜ a (λeSL ((p¨ ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C )) C ) (si Juan CA ) C ) : Fin λgSL ((sinangan − ˜ a (λeSL ((p¨ ara((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C )) C ) (si Juan CA ) C ) : A SL Fin Hafa : Wh q(Hafa, λgSL ((sinangan − ˜ a (λeSL ((p¨ ara ((godde − tta λtSL t C ) (ni chiba CA ) C ) R ) pro C )) C ) (si Juan CA ) C ) : Q

21

C–Merge

9.4. Syntax

16 18

CA–Merge C–Merge

17

472 Chris Worth

473

A Hypothetical Proof Account of Chamorro Wh-Agreement

9.5. Semantics 1

→E 4 →E →E →E 1. 2. 3. 4.

5. 6. 7. 8. 9. 10. 11.

12. 13. 14. 15. 16.

2 →I 3 5 6

7 8 →I 9

10

→E

11

12 13 →I 14

15

→E

16 z : ι λ f λy λxtie (with ( f (z)))(y)(x) : (ι → ι ) → (ι → (ι → π )) z : ι z : ι λz z : ι → ι z : ι λ f λy λxtie (with ( f (z)))(y)(x)[λz z ] λy λx tie (with (λz z [z]))(y)(x) λy λx tie (with (z))(y)(x) : (ι → (ι → π )) goat : ι z : ι λy λxtie (with (z))(y)(x)[goat ] λx tie (with (z))(goat )(x) : ι → π we : ι z : ι λxtie (with (z))(goat )(x)[we ] tie (with (z))(goat )(we ) : π λz tie (with (z))(goat )(we ) : ι → π y : ι λ f λx say ( f (y ))(x ) : (ι → π ) → (ι → π ) y : ι λ f λx say ( f (y ))(x )[λz tie (with (z))(goat )(we )] λx say (λz tie (with (z))(goat )(we )[y ])(x ) λx say (tie (with (y ))(goat )(we ))(x ) : ι → π j:ι y : ι λx say (tie (with (y ))(goat )(we ))(x )[ j] say (tie (with (y ))(goat )(we ))( j) : π λy say (tie (with (y ))(goat )(we ))( j) : ι → π what : (ι → π ) → κ1 what (λy say (tie (with (y ))(goat )(we ))( j)) : κ1

(Axiom) (Hypothesis) (→ I) (→ E) (β ) (β ) (Axiom) (→ E) (β ) (Axiom) (→ E) (β ) (→ I) (Axiom) (→ E) (β ) (β ) (Axiom) (→ E) (β ) (→ I) (Axiom) (→ E)

474

Chris Worth

10. Summary We have seen how a combination of natural-deduction-style hypothetical proof rules and lexical subcategorization for extraction sites can begin to model the initially puzzling facts about Chamorro wh-agreement. While it at first seems odd to treat verbs which do not ordinarily select for certain elements (say, instruments) as selecting for exactly those elements, this conclusion is motivated by the facts concerning verbal agreement morphology and extracted elements. Since the various verb forms appear with different syntactic structures, there is nothing unnatural about treating their selectional properties as different as well. A hypothetical-proof-based strategy therefore accounts effectively for the phenomenon of wh-agreement.

11. Future research While this paper is intended to account only for the syntactic structure of constituent questions, the wh-agreement phenomenon also occurs in relative clauses and focus constructions. This analysis could plausibly be extended to those areas as well, although the entire paradigm for wh-agreement is more complicated than can be dealt with in the present forum. The following example has been pointed out to me by an anonymous reviewer: (28) Hafa na patti gi atumobit mal¨agu’ hao u-ma-fa’maolik —? AGR-want you WH - NOM- AGR- PASS -fix ? What part LOC car ‘Which part of the car do you want to be fixed?’ Cases where the verbal nominalization is optional, as well as unbounded dependency constructions where wh-agreement does not occur are additional topics of current research. I consider these to be somewhat simpler, and assume that they are most likely accounted for in a manner similar to the one described in Pollard (2007). In this approach, syntactic hypotheses are simply maintained up until the point at which they are withdrawn and bound (by a wh-word, for example). This stands in mild contrast to the successive ‘hypothesize and withdraw’ strategy I have pursued here. In the above example, I assume that only the embedded verb actually subcategorizes for a gap. While such a strategy will likely account for the grammaticality of (28), it remains to be seen whether the ungrammaticality of a similar sentence with a non-agreeing embedding verb can be captured. It has been suggested to me (by participants at the LMNLDS workshop at DGfS 30) that the wh-agreement phenomenon is similar to the morphological voice system in other Western Austronesian languages. Chung (1998) contends that the Chamorro voice system is fairly uncomplicated by comparison to languages like Tagalog, and that wh-agreement is not simply a voicing alternation,

A Hypothetical Proof Account of Chamorro Wh-Agreement

475

but further examination of the Chamorro voice constructions (passive and antipassive) is necessary. One further issue is the precise nature of Chamorro NP structure. Chamorro has a complicated morphological case marking system which interacts with different categories of NPs (common nouns, pronouns, and proper names) and determiners in interesting ways. There is absolutely no doubt that these are deserving of greater attention on both the syntactic and semantic fronts than has been given here. While the internal construction of NPs is beyond the scope of this work, it is certainly of importance.

Bibliography Bach, Emmon (1979): ‘Control in Montague Grammar’, Linguistic Inquiry 10, 515–531. Bouma, Gosse, Rob Malouf and Ivan A. Sag (2001): ‘Satisfying Constraints on Extraction and Adjunction’, Natural Language and Linguistic Theory 19, 1–65. Chung, Sandra (1998): The Design of Agreement: Evidence from Chamorro. University of Chicago Press, Chicago. Chung, Sandra and Carol Georgopolous (1984): Agreement with Gaps in Chamorro and Palauan. Stanford Agreement Conference. (Unpublished). Curry, Haskell B. (1961): ‘Some Logical Aspects of Grammatical Structure’, In: R. Jakobson, ed., Proceedings of the Twelfth Symposium in Applied Mathematics. (American Mathetmatical Society, Providence, RI) Dowty, David (2007): Compositionality as an Empirical Problem. In: C. Barker and P. Jacobson, eds, Direct Compositionality. Oxford University Press, Oxford, pp. 23–101. Dukes, Michael (2000): ‘Agreement in Chamorro’, Journal of Linguistics 36, 575–588. Hukari, Thomas E. and Robert D. Levine (1995): ‘Adjunct Extraction’, Journal of Linguistics 31, 195–226. Hukari, Thomas E. and Robert D. Levine (2006): The Unity of Unbounded Dependency Constructions. CSLI Publications, Stanford University. Mansfield, Lia, Scott Martin, Carl Pollard and Chris Worth (2009): ‘Phenogrammatical Labelling in Convergent Grammar: The Case of Wrap’. Ms., Ohio State University; http://www.ling.ohiostate.edu/∼scott/cvg/wrap.pdf. McCloskey, James (2001): ‘The Morphosyntax of Wh-Extraction in Irish’, Journal of Linguistics 37, 67–100. Moortgat, Michael (1988): Categorial Investigations: Logical and Linguistic Aspects of the Lambek Calculus. Walter de Gruyter, Berlin. Morrill, Glyn and Tersea Solias (1993): Tuples, Discontinuity, and Gapping in Categorial Grammar. In: Proceedings of the Sixth Conference of the European Chapter of the Association for Computational Linguistics. ACL, Morristown, NJ, pp. 287–296. Oehrle, Richard (1994): ‘Term-Labelled Categorial Type Systems’, Linguistics and Philosophy 17, 633–678. Pollard, Carl and Ivan A. Sag, (1994): Head-Driven Phrase Structure Grammar. University of Chicago Press, Chicago. Pollard, Carl (2007): Nonlocal Dependencies via Variable Contexts. In: R. Muskens, ed., Proceedings of the Workshop on New Directions in Type-Theoretic Grammar. (ESSLLI 2007, Dublin); http://www.ling.ohio-state.edu/∼hana/hog/pollard2007-nonlocal.pdf. Pollard, Carl (2008a): Covert Movement in Logical Grammar. Presented at the Workshop on Continuations and Symmetric Calculi. (ESSLII 2008 Hamburg, Germany); http://www.ling.ohiostate.edu/∼scott/cvg/covert.pdf.

476

Chris Worth

Pollard, Carl (2008b): The Calculus of Responsibility and Commitment. In: Proceedings of the Workshop on Ludics, Dialog, and Interaction, Autrans, France, May 2008.; http://www.ling.ohiostate.edu/∼scott/cvg/autrans.pdf.

Department of Linguistics Ohio State University

Gregory M. Kobele

Deriving Reconstruction Asymmetries

Abstract There appears to be a systematic difference in the reconstructability of noun phrases and predicates. In this paper I show that reconstructing the A/A-bar distinction in terms of slash-feature percolation and movement allows for a simple derivational formulation of the principles of binding and scope which derives a generalization very much along the lines of the one presented by Huang (1993).

1. Introduction One of the important results of the study of syntactic dependencies is that different construction types (passive, raising, relative clause formation, wh-movement, topicalization, etc.) have been revealed to systematically cohere in terms of the properties exhibited by their constitutive dependencies. Movement-type dependencies typically instantiate one of two attested dependency types, which are called A-movement and A-bar movement. Much work has gone in to determining which properties movements of either of these two types have (see figure 1), with the ultimate goal being, of course, an explanation of why these facts obtain. Even in the absence of an explanation as to why movement dependencies Reconstruction is obligatory Licenses parasitic gaps Induces crossover effects Can escape tensed clauses

A A-bar no yes no

yes

no

yes

no

yes

Figure 1. Some A, A-bar Distinctions

divide in two, researchers can and have used the descriptive characteristics of

Local Modelling of Non-Local Dependencies in Syntax, 477-500 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

478

Gregory M. Kobele

these dependency types to classify other dependencies in other languages (such as scrambling; see Mahajan (1990); M¨uller and Sternefeld (1993); Tada (1993)). 1.1. Reconstruction For a large number of cases, the simple principles of the binding theory and the basic scope principle (see figure 2) give correct results. There are, however, a Principle A A reflexive must be c-commanded by a co-indexed expression within the same tensed clause. Principle B A pronoun must not be c-commanded by a co-indexed expression within the same clause. Principle C A proper noun must not be c-commanded by a co-indexed expression. Scope A quantifier can bind a pronoun only if it c-commands the pronoun. Figure 2. Principles of Binding, and of Scope

wide variety of exceptions to these principles, as exemplified below. (1) Principle A a. Kwasi criticized himself. b. Himself, Kwasi criticized. (2) Principe B a. *Kwasii criticized himi . b. *Criticize himi , Kwasii did. (3) Principle C a. *Hei criticized Kwasii . b. *Criticize Kwasii , hei did. (4) Scope a. *Hei criticized every boyi . b. *Every boyi , hei criticized.

Deriving Reconstruction Asymmetries

479

According to principle A, reflexive pronouns must be co-indexed with a ccommanding DP in the same tensed clause. Thus, we correctly predict sentence (1-a) to be well-formed. However, sentence (1-b) is mysterious, as here the reflexive c-commands its supposed antecedent, violating principle A, but the sentence is nonetheless well-formed. Principle B states that pronouns must not be co-indexed with a c-commanding DP within the same clause. Thus, sentence (2-a) is correctly predicted to be ill-formed. Sentence (2-b) is also ill-formed, despite the fact that the pronoun and the DP it is co-indexed with are mutually independent with respect to the c-command relation, and thus do not violate principle B. The ill-formed sentence (3-a) has a proper noun c-commanded by a co-indexed DP (he), thus violating principle C. Principle C is not violated in the nevertheless ill-formed (3-b), as the proper noun and the pronoun are again mutually independent with respect to the c-command relation. Finally, it is correctly predicted that the pronoun in (4-a) cannot be bound by the non-c-commanding quantified noun phrase every boy. However, in (4-b), the quantified noun phrase does c-command the pronoun, but is still unable to bind it. In each of the above cases, the second sentence of the pair can be subsumed under the binding theoretic or scope principles if these are taken not to apply to the surface structure representations of the mysteriously ill-/well-formed sentences, but rather to a representation in which the topicalized constituent is ‘put back’ in its pre-movement position (yielding sentences identical to the first member of each of the above pairs; see Chomsky (1976)). The phenomenon above (that the second of each of the above pairs can be dealt with by a slightly more abstract version of the binding/scope theory) is called ‘reconstruction’, after the literal reconstruction process proposed to account for these facts by Chomsky (1976). 1.2. The predicate/argument asymmetry Huang (1993) notes that whereas moved DPs seem to be able to reconstruct in non-first-merged positions (as in sentence (5)), moved predicates (as in (6)) are much more restricted in their reconstruction possibilities. (5) Which portrait of himselfi/ j does Kwasii believe that Diego j criticized? (6) Criticize himself∗i/ j Kwasii believes that Diego j did. In sentence (5), there are two different potential antecedents for the anaphor – it is semantically ambiguous. In other words, which portrait of himself can be reconstructed in either its θ position (sister to V), or in the intermediate SPEC-CP position. The fronted VP in sentence (6) on the other hand, can be reconstructed only in its deep structure position, as complement to Infl, as evidenced by the

480

Gregory M. Kobele

fact that only the lower subject (Diego) is acceptable as an antecedent for the anaphor. Huang proposes to tie this contrast to the presence of an unbound trace in the fronted constituent. According to Huang (who adopts the VP-internal subject hypothesis (Koopman and Sportiche (1991)), the structure of (6) above is as in (7) below. (7) [ti criticize himself] j Kwasi believes that Diegoi did t j Huang assumes that anaphors must be bound by a binder within the minimal node containing the expression of which the anaphor is an argument, along with all other arguments of that expression. In the case of verbs, this is the vP where all arguments of the verb have been base-generated. As in (7) the fronted element contains all arguments of the verb, the anaphor must be bound by one of them (in this case, the trace ti ), regardless of whether or where the phrase is ‘reconstructed.’ 1.3. A and A-bar movement in the Minimalist Program Some of the properties distinguishing movement types, such as the inability of A-movement to license parasitic gaps, or to escape from tensed clauses, can be accounted for in purely configurational terms (i.e., without recourse to the distinction between movement types; see Nunes (2001); Kobele (2006; 2008)), and thus are better viewed as an accidental, rather than as an essential characteristic of movement types. Attempts have been made to deal with the more semantic differences configurationally as well (Sportiche (2003)), but these lead to nonstandard analyses of even the most familiar phenomena. Perhaps the most influential account of the A/A-bar distinction in the minimalist program is propounded by Lasnik (1999) (see also Chomsky (1995); Fox (2000); Boeckx (2000)), who suggests that we attribute the different properties of these two movement types to whether or not a movement step leaves behind a copy of the moved item, or a simple trace (or nothing). His goal is to account for the observation that reconstruction into an A-position is not obligatory, whereas reconstruction into an A-bar position is. His analysis of the difference in behaviour between the two movement types is conceptually neat, as it traces the difference to a natural theoretical distinction (the difference between a full copy and an unstructured trace). However, it necessarily leaves some things to stipulation. First, no explanation seems readily available for the ban on improper movement; if movement can ‘decide’ to leave behind either a trace or a copy, why does the decision of a previous movement step influence the decision of the next? Second, why are the operations of grammar limited to copying and deletion/trace insertion? In Lasnik’s story, this isn’t

481

Deriving Reconstruction Asymmetries

derivable from anything else – it must simply be stipulated. Finally, Lasnik’s idea seems committed to an LF perspective on interpretation, and incompatible with a compositional, Montagovian one. Lasnik’s theory can be thought of as embodying the intuition that expressions only ‘count as being there’ in their A-bar positions (where they survive as copies), not in their A-positions (where they are traces). Manzini and Roussou (2000) take this intuition literally, and reformulate Lasnik’s ideas derivationally. They observe that the ban on improper movement structures chains in such a manner as to require that as soon as an expression counts as being there, it must be there in all of its higher chain positions as well (figure 3). This makes posnot there

. . . ci ci−1 ci−2 . . . c1 c c n n−1 there

Figure 3. Chains, as per the ban on improper movement

sible a simple timing account of the A/A-bar distinction, if we make the crucial assumption that an expression may begin satisfying dependencies before it is merged. Kobele (2007) shows how this can be done. He formalizes Manzini and Roussou’s ‘feature attraction’ operation using slash-feature percolation (as in generalized phrase-structure grammar; see Gazdar et al. (1985)), and shows how slash-features and movement can naturally co-exist in the minimalist grammar formalism (Stabler (1997)), giving rise without additional stipulation to the ban on improper movement. The remainder of this paper is structured as follows. First, we introduce minimalist grammars with slash-feature percolation. Next, we show how the particular properties of slash-feature percolation force ‘moved’ predicates to be actually moved, and not introduced via slash-feature percolation. This fact, in conjunction with a natural theory of reconstruction (presented in the section thereafter), derives Huang’s generalization.

2. Slash features and movement Slash-feature percolation and movement are two ways of establishing longdistance dependencies between tree positions. Though they appear to be inverses of each other, they have quite different formal properties: minimalist grammars with just slash-feature percolation are context-free, whereas minimalist grammars with just movement are mildly context-sensitive. As we will soon see, the reason for this difference lies in the fact that proper binding condition violating

482

Gregory M. Kobele

remnant movement is possible only with movement, and is not simulable with slash-features. 2.1. Features As the generating functions merge and move are taken to be universal and invariant, differences between languages reside in the lexicon. Lexical items have various features, which determine how they behave in the derivation. In addition to movement being feature driven, I will assume that merger is as well, and that, moreover, the kinds of features which are relevant for the merge and the move operations are distinct, and will be called selection and licensing features, respectively. Each feature has an attractor and an attractee variant (figure 4), and these must match in order for an operation to apply. Each time an operation is attractor attractee merge =x x move +y -y Figure 4. Features

applied, it checks both an attractor and an attractee feature, of the appropriate variety. We will also not attempt here to model inflection or agreement, nor to make a link between morphological features and the syntactic features of expressions which we take here to drive derivations. Thus, we make no distinction between interpretable and uninterpretable features – here all features behave as though uninterpretable. 2.2. Lexical items Syntax relates form and meaning. Lexical items are the building blocks of this relation, atomic pairings of form and meaning, along with the syntactic information necessary to specify the distribution of these elements in more complex expressions. Here, simplifying somewhat, we take lexical items to be pairings of abstract lexemes such as dog, cat, bank1 ,. . . with feature bundles. Feature bundles are taken to be ordered (see Stabler (1997)), so that some features can be available for checking only after others have been checked. We will represent feature bundles as lists, and the currently accessible feature is at the beginning (leftmost) position of the list. An example lexical item is shown in figure 5. Its feature bundle ‘=d V’ indicates that it first selects a DP argument, and then can be selected for as a VP. Treating feature bundles as ordered is convenient, as it al-

Deriving Reconstruction Asymmetries

483

praise, =d V Figure 5. A lexical entry for praise

lows us to take explicit syntactic control of which arguments get selected when. Certain lexical items, such as the Saxon genitive ’s, are naturally thought of as selecting two syntactic arguments, an NP complement and a DP specifier. As the notions of complement and of specifier reduce to first-merged and not-firstmerged in the context of the minimalist program, we need a way of ensuring that the first merged argument of ’s is the NP, and not the DP. Here we simply structure the feature bundle for ’s so as to have the noun phrase selected before the determiner phrase (the lexical entry might have the following form ’s, =n =D D). One question which arises at this point is whether this ordering must be taken as a primitive, or whether it is derivable from some deeper property (or a conspiracy of such) of grammar.1 A second question is whether a substantive theory of feature bundles can be developed, to which the here hypothesized linear ordering is perhaps only a rough approximation.2 These questions are non-trivial, and cannot be pursued further here. 1

2

The ordering between selection features (=x) might seem relatable to the semantic type of an expression, under natural assumptions about the syntax-semantics interface. However, with no obvious reason to prefer a function f of two arguments to its permuted counterpart f , where f (a)(b) = f (b)(a), this ‘reduction’ would seem to put the cart before the horse. An interesting approach is suggested by M¨uller (2010), where the licensor and selector features of an expression are ordered only with respect to other features of the same type (and so licensor features are ordered with respect to other licensor features, but not with respect to selector features). He proposes a general principle which forces licensor features to be checked as soon as possible. Note first that, regardless of the shape of a feature bundle, a derivation imposes a linear order on the features in each of the feature bundles of the items occurring in it – this order is simply the order in which those features were checked in that particular derivation. (Or, if we allow multiple features to be checked en masse, the derivation imposes an equivalence relation over features in feature bundles, and then imposes a total order over equivalence classes of features.) A feature bundle, then, stands proxy for a set of totally ordered feature bundles (or of totally ordered feature equivalence class bundles); these are given by those total orderings which can be imposed upon it by some derivation in which it can occur. (In the system here, each feature bundle, being already totally ordered, stands for the unit set containing just it.) A natural desideratum for a lexicalized grammar formalism is that it supports a type system rich enough to be able to completely account for the distribution of each expression with a single category (feature bundle). Thus, one goal of a substantive theory of feature bundles is to be able to derive all and only the necessary linearly ordered feature bundles which describe the distribution of a lexical item from a single feature bundle. Another goal is to make those feature bundles which describe the distribution of linguistically possible lexical items less marked (i.e., simpler) than those which do not. This ‘markedness’ can take the form either of simply ruling out linguistically impossible feature bundles, or of abbreviatory conventions (`a la Chomsky (1965)) which make linguistically natural feature bundles more notationally natural.

484

Gregory M. Kobele

2.3. Syntactic objects We write lexical items using the notation α , δ , where α is a lexeme (such as praise), and δ is a feature bundle (such as ‘=d V’). Complex expressions are written using a labeled bracket notation, as per the following: [δ α β ] The above represents an expression whose head has feature bundle δ , and consists of the two immediate sub-expressions α and β . As an example, if we assign the lexical entries ’s, =n =D D and brother, n to the Saxon genitive and to brother respectively, then the complex expression ’s brother, which is the result of merging brother as the complement of ’s, is represented as the below. [=D D ’s [ brother]] As can be seen, the above expression has feature bundle ‘=D D’, which means after it merges with an expression of category D (such as Kwasi, D), it will itself be of category D. [D Kwasi [ ’s [ brother]]] Note that the features (D) of the head (’s) of the above expression are only represented once, and on the most maximal projection of that head. When a pair of brackets no longer has any unchecked features (as is the case with [ brother] and with [ ’s [ brother]] above), we no longer need to make these constituency distinctions (without features no constituent can move), and sometimes for convenience will leave those brackets out. The above expression would under this convention be rendered as per the following. [D Kwasi ’s brother] Moving expressions are simply sub-expressions which have not checked all of their features. For example, the expression below is a sentence (IP) which contains a wh-phrase which has not yet checked its -wh feature. [i Kwasi [will [praise [-wh who]]]] We will apply our convention on leaving out brackets here as well, eliminating those brackets around constituents whose heads have no unchecked features. The expression above would be given as the below. [i Kwasi will praise [-wh who]]

485

Deriving Reconstruction Asymmetries

2.4. Merge The merge operation applies to two arguments, A and B, resulting in the new object A + B, just in case the head of A has some selector feature =x as the first unchecked feature in its feature bundle, and the head of B has the corresponding x as the first unchecked feature in its feature bundle. In the resulting A + B, both first features used in this derivational step are checked, making available the next features of both feature bundles. There are two cases of the merge operation, depending on whether B will surface as a specifier or as a complement of A (figure 6). The first case of merge is the merger of a complement, or first-merge. merge(A, B) =

[A A B] B is a complement [A B A] B is a specifier

Figure 6. Cases of merge

Let us represent the active voice head in English (little-v) with the following lexical item: -ε , =V +k =d v.3 Because the first feature in the feature bundle of this lexical item is =V, it can be merged with the expression below (the VP praise Kwasi). [V praise [-k Kwasi]] Note that the selecting lexical item is a suffix (marked by the hyphen preceding the lexeme). This triggers head movement from its complement.4 merge( -ε , =V +k =d v, [V praise [-k Kwasi]])

= +k =d v praise-ε t praise [-k Kwasi] head movement

Leaving out the traces, phonetically empty elements, and unnecessary internal structure, this expression can be abbreviated as the below. [+k =d v praise [-k Kwasi]] Note again that both of the matching first features of the arguments to merge (=V and V) have been checked in (i.e., deleted from) the result. The second case of merge is merger of a specifier. This happens whenever the first argument to merge is not a lexical item (note that first-merge happens 3 4

The symbol ε is the empty string. The hyphen in front of it indicates that it is a suffix, and thus triggers head movement; its complement’s head raises to it. See Stabler (2001) for more details. The basic idea is that head movement is not syntactic (i.e., feature-driven) movement (see Matushansky (2006)).

486

Gregory M. Kobele

when the first argument is a lexical item). As can be seen in figure 6, the main difference between first and later merges lies in the positioning of the merged item with respect to the head (first merged expressions come after, later merged expressions before, the head). Thus, we are essentially computing the effects of Kayne’s (1994) Linear Correspondence Axiom (LCA) incrementally during the derivation of an expression. Readers who are uncomfortable with this can view it equivalently as a convenient shorthand for first building an unordered tree, and then applying the best current linearization algorithm to this unordered tree. 2.5. Move The move operation applies to a single syntactic object A just in case it contains a sub-expression expression B with first unchecked feature -y, and the first unchecked feature of A is +y. In order to rule out nondeterminacy, move will only be defined if there is exactly one such sub-expression beginning with a matching -y feature.5 Just as with the merge operation, move checks the matching features of the expression to which it applies. As an example, consider the expression below (which is similar to praise Kwasi derived above, but with who having been merged instead of Kwasi). [+k =d v praise-ε [-k -wh who]] The move operation applies to this expression, as the main expression has as its first feature a licensor +k, and there is exactly one sub-expression with first feature the matching licensee -k.

move([+k =d v praise-ε [-k -wh who]]) = =d v [-wh who] praise-ε twho 5

This is a radical version of the Shortest Move Constraint (Chomsky (1995)), and will be called the SMC – it requires that an expression move to the first possible landing site. If there is competition for that landing site, the derivation crashes (because the losing expression will have to make a longer movement than absolutely necessary). Note that the SMC plays on the fact that structuring feature bundles as lists allows features to be temporarily ‘hidden’. Using the SMC as a constraint on movement has desirable computational effects (such as guaranteeing efficient recognizability – see Harkema (2001); Michaelis (2001)), although other constraints have been explored in G¨artner and Michaelis (2007). There are well-known counter-examples to this restriction on movement. Notable among them are the multiple-wh fronting constructions familiar in the Slavic languages (Rudin (1988)). To deal with such phenomena, it seems natural to introduce a wh-cluster forming operation, which ‘saves’ otherwise SMC violating configurations by fusing together the offending expressions (see Grewendorf (2001); see also fn. 20 in G¨artner and Michaelis (2005)). The wh-in-situ strategy of multiple questions (as in English) is dealt with by imposing a restriction on wh-cluster formation in such languages which requires that the phonological matrices of all fused wh-items but one be null – forcing spell-out of these wh-chains in their base (or case) positions.

487

Deriving Reconstruction Asymmetries

Again, leaving out traces and internal structure, we write the expression above as per the below. Note again that both features involved in the move operation (+k and -k) are deleted/checked in the result. [=d v [-wh who] praise] 2.6. Slash-feature percolation The intuition behind Manzini and Roussou’s (2000) feature attraction mechanism is that features which need to be checked can be ‘piled up’, and checked en masse by a newly merged expression, provided that the source positions of the features form an appropriate movement chain with the newly merged expression. The slash-feature percolation mechanism introduced into minimalist grammars by Kobele (2007) is designed to minimize the bookkeeping necessary to ensure that the features which are piling up do indeed form a legitimate chain. As the move operation already builds movement chains, Kobele introduces a ‘dummy’ expression (an assumption), with content just a list of features. The move operation then treats this dummy expression just like a real one, thereby ensuring that the features that are ‘piling up’ do indeed stand in an appropriate relation to one another. The second step involves merging in an expression with appropriate features, i.e., one that could have taken the place of the dummy expression (thereby discharging the assumption). To do all this, two new operations are introduced: assume and discharge. 2.6.1. Assume The operation assume takes a single argument, the first feature of whose head is =x, and results (non-deterministically) in an expression where the =x feature of the head has been satisfied by the assumption of an expression with initial feature sequence xδ (for some δ ), as shown in figure 7. Assumptions are writ A [δ xδ ]

γ [δ xδ ] A

assume([=xγ A]) →

γ

Figure 7. Cases of assume

ten just as are normal expressions (i.e., in brackets notated with active features), however, instead of containing tree structure, all that is represented is the originally assumed feature sequence. For example, the below represents an assumed

488

Gregory M. Kobele

expression with a single feature (-wh) left to be checked, where the original assumption was d -k -wh. [-wh d -k -wh] For example, given the lexical item praise, =d V, one possible output of applying the assume operation is given below. [V praise [-k d -k]] 2.6.2. Discharge Once an assumption is made, it needs eventually to be discharged. The discharge operation provides a means of doing this. The discharge operation takes as its two arguments an expression containing an assumption, and an expression which will replace, or ‘cash out’, this assumption. The assumption to be discharged must have exactly one unchecked feature remaining, and the original assumption must be an initial segment of the feature bundle of the expression replacing it.6 As an example, consider the instance of the discharge operation below. discharge([V praise [-k d -k]], [d -k -wh which boy]) = [V praise [-k -wh which boy]] Note first that all features of the discharging element are deleted up to (but not including) the remaining feature in the assumption’s feature bundle (in this case, it is just the one d feature). Note also that the discharging expression actually replaces the assumption in the original expression. 2.7. On locality The reader will have noticed that the move and discharge operations have been presented ‘by example’, instead of being given a general definition as have been merge and assume. Let us denote by Aδ an expression of the form [δ α β ], and by A B an expression A which contains designated occurrence of B. Then if A B and C are expressions, A C is the result of replacing the occurrence of B in A by C. We define move and discharge as per figure 8. In this figure, we see that both move and discharge have a non-local character in the following 6

That there be at least one feature remaining is to ensure that the assumption is still ‘active’ at the point it is discharged. That there be at most one feature remaining is to make sure that not both move and discharge can be applied to the same hypothesis at any given time.

Deriving Reconstruction Asymmetries

489

move(A+yδ B-yγ ) = [δ Bγ A tB ] discharge(A [-y δ -y], Bδ -yγ ) = A B-yγ Figure 8. move and discharge

sense: in order to determine whether they can apply to an expression A, and, if so, what the result is, one must conduct a search of unbounded depth inside A (in the case of move for an expression B which is to be moved, and in the case of discharge, for a hypothesis which is featurally compatible with the second argument). However, because of the SMC, we can keep a finite list of which expressions internal to the main expression have which features. The representation of an expression praise Kwasi then looks as per the below, where the finite list of feature sequences to the right of the • indicates which moving pieces the expression contains. [V praise [-k Kwasi]] • -k In other words, we can simply enrich our representations of expressions with the information about which moving subcomponents they contain. This representational enrichment then completely eliminates the non-locality we observed in determining whether move and discharge can apply to an expression. The other source of non-locality (that of computing the result of move or discharge) can be similarly eliminated if, instead of representing just the features of the moving expressions internal to another after the •, we display the expressions themselves. [V praise tKwasi ] • [-k Kwasi] We can redefine merge so that, if its second argument is going to move later on (i.e., if it has licensee features), a trace of it is merged in its place, and it is placed into the list of the first argument. merge([=d V praise] • , [d -k Kwasi] • ) = [V praise tKwasi ] • [-k Kwasi] The same strategy of representing them external to the main expression can be applied to assumptions. The general case of merging a to be moved expression is given in figure 9. Note that the lists of moving expressions in the arguments merge(A=xγ • φ , Bx-yδ • ψ ) = [γ A tB ] • (φ B-yδ ψ ) Figure 9. Merging an expression which will later move

490

Gregory M. Kobele

to merge are put together (denoted by juxtaposition) in the result, which results in something like slash-feature percolation (but with moving expressions instead of (or in addition to) slash-features). With this augmented representation, the superficial non-locality of move and discharge has been eliminated. Figure 10 shows the definitions of these operations on the augmented representations. One might wonder whether and to what move(A+yδ • (φ1 B-yγ φ2 )) =

[δ B A] • (φ1φ2 ) [δ tB A] • (φ1 Bγ φ2 )

if γ = ε otherwise

discharge(A • (φ1 [-y δ -y]φ2 ), Bδ -yγ • ψ ) = A • (φ1 B-yγ φ2 ψ ) Figure 10. move and discharge, locally

extent our augmented representations are ‘the same’ as the old ones (and thereby whether we have really shown that the non-locality of move and discharge is only apparent). As this augmented representation changes neither the derivations, nor the thing we ultimately derive, and is moreover easily obtainable from the original representation, it is not clear what kind of empirical content could be given to the claim that the augmented representations constitute a significantly different claim about the nature of our linguistic faculty. Indeed, all that we have done is identify where our operations have to do some extra work, determine that this extra work is in fact unnecessary, and change our representations so as to eliminate this extra work. It is like switching from decimal to binary representations of numbers, because we see that we have to multiply by two more often than by ten. That being said, I will continue to use the ‘non-local’ representation in the rest of this paper. It has the advantage of looking familiar. 2.8. Movement and slash-features Adding slash-feature percolation (in the form of the operations assume and discharge) to the minimalist grammar framework results in an explosion of syntactic ambiguity. Even simple sentences like Kofi smiled, previously unambiguous, now have (at least) two derivations: one where the DP Kofi is merged with the verb smiled, and one where the DP Kofi discharges an assumption of the form [-k d -k]. However, although every derivation using slash-features has a corresponding derivation using movement, the reverse is not true.7 In particular, 7

The system in Kobele (2007) uses a slightly different formulation of the discharge operation, which, allowing for smuggling (in the sense of Collins (2005)), can no longer be simulated in every case by movement.

Deriving Reconstruction Asymmetries

491

slash-feature percolation cannot describe cases where a remnant ‘moves’ over something that has been extracted out of it (violating the proper binding condition, see Fiengo (1977)),8 as in the configuration below. [Y P . . .tXP . . .] . . . XP . . .tY P The reason for this is that at the point in the derivation where XP is put into its surface position, its source position (inside YP) does not yet exist. Instead, only an assumption that we will have a YP has been made – nothing has been said about whether we will also have an XP. Contrasting this with the case of movement, in the case of movement, both XP and YP are present in the derivation before either of them needs to move. What this means is that if we can find examples in language of remnant movement, then we will be able to analyze them only in terms of actual movement, not using slash-feature percolation. In terms of the theory in the following section, remnants must reconstruct below the lowest expression extracted out of them.

3. A theory of reconstruction Minimalist grammars with slash-feature percolation give us a natural way of dealing with reconstruction phenomena, one that takes advantage of the sudden multiplicity of derivations for sentences. What the hybrid merge-move and assume-discharge system does is to allow moving elements to be introduced into the derivation at any point between their chain-initial and chain-final positions (figure 11). It is thus a natural move to make to tie the point at which an expression is inserted into the derivation to its reconstruction possibilities. We can think of two obvious ways to do this. First, we might demand that an expression be reconstructed into the position at which it enters the derivation. This approach minimizes as much as possible the potential spurious ambiguity introduced into the minimalist grammar system by having both movement and slash-feature percolation. The other option is a relaxation of the first, requiring only that an expression be reconstructed no lower than the position at which it enters the derivation. Only with the first approach however will we be able to derive Huang’s generalization without making any further assumptions.9 We restate Principle A of the Binding Theory in the following terms (whether and how the other principles 8 9

The slash-feature mechanism here thus implements a (restricted, as there is no downward movement) version of the trace-binding algorithm sought after by Pullum (1979). We can think of both of these options as similar to Epstein and Seely’s (2006) ideas on syntactic relations, whereby relations such as c-command are incrementally specified during the derivation, with each derivational step potentially adding new relata to the relation.

492

Gregory M. Kobele discharge merge

C[Q]

who

move

merge

merge

C[Q]

move

discharge

merge Infl

move

merge Infl

assume

move merge

who

assume

Infl

merge

merge criticize

C[Q]

Diego

criticize

merge merge

Diego

criticize

who Diego

Figure 11. Three derivations for the sentence Who criticized Diego?

in figure 2 should be relativized to positions in which the talked about elements entered the derivation is orthogonal to the present issue). Principle A (revised) A reflexive must be c-commanded by a co-indexed expression within the first tensed clause above the point at which it entered the derivation.

Now that we have specified both a syntactic theory (minimalist grammars with both slash-feature percolation and movement) and an interpretative theory (the revision of Principle A above), we are finally in a position to derive Huang’s generalization. Let us fix our analysis of a tiny fragment of English consisting of transitive verbs like criticize, and of sentential complement verbs like believe, as in figure 12. The lexicon given in figure 12 generates many non-sentences,

will, =v +k S

-ε , =V =d v

criticize, =d +k V believe, =S =d v

Kwasi, d -k

himself, d -k

ε , =v v -top ε , =S +top S Figure 12. A fragment of English

such as Himself will criticize Kwasi. These will be ruled out by principle A of the binding theory, which acts as a restriction on the distribution of the lexical item himself. With the syntactic operations our grammar formalism permits us, the only derivation of a sentence like Criticize himself, Kwasi will is one where the verb phrase criticize himself is first merged as sister to will.10 10

As we shall see, this is not a good description of what happens, as it is not the v criticize himself which is sister to will, but rather the vP Kwasi criticize himself.

Deriving Reconstruction Asymmetries

493

First, we give a well-formed derivation of this sentence. We begin by merging together criticize and the anaphor himself. 1. merge( criticize, =d +k V, himself, d -k) [+k V criticize [-k himself]] Next the verb assigns case to its object. Note that both +k and -k features of the verb and its object are checked by the move operation. 2. move(1) [V [ himself] [criticize thimsel f ]] In the next step, little-v merges with the previous expression. Note that this triggers head movement of the V criticize. 3. merge( -ε , =V =d v,2) [=d v criticize-ε [[ himself] [tcriticize thimsel f ]]] Then the agent argument is selected. 4. merge(3, Kwasi, d -k) [v [-k Kwasi] [criticize-ε [[ himself] [tcriticize thimsel f ]]]] Next, the vP is marked as requiring topicalization. 5. merge( ε , =v v -top,4) [v -top ε [[-k Kwasi] [criticize-ε [[ himself] [tcriticize thimsel f ]]]]] Then will selects the above vP. 6. merge( will, =v +k S,5) [+k S will [-top ε [[-k Kwasi] [criticize-ε [[ himself] [tcriticize thimsel f ]]]]]] At this point, the expression derived is getting too big to be written on a single line, and so we abbreviate it using the convention discussed earlier as the below. [+k S will [-top [-k Kwasi] criticize himself]] The next step of the derivation is to move the subject Kwasi for case. Note again that both +k and -k features are checked.

494

Gregory M. Kobele

7. move(6) [S [Kwasi] will [-top tKwasi criticize himself]] Next, a head hosting a +top feature merges with the expression thus far derived. 8. merge( ε , =S +top S,7) [+top S ε [ [Kwasi] will [-top tKwasi criticize himself]]] Finally, the vP moves to check its -top feature. 9. move(8) [S [tKwasi criticize himself] [ε [ [Kwasi] will tcriticize himsel f ]]] Abbreviating the expression derived in 9 as per our conventions, we obtain the below. [S criticize himself Kwasi will] As per our theory of reconstruction, because in the derivation of this sentence the fronted predicate was merged low, it behaves for the purposes of reconstruction as though it were in its base position. A (short and unsuccessful) derivation which attempts to use slash-feature percolation to deal with the topicalized VP follows. This shows this sentence to be derivationally unambiguous (at least with respect to the fronted predicate), and thus (generalizing a little) that the only reconstructive possibilities available to sentences with fronted predicates, are those in which the predicate is interpreted in its base position. The previous derivation introduced the vP in its lower chain position, to introduce it (via discharge) in its higher chain position, the to-be-moved vP is first introduced as an assumption. 1. assume( will, =v +k S) [+k S will [-top v -top]] Already at this point, the derivation can continue no further, as there is no subexpression with matching feature -k which can be used to check the +k feature of will. In other words, while we have assumed that we will find some expression with feature bundle v -top, we do not know that it will itself contain a moving expression waiting to check its -k feature.11 11

It is instructive to consider how to extend the system to allow this kind of derivation. In other words, why can’t we simply make our hypotheses more explicit (so, not just ‘we have some ex-

Deriving Reconstruction Asymmetries

495

We can compare the derivation of predicate fronting sentences to those with fronted DPs, like Which portrait of himself did Kwasi believe that Diego criticized?, which allow for the fronted DP which portrait of himself to be reconstructed either in its base position (thereby giving rise to the reading in which himself is coreferent with Diego) or in the intermediate SPEC-CP position (giving rise to the reading where himself and Kwasi are coreferent). Here we will see only two derivations of the (shorter) sentence Which portrait of himself did Kwasi criticize, which serves to illustrate the fact that the fronted DP is not restricted in its reconstruction possibilities.12 We first extend our lexicon in figure 12 with the complex wh-anaphor which portrait of himself, to which we assign the type d -k -wh,13 and a +WH Comp position.

which portrait of himself, d -k -wh

-ε , =S +wh S In the first derivation of interest to us, the wh-phrase is introduced in its case position via the operation discharge. This derivation will correspond to a reading of the sentence where the anaphor is bound by its co-argument. We begin by assuming the existence of an appropriate DP for criticize. 1. assume( criticize, =d +k V) [+k V criticize [-k d -k]] Next, we discharge the assumption by replacing it with which portrait of himself. 2. discharge(1, which portrait of himself, d -k -wh) [+k V criticize [-k -wh which portrait of himself]] Case is then assigned to the object.

12 13

pression with feature bundle v -top’, but rather ‘we have some expression with feature bundle v -top, which itself contains an expression with feature bundle -k’)? The reasoning is subtle, and revealing of an important but oft neglected fact. The answer is, simply put, that there are no expressions with feature bundle -k derivable in our grammars! The DP moving for case (with feature bundle -k) does not exist in isolation, but only as a part of a larger containing expression. This is true not just of minimalist grammars, but in all variants of minimalism. Formally speaking, we are confronted with the distinction between derivational and derived constituents. A DP with feature bundle -k is a derived constituent, but the discharge operation (and grammatical operations in general) are defined only over derivational constituents. The fact that the anaphor will be unbound in the second derivation is not of concern to us here. Of course, this is a complex expression composed (at least) of the lexical items which, portrait, and himself. As the internal structure of this expression is not relevant for our purposes here (which are simply to investigate the reconstruction possibilities allowed to sentences containing it), we can safely ignore these niceties.

496

Gregory M. Kobele

3. move(2) [V [-wh which portrait of himself] [criticize twhich ]] Next the little-v head selects the VP thus derived. 4. merge( -ε , =V =d v,3) [=d v criticize-ε [[-wh which portrait of himself] [tcriticize twhich ]]] We then merge the agent Kwasi. 5. merge(4, Kwasi, d -k) [v [-k Kwasi] [criticize-ε [[-wh which portrait of himself] [tcriticize twhich ]]]] As our derived expression is rapidly becoming unwieldy, we abbreviate it as per our convention as the below. [v [-k Kwasi] criticize [-wh which portrait of himself]] In the next derivational step, will selects the expression derived thus far. 6. merge( will, =v +k S,5) [+k S will [[-k Kwasi] criticize [-wh which portrait of himself]]] Next the subject raises to check its case features. 7. move(6) [S [ Kwasi] [will [tKwasi criticize [-wh which portrait of himself]]]] The next derivational step is to introduce a +WH Comp, the merger of which induces head movement of the infl element will. 8. merge( -ε , =S +wh S,7) [+wh S will-ε [[ Kwasi] [twill [tKwasi criticize [-wh which portrait of himself]]]]] Finally, the wh-phrase moves to check its wh-feature. 9. move(8) [S [ which portrait of himself] [will-ε [[ Kwasi] [twill [tKwasi criticize twhich ]]]]]

497

Deriving Reconstruction Asymmetries move

move

merge

discharge

C[Q]

move

which portrait of himself

merge

merge

C[Q]

move

merge

will

merge

merge v

Kwasi

move

merge

discharge assume

merge

will

Kwasi

v which portrait of himself

move assume criticize

criticize

Figure 13. Two derivations of the sentence which portrait of himself will Kwasi criticize?

In this derivation, the wh-anaphor is introduced early enough to be bound by its co-argument, Kwasi, as depicted in the tree on the left in figure 13. Introducing the wh-anaphor after the S containing its co-argument has been completed (as depicted in the tree on the right in figure 13) would force it to be bound by an argument in a higher clause. This second derivation begins with the assumption of a +wh DP as the object of criticize. 1. assume( criticize, =d +k V) [+k V criticize [-k -wh d -k -wh]] Next, case is assigned to the hypothetical object. 2. move(1) [V [-wh d -k -wh] [criticize td-k-wh ]] Next, little-v is merged, triggering head movement of criticize. 3. merge( -ε , =V =d v,2) [=d v criticize-ε [[-wh d -k -wh] [tcriticize td-k-wh ]]] The agent is merged.

498

Gregory M. Kobele

4. merge(3, Kwasi, d -k) [v [-k Kwasi] [criticize-ε [[-wh d -k -wh] [tcriticize td-k-wh ]]]] Next will is merged with the vP. 5. merge( will, =v +k S,4) [+k S will [[-k Kwasi] [criticize-ε [[-wh d -k -wh] [tcriticize td-k-wh ]]]]] Abbreviating, we obtain the below. [+k S will [-k Kwasi] criticize [-wh d -k -wh]] The subject moves to receive case. 6. move(5) [S [Kwasi] [will tKwasi criticize [-wh d -k -wh]]] Next, a +WH position is made available, and will head-moves to the new head. 7. merge( -ε , =S +wh S,6) [+wh S will-ε [[Kwasi] [twill tKwasi criticize [-wh d -k -wh]]]] Now, the expression which portrait of himself discharges the assumption made earlier. Note that the anaphor is outside the binding domain of the subject Kwasi. 8. discharge(7, which portrait of himself, d -k -wh) [+wh S will-ε [[Kwasi] [twill tKwasi criticize [-wh which portrait of himself]]]] Finally, the wh-phrase moves to check its wh feature. 9. move(8) [S [which portrait of himself] [will-ε [[Kwasi] [twill tKwasi criticize twhich ]]]]

4. Conclusion We have seen that under a natural account of the syntax-semantics interface according to which the position into which elements are reconstructed depends on the point at which they are inserted into the derivation, Huang’s generalization can be derived as a consequence of the architecture of the hybrid mergemove/assume-discharge minimalist grammar system.

Deriving Reconstruction Asymmetries

499

Nothing has been said in this paper about where slash-features stop and movement begins – in other words, we have been simply looking at the architecture of the system, and have abstracted away from questions like which dependencies should be A-dependencies, and which A-bar. Although it is natural to stipulate, in the context of DPs, that they be introduced via assumptions, and discharged in their case positions (recovering the traditional perspective on the A/A-bar distinction), this is not necessary to derive Huang’s generalization, and has in fact been argued against by Sportiche (2003), who notes that reconstruction is sometimes possible into what are traditionally considered A-positions. An interesting alternative made possible by this formal system is to view the reconstruction differences between A- and A-bar movement as the result, not of a grammatical prohibition, but rather of a parsing preference. If we assume that both slashfeature percolation and movement are always available at each derivational step, but that the parser first pursues parses involving slash-feature percolation, we derive that, in traditional cases of A-movement, derivations without reconstruction in A-positions are recovered first. Note that this does not affect our derivation of Huang’s generalization, as in cases of remnant movement, the only available derivations are those which involve reconstruction into lower chain positions. Finally, note that Heycock (1995) has argued against the adequacy of Huang’s generalization. On the basis of examples such as the below, she offers a new generalization based on referentiality, according to which “referential” phrases, but not “non-referential” ones, may be reconstructed into positions other than their base positions. (8) Which stories about Dianai did shei most object to? (9) *How many stories about Dianai is shei likely to invent? As discussed above, in the present framework, there are no constraints on what can be first merged where save for those imposed by the inability of slash-feature percolation to support remnant movement. Heycock’s insight surrounding the referentiality distinction can be implemented here to restrict the otherwise spurious ambiguity engendered by the addition of hypothetical reasoning to the minimalist grammar system.

Bibliography Boeckx, Cedric (2000): ‘A Note on Contraction’, Linguistic Inquiry 31, 357–366. Chomsky, Noam (1965): Aspects of the Theory of Syntax. MIT Press, Cambridge, Massachusetts. Chomsky, Noam (1976): ‘Conditions on Rules of Grammar’, Linguistic Analysis 2, 303–351. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Collins, Chris (2005): ‘A Smuggling Approach to the Passive in English’, Syntax 8, 81–120. Epstein, Samuel and T. Daniel Seely (2006): Derivations in Minimalism, Volume 111 of Cambridge Studies in Linguistics. Cambridge University Press, Cambridge. Fiengo, Robert (1977): ‘On Trace Theory’, Linguistic Inquiry 8, 35–61.

500

Gregory M. Kobele

Fox, Danny (2000): Economy and Semantic Interpretation. MIT Press, Cambridge, Massachusetts. G¨artner, Hans-Martin and Jens Michaelis (2005): A Note on the Complexity of Constraint Interaction: Locality Conditions and Minimalist Grammars. In: P. Blache, E. Stabler, J. Busquets, and R. Moot, eds, Logical Aspects of Computational Linguistics, Volume 3492 of Lecture Notes in Computer Science. Springer, Berlin, pp. 114–130. G¨artner, Hans-Martin and Jens Michaelis (2007): Some Remarks on Locality Conditions and Minimalist Grammars. In: U. Sauerland and H.-M. G¨artner, eds, Interfaces + Recursion = Language?, Volume 89 of Studies in Generative Grammar. Mouton de Gruyter, Berlin, pp. 161–195. Gazdar, Gerald, Ewan Klein, Geoffrey Pullum, and Ivan Sag (1985): Generalized Phrase Structure Grammar. Harvard University Press, Cambridge, Massachusetts. Grewendorf, G¨unther (2001): ‘Multiple Wh-Fronting’, Linguistic Inquiry 32, 87–122. Harkema, Henk (2001): Parsing Minimalist Languages. PhD thesis, UCLA. Heycock, Caroline (1995): ‘Asymmetries in Reconstruction’, Linguistic Inquiry 26, 547–570. Huang, Cheng-Teh James (1993): ‘Reconstruction and the Structure of VP: Some Theoretical Consequences’, Linguistic Inquiry 24, 103–138. Kayne, Richard (1994): The Antisymmetry of Syntax. MIT Press, Cambridge, Massachusetts. Kobele, Gregory M. (2006): Generating Copies: An Investigation into Structural Identity in Language and Grammar. PhD thesis, University of California, Los Angeles. Kobele, Gregory M. (2007): A Formal Foundation for A and A-bar Movement in the Minimalist Program. In: M. Kracht, G. Penn, and E. P. Stabler, eds, Mathematics of Language 10. UCLA. Kobele, Gregory M. (2008): Across-the-board Extraction in Minimalist Grammars. In: Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+9), pp. 113–128. Koopman, Hilda and Dominique Sportiche (1991): ‘The Position of Subjects’, Lingua 85, 211–258. Lasnik, Howard (1999): Chains of Arguments. In: S. D. Epstein and N. Hornstein, eds, Working Minimalism, Number 32 in Current Studies in Linguistics. MIT Press, Cambridge, Massachusetts, pp. 189–215. Mahajan, Anoop (1990): The A/A-bar Distinction and Movement Theory. PhD thesis, Massachusetts Institute of Technology. Manzini, Rita and Anna Roussou (2000): ‘A Minimalist Theory of A-Movement and Control’, Lingua 110, 409–447. Matushansky, Ora (2006): ‘Head Movement in Linguistic Theory’, Linguistic Inquiry 37, 69–109. Michaelis, Jens (2001): On Formal Properties of Minimalist Grammars. PhD thesis, Universit¨at Potsdam. M¨uller, Gereon (2010): On Deriving CED Effects from the PIC. Linguistic Inquiry 41, 35–82. M¨uller, Gereon and Wolfgang Sternefeld (1993): ‘Improper Movement and Unambiguous Binding’, Linguistic Inquiry 24, 461–507. Nunes, Jairo (2001): ‘Sideward Movement’, Linguistic Inquiry 32, 303–344. Pullum, Geoffrey K. (1979): ‘The Nonexistence of the Trace-Binding Algorithm’, Linguistic Inquiry 10, 356–362. Rudin, Catherine (1988): ‘On Multiple Questions and Multiple WH Fronting’, Natural Language and Linguistic Theory 6, 445–501. Sportiche, Dominique (2003): Reconstruction, Binding and Scope. Available at: http://ling.auf.net/lingBuzz/000017. Stabler, Edward (1997): Derivational Minimalism. In: C. Retor´e, ed., Logical Aspects of Computational Linguistics, Volume 1328 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, pp. 68–95. Stabler, Edward (2001): Recognizing Head Movement. In: P. de Groote, G. F. Morrill, and C. Retor´e, eds, Logical Aspects of Computational Linguistics, Volume 2099 of Lecture Notes in Artificial Intelligence. Springer Verlag, Berlin, pp. 254–260. Tada, Hiroaki (1993): A/A-bar Partition in Derivation. PhD thesis, MIT, Cambridge, Mass.

Computation Institute and Department of Linguistics University of Chicago

Dalina Kallulli

Local Modelling of Allegedly Local but Really Non-Local Phenomena: Lack of Superiority Effects Revisited*

Abstract Starting from the observation that while English generally exhibits superiority effects “Dlinked” wh-phrases can violate it, this paper proposes an account of lack of superiority effects that relies on and provides further support for the view that locality constraints are often hard to detect because of spell-out forms that obscure the presence of syntactic structure. I argue that the ‘lack’ of superiority effects is due to the existence of a relative clause within a null copular construction, as well as a resumptive null object pronoun inside this concealed relative clause. Crucially then, constructions exhibiting lack of superiority effects involve a hidden biclausal structure, which creates the illusion that superiority is violated when in fact it is not. The empirical adequacy of this proposal comes from a range of diverse syntactic phenomena, such as lack of weak crossover in appositives, lack of Principle C effects in relative clauses, and so-called ATB movement phenomena, to which it is extended. The proposal also accounts for other well-known asymmetries across languages, such as those involving interpretive differences between wh-questions with resumptive pronouns versus traces, the distribution of various types of distributive versus functional readings of relative clauses in equational versus predicational contexts, and others.

1. Introduction A syntactic constraint that entails locality is Superiority, requiring that a dependency between a filler and its gap may not be interrupted by a wh-phrase syntactically superior to the gap (Chomsky (1973)). But while English generally exhibits superiority effects, as in (1), so-called ‘D-linked’ wh-phrases (Pesetsky (1987)) can violate superiority, as in (2-a,b), both of which are acceptable to many (though not all) English speakers, according to Frazier and Clifton (2002). (1) a. Mary asked [whoi [ei read what ]]? b. *Mary asked [whati [who read ei ]]? (2) a. *

(Pesetsky (1987, 104, ex. 21))

Mary asked which mani [ei read which book ]?

Versions of this paper were presented at the DGfS 2008 workshop Local Modelling of Non-Local ¨ LinguistikDependencies in Syntax, at the workshop Theory of Grammar (36th Osterreichische tagung), and at the 35th Incontro di Grammatica Generativa. I thank the audiences at these events for their feedback and an anonymous reviewer for detailed comments.

Local Modelling of Non-Local Dependencies in Syntax, 501-524 Artemis Alexiadou, Tibor Kiss & Gereon M¨uller (eds.) L INGUISTISCHE A RBEITEN 547, de Gruyter 2012

502

Dalina Kallulli

b.

Mary asked which booki [which man read ei ]? . (Frazier and Clifton (2002, ex. 3))

Starting from this basic observation, in this paper I propose an account of lack of superiority effects, which relies on and provides further support for the view that locality constraints are often hard to detect because of spell-out forms that obscure the presence of agreement chains (Kratzer (2009)). The central claim that I put forward is that such ‘lack’ of superiority effects is due to the existence of tacit (in the sense, phonetically null) structure, specifically a relative clause within a null copular construction, as well as a (resumptive) null object pronoun inside the relative clause. Thus, the goal of the paper is to show that there is more to the underlying structure of the construction under scrutiny than meets the eye. Crucially, I contend that constructions exhibiting lack of superiority effects involve a bi-clausal structure, which creates the illusion that superiority is violated, when in fact it is not. The structure of this paper is as follows. In section 2, I flesh out the proposal, which, as just mentioned, rests on two basic ingredients: the existence of a null object pronoun on the one hand (specifically pro), and of a phonetically null copular construction (i.e., a silent copula) on the other. Since English is generally not thought to have null object pronouns or null copulas, in this section I also provide evidence for their existence. Section 3 then highlights the empirical adequacy of my proposal. I show that it can be extended to account for a variety of other facts of English syntax that have been recalcitrant to analysis and/or have thus far remained mysterious. In addition, in this section I also show that my proposal accounts for other well-known asymmetries across languages, such as those involving interpretive differences between wh-questions with resumptive pronouns versus traces, the distribution of different types of (roughly) distributive versus functional readings of relative clauses in equational versus predicational contexts, and others. In section 4, I compare my proposal with a (in part) similarly-spirited one, namely the one in Safir (1999). Finally, drawing on Guilliot and Malkawi (2006), in section 5 I show that while potentially able to account for lack of Principle C reconstruction effects, my proposal does not entail absence of Principle A effects.

2. The nuts and bolts of the proposal: massively tacit I submit that lack of superiority effects is due to the existence of a tacit bi-clausal structure. Specifically, I contend that a construction like (2-b) has a structure like the one in (3), the highlighted part of which is a silent copular construction containing a chunk that has been elided under identity with a previously mentioned linguistic expression (namely book). Crucially then, the (first) wh-phrase (i.e.,

Local Modelling of Allegedly Local but Really Non-Local Phenomena

503

which book) is not raised from inside the relative clause but is externally merged in the upper CP.1 (3) Mary asked [ CP which bookk is [DP such / the one (bookk ) ] j [CP that which man read it j / pro j ]] Thus, the dependency between the clause-initial wh-phrase in the first embedded CP and its purported thematic position (i.e., the complement of the verb read) is under this proposal not established by wh-movement, but by variable binding.2 That is, the thematic position is occupied by a phonetically null pronoun, either pro or PF-elided, depending on the exact nature of the concealed relative clause (see footnote 1).3 This pronoun is in turn bound either by the (restrictor of) the wh-phrase (in the upper CP), or alternatively by a (PF-deleted) copy of the restrictor of this wh-phrase.4 Hence, the null (object) pronoun inside the concealed relative is a bound variable (i.e., resumptive) pronoun. In sum, there is no superiority violation at all.5 Thus, my proposal amounts to the following two claims: 1

2 3

Note the alternation between the elements such and the one in the structure in (3). For the purposes of this paper, it is not important to distinguish between these two alternatives. What is important here is the existence of a concealed relative clause in the structure. Note also that depending on whether the concealed relative is a such that or its the one that alternative, the bound variable pronoun inside it will be either PF-deleted, or simply null (i.e., pro), but at any rate non-overt. This is reminiscent of a proposal in Adger and Ramchand (2005) for Scottish Gaelic. Note in this context that as has often been pointed out “[b]inding is a [...] relation between NPs and does not require strict semantic identity between the two coreferential NPs involved [...]; it can also be a very loose relation, a vague ‘aboutness’ relation” (Demirdache (1991, 177)), as shown in the examples under (i): (i)

4

5

a. b. c. d.

John, I hate the bastard. John, I really hate that man / the man. The shirt that John is wearing, I really hate that kind of shirt. John, I really can’t stand that type of guy. (Demirdache (1991, 176))

This is a relevant point, because it shows again that the hidden relative clause in the structure in (3) may be either a such that or a the one that relative (see footnote 1). Of course, this issue depends on the exact structure inside the first (embedded) CP, specifically the precise (external) merging position of the wh-phrase inside this CP, which is however not the concern of the present paper. Crucially, however, note that a sentence such as What is it (that) you want? is completely fine in English, the idea being that the post-copular (null) DP in the structure in (3) has the same status as the overt pronoun it in such sentences. A reviewer points out that with regard to the analysis of (2-b) in (3), it has to be assumed that the relative clause in (3) can be interpreted as a relative clause and a question at the same time. But this, the reviewer withholds, seems to be impossible because relative clauses are not propositions but properties. S/he suggests that empirically, one would have to show that question relative clauses such as that which man read it exist independently of the analysis presented here. This is an objection that I am not able to deal with at present. The reviewer also objects to my analysis of (2-b) by arguing that indirect questions do not show up as adjuncts, and that relative clauses

504

Dalina Kallulli

1. Lack of superiority effects is restricted to concealed relatives, and 2. Resumption is restricted to (sometimes concealed) relatives. Of these two claims, only the second one is easily testable empirically, as I discuss in detail in section 3.4. However, if my overall proposal is empirically adequate, as I try to show in section 3, then this should be sufficient corroboration for the first claim, too. Turning to the distinction between trace and (resumptive) pro, while one may imagine it to be material for languages where it can be shown that (overt) resumptive pronouns have different properties from traces (such as for instance Hebrew – cf. Doron (1982), Sells (1984), Shlonsky (1987), Sharvit (1999)), it is legitimate to ask whether this distinction is independently motivated for English. I believe it is. First, the existence of resumptive pro in English has already been independently argued for in Cinque (1990) in connection with parasitic gap constructions.6 Crucially, Cinque observed that parasitic gaps are restricted to the category DP, as shown in (4). (4) a. b.

This is a neighbourhood which you should work in before residing *(in). This is a neighbourhood which you should work in before residing in (it).

Secondly, notice that though English does not generally allow for gaps (in the sense: pro) in the object position (or otherwise) outside the realm of parasitic gap constructions, there are contexts that require a gap, such as (5-a,b), which contrast with their close paraphrase, namely the so-called “‘unlyrical’ such that” relative (Quine (1960, 110)), which needs an overt resumptive pronoun, as shown in (6).7 (5) a. b.

Which movie is of the kindi that you like (*iti )? Which movie is the onei that you like (*iti )?

(6) Which book is such that you bought *(it)? This amounts to evidence that the gap in constructions such as (4-a,b) is in fact a resumptive pro and not a trace.8

6 7

are invariably adjuncts. This observation is not quite correct however; for instance, Aoun and Li (2003) assume a complementation structure for English wh-relatives. See also Ross (1967), Perlmutter (1972), Obenauer (1984/1985), who argue that all extractions involve empty resumptive pronouns. In this context, see also van Riemsdijk (2008) and Kallulli (2008) for the idea that such in a such that relative in fact means the kind (of x) that. In other words, the antecedent of the bound variable pronoun is deleted under identity with a previously mentioned linguistic expression also in these constructions.

Local Modelling of Allegedly Local but Really Non-Local Phenomena

505

Finally, positing a resumptive pro also in English-like languages is also theoretically appealing in view of the ideas in Hornstein (1999; 2001) and Boeckx and Hornstein (2003; 2004), who argue for the existence of arbitrary (i.e., nonresumptive) pro in non-obligatory control structures. That is, if a language has arbitrary pro, there is no a priori reason why resumptive pro should be unavailable in that language. Turning to null copulas, these have been postulated for languages as different as Arabic (Benmamoun (2000)), Austronesian (Paul (2001)), Hebrew, Russian (Pereltsvaig (2001)), Irish (Carnie (1997)), Japanese (Fukaya and Hoji (1999)), Korean (Lee (1995)), Turkish (Ince (2006)), Welsh (Rouveret (1996)) and even varieties of English. Indeed copular constructions in English have been notoriusly recalcitrant to analysis, and their behaviour also in relative clauses and other contexts (such as ellipsis-related environments) has been rather difficult to accomodate in linguistic theory, as can be illustrated through the following quote from Merchant (2004), who in spite of very rigorous attempts to restrict ellipsis to constituent deletion is forced to allow some kind of non-constituent deletion in cases involving copular constructions: “In short, I’m proposing a kind of ‘limited ellipsis’ analysis, one in which a demonstrative (such as this/that or a pronoun in a demonstrative use) or expletive subject and the copula are elided – given the appropriate discourse context, which will be almost any context where the speaker can make a deictic gesture, and where the existence predicate can be taken for granted.” (Merchant (2004, 725))

Similar arguments have also been made for French. Thus, according to Dufter (2008), a construction which is traditionally included into the class of c’estclefts, but strictly speaking not covered by run-of-the-mill definitions of clefts (such as the one formulated in Lambrecht (2001)), is clefting of copula complements:9 8

In this context, note that such that relatives are also fine with overt ‘copies’ involving an overt pronominal only: (i)

Which book is such that you read that book?

(ii) *Which book is such that you read the book? 9

According to Lambrecht (2001, 467) a cleft construction “is a complex sentence structure consisting of a matrix clause headed by a copula and a relative or relative-like clause whose relativized argument is coindexed with the predicative argument of the copula. Taken together, the matrix and the relative express a logically simple proposition, which can also be expressed in the form of a single clause without a change in truth conditions.”

506

Dalina Kallulli

“As can be seen in [(7-a)], in this structure, the post-QU subpart l’abus du tems obligatorily surfaces without a verb form. Nonetheless, it would not seem unreasonable to assume an elliptical copula in CP, and thus underlyingly a biclausal syntactic format. In any event, [(7-a)] has to be considered a marked syntactic variant of [(7-b)].” (Dufter (2008, 6), emphasis mine)

(7) a. b.

C’est un grand mal que l’abus du tems (*est) it’s a big evil that the.abuse of.the time (*is) L’abus du tems est un grand mal. the.abuse of.the time is a big evil ‘Wasting time is a big evil.’

(Dufter (2008))

Interestingly, Dufter (2008) also reports an overall increase of clefting in French, a phenomenon that has been reported for Spanish as well (Helfrich (2003)), a language in which major constituent orderings have remained much more flexible than in French to date. Let me note again, however, that the null copular construction that I have postulated here is not meant to be understood as a result of PF-deletion, but rather as due to the existence of a silent (i.e., phonetically null) copula (possibly differing from the overt one also with respect to other properties), similar to the null copulas and/or other verbs that have been argued to exist in various languages (see in addition to the references above especially van Riemsdijk (2002), and ˇ Maruˇsiˇc and Zaucer (2006) on independent, phonologically null lexical verbs in Dutch/German and Slovenian, respectively, which go beyond the more common accounts with a null have, as in Larson et al. (1997)).10 Of course, delineating the properties of this null copula is an important and non-trivial task in and of itself, but this paper does not contribute to this issue.11 (For arguments on a lexically and structurally non-uniform, so-called copula be in English, see Becker (2000; 2004), Sch¨utze (2004) and references therein, however.) Having presented the core of the proposal and some preliminary evidence for the tacit structure that I have postulated, in the next section I turn to other, to my mind significant gains that the present account readily achieves. 10 11

See especially Kayne (2005) and van Riemsdijk (2002; 2003; 2006), who crucially extend the inventory of null elements also to such not having an antecedent. I would, however, like to put forward the hypothesis that the phenomenon of do-support in English might have started out precisely in order to recover this tacit structure, and was subsequently generalized to the extent attested in Modern English. Likewise, other clues such a prosody might serve as recovering the (bits of the) null copular construction that I have postulated. At this point, I am not aware of any systematic corpus studies on the emergence of do-support in English which would exclude this hypothesis.

Local Modelling of Allegedly Local but Really Non-Local Phenomena

507

3. The scope of the proposal The goal of this section is to show that in spite of the massively null structure that I have postulated to explain (alleged) lack of superiority effects in English, my proposal is not only theoretically appealing but also empirically adequate, as it is able to account for other facts of English syntax that have thus far remained analytically obscure. Specifically, my proposal can straightforwardly capture lack of weak crossover effects in appositives. In addition, I would like to speculate that it can also be extended to account for strong crossover obviation (i.e., lack of Principle C reconstruction) effects. Furthermore, I provide evidence for my proposal by looking at languages such as Hebrew, where traces and (overt) resumptive pronouns have been known to give rise to interpretive differences. 3.1. Weak crossover obviation in appositives A well-known observation (originally due to Safir (1986)) is that appositives do not exhibit weak crossover effects, as in (9), thus contrasting with restrictive relatives (8). (8)?*A mani whoi hisi wife loves ti arrived early. (9) Johni , whoi hisi wife loves ti , arrived early. My core proposal can also be extended to account for the obviation of weak crossover effects in appositives, whose structure will accordingly be as in (10). (10) Johni , whoi is [DP such / the one ]i that hisi wife loves himi /proi , arrived early. 3.2. ATB phenomena So-called Across-The-Board (ATB) movement phenomena as in (11) have long puzzled syntacticians as the single exception to the Coordinate Structure Constraint (CSC). (11) Who did John like and Mary hate? My proposal can be extended to account for ATB phenomena in the way given in (12).12 12

Of course more needs to be said about the extension of this proposal to ATB phenomena, including do-insertion. I leave this issue open to future work.

508

Dalina Kallulli

(12) Whok is [DP the one / the person ] j [CP that [ IP John liked pro j ] and [ IP Mary hated pro j ]] 3.3. Lack of Principle C reconstruction effects As noted by Munn (1994), sentences like the one in (13) constitute a problem for the promotion or head-raising analysis of relative clauses given in (14-a), since under this analysis the configuration in (14-b) should be ungrammatical due to illicit binding of a name (i.e., a Principle C violation). (13) The picture of Johni which hei saw in the paper is very flattering. (14) a. b.

[ DP . . . namei . . . ] j [ CP pronouni . . . t j ] [ DP . . . namei . . . ] j [ CP pronouni . . . [ DP . . . namei . . . ] j ] (LF reconstruction)

Furthermore, sentences like (13) contrast in this respect with analogous whquestions, as given in (15) through (18) (examples from Sauerland (1998) and Safir (1999)). (15) a. The picture of Johni which hei saw in the paper is very flattering. b. *Which picture of Johni did hei see in the paper? (16) a.

The pictures of Marsdeni which hei displays prominently are generally the attractive ones. b. *Which pictures of Marsdeni does hei display prominently?

(17) a. I have a report on Bob’si division hei won’t like. b. *Which report on Bob’si division won’t hei like? (18) a.

In pictures of Ali which hei lent us, hei is shaking hands with the President. b. *Which pictures of Ali did hei lend us?

Yet, the (b) examples in (15) through (18) are fine in certain contexts, such as contrastive ones (evidenced through the use of the emphatic reflexive expression), as shown in (19). (19) a. b. c. d.

Which picture of Johni did hei himself see in the paper? Which pictures of Marsdeni does hei himself display prominently? Which report on Bob’si division won’t hei himself like? Which pictures of Ali did hei himself lend us?

My proposal can be extended to the (a) sentences in (15) through (18), as well as to those in (19). That is, I would like to propose that a sentence like the one in (15-a) is derived from the structure in (20), in a manner analogous to what

Local Modelling of Allegedly Local but Really Non-Local Phenomena

509

was said for the structure of (2-b) above. Thus, in a sentence like (15-a) the wh-phrase neither ‘reconstructs’ in its putative external merging site (i.e., as the object of the verb saw), nor is deleted at PF.13 (20) [ CP [ DP The picturek of Johni [ CP which is [DP such / the one (picturek ) ] j [CP that hei saw it j /pro j in the paper ]]] is very flattering ] Similarly, I suggest that a sentence like the one in (19-a) is derived from the structure in (21).14 (21) [ CP [ DP Which picturek of Johni ] is [DP such / the one (picturek ) ] j [CP that hei himself did see it j /pro j in the paper ]]

3.4. Resumption in interrogatives A well-known observation (for an overview, see Boeckx (2003)) is that across languages resumption in wh-constructions is restricted to D-linked wh-phrases, as illustrated in (22-a) versus (22-b) for Hebrew and (23-a) versus (23-b) for Albanian, respectively.15 (22) a.

Eyze student nifgaSta (ito) which student you.met with.him ‘Which student did you meet?’ b. *Mi nifgaSta ito who you.met with.him ‘Who did you meet with?’

(23) a.

13 14

15

(Hebrew, Sharvit (1999, 591))

C ¸ far¨e (*e) solli Ana? what 3 S . CL . ACC brought Ana.NOM ‘What did Ana bring?’

See Citko (2001) for the view that the wh-phrase in sentences like (15-a) through (18-a) does not reconstruct but is instead deleted at PF. As a reviewer justly points out, a non-trivial question concerning the application of my proposal to strong crossover obviation effects, as well as ATB phenomena, involves the phenomenon of doinsertion. I leave this issue and the complex of problems that it relates to (such as, the postulated movement of do to C, the nature of the relation between do and the main verb, etc.) open to future research. Note that one form of resumption consists in the use of pronominal clitics. Hence, the term ‘resumption’ is throughout this paper used broadly, to include what is generally known as the phenomenon of clitic doubling as well. Consequently, clitic doubling emerges as a form of resumption.

510

Dalina Kallulli

b.

Cil-in lib¨er (e) solli Ana? which-the.ACC book 3 S . CL . ACC brought Ana.NOM ‘Which book did Ana bring?’

(Albanian)

I extend the analysis sketched in section 2 to wh-constructions with resumptives (i.e., to D-linked questions with resumptives). Specifically, I claim that resumption is restricted to (sometimes concealed) relatives. Thus, I contend that when the clitic in (23-b) is present, this sentence is derived from a bi-clausal structure involving a concealed relative clause and a null resumptive pronoun (i.e., pro) within a phonetically null copular construction (as was explicated for the construction types in section 2), roughly as in (24):16 [CP q¨e ei solli Ana (24) [CP cili lib¨eri e¨ sht¨e i till¨e/ai which book is such/it/the.one that 3 S . CL . ACC brought Ana proi -inacc ]] pro That is, when the clitic in it is present, the sentence in (23-b) has the same structure as the overtly bi-clausal one in (25).17 (25) Cil-i lib¨er e¨ sht¨e i till¨e/ai q¨e e solli Ana? which-the.NOM book is such/it that 3 S . CL . ACC brought Ana ‘Which book is such that Ana brought *(it)?’ As witnessed by the fact that the wh-phrase in (25) bears nominative case and not accusative case like the clitic, it cannot be the wh-phrase in the specifier of the (matrix) CP that the clitic in the relative clause doubles here, but a phonetically null embedded object, specifically pro, which is anaphorically linked with (the restrictor of) this wh-phrase, as rendered in (26).18 (26) [CP cili lib¨eri e¨ sht¨e i till¨e lib¨eri [CP q¨e e solli Ana which booki is such booki that 3 S . CL . ACC brought Ana proi ]] proi 16

17

18

Similar proposals (involving a bi-clausal structure) have been made by McCloskey (1990) and Demirdache (1991, 42ff.) for questions with resumptive pronouns in Irish and Arabic, respectively. Note also that the asymmetry overt versus null resumptive that we saw for English between such that and the one that relatives is not replicated in Albanian, where the resumptive is uniformly pro in either case. That the clitic in sentences like (26) ‘doubles’ a pro is argued for in detail in Sportiche (1996).

Local Modelling of Allegedly Local but Really Non-Local Phenomena

511

Thus, the only difference between (24) and (25) is that the copular structure containing the (elided) relative head while spelled out in the latter is not spelled out in the former.19,20 Concerning the English counterpart of (25), it must then be the case that, since unlike in Albanian pro has a restricted distribution in this language, the pronoun it is not the counterpart of the (doubling) clitic, but rather the counterpart of pro, which is anaphoric with book in the matrix. On the other hand, I claim that sentences like (23-b) have a monoclausal structure when no clitic is present. This structural difference between the ‘clitic’ and the ‘no-clitic’ versions of a sentence like (23-b) is corroborated by the following facts. In Albanian and other clitic doubling languages, a sentence such as the one in (27-a) is ungrammatical due to a weak crossover effect, just as its counterpart is in English. However, the clitic doubled counterpart of (27-a) is grammatical, as shown in (27-b). That is, the clitic in (27-b) triggers weak crossover obviation.21 (27) a. *Cil-in djal¨ei pa n¨ena e tiji ? which-the.ACC boy saw.3 S mother AGR his ‘*Which boyi did hisi mother see?’ b. Cil-in djal¨ei e pa n¨ena e tiji ? which-the.ACC boy 3 S . CL . ACC saw.3 S mother AGR his ‘Which boyi is such that hisi mother saw himi ?’ or: ‘Which boyi is the one that hisi mother saw?’ Under my bi-clausal analysis, the grammaticality of the Albanian example in (27-b) is unsurprising since the wh-phrase here c-commands the embedded sub19

20

21

Note that my concealed relative clause analysis shares quite a bit with van Riemsdijk’s (2008) analysis of what he calls ‘aboutness’ such that relatives of the type The mathematical system such that two and two are four is Peano arithmetic. Crucially, van Riemsdijk proposes that such a sentence is derived through which is deletion. That is, according to him, the sentence above is derived from something like: The mathematical system which is such that two and two in it are four is Peano arithmetic. In this context, note that the part-whole relationship that van Riemsdijk highlights as the essential notion underlying ‘aboutness’ does indeed carry over to the Albanian (and also English) case, as is obvious from the structures in (24) and (25), where cili lib¨er ‘which book’ neccessarily stands in a part-whole relationship with i till¨e lib¨er ‘such a book’. One residual problem under this view is how to account for the fact that cilin lib¨er ‘which book’ in (23-b) bears accusative case also when the clitic is present. A tentative solution would be along the following lines: pro needs to be case marked (Rizzi (1986, 519–520)), but since the bound morpheme -in (the. ACC) cannot attach to it (as pro does not have phonetic content), it will attach to pro’s recovering element cili lib¨er ‘which book’ in the matrix. A ramification of this view then is that, contrary to traditional wisdom (but see Bhatt and Takahashi (2008)), case is in fact a diagnostic for structure. One piece of evidence for this conception of case is that superiority effects are generally absent in case marking languages. This is in fact part of a more general pattern; also other forms of resumption in languages without clitic doubling lead to an alleviation of weak crossover effects (cf. e.g., Shlonsky (1992); Demirdache (1991)).

512

Dalina Kallulli

ject n¨ena e tij ‘his mother’ from an A-position, therefore binding the pronoun in it. Naturally the account of the structural asymmetry between the ‘clitic’ and the ‘no-clitic’ versions of (23-b) that I have posited leads one to expect asymmetries with respect to reconstruction. These do indeed exist. For instance, while the (mono-clausal) sentence in (28-a) shows Principle C effects, the minimally different one in (28-b) containing a clitic does not. fotografi t¨e An¨esi pa (ajo)i n¨e gazet¨e? (28) a. *Cil-¨en which-the.ACC picture of Ana saw.3 S she in newspaper ‘*Which picture of Anai did shei see in the newspaper?’ b. Cil-¨en fotografi t¨e An¨esi e pa (ajo)i n¨e which-the.ACC picture of Ana 3 S . CL . ACC saw.3 S (she) in gazet¨e? newspaper ‘Which picture of Anai is such that shei saw it in the newspaper?’ Under the bi-clausal analysis that I have proposed, the lack of a Principle C effect in (28-b) is straightforwardly accounted for, since under this analysis, the clitic does not double the wh-phrase in the matrix clause but an (embedded) object pro.22 Other facts that speak for the correctness of my bi-clausal analysis can be adduced, such as interpretive differences between the clitic and the no-clitic version of sentences such as (23-b). To see this, consider the differences between (23-b) in its no-clitic and clitic varieties, as given in (29-a) vs. (29-b), respectively. (29) a. b.

lib¨er solli Ana? Cil-in which-the.ACC book brought Ana.NOM ‘Which book did Ana bring?’ Cil-in lib¨er e solli Ana? which-the.ACC book 3 S . CL . ACC brought Ana.NOM ‘Which is the book that Ana brought?’

As the English translations of the sentences in (29) suggest, there are very clear interpretive differences between the sentence in (29-a) and that in (29-b). Both (29-a) and (29-b) presuppose that Ana brought a certain book. Indeed under the analysis of which-phrases as definite expressions (Katz and Postal (1964), Kuroda (1969)), it is predicted that these, like definite expressions, are presuppositional. The wh-words in (29-a) and (29-b) could then be viewed as the source of the presupposition that these sentences carry, namely that Ana brought a certain 22

In section 5 I explain why unlike for Principle C, Principle A effects are not obviated in the presence of a resumptive clitic.

Local Modelling of Allegedly Local but Really Non-Local Phenomena

513

book. What is puzzling, however, is the fact that while this presupposition can be cancelled for (29-a), it cannot for (29-b). This is shown in (30) and (31), respectively (as is also replicated in their English translations through the (in)felicity of if any).23 (30) Cil-in lib¨er solli Ana (n¨e qoft¨e se solli lib¨er)? which-the.ACC book brought Ana.NOM in case that brought book ‘Which book did Ana bring (if any)?’ lib¨er e solli Ana (#n¨e qoft¨e se (31) Cil-in which-the.ACC book 3 S . CL . ACC brought Ana.NOM in case that solli lib¨er)? brought book ‘Which book is such that Ana brought it (#if any)?’ Strikingly, while the wh-phrase in (29-a) can appear in what seems to be its base position, namely the object of the verb solli ‘brought’, still retaining its question interpretation, the wh-phrase in the clitic construction in (29-b) cannot do so. This contrast is illustrated in (32-a) vs. (32-b).24 (32) a.

Ana solli cil-in lib¨er? Ana.NOM brought which-the.ACC book ‘Ana brought which book?’ e solli cil-in lib¨er? b. *Ana Ana.NOM 3 S . CL . ACC brought which-the.ACC book My proposal straightforwardly accounts for other interpretive differences between wh-questions with resumptive pronouns versus gaps (here, in the sense: traces), such as the fact that wh-questions with resumptive pronouns (versus gaps) only allow functional answers and not pair-list readings, as discussed in Sharvit (1999),25 who illustrates this contrast through (33) versus (34):26

23 24

25 26

I thank Edwin Williams (personal communication) for suggesting the if any test. While I have not provided the full paradigm of direct object clitic doubling patterns in Albanian here, note that Albanian is a so-called free word-order language, and that also in non-whconstructions direct objects may occur in the same position as wh-phrases irrespective of whether they are clitic doubled or not (for details, see Kallulli (1999; 2000)). Importantly, no ungrammaticality arises if the wh-phrase in (32-b) is replaced by a [−wh] DP. That is, wh-phrases contrast with non-wh-phrases in this respect. Note that Sharvit (1999) uses the term ‘gap’ to denote a trace and not a null pronominal, as I do in section 2. Sharvit shows that this holds “even if the pronoun cannot alternate with a trace for syntactic reasons (i.e., to avoid an ECP violation)” ... “[a] pair-list reading is strongly disfavoured even if the second member of each pair happens to be, for example, the mother of the first member” (Sharvit (1999, 595)), as in (i): (i)

Ezyo iSa kol gever rakad ita which woman every man danced with-her ‘Which woman did every man dance with?’

514

Dalina Kallulli

(33) Ezyo iSa kol gever hizmin which woman every man invited ‘Which woman did every man invite?’ a. et Gila ACC Gila b. et im-o ACC mother-his c. Yosi et Gila; Rami et Rina Yosi ACC Gila Rami ACC Rina (34) Ezyo iSa kol gever hizmin ota which woman every man invited her ‘Which woman did every man invite’ a. et Gila b. et im-o c. *Yosi et Gila; Rami et Rina Sharvit also discusses the distribution of different types of (roughly) distributive versus functional readings of relatives in equational versus predicational contexts. The intuitions behind my analysis fit with these facts too. Specifically, Sharvit points out that the contrast between traces and resumptive pronouns observed in Doron (1982) disappears in specificational sentences. Doron’s (1982) observation is that when a trace in a relative clause is c-commanded by a quantified expression, the sentence is ambiguous between a ‘single-individual’ and a ‘multiple-individual’ reading, but if the trace position is filled by a resumptive pronoun, the multiple-individual interpretation is not available.27 This contrast is illustrated in (35) versus (36).28 (35) ha-iSa Se kol gever hizmin hodeta lo the-woman OP every man invited thanked to-him a. The woman every man invited thanked him. b. For every man x, the woman that x invited thanked x. (36) ha-iSa Se kol gever hizmin ota hodeta lo the-woman OP every man invited her thanked to-him The woman every man invited thanked him (= y). On the basis of this contrast, Doron argues that there is a fundamental difference between traces and resumptive pronouns, a position that has been challenged by 27 28

Note also that pair-list readings disappear across islands (Hagstrom (1998), Dayal (2002)). The multiple-individual reading of (35) poses the question how it is obtained, since the quantified expression seems to bind a pronoun outside its scope (recall that relative clauses are scope islands and as such they presumably block long-distance QR).

Local Modelling of Allegedly Local but Really Non-Local Phenomena

515

Sharvit (1997; 1999) on the basis of the fact that it disappears in specificational sentences, as shown in her example under (37). (37) ha-iSa Se kol gever hizmin /ota hayta iSt-o the-woman OP every man invited her was wife-his a. The woman every man invited was his (he = y) wife. b. For every man x, the woman x invited was x’s wife. Sharvit (1999) claims that relative clauses in equative, or specificational, sentences correspond to natural functions, whereas in non-equative sentences they correspond to lists of arbitrary pairs. Therefore, although traces are licensed in both types of sentences, resumptive pronouns are licensed only in equative sentences. But as Sharvit herself assumes based on Chierchia (1991; 1993), the pair-list reading is also a functional reading (albeit of a different kind). That is, semantic type alone does not differentiate between ‘natural’ functions and sets of (possibly arbitrary) pairs; both are functions from individuals to individuals (i.e., type e, e). Sharvit (1999, 602) suggests that: “resumptive pronouns support natural function readings but not pair-list questions because natural functions (for whatever reason) are permissible referents of pronouns, but sets of arbitrary pairs are not”, a claim that she corroborates with additional data. Crucially then, Sharvit’s analysis relies on the assumption that there is a semantic/pragmatic distinction between natural functions and pair-lists, which goes beyond semantic type denotation and which relies heavily on the notion of Dlinking. Importantly, however, Sharvit (1999, 595) notes that “satisfaction of the D-linking requirement alone does not suffice to license a resumptive pronoun”. Thus, D-linked wh-phrases come in (at least) two blends, which is exactly how I have analysed seemingly simple D-linked wh-questions, namely: as underlyingly mono-clausal versus underlyingly bi-clausal ones. Thus, the implication of my claim at the beginning of this section (namely that resumption is restricted to sometimes concealed relatives) is only one way: resumption with D-linked whphrases entails a bi-clausal structure, but bi-clausality does not entail resumption/clitic doubling. One obvious ramification of this view then is that simple wh-phrases should always have D-linked uses;29 note the well-formedness of the sentences in (38). (38) a. b.

What is the thing that John likes? (What are the things John likes?) Who is the one that John likes? (Who are the ones that John likes?)

To conclude, my analysis derives Sharvit’s ‘D-linking’ assumption in a purely syntactic fashion. 29

This is often disputed.

516

Dalina Kallulli

3.5. More on tacit structure in English One more piece of evidence for the existence of tacit structure in English can be construed on the basis of the phenomenon of so-called “wh&wh” questions (Citko (2008)), illustrated in (39).30 (39) a. b. c. d. e. f. g. h.

What and why did you eat? When and where did you see them? What or whom did John see? What and when did Bush know on Iran? What and where did the phrase ‘bread and roses’ come from? How, why and what to count? When and what should bloggers disclose? When and whose cookery tips we copy . . . (from Citko (2008))

As shown in (40), a striking property of such questions is that they only allow single-pair (i.e., individual) but not pair-list readings, thus differing diametrically from ‘normal’, non-conjoined multiple question word constructions, which exhibit exactly the reverse pattern in that they only allow a pair-list but no single pair interpretation, as shown in (41). (40) a. b. c.

What and why did John eat? John ate breakfast because he was hungry. ✓SP John ate breakfast because he was hungry, he ate lunch out of habit, he ate dinner because he had a dinner reservation. *PL

(41) a. b. c.

Who ate what? John ate a doughnut. *SP John ate a doughnut, Bill ate a piece of cake, and Tom ate some pretzels. ✓PL

This contrast is thus strongly reminiscent of the facts discussed in section 3.1 for Hebrew, where there is no free alternation between traces and resumptive pronouns. It has been claimed that wh&wh questions unlike conjoined wh-questions in English only allow so called at all but not it readings (Graˇcanin-Yuksek (2007)). Thus, a question like (42-a) can only be interpreted as in (42-b) but not as in (42-c). Interestingly however, in the presence of a parasitic gap, not only does the it reading for a wh&wh question such as (42-a) become available, but the at all reading disappears altogether, as shown in (43). (42) a. 30

What and why did you eat?

Other terms for this phenomenon include “conjoined wh-questions” (Graˇcanin-Yuksek (2007)) and “conjoined question words constructions” (Zhang (2007)).

517

Local Modelling of Allegedly Local but Really Non-Local Phenomena

b. c. (43) a. b.

What did you eat and why did you eat at all? What did you eat and why did you eat it?

at all reading it reading

What and why did you eat without intending to devour? The meat, because it was warm (and I was starving).

While a detailed discussion of the structure of wh&wh questions is well beyond the scope of this paper, this fact seems to constitute evidence for the existence of a (resumptive) bound variable pro object in the structure of (43-a).31 Thus, I have provided evidence for a phonetically null resumptive pronoun with properties different from traces also in languages like English. To conclude, a variety of facts across several construction types that have thus far been recalcitrant to analysis can be accounted for in a straightforward manner by my proposal that English (like other languages) has a null copular construction containing (the head of) a (concealed) relative clause, as well as a null pronoun (pro), which like other types of pronouns can receive a bound variable interpretation.

4. Comparison with other accounts My account of lack of Principle C effects is similar to that in Safir (1999), which builds on Fiengo and May’s (1994) independently motivated mechanism of Vehicle Change. This is a procedure that replaces a name with its ‘pronominal correlate’ (i.e., a pronoun bearing the same index), as depicted below: (44) A picture of Johni which hei thought Mary would like to have was recently stolen. (45) A picture of Johni which hei thought Mary would like to have picture of Johni was recently stolen. (LF reconstruction) (46) A picture of Johni which hei thought Mary would like to have picture of himi was recently stolen. (Vehicle Change) But as remarked in Citko (2001), there is a major problem with Safir’s Vehicle Change approach, namely that it predicts the lack of Principle C effects in many environments in which they do occur, as mentioned earlier (see section 3.3) and as repeated in (47): (47) a. The picture of Johni that hei likes is on display. b. *Which picture of Johni does hei like? 31

Note also that several accounts (e.g., Graˇcanin-Yuksek (2007), Citko (2008)) posit a bi-clausal structure for wh&wh questions in English.

518

Dalina Kallulli

The crucial differences between Safir (1999) and my analysis are that: (i) I take the bound variable pronoun to be pro (or a PF-deleted one if the structure involves a concealed such that relative), which has a different, obviously more restricted, distribution in English relative to that of overt pronominals; and (ii) pro is co-indexed with a c-commanding (elided copy of a) DP in a phonetically null copular structure (as described in section 3.3). Given the restricted distribution of pro or putatively other null pronouns in English (relative to overt ones), my analysis eschews the objections raised against Safir (1999) in Citko (2001). However, the question arises as to why emphatic wh-questions pattern with relative clauses while non-emphatic wh-questions do not. That is, what is it that licences the concealed relative clause strategy, why is it available for (15-a) through (18-a) (as well as (19)) but not for (15-b) through (18-b) (and/or (47-b)), since both involve D-linked wh-phrases? At this point, I can only speculate that it is the bi-clausal (hidden) structure of emphatic wh-questions that is responsible for their presuppositional structure, which as mentioned earlier and as repeated under (48) is different from that of their non-emphatic counterparts, among other things. (48) a. b. c. d.

Which book did Ana bring (if any)? Which book is such that Ana brought it (#if any)? Which book is of the kind that Ana brought (#if any)? Which book is the one that Ana brought (#if any)?

It might be informing to also look at other cues for syntactic structure, such as intonation, the idea being that recoverability of elliptic or silent structure might also be achieved through prosody.

5. More on resumption and reconstruction A major property of resumption noticed by Aoun et al. (2001) is that it sometimes allows for reconstruction, as the contrast between the Lebanese Arabic sentences in (49-a) and (49-b) (the latter containing a strong island) demonstrates. (49) a.

[t@lmiiz-a1 l-k@sleen]2 ma baddna nXabbir wala mQallme1 student-her the-bad NEG want.1 PL tell.1 PL no teacher P@nno huwwe2 zaQbar cheated.3 SG . M that he ‘Her1 bad student2 , we don’t want to tell any teacher1 that he2 cheated.’

Local Modelling of Allegedly Local but Really Non-Local Phenomena

519

maQ wala mQallme1 b. *[t@lmiiz-a1 l-k@sleen]2 ma èkiina student-her the-bad NEG talked.1 PL with no teacher Pabl-ma huwwe2 yuus.al arrive.3 SG . M before he ‘Her1 bad student2 , we didn’t talk to any teacher1 before he2 arrived.’ To account for this contrast, Aoun et al. (2001) distinguish between apparent resumption (where no strong island intervenes) and true resumption (i.e., resumption in the presence of an island), claiming that only the former but not the latter can be derived via movement, as shown in (50-a) and (50-b), where RE stands for Resumptive Element. (50) a. b.

Apparent resumption: [ DP . . . pronoun1 . . . ] 2 [ IP . . . QP1 . . . [ CP . . . [ DP [ DP . . . pronoun1 . . . ]2 RE2 ]]] True resumption: [ DP . . . pronoun*1 . . . ] 2 [ IP . . . QP1 . . . [ Island . . . [ DP RE2 ]]]

In this way, Aoun et al. (2001) extend the traditional take on reconstruction to resumption as follows: even with resumption, if an XP allows for reconstruction, movement of that XP has occurred.32 Thus, reconstruction of an XP should not occur within islands. However, newly noted data involving resumption in languages as different as Jordanian Arabic, French and Albanian, argue for reconstruction within (strong) islands, as illustrated for French and Albanian in (51), (52) and (53), (54), respectively. (51) La photo1 de sa2 classe, tu es fˆach´e parce que chaque prof2 l1 ’a d´echir´ee. ‘The picture of his class, you are furious because each teacher tore it.’ . (Guilliot and Malkawi (2006, 170)) (52) Quelle photo1 de lui2 es-tu fˆach´e parce que chaque homme2 l1 ’a d´echir´ee? ‘Which picture of him are you furious because each man tore it?’ . (Guilliot and Malkawi (2006, 170)) (53) Fotografin¨e1 e klas¨es s¨e vet2 , ti je i nxehur q¨e secili m¨esues2 e1 picture.the of class.the of self you are furious that each teacher it grisi. tore 32

Traditional accounts of reconstruction posit that if an XP allows for reconstruction, movement of that XP has occurred (Lebeaux (1990), Chomsky (1995)).

520

Dalina Kallulli

(54) Cil¨en fotografi1 t¨e tij2 je (ti) i inatosur q¨e/sepse secili burr¨e2 which.the picture of him are you furious that/because each man e1 grisi? it tore These data challenge thus traditional accounts of reconstruction (including its extended version in Aoun et al. (2001)) by presenting the following paradox: if reconstruction is only due to syntactic movement, how come it is possible in strong islands? This paradox has however been solved successfully in Guilliot and Malkawi (2006), who claim that what really matters for reconstruction is on the one hand the type of resumption, and on the other hand the type of binding condition. Specifically, Guilliot and Malkawi (2006) show that reconstruction with weak resumption (e.g., a clitic) is sensitive to the type of binding condition (there is reconstruction with bound variable anaphora but not with R-expressions) but insensitive to islandhood (it occurs even in strong islands), whereas reconstruction with strong resumption (e.g., a strong pronoun or epithet) is sensitive to islandhood (present in no or weak islands and absent in strong islands), but insensitive to the type of binding condition. The central claim in Guilliot and Malkawi (2006) is that reconstruction of an XP follows from interpretation of a copy of that XP. Capitalizing on the difference between two distinct processes as the origin of copies, namely movement and ellipsis, Guilliot and Malkawi argue that reconstruction with weak resumption follows from ellipsis (specifically via Elbourne’s (2001) NP-deletion analysis of third person pronouns to resumptive pronouns), whereas reconstruction with strong resumption is a result of movement. Elbourne (2001), who assimilates third person pronouns to definite determiners, assumes two alternative structures for them, corresponding to two different anaphoric processes, as depicted in (55). (55) a. b.

[ DP [ D the/it ] NP ] [ DP the/it 1 ]

In (55-a), the pronoun takes an NP-complement as argument (which undergoes NP-deletion under identity with a linguistic antecedent), whereas in (55-b), the pronoun takes an index (variable) as argument. Under weak resumption, reconstruction with bound variable anaphora is now predicted, since presence of the elided copy (in accordance with the analysis in (55-a)) allows for the bound variable interpretation of the pronoun. Lack of reconstruction with weak resumption when Principle C is at stake is under Guilliot and Malkawi’s approach derived by assuming that weak resumptives can also be analysed with an index as argument, as in (55-b). I adopt this analysis to derive the asymmetry between presence of reconstruction with bound variable anaphora and its absence with R-expressions (i.e., lack

Local Modelling of Allegedly Local but Really Non-Local Phenomena

521

of Principle C effects) in the presence of a clitic (hence, weak resumption) in Albanian (see footnote 22).

6. Conclusion My main agenda in this paper was to show that though locality constraints are often hard to detect because of spell-out forms that obscure their presence, they still exist and are obeyed, a view that has been argued for most recently in Kratzer (2009) in connection with the relationship between bound variable pronouns and their antecedents. More specifically, I have argued that agreement chains can be established or mediated through either part-whole or specificational relations, and that in particular, resumption with D-linked wh-phrases is restricted to (sometimes) concealed relatives. It is precisely this hidden structure that is responsible for apparent lack of superiority effects, among other things. Crucially, I have provided evidence for a phonetically null resumptive pronoun with properties different from traces also in languages like English.

Bibliography Adger, David and Gillian Ramchand (2005): ‘Merge and Move: Wh-Dependencies Revisited’, Linguistic Inquiry 36, 161–193. Aoun, Joseph and Yen-hui Audrey Li (2003): Essays on the Representational and Derivational Nature of Grammar: The Diversity of Wh-Constructions. MIT Press, Cambridge, Massachusetts. Aoun, Joseph, Lina Choueiri and Norbert Hornstein (2001): ‘Resumption, Movement, and Derivational Economy’, Linguistic Inquiry 32, 371–403. Becker, Misha (2000): The Development of the Copula in Child English: The Lightness of Be. PhD thesis, University of California, Los Angeles. Becker, Misha (2004): ‘Is Isn’t Be’, Lingua 114, 399–418. Benmamoun, Elabbas (2000): The Feature Structure of Functional Categories: A Comparative Study of Arabic Dialects. Oxford University Press, Oxford. Bhatt, Rajesh and Shoichi Takahashi (2008): When to Reduce and when not to: Crosslinguistic Variation in Phrasal Comparatives. Talk given at GLOW 2008, March 28, Newcastle University. Boeckx, Cedric (2003): Islands and Chains: Resumption as Stranding. John Benjamins, Amsterdam. Boeckx, Cedric and Norbert Hornstein (2003): ‘Reply to “Control Is Not Movement”’, Linguistic Inquiry 34, 269–280. Boeckx, Cedric and Norbert Hornstein (2004): ‘Movement under Control’, Linguistic Inquiry 35, 431–452. Carnie, Andrew (1997): ‘Two Types of Non-verbal Predication in Modern Irish’, Canadian Journal of Linguistics 42, 57–73. Chierchia, Gennaro (1991): Functional WH and Weak Crossover. In: D. Bates, ed., Proceedings of WCCFL 10. CSLI Publications, Stanford University, pp. 75–90. Chierchia, Gennaro (1993): ‘Questions with Quantifiers’, Natural Language Semantics 1, 181–234. Chomsky, Noam (1973): Conditions on Transformations. In: S. Anderson and P. Kiparsky, eds, A Festschrift for Morris Halle. Holt, Reinhart and Winston, New York, pp. 232–286. Chomsky, Noam (1995): The Minimalist Program. MIT Press, Cambridge, Massachusetts. Cinque, Guglielmo (1990): Types of A -Dependencies. MIT Press, Cambridge, Massachusetts.

522

Dalina Kallulli

Citko, Barbara (2001): Deletion Under Identity in Relative Clauses. In: M. Kim and U. Strauss, eds, Proceedings of NELS 31. GLSA, University of Massachusetts, Amherst, pp. 131–145. Citko, Barbara (2008): How and Why do Wh-Questions Linearize? Handout of talk given at GLOW 2008, March 29, Newcastle University. Dayal, Veneeta (2002): ‘Single-Pair versus Multiple-Pair Answers: Wh-in-Situ and Scope’, Linguistic Inquiry 33, 512–520. Demirdache, Hamida (1991): Resumptive Chains in Restrictive Relatives, Appositives and Dislocation Structures. PhD thesis, MIT, Cambridge, Massachusetts. Doron, Edit (1982): On the Syntax and Semantics of Resumptive Pronouns. In: R. Bley-Vroman, ed., Texas Linguistics Forum 19. University of Texas at Austin, pp. 1–48. Dufter, Andreas (2008): On Explaining the Rise of c’est-Clefts in French. In: U. Detges and R. Waltereit, eds, The Paradox of Grammatical Change: Perspectives from Romance. Benjamins, Amsterdam, pp. 31–56. Elbourne, Paul (2001): ‘E-Type Anaphora as NP-Deletion’, Natural Language Semantics 9, 241– 288. Fiengo, Robert and Robert May (1994): Indices and Identity. MIT Press, Cambridge, Massachusetts. Frazier, Lyn and Charles Clifton (2002): ‘Processing “d-Linked” Phrases’, Journal of Psycholinguistic Research 31, 633–659. Fukaya, Teruhiko and Hajime Hoji (1999): Stripping and Sluicing in Japanese and some Implications. In: S. Bird, A. Carnie, J. D. Haugen and P. Norquest, eds, Proceedings of WCCFL 18. Cascadilla Proceedings Project, Somerville, Massachusetts, pp. 145–158. Graˇcanin-Yuksek, Martina (2007): About Sharing. PhD thesis, MIT, Cambridge, Massachusetts. Guilliot, Nicolas and Nouman Malkawi (2006): When Resumption Determines Reconstruction. In: D. Baumer, D. Montero and M. Scanlon, eds, Proceedings of WCCFL 25. Cascadilla Proceedings Project, Somerville, Massachusetts, pp. 168–176. Hagstrom, Paul Alan (1998): Decomposing Questions. PhD thesis, MIT, Cambridge, Massachusetts. Helfrich, Uta (2003): Hendidas y seudo-hendidas: un an´alisis emp´ırico-diacr´onico. In: F. S. Miret, ed., Actas del XXIII Congreso Internacional de Ling¨u´ıstica y Filolog´ıa Rom´anica, Salamanca 2001. Vol. 2, Niemeyer, T¨ubingen, pp. 439–451. Hornstein, Norbert (1999): ‘Movement and Control’, Linguistic Inquiry 30, 69–96. Hornstein, Norbert (2001): Move! A Minimalist Theory of Construal. Blackwell, Oxford. Ince, Atakan (2006): Pseudo-Sluicing in Turkish. In: N. Kazanina, U. Minai, P. Monahan and H. Taylor, eds, University of Maryland Working Papers in Linguistics 14. College Park, Maryland, pp. 111–126. Kallulli, Dalina (1999): The Comparative Syntax of Albanian: On the Contribution of Syntactic Types to Propositional Interpretation. PhD thesis, University of Durham. Kallulli, Dalina (2000): Direct Object Clitic Doubling in Albanian and Greek. In: F. Beukema and M. den Dikken, eds, Clitic Phenomena in European Languages. John Benjamins, Amsterdam, pp. 209–248. Kallulli, Dalina (2008): Resumption, Relativization, Null Objects and Information Structure. In: J. M. Hartmann, V. Heged˝us and H. van Riemsdijk, eds, Sounds of Silence: Empty Elements in Syntax and Phonology. Elsevier, Amsterdam, pp. 235–264. Katz, Jerrold and Paul Postal (1964): An Integrated Theory of Linguistic Descriptions. MIT Press, Cambridge, Massachusetts. Kayne, Richard S. (2005): Movement and Silence. Oxford University Press, Oxford. Kratzer, Angelika (2009): ‘Making a Pronoun: Fake Indexicals as Windows into the Properties of Pronouns’, Linguistic Inquiry 40, 187–237. Kuroda, Sige-Yuki (1969): English Relativization and Certain Related Problems. In: D. Reibel and S. Schane, eds, Modern Studies in English: Readings in Transformational Grammar. Prentice Hall, Englewood Cliffs, New Jersey, pp. 264–287. Lambrecht, Knud (2001): ‘A Framework for the Analysis of Cleft Constructions’, Linguistics 39, 463–516. Larson, Richard, Marcel den Dikken and Peter Ludlow (1997): Intensional Transitive Verbs and Ab-

Local Modelling of Allegedly Local but Really Non-Local Phenomena

523

stract Clausal Complementation. Ms., SUNY at Stony Brook and Vrije Universiteit Amsterdam. Lebeaux, David (1990): Relative Clauses, Licensing and the Nature of the Derivation. In: J. Carter, R.-M. D´echaine, B. Philip and T. Sherer, eds, Proceedings of NELS 20. GLSA, University of Massachusetts, Amherst, pp. 318–332. Lee, Jeong-Shik (1995): ‘A Study on Predicate Clefting’, Studies in Generative Linguistics 5, 531– 584. ˇ Maruˇsiˇc, Franc and Rok Zaucer (2006): ‘On the Intensional Feel-Like Construction in Slovenian: A Case of a Phonologically Null Verb’, Natural Language & Linguistic Theory 24, 1093–1159. ¯ McCloskey, James (1990): Resumptive Pronouns, A-Binding and Levels of Representation in Irish. In: R. Hendrick, ed., The Syntax of the Modern Celtic Languages. Academic Press, New York, pp. 199–248. Merchant, Jason (2004): ‘Fragments and Ellipsis’, Linguistics and Philosophy 27, 661–673. Munn, Alan (1994): A Minimalist Account of Reconstruction Asymmetries. In: M. Gonz`alez, ed., Proceedings of NELS 24. GLSA, University of Massachusetts, Amherst, pp. 397–410. Obenauer, Hans-Georg (1984/1985): ‘On the Identification of Empty Categories’, The Linguistic Review 4, 153–202. Paul, Ileana (2001): ‘Concealed Pseudo-Clefts’, Lingua 111, 707–727. Pereltsvaig, Asya (2001): On the Nature of Intra-Clausal Relations: A Study of Copular Sentences in Russian and Italian. PhD thesis, McGill University. Perlmutter, David (1972): Evidence for Shadow Pronouns in French Relativization. In: P. M. Peranteau, J. N. Levi and G. C. Phares, eds, The Chicago Which Hunt: Papers from the Relative Clause Festival. Chicago Linguistic Society, Chicago, pp. 73–105. Pesetsky, David (1987): Wh-in-Situ: Movement and Unselective Binding. In: E. J. Reuland and A. G. B. ter Meulen, eds, The Representation of (In)definiteness. MIT Press, Cambridge, Massachusetts, pp. 98–129. Quine, Willard van Orman (1960): Word and Object. MIT Press, Cambridge, Massachusetts. Rizzi, Luigi (1986): ‘Null Objects in Italian and the Theory of pro’, Linguistic Inquiry 17, 501–557. Ross, John Robert (1967): Constraints on Variables in Syntax. PhD thesis, MIT, Cambridge, Massachusetts. Rouveret, Alain (1996): Bod in the Present Tense and in other Tenses. In: R. D. Borsley and I. Roberts, eds, The Syntax of the Celtic Languages: A Comparative Perspective. Cambridge University Press, Cambridge, pp. 125–170. Safir, Ken (1986): ‘Relative Clauses in a Theory of Binding and Levels’, Linguistic Inquiry 17, 663– 689. ¯ Safir, Ken (1999): ‘Vehicle Change and Reconstruction in A-Chains’, Linguistic Inquiry 30, 587– 620. Sauerland, Uli (1998): The Meaning of Chains. PhD thesis, MIT, Cambridge, Massachusetts. Sch¨utze, Carson T. (2004): Why Nonfinite Be Is Not Omitted While Finite Be Is. In: A. Brugos, L. Micciulla and C. E. Smith, eds, Proceedings of BUCLD 28. Cascadilla Press, Somerville, Massachusetts, pp. 506–521. Sells, Peter (1984): Syntax and Semantics of Resumptive Pronouns. PhD thesis, University of Massachusetts, Amherst. Sharvit, Yael (1997): The Syntax and Semantics of Functional Relative Clauses. PhD thesis, Rutgers University. Sharvit, Yael (1999): ‘Resumptive Pronouns in Relative Clauses’, Natural Language & Linguistic Theory 17, 587–612. Shlonsky, Ur (1987): Donkey Parasites. In: J. McDonough and B. Plunkett, eds, Proceedings of NELS 17. GLSA, University of Massachusetts, Amherst, pp. 569–580. Shlonsky, Ur (1992): ‘Resumptive Pronouns as a Last Resort’, Linguistic Inquiry 23, 443–468. Sportiche, Dominique (1996): Clitic Constructions. In: J. Rooryck and L. Zaring, eds, Phrase Structure and the Lexicon. Kluwer, Dordrecht, pp. 213–276. van Riemsdijk, Henk (2002): ‘The Unbearable Lightness of GOing: The Projection Parameter as a Pure Parameter Governing the Distribution of Elliptic Motion Verbs in Germanic’, Journal of Comparative Germanic Linguistics 5, 143–196.

524

Dalina Kallulli

van Riemsdijk, Henk (2003): Some Thoughts on Specified Ellipsis. In: L.-O. Delsing, C. Falk, ´ Sigursson, eds, Grammar in Focus: Festschrift for Christer Platzack. G. Josefsson and H. A. Vol. 2, Department of Scandinavian Languages, Lund, pp. 257–263. van Riemsdijk, Henk (2006): The Morphology of Nothingness: Some Properties of Elliptic Verbs. In: S. Haraguchi, O. Fujimura and B. Palek, eds, Proceedings of LP 2002. Karolinum Press, Charles University, Prague, pp. 465–486. van Riemsdijk, Henk (2008): Identity Avoidance: OCP Effects in Swiss Relatives. In: R. Freidin, C. P. Otero and M. L. Zubizarreta, eds, Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud. MIT Press, Cambridge, Massachusetts, pp. 227–250. Zhang, Niina Ning (2007): ‘The Syntactic Derivations of two Paired Dependency Constructions’, Lingua 117, 2134–2158.

Department of Linguistics University of Vienna

Index adjacency, 340 agent severance, 233 agreement, 51–55, 57–58, 61–74, 117–118, 123–126 wh-agreement, 454, 457, 463 long distance, 88 long-distance, 8, 36 long-distance agreement, 137–152 anticausatives, 273, 283–286, 297, 300 Anywhere Condition, 187, 188 argument saturation, 266–267 backward raising, 86 bi-clausal structure, 502, 515 binding Binding Theory, 188 Knight Move Binding, 223–230 Principle A, 165, 187–189, 203, 240, 286, 437, 478, 491 Principle B, 188, 194, 197, 478 Principle C, 446, 478, 508 reflexive, 221, 230–237 variable, 257, 260 Brozsiewski’s model, 401, 402, 411 Burzio’s Generalization, 275, 279, 333 case case assignment, 6, 35, 187, 308 default, 281, 282, 295, 296, 301 dependent, 281, 282, 294–296, 301 Global Case Split, 306, 307, 310, 318 Local Case Split, 306, 330–333 structural, 275, 279–281 categorial grammar, 339, 344, 350, 380–384 clitic climbing, 371, 384–391 Complex NP Constraint (CNPC), 22 Condition on Extended Scope Taking (CEST), 338 Condition on Extraction Domain (CED), 417 control non-obligatory, 89 obligatory, 89 Convergent Grammar, 454, 458–463 copy, 407 cyclic Agree, 55–56, 144–147, 277, 279 cyclic spell-out, 135 cyclic Transfer, 353, 357–359 D-linking, 501, 515 dative

beneficiary, 214, 218, 220 dativus indicantis, 214, 216 free, 214–217, 237 possessor, 214, 216, 217, 220 defectivity, 149–151 dependency, 166, 169–172 local, 1 non-local, 3 spurious non-local, 12–17 direction marking, 328–329 discontinuity, 338–339, 344, 349 division, 383, 391 ellipsis, 354–356 ergativity, 283 Everywhere Condition, 188 expletive, 300 Extended Projection Principle (EPP), 2, 104–105, 127, 196, 279 feature inheritance, 366 feature-inheritance, 148–152 freezing, 414 function composition, 340–341, 383 Lambek Calculus, 383 languages Adyghe, 86 Albanian, 509–513, 519, 520 Arabic (Lebanese), 518 Blackfoot, 138 Chamorro, 453–456 Chinese, 5 Choctaw, 11 Chukchee, 8, 138 Czech, 113, 117, 120, 122, 130, 131 Dutch, 162 English, 2, 4, 9, 10, 17, 49, 77, 85, 136, 155, 177, 190, 251, 338, 353, 404, 431, 445, 501, 516 Ewe, 21 Finnish, 7 Fore, 326 French, 505, 506, 519 German, 1, 6, 10, 15, 16, 52, 67, 68, 72, 74, 76, 113, 127, 142, 156, 178, 214–226, 273, 283, 287– 289, 414 Greek, 5, 88–106 Hebrew, 306, 331, 509, 513–515 Hindi, 8, 138, 143

526 Icelandic, 5, 112, 137, 277, 278 Irish, 20, 432, 457 Italian, 161, 376–377 Itelmen, 138, 143 Japanese, 6, 115 Kashmiri, 8 Kayardild, 115 Khwarshi, 8, 114, 124 Korean, 5, 9 Kutchi Gujarati, 8 Latin, 5, 116 Mohawk, 8 Nocte, 328 Norwegian, 441–444 Portuguese, 78, 163–165 Romanian, 88–106 Russian, 4, 113, 115, 128 Serbo-Croatian, 162 Spanish, 15, 89, 106, 107, 360, 365, 372–379 Tauya, 332 Tsez, 8, 117, 138, 141, 145 Umatilla Sahaptin, 325 Yurok, 7, 307 linearization, 338, 350, 467 local resolution, 172 Maraudage, 314–325 melting effect, 419 merge, 406 complement, 459, 460, 462, 468, 470 Set-Merge, 118–119 specifier, 462, 470 subject, 459, 460 Minimal Link Condition (MLC), 421 Minimalism, 307 movement, 4, 24, 117–118, 126, 129–131, 401, 431–432, 465 A-bar-movement, 480–481 A-movement, 480–481 across-the-board, 424, 507 remnant, 414

Index phase, 188, 189, 354, 357–359 strong, 149, 152 weak, 152 Phase Impenetrability Condition (PIC), 25, 50, 51, 111, 112, 136, 194, 309, 366, 367 pit-stop reflexive, 32, 432 predicate abstraction, 257 prosody, 338, 342, 349–350 reconstruction, 432, 435–437, 479, 491–498 reflexivity, 5, 34, 166 exempt reflexives, 155 picture-NP reflexives, 155–156, 158, 160 relatives concealed relatives, 503 relative clause, 50, 52, 502 restructuring, 13, 142, 372, 374, 384–391 resumption, 56–57, 504, 509–510, 518–521 scope, 441–442, 444 selection c-selection, 120–121 s-selection, 122 semantic computation, 255 semantics variable-free semantics, 260 slash feature percolation, 27–30 Slash Termination Metarule, 29 slash-feature percolation, 481, 487, 490–491 sluicing, 353–355 Antecedent Contained Sluicing, 359– 364, 368 Strict Cycle Condition (SCC), 35, 55, 308 Subjacency Condition, 23 superiority, 501 tectostructure, 249 Unambiguous Domination, 420 unbounded anaphors, 157

NEG-scope, 338, 342 negative quantifiers, 338–339

voice, 236–237

object control verbs, 373, 376–379, 384–391 object experiencer psych verbs, 157, 159, 177

weak crossover, 92, 507 weak islands, 419–421 wh & wh questions, 516–517 wrapping, 346, 467

path, 432–433 punctuated, 31, 433 uniform, 31, 433

Y-model, 338, 350–351