Phases: Developing the Framework 9783110264104, 9783110264098

This volume explores and develops the framework of phases (so-called Phase Theory), first introduced in Chomsky (2000).

221 32 2MB

English Pages 453 [454] Year 2012

Table of contents :
Preface
Contents
Foreword
Introduction: A framework of phases for linguistic theory
Phases beyond explanatory adequacy
Phase periodicity
Exploring phase based implications regarding clausal architecture. A case study: Why structural Case cannot precede theta
Phase cycles in service of projection-free syntax
Feature-splitting Internal Merge and its implications for the elimination of A/A’-position types
On feature inheritance, defective phases, and the movement–morphology connection
The size of phases
Consequences of phases for morpho–phonology
Phonological interpretation by phase: Sentential stress, domain encapsulation, and edge sensitivity
Phases and semantics
Phases in NPs and DPs
Phases, head movement and second-position effects
Index of subjects

Recommend Papers

Phases of Irish History

214 88 245KB Read more

Enterprise services with the .NET Framework: developing distributed business solutions with .NET Enterprise Services 9780321246738, 032124673X

Excerpt from C# Online.NET Review (wiki.CSharp-Online.NET): "The author is quite well known in parts of the Microso

420 53 12MB Read more

Enterprise services with the .NET Framework: developing distributed business solutions with .NET Enterprise Services 9780321246738, 032124673X

Enterprise Services with the .NET Framework is the only book that experienced .NET developers need to learn how to write

601 30 2MB Read more

Artificial Intelligence and Autonomous Shipping: Developing the International Legal Framework 9781509933358, 9781509933389, 9781509933365

This collection of essays critically evaluates the legal framework necessary for the use of autonomous ships in internat

110 84 3MB Read more

Phases of Interpretation 9783110197723, 9783110186840

This book investigates the concept of phase, aiming at a structural definition of the three domains that are assumed as

176 10 4MB Read more

Developing a Framework for Measuring Community Resilience: Summary of a Workshop [1 ed.] 9780309347396, 9780309347389

The 2012 National Research Council report Disaster Resilience: A National Imperative highlighted the challenges of incre

135 92 2MB Read more

Fit for developing software: framework for integrated tests [3rd print ed.] 0321269349, 2005005894, 9780321269348

Testing was once regarded as a separate and unique discipline within the overallsoftware development process. The realit

281 102 5MB Read more

The Sourdough Framework

159 38 127MB Read more

Fit for Developing Software: Framework for Integrated Tests [1st Printing ed.] 0321269349, 9780321269348

This guide shows how to develop Fit (framework for integrated tests) tables for solving communication, agility, and bala

289 20 4MB Read more

Developing Digital Literacies : A Framework for Professional Learning [1 ed.] 9781483332864, 9781452255521

Turn teachers--and students--into tech-savvy digital citizens! Digital literacies are essential for managing information

144 107 2MB Read more

Phases: Developing the Framework
9783110264104, 9783110264098

Author / Uploaded
Ángel J. Gallego (editor)
Noam Chomsky (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Phases

Studies in Generative Grammar 109

Editors

Henk van Riemsdijk Harry van der Hulst Jan Koster

De Gruyter Mouton

Phases Developing the Framework

Edited by

´ ngel J. Gallego A Foreword by

Noam Chomsky

De Gruyter Mouton

The series Studies in Generative Grammar was formerly published by Foris Publications Holland.

ISBN 978-3-11-026409-8 e-ISBN 978-3-11-026410-4 ISSN 0167-4331 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. ” 2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston Printing: Hubert & Co. GmbH & Co. KG, Göttingen 앝 Printed on acid-free paper 앪 Printed in Germany www.degruyter.com

Preface Phases have become the focus of much linguistic inquiry in the literature of the last twelve years, but it is atually an old topic of contemporary linguistics (i.e., the cycle). Like most old topics, so-called Phase Theory touches upon different phenomena that lie at the very heart of the language faculty and its interaction with the interfaces: recursion, compositionality, ciclicity, economy, agreement, etc. This collection of essays is an attempt to explore phases, and in so doing to understand the nature of core aspects of language. Quite a few people deserve to be thanked for their help with the development of this book. First and foremost, I would like to express my most sincere gratitude to Henk van Riemsdijk, whose support and interest in this project was crucial. Henk provided me with sound advice and encouragement every time I contacte him. I must also thank Cedric Boeckx for his help; he was the first to know about my intention to prepare a volume on phases, and his advice was also key for this project to take shape. Noam Chomsky deserves special gratitude too. He was much interested and willing to help with this project from the very beginning, and I am very happy that he could finally contribute. Many thanks to the people that agreed to review the papers in the present volume: Peter Ackema, Roberta D’Alessandro, Artemis Alexiadou, Susana Béjar, Theresa Biberauer, Carlo Cecchetto, Jeroen van Craenenbroeck, Marcel den Dikken, Gorka Elordieta, Caterina Donati, Antonio Fábregas, Kleanthes Grohmann, Shinichiro Ishihara, Hilda Koopman, Thomas Leu, Clemens Mayr, David Medeiros, Masao Ochi, Dennis Ott, Paul Pietroski, Anna Maria Di Sciullo, Viola Schmitt, Peter Svenonius, and Luis Vicente. Thanks also to Emily Farrell, Ursula Kleinhenz, Wolfgang Konwitschny, and Julie Miess at Mouton de Gruyter for editorial help. Finally, and most importantly, I would like to thank my wife Sandra for her support, clever advice, and constant understanding. She is the ‘totem’ that gives me a base-line reality.

Contents Foreword…………………………………………………………… Noam Chomsky

1

Introduction: A framework of phases for linguistic theory..………. Ángel J. Gallego

9

Phases beyond explanatory adequacy……………………………… Cedric Boeckx

45

Phase periodicity…………………………………………………... Juan Uriagereka

67

Exploring phase based implications regarding clausal architecture. A case study: Why structural Case cannot precede theta.................. Samuel D. Epstein, Hisa Kitahara, and T. Daniel Seely Phase cycles in service of projection-free syntax………………….. Hiroki Narita

103 125

Feature-splitting Internal Merge and its implications for the elimination of A/A’-position types………………………………… Miki Obata

173

On feature inheritance, defective phases, and the movement– morphology connection……………………………...…………….. Marc D. Richards

195

The size of phases………………………………………………….. Julie Anne Legate

233

Consequences of phases for morpho–phonology………………….. Bridget Samuels

251

Phonological interpretation by phase: Sentential stress, domain encapsulation and edge sensitivity…………………………………. Yosuke Sato

283

Phases and semantics………………………………………………. Wolfram Hinzen

309

Phases in NPs and DPs.……………………………………………. Zjelko Boškoviü

343

viii

Contents

Phases, head movement and second-position effects……………… Ian G. Roberts

385

Index of subjects..………………………………….……………….

441

Foreword Noam Chomsky Phase-theoretic ideas are a development of several concepts that have played a significant role in linguistic theory in the various forms it has taken since the 1950s. One such concept is locality, the conclusion that what superficially appear to be long-distance relations decompose into more local ones. In (1), for example, the relation between which books and buy is decomposed into local operations that relate which books in clause-initial position to an unpronounced entity W that receives its semantic role in the normal way, as in “buy those books”: (1)

Guess which books they think that the storekeeper hopes that the customers will buy W

The locality principle in turn falls under an overriding principle of Minimizing Computation (MC), a guideline since the origins of modern linguistics, with traditional roots. It is a very natural guideline if we view language (more explicitly, I-language) as a computational system internal to the mind/body, and subject to conditions of general biology, perhaps beyond, of which some version of MC is plausibly one. In this context, MC is among the third factor properties, interacting with external data and genetic endowment to determine the course of growth and development of language in the individual (language acquisition). Another of these concepts is cyclicity, in essence, the intuition that the properties of larger linguistic units depend on the properties of their parts. While ubiquitous in traditional practice, the concept perhaps received its first clear formulation and application in a 1956 paper on stress contours (Chomsky, Halle & Lukoff 1956). At the time, one of the most lively topics in American structural linguistics was the study of stress and pitch levels, developed in its fullest form by George Trager and Henry L. Smith (1951[ 1956]), which provided a four-stress, four-pitch notation allegedly sufficient for all English dialects. The Chomsky-Halle-Lukoff system proposed that the stress contours are determined by a few simple rules operating cyclicly, in accord with syntactic structure, ideas since elaborated in many ways. The cyclicity proposal was inconsistent with fundamental assumptions of American (and, to a considerable extent, European) structural linguistics, in particular the procedural approach to language that conceived of linguis-

2

Noam Chomsky

tic theory substantially in terms of operations of segmentation and classification, perhaps supplemented by some others, which could be applied to a corpus to yield a structuralist descriptive grammar. That approach imposed a ban on “mixing of levels”: establishing “higher levels” such as syntactic or even word structure before the sound system is fixed (with marginal exceptions, which raised their own problems). The cyclicity proposal also coexisted uneasily with the notions of Phrase Structure Grammar (PSG), which were then under development, with their “top-down” conception of generation. The latter tension was resolved by the abandonment of PSG, for substantial reasons, in favor of X-bar Theory, which takes larger units to be constructed from smaller ones, ultimately heads drawn from the Lexicon. That approach, along with its various descendants, is based on a notion of compositionality that differs from the PSG conception: specifically, it requires complete endocentricity, and thus rules out the kind of analysis standard in early PSG, beginning with the first rule postulated, (2), which establishes no hierarchic relation between its two constituents: (2)

S → NP VP

X-bar Theory, in contrast, assumes that the head of one of them projects (V in early versions), with the subject the specifier of the predicate. In later versions, T projects and the subject is the specifier of TP. This, however, is a stipulation, unmotivated within X-bar Theory, which just as readily tolerates the conclusion that TP is the specifier of the head of the subject⎯wrong, but for no motivated reason. The simplest conception of compositionality holds that if objects X, Y have been generated, then a new object K can be constructed from them, whether they are heads or not. Still keeping to MC, the operation that forms K (call it Merge) takes K to be {X, Y}, unordered, with X, Y unmodified by the operation. X-bar Theory yields the notions complement, specifier, second-specifier, etc. Apart from complement (= first Merge when one of X, Y is a head), none of these notions is definable under Merge, which is to be preferred, under MC, unless there is substantial counter-evidence. Any serious approach to to I-language has to at least meet the condition that it incorporates a generative procedure G that yields an infinite array of structured expressions that can be interpreted at two interfaces, the sensorimotor interface SM for externalization, and the conceptual-intentional interface C-I for thought and planning of action. In any such system, however constructed, there will be some operation that yields K given X, Y, however

Foreword

3

it may be buried within G (perhaps taken to be a Fregean ancestral, a PSG, a collection of filters, an axiom system, or some other mechanism). A Mergebased system appears to be the simplest of these variants, hence again to be modified only under the pressure of strong counter-evidence. Assuming only Merge, it should follow that the operations leading to C-I (narrow syntax, formal semantics) rely only on hierarchy, not order. Linearization or other forms of ordering (depending on modality of externalization) should be part of the mapping to SM, in essence a reflex of properties of SM, which requires more than hierarchy. That conclusion has far-reaching consequences, discussed elsewhere, and faces interesting empirical challenges, which I will put to the side here. As a matter of simple logic, there are two cases in which X, Y are merged: Internal Merge (IM), in which one of X, Y is a term of the other (a member of it or a member of a term); and External Merge (EM), in which neither X nor Y is a term of the other. In the latter case, Merge (X, Y) = {X, Y}. If, say, Y is a term of X, then Merge (X, Y) again is {X, Y}, but with two occurrences of Y, one outside of X and one within X (which is unchanged, by MC); there are two copies of Y. There are no operations Copy or Remerge, just simple Merge. G will have to have some device to distinguish copies from repetitions. The basic intuition is straightforward. Suppose that some item, say the preposition in, is selected twice from the Lexicon in the course of generating (3): (3)

The man in the house was in a hurry.

The two occurrences of in are interpreted independently at the interfaces; they are repetitions, not copies. In contrast, in (1), in a Merge-based theory W = which books, and the occurrences of this syntactic object are copies, not repetitions. At the C-I level (1), the copies will be given an operator-variable interpretation along the general lines of (4), and the lower copy will be deleted by another application of MC, minimizing externalization, in general a vast reduction of computation (Chomsky 2008): (4)

for which x, x = books, the customers will buy x

One goal of Phase Theory is to provide the mechanisms to distinguish copies from repetitions, as straightforwardly as possible. If Merge is binary, then generated X and Y can intersect only if one is a term of the other. If n-ary operations are added for n > 2, other options arise, including those studied in multidominance theories, which also require some new notion of copy, and of phase (if they incorporate the latter

4

Noam Chomsky

notion). It is in fact likely that binary Merge in its simplest form is insufficient, and that some extensions of Merge are licensed by UG, an interesting topic I will not try to pursue here. In PSG, non-terminal nodes are stipulated, and the notion of projection is undefined, captured (improperly) by notational devices, as in the rule VP → V NP. In X-bar Theory, projection is determined by the requirement that composition must be head-oriented, and the stipulation of the projected head when non-heads are merged. But in the simpler Merge-based theory, the notion is undefined⎯which I think makes sense: unlike such properties as compositionality, non-contiguous relations (i.e., displacement), and order, projection is not (virtually) observed in phenomena, but is a theoryinternal notion. In early versions of such theories (e.g., Chomsky 1995), Merge was complicated to provide a label as part of the generated object, yielding projection, but by stipulation. From a more principled perspective, labeling should not be marked at all in generated structures but rather regarded as a property of G. If X is to be interpreted at the interfaces, then information is needed about what kind of syntactic object it is. In the simplest case, that information will be provided by a single designated element of X, optimally an atom of computation⎯a head, though here questions arise about the relations among heads, atoms of computation, and lexical items, which again I will put aside. G then incorporates an algorithm A that may detect a head of X⎯or may not, and indeed should not if X must be modified to be interpreted. That option does not arise, by stipulation, in X-bar Theory, but does in the simpler Merge-based approaches, and I think has empirical support in addition to its conceptual advantages. Note that the analogue to (2) in a Merge-based theory is legitimate, and could be correct. Interesting questions arise about the status of the simple subject-predicate construction and others that violate endocentricity. In the simplest case, A should reduce to the Probe-Goal relation that also values uninterpreted features, as in agreement under minimal search, where the relation values the ĳ-features of the Probe and the structural Case of the Goal. Labeling differs in that it does not modify syntactic objects, but only provides information as to how they enter into subsequent computation. Another notion that has been invoked repeatedly in work on generative grammar is strict cyclicity, a stronger version of cyclicity. The underlying intuition is that for certain elements X constructed in the course of derivation, further computation should not modify X: specifically, by valuation of features or IM. The concept has clear motivation in terms of MC, substantially improving computational efficiency.

Foreword

5

Consider for example (1), repeated here, with some structure added, interpreted at SM with the lower copy deleted, and at C-I along the lines of (4): (1)

Guess [C which books C [? they think that the storekeeper hopes that the customers will buy which books]]

C is the head detected by the labeling algorithm, and we leave open the status of ?. Suppose that further computation embeds (1) in (5): (5)

The man who T (1)

We do not expect the bracketed CP complement of guess in (1) to be modified in that process, though guess may be, through interaction with T (inflection). Furthermore, if CP is generated alone, not embedded as the object of guess, then the complement ?P of C will not be modified in further computation, though which books, in the periphery of C, can be; for example, by merging with a higher C (raising). In these cases it is the interior of the phrases⎯the complement of the head⎯that is immune to further change. The residue, the edge, can be modified. Let us take a phase to be the smallest syntactic object which, like (1) and its complement CP (in isolation), has an interior that is immune to change. Immunity of the interior to change is what has been called the Phase Impenetrability Condition (PIC). PIC is guaranteed by Transfer to the interfaces of all information that would allow the interior to be modified by G. This principle must be defined with care⎯more care than in my own publications on the topic⎯to ensure that the interior, while not further modified, can nevertheless be interpreted in other positions (see Obata 2010). Thus if (5) is the complement of “find”, and is then raised to the subject position of “was found”, it will be interpreted at the interfaces in its surface position, not the base position in which it entered the computation. The optimal way to achieve this result is another question I will leave open. One desideratum for phases is that all operations take place at the phase level, including IM and transfer of the interior to the interfaces. That suffices to distinguish copies from repetitions, since the relevant information is available locally at the phase level. Again, the appropriate mechanism must be specified. One operation cannot be restricted to the phase level: EM, which provides the syntactic objects that ultimately constitute a phase. But optimally that will be the only exception.

6

Noam Chomsky

The next task is to determine which syntactic objects are phases. Here several plausible criteria have been considered (see the introduction of this volume for discussion). To satisfy MC, phases should be as small as possible, but not so small that the interior may be modified at a later stage. Any subpart subject to further modification has to be raised to the edge, an operation that always requires justification, and some account of why it cannot remain in that position (if it cannot). Another condition is provided by uninterpretable features, a concept that came into prominence with the important work of the late Jean-Roger Vergnaud. These features must be eliminated by valuation before transfer, or the derivation will crash at the interface, with an uninterpretable feature. It follows that among the phases are the syntactic objects in which structural case and unvalued ĳ-features are valued: tensed clauses and transitive verb phrases v*P. There is good reason to suppose that the tensed clause is CP but not TP (see Chomsky 2008). Note that a raised XP (say a wh-phrase) cannot remain at the edge of v*P but must move on, ultimately to the edge (left periphery) of CP. That is sometimes true of XP at the edge of CP (successive-cyclic movement), but not always, as in embedded interrogatives. There is a straightforward explanation for these results in terms of the labeling algorithm, but I will put it aside here. Note that although the interior of a phase cannot be modified⎯it can be inspected, determining valuation of a higher probe. There are several interesting cases, the best known being the inherent Nominative object of an experiencer verb (assuming the VP to be a phase, like other transitive verb phrases), which can determine the ĳ-features of higher T (not without some intricacy; see Sigurðsson & Holmberg 2008; and for other possible cases, Chomsky 2008) Are there phases other than CP and v*P? There is conflicting evidence, which again I will not pursue. I have tried to sketch some of the motivations for phase theory, and the general framework in which it is embedded, leaving a host of unanswered questions, including many not even mentioned. The essays that follow develop various approaches to this intricate web of issues, proposing answers to many questions, opening the way to new and challenging inquiries.

Foreword

7

References Chomsky, Noam 1995 The Minimalist Program. Cambridge, MA: MIT Press. 2008 On Phases. In Foundational Issues in Linguistic Theory, R. Freidin et al. (eds.), 133-166. Cambridge, MA: MIT Press. Chomsky, Noam, Morris Halle, and Fred Lukoff 1956 On Accent and Juncture in English. In For Roman Jakobson: Essays on the occasion of his sixtieth birthday, M. Halle et al. (eds.), 65-80. The Hague: Mouton & Co. Obata, Miki 2011 Root, Successive-Cyclic and Feature-Splitting Internal Merge: Implications for Feature-Inheritance and Transfer. Ph.D. Dissertation, U. Michigan. Sigurðsson, Halldór Ármann, and Anders Holmberg 2008 Icelandic dative intervention: Person and Number are separate probes. In Agreement Restrictions, R. D’Alessandro et al. (eds.), 251-280. Berlin: Mouton de Gruyter. Trager, George L., and Henry L. Smith 1951 An outline of English structure. Norman, OK: Battenberg Press. (Corrected edition, 1956, Washington DC: American Council of Learned Societies).

Introduction: A framework of phases for linguistic theory* Ángel J. Gallego

1. Introduction Linguistic theory relies on some notion of compositionality, the interpretation of complex units being dependent on the interpretation of smaller ones. Within generative grammar, this leading idea took form under the so-called phonological cycle, with Chomsky et al.’s (1956) pioneering work on stress, later on extended in Chomsky & Halle (1968), and quickly adopted in the domains of morphology, semantics, and syntax, where different cyclic conditions were argued to regulate derivational dynamics (see Lasnik 2006 and Uriagereka 2011 for recent discussion). Chomsky (2000) revamps the cycle under the rubric of phase, trying to accommodate it within a minimalist, ‘from below’, approach to the Faculty of Language (henceforth, FL). As has often been noted in the literature (Boeckx & Grohmann 2007), current phases roughly correspond to bounding nodes (Chomsky 1977) and barriers (Chomsky 1986) of earlier frameworks, hence posing the question of how this—apparently terminological—twist can help us approach language from more precise angles. The answers that Chomsky has offered in his writings have been interpreted in different ways, giving rise to varying—sometimes even conflicting—perspectives on what phases are and do (see Gallego 2010 for a summary). The goal of this volume is to provide the reader with a sample of those approaches, the results they have achieved, and the matters that are still under debate. Discussion is organized as follows: Section 2 reviews some of the conceptual and empirical arguments that have been offered to motivate phases; these fall into two broad categories, interface conditions and computational efficiency, and, as will be shown, they fail to provide a stable characterization of phases. Section 3 delves into Chomsky’s primary focus on ĳfeatures in order to define phases, a restricted view that is consistent with v*P and CP being cyclic domains, but at the same time casts doubt on extensions to other categories (DP, PP, etc.). As discussion unfolds, I will make reference to the papers of this volume (when it is relevant), both highlighting their

10

Ángel J. Gallego

specific observations and putting them into a broader context. Section 4 summarizes the main conclusions. 2. Conceptual and empirical motivations for cycles A general trait of a minimalist approach to language is the endeavour to inspect and reduce any aspects (rules, filters, devices, formatives, etc.) that we regard as stipulative or unprincipled in the hope that such reduction will help us better understand the nature of this species-specific capacity. To be sure, determining what a stipulation is is by no means an easy task, for it largely depends on our background assumptions and the perspective we adopt to study linguistic phenomena. Chomsky (1993 et seq.) invites us to follow this ‘minimalist’ route—already present in generative grammar since its inception—by capitalizing on two aspects (better known as the “third factor”; Chomsky 2005) in order to reduce the complexity of the machinery that accummulated towards the end of the GB era: (1)

a. Interface conditions b. General considerations of computational efficiency (not exclusive of FL)

Consider (1b) first. In the minimalist writings, Chomsky seriously alludes to computational complexity in (1995:227-228) when discussing the possibility that convergent derivations are compared by taking an initial choice of lexical items: a reference set. The main empirical aim at the time was to differentiate lexical items within a given computational domain⎯for instance, two occurrences of “John” in the sentence John was killed , as opposed to the two tokens of “John” in John killed John⎯. In Chomsky’s words: An elementary empirical condition on the theory is that expressions “usable” by the performance systems be assigned interface representations in a manner that does not induce too much computational complexity. We want to formulate economy conditions that avoid “exponential blowup” in construction and evaluation of derivations. A local interpretation of reference sets is a step in this direction. [from Chomsky 1995:228]

The reference set of Chomsky (1995) is an initial storage of lexical items that feeds a derivation. When the notion of phase is introduced in

Introduction: A framework of phases for linguistic theory

11

2000, Chomsky dubs this collection Lexical Array (LA), and capitalizes on it in order to reduce the access to the lexicon:1 Is it […] possible to reduce access to Lex? The obvious proposal is that derivations make a one-time selection of a lexical array LA from Lex, then map LA to expressions, dispensing with further access to Lex. That simplifies computation far more than the preceding steps. If the derivation accesses the lexicon at every point, it must carry along this huge beast, rather like cars that constantly have to replenish their fuel supply. [from Chomsky 2000:100-106]

Basically, then, in order to derive an expression like (2b), we form the LA in (2a), which is used as a storage that Merge can feed from, dispensing with direct access to the lexicon. (2)

a. Lexical Array: {boy1, C1, read1, T1, the2, v*1, book1} b. Output: The boy read the book

Pushing this idea further, Chomsky (2000:100-106) restricts the access to LAs so that only a subpart is placed in active memory. The main reason for this move was to prevent the possibility that comparisons between derivations became intractable⎯an LA with n tokens could potentially correspond to n! possible orders of Merge applications, and therefore at least n! derivations to compare (see Johnson & Lappin 1997, 1999 for further criticism along the same lines). To tackle these objections, Chomsky proposed that derivations could only be compared if they had the same LA at a given derivational step. Consider, in this respect, the example in (3b), with the numeration in (3a). As Chomsky (2000) observed, (3c) cannot be formed from (3a). (3)

a. {there1, likely1, a1, T1, proof1, to1, discovered1, is1, be1} b. There is likely to be a proof discovered c. *There is likely a proof to be discovered

The explanation provided by Chomsky in order to account for the asymmetry between (3b) and (3c) relied on the possibility that the comparison between operations is made at specific derivational stages. In particular, if we first focus on the stage depicted in (4), we have two options to form (3b): Either we Move a proof (from the syntactic object already assembled) or we Merge there (accessing the LA again). Both routes are shown in (5a) and (5b).

12

Ángel J. Gallego

(4)

[TP T to [vP be a proof discovered]]

(5)

a. [TP a proof T to [vP be ta proof discovered]]

Move a proof

b. [TP there T to [vP be a proof discovered]]

Merge there

{there0, likely1, a0, T0, proof0, to0, discovered0, is1, be0}

Under the (nowadays abandoned) assumption that Merge is more economical than Move, the preference for taking there from the subarray was accounted for. However, as Chomsky quickly noted, merger of there is not always an option to bar movement. Although that might follow from the expletive not being part of the lexical array (as in, e.g., the glass broke), it will not always be the case (as in, e.g., there is a possibility that the proofs will be discovered). In order to solve this new puzzle, Chomsky put forward the hypothesis that one does not take the whole LA to construct the expression, but only supbarts of it: a “sub-LA” or “phase”. Suppose we select LA as before [...] suppose further that at each stage of the derivation a subset LAi is extracted, placed in active memory (the “workspace”), and submitted to the procedure L. When LAi is exhausted, the computation may proceed if possible; or it may return to LA and extract LAj, proceeding as before [...] Operative complexity in some natural sense is reduced. [from Chomsky 2000:100-106]

The important question that emerges is how those small subarrays are selected, and here is where proposals diverge.2 For some authors, phases are CP and v*P alone (Chomsky 2000, Gallego 2010); others incorporate PPs (Abels 2003) and DPs (Svenonius 2004, Hiraiwa 2005, Boškoviü this volume), and others yet take predications, every phrase, and even every application of Merge to constitute phases (Den Dikken 2006, Müller 2010, Epstein & Seely 2002)—and there may be more options I am forgetting.3 This alone should make us realize, on the one hand, how stimulating and prolific the theory of phases is and, on the other hand, how difficult it is to compare the existing approaches. To recap so far, the computational complexity arguments given by Chomsky in 1995, and then again in 2000 (which directly align with the oftused, but ill-understood, “computational efficiency” motto), though intuitively

Introduction: A framework of phases for linguistic theory

13

appealing and sound, have never been worked out in detail. Of course, we want the system to invoke “least effort” metrics (eliminating superflous elements, barring redundant steps, restricting the search space, and so on and so forth; see Chomsky 1991, Collins 1997, Fukui 1996, Uriagereka 1997, and references therein), but it is not clear how. At first glance, one application of Merge is more economical than two applications, which is in turn more economical than three, but we know that this type of metrics will not work (see Chomsky 2007, 2008 for discussion), especially if we endorse some version of the hypothesis that ‘grammars do not have counters’. What the precise metrics to determine complexity in FL are is still, as far as I know, under debate, and only interdisciplinary collaborations will allow us to advance in this terrain.4 In part as a response to this slippery scenario, Juan Uriagereka’s chapter explores the possibility that the existence of phases is a consequence of the (conflicting) properties of complex dynamic systems. Uriagereka takes as a starting point Chomsky’s (2001, 2008) assumption that phases must be syntactic objects with the form in (6), where P and N stand for phase and nonphase heads respectively: (6)

P–N

Uriagereka notes that the P – N space is insufficient to account for many cross-linguistic phenomena (cartographic approaches would in fact say that (6) is at best a simplistic idealization). Building on ideas discussed in Gallego (2009), this author argues that (6) can expand to P – N – N, a template he relates to a two basic rhytmic units: (i) + − and (ii) + − − (where “+” and “−” correspond to edge and complement, being actually bigger domains). Adding a right edge, also external to the complement of the phase, provides the patterns in (7), which—Uriagereka reasons—are witnessed elsewhere in language (e.g., in syllabic patterns), with deeper connections to more general conditions in nature (see Uriagereka 1998): (7)

a. + − c. + − +

b. + − − d. + − − +

As Uriagereka shows, the possibility to expand the right and left edges of phases can capture the well-known elasticity of some domains. This seems relevant for language variation (the left edge of some languages contains richer structures, be this captured through more heads or more specifiers) and

14

Ángel J. Gallego

for stylistic or extra-prosodic arrangements pushing syntactic constituents to the right (as afterthoughts of sorts): (8)

a. . . . [CP que [FP cuántoi [ él [v*P v adoraba [ el campo]] ti ]]]] that how he adored-3.SG the country-side . . . that how much he adored the country-side b. [ [a rumor tj]i [ v [emerged ti] ]] [about the candidate from Chicago]j

A similar effort to naturalize phases is made by Cedric Boeckx’s contribution. As this author points out, a system without some type of cyclic, phase-byphase, Transfer will fail to provide units that can be read and manipulated by the interfaces. What must be determined is when Transfer applies and how much it takes away. These questions are related to what Chomsky calls Phase Impenetrability Condition (PIC), which captures the idea that, after cyclic Transfer applies, some part of the phase must be left in the syntax, for otherwise, displacement could not be captured (Chomsky 2004):5 (9)

Phase Impenetrability Condition (PIC) In phase Į with head H, the domain of H is not accessible to operations outside Į; only H and its edge are accessible to such operations [from Chomsky 2000:108]

In the literature on phases, the when and how much questions have been addressed in different ways: (10)

When question a. Transfer applies right after the phase head is introduced (Chomsky 2000, 2007, 2008) b. Transfer applies right after the next phase head is introduced (Chomsky 2001)

(11)

How much question a. Transfer targets the complement domain of phases (except for root clauses; Chomsky 2000 et seq.) b. Transfer may target the phase head and its complement (if the head is not needed for selection purposes) (Ott 2011)

Boeckx’s paper focuses on the how much question, arguing for the existence of two kinds of phases: transitive and intransitive ones, the former cor-

Introduction: A framework of phases for linguistic theory

15

responding to cases where ĳ-feature valuation is required. Both options can be seen in (12), where strikethrough signals the amount of phase cashed out as a consequence of cyclic Transfer. (12)

a. Transitive phase: {Ȗ, {Į, ȕ}} b. Intransitive phase: {Į, ȕ}

Boeckx suggests that the dychotomy in (12) provides the fundamental categories syntax requires to satisfy interface conditions. Building on work by Zenon Pylyshyn, this author relates (12a) and (12b) to more general cognitive categories, the WHAT and WHERE categories, which have a demonstrative and locative nature respectively. Boeckx further connects this reduction to the traditional distinction between open (nominal-like) and closed (relational-like) categories, and the way they operate in derivations. His proposal has clear implications for the study of lexical and functional categories, whose fine-grained classifications (e.g., the different types or flavors of C, v, P, etc. that have been identified in the recent literature; Rizzi 1997, Svenonius 2008, Folli & Harley 2004) are derived in the morphological interface, by contextual strategies. As just pointed out, having some transfer / mapping mechanism is compulsory and in fact guarantees the following: (13)

Effects of cyclic transfer / mapping a. What is transferred cannot be changed (tampered with) in subsequent stages (barring backtracking) b. Interfaces have access to small (manageable) objects, so that the outputs can be informative enough (reducing complexity) c. Information contained in a phase P1 and the information of subsequent phases (P2, P3, … Pn) will not engage into special meaning (allosemy) / form (allomorphy) interactions

The latter property of cyclic systems has been often related to morphophonological domains (Marantz 2001, 2007). Bridget Samuels discusses some of the most relevant arguments to postulate cyclic conditions in the realm of morphology and phonology. As this author argues, all phonological rules are governed by the PIC, but in different ways: Some obey the PIC on a scale on which every phase head is relevant, whereas others obey the PIC on a scale on which only the clause-level phase heads (C, v, and D) count. This cut roughly corresponds with lexical/(sub)word-level rules and phrase-level/post-lexical rules. The first type of rules concerns the distinc-

16

Ángel J. Gallego

tion between, e.g., adjectival (stative) and eventive passives in English, which differ in a series of respects. One of them is the possibility to attach causative or applicative morphemes: (14)

a. *The men are baked a cake (stative interpretation) b. *These tomatoes are grown (‘cultivated’ interpretation)

(15)

a. The men were baked a cake b. These flowers were grown by farmers

With Marantz (2001), Samuels argues that the asymmetry between (14) and (15) does not follow from adjectival passives being formed in the lexicon and eventive passives in the syntax, but from the special negotiation that a categorizing (phase) head and the root establish. If only the stative v is merged directly with the root (see (16a)), one can account for the facts above by assuming that these elements will be transferred together, becoming inaccessible to subsequent operations. This way any morpheme that has been plugged in above the [v √ROOT] unit will not be able to ‘talk to’ its components. (16)

a. Adjectival passive 3 x vP 3 v √ROOT stative

b. Eventive passive 3 x 3 v vP passive 3 v √ROOT

Such inner vs. outer (Class 1/stem level vs. Class 2/word level) distinction between affixes proves useful for other morpho-phonological operations that apply at the sub-word level, such as stress assignment. The stress contour of párent, paréntal and párenthood can be explained if we assume that Class 1 affixes affect stress, while Class 2 affixes do not. Samuels argues that, for parent, stress assignment rules apply to the n + √PARENT structure first, constructing a unary foot and yielding stress on the first syllable. In the case of parental, which involves a Class 1 suffix, the input to the stress rules will be the a + √PARENT structure, and the result will be regular penultimate stress. Finally, párenthood contrasts with paréntal because it contains two cycles: The

Introduction: A framework of phases for linguistic theory

17

first cycle applies to n + √PARENT, with the same result as in the noun/verb párent; then -hood is added, but the root is inaccessible because of the PIC. Samuels’ chapter also considers the effects of the PIC with larger domains: phonological (‫ )׋‬phrases, which are intimately connected to the focus of Yosuke Sato’s chapter. Sato studies Nuclear Sentence Stress (NSS) assignment, proposing that languages select either the leftmost or rightmost edge for phrasal stress within each Transfer (Spell-Out) domain. Assuming that the PIC transfers the complement domain of phases (option (11a) above), which thus becomes a potential domain for phrasal stress, Sato proposes the following rule: (17)

The Nuclear Sentence Stress Rule The head of the rightmost MaP in phonological representation receives maximal prominence

In Sato’s theory, Transfer domains correspond to Major Phrases (MaPs) in phonological representations. In order to determine the sentence stress of, e.g., John loves Mary, Sato argues that the rightmost constituent at the VP level and the leftmost constituent at the TP level (VP and TP being the relevant Transfer domains) are mapped onto MaPs, which entails that they are recorded as s(trong). This predicts that after Transfer targets the VP and TP, both Mary and John will end up being marked as s. Sato then assumes Obligatory Contour Principle (Goldsmith 1976), which precludes the occurrence of two consecutive identical phonological features, and disallows the assignment of another s-label to the sister constituent. Consequently, the subject John receives the w(eak)-label: (18)

s (MaP object) s w s ]]] ĺ (MaP subj) (MaP object) b. [TP subj T [ [vP v + Vi a. [vP v + Vi

[VP ti object ]]

ĺ

The outcome in (18b) correctly predicts that, in SVO English sentences, it is the object that receives more prominent stress than the subject in the focusneutral, out-of-the-blue context. Importantly, Sato convincingly argues that languages may differ with respect to which phonological edge of a given

18

Ángel J. Gallego

Transfer domain is chosen for stress assignment. In order to show this, Sato offers evidence from Japanese, English, Italian, and other languages. Let us now go back to (1a), and the role of the systems external to the FL to determine what the cyclic domains are. A key trait of minimalism is its interface concern, as stated in Chomsky’s (2000) strong minimalist thesis (SMT) that FL is an optimal solution to legibility conditions. In the case of phases, one could interpret ‘interface concern’ in the sense of having constrains on phases so that these computational objects have some conceptual-intentional or sensorimotor counterparts. We have already seen this in the case of the morphophonology, but what about the semantics? Wolfram Hinzen addresses this question, defending the idea that semantic objects emerge as a result of the phase-based nature of syntax. Following previous work of his, Hinzen sketches a proposal where semantic entities (proposition, reference, truth, judgment, etc.) are by-products of syntactic derivations. This perspective, which is even more ambitious than customary syntactico-centric approaches, minimizes the role played by the interfaces, up to the point of questioning the independent existence of the Conceptual-Intentional systems. Hinzen investigates what the units that syntax uses to give raise to dedicated semantic entities are. In line with Chomsky, this author takes vPs and CPs to be phasal objects, since these domains are responsible for argument structure and discourse / scope relations. Building on familiar structural parallelisms of the verbal and nominal domains (Chomsky 1970, Abney 1987, and subsequent work), Hinzen also includes DP in the list of phases—a possibility also considered by Chomsky (2004, 2005, 2007), but elusively so—.6 As Hinzen emphasizes, the independent (self-contained) nature of CPs, v*Ps, and DPs makes them count as phases. In Hinzen’s proposal, ‘independence’ aligns with ‘referentiality’, in the following sense: (19)

a. CP is the paradigmatically ‘independent’ structure, which alone, when occurring as a matrix clause, allows for a complete ‘move in the language game’ […] such as the assertion of some proposition as true: it exhibits finite Tense and Force and denotes a truth value. b. v*P denotes an event in time: it assembles the argument-structure of a verb with all thematic requirements and event participants satisfied, specified for Aspect but not Tense. c. DP can only denote an object in space, on a scale from maximal indefiniteness (generics, indefinite existential, etc.) to maximal definiteness (definite specific DPs).

Introduction: A framework of phases for linguistic theory

19

Hinzen’s explorations are to be ascribed to Chomsky’s (2000:106) earliest characterization of phases, where these objects are defined as “the closest counterparts to a proposition”. Though interesting, it is not entirely clear that such a definition can provide an empirically coherent picture of phases, for different reasons. To begin with, propositionality is a semantic notion⎯rooted in truth-conditional semantics, in the sense of the Tarskian and Fregean tradition—that does not readily apply to v*Ps, DPs, and certain CPs (uninflected, subjunctives, and others). In other words, it is not clear whether propositionality, to the extent that this notion can be worked out in syntactic terms, is a valid property in the case of objects that have already been shown to meet independent cyclic criteria. Along with the propositionality argument offered in 2000, Chomsky has also suggested that phases have some counterparts at the Sensorimotor and, especially so, the Conceptual-Intentional systems. [P]hases should have a natural characterization in terms of IC: they should be semantically and phonologically coherent and independent. At SEM, vP and CP (but not TP) are propositional constructions: vP has full argument structure and CP is the minimal construction that includes tense and event structure and (at the matrix, at least) force. At PHON, these categories are relatively isolable (in clefts, VPmovement, etc.). [from Chomsky 2004:124]

Although the argument here is different than the one based on propositionality, it is equally uninformative, as it presupposes a theory of what a ‘natural characterization’ is.7 Why should “argument structure” and “tense/event structure” count as natural characterizations? All other things being equal, several domains can be naturally defined with respect to certain evidence, so one should not rule them in or out as candidates for phasehood at once. To be sure, we would ultimately want all those domains to converge, but such a unification appears to be a pending issue of linguistic theory. A slightly more promising way⎯I believe⎯of interpreting Chomsky’s (2004) claim about phases and their counterparts in the C-I systems is to concentrate on the ‘independent’ status of these units (Hinzen’s goal). The devil is in the details here as well, nonetheless. The argument concerning phonetic independence comes from Chomsky (2001:43), who provides the minimal pair in (20):8 (20)

a. It is [Į to go home (every evening)]i that Johns prefers ti b. *It is [Į to go home (every evening)]i that Johns seems ti

20

Ángel J. Gallego

The data in (20) show that while CP can be fronted, TP cannot, which could be taken to indicate that only phases are phonologically independent. Though interesting, the asymmetry becomes much unclear in other cases. In fact, the test provided by Chomsky (2001) concerning the PF isolability of phases is not restrictive enough: As we can see below, virtually every maximal projection can be fronted. (21)

a. [Į Those books ]i , I didn’t buy ti b. [Į Peanuts ]i , I ate some ti c. [Į In that hill ]i is where John lived ti d. [Į More prepared ]i , that is how I saw John ti e. [Į Exhausted ]i is how I feel ti f. [Į Quickly ]i is the way I work ti (where Į = DP, NP, PP, DegP, AP, AdvP)

In sum, what can be concluded from the preceding discussion is that trying to identify phases by building on isolability, be it semantic or phonetic, is a rather unclear strategy, which—as Noam Chomsky (this volume) points out—can easily yield conflicting results.9 Does this mean that the independence argument is to be discarded? I think the answer is negative, as there is a sense in which TP, VP, and other non-maximal projections, are defined through higher functional heads. It just means that this criterion is not restrictive enough to determine phasehood. To conclude, however plausible in and of itself, if phases were to be determined on the basis of interface criteria alone, it would not be too difficult to obtain an unrestricted scenario, as certainly most grammatical units have some sound-meaning correspondences⎯phrases, words, morphemes, etc.⎯. Under such an interface-based scenario, the notion of phase would be of no particular interest: Bluntly put, one’s ontology of interface units would determine the ontology of phases. There is no doubt that some version of the cycle must be postulated for morpho-phonology, but it is not immediately obvious that phases (in the narrow sense of Chomsky 2000 et seq.) are required to capture that— some version of the No Tampering Condition, precluding manipulation of already assembled structure, could do. If neither computational complexity nor interface conditions are good places to start our search for phases, then how can we determine the cyclic status of grammatical objects? In the next section, I discuss the view that Chomsky has favored over the years.

Introduction: A framework of phases for linguistic theory

21

3. Vergnaud’s theory of abstract Case and the role of ĳ-features We have just seen that Chomsky’s initial characterization of phases as propositional objects with some degree of independence at the interface levels results in a miscellaneous and evanescent scenario. Together with the interface-based arguments, Chomsky has also pointed out—perhaps less effectively—that those interface correlates should actually be taken as side effects, not motivations to determine cyclic mappings: My feeling has been that Phase Theory should⎯and I think probably does⎯fall out from conditions of computational complexity, with interface motivation separate and ancillary, more a consequence than a cause [...] Phase Theory makes LF unstateable, like D- and S-structure. That’s an obvious desideratum for optimal design, and in itself I think motivates phase theory. It leaves open what the phases are. Then comes the ancillary question of whether the phases have significant interpretations at the CI and (secondarily) SM level. That consideration has often been taken as the motivation for phase theory, but I think that has the matter backwards. [Chomsky, cited in Gallego 2010:54-55]

The non-interface anchored criterion to determine phasehood that Chomsky has pursued concerns the existence of uninterpretable ĳ-features. This has been argued for by Chomsky in different passages: As discussed elsewhere (Chomsky 2001), the size of phases is in part determined by uninterpretable features […] These observations provide further support for the conclusion that v*P and CP are phases, the locus of determination of structural Case and agreement for object and subject […] A stronger principle would be that phases are exactly the domains in which uninterpretable features are valued, as seems plausible. [from Chomsky 2008:154-155—my emphasis, AJG]

The idea that ĳ-features drive the derivation into the Transfer mode is not new at all. It goes back to the ‘featural’ conception of cyclicity entertained in Chomsky (1995:233). Such view, which admittedly endorses a “more abstract notion of phase, based on the concept of valuation of features rather than just the size of the category” (Chomsky 2004:127), must also face different problems. Firstly, there is no worked out theory that predicts where uninterpretable features are located (universally). Secondly, since uninterpretable features are a source of parametric variation (the so-called Borer-

22

Ángel J. Gallego

Chomsky Conjecture), one could plausibly expect that phases be determined by their distribution, yielding cross-linguistic variation. Before exploring this hypothesis and its consequences, let me step back a little bit and elaborate on the role played by ĳ-features in Chomsky’s approach to derivations. In Chomsky’s (2000, 2001) system, C and v* are taken from the lexicon endowed with a bundle of uninterpretable ĳ-features. Chomsky (2001) makes the additional assumption that uninterpretable features are introduced into the derivation as unvalued, which turns these features into probes (i.e., seekers) that look for a goal in their c-command domain. In a standard configuration, phase heads match a DP that contains interpretable ĳ-features, which feeds valuation-and-deletion (of C’s and v*’s ĳfeatures) and structural Case assignment (to the DPs). Crucially, for reasons pointed out by Epstein & Seely (2002), valuation and deletion of ĳ-features apply simultaneously, as part of the cylic mapping.10 Precisely a hallmark of phases is that they are related to cyclic Transfer— perhaps this is the sole idea that everyone working on Phase Theory subscribes. Now, interestingly, Chomsky’s conception of Case allows for two approaches to Transfer: (22)

a. Transfer takes place to value (and delete) ĳ-features b. Transfer takes place at least to value (and delete) ĳ-features

Chomsky seems to favor (22a) on both conceptual and empirical grounds. (22a) is forced upon us if we accept the existence of uninterpretable ĳ-features and the necessity to delete them. (22b) is a weaker version, which increases the list of objects that qualify as phases. As Noam Chomsky observes through personal communication: I think every approach must take as phases at least those syntactic objects with uninterpreted features; otherwise transfer will crash, transferring uninterpreted features. Those features must be on the phase head, if the features are valued by a goal below. The question, then, is whether there should be other phases as well. If the answer is NO, we have a start to an answer for the question why there are uninterpretable features altogether: to identify phases [...] There are independent reasons [...] to suspect that C [and] v* are unique. I presume [Internal Merge – IM] applies at the phase level. Then the question arises: is successive-cyclic movement necessary to carry it further? For v*, it is, and that can be at least partially (maybe completely) explained by the intervention effect of the raised element. For C, it is required unless there is matching of the IM and C (e.g., indirect questions); otherwise

Introduction: A framework of phases for linguistic theory

23

projection fails. I don’t think these arguments hold for other proposed phases. That also seems to me a significant argument for the conclusion that C and v* are the only phases.

This restricted view of phases minimizes the weight of interface criteria, and focuses on ĳ-features as the one and only reason for the system to require cyclic mapping. As Chomsky (2000, 2001, 2007, 2008) has noted, v*P and CP appear to be unique with respect to the following phenomena: (23)

a. Application of Agree (valuation of uninterpretable ĳ-features) b. Application of Internal Merge (Move)

What (23) is saying is that all operations but EM apply at the phase level (as noted in Chomsky 2007:17). As just noted, the application of Agree in the v*P and CP is rather straightforward, since v* and C contain uninterpretable ĳ-features. The same logic does not carry over to DP and PP so naturally: Such step would require for D and P to be endowed with uninterpretable ĳ-features, but since these heads are related to inherent (semantic) Case, it is unlikely that they qualify as probes in the technical sense that Chomsky has in mind.11 In the case of IM, things are more controversial. That v*/v and C trigger movement to their edges is out of the question, given the existence of reconstruction effects (Abels 2003, Barrs 1986, Chomsky 1977, Fox 2000, and Legate 2003). What about PPs and DPs (or nPs)? The idea that DPs and PPs may also have an escape hatch was the focus of much research in the late Seventies and Eighties (by Cinque 1980, Giorgi & Longobardi 1991, van Riemsijk 1978, Torrego 1985 and others). Following that fruitful trend, Željko Boškoviü addresses the hypothesis that DP is a phase paying attention to left-branch extraction (LBE). Boškoviü argues for a parameter teasing apart languages with and without articles (say, English and Serbo-Croatian); according to him, the latter do not have a null D, but rather project a bare NP, as depicted in (24): (24)

a. [DP D [nP n [NP N . . . ]]] b. [nP n [NP N . . . ]]

English Serbo-Croatian

The structural difference in (24) has a cluster of parametric consequences, which Boškoviü regards as a (macro-)parameter. (25)

Consequences of the NP/DP Parameter a. Only languages without articles may allow left-branch extraction

24

Ángel J. Gallego

b. Only languages without articles may allow adjunct extraction from [Traditional Noun Phrases] c. Only languages without articles may allow scrambling d. Multiple-wh fronting languages without articles do not show superiority effects e. Only languages with articles may allow clitic doubling f. Languages without articles do not allow transitive nominals with two genitives g. Head-internal relatives display island sensitivity in languages without articles, but not in languages with articles h. Polysynthetic languages do not have articles i. Only languages with articles allow the majority reading of MOST j. Article-less languages disallow negative raising (i.e strict clausemate NPI licensing under negative raising); those with articles allow it Crucially, Boškoviü attributes the possibility of LBE (and adjunct whphrases) to the presence / absence of the DP layer, which qualifies as a cyclic node, thus provided with an escape hatch. (26)

(27)

a. *Expensivei he saw [ti cars] b. Skupai je vidio [ti kola] expensive is seen car a. *From which cityi did Peter meet [NP girls ti]? b. Iz kojeg gradai je Petar sreo [djevojke ti] from which city is Peter met girls From which city did Peter meet girls?

(Serbo-Croatian)

(Serbo-Croatian)

Following Chomsky’s idea that CP is a phase, but TP is not, Boškoviü develops an analysis of extraction with different ingredients: First and foremost, he pushes Chomsky’s CP/TP distinction to DP/NP; second, he adopts anti-locality, a syntactic constraint whereby movement must cross at least one full phrasal boundary (not just a segment). Together, these assumptions predict that APs and adjuncts cannot be extracted out of a DP, since these are base-generated as adjoined to an NP segment. Since movement from this position would entail crossing a partial NP projection (violating anti-locality), movement is precluded. Although the idea that DPs have an escape hatch analogous to CPs is pretty much hegemonic nowadays, it seems to be threatened by some data

Introduction: A framework of phases for linguistic theory

25

that were already noted by Chomsky (1973). As Chomsky noted, unlike CPs, DPs block successive cyclicity scenarios—they just allow one-cycle extraction. It is unclear why this is so, if they have the ‘escape hatch’ property. This unexpected asymmetry is shown below, where extraction from one DP is fine, while extraction from two DPs is not.12, 13 (28)

a. [CP Whoi did you see [DP a picture of ti ]]? b. [CP Whoi did you hear [DP stories about ti ]]? c. [CP Whati did you write [DP articles about ti ]]?

(29)

a. *[CP Whoi did you hear [DP stories about [DP a picture of ti]]]? b. *[CP Whoi do you receive [DP requests for [DP articles about ti]]]? [from Chomsky 1973:105]

Surprisingly, Chomsky (1973) argued that this behavior was due to the fact that DPs lack an escape hatch (a COMP node, in earlier terminology): If we are correct in assuming the [Strict Cycle Condition], which restricts extraction to adjacent cycles, it follows that although whphrases can be extracted from such structures as a picture of__, stories about__, requests for__, as in [(28)], it will not be possible to extract a wh-phrase when one of these structures is embedded in another, as in [(29),] because of the absence of a COMP node in noun phrases. [from Chomsky 1973:105]

Boškoviü provides similar data from Serbo-Croatian, where “deep LBE” (i.e., LBE in a [NP . . . [NP . . .]] scenario) is also blocked. He accounts for this by suggesting that NP may be a phase even in D-less languages, so that movement to the higher NP is too local. Be that as it may (and Boškoviü provides interesting arguments to support this approach), the analysis requires to parametrize not only the presence or absence of the DP layer, but also the category to which phasehood is attributed (just like Abels 2003 did with prepositions to account for P-stranding facts). Another interesting asymmetry between v*P/CP and DP/PP concerns the fact that a moved element can stay only in the edge of the former. This is clear in the case of CP (e.g., wh-questions, relatives, and possibly others). Chomsky (2004:123) suggests that this is true in the case of v*P too, but the element must move again so that there is no intervention effect between C-T and the external argument:

26 (30)

Ángel J. Gallego

a. *[CP C [TP Tĳ [v*P what booksi [v*P I v* bought ti ]] b. John asked [CP what booksi [TP Ij T [v*P ti [v*P tj v* bought ti ]]]]

As for DP, it seems that phrases cannot remain in [Spec, DP]—if they ever move to that position to begin with, which is unclear, as we have seen. To be precise, it is not obvious why what cannot remain in the specifier position of either DP in (31a) and (31b):14, 15 (31)

a. *John saw [DP whati pictures of ti] b. *John saw [DP whati stories of [DP pictures of ti ]]

The same seems to be found with prepositions. Evidence within the PP realm has highlighted the existence of forms like thereon, thereafter, thereof, therewith, whereby, and the like, which suggest the possibility that P has escape hatch (this was van Riemsdijk’s 1978 influential proposal). However, the extraction data are not entirely conclusive either (or, at least, it is not as robust as it is with CPs): PPs typically behave as opaque domains, which casts doubt on the role played by its alleged escape hatch (as noted by van Riemsdijk himself). Furthermore, as in the case of DPs, wh-elements cannot move to the [Spec, PP] of a higher PP. Compare (32) and (33), which parallel (28) and (29): (32)

a. [PP wherei [P’ of ti ]] b. [PP wherei [P’ by ti ]]

(33)

a. *[PP wherei [P’ from [PP above ti ]]] b.*[PP wherei [P’ from [PP over ti ]]]

Unless something else is relevant here, it is unexpected for the examples in (33) to be ruled out, given the availability of PP’s escape hatch.16 If wh-words are restricted to the ‘first Spec’ position (i.e., an adjacent position), then one might argue that the process is morpho-phonological, rather than syntactic. Needless to say, these observations do not necessarily lead to the rejection that DPs and PPs are phases (perhaps because, as Boeckx points out, phases are only indirectly related to locality), but they clearly signal non-trivial differences between these domains and v*Ps and CPs—differences that are truly syntactic and go beyond what is expected from a mere change in the category.

Introduction: A framework of phases for linguistic theory

27

The relevance of phases to determine extraction is also discussed by Hiroki Narita. This author outlines an approach to derivations where every application of Merge must have as one of its inputs a lexical item. Following Chomsky (2008), Narita assumes that the property that enables syntactic objects to be merged is an edge feature (EF), which is a prerrogative of LIs. Since postulating an operation of feature percolation would entail an (apparently unnecessary) enrichment of UG (see Cable 2007 for empirical arguments in defense of this claim), Narita restricts the presence of EFs to LIs. This yields a very restrictive view of how Merge operates. In particular, application of Merge may yield structures like (34a) and (34b), any complex specifier situation like (34c) being ruled out (a conclusion reached by Uriagereka 1999a and Moro 2000 on independent grounds): (34)

a. {X, Y}

b. {X, YP}

c. *{XP, YP}

Situations like (34c) obtain whenever a successive application of EM creates a complex object that is to be combined with another complex object (created in a parallel workspace). Now: What happens when the two complex syntactic objects are to be merged in a system where Merge must take an LI as one of its inputs? Narita invokes phases to solve this problem. As this author argues, cyclic Transfer must affect either XP or YP (transferring its complement) for Merge to apply. In the case of, say, DP and v*P merger, the derivation may transfer the complement of D (i.e., NP) or the complement of v*P (i.e., VP). Suppose that it is D’s complement that is transferred. (35)

a.

D

→ TRANSFER →

b. D

... c.

D v*P

After cyclic reduction of the DP, the remnant is a D head, and Merge applies unproblematically—in accord with the idea that EF are only assigned to LIs. Narita argues that this approach may account for CED effects, assuming that languages may choose to apply cyclic Transfer to the complement of one phase head or the other. As can be seen, Narita’s work has interesting implications for how Merge operates: If his reliance on the

28

Ángel J. Gallego

roled played by EF is on track, then the endocentric nature of phrases can be said to follow from the way Merge applies. As we have seen, Narita’s proposal must assume that D (in his terms, K(ase), and P too) is a phase head, so that it can force the application of cyclic Transfer in cases where no valuation of ĳ-features seems necessary (in line with (22b)). The effects and intricacies of uninterpretable morphology in Chomsky’s system are the focus of the remaining papers of this volume. Samuel D. Epstein, Hisatsugu Kitahara and T. Daniel Seely (EKS) offer a series of arguments in order to derive the GB assumption that theta relations are established prior to Case assignment. EKS entertain Chomsky’s (2007, 2008) hypothesis that ĳ-features are generated in phase heads, and are later on downloaded to non-phase heads through a mechanism of feature inheritance.17 As Chomsky argues, building on observations by Marc D. Richards, the necessity to transmit ĳ-features is due to the way cyclic Transfer operates (it is restricted to the complement of phase heads). Assuming a configuration like (36) . . . (36)

T > . . . > DP where: “…” may contain other categories but contains no theta assigner of DP and therefore DP is not in a theta position

. . . EKS note that two prerequisite conditions would have to be met for T to value Case on DP—prior to DP receiving a theta-role. First, no phase head must be sandwiched between T and DP, which follows from the PIC (such a phase head would induce Transfer and the DP would be gone, hence unavailable for Agree (T, DP)). Second, T must inherit ĳ-features from C. As EKS further observe, once cyclic Transfer applies, TP is cashed out to the interfaces. What the semantics receives is something like (37): (37)

[TP T > … > DP ]

where DP is not in a theta configuration

Consequently, the DP has no theta role in the CI representation of the transferred TP, which results in a “Theta Criterion” violation (which is to be understood as a violation of Full Interpretation): the DP itself can be interpreted, but it will have no computable semantic (i.e., theta) relation with anything else in the given local structure. EKS offer additional phasedependent arguments (and reject potential counter-arguments) to reinforce the same conclusion, thus showing that phases may be relevant for more general interface conditions (Full Interpretation).

Introduction: A framework of phases for linguistic theory

29

Still within the context of Chomsky ĳ-feature inheritance proposal, Marc D. Richards investigates the connection between movement and morphology, and the relevance of the PIC. Richards starts considering the well-known existence of two versions of the PIC (Chomsky’s 2000 strong original version, and Chomsky’s 2001 weaker version): (38)

Phase Impenetrability Condition a. Strong version (Chomsky 2000:108) In phase Į with head H, the domain of H is not accessible to operations outside Į; only H and its edge are accessible to such operations b. Weak version (Chomsky 2001:13) [Given structure [ZP Z … [HP Į [H YP]]], with H and Z the heads of phases]: The domain of H is not accessible to operations at ZP; only H and its edge are accessible to such operations.

As has been noted in the literature, the main consequence of the 2001 adjustment is that T can probe the interior of the v*P phase—in the strong version, this is impossible, since VP is already gone by the time T is merged. Although this slight modification may be needed for empirical reasons (in those cases where T agrees with the internal argument, as in DAT-NOM constructions), the resulting scenario is conceptually problematic (for reasons discussed by Richards, who also argues against Chomsky’s 2001 distinction between weak and strong phases, the former being incapable of triggering cyclic Transfer). The 2001 version of the PIC in fact incompatible with the idea that ĳ-features are inherited from C: If T’s ĳ-features come from C, then it is impossible for T to probe before C is introduced into the derivation. Building on Chomsky (2000, 2001), Richards argues that phase heads may come from the lexicon in a complete or defective fashion. Richards proposes a refinement of this ontology by noting that defectiveness may be total (a phase head being feature-less) of partial (a phase head having gender, number, or both, but never person). (39)

a. Complete P: {[uPerson], [uNumber]} b. Partially defective P: {([uNumber]), ([uGender])} c. Completely defective P: no ĳ-features

Richards makes the key observation that it actually does not matter whether a phase head is ĳ-complete or ĳ-defective: As long as it contains some uninterpretable feature, valuation (hence cyclic Transfer) will be man-

30

Ángel J. Gallego

datory—the system will not be sensitive to how many uninterpretable features a given head has: As long as there is one, valuation will be activated. The only uninteresting phase heads will be those that are totally empty, since they do not require cyclic Transfer. This makes a prediction: Only partially defective phases will require for a given DP to stop in their specifier position before receiving structural Case. Totally defective phase heads will not require cyclic Transfer, and their interior will be visible to outside probes. With this technical background in mind, Richards concentrates on the GB empirical observation that strong agreement is found in Spec-Head configurations (Kayne 1989 and related literature). This can be seen with the following French data, where trois pommes (Eng. three apples) agrees in number with the participle mangées (Eng. eaten) when it is in [Spec, TP], a position reached after successive cyclic movement through [Spec, vP]: (40)

a. Jean a mangé / *mangées trois pommes John has eaten.M.SG / eaten.F.PL three apples b. Il a été mangé / *mangées trois pommes it has been eaten.M.SG / eaten.F.PL three apples c. Trois pommes ont été *mangé / mangées three apples have been eaten.M.SG / eaten.F.PL

(French) (French) (French)

Since Chomsky’s Probe-Goal system cannot capture the connection between movement and agreement (there is no Spec-Head agreement), Richards suggests that data like (40c) is a consequence of the French participle being partially defective so that movement of trois pommes is forced to move to its specifier, as (41) shows (irrelevant structure and details are ommitted). (41)

a. [vPrt ti Prt[uNumber][uGender] [VP V trois pommes ]] ] b. [vPrt trois pommes Prt[Number:pl][Gender:fem] c. [CP Cĳ [TP trois pommes T[vPrt ti Prt[Number:pl][Gender:fem]

]]]

The interaction between Transfer and defectiveness is also inspected in Julie Legate’s paper. While sticking to Chomsky’s ĳ-feature-centered view of phases, this author explores the possibility that there can be transitive vPs with both external and internal arguments, but without ĳ-features. In practice, this situation (which is not discussed in Chomsky’s system, putting aside DAT-NOM vPs) gives us a totally ĳ-defective transitive vP, which—Legate

Introduction: A framework of phases for linguistic theory

31

shows—is found in the object voice construction found in Austronesian languages. As Legate argues, such construction is characterized by the external argument being first-merged in its theta-position, while the EPP feature of T is satisfied by the internal argument, which unexpectedly undergoes Amovemenent. Consider object voice in Indonesian and Acehnese, which contrasts with an English-style passive voice in which the external argument appears in an optional PP. (42)

a. Buku itu dia baca book that he/she read The book, (s)he read b. Ibrahim ka dokto peu -ubat Ibrahim perf doctor cause-medicine Ibrahim was treated by the doctor

(Indonesian) (Acehnese)

Legate analizes this construction as depicted in (43), noting that object raising to [Spec, TP] suggests that these transitive vPs must not be phases. If this transitive vP were a phase (akin to Chomsky’s v*P), the internal argument in the VP complement would be transferred before C’s ĳ-feature could value its features and attract it to the EPP position. (43)

The rice Pat ate (meaning “Pat ate the rice”)

TP DP the rice T vP DP Pat v VP V DP ate t (the rice) Miki Obata invites us to entertain the idea (suggested, in a somewhat different form, in Chomsky’s 1995 Attract F) that the features that lexical items contain can be splitted into two parts: A ĳ-feature bag (the A-part) and a remnant bag (the formal features, or A-bar part). Obata’s feature-split proposal has a series of interesting consequences, which become more salient

32

Ángel J. Gallego

in phase-by-phase derivations. Assuming Chomsky’s (2008) ĳ-feature inheritance, Obata suggests that the ĳ-features of C are responsible for attracting the ĳ-feature component of the external argument (in bona fide transitive v*P), whereas C’s EFs match the featural remnant. This process can be seen in (44) in the case of a simple wh- question (the crucial stage is the step 4, where the LI who splits in order to give raise to two parallel non-trivial chains; as first suggested in Chomsky 2008). (44)

Who left? Step 1: [v*P who [VP left]] Step 2: [CP C [TP Tĳ [v*P who …]]] Step 3: [CP C [TP Tĳ [v*P who[ĳ][Case][Q] …]]] Step 4: Feature Splitting Internal Merge [CP who[Q] C [TP who[ĳ][Case] Tĳ . . . [v*P who[ĳ][Case]

[Q]

…]]]

Obata proposes a weak (phase-bounded) violation of the Lexical Integrity Hypothesis that raises different questions. As this author argues, an interesting consequence of feature-splitting concerns improper movement, which receives a straightforward explanation: Since the high copy of the A-bar chain contains no ĳ-features, and this is the only element visible to operations outside the phase, it cannot be the Goal of a ĳ-Probe. Consider, to see this, (45): (45)

*Who seems that left?

Obata’s feature-splitting predicts that who gives rise to two chains: an Achain moving who’s ĳ-features to [Spec, TP] and an A-bar chain moving who’s formal features to [Spec, CP]. After cyclic Transfer applies, what is visible outside of the phase is the A-bar part of who, which is devoid of ĳfeatures. Consequently, if a ĳ-Probe tries to match who, the derivation will crash, for valuation will be impossible: (46)

a. [v*P who v* [VP left ]]

Introduction: A framework of phases for linguistic theory

33

b. [CP who[Q] C[TP Tĳ[v*P v*[VP left ]]]] c. [CP who[Q] C d. [CP C [TP Tĳ . . . [CP who[Q] C . . . ]

]

Obata’s work is relevant for the definition and study of A and A-bar positions, and for cross-linguistic phenomena concerning the interaction of feature-splitting and movement (hyperraising, intervention effects, etc.). The volume ends with Ian G. Roberts’s contribution. Following previous work on his own, Roberts studies the nature of second-position (P2) effects within the CP field of Southern and Western Slavonic languages (see (47)), arguing that they are attributable to the nature of C as a phase head. (47)

a. Dao ga je Mariji given it has (he) to-Maria He has given it to Maria b. Taj mi je pjesnik dao autogram this to-me has poet given autogram This poet has given me an autogram

(Croatian) (Croatian)

In Roberts (2010), it is argued that a consequence of phase heads being endowed with ĳ-features is that they trigger cliticization. Assuming that clitics are defective Goals (DPs without a Case layer, or ĳPs), Roberts argues that after the Agree dependency between, e.g., v*’s ĳ-features and the relevant clitic is established, the system cannot distinguish it from a non-trivial chain, so, as usual with non-trivial chains, the high copy is spelled-out: (48)

a. [v*P v*ĳ [VP V DP[ĳ][Case] ]] b. [v*P v*ĳ [VP V ĳP ]]

Standard Agree Cliticization

Roberts argues that the P2 effect on clitic clusters in Croatian (and elsewhere in South and West Slavic) requires for the clitic to follow either one XP (which can be any type of constituent) or one head. Clitics in these languages move for the same reason clitics do in Romance—because there is a phase head, which contains a bundle of ĳ-features. However, as Roberts notes, something must be said about the fact that P2 has C (or some related head, e.g., Fin), and not v*, as its locus. To account for this, this author proposes that second-position clitics may escape from the v*P because they are Dmin/max,

34

Ángel J. Gallego

rather than ĳmin/max. Given this size asymmetry, this variety of clitics is distinct from v, and in fact unable to incorporate to it, since v has no D-feature. If Roberts’ approach is on track, the fact that cliticization is crosslinguistically restricted to v* and C (or some related functional head) follows from these heads being endowed with ĳ-features, which predicts that D, P, and any other candidate of phasehood should not exhibit those effects. 4. Conclusions and open questions Although minimalism is often thought of as a brand-new tendency, the spirit and motivations that drive it have been around since the origins of generative grammar. The same could be said about the notion of cycle, whose relevance was—like that of many other notions, plausibly as a side-effect of structuralism’s influence—first experimented in phonology, from where soon enough it spread to morphology, semantics, and syntax. As I hope the previous pages have made clear, the rationale for having a cyclic organization in the grammar is far from controversial (at least if one entertains the idea that representations have to be transferred to the interface components), but the specific formulation of the cycles within minimalism certainly is. In the framework of phases outlined in Chomsky (2000 et seq.), cycles/phases are defined roughly by the following criteria: (49)

Characterization of phases a. Phases are as small as possible b. Phases have some reasonable interpretation (at least) at the C-I level c. Phases are identifiable in a natural and straightforward way

Notice that the first property adscribed to phases would be consistent with a strict compositionality scenario whereby every syntactic rule (every application of Merge) would be paired with a corresponding semantic or phonological rule. Chomsky has nonetheless argued for a more representational option in which the relevant mappings procrastinate, giving rise to apparent mismatches between syntactic structures on the one hand and the phonological and semantic interpretations on the other. Such a tension is in fact increased by Chomsky’s claim that phases “should have a natural characterization in terms of [Interface Conditions]: they should be semantically and phonologically coherent and independent” (Chomsky 2004:124), which has led much literature to focus on (49b), hence identifying phases on the basis of interface effects. As a conse-

Introduction: A framework of phases for linguistic theory

35

quence, phases have been tracked down by looking at phenomena such as movement, reconstruction, phonological phrasing, linearization, binding, and so on (where the ‘so on’ part is vast; see Gallego 2010:ch.2, and Richards 2004:65 and ff. for ample discussion), a line of action that was motivated by Chomsky’s early emphasis on the propositional nature of phases. This view must certainly be taken into account (cyclic mappings must give rise to objects that the morpho-phonology and the semantics can read and manipulate), but it cannot be the leading criterion to define phasehood—if it were, then our chances of deriving the cyclic organization of the system would reduce, not only because of the diversity of phenomena associated to the interfaces, but also because of our limited understanding of these border zones. At the same time, it is fair to say that the campaign to define phases in terms of “computational efficiency” has failed as well—or, to put it in less dramatic terms, it has not provided us with a good understading of the relevant metrics yet. Chomsky (2000) first invoked phases to reduce complexity (restricting access to the lexicon through LAs), but sensible and appealing as this hypothesis may be, it is not clear how it is to be worked out in practice (see fn. 2). This seems to leave us in an uncomfortable situation, but I think there are reasons for optimism. Part of this optimism is due to the fact that the arguments to regard CP and v*P as cyclic objects are robust. Since C and v* are the loci of uninterpretable ĳ-features, Chomsky has endorsed the idea that such design trait, rooted in Vergnaud’s theory of abstract Case, should be understood as a sign of perfection: Uninterpretable ĳ-features exist precisely to determine the phases. This raises questions about other cyclic domains—most notably, DPs (Boškoviü 2010, Svenonius 2004, Marantz 2001, and others)—and, more generally, it casts doubt on the possibility to treat all the cycles in a unitary fashion (the ultimate goal of Chomsky’s 2000:131 Single Cycle Syntax). But the evidence that has been offered in the literature in order to unify all the cycles seems to be a bit contrived, and in fact contributes to the miscellaneous flavor of Phase Theory. That there is no unification of cycles (consider, despite Marantz’s 2001 observations, the non-trivial differences between morpho-phonology and syntax-semantics in the context of how the PIC operates) is, as a matter of fact, a quite reasonable working hypothesis, consistent with Chomsky’s (2007:15) suggestion that “language may be optimized relative to the CI interface, with mapping to SM an ancillary procedure”. Perhaps, then, phases should be restricted to capture the basic intuition behind the strict cycle, with ĭ Transfer (Spell-Out) being parametrized and Ȉ Transfer (what Roger Martin called Interpret) being uniform—pretty much like in Chomsky’s (1993, 1995) model. This brings old, but still much relevant, questions to the

36

Ángel J. Gallego

fore. One such question, e.g., is whether the mappings should apply to the same amount of structure. A natural alternative to the current conception of the PIC, suggested in passing by Chomsky (2007), is that Ȉ Transfer targets phases in full (say, v*P, not just VP), while ĭ Transfer only does the complement domain (basically, the phase minus its edge, which typically contains a complex specifier that creates an independent ‫׋‬-phrase). These and many more questions concerning phases (about feature inheritance, (long distance) agreement, (successive cyclic) movement, stress assignment, grammatical categories, language variation, cliticization, phonological phrasing, complexity of dynamic systems, etc.) are in the minimalist agenda, and they are nothing but means to understand the FL and—if the SMT is correct—to what extent its nature can be deduced from considerations of computational efficiency and interface conditions. Notes *

1. 2.

For comments to previous versions of this paper I would like to thank Roberta D’Alessandro, Ignacio Bosque, Noam Chomsky, Luis Eguren, Antonio Fábregas, Adriana Fasanella, Olga Fernández-Soriano, Luis López, Dennis Ott, Henk van Riemsdijk, Ian G. Roberts, and Juan Uriagereka. Errors are mine. This research has been partially supported by grants from the Ministerio de Educación y CienciaFEDER (HUM2006-13295-C02-02), from the Generalitat de Catalunya (2009SGR-1079), and from the Ministerio de Ciencia e Innovación (FFI201129440-C03-01). An array contains tokens of lexical items. If more than one token of the same type is stored in an array, an LA is called Numeration (see below). For reasons that have not been made explicit, lexical arrays are dispensed with in Chomsky (2005, 2007, 2008). Tracking down phases boils down to locating the phase heads, which is actually the same strategy that was used in lexical arraybased formulations of phases, as the following quotes make clear: “LAi can then be selected straightforwardly: LAi contains an occurrence of C or of v” (Chomsky 2000:106), “[A] subarray LAi must be easily identifiable; optimally, it should contain exactly one lexical item that will label the resulting phrase.” (Chomsky 2001:12). Noam Chomsky (p.c.) points that that the main goal of lexical arrays was to distinguish copies from independent choices from the lexicon, which can be done by taking all operations to apply at the phase level: at the stage where v* or C is merged, this head either triggers external Merge (of some element from the lexicon) or internal Merge (from some element within its domain), with no need to invoke lexical arrays (see Chomsky 2008:145 for discussion).

Introduction: A framework of phases for linguistic theory 3. 4.

5.

37

I am putting aside the possibility that phases can be slided or extended. See Gallego (2010) and Den Dikken (2007) on this possibility. A strong interpretation of this view is that economy/efficiency is an inherent property of the system. A weaker (and more plausible, it seems to me) view is that efficiency considerations emerge in a derivative way. To be specific, imagine the computational system has a property X (for which we independently have evidence) and, as a consequence of X, efficiency obtains. This is, if I understand correctly, the route that Chomsky wants to pursue by emphasizing the role of uninterpretable morphology (and was also the rationale that made Uriagereka put forward his Multiple Spell-Out system). Chomsky (2008:143) seems to hint at a new conception of the PIC when he points out that “[i]t may be, then, that PIC holds only for the mappings to the interface, with the effects for narrow syntax automatic”. Chomsky elaborates on this idea through personal communication: My original proposal was that transfer eliminates all information relevant to SM and CI, but that is too strong. Take, say, a direct object complex enough so that there are internal phases, a NP with a relative clause, for example. If it then raises, the relative clause will be spelled out in the surface position, which means it can’t have “disappeared” at the transfer level […] The strict cycle—the computational intuition behind transfer—requires only something weaker: that no later operation applies to the interior of the phase that has been passed: no IM, for example.

6.

7.

This view makes it possible to treat DAT-NOM as the DP receiving nominative from C, assuming that the PIC allows for its ĳ-Probe to scan the complement of v*. This would be enough to value the unvalued features of C, but at the same time requires for the DP to receive Case in a countercyclic way. Such complication, Noam Chomsky suggests, could be solved if nominative is regarded as inherent Case assigned by V. The passages where Chomsky alludes to DPs as potential phases are revealing. In particular, he points out that “perhaps phase also include DP…” (Chomsky 2005:17; my emphasis) or “[phases include] possibly DP as well, but this raises many questions” (Chomsky 2004:125—my emphasis). Very tentatively, Chomsky (2007:25) speculates that nominal expressions “might sometimes also constitute phases”, depending on their referential nature, which he relates to the presence of D (the locus of the [±definite] property; see Ott 2008 for explorations of this idea). In the case of adpositions, Noam Chomsky (p.c.) is skeptical about there being a natural characterization holding at the C-I level for all PPs, as adpositions establish different types of relations, at least “spatial, temporal, or other,” as Hale & Keyser (1998:77) indicate, and sometimes locative notions are expressed by nominal-like elements (so-called axial parts, in the sense recently discussed by Peter Svenonius). Perhaps Hale’s (1986) ‘(central/terminal coincidence’ is a common semantic notion that holds for all types of adpositions, but details remain to be filled in.

38 8.

9.

10.

11.

12.

13.

14.

Ángel J. Gallego The same argument should hold for semantic independence, but this brings us back to the murky question of what constitutes a ‘natural’ semantic object. As I see things, this very much depends on one’s ontology. The problem of these accounts is similar to the alternative considered in Chomsky (2000:107), namely that phases are convergent⎯assuming that convergence is an interface notion⎯. Chomsky argued against this option (explored in Uriagereka 1999a, 1999b) on complexity grounds. As he noted, under a convergence-based version of phases, in a expression like (i), only Į (the entire expression) is a phase, as on at Į the element is frozen (assuming wh-phrases have a wh-feature to check, as in Rizzi 1997). (i) [Į Which article is there some hope [ȕ that John will read twh ]] Different aspects of Chomsky’s proposal remain to be studied in detail. For instance, it is left unexplained how and why ĳ-features are encoded in some LIs. Also obscure is the idea that uninterpretability is translated by UG as lack of value, as Chomsky (2001:5) argues—this requires some connection between the pre-syntactic lexicon and the C-I systems, putting aside how lack of value is to be implemented (see Boeckx 2010 and Adger 2010). The conclusion is the same in an analysis with nP (not DP), if n is the locus of interpretable ĳ-features. If n (or D) contained uninterpretable ĳ-features, like C and v*, then there would be no Goal to value those features (since roots are featureless), which would predict a crash. This leaves open what happens with concord phenomena within the DP, and it also has nothing to say with respect to those approaches where prepositions are ĳ-Probes (Kayne 2004). Perhaps the presence of abstract ĳ-features could explain why P (like v) must always have an internal argument, but this requires saying something about particles, and the status of applicative-like morphemes. See McGinnis (2004) for the possibility that high applicatives are phases (McGinnis’ proposal builds on Pylkännen’s 2008 distinction between low and high applicatives, which has a semantic nature). As suggested in Bosque & Gallego (in progress), this would follow if there is no subextraction from DPs (as Bach & Horn 1976 already argued). A similar example, involving AP (not PP) extraction is provided by Ian Roberts (p.c.); as (i) shows, extraction of how big suggests that it has moved through [Spec, DP] from its DP-internal, post-article position. (i) How big a house did John buy? As Roberts notes, whatever AP-movement is going on in cases like this one is highly local: In (ii), how good cannot be taken to modify boy, but only story. (ii) How good a story about a boy did John tell? Wh-movement from a complex PP is also barred, which would reinforce the thesis that DPs and PPs lack the relevant escape hatches/edges. (i) *[CP Whoi did Obama talk [PP about articles [PP on pictures of ti ]]]? (ii) *[CP Whoi did you agree [PP with the father [PP of ti ]]]? The effect with the examples in (31) could be due to independent semantic reasons: It is not clear what the interpretation of such expression would be. (31)

Introduction: A framework of phases for linguistic theory

39

could therefore be ruled out by saying that [Spec, DP] is not an appropriate scope taking position. Be that as it may, the question still arises why [Spec, CP] and [Spec, vP] are scope taking positions (Fox 2000, Nissenbaum 2000, and references therein), but [Spec, DP] and [Spec, PP] are not. 15. Wh-movement has been taken to occur DP internally at least in the case of possessive determiners, as (i)-(ii) shows (also demonstratives; see Bernstein 1997, Brugè 2002, and references therein). (i) [DP D [NP pictures of {whom/his} ]] (ii) [DP {Whose/His}i D [NP pictures ti ]] Although movement accounts of (ii) are plausible, it is not immediately obvious that direct base-generation of the wh-phrase in [Spec, DP] is to be dismissed. The hypothesis that there is movement in (i)-(ii) is required in order to capture the thematic dependency between the noun pictures and the pronoun whose/his (in the case at hand, a possessive relation); however, if nouns lack argument structure in the syntax (as argued for by Hale & Keyser 1998, Mateu 2002, and Kayne 2011), there is reason to doubt that such a dependency must be established in a configurational fashion. 16. Evidence of reconstruction effects in [Spec, PP] is, to the best of my knowledge, hard to find. Antonio Fábregas and Henk van Riemsdijk (p.c.) suggest that the examples in (i) (taken from May 1985) and (ii) could be relevant: (i) There is a flagi hanging [PP ___ [PP ti in front of every building ] ] (ii) Which of these students’ papersi was every teacher happy [PP ___ about their wording of ti ]? Putting aside the status of (ii), what is still problematic is that the relevant readings could be obtained as long as there is some intermediate position different from [Spec, PP]. In the case of (i), the QR every building could undergo (covert) QR to [Spec, TP], and in (ii), the relevant position of the wh-phrase could be [Spec, vP]. For similar problems in determining reconstruction effects in DPs, see Bosque & Gallego (in progress). 17. Chomsky (2011) argues that all the features of phase heads (tense, Q, and more), not just uninterpretable ones, must be inherited. This may follow from the fact that they form a bundle that cannot be disentangled (as in Chomsky’s 1995:262263), suggesting that inheritance is actually some copying mechanism, not actually a mechanism to remove features from one head to handle them to another.

References Abels, Klaus 2003 Successive-cyclicity, anti-locality, and adposition stranding. PhD dissertation, University of Connecticut.

40

Ángel J. Gallego

Abney, Steven 1987 The English Noun Phrase in Its Sentential Aspects. PhD dissertation, MIT. Adger, David 2010 A minimalist theory of feature structure. In Features: Perspectives on a Key Notion in Linguistics, A. Kibort and G. Corbett (eds)., 185-218. Oxford: Oxford University Press. Bach, Emmon and George Horn 1976 Remarks on “Conditions on Transformations”. Linguistic Inquiry 7: 265-361 Barrs, Andrew 1986 Chains and Anaphoric Dependence. PhD dissertation, MIT. Bernstein, Judy 1997 Demonstratives and reinforcers in Romance and Germanic languages. Lingua 102:87-113. Boeckx, Cedric 2010 Defeating Lexicocentrism. Ms., ICREA-UAB. Boeckx, Cedric and Kleanthes Grohmann 2007 Putting Phases in Perspective. Syntax 10: 204-222. Boškoviü, Željko 2010 Phases beyond clauses. Ms., University of Connecticut. Bosque, Ignacio and Ángel J. Gallego In prog. Against Subextraction. Ms., UCM – UAB. Brugè, Laura 2002 The positions of Demonstratives in the Extended Nominal Projection. In Functional Structure in DP and IP, G. Cinque (ed.), 2-53. Oxford: Oxford University Press. Cable, Seth 2007 The grammar of Q. Q-particles and the nature of wh-fronting, as revealed by the wh-questions of Tlingit. PhD dissertation, MIT. Chomsky, Noam 1970 Remarks on nominalization. In Readings in English transformational grammar, R. Jacobs and P. Rosenbaum (eds.), 184-221. Waltham, MA: Ginn and Co. 1973 Conditions on transformations. In A festschrift for Morris Halle, S. Anderson and P. Kiparsky (eds.), 232-286. New York: Holt, Renehart and Winston. 1977 On wh-movement. In Formal Syntax, P. Culicover et al. (eds.), 71-132. New York: Academic Press. 1986 Barriers. Cambridge, MA: MIT Press. 1991 Some notes on economy of derivation and representation. In Principles and parameters in comparative grammar, R. Freidin (ed.), 417-454. Cambridge, MA: MIT Press.

Introduction: A framework of phases for linguistic theory 1993

41

A minimalist program for linguistic theory. In K. Hale and S. J. Keyser (eds.), The view from Building 20. Cambridge, MA: MIT Press, 1-52. 1995 Categories and transformations. In The minimalist program, 219-394. Cambridge, MA: MIT Press. 2000 Minimalist inquiries: the framework. In R. Martin et al. (eds.), Step by Step. Cambridge, MA: MITPress, 89-155. 2001 Derivation by phase. In: M. Kenstowicz (ed.), Ken Hale: A Life in Language. Cambridge, MA: MIT Press, 1-50. 2004 Beyond explanatory adequacy. In A. Belletti (ed.), Structures and Beyond. Oxford: Oxford University Press, 104-131. 2005 Three factors in language design. Linguistic Inquiry 36: 1-22. 2007 Approaching UG from below. In Interfaces + recursion = language? Chomsky’s minimalism and the view from syntax-semantics, U. Sauerland and H-M. Gärtner (eds.), 1-30. Berlin: Mouton de Gruyter. 2008 On phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud, C. Otero et al. (eds.), 134-166. 2011 Problems of Projection. Talk given at the University of Leiden. Chomsky, Noam; Morris Halle, and Fred Lukoff 1956 On accent and juncture in English. In For Roman Jakobson, M. Halle et al. (eds.), 65-80. The Hague: Mouton. Chomsky, Noam and Morris Halle 1968 The sound pattern of English. New York: Harper Row Cinque, Guglielmo 1980 Extraction from NP in Italian. Journal of Italian Linguistics 5: 47-99. Collins, Chris 1997 Local Economy. Cambridge, MA: MIT Press. Dikken, Marcel den 2006 Relators and linkers. The syntax of predication, predicate inversion, and copulas. Cambridge, MA: MIT Press. 2007 Phase extension. Contours of a theory of the role of head movement in phrasal extraction. Theoretical Linguistics 33: 1-41 Epstein, Samuel and Daniel Seely 2002 Rule applications as cycles in a level-free syntax. In Explanation and Derivation in the Minimalist Program, S. Epstein and D. Seely (eds.), 65-89. Oxford: Blackwell. Folli, Raffaella and Heidi Harley 2004 Flavors of v: Consuming Results in Italian and English. In Aspectual Inquiries, R. Slabakova and P. Kempchinsky (eds.), 95-120. Dordrecht: Kluwer Fox, Danny 2000 Economy and Semantic Interpretation. Cambridge, MA: MIT Press. Fukui, Naoki 1996 On the Nature of Economy in Language. Cognitive Studies 3: 51-71.

42

Ángel J. Gallego

Gallego, Ángel J. 2009 Phases and Variation: Exploring the Second Factor of Language. In Alternatives to Cartography, J. van Craenenbroeck (ed.), 109-152. Berlin: Mouton de Gruyter. 2010 Phase Theory. Amsterdam: John Benjamins. Giorgi, Alessandra and Giuseppe Longobardi 1991 The syntax of noun phrases. Cambridge: Cambridge University Press. Hale, Kenneth 1986 Notes on world view and semantic categories: some Warlpiri examples. In Features and Projections, Studies in Generative Grammar, P. Muysken and H. van Riemsdijk (eds.). Dordrecht: Foris, 233-254 Hale, Kenneth and Samuel J. Keyser 1998 The basic elements of argument structure. In MIT working papers in linguistics 32: Papers from the Upenn/MIT roundtable on argument structure, H. Harley (ed.), 73-118. Cambridge, MA: MIT Press. Hiraiwa, Ken 2005 Dimensions of symmetry in syntax: Agreement and clausal architecture. PhD dissertation, MIT. Johnson, David and Shalom Lappin 1997 A Critique of the Minimalist Program. Language and Philosophy 20: 273-333. 1999 Local constraints vs. economy. Stanford, CA: CSLI. Kayne, Richard S. 1989 Facets of Romance past participial agreement. In Dialectal variation and the theory of grammar, P. Benincà (ed.), 85-103. Foris: Dordrecht. 2004 Prepositions as probes. In Structures and Beyond, A. Belletti (ed.), 192-212. Oxford: Oxford University Press. 2011 Antisymmetry and the Lexicon. In The Biolinguistic Entreprise, A.M. Di Sciullo and C. Boeckx (eds.). Oxford: Oxford University Press. Legate, Julie A. 2003 Some interface properties of the phase. Linguistic Inquiry 34: 506-516. Lasnik, Howard 2006 Conceptions of the Cycle. In Wh-movement: moving on, L. Cheng and N. Corver (eds.), 197-216. Cambridge, MA: MIT Press. Marantz, Alec 2001 Words. Ms., MIT. 2007 Phases and words. In Phases in the theory of grammar, S. H. Choe (ed.), 191-220. Seoul: Dong In. Mateu, Jaume 2002 Argument structure: Relational construal at the syntax-semantics interface. PhD dissertation, UAB McGinnis, Martha 2004 Lethal ambiguity. Linguistic Inquiry 35: 47-95.

Introduction: A framework of phases for linguistic theory

43

Moro, Andrea 2000 Dynamic antisymmetry. Cambridge, MA: MIT Press. Müller, Gereon 2010 On Deriving CED Effects from the PIC. Linguistic Inquiry 41: 35-82. Nissenbaum, Jon 2000 Investigations of covert phrase movement. PhD dissertation, MIT. Ott, Dennis 2008 Notes on noun ph(r)ases. Ms., Harvard University. 2011 A Note of Free Relative Clauses in the Theory of Phases. Linguistic Inquiry 42: 183-192. Pylkännen, Liina 2008 Introducing Arguments. Cambridge, MA: MIT Press. Richards, Marc D. 2004 Object shift and scrambling in North and West Germanic: A case study in symmetrical syntax. PhD dissertation, University of Cambridge. Riemsdijk, Henk van 1978 A case study in syntactic markedness: The binding nature of prepositional phrases. Lisse: The Peter de Ridder Press. Rizzi, Luigi 1997 The fine structure of the left periphery. In Elements of grammar. Handbook in generative syntax, L. Haegeman (ed.), 281-337. Dordrecht: Kluwer. Roberts, Ian G. 2010 Agreement and head movement: Clitics, incorporation and defective goals, Cambridge, MA: MIT Press. Svenonius, Peter 2004 On the edge. In Peripheries: Syntactic edges and their effects, D. Adger et al. (eds.), 259-287. Dordrecht: Kluwer. 2008 Projections of P. In Syntax and Semantics of Spatial P, A. Asbury et al. (eds.), 63-84. Amsterdam: John Benjamins. Torrego, Esther 1985 On empty categories in nominals. Ms., UMass Boston. Uriagereka, Juan 1997 Formal and substantive elegance in the minimalist program. In The role of economy principles in linguistic theory, M. Bierwisch et al. (eds.), 170-204. Berlin: Akademie Verlag. 1998 Rhyme and Reason. Cambridge, MA: MIT Press. 1999a Multiple spell-out. In Working minimalism, N. Hornstein and S. Epstein (eds.), 251-282. Cambridge, MA: MIT Press. 1999b Minimal restrictions on Basque movements. Natural Language and Linguistic Theory 17: 403-444. 2011 Derivational cycles. In The handbook of linguistic minimalism, C. Boeckx (ed.), 239-259. Oxford: Oxford University Press.

Phases beyond explanatory adequacy Cedric Boeckx

1. Putting phases into perspective1 The goal of this exploratory paper2 is to examine the possibility of a new theoretical role for phases, one that is quite distinct from the traditional uses of the syntactic cycle, and one that I argue has a better chance of taking us beyond explanatory adequacy. In a nutshell, phases (equivalently, cyclic spell-out) will be argued to be only indirectly related to locality (much more indirectly than standard theoretical elaborations assume), but crucially, and directly involved in the formation of basic grammatical categories, and indeed in the formation of all central grammatical asymmetries. Importantly, this new theoretical role for phases only makes sense if the appeal to syntactic features and attendant principles like Last Resort, so common in minimalist practice, are done away with. In a recent reflective piece, Jan Koster (Koster 2010) begins as follows: What follows is born out of dissatisfaction with current Minimalism, the received linguistic paradigm since Chomsky 1995. My concerns are not about Minimalism as a program. On the contrary, I subscribe to the overall goal to construct a theory that makes grammar look as perfect as possible and that relegates as much as it can to “third factor” principles. My dissatisfaction is about how this program is carried out in practice. Others disagree, but my personal feeling is that little theoretical progress has been made since the 1980s. I emphasize theoretical, because empirically speaking the progress has been impressive. One can hardly think of any topic nowadays of which it cannot be said that there is a wealth of literature about it. All of this progress, I claim, is mainly “cartographic” and therefore compatible with pre-minimalist generative grammar and even certain forms of pre-generative structuralism. Part of the theoretical stagnation is due to the fact that some key problems of earlier versions of generative grammar, as they arose for instance in the GBperiod, are either unresolved or ignored. But there are deeper problems, it seems, that involve the very foundations of the field.

I fully subscribe to Koster’s assessment,3 and find the lack of theoretical progress very worrisome indeed in a framework whose goal is to go “beyond explanatory adequacy.” In my view, nowhere has the lack of theoretical pro-

46

Cedric Boeckx

gress been as clear as in areas where the notion of ‘phase’ has been appealed to in recent years. I expressed my dissatisfaction with ‘phases’ in Boeckx & Grohmann (2007). Like Koster, Grohmann and I made it clear right from the beginning of the article that “we could not agree more with the general vision and virtually all the arguments made by Chomsky over the years regarding the motivations behind the Minimalist Program” (p. 204). We even accepted the general intuition expressed by Chomsky over the years that phases play a role in reducing computational load. Our concern, similar to Koster’s, stemmed from how this intuition was being cashed out in practice. To us, it seemed—and it still seems to me—that “virtually all the properties ascribed to phases in the current literature have been recycled from the very first theoretical attempt to make sense of such phenomena as islands or successive cyclicity (Chomsky 1973)” (p. 205). We immediately pointed out that “[i]n and of itself, the fact that phases have theoretical antecedents is not a bad thing”. The problem was (as Koster’s remarks emphasize) at the theoretical level—at the level where attempts to go beyond explanatory adequacy are evaluated. Grohmann and I noted—what is obvious to everybody—that “phases are to minimalism what bounding nodes and barriers were to the Extended Standard Theory and Government-and-Binding Theory, respectively” (p. 205). Our overall assessment was that “Like bounding nodes and barriers, phases beg questions that lead to persistent problems. Accordingly, phases do not enhance our under-standing of syntactic phenomena like locality; they simply recode insights from the past” (p. 205). I will not repeat all the arguments that Grohmann and I adduced in support of our assessment; nor will I try to provide new illustrations of our (and Koster’s) negative evaluation, despite the fact that some of them (e.g., a close comparison of Chomsky 1986 and Müller 2010) would be very revealing indeed.4 Instead I will explore a very different way of putting phases to theoretical use, thereby attempting to make the notion of ‘phase’ (and its properties) follow from virtual conceptual necessity, as minimalism demands. 2. What phases do(n’t do) A quick survey of the literature reveals that phases—much like bounding nodes in the 1970s—have mainly been used to capture two ‘big’ facts about human language syntax: successive cyclic movement and island/subjacency effects.5 These are in fact the joint rationale for phases offered in textbooks on minimalist syntax (see, e.g., Adger 2003, ch. 10).6

Phases beyond explanatory adequacy

47

Phases, according to Chomsky (2000), impose a “Phase Impenetrability Condition” on syntactic derivations, according to which at a given point (around which there is some debate; contrast Chomsky 2000 vs. Chomsky 2001) elements inside the complement of the phase head become inaccessible for further computation. To remain active, elements from inside the phase head have the option to move to the edge of the phase. At this point we are in familiar territory: the Phase Impenetrability Condition ensures that long-distance dependencies will have to be formed via successive cyclic movement (“Compto-Comp” or “phase-edge-to-phase-edge”). At the same time, the Phase Impenetrability Condition offers the possibility of viewing the trapping effects of islands as instances where the escape hatch, the edge of the phase, becomes— for some reason or other—inaccessible to a given element. The trouble is (and, come to think of it, has always been) that it is far from easy to come up with a good reason why the edge of the phase becomes inaccessible. One can certainly code it featurally (witness Müller 2010), but it should be obvious that imposing conditions on edge accessibility is simply a way of getting the facts,7 not a way of going beyond explanatory adequacy. The move can hardly be characterized as ‘minimalist’. The fact that this way of getting the data has been used for close to 40 years indicates both how theoretically conservative linguists have been, and perhaps also how hard it is to come up with some alternative. As for the idea that phases capture successive cyclic movement, let me briefly touch on an issue that was first brought up in Abels (2003), and has since then been taken up in Boeckx (2008b), and Abels & Bentzen (2009). Abels contrasts two ways of conceiving of successive cyclic movement: a classic way, according to which an element moves only through some welldesignated intermediate landing sites (forming “punctuated paths”), and an alternative way that takes successive cyclic movement to move through all the projections separating its projection of origin (forming “(quasi) uniform paths”).8 To the extent that not all phrases are phases (an issue I return to below), phases appear to favor the view that paths are punctuated, which Abels & Bentzen (2009) argue is descriptively more adequate. Unfortunately, even if the evidence is taken at face value, it does not tell us if movement proceeded in a punctuated, or (quasi-)uniform manner, since the evidence is only indirect, dealing as it does with interface effects such as reconstruction effects. Such evidence merely points to the fact that syntactic derivations interface with the external systems in a punctuated fashion. It does not, indeed cannot, indicate the path taken by a moving element. In other words, such evidence is evidence for cyclic transfer (and concurrent interpretation),

48

Cedric Boeckx

not for how chains are formed. Put another way, it is evidence that spell-out is cyclic/punctuated, not that paths are.9 Accordingly, it is false to claim (as the textbooks do) that phases enforce successive cyclic movement. Phases, understood as cyclic (punctuated) spell-out, may provide a good way to capture the interface reflexes of successive cyclic movement, but they say nothing about the process of chain formation. The previous discussion begins to touch on a central issue in the minimalist program, that of Last Resort, and the role of economy conditions. The reason that phases are often said to motivate successive cyclic movement is because it has been assumed since Chomsky (1993) that movement only happens for a reason: be it a morpho-syntactic reason (“to check a feature”), or an interpretive reason (“to have an effect on outcome”; cf. Fox 2000, Reinhart 2006), or to avoid the trapping effects of the Phase Impenetrability Condition. But if movement is conceived as internal merge (as proposed in Chomsky (2004)), and internal merge is really just merge (‘internal’ being devoid of theoretical import; simply a descriptive term), then the question really should be whether merge ought to be subject to Last Resort. Here opinions differ. Subjecting merge to Last Resort opens the door to a whole range of unnatural syntacticization/lexicalization/featuralization of properties of the external systems—properties that ought to be viewed as the result of syntactic structures (configurations), not the cause (driving force) of syntactic structure formation. In its simplest formulation, Merge should just be a recursive set-forming operation, capable of combining any two lexical items. The simplest, most concise (i.e., lawful)10 way of achieving this result is to appeal to—in fact, define lexical items in terms of—an edge ‘feature’,11 which Chomsky defines as follows:12 For a L[exical] I[tem] to be able to enter into a computation, merging with some [syntactic object], it must have some property permitting this operation. A property of an LI is called a feature, so an LI has a feature that permits it to be merged. Call this the edge-feature (EF) of the LI. [from Chomsky 2008:139]

In such an unrestricted Merge approach (an approach I like to call “Merge Į”, as it harks back to some important characteristics of the “Move/Affect Į” framework of early GB), intermediate landing sites for ‘movement’ can be formed in the absence of phases, because every lexical item, not only phase heads, have an edge property enabling further merge, and no condition on phase edge accessibility can be imposed.13 Accordingly, the motivation behind the two main roles of phases within narrow syntax—enabling succes-

Phases beyond explanatory adequacy

49

sive cyclic movement and forming islands for movement—turn out to be theoretically very weak indeed. It is thus quite reasonable to wonder if in such a Merge-Į framework, phases ought to be retained. In what follows I will argue that phases must in fact exist. In fact, I will try to indicate that not only should they exist, they receive a much better motivation under Merge Į.14 Indeed, once the Last Resort nature of movement/merge is done away with, the existence of phases approximates virtual conceptual necessity—a conclusion that mini-malist syntacticians should welcome, even if it means abandoning some of their most cherished assumptions about how narrow syntax works, and what sort of locality conditions it imposes. I take this to be a sign of theoretical progress. 3. Pressing phases into new theoretical service Dispensing with the Last Resort condition imposed on Merge removes much of the theoretical weight assigned to features within narrow syntax in minimalism. (This is particularly important in the context of phases, which have grown into featurally “all-powerful” heads; see Chomsky 2008, Gallego 2010a) Suppose now that we strengthen the definition of lexical items in terms of edge feature discussed in the previous section and claim that such a definition exhausts the (featural) content of lexical items as far as syntax is concerned. That is, the atoms of syntactic computations boil down to elements endowed with an edge feature: mergeable elements. (This is another way of maximizing the role of Merge in narrow syntax: all there is to syntax is (elements that can) merge. By claiming that nothing else in lexical items is accessible to the syntax, I am closing the door to an unconstrained use of features, extra operations, etc.). Think of lexical items as concepts whose contents have become opaque by lexicalization. Put another way, think of lexicalization as a process that puts all concepts on a par, making them all accessible in the workspace. Concepts as different in meaning and adicity as GIVE, FOOD, JOHN, HAMMER, ARRIVE, KISS, once lexicalized, all become what Distributed Morphologists would call ¥ROOTS. Lexicalization, in this sense, creates a completely homogenous pre-syntactic lexicon. Of course, such a lexicon is not a good model for language-specific vocabularies, but my claim is that it is quite an appropriate lexicon for a universal (narrow) syntax.15 Lexicalization, as conceived of here, has quite a liberating effect: it allows for a general Merge Į model; it lifts all selectional restrictions

50

Cedric Boeckx

on Merge. Lexical items having become like billiard balls entering into random collisions, are now, for all syntactic purposes, completely flat, structureless, incapable of projecting. Feature percolation, projection, checking, and so on all become unformulable. All there is, is Merge. Merge becomes the only source of ((specifically) linguistic) structure. In such a super-syntactocentric model, the reigning notion is symmetry. Saussure’s assertion that everything in language is a system of differences cannot be true of language if the latter is used in the sense of the human language faculty. But we know that it must be true of specific languages, specific final / stable states of the human language faculty. So, how can we get from here to there? How can we get from symmetry to asymmetry, naturally (i.e., in accordance with the minimalist desideratum of virtual conceptual necessity)? How can we get from a homogeneous pre-syntactic linguistic lexicon to an heterogenous post-syntactic language-specific dictionary, full of differences, distinctions, and idiosyncracies. Short of ad hoc moves, I can only think of one way of achieving this, viz. letting the dynamics of syntax asymmetrize. That is, letting the syntactic derivations dictate which asymmetries one finds in natural languages. As I pointed out in Boeckx (2009b), approaching asymmetry in this way gives us hope that we may understand why the asymmetries that we find empirically exist. By locating their sources in the syntax (the realm of the lawful), as opposed to locating their sources in the lexicon (the realm of the lawless and the idiosyncratic), we hope to make sense of them, thereby taking a few steps beyond explanatory adequacy. This is in accordance with Epstein & Seely’s (2006:7) dictum that “if you have not derived it, you have not explained it”.16 This attempt to go beyond explanatory adequacy may well fail, but it is important to note that this is virtually the only way we have to try to rationalize properties of the language faculty and its offsprings (specific languages). A lexicalist, non-syntactocentric approach merely stipulates properties that are in need of explanation; and approaches pursuing a parallel architecture, such as the approach pursued by Ray Jackendoff (1997, 2002), multiply sources of generativity in a way that is both offensive to Occam, and biologically rather implausible in light of the short evolutionary window of time during which the human language faculty emerged—a fact that calls for a minimally specified Faculty of Language in the Narrow sense (cf. Hauser, Chomsky, & Fitch 2002). For us now, the key question is how to let the syntax generate the asymmetries that we need. This immediately begs the question of how many (kinds of) asymmetries we need, of course. It is here that the immense cartographic progress alluded to by Koster in the opening section of this paper can

Phases beyond explanatory adequacy

51

have a negative influence on theoretical progress. This is the familiar tension between descriptive and explanatory adequacy, now taken to the next level: the tension between explanatory adequacy and what lies beyond it (what some have called natural, biological, or evolutionary adequacy; cf. Boeckx & Uriagereka 2007, Narita 2010, Fujita 2007, 2009, Longobardi 2003). It clearly cannot be all the differences that the cartographic approaches posit, because we would be back to coding the differences as features (tellingly, recent cartographic/nanosyntactic approaches take a maximally decompositional approach to feature bundles/categories, and claim that single features project; cf. Kayne 2005, Starke 2010). Moreover, empirically, a few cartography-enthusiasts have begun to recognize the need for larger categories and domains (see Shlonsky (2006, To appear)). Failing to recognize these supercategories would miss important generalizations that suggest that there are indeed very few kinds of distinct elements in syntax (see Boeckx 2008a, ch.4 on fractality in language). Very interestingly, the same question arises in the context of phases: how many (kinds) are there? Building on Boeckx (2009b), I would like to suggest that all asymme-tries (i.e., categories, a.k.a. labels/instruction for specialized interpretations) reduce to points of transfer that correspond to phase boundaries.17 In addition, I will claim here that the mini-max number of phase-type is two. (I emphasize phase-type, and not phase-token, as there will be many more than two phasetokens in many derivations (even mono-clausal ones). In fact, the model advocated here predicts that the number of phase-tokens grows as the derivation grows, which I take to be a natural consequence of the view that phases play a role in reducing computational efficiency. The logic behind locating, or, more accurately, anchoring all asymmetries at the phase level is that cyclic transfer (i.e., phase) by definition creates an asymmetry between what is transferred and what is not. Crucially, for the reasoning to go through, two things must be assumed. First, transfer must indeed be cyclic. If it were to take place only once, the giant Merge-set constructed in the syntax would be completely homogenous. Second, it must be the case that not everything built in the syntax gets transferred at the point of spell-out, otherwise, no asymmetry will be established, and moreover no hierarchy would be built, as syntax would always have to restart from scratch.18 Accordingly, phases must be decomposed into two complementary domains: a phase-complement (which gets transferred) and a phase-edge (what remains of the phase after transfer). Notice that none of this has to be built into lexical instructions/features: failure to proceed along these lines will be filtered out at the interfaces.

52

Cedric Boeckx

The next question is: how much gets transferred, and how often? As we will see, this will directly relate to the question of how many (kinds of) phases there are. The most desirable answer, given efficiency consideration internal to narrow syntax, is that as much as possible gets transferred upon spell-out. Only the bare minimum should remain to allow for the derivation to keep going. In technical language: minimize the size of the phase edge, and maximize the size of the phase complement. At the same time (a familiar tension in the context of optimization), maximizing the size of the complement means delaying the timing of transfer. Early phase edge formation (i.e., minimizing the size of the phase-complement) should thus be highly valued. On the face of it, the optimal compromise appears to be to spell-out one of the members of each merge-set: Merge (Į, ȕ): {Į, ȕ} and transfer (say) ȕ: {Į, ȕ}. This is certainly one possibility, but there are at least two reasons to allow for another possibility: (i) we certainly want to allow for a complex edge (say, head and specifier) to be formed, and later spelled-out (at the next phase level), which means that we have to allow for a slight expansion of the complement domain of a phase, one that allows for minimally two members: {Ȗ, {Į, ȕ}}. (ii) As it turns out, this slight expansion of the complement domain of a phase is also necessary to deal with agreement / valuation phe-nomena: the element controlling the agreement and the element functioning as the target must be part of the same transfer domain.19 The phase complement (i.e., transfer domain) must thus be allowed to consists of (minimally) two members. We are thus left with two options, or kinds of phases (more accurately, two amounts of transferred material): one where the phase complement consists of one element only, and the other where it consists of two elements. Let me refer to the first option as the intransitive phase option, and to the other as the transitive phase option.20 I would like to contend that these two options are all that is needed to provide the fundamental categories that syntax needs to construct to meet the empirical demands known to hold of the interfaces. I take it as a given that conceptually speaking the mind is endowed with a basic two-category system, as has been independently identified in non-linguistic cognitive systems such as the visual system, where Pylyshyn (2007) has argued in favor of a WHAT category (a bare demonstrative, roughly ‘this’) and a WHERE category (a bare locative, roughly ‘there’) as perceptual primitives. These I interpret linguistically as a nominal category (n) and an adpositional category (p). Such categories will emerge at Transfer (the point of asymmetry), as follows: In an intransitive phase context ({Į, ȕ}), ȕ will be assigned the interpretation N (following distributed morphologists, Į will thus count as n). In a transitive

Phases beyond explanatory adequacy

53

phase context ({Ȗ, {Į, ȕ}}), it must be the case that one of the two members of the phase complement is the edge of a former phase (i.e., is branching), otherwise the derivation will fail to converge, given the Anti-Identity avoidance holding at the interfaces, which prohibits two adjacent units in the same phase complement (Richards 2002, 2010, van Riemsdijk 2008, Boeckx 2008a). Say that ȕ is that element, and is minimally branching (i.e., ȕ is of the n-type). Į will be interpreted as P (and Ȗ as p). 4. Primitive (syntactic) and emergent (post-syntactic) linguistic categories The reader will no doubt wonder how such a minimal(ist) system can produce the full range of categories that linguists have identified. If all that can be produced are Ns and Ps, where do categories like C, V, v, D, T, etc. come from—to say nothing about the various ‘flavors’ that all these categories take in cartographic approaches (CForce, CFin, PPath, PPlace ...)? The only natural answer to his question is, ‘configurationally’ (i.e., contextually), as the derivation unfolds. Finer-grained categories will be defined according to a schema familiar from phonological rules: p ĺ v / {“T”,{ _ }} (“T” is surrounded by quotation marks, as it too will be defined configurationally). This effectively means that many categories will be born alike, but become distinct, i.e., functionally specialized post-Spell-Out (at the point of recombination/vocabulary insertion). They emerge configurationally, much like theta-roles in Hale & Keyser’s (1993, 2002) theta-theory. Although this approach to the fine details of cartography may look unfamiliar, I would like to point out that many of the ideas needed to make it work are already available, and in some cases already quite standard. Take, for example, Pylkkänen’s (2002, 2008) approach to applied arguments, where the same category (ApplP, a flavor of v/p) becomes specialized depending on its merge-site, hence the existence of low applicatives, high applicatives, super-high applicatives, etc. Likewise, as I will suggest below, the nominal category Person may be regarded as high n (low n corresponding to Gender/Class.21 Similarly, the distinction between PPlace and PPath plausibly reduces to high P and low P. Ditto for Force and Finiteness: high C and low C.22 The more analytic the pattern of category expression, the more specialized each occurrence of the category in question will become. One could in fact speak of descent/transfer with modification, a process not too different from the process of grammaticalization studied by typologists.

54

Cedric Boeckx

It stands to reason that the richer the language is morphologically, the more salient these functional specialization of category occurrences will become. In morphologically poor languages, the inventory of category may reduce to the most basic one: N and P. A two-category distinction is all that is needed to bootstrap the grammaticalization process, as Heine & Kuteva (2007) show (although they argue that V is more primitive than P). The N/P distinction is also very reminiscent of Mateu’s (2005) distinction between relational and non-relational categories23 (which Mateu shows is enough to capture all the necessary distinctions in Hale and Keyser’s 1993, 2002 thetatheory). Finally, as I have argued in Boeckx (2009c), the two phase-kinds under discussion are both necessary and sufficient to construct the NeoDavidsonian event representations that I take to form the basis of an adequate semantic theory along the lines pursued by Pietroski (2005): (1)

∃e [ . . . & șext (x, e) & . . . & șint (y, e) & . . . & ad (w, e) ]

It should indeed be obvious that each occurrence of the event argument (e) in (1) is introduced by an intransitive or a transitive predicate, each of which, I argue in Boeckx (2009c), corresponds to a phase boundary (with ‘&’ being the interpretive correlate of Merge). Still, the reader may well wonder if it is really the case that all attested categories (indeed, UG-possible categories) can reduce to just N and P (i.e., elements embedded under an intransitive/n phase or under a transitive/p phase). Here again, the impressive weight of cartographic data can be overwhelming. But how different, really, is V from P? Would we be able to know for sure if we forced ourselves not to rely on morphological Tense-marking (one of the criteria used by Svenonius 2007)?24 Isn’t it otherwise surprising that event/aspect structure and adpositional structures match so closely (see Ramchand 2008, Tungseth 2008)? Are light verbs really all that different from light adpositions?25 In the same vein, the idea that C is either nominal or adpositional is a recurring theme in studies focusing on categories, and would make sense of the morphology of complementizers (see already Emonds 1985). Moreover, if C is really what gives T its identity (much like n labels N), as Chomsky (2007) has argued, isn’t it quite natural to treat C as an element providing a location on a time line?26 Finally, are coordinators (say, and) really all that different from adpositions (say, commitative with)?; and adjectives from adpositions (isn’t angry simply with anger?; Mateu 2005, Amritavalli & Jayaseelan 2003, Kayne 2011)? These (admittedly, rhetorical) questions point to the fact that the (post-syntactic) morphological

Phases beyond explanatory adequacy

55

component of the human language faculty, like natural selection, constantly tinkers with the spare resources made available to it (by the generator, narrow syntax), recycling27 the same categories and adding morphophonological flavors to them, which have sidetracked linguists into thinking that these emergent classes of categories are primitives. I would like to close this section on categories by briefly examining the nature of the functional category D, which, if the theory sketched here is correct, must be regarded as an instance of p (transitive phase). The present system takes N/n to be the core nominal category (“which is what we intuitively always wanted to say” Chomsky 2007:25-26). But what is the role of D in this system? Although widely adopted, and deeply implicated in matters such as argumenthood (see Longobardi 1994), the DP-hypothesis has always been problematic for selection and for concord (see Bruening 2008 for a careful survey of problems; see also Fukui & Zushi 2008). The selection problem is particularly salient (although not confined to the nominal domain; see Shlonsky 2006): we want the Verb / Preposition to select N, not D. But if D dominates N, as in the DP-hypothesis, why doesn’t it block selection? Being forced to view D as a special occurrence of P, and taking advantage of the established identity between P and C, I would like to argue that D is in fact a kind of C. Although taking D to be a complementizer is not completely unheard of (cf. Szabolcsi 1984, among many others), I would like to further argue that D is a relative complementizer, and adopt Kayne’s (1994) revival of the raising analysis of relative clauses to allow for N/n to raise from within the DP and reproject an NP/nP layer, rendering selection by V/P straightforward, as represented in (2).28 (2)

[nP [DP D . . . ] [ n [¥N ] ] ]

where DP = [DP D . . . [nP n [¥N ] ] ]

Note that the derivation in (2) not only solves the selection problem,29 but also allows us to treat concord (controlled by N/n, not by D) as a case of standard agreement, established in a manner similar to agreement at the clausal level. It also captures the oft-noted dependency between D and Person (see, e.g., Longobardi 2006 on DP = PersonP), if Person corresponds (as suggested above) to the higher occurrence of n, the one that reprojects out of the relative clause (i.e., the “person” flavor of n entails the presence of D).30 It is worth noting in the context of the present proposal that the representation in (2) captures the well-established fact (see Heine & Kuteva 2007) that determiners tend to grow out of (grammaticalize out of) demonstratives,

56

Cedric Boeckx

which in turn tend to emerge from locatives. If D is C, and C is just a high occurrence of P, then D is ultimately P, a locative.31 Let us take stock. I have argued that the simplest take on Merge (Merge Į) virtually forces upon us the need for a way to regulate this unconstrained operation, to ensure proper mapping to the external systems. Lexical features have been the method of choice within minimalism to achieve this. But I do not see how these features, given their high degree of specificity, can contribute to the attempt to go beyond explanatory adequacy. Instead, I have argued that cyclic spell-out (transfer by phase) can create the configurations that not only take over the essential work of features, but offer far more explanatory perspectives, and enhances prospects for biolinguistics, as cyclic spell-out, unlike features, can be related to third factor considerations (see Boeckx 2009a, 2010 / In progress). In particular, phases provide, perhaps for the first time, the basis for a non-parochial, non-linguistic specific mode of categorization whose primitives are rooted in the Faculty of Language in the Broad Sense, and appear to mesh well with findings not only in syntax, but also at the interfaces (especially semantics), including findings in the field of typology (grammaticalization). I contend that a theory of categorization by phase is more explanatory than an appeal to categorial features (as in Chomsky 1970), or criteria like “able to project a specifier/“carry a referential index” (Baker 2003)—definitions that appeal to non-minimalist concepts (specifier, index). The theory proposed here recognizes the existence of very few basic categories, hence is highly constrained. It also provides the means to obviate thorny issues of selection that necessarily arise (without receiving a natural solution) in cartographic studies. Finally, it turns the old chestnut “Which category is lexically specified as phasal?” into a non-question: in the present framework, there is no such thing as a phase-head listed in the lexicon. Phase heads emerge in the course of the syntactic derivation. It is a dynamic notion, not a feature. Because of the way the derivation unfolds, a Phase–non-Phase32 alternation emerges as the natural (optimal) rhythmic within nar-row syntax—a natural beat on which one can then try to anchor other, post-syntactic rhythms (prosodic notions, e.g.), as suggested in Samuels (this volume). Note, incidentally, that, if correct, the present approach gives us strong reasons not to take every instance of Merge to be phasal (contra Epstein & Seely 2002, 2006). It is interesting to note that the Phase–non-Phase rhythm (in fact, binary branching) is derived from deeper (interface) considerations (ultimately, the plausible bare output condition of Anti-identity, *[XX]), which prevents one from entertaining the possibility that some languages may vary in this respect (a possibility argued for in Gallego 2008, who claims that some languages

Phases beyond explanatory adequacy

57

allow for an enlarged Phase–non-Phase–non-Phase pattern). This last point illustrates a familiar discovery within minimalism: the better we motivate our syntactic constructs, the more they appear to be invariant—which leads one to the claim that narrow syntax is completely immune to variation (see Boeckx 2011), as one would expect if the pre-syntactic lexicon is featurally as impoverished as I have claimed it is and if parameters really are coded featurally (the hypothesis often attributed to Hagit Borer).33 5. Conclusions Mark Baker begins his (2003) monograph on lexical categories by pointing out that “it is ironic that the first thing one learns can be the last thing one understands” (p. 1). Basic categories are often where fundamental syntactic theory starts, but also where it ends, as it too quickly appeals to constructs features) that do not have a distinguished explanatory track record. In this paper I have advocated a minimalist model that seeks to minimize lexical/featural instructions so as to maximize explanation. I think that the overall strategy ought to be regarded as more promising than the standard alternative that puts so much explanatory weight on a component of the grammar (the lexicon) that everyone regards as the repository of idiosyncracies. If the present attempt proves on the right track, it reinforces (indeed, radicalizes) the exoskeletal, syntactocentric, dynamically anti-symmetric approaches that have been explored in recent years (see, among others, Borer 2005, Moro 2000), and provides a new rationale for the existence of phases and cyclic spell-out. Notes 1.

I am grateful to Ágel Gallego for inviting me to contribute to this volume, and for numerous discussions about the very possibility of a theory of phases over the years. For inspiration, inadequately acknowledged in the text, I am indebted to Dennis Ott, Noam Chomsky, Paul Pietroski, Marc Richards, Hiroki Narita, and Bridget Samuels. I am also thankful to Adriana Fasanella and Carlos Rubio for conversations during the writing of this paper, and to two anonymous reviewers. Finally, I thank audiences at the University of Massachusetts at Amherst (Workshop on recursion, May 2009), Kyoto University, the Graduirtenkolleg at the Goethe Universität Frankfurt, the Centre de Lingüística Teòrica at the Universitat Autònoma de Barcelona, the linguistics program at Bo÷aziçi Üniversitesi, and

58

Cedric Boeckx

the Consejo Superior de Investigaciones Científicas in Madrid for valuable comments. The present work is supported by a Marie Curie International Reintegration Grant from the European Union (PIRG-GA-2009-256413), research funds from the Universitat Autònoma de Barcelona Vicerector for Research, as well as grants from the Spanish Mininistry of Science and Innovation (FFI-201020634; PI: Boeckx), and from the Generalitat de Catalunya (Grant 2009SGR1079 to the Centre de Lingüística Teòrica). 2. For reasons of space, the present piece is only an outline. For a more comprehensive treatment, see Boeckx (2010/In progress). 3. Although I disagree with him regarding how to go about solving this dire situation. See Boeckx (2010/In progress). 4. For another revealing comparison, see Boeckx (2007) in the context of den Dikken (2007). 5. See, however, Samuels (this volume) for other, more interface-based uses of phases, linking it to the notion of the “phonological cycle”, the original notion of the cycle in generative grammar. 6. Hornstein et al. (2006, chap.10) also discuss Chomsky’s original (2000) argument for phases partitioning the numeration into subnumerations. But given that the concept of (sub)numeration has fallen into disrepute since then, it is fair to say that this is no longer a major function of phases in current syntactic theorizing 7. Here I should perhaps say ‘at best a way of getting the facts’, for I do not think that it even get the facts, when the data base is expanded. For discussion, see Boeckx (2003, 2008a, to appear). 8. Abels notes that the paths are not entirely uniform as movement does not target intermediate projections of the phrases it moves through, for reasons of chain uniformity. 9. I thus still endorse my earlier conclusion (Boeckx, 2003, 2008b) that there is no reason to reject the idea that paths are quasi-uniform. In a framework such as the one pursued below, there cannot be any constraint on the way paths are formed 10. I am here alluding to Murray Gell-Mann’s definition of a natural law as “a compressed description, available beforehand, of the regularities of a phenomenon”. (I owe this particular phrasing to Kauffman 2008:133; Gell-Mann’s view is well expressed in Gell-Mann 1994, e.g., 84). 11. The term ‘feature’, though legitimate, is an unfortunate terminological choice, as it invites completely unnatural theoretical moves (i.e., moves that obstruct the path to “beyond explanatory adequacy”, such as subjecting the edge feature to various conditions of insertion, checking, etc. (see, e.g., Müller 2010). 12. Elsewhere (Chomsky 2007:11n.16), Chomsky correctly notes that the edge feature must be taken to be unerasable, a constant trait of lexical items. Put more precisely, I take it that the edge feature (the trait enabling Merge) is removed upon Transfer, once the element is De-merged (in the sense of Fukui & Takano 1998, i.e., mapped onto a linear string).

Phases beyond explanatory adequacy

59

13. Likewise, no Anti-locality condition can be imposed on merge (contra Boškoviü 1994, Abels 2003, among others); its effects, like those of any other locality condition, must be understood by making references to the properties of the external systems with which syntax interfaces. It is interesting to note in this regard that Grohmann’s original formulation of the Anti-locality condition (Grohmann 2000, 2003) allows for violations of Anti-locality as long as they are tolerated (‘repairable’) at the interfaces, which effectively means that Anti-locality cannot be taken as a narrow syntactic, derivational condition, but must instead be viewed as a representational filter, as I have argued must be true of locality conditions more generally (see Boeckx 2003, 2008a, to appear). 14. Again, for details, see Boeckx (2010/In progress). 15. It is also quite appropriate for a universal semantics, judging from Pietroski’s (2005) argument in favor of a all-predicate-based, one-semantic-type-only model. 16. This dictum could form the basis of a strong version of syntactocentrism, according to which, ‘if you have not constructed it syntactically, you have not explained it’. 17. The present proposal thus differs from the one put forth by Moro (2000), according to which it is movement that asymmetrizes. I take it that Moro’s proposal is undesirable in light of Chomsky’s (2004) argument in favor of collapsing merge and move. 18. Notice that the topmost phase of non-selected/root constituents could be transferred in toto—an option that I explore in Boeckx (2010/In progress), in light of the results achieved by Ott (2011) in the context of free relatives. (For a similar proposal, within a different framework, see Obata 2010). As I discuss in Boeckx (2010 / In progress), in a Merge Į framework, application of Transfer should be as free as merge is. 19. There are several issues relating to agreement that I am not discussing here, among others: the need for a Feature-Inheritance mechanism to ensure proper feature occurrence interpretation without invoking the interpretable/uninterpretable distinction (Chomsky 2004, Richards 2007). Also, I am setting aside the question of what such a Feature-Inheritance mechanism would amount to in a framework like the present one, which does not recognize the existence of any syntactic feature other than the edge feature. For discussion, see Boeckx (2010/In progress). 20. Note that, given the need to asymmetrize at the interfaces, these are the only two options that are available. Spelling out more than two elements would necessarily fail to provide an unambiguous category label for all the elements transferred. Hence, this option is filtered out. 21. For relevant material, see Picallo (2006, 2008). 22. For relevant material regarding categories being interpreted as high and low, see Boeckx (2008a, chap.4). 23. Mateu maintains a distinction between P and V categories, which I do not see theoretical reasons for. I should also point out that the present theory of primitive categories relates in an obvious way to Kayne’s (2011) claim that there are two

60

24. 25.

26. 27.

28.

29.

30.

31.

32.

Cedric Boeckx basic categories: functional/closed-class and lexical/open-class, with the latter consisting of roots, which Kayne takes to be Nouns. I disagree with Kayne regarding the nominal character of roots, and follow work in Distributed Morphology in regarding roots as a-categorial: without syntax, roots are conceptual stuff, non-linguistic elements. Assignment of a given category only makes sense in a system: if all roots are the same, we do not need to assign a category to them, since category implies contrast. See Gallego (2010b), Masullo (2008) on participles and gerunds, respectively; see also Emonds (2008) for relevant discussion It is interesting to note that in highly analytic, morphologically isolating languages, it is hard, and perhaps irrelevant, to distinguish between serial verbs as V-V compounds and serial verbs as (light) P-V combinations; cf. Aboh (2009). Similarly for Mood, which locates an utterance in a possible world. The idea of morphological recycling is not new; see Longa, Lorenzo, & Rigau (1996, 1998). I would like to add that the tinkering character of morphology may have interesting implications for phylogenetic studies of the human language faculty. See Boeckx (2009d) for preliminary discussion of this intriguing theme. For interesting material bearing on the present proposal, see Koopman (2005) on Noun-formation via relativization in Masaai; see also Arsenijeviü (2009), where the role of C as a relativizer is extended considerably. It is tempting to solve other instances of the selection problem, such as the one that arises at the left periphery noted in Shlonsky (2006)—why don’t TopicP/FocusP block selection of FinitenessP by the higher V?—in the same manner: take the ForceP/FinitenessP split to be an instance of reprojection, with TopicP/FocusP forming a relative clause in the same way that DP does. On reprojection caused by (contrastive) Focus (Topic/Focus pair), see Irurtzun (2007). It is also tempting in this regard to relate this dependency of Person on D to the presence of a P-element (such as Spanish a) in the context of [+person/animacy] objects. Leu (2008) has recently formalized this link between demonstratives and locatives by taking D(em.) to contain a locative element (which Leu takes to bottom out into an adjective). Leu’s representation is reproduced here: (i) [D the HERE man] Leu in fact suggests that phrases like this man contains a two-DP layer (roughly: [[this HERE] THE man], with HERE and THE phonetically null in most, but not in all languages). One could straightforwardly adapt Leu’s proposal in the present framework, with the locative element forming a low occurrence of P/D, and the definite determiner forming a high occurrence of D. To the best of my knowledge, the first work studying this type of alternation is Richards (2011), who however ends up motivation a Non-Phase–Phase pairing that ultimately conflicts with bottom-up derivations (see Boeckx 2010/In progress).

Phases beyond explanatory adequacy

61

33. It is indeed quite remarkable to see that all too often it is only lack of understanding that leads one to claim that a certain property attributed to the language faculty is taken to be parametrizable. It is as if variation were the default.

References Abels, Klaus 2003 Successive cyclicity, anti-locality, and adposition stranding. Ph.D. Dissertation, University of Connecticut. Abels, Klaus and Kristine Bentzen 2009 Are movement paths punctuated or uniform? Catalan Journal of Linguistics 8: 19-40. Aboh, Enoch 2009 Clause structure and verb series. Linguistic Inquiry 40: 1-33. Adger, David 2003 Core syntax: a minimalist approach. Oxford: Oxford University Press Amritavalli, R., and K.A. Jayaseelan 2003 The genesis of syntactic categories and parametric variation. In Generative Grammar in a Broader Perspective: Proceedings of the 4th GLOW in Asia, 19-41. Arsenijeviü, Boban 2009 Clausal complementation as relativization. Lingua 119: 39-50. Baker, Mark C. 2003 Lexical categories: Verbs, Nouns, and Adjectives. Cambridge: Cambridge University Press. Boeckx, Cedric 2003 Islands and chains. Amsterdam: John Benjamins. 2007 Phases and explanatory adequacy: Contrasting two programs. Theoretical Linguistics 33: 43-48. 2008a Bare syntax. Oxford: Oxford University Press. 2008b Understanding Minimalist Syntax: Lessons from Locality in Longdistance Dependencies. Oxford: Blackwell. 2009a How the language organ self-organizes. A dynamical system approach to linguistic complexity. Presented at the Theoretical Biology Research Group, Institute Cavanilles for Biodiversity and Evolutionary Biology, University of Valencia. 2009b The locus of asymmetry in UG. Catalan Journal of Linguistics 8: 4153. 2009c Some notes on the syntax-thought interface. In Proceedings of the Sophia University Linguistic Society 24, 92-103. Sophia University Linguistic Society.

62

Cedric Boeckx 2009d

When syntax meets the sensorimotor systems, phylogenetically and ontogenetically. Presented at the pre-ConSOLE Workshop on the syntax-morphology-phonology interface. Universitat Autònoma de Barcelona, December 2009. 2010/In p. Elementary syntactic structures.Ms., ICREA–UAB. [Part A, “Defeating lexiconcentrism” available at http://ling.auf.net/lingBuzz/001130] 2011 Approaching parameters from below. In The biolinguistic enterprise: New perspectives on the evolution and nature of the human language faculty, A.-M. Di Sciullo and C. Boeckx (eds.), 205-221. Oxford: Oxford University Press to appear Syntactic Islands. Cambridge: Cambridge University Press. Boeckx, Cedric and Kleanthes K. Grohmann 2007 Putting phases in perspective. Syntax 10: 204-222. Boeckx, Cedric and Juan Uriagereka 2007 Minimalism. In The Oxford Handbook of Linguistic Interfaces, G. Ramchand and C. Reiss (eds.), 541-573. Oxford: Oxford University Press. Borer, Hagit 2005 Structuring Sense (2 vols.). Oxford: Oxford University Press. Boškoviü, Željko 1994 D-structure, theta criterion, and movement into theta positions. Linguistic Analysis 24: 247-286. Bruening, Benjamin 2008 Selectional Asymmetries between CP and DP Suggest that the DP Hypothesis is Wrong. Ms., University of Delaware. Chomsky, Noam 1970 Remarks on nominalization. In Readings in English transformational grammar, R. Jacobs and P. Rosenbaum (eds.), 184-221. Waltham, Mass.: Ginn and Co. 1973 Conditions on transformations. In A Festschrift for Morris Halle, S. Anderson and P. Kiparsky (eds.), 232-286. New York: Holt, Rinehart and Winston. 1986 Barriers. Cambridge, Mass.: MIT Press. 1993 A minimalist program for linguistic theory. In The view from Building 20, K. Hale and S. J. Keyser (eds.), 1-52. Cambridge, Mass.: MIT Press. 2000 Minimalist inquiries: the framework. In Step by step: Essays on minimalist syntax in honor of Howard Lasnik, R. Martin et al. (eds.), 89-155. Cambridge, Mass.: MIT Press 2001 Derivation by phase. In Ken Hale: A Life in Language, M. Kenstowicz (ed.), 1-52. Cambridge, Mass.: MIT Press. 2004 Beyond explanatory adequacy. In Structures and beyond, A. Belletti (ed.), 104-131. New York: Oxford University Press.

Phases beyond explanatory adequacy 2007

63

Approaching UG from below. In Interfaces + recursion = language? Chomsky’s minimalism and the view from semantics, U. Sauerland and H.-M. Gärtner (eds.), 1-30. Berlin: Mouton de Gruyter. 2008 On phases. In Foundational issues in linguistics, R. Freidin et al. (eds.), 133-166. Cambridge, Mass.: MIT Press. den Dikken, Marcel 2007 Phase Extension: Contours of a theory of the role of head movement in phrasal extraction. Theoretical Linguistics 33: 1-41. Emonds, Joseph 1985 A unified theory of syntactic categories. Dordrecht: Foris. 2008 Valuing v-features and n-features: what adjuncts tell us about case, agreement, and syntax in general. In Merging Features: Computation, Interpretation, and Acquisition, J.M. Brucart et al. (eds.), 194-214. Oxford: Oxford University Press. Epstein, Samuel D. and T. Daniel Seely 2002 Rule applications as cycles in a level-free syntax. In Derivation and explanation in the minimalist program, S. D. Epstein and T. D. Seely (eds.), 65-89. Oxford: Blackwell. 2006 Derivations in Minimalism. Cambridge: Cambridge University Press. Fox, Danny 2000 Economy and semantic interpretation. Cambridge, Mass.: MIT Press. Fujita, Koji 2007 Facing the logical problem of language evolution. English Linguistics 24: 78-108. 2009 A Prospect for Evolutionary Adequacy: Merge and the Evo-lution and Development of Human Language. Biolinguistics 3: 128-153. Fukui, Naoki and Yuji Takano 1998 Symmetry in syntax: Merge and Demerge. Journal of East Asian Linguistics 7: 27-86. Fukui, Naoki and Mihoko Zushi 2008 On Certain Differences between Noun Phrases and Clauses. In Essays on nominal determination, H. Høeg Müller and A. Klinge (eds.), 265286. Amsterdam: John Benjamins. Gallego, Ángel J. 2008 Phases and variation. Ms., CLT-UAB. 2010a Phase Theory. Amsterdam: John Benjamins. 2010b On the prepositional nature of non-finite verbs. Catalan Journal of Linguistics 9: 81-204. Gell-Mann, Murray 1994 The Jaguar and the Quark: adventures in the simple and the complex. London: Little, Brown and Company.

64

Cedric Boeckx

Grohmann, Kleanthes K. 2000 Prolific peripheries: A radical view from the left. Ph.D. Dissertation, University of Maryland. 2003 Prolific Domains. On the Anti-Locality of Movement Dependencies. Amsterdam: John Benjamins Hale, Kenneth and Samuel J. Keyser 1993 On argument structure and the lexical expression of grammatical relations. In The view from Building 20. K. Hale and S. J. Keyser (eds.), 53-110. Cambridge, Mass.: MIT Press. 2002 Prolegomenon to a theory of argument structure. Cambridge, Mass.: MIT Press. Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch 2002 The Faculty of Language: What is it, who has it, and how did it evolve? Science 298: 1569-1579. Heine, Bernd and Tania Kuteva 2007 The genesis of grammar: a reconstruction. Oxford: Oxford University Press. Hornstein, Norbert, Jairo Nunes, and Kleanthes K. Grohmann 2006 Understanding minimalism. Cambridge: Cambridge University Press. Irurtzun, Aritz 2007 The grammar of focus at the interfaces. Doctoral Dissertation, Euskal Herriko Unibertsitatea. Jackendoff, Ray 1997 The architecture of the language faculty. Cambridge, Mass.: MIT Press. 2002 Foundations of language. Oxford: Oxford University Press New York. Kauffman, Stuart A. 2008 Reinventing the sacred. New York: Basic Books. Kayne, Richard S. 1994 The antisymmetry of syntax. Cambridge, Mass.: MIT Press. 2005 Some notes on comparative syntax, with special reference to English and French. In The Oxford Handbook of Comparative Syntax, G. Cinque and R.S. Kayne (eds.), 3-69. Oxford: Oxford University Press. 2011 Antisymmetry and the lexicon. In The biolinguistic enterprise: New perspectives on the evolution and nature of the human language faculty, A.-M. di Sciullo and C. Boeckx (eds.), 329-353. Oxford: Oxford University Press. Koopman, Hilda 2005 On the parallelism of DPs and clauses: Evidence from Kisongo Maasai. In Verb First: On the Syntax of Verb-Initial Languages, A. Carnie et al. (eds.), 281-301. Amsterdam: John Benjamins. Koster, Jan 2010 Language and tools. Ms., Universiteit Groningen.

Phases beyond explanatory adequacy

65

Leu, Thomas 2008 The Internal Syntax of Determiners. Ph.D. Dissertation, NYU. Longa, Víctor M., Guillermo Lorenzo, and Gemma Rigau 1996 Expressing Modality by Recycling Clitics. Catalan Working Papers in Linguistics 5: 67-79. 1998 Subject clitics and clitic recycling: locative sentences in some Iberian Romance languages. Journal of linguistics 34: 125-164. Longobardi, Giuseppe 1994 Reference and proper names: a theory of N-movement in syntax and logical form. Linguistic inquiry 25: 609-665. 2003 Methods in parametric linguistics and cognitive history. Linguistic Variation Yearbook 3: 101-138. 2006 Reference to Individuals, Person, and the Variety of Mapping Parameters. In Essays on nominal determination, H.H. Müller and A. Klinge (eds.), 189-211. Amsterdam: John Benjamins, Masullo, Pascual J. 2008 The syntax-lexical semantics interface: prepositionalizing motion verbs in Spanish. Ms., University of Pittsburgh. Mateu, Jaume 2005 Impossible Primitives. In The compositionality of meaning and content, M. Werning et al. (eds.), 213-229. Heusenstamm: Ontos Verlag. Moro, Andrea 2000 Dynamic antisymmetry. Cambridge, Mass.: MIT Press. Müller, Gereon 2010 On deriving CED effects from the PIC. Linguistic Inquiry 41: 35-82. Narita, Hiroki 2010 The Tension between Explanatory and Biological Adequacy. A Review of Theoretical Comparative Syntax: Studies in Macroparameters, Naoki Fukui, Routledge, London and New York (2006). Lingua 120: 1313-1323. Obata, Miki 2010 Root, Successive-cyclic and Feature-Splitting Internal Merge: Implications for Feature-Inheritance and Transfer. Ph.D. Dissertation, University of Michigan. Ott, Dennis 2011 A note on free relative clauses in the theory of phases. Linguistic Inquiry 42: 183-192. Picallo, M. Carme 2006 On gender and number. Ms., Universitat Autònoma de Barcelona. 2008 On Gender and Number in Romance. Lingue e Linguaggio 1: 44-66. Pylkännen, Liina 2002 Introducing arguments. Ph.D. Dissertation, MIT. 2008 Introducing arguments. Cambridge, Mass.: MIT Press.

66

Cedric Boeckx

Pylyshyn, Zenon W. 2007 Things and Places: How the Mind Connects with the World. Cambridge, Mass.: MIT Press. Ramchand, Gillian 2008 Verb meaning and the lexicon: a first-phase syntax. Cambridge: Cambridge University Press. Reinhart, Tanya 2006 Interface strategies: optimal and costly computations. Cambridge, Mass.: MIT Press. Richards, Marc D. 2007 On feature inheritance: An argument from the phase impenetrability condition. Linguistic Inquiry 38: 563-572. 2011 Deriving the edge: What’s in a phase? Syntax 14: 74-95. Richards, Norvin 2002 A distinctness condition on linearization. Ms, MIT. 2010 Uttering trees. Cambridge, Mass.: MIT Press. van Riemsdijk, Henk 2008 Identity Avoidance: OCP-Effects in Swiss Relatives. In Foundational issues in linguistics, R. Freidin et al. (eds.), 227-250. Cambridge, Mass.: MIT Press. Shlonsky, Ur 2006 Extended projection and CP cartography. Nouveaux cahiers de linguistique française 27: 83-93. To app. The cartographic enterprise in syntax. Language and Linguistics Compass. Starke, Michal 2010 Nanosyntax: A short primer to a new approach to language. Nordlyd 36. Svenonius, Peter 2007 Adpositions, particles and the arguments they introduce. In Argument structure, K.V. Subbarao et al. (eds.), 63-103. Amsterdam: John Benjamins. Szabolcsi, Aanna 1984 The possessor that ran away from home. The Linguistic Review 3: 89102. Tungseth, Mai Ellin 2008 Verbal Prepositions and Argument Structure: Path, Place and Possession in Norwegian. Amsterdam: John Benjamins.

Phase periodicity* Juan Uriagereka

1. Antecedents The purpose of this paper is to discuss the periodicity of phases (to use the term in Boeckx 2008). Such objects were conceived by Chomsky (2000:99 and ff.) as a design feature to avoid computational complexity.1 Derivations were taken to select . . . . . . [A] lexical array LA from Lex [the lexicon], then map LA to expressions, dispensing with further access to Lex. […] Derivations that map LA to expressions require lexical access only once, and thus reduce operative complexity in a way that might well matter for optimal design.

Then, on page 106, Chomsky suggests that, “taking the derivation more seriously”, LAs should be accessed cyclically, so that computational load is reduced: [A]t each stage of the derivation a subset LAi is extracted, placed in active memory (the “workspace”), and submitted to the [derivational] procedure […] When LAi is exhausted, the computation may proceed if possible; or it may return to LA and extract LAj […] Operative complexity in some natural sense is reduced, with each stage of the derivation accessing only part of LA.

Assuming that much, let’s concentrate on the situations under which the computation returns to LAs, in order to extract a further portion of this array. As Chomsky notes, this should determine a natural syntactic object (and see fn. 2): [E]ither a verb phrase in which all theta roles are assigned or a full clause including tense and force. LAi can then be selected straightforwardly: LAi contains an occurrence of C or of v, determining clause or verb phrase […] Take a phase of a derivation to be a syntactic object SO derived in this way by choice of LAi. A phase is CP or vP, but not TP or a verbal phrase ...

The question is then why specifically vP and CP should be the relevant phases. In his 2001 paper, Chomsky emphasizes that: “A subarray LAi must

68

Juan Uriagereka

be easily identifiable; optimally, it should contain exactly one lexical item that will label the resulting phase.” (p. 11). This, however, doesn’t tell us why v and C should determine phases, as opposed to T or V.2 Chomsky (2004:124) suggests that phases are domains that “have an [Extended Projection Principle, EPP] position as an escape hatch for movement and are, therefore, the smallest constructions that qualify for Spell-Out”. By (2005:17), Chomsky wants phases to “at least include the domains in which uninterpretable features are valued”, a view that, so far as I know, Chomsky still favors. More recently, he asserts that phasal cyclicity has to do with the derivational need to deal with uninterpretable features right away: Since these features have no semantic interpretation, they must be deleted before they reach the semantic interface for the derivation to converge. They must therefore be deleted either before Transfer [of syntactic material to the interfaces] or as part of Transfer. [from Chomsky 2008:154]

From this perspective, the question is why the v and C projections happen to be the locus of uninterpretable features.3 Several works deal with these matters. For Svenonious (2001), Fox & Pesetsky (2005), den Dikken (2007a, 2007b), or Gallego (2010), in different terms, what determines phase periodicity is some interface condition that the system is attempting to meet via phase domains. The general difficulty with this approach is to avoid circularity, in the absence of a general theory of what awaits at the interface. Other approaches effectively deny the relevant cyclicity of phases, either by embracing it to its limit (every phrase is a phase, see Manzini 1994, Takahashi 1994, Epstein et al. 1998, Fox 2000, Boeckx 2008, Boškoviü 2002, Richards 2002, Epstein & Seely 2002, Fox & Lasnik 2003, and see Abels 2003 for perspective) or by questioning this notion of cyclicity altogether, at least in the specific terms of phases (see Collins 1997, Grohmann 2003a, 2003b, Boeckx 2007, Boeckx & Grohmann 2007, Jeong 2006, and Chandra 2007). The putative absence of cyclicity effects in general⎯or at the very least the possibility that all cyclicity effects may in the end not reduce to the phase architecture⎯are certainly empirical matters. But in this work I will explore the possibility that phases are real, and that, therefore, our task as linguists is not to deny their existence, but to understand why it is what it happens to be. Something, however, can arguably be surmised from all those interesting works: The matter of what constitutes a phase, and why, is pretty much unsettled. In the ensuing pages an attempt will be made to approach things

Phase periodicity

69

from an entirely different perspective: the emergence of stability domains within complex dynamical systems. 2. A different take: Dynamical frustration The science of non-linear complex dynamical systems explores conditions whereby ordering forms emerge from understandable interactions when they act collectively. Relevant to us are systemic behaviors, technically called “frustrating”, in which overall forces or tendencies pull in opposite directions (hence the colorful term). The notion of a frustrated (spin) system comes from particle physics and the science of materials (see Diep 2005 for perspective), the frustration arising through a lack of alignment in atomic spins.4 Under certain conditions, the natural crystallization ordering in atoms can be in a frustrated state, giving raise to a spin glass, so-called because of the arrangement that the atom spins have with regards to one another. For example, a given atomic moment may align with respect to two neighbors in opposite interactions. That situation is easy to comprehend by imagining three children playing in a circle, each successively saying “yes” or “no” by opposing their right-handside neighbor. This system will stabilize in its instability, with no child saying the same word twice in a row. While a substance in these circumstances (substituting spin direction for the yes/no polarity, but still with a critically relevant number of atoms) acts as a crystal as a whole, it does not in terms of its atomic magnetic moments (determined by their spins). Dynamical frustration can arise in instances without signs of atomic disorder in a compound (with no apparent magnetic interactions to be frustrated). The glassy behavior can still arise if temporal frustration obtains (Goremychkin et al. 2007). This team has shown how, by fluctuating in magnitude, magnetic moments in the particular compound they studied can cause what we may think of as temporal cycles that appear and disappear, long enough to disrupt magnetic alignment. A parallel with phases may sound fanciful, but certainly the sorts of objects that interest us in a sense “appear and disappear” in time, or more precisely as the derivation unfolds. The proposals reviewed in the previous section blame that on (still obscure) interface conditions, whereas the approach now about to be explored, instead, focuses on an emergence that it is to be blamed on the syntax itself. Needless to say, for this to make any sense beyond the metaphorical, we have to show (a) any indication that the sort of periodicity seen in syntax has anything to do with instances of

70

Juan Uriagereka

dynamical frustration, and (b) what would the dynamics be in our instance⎯the opposing tendencies leading to the central frustration. A nice review of these issues, with much relevance to linguistics, is Binder (2008), where dynamical frustration is characterized as the unresolvable co-existence of opposite tendencies.5 The phenomenon certainly goes beyond the confines of materials. As Piekarewicz (2008) notes, this sort of complexity plays a crucial role in the emergence of topological shapes within neutron stars. Nerukh (2008) in turn shows dynamical frustration at the nanoseconds time scale, within a protein environment. Yu et al. (2007) focuses on the role of dynamical frustration within the biological clock of a bread mould, whose genetic timing network they explore. One should, of course, attempt to make a case for whether the specifics of dynamical frustration apply in the linguistic instance, but it should at least be clear that if a concept applies at such different scales, there is no particular reason why it shouldn’t apply at the levels that matter to us here. Be that as it may, to make the case in linguistic terms, we ought to see some signs of the sort of “rhythm” implied in biological clocks (controlled pulses). So let’s turn to linguistics. Richards (2006, 2007) suggests that the right periodicity among phases is as indicated in (1a) (P=phase, N=non-phase)⎯cf. instantiation (1b), corresponding to (1c): (1)

a. … [P [N [P [N [P [N … ]]]]]] b. … [CP [TP [vP [VP [DP [NP … ]]]]]] c. … [CP that [TP he [vP v [VP adored [DP the [NP country-side ]]]]]]

Successive categories in the syntactic skeleton (whatever they turn out to be: CP, TP and so on) stand in a phase/not-phase “rhythm” with regard to one another. This generalization is more interesting than merely stating a list of phases, but one still wonders why this particular rhythm should hold, as opposed to other imaginable ones. A putative worry before proceeding is that this view of things ignores the so-called fine structure of the functional skeleton. There could, of course, be many more projections within a phase than suggested in (1), messing the “rhythm”. This is not the place to criticize the Cartographical Theory presupposed by such considerations, but in fairness to this whole discussion it should be kept in mind that this approach admittedly re-categorizes projection groupings in terms of “fields” (as in, literally, “the CP field”). So it may well be that there is, indeed, a finer structure to all of this, but in itself this doesn’t eliminate

Phase periodicity

71

the existence of domains which remain robustly observable, whether they manifest themselves in the traditional terms that (1) deploys, by way of functional categories, or more elaborate “fields” turn out to be necessary.6 From that perspective, and at a sufficient level of abstraction, those would be the ones defining a rhythm along the lines of what Richards observed. There is of course no reason to deny the higher-order observation on the basis of the existence of a finer one. One thing should be clear, if one attempts an approach along the lines being explored here: observables won’t come up easily⎯or they would have been observed already. I say this with utmost respect both for the difficult task of observation and the equally complex task of theorizing. I will, in other words, purposefully abstract away from all the putative finer structure in the CP and elsewhere. For all one knows at this point, there could indeed be further phases (including DP, PP and many other such possibilities tentatively argued for in recent works too numerous to mention). If that is the case, whether the present considerations survive the test of falsification will depend on the exact details of such phases, and whether they are as robust (in whatever sense turns out to be relevant) as the mostly undisputed ones. All of that said, I will qualify Richards’s observation in terms of an equally abstract one. While admitting that the situation in (1b) is common across the world’s languages, Gallego (2009) suggests that in some the periodicity is slightly more elaborate. Concretely, in these languages (e.g. Spanish) there are more projected materials between the left-peripheral CP and the core TP that turn out to be relevant to how phases work in this language. This of course again speaks to the issue of a finer structure, but rather than multiplying the categories (to then group them back into “fields”), the idea here is to characterize linguistic diversity in terms of having a more or less intricate “nonphase” area. Gallego’s alternative to (1) is (2) for these languages:7 (2)

a. … [P [N [N [P [N [P [N … ]]]]]]] b. … [CP [FP [TP [vP [VP [DP [NP …]]]]]]] c. … [CP que [FP cuánto [TP él [vP v [VP adoraba [DP el [NP campo ]]]]]] that how he adored the country-side that how much he adored the country-side

Now the rhythm in (2) emphasizes the question of whether there is any rhyme or reason to these periodicities. Although no decision is ever innocent, in order to study the matter in clean, abstract, terms, we can assign a +

72

Juan Uriagereka

representation to phase heads and a − representation to their domains.8 Then we clearly obtain these possible “rhythmic units”: (3)

a. + −

b. + − −

The structural distribution of these sorts of domains possibly doesn’t stop there. In many languages significant structural considerations arise for the right periphery as well: (4)

a. [[a rumor]i [ v [emerged ti]]] Probe b. [[a rumor tj]i [ v [emerged ti]]] [about the candidate from Chicago]j Probe

(4a) presents a simple phase, with its domain highlighted in gray and the phase edge, the Probe head and its specifier, unmarked. But activity is possible, also, in the opposite direction, where “stylistic” displacements involve pronunciation in an “extra-prosodic” domain, with characteristic intonation. As Ross (1967) observed, long-distance displacement is disfavored in that direction (the Right Roof Constraint)⎯contrary to what happens with left-ward displacement, which is unbounded (pace Sabbagh 2007): (5)

a. [Kennedy will say [that there emerged [a rumor ti]] tomorrow in the senate] *[about the candidate from Chicago]i b. [About whom]i will [Kennedy say tomorrow [that Clinton believes as of today [that [there emerged [a rumor t i] yesterday ]]

Once again, the abstract observation just made deserves some pause. An anonymous reviewer worries about the possibility that my observations above take for granted “that extraposition and the like involve rightward movement”. This is not correct, although the notation used may perhaps lead to such a conclusion. A trend initiated by Kayne (1994) questions the rightward character of extraposition. But the notion of phase that I will explore here⎯if it indeed involves right edges⎯is at right angles with whether these edges are obtained by rightward movement or, instead, they arise upon remnant moving everything else to the left. The bottom line is that the “right edge” exists, however it is we describe it: right dislocation or characteristic intonation drops attest to it. Now, this is obviously not to say that the putative right edge of phases is going to work the same way as the left edge; if that

Phase periodicity

73

were the case, there ought to exist generalized long-distance movement to the right, rightward agreement, or for that matter interaction between the right and left edges. However, none of that appears to happen, nor is it even clear that there is such a thing as a “rightward specifier”. A different reviewer also worries about the fact that notions like “right” or “left” cannot be expressed if a phase is just the output of successive applications of the Set Merge operation, which is by definition linearly insensitive. Now the plain fact is that linearization is a fact of language, and moreover an uncontroversial function of grammatical structure. Grammatical effects are simply not the same to the left and to the right. For example, the right edge has semantic effects of an information sort, regardless of how this is implemented. Compare the answers to the A questions below:9 (6)

A: And so what happened then? B: a. (Well, that) a rumor about the candidate from Chicago emerged b. (Well, that) a rumor emerged about the candidate from Chicago

(7)

A: And so what emerged then? B: a. (Well,) a rumor about the candidate from Chicago emerged b. #(Well,) a rumor emerged about the candidate from Chicago

(8)

A: And so a rumor about whom emerged then? B: a. #(Well,) a rumor about the candidate from Chicago emerged b. (Well,) a rumor emerged about the candidate from Chicago

The two possible orders under consideration (aside from having a different stylistic flavor associated to intonation) only constitute an eloquent answer to the neutral question in (6). In the other two instances, when the inquiry is either about the subject of the sentence (7), or a specification thereof (8), only one of the two orders is fully acceptable. Now whatever is the ultimate grammatical explanation for how given portions of structure linearize “to the left” or “to the right” (the latter possibly being a more complex process), the fact remains that they do, with clear interface conditions. The only issue is whether this significant grammatical cue enters the fine structure of phases. To continue studying these patterns in abstract terms, we could make the notation in (3) precise, following the definitions in Chomsky (2004:108). In this work, and given structures as (9), Chomsky explicitly characterizes the phase edge as the strange combination Į-H, where PH is a phase and H its head (we return to this definition):

74 (9)

Juan Uriagereka

PH = [Į [H ȕ]]

If this perspective is assumed, “+” should signal not just a phase head, but more concretely its edge as just defined (to be compared to the domain signaled with “−”). If we in turn also tentatively accept the right periphery as an edge of sorts, also external to the domain, then we could extend our patterns in (3) as in (10)⎯assuming the right periphery is present both in languages of the sort in (3a) and also of the sort in (3b): (10)

a. + −

b. + − − c. + − +

d. + − − +

So an analysis of what falls into phasal domains and what “outside”⎯to the left and to the right⎯coupled with a nuanced consideration about the structural size of what a given phase comprises, suggests that the “rhythmic” possibilities that Richards sought are richer than one might have first imagined. I would like to suggest here, however, that far from being a problem for our characterization of phases, the ontology in (10) can actually be turned into an argument for their genuinely rhythmic elegance. This is particularly so if we demonstrate that a situation as in (10) involves dynamical frustration. 3. A Fibonacci approach and why it entails frustration Curiously, the patterns in (10) are witnessed elsewhere in the language faculty. This is easily seen by translating back the abstract symbols into more concrete units, but this time elements of a phonological sort. Suppose, specifically, that we take “+” to mean a consonant (or consonant cluster, disregarding glides and secondary consonants for this exercise) and “−” to mean a vowel (more precisely, a vocalic mora). This is what we obtain: (11)

a. C V

b. C V V c. C V C

d. C V V C

In (11), one might worry about why consonant clusters are ignored (simplifying them to the primary consonant), but “double vowels” count. The reason is factual: secondary consonants do not change syllabic timing⎯moraic conditions, however, do. In other words, in terms of standard phonological conditions, a syllable is not more, say, closed because of ending in two consonants than because of ending in one; however, a bi-moraic syllable is heavy (e.g. for

Phase periodicity

75

the purposes of stress assignment) in ways that a mono-moraic syllable need not be. Phonological theory standardly lists the patterns in (11) (plus two more that I return to) as the observed ones across the world’s languages. For instance, as Blevins (1995) illustrates, these constitute the major syllabic patterns, in descending frequency (from left to right).10 I would like to suggest in what follows that, while the parallelism between phases and syllables is interesting, it is not miraculous, since the basic “forces” that underlay syllabification are decisive in carving out phases too. Even said that way, the connection may seem strange: Why should phonological rhythms have anything to do with syntactic ones? However, things start making more sense if evaluated more abstractly. Uriagereka (1998:485 and ff.) made such a move when arguing that syllables may emerge from: . . . [T]wo factors: general repulsion forces that generate other Fibonacci patterns, and general ‘gluing’ forces that have as a result discrete units of various shapes. . . Perhaps these [patterns] are not even coded in the genes of different species, and only more fundamental matters really are (stuff like a growth function, the timing of its internal dynamics, et cetera) . . . If this under-determinacy holds, that basic systemic balance may show up elsewhere, granting you a very plastic system with patterns appearing in different domains, given internal dynamics.

The point is that there do not seem to be any syllable templates like CVVVC or VVC, and the question is why only the patterns in (11) emerge, and with that (descending) frequency. In particular, as is usefully summarized by Blevins, many unrelated languages exhibit all of these syllabic possibilities, varying mainly in consonant clusters or vocalic colorings that we are now setting aside.11 In a few languages only templates of the CV variety (stressed or unstressed) are present.12 Typically, conditions along these lines are blamed on sonority requirements. Interestingly, however, they have been shown to appear in signed languages as well.13 This makes it quite implausible that their origin should have anything to do with low-level sound-driven specifications. The approach I took in 1998 was very different. To see why it is specifically of the frustrated sort discussed above, suppose we let “+” or “−” representations interact according to two computational rules, (12i) and (12ii), to generate symbol strings:14

76 (12)

Juan Uriagereka

F Game Starting with either a + or a −, (i) Go on to concatenate it to another + or a −, with one condition: (ii) Avoid combining identical symbols, unless they are adjacent to a different symbol

The results of this game are as in Figure I, starting with a space or a boundary and adding successive symbols.15 Possible combinations as various elements are added yield different arrays of spaces and boundaries, and the number of combinations as the added elements grow falls within the Fibonacci series:16

Figure I: Fibonacci patterns emerging from the F game, for 2, 3, 4, 5, 6, 7 and 8 symbols

Fibonacci patterns emerge here as a result of how the F Game in (12) is set up⎯and see fn. 15. But now suppose we next adjust these patterns to purely linguistic conditions: (13)

Linguistic Conditions (i) Nucleus Constraint: Look for a maximal space Then, (ii) Onset Constraint: Try to assign an onset boundary to that space Then, (iii) Coda Constraint: Try to assign a coda boundary to that space

Phase periodicity

77

(13) is an algorithm that, first, optimizes bounded spaces to make them as large as possible (i), and next (ii) as delimited as possible. This has the consequences in Figure II.

Figure II: Patterns emerging from adding linguistic conditions on the F game

The algorithm attempts to find maximal spaces (understood as combinations of “−” elements);17 next, it attempts to delimit that maximal space in terms of an onset boundary (if possible); finally, the algorithm tries to find a coda boundary for the delimited spaces. In some circumstances, after the algorithm applies, the remaining space is a single “−” (not a maximal space), and in fact without either an onset or a coda. Readers can verify that only six groupings emerge from applying the linguistic conditions in (13) to the F game in (12). As further combinations of successive symbols are attempted under these conditions, new associations within the F series emerge (twenty one, thirty four, etc.); but when the linguistic conditions in (13) are applied to these new objects, no more combinations emerge. The entire set of possibilities is as in (14): (14)

a. + −

b. + − −

c. + − +

d. + − − +

e. −

f. − +

Moreover, observe the number of occurrences for each combination type within the strings above (for clarity we are replacing +/− notations for the phonologically substantive C/V, where the basic spaces are vocalic and their boundaries consonantal):18

78 (15)

Juan Uriagereka

a. (a) grouping: CV 37 (times), CVC 21, CVV 11, CVVC 10, V 0, VC 0. b. (b) grouping: CV 37 (times), CVC 17, CVV 7, CVVC 8, V 19, VC 13.

The (a) grouping is generated by starting the game with a boundary⎯which can be rationalized as a punctuated element in an open space⎯while the (b) grouping emerges from starting the game with a space⎯which can be rationalized as an in-principle boundless topology (see fn. 18). In either instance, after applying the linguistic conditions in (13), the CV pattern emerges 37 times, the maximum. At the opposite extreme we have the V(C) pattern, which doesn’t emerge in the (a) grouping, and does moderately in the (b) grouping (the VC pattern is the least common, emerging only 13 times). In between is the CVC pattern. This roughly correlates with the frequency order of the objects in (11), which can thus be said to be predicted from a Fibonacci game along the lines just discussed. It is hard to find a living creature some of whose structural properties do not deploy a Fibonacci pattern somewhere: either a number of features falling into the series 1, 1, 2, 3, 5, 8, … or logarithmic growth based on the limit of the ratio between successive terms in the Fibonacci series (1.618033…, the “golden expression” ĳ). That biolinguistic patterns too should exhibit this sort of regularity is perhaps not that surprising. These patterns present a characteristic optimality of the dynamically frustrated sort, involving two opposing forces that clash. A classic example is offered by Douady & Couder (1992), who let magnetized ferrofluid drops fall into an oil dish, repelling each other but constrained in velocity by the oil viscosity. As the dropping rate increases, a characteristic Fibonacci pattern emerges.19 The relevant equilibrium can be conceptualized as involving a local and a global force pulling in opposite directions, and the issue is how these opposing forces balance each other out, such that the largest number of repelling droplets can fit within the plate at any given time, as they fall onto it. It turns out that an angle ĳ of divergence between each drop and the next achieves this dynamic equilibrium (see Medeiros 2008 and Piattelli-Palmarini & Uriagereka 2008 for relevant discussion).20 The Douady and Couder scenario was purely physical and is, thus, not a structure that emerges in traditional Darwinian conditions for living organisms.21 This is important, for if Fibonacci structures are present in an organism, we may not need to blame this presence on the usual adaptationist accounts. The relevant origin may be ultimately different, having more to do with the complex dynamical properties that lead to “opposing forces” in the

Phase periodicity

79

general sense described above (be they physical forces or any other relevant tendencies, as Binder 2008 emphasizes). Needless to say, it will not be easy to figure out the “ultimate” details of any of this, whether in biology or general psychology, but this is not so much a matter of principle as one of identifying relevant dynamics to interact.22 It is worth emphasizing that this interplay between opposing “forces”, could be explored at three different dimensions (or a combination thereof): (i) a philogentic approach: the physics of the context where an evolutionary event took place may have resulted in the appropriate systemic interaction, which could end up genomically coded;23 (ii) an ontogenetic approach: the developmental pathway of an organism, channeled by the relevant physical constraints, coordinates the growth and allometric ratios of the different parts, requiring some epigenetic interplay between the genome and the proteome deployed in the developing individual; finally, (iii) it could even be that these dynamics continue to unfold throughout the organism’s (adult) life, regulating physiological processes. Can syllables emerge as properties of two factors pulling the linguistic system in opposite directions: “repulsion” forces, of the sort in (12), that generate Fibonacci patterns more generally, and more specific “gluing” forces, as in (13), that result in discrete units of various shapes? In other words, can a syllable be a mini-max compromise, the “max” aspect being determined by general Fibonacci conditions, and the “mini” one by the linguistic specificities that something like (13) dictates? Formally, this is what the game above really shows. The substantive question is only whether it is sound to assume general “repulsion” forces as in (12) for language, or for that matter the “gluing” conditions in (13). These are impossible questions to address a priori, but it is not senseless to ponder how plausible it is for the system to assume the relevant presuppositions, even if the operational result is patent. What makes the question all the more interesting is its abstractness in the linguistic instance. We have already noted that whatever dynamics are at play here cannot be simply of an acoustic sort, or they will not easily generalize to signed languages presenting comparable nuances. 4. Edge conditions in syntax We can push matters even further, towards the conditions that interest us here, inasmuch as (14) clearly represents a superset of (10) (the new elements being (14e) and (14f)). Actually, it is not difficult to mechanically generate a version of (10) from the abstract Fibonacci-spaces in Figure I that the G-game in (12)

80

Juan Uriagereka

describes. Above we generated such arrays both starting at a boundary (+) (interpreted in consonantal terms) and a space (−) (interpreted in vocalic terms)⎯the higher and lower halves of Figure I. If we were to start in just the boundary, according to the F game the initial “cell” would be of the “+ −” sort, which is tantalizingly close to Chomsky’s phase if we interpret the “−” as the domain and the “+” as the edge. In turn, consider again the sorts of substantive conditions discussed in (13)⎯repeated below as (16) for convenience⎯to uncover syllables within these contexts by way of seeking to maximize stable signaling spaces and boundaries thereof: (16)

Linguistic Conditions (i) Nucleus Constraint: Look for a maximal space Then, (ii) Onset Constraint: Try to assign an onset boundary to that space Then, (iii) Coda Constraint: Try to assign a coda boundary to that space

The Nucleus Constraint is abstract enough to manifest itself in phonology, syntax or even signaling more generally. In turn, that an open-space should be bounded may be a very broad requirement on symbolic systems, assuming they require discreteness. However, it is less obvious where the boundary in question should have to be. The linguistic signal is carried on a one-dimensional motor expression (whether via speech or gesture) that is deployed in time, which signalers and decoders are part of. It is sound for an information-exchange system to be symbolically focused on the beginning of a given signal⎯which can be established when tracking the emission in real time⎯in a way that would be impossible to fix, in full generality, for the end of the signal. That basic intuition alone suffices to justify the nature of condition (16ii). Condition (16iii), however, though clearly operative in word-level phonology, seems harder to justify in the domain of sentence grammar. It may make sense to help delimit word-endings if lexical units have to be parsed, rapidly and effectively, without “spilling over” into the next such unit⎯and that might justify coda conditions at the syllabic level. But matters seem less clear at the discourse level that sentences articulate. Suppose that means, then, that we lack condition (16iii) for syntactic discretization. If (16iii) is ignored to determine patterns within the possible arrays of the F-game, we obtain a “syntactic variant” of this game. It allows for (syntactic) coda conditions in various instances, but it also permits a totally different parse of the relevant arrays: one for which two left edges are viable (instead

Phase periodicity

81

of assigning one of those as a right edge to a prior expression). This (bearing in mind, also, that we are constructing relevant spaces based on the “+ −” combination)24 creates a syntactic ontology (cf. Figure II above):

Figure III: Patterns emerging from the “syntactic variant” of F game

The objects that the system stabilizes into are as in (17) (the digit indicates the number of occurrences of each stable object within the array): (17)

a. + − 42 , b. + − − 14, c. + + − 10, d. + − + 6, e. + + − + 4, f. + − − + 4, g. + + − − + 2, h. + + − − 1

The question is whether, just as similar patterns correspond, in phonologically grounded parses, to abstractly analyzed syllables within the world’s languages, so too patterns as in (17) correspond to abstractly analyzed phases. Now recall (10), repeated as (18): (18)

a. + −

b. + − − c. + − +

d. + − − +

The most frequent objects generated by the syntactic variant of F-game as in (17) correspond to (18a) and (18b); (18c) and (18d) are also generated, though less frequently. It is easy to rationalize what is going on in (17) once we observe how, effectively, phase edges can be light (+) or heavy (++). In those terms, the ontology in (17) can be understood as in (19); the heavy edge is signaled as EH and the light as EL: (19)

a. EL − 42 , c. EH − 10, f. EL − − + 4,

b. EL − − 14 d. EL − + 6, e. EH − + 4 g. EH − − + 2, h. EH − − 1

82

Juan Uriagereka

And abstracting away the light/heavy distinction for edges, we obtain: (20)

a. EL/H − 52 , b. EL/H − − 15, c. EL/H − + 10, d. EL/H − − + 6

This, of course, is abstractly identical to (10)/(18). Heavy edges make good sense in so-called paratactic conditions, where it is possible to have sentences as in (21b) side-by-side with others as in (20a), which fit the pattern in (20a) in its two varieties: (21)

a. I believe that he adored the countryside b. I believe that, as for the countryside, he adored it

The heavy edge is impossible in hypotactic conditions, as discussed (with relevant changes in terminology) in Torrego & Uriagereka (1992, 2002). This makes the variant with the light edge the most frequent. That said, expressions as in (22) need to be accommodated: (22)

As for the President, what is being a lame duck if not a near-death experience?

This (recorded) example clearly exhibits some extra element to be added to the edge hosting the Wh-word what in standard terms, without starting a separate phase. Similar considerations apply to the other (less frequent) four variants in (20), involving a larger phase domain in Gallego’s (2009) sense (20b), as well as more exotic variants with right edges, however it is that such notions are to be precisely characterized. A connection between syllabic and syntactic constraints may sound somewhat exotic, but it was explicitly argued for in Carstairs-McCarthy (2000), which sees phrasal structure as an “exaptation” (in evolution) of earlier syllabification requirements. The idea for this sort of correlation is actually not new: it was defended in synchronic studies as far back as Kaye, Lowenstamm & Vergnaud (1985), and it is certainly sympathetic to intuitions that can be traced to Saussure (1916). But it remains to be seen why such a structural translation⎯from the realm of sound to that of structured meaning⎯is part of the fabric of language (see Medeiros 2010 on this general topic). That said, it should be kept in mind how these Fibonacci-patterns are adapted to various externalization conditions (e.g. (16i) and (16ii) in syntax vs. all three conditions in (16) in word-level phonology). In natural conditions, too, such patterns emerge in slightly different forms depending on

Phase periodicity

83

whether, for instance, a given growth function is (roughly) continuous (a mollusk shell) or discrete (florets on a corolla). Although all of these are ultimately abstract Fibonacci-patterns, a phase is obviously not a syllable, anymore than a shell is a corolla. 5. The source of frustration in language: Two orthogonal computations In section 1 we saw Chomsky’s argument that systemic phases make grammatical computations feasible. A difficulty emphasized by Townsend & Bever (2001) places the matter of feasibility around a curious orthogonality reigning internal to the language faculty. Syntactic conditions are thought to be bottomup, as first-merging a head to its complement yields well-known syntactic relations. Complements are relevant to a variety of conditions reviewed below, and yet linguistic behavior in sentence parsing patently proceeds serially and incrementally from before to after. The question is how “before-to-after” can be matched with “bottom-up”. For many theorists the matter is ill-posed, since these are not computations of the same sort (one involves “competence” while the other deals with “performace”), or one of these processes is not even a linguistic computation (Townsend & Bever’s position). For others, this orthogonality of processes suggest that syntactic computations are ill-conceived as proceeding bottom-up (Lewis & Phillips 2009). But an interesting alternative exists that capitalizes on embracing the tension: “Commensurability” between two genuine computations that are orthogonal to one another becomes possible if we break down the structures for which the problem arises to a size that makes its workings not just computationally solvable, but effectively so, roughly in the sense of Berwick & Weinberg (1984), adapted to present concerns. Of course, this is asserting a form of cyclicity in order to resolve orthogonality (see Uriagereka forthcoming for discussion). The last point needs to be emphasized. Without cycles of some sort, dynamical frustration would not make sense. It is within these very local domains that the frustration stabilizes. How local the domains are is a function of the system as a whole, including the dimensions it presents. Understanding this in full generality is beyond the scope of the present paper, but the message to take home should be clear: cycles (pockets of stability) in a dynamically frustrated system are definitional. They constitute the domain in which the systemic forces interact, or from a different perspective the only way in which such forces can interact. There is no need to extrinsically impose cyclicity in the

84

Juan Uriagereka

relevant systems, any more that one needs to impose, for instance, whirlpools in water rapids. Now while that view of things argues for systemic cycles of some sort, it does not, in itself, tell us much about what matters to us here: the cycle’s periodicity. But conditions of the form in (16) may help understand this too, particularly if we rationalize them as broadly seeking to maximize stable signaling spaces and boundaries thereof. For conditions of this type to emerge it is immaterial whether we are dealing with phonological spaces (consonant/vowels or hand position/movements) or syntactic spaces (phasal periodicity). Moreover, this view of things may also give us a rationale for the opposing tendencies implicit in the F game: the disparate computations of language can only hope to be measured against one another by breaking each unit into cycles. The bottom-up, syntactico-semantic, computation is the equivalent of the “repulsion” forces we studied for syllables: the associative procedure that makes the system grow. But something has to discretize associative thought into squeezable and then parseable speech, yielding phasal periodicity⎯more accurately periodicities: syllables, phases, and probably more. At least in the case of syntactic periodicities, we can be more precise. In bottom-up terms of the sort explored in Chomsky (1994), the derivational “workspace” mentioned is articulated in a curious way.25 The head-complement relation is what the system captures most naturally, while the head-specifier relation is forced into a separate “derivational workspace”. Consider deriving the man saw a woman, as depicted in (23). There is no bottom-up way to merge the man directly to saw a woman: we must assemble the former in a separate workspace within the derivation (23b), place it on a “memory buffer”, and then assemble the results to the structure still active in (23a), as in (23c). Resorting to this buffer is a virtual definition of specifier. (23)

a. {saw, {saw, {a, {a, woman}}}} b. {the, {the, man}} the ←↑→ man

saw ←↑→ {a, {a, woman}} a ←↑→ woman

c. {saw, {{the, {the, man}}, {saw, {saw, {a, {a, woman}}}}}} {the, {the, man}} ←↑→ {saw, {saw, {a, {a, woman}}}}

Phase periodicity

85

So first-merge defines the “spine” of any given derivation (see fn. 25). Several grammatical conditions happen either within that “spine” or chunks thereof: (24)

a. Standard theme selection, measuring out event denotations (Teny 1994)26, where quantifiers “live on” (Barwise & Cooper 1981)27 and yielding core idioms (Marantz 1984)28 b. Head-to-head dependencies (incorporation, affixation, light constructions) (Baker 1988, Grimshaw & Mester 1988)29 c. Basic linearization for Spell-out purposes (Uriagereka 1999 after Kayne 1994) d. Standard Agree and related complex relations, such as temporal or negative concord or polarity licensing30

It is not clear how a unified characterization of these domains can be provided that is genuinely alternative to its definition of a bottom-up first Merge⎯i.e. that is not a mere list of the representations where these sorts of processes happen to take place. In contrast, subsequent instances of external or internal Merge, if they involve complex steps of their own, arise off of the spine. Many grammatical conditions arising in these terms require the “memory buffer”, suggesting that they are more complex: (25)

a. External arguments, signaling mere event participation (and no idioms) b. Clitic climbing, placement and reduplication c. Derived linearization for Spell-out purposes d. Standard Move, including successive cyclic displacement through the Edge

The conditions in (25) are structurally more complex than the corresponding ones in (24) in one other respect: the characterization of the domain where they obtain. Given a phase [Į [H ȕ]], as we saw Chomsky characterizes the phase edge as the complex Į-H. Now this complex is not a configuration, for no configurational object can include Į and H without also including ȕ. A precise definition is provided in Chomsky (1993:11 and ff.), which is quite telling with respect to the sort of asymmetry that interests us:

86 (26)

Juan Uriagereka

Given a head Į: a. Max (Į) is the least maximal projection dominating Į b. Domain (Į) is the set of nodes contained in Max (Į) that are distinct from and do not contain Į c. Complement Domain of Į is the subset of Domain (Į) that is reflexively dominated by Į’s complement d. Residue of Į is the set-theoretic complement in Domain(Į) of the Complement Domain of Į e. Given a set S of categories, Min (S) is the smallest subset K of S such that for any Ȗ in S, there is some ȕ in K that reflexively dominates Ȗ f. Internal domain of Į is the Minimal complement domain of Į g. Checking domain of Į is the Minimal residue of Į

These set-theoretic conditions are needed for one purpose: the Checking domain is a heterogeneous set, including the specifier and elements adjoined to the head (the head itself in later formulations). (26a) isolates a maximal projection; (26b), the categories it includes; (26c) defines the complement domain; (26d) picks out the set-theoretic complement of that complement domain; finally, (26g) defines the checking domain in terms of the notion Min (S) as in (26e). Of course, once all this machinery is in place, it also allows us to define the internal domain. Now the domain so defined is just the element that first-merges to Į. In other words, while conditions as in (24) can be directly defined in terms of configurational relations (what a “complement” boils down to), conditions of the sort in (25) require the higherorder set-theoretic paraphernalia in (26). To obtain the phasal periodicity we are after, it seems crucial to determine phasal edges (within an overall F-space). In other words, we need to understand how the various “+” and “−” conditions are systemically organized, in order to see their “rhythm”. Effectively, phase domains (the “−” conditions) are going to fall into first-Merge spines, characterized as just discussed; in turn, phase edges (the “+” conditions) will need to involve the extra systemic resources, with all the consequences that brings with. This state of affairs is actually quite consistent with the fact that the phasal edge is what carries movement longdistance, which we need to briefly consider next.

Phase periodicity

87

6. Syntax at a higher dimension The present approach revisits old, yet very productive, ideas. Townsend & Bever (2001) are ultimately revamping considerations in Halle & Stevens (1962) and Chomsky & Miller (1963), the “Analysis by Synthesis” hypothesize-and-test method in parsing. Parsing is important not just for the processing of speech as it is produced, but also for language acquisition. As Poeppel et al. (2008) argue, this general approach to the stuff of language is biologically plausible: Based on minimal sensory information, the perceptual system generates knowledge-based “guesses” (hypotheses) about possible targets and internally synthesizes these targets. Matching procedures between the synthesized candidate targets and the input signal ultimately select the best match; in other words, the analysis is guided by internally synthesized candidate representations. [from Poeppel et al. 2008: 1072]

Analysis by Synthesis was first understood as an “intelligent” method for speech recognition. Signals are mapped through messages through feed-back loops that build on an initial hypothesis. This “first pass” presupposes a device capable of generating the message it is attempting to decode. Effectively the device makes a guess at generating a chunk of perceived structure, and then matches it against the signal that justified that initial hypothesis. Depending on how good the match is, a second pass is initiated to refine the initial guess, again with the generative devices presupposed in the task⎯now slightly refined so as to better match the signal. The method was called “analysis by synthesis” because the active analysis was performed, internal to the system, by way of the hypothesized synthesis of signals-to-be-compared. Importantly for our purposes here, such a method would only work if the presupposed evaluation is done in a very local fashion, so that the comparison sets can be manageable and the calculation does not explode. So the idea is that what we see as “cyclicity effects” may simply be the systemic response of the grammar to Analysis by Synthesis considerations in parsing. That approach may lead to a theory of the sort in Townsend & Bever (2001), where a bottom-up grammar provides the core analysis of a structure, while a rough-and-ready analyzer establishes a rapid, left-to-right hypothesis, to be later on checked, in the feed-back loop, against the grammatical analysis. The explanatory power, in these terms, is based on the functional efficacy of structures, so as to meet Analysis by Synthesis conditions. A different take on these matters is attempted in Uriagereka forthcoming, where rather than

88

Juan Uriagereka

“the grammar vs. the parser” being the opposing “forces” in a dynamically frustrated state of affairs, the clashing tendencies are both grammatical: a left-to-right Phonetic Form vs. a bottom-up Conceptual Structure. Regardless of which take on the problem turns out to be correct, the point that matters to us now is that, if this dynamically frustrated situation is central to the overall architecture of language, it might be determining, also, its fundamental structuring. From the “hypothesize-and-test” perspective, grammatically possible forms are creatively generated from a first-pass analysis, stemming from the local symbolic combinations identified upon lexical analysis, and fed-back into the system for comparative accuracy. From the “two computations clashing” perspective, such forms arise as stability points between the orthogonal grammatical processes.31 Either way, two systemic tendencies, are at play: the associative ones constructing structure and the looping ones, which in the process of feeding information back into the system determine characteristic periodicities. In the case of phases, we must understand⎯and presently we do not⎯how it is that, in general, specifiers emerge within the system (see Chomsky 2009 on this very point). In a sense the question boils down to how the system treats material within the “memory buffer”, vis-à-vis the rest of the structure. Such an elsewhere condition seems to be what underlies specifiers, which appear to be counterpoints of sorts to the more primordial headcomplement phrasal spines. The distinction is just part of how the derivational dynamics work under Chomsky’s assumptions, but while the headcomplement relation (or extensions of the Agree sort) is straightforward, how exactly other “super-structures” emerge under higher-order conditions of the sort in (26) is less clear. This is not so much in terms of the relevant phenomenology (of successive cyclicity, reconstruction, construal and antecedence more generally), which has been thoroughly studied. What is missing is a simple analysis that can respond, without stipulations or notational tricks, to such questions as why edges allow displacement or what it means to reconstruct in any of the displacement points⎯but not all at a time. Ironically, what should be less problematic is what is often taken as mysterious within phase dynamics: that special conditions should hold of specifiers (e.g. their not being transferred to interpretation when the domain of the head that hosts them does). Since the very inception of these matters, it was patent that the edge does not go hand-in-hand with the rest of the computation, as is expected if the derivational “memory buffer” alluded to above is real: computationally, these elements occupy their own derivational dimension; so much so that apparently the system recognizes them as bona-fide edges to more basic syntactic spaces (of the sheer Merge sort, without any

Phase periodicity

89

additional conditions, including “memory buffer” ones). But what this means, precisely, in computational terms is still hard to understand, and we need not dwell on the matter here. Now why are CP and vP the canonical phases⎯and not the TP (dominated by CP) and VP (dominated by vP)? From the present perspective, the central assumption may be expressed in the “bottom-up” terms in (27): (27)

Syntax is built from the first-merge of the verb to its theme argument

Once that assumption is made, the issue is what should count as a Fibonacci space for the purposes of the F-game (i.e., where the “−” signs should be anchored, as it were). Given (27), it is as natural to assign that space to the lexical space of a verb-object relation as it is to assign a phonological space to the domain of a vowel (in oral languages) or a hand-movement gesture (in signed languages). At that point, VP will not be a phase edge, by definition. Extended domains will be viable too, though statistically rare (possibly corresponding to language specific ditransitive expressions, absent in many languages). 32 In turn, once that much is assumed, the first available edge to satisfy the Fibonacci-pattern will come at the vP level, where the first relevant specifier is determined. That itself determines the next space, in counterpoint to what is already established: it will be in the TP domain (with possible variants including extended domains in languages where this is relevant, as per Gallego’s 2009, 2010 observations). The “rhythm”, in Richards’ (2007) terms, then goes on: the next available specifier, at a separate projection that determines its own dynamics, will be at CP. The structure should be symmetric to the one holding underneath, or for that matter above. Technically, the system just proposed turns out to be very similar to the one explored by Chomsky in his recent papers, although some serious differences need to be emphasized as well. Of course, the ontology in (20), adapted now as in (28ia/b) for clarity, is richer than the customary one that Chomsky assumes, as in (28ii): (28)

i. a. EdgeW/S DomainW/S ii. Edge, Domain

b. EW/S DomainW/S Coda

Obviously, (28ia) is more nuanced than (28ii), in that the former allows for both light and heavy edges, and as a matter of fact also domains (understanding Gallego’s extended domains as heavy in the relevant sense). In addition (28ib) allows for the possibility of codas. We have seen empirical

90

Juan Uriagereka

reasons to suspect that the “extra syntactic” structure is not out of the question, and in that sense (28i) can be seen as nothing but a beefed up version of the skeletal conditions in (28ii), albeit in a way that gets the relevant domains to be closer in nature to a putative Fibonacci base. Other than that, it would seem as if (28i) has all the virtues⎯or for that matter defects⎯that (28ii) carries. There is, however, a very significant conceptual difference between the sort of object in (28ii), which as we saw in section 1 Chomsky attempts to blame on all sorts of conditions with an interface twist (as have other authors), and the one in (28i). If the architecture suggested here is anywhere close to right, in any of its dynamically frustrated variants, (28i) describes stability points, given certain linguistic conditions as in (16) that an intrinsically Fibonacci syntax deploys at this level of abstraction. So while the end result is really very similar to customary phases, in fact purposefully so, the process that carries the system there has little to do with customary assumptions. While it is true that (16) presents “linguistic conditions”, we have seen how all of these conditions are presumably deployed only in that side of the grammar that needs to bound grammatical spaces on two different sides. In contrast, in open syntax (16iii) does not seem to obtain⎯effectively allowing a syntactic parse with both light and heavy edges. So it is far from clear that these sorts of conditions are simpleminded “interface specifications”, and future research ought to clarify what such substantive constraints amount to.33 Other issues that remain to be seen is what these structural dynamics entail, in terms of systemic computation beyond the obvious patterns. What happens, computationally, at those light edges for material to continue live in the computation? It may seem as if this is a trivial consequence of “phase impenetrability”: whatever escapes a phase’s edge will continue to be active in the derivation, at least while merely light edges are invoked. However, there presently exists no insightful way to understand why material moves to successive edges to start with (as opposed to being ineffably trapped within a “phase guts”). Moreover, many troubling situations involving long-distance conditions remain,34 and it is unclear how mere transfer constraints of the sort studied here will clear all the problems. It is almost as if, within the “specifier dimension”, things could happen in their own terms, particularly when these edges are appropriately light (as opposed to the heavy edges present in paratactic conditions).

Phase periodicity

91

7. Conclusions Different though syllables and phases are, they are also abstractly similar. The “double articulation of language” that has fascinated linguists for decades may turn out to be a single affair, albeit with different degrees of coarseness. Inside words, we have syllables; outside, phases. The space where these units emerge appears to be of a frustrated, Fibonacci sort, itself possibly a consequence of the fact that language exists as two processes going in opposite directions: bottom-up in terms of its semantic articulations, left-to-right in terms of its parsing in time, or perhaps more radically the overall phonetic structure of the system. This orthogonality is resolvable only in small chunks, which is generally the sort of situation emerging in Fibonacci conditions and frustration more generally. The specific “rhythm”, however, that we see for syllables or phases is a more delicate matter, which we have only touched upon here. Equally delicate is what all of this entails for relations holding across the relevant units. The explanatory interest of Fibonacci conditions in syllable structure stems from two facts: the ontology of syllable types that the F-game allows under certain linguistic (phonological) conditions and the frequency of relevant such types within the generating space of possibilities. These two correlate with observed facts in natural language. If this sort of explanation is to extend to syntactic phases, similar such correlations ought to be expected. To a large extent, this paper has attempted to show how, in fact, the ontology of phase types that an F-game allows under plausible linguistic (syntactic) conditions does correlate with nuanced observations about these domains in syntax. A harder point to make concerns the frequency of such domains. In the case of phonology, relevant samples are easy enough to bound, but the same is not true in the (by definition unbounded) syntactic samples. That said, it is in principle possible to find relevant statistical regularities within specific corpuses. I do not have data available to me in this specific regard, but the prediction is clear: all other things being equal, given types of phases (in fact, the core vP and CP cases that syntacticians normally work with) ought to be statistically more significant than more exotic cases presented in this paper. The paper has attempted an explicit parallel between cycles in phonology and cycles in syntax. An interesting future exercise, once this general point is made, is to determine to what extent the parallel is complete. It was already argued that, because of the distinct effect of specific interface conditions for phonology and for syntax, “right-edges” play a different role in each domain. But it is also an open question to what extent the “left-edges” are truly comparable. By the mere fact that specific material can be parsed syntactically as

92

Juan Uriagereka

a “right-edge” or a “left-edge”, an obvious (significant) difference emerges. This ought to relate to the central fact, also, that unlike syntactic structures syllables are not recursive. The present paper has had nothing to say about what determines this fundamental difference. Notes *

My appreciation to Cedric Boeckx, Atakan Ince, Ángel Gallego, Terje Lohndal, and three anonymous reviewers for their comments to various aspects of this work. I assume all the errors. Much of the content of this paper has now been integrated into Uriagereka (forthcoming), in a larger context. 1. Also for empirical reasons, given an analysis of compared derivations that only works if the comparison is established between derivations sharing lexical tokens up to the domain of a phase, no more and no less. It would take us too far afield to go into this case, but see Uriagereka (2008:chapter 1) for discussion. 2. Considerations of the sort alluded to in the previous note, and several others of a simpler sort, would not yield the right empirical results if TP were a phase. 3. This is where structural Case/agreement for object and subject are determined, respectively. 4. A sub-atomic particle’s spin is its intrinsic angular momentum, even though this is a quantum theory notion with little relation to the classical notion of spinning. The dynamics we are interested in arise in terms other than atomic spins, although this is the domain where the notion of frustration was originally proposed. We see below several more abstract conditions leading to dynamical frustration. 5. I thank Philippe Binder for alerting me of the existence of this work and of the significance of dynamical frustration more generally. 6. I know what a field is in mathematics, but have not seen an explicit definition of a “field” in cartographic terms. The intuition seems to be that sets of successively embedding categories in the cartography constitute a unit of sorts for a variety of purposes, which go from interface conditions to syntactic nuances. 7. Where FP signals a functional category in the left-periphery, other than TP or CP. 8. I am separating the N instance Richards postulated, as in (1), from the double N instance that Gallego adds to the discussion, as in (2), in terms of a single “−” vs. double “−” representation. 9. The observation ought to proceed without any special focal conditions. 10. The descending frequency in (10) pertains, both, to the number of languages that present the various types of syllables, and also to the rough number of syllable tokens per type in any given language. Needless to say, we also have syllabic consonants to reckon with, a reduced class with continuant conditions (nasals,

Phase periodicity

11.

12. 13.

14.

93

laterals, etc.). At the level of abstraction invoked here, it does not matter whether the “vocalic” space is relevantly broad, including other phonological specifications capable of sustaining continuant conditions (and see below for even broader specifications pertaining to signed expressions). For instance, the Spanish va “goes” vs. the Dutch vee “cattle”, both instances of the broad CV(V) template; or the Spanish sin “without” vs. the Dutch baard “beard”, both instances of the broad CV(V)C template. Arabela is natively referred to as Ta-pwe-yo-kwa-ka – without final consonants, the general case for this language. Perlmutter (1992) argues that syllabic distributions obtain in signed languages too, where hand movements are the equivalent of open spaces and hand positions of boundaries thereof. According to him, certain impossible combinations in signed language are the equivalent of unacceptable combinations in syllabic structure (e.g. impossible consonantal clusters that would violate the Sonority Hierarchy). If the reason why such limitations emerge is not a demand associated to the speech organs, then the ontology of syllables should be understood in terms as abstract as the ones presented above. See van der Kooij & Crasborn (2008) for a recent perspective and references. A reviewer reminds us how Perlmutter (and also Brentari 1990, 1998, Corina 1990, and Sandler 1993) liken gestural movements in signed languages to a “sonority of a visual sort”. However, it is not the same to raise this point as it is to account for the ontology of syllables we are after. Would this “sonority” predict possible syllabic entities across signed languages, given its general conditions (whatever those are)? And even if we grant the putative result, would “sonority” restrictions be the ultimate explanation for the pattern, or would it be the case, then, that familiar sonority restrictions align with these hypothetical alternatives for mathematical reasons? A reviewer correctly points out that (12ii) is an output condition on a representation of n symbols. However, a derivational approach to these matters is also straightforward. An equivalent way of generating the relevant patterns is in terms of a “Lindenmayer” version (rewriting all rewritable symbols in any derivational line) of the sort of a standard rewrite grammar, using the two rules below: (i) a. + Æ −; b. − Æ − + (ii) + | /

/

−

−

\ + |

\ + − / \ | / \ − + − − + The successive lines in the derivation in (ii) produce a number of symbols in the Fibonacci sequence. Readers can check that the next derivational line will generate

−

94

15. 16.

17.

18.

19.

Juan Uriagereka the string “− + − − + − + −”, with the number of symbols expected from adding those in the 4th and 5th derivational lines. The Fibonacci structure, in this instance, arises in the very derivational lines, not the possible representations as in Figure I. To be precise, one can parse successive derivational lines according to the conditions on the F-game imposed on (13), resulting in the same sorts of ontologies we have been discussing. In other words, how the F-game generates the space for the linguistic conditions to operate is not crucial, so long as it is an underlying Fibonacci space. This poses questions similar to those raised on fn. 21. One, two, three, etc. up to seven symbols in this concrete instance. A reviewer is concerned about my way of calculating the symbolic combinations as their number grows, one by one (it should be easy to see that with 13 symbols the combinations are 21, with 14 they are 34, and so on). The reviewer wants me to explain “why it has to be the sum, and not, say, only the number of occurrences found in the last and biggest “step” (here, the 26 strings of 8 symbols), which leads to completely different results”. This is true, as it would be in any other Fibonacci observation in nature. For example, most plants follow an ideal pattern as in Figure I, where the initial steps could be taken to correspond to the trunk, a bifurcation starts in the third step, and so on, in an (in principle) unlimited branching fashion. We could demand that botanists ignore this mathematical fact, by insisting on counting just the leaves of the tree (or the nerves in the leaves, the cells in the nerves, etc.). That may have obscured the series (e.g., for a large number, it may have been hard to identify it as a Fibonacci number). In truth, these are not mathematical gimmicks, but ways to seek a deeper understanding of what the underlying structures of these interactions may be⎯and it seems pointless to cloud them on purpose. The assumption is that one of the two polar symbols, “+” and “−”, is an unbounded topological entity, while the other constitutes a boundary thereof. It is arbitrary whether “+” or “−” is chosen as the relevant boundary, so long as its polar opposite symbol is kept as the space, and so long as the linguistic conditions in (13) are applied consistently to whichever particular representation is assumed. Concerning (14), a reviewer asks why I do not add up the (a) and (b) results. The reason is that the combinations are generated differently, in one instance starting in a “+” (a boundary) and in the other, instead, in a “−” (a space). The substantive conditions capitalize on this distinction (attempting to maximize spaces, to begin with, see below), so it would cloud the picture not to make this distinction. Douady & Couder’s (1992: 2098-2099) describe their experimental set-up as follows: “The experimental system . . . consists of a horizontal dish filled with silicone oil and placed in a vertical magnetic field H(r) created by two coils near the Helmholtz position. Drops of ferrofluid of equal volume (v ≈ 10 mm3) fall with a tunable periodicity T at the center of the cell. The drops are polarized by the field and form small magnetic dipoles, which repel each other with a force proportional to d-4 (where d is their distance). These dipoles are advected by a

Phase periodicity

95

radial gradient of the magnetic field (from 2.4 x 104 A/m at the center to 2.48 x 104 A/m at the border of the dish), their velocity V(r) being limited by the viscous friction of the oil. In order to model the apex, the dish has a small truncated cone at its center, so that the drop introduced at its tip quickly falls to its periphery . . . The drops ultimately fall into a ditch at the periphery”. The following is a link to an article that, aside from providing a brief overview of “phylotaxis”, presents a video of the actual experiment: http://www.sciencenews.org/view/generic/id/8479. I thank a reviewer for several clarifications about the experimental set-up, which is incorrectly reported in Piattelli-Palmarini & Uriagereka (2008). 20. A rational division k of any given space S will only be optimal to fit k-1 features within the separate portions of that space. When the number of features we want to fit into S does not have a maximum, what we want for the ideal packing is an irrational division of S—If such a thing even exists, the most irrational such division. In fact ĳ is such a number, inasmuch as it satisfies the equation x2 - x - 1 = 0, so that its continued fraction expansion is the simplest of all: (i)

The convergents of this expansion are 1, 2, 3/2, 5/3, 8/5, ... , the ratios of consecutive Fibonacci numbers. How irrational a given number n is depends on how well relevant convergents approximate n. Famous irrational numbers like ʌ or ¥2 have optimal rational approximations (22/7 and 7/5). However, Hurwitz’s Theorem indirectly guarantees that ĳ cannot have any such rational approximation (see Hardy et al. 2008 for technical discussion). 21. A reviewer is concerned about the fact that some Fibonacci patterns cannot be of this sort - for instance situations in which we count the number of ways that a “bin” of n cells can be covered by one or two-length “bricks”; the number of different possibilities in such circumstances is Fib (n). Now, of course, given continued fractions as in the previous footnote, any particular situation that happens to fall into that series will have Fibonacci properties—and these would not have arisen out of frustration. The issue, however, is whether there are natural situations for which such mathematical conditions arise. We know of one such situation: dynamical frustration, for reasons having to do with the dynamics of such systems. The fact that this is one such scenario does not preclude others, and for that matter does not uncover the elephant in the room: how nature figures out that these bits of mathematics are the ideal solution to its dynamical conditions. That is where the crux of the question is, and I have little to offer about resolving it. My goals as a linguist are more modest: I would be happy with being able to

96

22.

23. 24.

25.

26.

27.

28.

29.

Juan Uriagereka establish relevant (natural) dynamical conditions in language which result in the observable patterns. An entire branch of bio-physics is concerned with these sorts of questions, although the matter in psychology is still in its infancy, and see Medeiros (2008) for much useful discussion and references. Perhaps specifying the timing of successive cell divisions and the cell-to-cell adhesion thresholds. This may seem like an arbitrary decision, one that, in terms of the L-systems in fn. 14, forces us to choose the variant in (ib) as opposed to the more general variant in (ia): (i) a. 0 Æ 1, 1 Æ {0, 1} b. 0 Æ 1, 1 Æ 1, 0 It may well be, however, that (ib) is a linearized version of (ia), and that the linearization happens to be in the direction “specifier first”. Why that should be, of course, is an interesting matter in itself, as discussed in Uriagereka (forthcoming), where multiple references are provided too. Toyoshima (1997) uses the term “process” to refer to a “separately built phrase marker”. The same point was raised in Uriagereka (1995/1997), citing the presentation in various venues of what was later to become Uriagereka (1997/1999). Along similar lines, Zwart (2004) separates between “current” and “previous” (or “auxiliary”) derivations, as does Johnson (2002) with different terminology but similar assumptions. For instance, in a sentence like (i), the extension of the “complement” matters to determine the duration of the event denoted by the sentence, via the verb freed, while the extension of the “subject” does not: (i) Lincoln freed the slaves Lincoln was assassinated before completing his goal, but nonetheless the event is considered complete when there are no more slaves, whether or not Lincoln lived to tell the story. So in a sentence like (i), the relevant logical dependencies for the determiner most are between the complex expression in (iib) and the simple quantifier restriction in (iia), containing the complement of the determiner; there are no similar simple quantificational dependencies with any other object in the sentence: (i) I love most children (ii) a. children b. [I love x] Many idioms exist with the format [V complement], as in (i), but putative idioms with the alternative format [subject V] do not seem to exist: (i) a. John kick the bucket b. John hit the road c. John broke the news For example, an underlying representation with roughly the import of (i) yields incorporations as in (iia), but no other form of incorporation seems possible, beyond that of the “complement”:

Phase periodicity

97

(i) (ii)

30. 31.

32.

33.

34.

a. Basques traditionally hunted whales a. Basques traditionally whale-hunted b. *(there) traditionally Basque-hunted whales (cf. “there hit the stands a new journal”) c. *Basques tradition-hunted whales These typically involving not one, but a series of 1st merges. Uriagereka (forthcoming) exploits Lasnik & Kupin’s (1977) “reduced phrasemarkers” for this point. A natural interpretation of “monostrings” within those formal objects (i.e., strings containing precisely one non-terminal element) is as hybrid elements arising from having Markovian strings of phonetic (terminal) symbols formally match more complex associations among non-terminal items. See Gallego (2010) for a suggestion in this direction, the idea being that just as he unearthed heavier phases in the CP/TP domain, so too languages present such creatures in the vP/VP domain. Including whether the underlying dynamics obtained just in the evolution of the language faculty, or is rehearsed every time a given language is acquired—let alone in actual parsing. Long-distance anaphora and obviation, multiple question interactions, polarity licensing, etc.

References Abels, Klaus 2003 Successive-cyclicity, anti-locality, and adposition stranding. Doctoral dissertation, University of Connecticut. Baker, Mark C. 1988 Incorporation: A theory of grammatical function changing. Chicago: University of Chicago Press. Barwise, Jon and Robin Cooper 1984 Generalized quantifiers and natural language. Linguistics and Philosophy 4: 159-219. Berwick, Robert and Amy Weinberg 1984 The Grammatical Basis of Linguistic Performance. Cambridge: MIT Press. Binder, Philippe 2008 Frustration in Complexity. Science 320-5874: 322-323. Blevins, Juliette 1995 The Syllable in Phonological Theory, Handbook of phonological theory, ed. by John Goldsmith, Basil Blackwell, London, 206-44. [Reprinted in C. W. Kreidler 2001 (ed.), Phonology: Critical Concepts, Volume 3. London: Routledge, 75-113.]

98

Juan Uriagereka

Boeckx, Cedric 2007 Understanding minimalist syntax: Lessons from locality in long-distance dependencies. Oxford: Blackwell. 2008 Elementary syntactic structures. Ms, Harvard University. Boeckx, Cedric and Kleanthes Grohmann 2007 Putting Phases in Perspective. Syntax 10: 204-222. Boškoviü, Željko 2002 A-movement and the EPP. Syntax, 5: 167-218. Brentari, Diane Brentari, D. 1990 Theoretical foundations of American Sign Language Phonology. Doctoral dissertation: University of Chicago. 1998 A prosodic model of sign language phonology. Cambridge, MA: MIT Press. Carstairs-McCarthy, Andrew 2000 The Origins of Complex Language: An Inquiry into the Evolutionary Beginnings of Sentences, Syllables, and Truth. Oxford: Oxford University Press. Chandra, Pritha 2007 (Dis)agree: Movement and agreement reconsidered. Doctoral dissertation, University of Maryland. Chomsky, Noam 1993 A minimalist program for linguistic theory. In K. Hale and S. J. Keyser (eds.), The view from Building 20. Cambridge, MA: MIT Press, 1-52. 1994 Bare phrase structure. Ms., MIT. [Published in G. Webelhuth (ed.), 1995. Government and Binding and the Minimalist Program. Oxford: Blackwell, 385-439.] 2000 Minimalist inquiries: the framework. In R. Martin et al. (eds.), Step by Step. Cambridge, MA: MITPress, 89-155. 2001 Derivation by phase. In: M. Kenstowicz (ed.), Ken Hale: A Life in Language. Cambridge, MA: MIT Press, 1-50. 2004 Beyond explanatory adequacy. In A. Belletti (ed.), Structures and Beyond. Oxford: Oxford University Press, 104-131. 2005 Three factors in language design. Linguistic Inquiry 36: 1-22. 2008 On phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud, C. Otero et al. (eds.), 134-166. 2009 Opening Remarks. In M. Piattelli Palmarini, P. Salaburu and J. Uriagereka (eds.), Of Minds and Language: A Conversation with Noam Chomsky in the Basque Country, Oxford: Oxford University Press. Chomsky, Noam and George Miller 1963 Introduction to the Formal Analysis of Natural Language. In R. Luce, R. Bush and E. Gallanter (eds.), Handbook of Mathematical Psychology, Volume II. New York: Wiley.

Phase periodicity

99

Collins, Chris 1997 Local Economy. Cambridge, MA: MIT Press. Corina, David 1990 Handshape assimilation in hierarchical phonological representations. In C. Lucas (ed.), Sign language research: Theoretical issues. 27-49. Washington, DC: Gallaudet University Press. Diep, H.T. 2005 Frustrated Spin Systems. World Scientific Publishing Company. Dikken, Marcel den 2007a Phase extension. Contours of a theory of the role of head movement in phrasal extraction. Theoretical Linguistics 33: 1-41. 2007b Phase extension: A reply. Theoretical Linguistics 33: 133-63. Douady, Stéphane and Yves Couder 1992 Phyllotaxis as a Physical Self-organized Growth Process. In Phys. Rev. Lett. 68: 2098-2101. Epstein, Samuel, Eric Groat, Ruriko Kawashima, and Hisatsugu Kitahara 1998 The Derivation of Syntactic Relations. Oxford: Oxford University Press. Epstein, Samuel and Daniel Seely 2002 Rule applications as cycles in a level-free syntax. In S. Epstein and D. Seely (eds.), Explanation and Derivation in the Minimalist Program. Oxford: Blackwell, 65-89. Fox, Danny 2000 Economy and Semantic Interpretation. Cambridge, MA: MIT Press. Fox, Danny and Howard Lasnik 2003 Successive cyclic movement and and island repair: the difference between sluicing and VP-ellipsis. Linguistic Inquiry 34: 143-154. Fox, Danny and David Pesetsky 2005 Cyclic linearization of syntactic structure. Theoretical Linguistics 31: 1-46. Gallego, Ángel J. 2009 Phases and Variation. Exploring the Second Factor of Language. In Alternatives to Cartography, J. van Craenenbroeck (ed.), Berlin: Mouton de Gruyter, 109-152. 2010 Phase Theory. Amsterdam: John Benjamins. Goremychkin, E. A. et al. 2008 Spin-glass order induced by Dynamic Frustration. Nature Physics 1-10: 766-771. Grimshaw, Jane and Armin Mester 1988 Light Verbs and Theta-Marking. Linguistic Inquiry 19: 205-232. Grohmann, Kleanthes 2003 Prolific Peripheries. Amsterdam: John Benjamins.

100

Juan Uriagereka

2003b

Successive cyclicity under (anti-)local considerations. Syntax 6: 260312. Halle, M. and K. Stevens 1962 Speech recognition: a model and a program for research. IEEE Transactions on Information Theory 8: 155-160. Hardy, G.H. et al. 2008 Theorem 193. An introduction to the Theory of Numbers (6th ed.). Oxford science publications. p. 209. Jeong, Youngmi 2006 Multiple questions in Basque. University of Maryland Working Papers in Linguistics 15: 98-142. Johnson, Kyle 2002 Towards an etiology of adjunct islands. Ms., University of Massachusetts, Amherst, MA. Kaye, Jonathan, Jean Lowenstamm, and Jean-Roger Vergnaud 1985 The internal structure of phonological elements: A theory of Charm and Government. Phonology Yearbook 2: 305-328. Kayne, Richard S. 1994 The antisymmetry of syntax. Cambridge, MA: MIT Press. Lewis, Shevaun and Colin Phillips 2009 Derivational Order in Syntax: Evidence and Architectural Consequences. Ms., University of Maryland. Manzini, Rita 1994 Locality, Minimalism and Parasitic Gaps. Linguistic Inquiry 25: 481508. Marantz, Alec 1984 On the nature of grammatical relations. Cambridge, MA: MIT Press. Medeiros, David 2008 Optimal Growth in Phrase Structure. Biolinguistics 2: 152-195. 2009 Doctoral dissertation, University of Arizona. Moro, Andrea 2000 Dynamic antisymmetry. Cambridge, MA: MIT Press. Nerukh, Dmitry 2008 Dynamical frustration of protein’s environment at the nanoseconds time scale. Journal of Molecular Liquids 145: 139-144. Perlmutter, David 1992 Sonority and syllable structure in American Sign Language. Linguistic Inquiry 23: 407-442. Piattelli-Palmarini, Massimo and Juan Uriagereka 2008 Still a Bridge too Far? Biolinguistic questions for grounding language on brains. Physics of Life Reviews 5: 207-224.

Phase periodicity

101

Piekarewicz, J. 2008 The Nuclear Physics of Neutron Stars. 5th ANL/MSU/JINA/INT Workshop: Bulk Nuclear Properties. Poeppel, David, William Idsardi, and Virginie van Wassenhove 2008 Speech perception at the interface of neurobiology and linguistics. Philosophical Transactions of the Royal Society, B: Biological Sciences 363: 1071-1086. Richards, Marc D. 2006 Deriving the Edge: What’s in a Phase?. Ms., University of Cambridge 2007 On feature inheritance: an argument from the phase impenetrability condition. Linguistic Inquiry 38: 563-572. Richards, Norvin 2002 Very local A-bar movement in a Root-First derivation. In S. D. Epstein and T. D. Seely (eds.), Explanation and Derivation in the Minimalist Program. Oxford: Blackwell, 227-248. Ross, J. R. 1967 Constraints on variables in syntax. Doctoral dissertation, MIT. [Published in 1986 as Infinite Syntax! Norwood, NJ: Ablex.] Sabbagh, Joseph 2007 Ordering and Linearizing Rightward Movement. Natural Language and Linguistic Theory 25: 349-401. Sandler, Wendy 1993 A sonority cycle in American Sign Language. Phonology 10: 243-279. Saussure, Ferdinand de 1916 Cours de linguistique générale. C. Bally and A. Sechehaye (eds.), Lausanne: Payot. Svenonious, Peter 2001 Impersonal passives: A phase-based analysis. In A. Holmer, J. O. Svantesson, and Å. Viberg (eds.), Proceedings of the 18th Scandinavian Conference of Linguistics. Lund: Lund University. Takahashi, Daiko 1994 Minimality of movement. Doctoral dissertation, University of Connecticut. Tenny, Carol 1994 Aspectual Roles and the Syntax-Semantics Interface. Dordrecht: Kluwer Academic Publishers. Torrego, Esther and Juan Uriagereka 1992 Indicative Dependents. Ms., UMass Boston/UMD. 2002 Parataxis. In J. Uriagereka (ed.), Derivations. Exploring the Dynamics of Syntax, London: Routledge, 253-265. Townsend, David and Thomas Bever 2001 Sentence Comprehension: The Integration of Habits and Rules. Cambridge, MA: MIT Press.

102

Juan Uriagereka

Toyoshima, Takashi 1997 Derivational CED. In WCCFL 15, B. Agbayani and S.-W. Tang (eds.). CSLI, Stanford, 505-519. Uriagereka, Juan 1995 Formal and Substantive Elegance in The Minimalist Program. Invited talk at the Workshop on Economy in Grammar, Arbeitsgruppe Strukturelle Grammatik, Max-Plank Institute. 1997 Formal and Substantive Elegance in the Minimalist Program. In C. Wilder et al. (eds.) The Role of Economy Principles in Linguistic Theory. Berlin: Akademie Verlag. 170-204. 1998 Rhyme and Reason: An Introduction to Minimalist Syntax. Cambridge, MA: MIT Press. 1999 Multiple spell-out. In Working minimalism, N. Hornstein and S. Epstein (eds.), 251-282. Cambridge, MA: MIT Press 2008 Syntactic anchors. On semantic structuring. Cambridge: Cambridge University Press forth. Spell-out and the Minimalist Program. Oxford: Oxford University Press. van der Kooij, Els and Onno Crasborn 1994 Syllables and the Word-Prosodic System in Sign Language of the Netherlands. Lingua 118, 1307-1327. Yu, Y. et al. 2007 A genetic network for the clock of Neurospora crassa. PNAS 104-8: 2809-2814. Zwart, Jan Wouter 2004 Case and Agreement as Dependent Marking. Groeningen PIONIER Colloquium, Nijmegen, Netherlands, 14 April 2004.

Exploring phase based implications regarding clausal architecture. A case study: Why structural Case cannot precede theta* Samuel D. Epstein, Hisatsugu Kitahara, T. Daniel Seely

1. Introduction There exists in contemporary theories of syntax a widely adopted hypothesis that syntactic representations of sentential phenomena exhibit a hierarchically organized clausal architecture. Essentially, there is the theta domain, above that the Case domain, and finally the scope domain. Why should this be true? Is it to be stipulated, or can aspects of clausal architecture be derived, within a level-free phased-based framework? More generally, might aspects of clausal architecture in mental representations be deduced from the architecture of UG and 3rd factor principles? We suggest that (at least to some degree, and perhaps even more than we explore here) the answer is yes. The history of the development of syntax has witnessed different approaches to basic clausal architecture. For example, within Government and Binding theory an argument DP could not be assigned structural Case before it was assigned a theta role. Why? This was a result of a conspiracy of the following “GB architectural axioms”: (1)

The postulation of an “all at once” D-Structure (DS) level of representation

(2)

The application of the Theta Criterion at DS

An argument DP in a non-theta position at DS (the point of its ‘syntactic birth’) violates the Theta Criterion at this level of representation (by definition) and the reception by DP of a theta role later on could have no effect (positive or otherwise) on this DS offense. At S-structure (SS) the argument DP might have both a theta role and structural Case,1 but within GB, this is insufficient; it is simply ‘too late’: given the stipulated architecture of the GB system, the DP must get its theta role “first” at the

104

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

initial level of DS and so could never possibly be assigned structural Case before receiving a theta role. But, within the more recent Minimalist Program, where syntax-internal levels of representation including D- and S-structure are eliminated, such an axiomatic description of clausal architectural hierarchy is impossible; indeed since there is no D- nor S-structure, the “conspiracy” indicated above, induced by ordering DS before SS and stipulating that certain filters (theta) do apply at DS while others apply at a subsequent level of representation (the Case Filter) can’t even be formulated in MP. For MP, there are no noninterface levels such as D- and S-structure, and hence there is no ordering of them, and no syntactic filters apply internal to the narrow syntax. What then becomes of properties of clausal architecture for the MP? Must we adopt stipulations, such as that from Chomsky (2000) that an argument must be first merged into a theta position (which implies the existence of special properties of first Merger and of argument merger, as well)? How much, if any, of the attested clausal architecture, theta and then Case (and then scope), can be explained by being deduced from independently motivated components of the framework? This paper explores this question, suggesting that at least some properties of clausal architecture may be deducible within the MP, specifically from the analysis of Chomsky (2007, 2008).2 We explore the idea that Case before Theta invariably results in crashing and/or interface gibberish (a violation of Full Interpretation (FI) at the Conceptual-Intentional (CI) interface) even within the level-less, optimality-seeking filter free minimalist system of Chomsky (2007, 2008). One central aspect of clausal architecture is shown to be deducible from Chomsky’s current analysis, following strict minimalist tenets, positing nothing beyond irreducible lexical features, natural (and independently motivated) Interface Conditions, and 3rd Factor considerations expressing optimality-seeking minimalist design. If on track, this provides another deep explanatory improvement over the level-ordered GB system, within which clausal architecture follows only from stipulated aspects of level ordering (DS precedes SS) and unexplained stipulations as to which filters apply at which levels and which do not (Theta Criterion, but not the Case Filter, applies at DS).

Exploring phase based implications regarding clausal architecture

105

2. Under the phase-based, feature inheritance analysis structural Case cannot precede theta-marking 2.1. Phase Heads and Feature Inheritance Within Chomsky’s (2007, 2008) analysis, the lexical entries for the phase heads (C and v) inherently contain unvalued ĳ-features.3 The valuation of a DP’s structural Case feature is achieved via ĳ-feature agreement (under the Probe-Goal analysis) with a phase head.4 But, there is no direct agreement between the Probing phase head (C or v) and the DP Goal that gets Case valued. Rather, the phase head must first transmit its ĳ-features to the head of its complement (see Chomsky 2007, 2008, and Richards 2007).5 The head of the phase complement, in turn, agrees with the Goal DP, valuing that DP’s Case. Why does the phase head C transmit its ĳ-features to its complement head T, as the phase head v does to its complement head V, thereby allowing T and V to induce Agree (i.e. feature valuation)? Such feature transmission is itself explicable, and need not be stipulated, since, on independent grounds, syntactically-valued features (such as valued ĳ) appearing on a phase edge will invariably induce crash at the ConceptualIntentional (CI) interface, as insightfully explained by Chomsky (2007, 2008) and Richards (2007).6 Simply put: ĳ-feature transmission must occur since, if it doesn’t, convergence would be impossible. Chomsky (2001:5) proposed that “the uninterpretable features, and only these, enter the derivation without values, and are distinguished [in the narrow syntax, SE, TDS, HK] from interpretable features by virtue of this property.޵7 This proposal predicts that the distinction between unvalued and valued features is lost in the eyes of TRANSFER at the moment such unvalued features change from unvalued to valued. Once unvalued features get valued, they will be regarded just like inherently valued features, and thus Transfer cannot (“know” to) remove them since Transfer is a purely formal non-interpretive operation and hence knows nothing about interpretability at the (not yet reached) interface. However, even though non-interpretive and non-lookahead, Transfer can see (purely formal) feature values, and can make its decision regarding what to Transfer where, based on these detectable formal feature values. The valuation analysis however confronts the so-called “before-and-after” problem⎯a problem regarding the exact timing of Transfer application. Transferring features before valuation is too early (i.e. unvalued features cause crash at the interface) and transferring features after they are valued is too late (i.e. after valuation, Transfer cannot remove valued features, also leading to crash). For example,

106

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

Transfer cannot distinguish a valued ĳ-feature on a DP from a valued ĳ-feature on Tense⎯they are identically valued ĳ-features in the eyes of Transfer, which, as a result, cannot discern that the valued ĳ-features in T will be uninterpretable at CI, while the valued ĳ on DP will be interpretable at CI (see Epstein & Seely 2002 for detailed discussion and possible solutions). To solve this problem, Transfer must remove unvalued features at the point of their valuation. Thus Chomsky (elegantly) seeks to explain cyclic phasal Transfer, not just stipulate it. As Chomsky (2008:19), echoing Chomsky (2001), states: If transferred to the interface unvalued, uninterpretable features will cause the derivation to crash. Hence both interface conditions require that they cannot be valued after Transfer. Once valued, uninterpretable features may or may not be assigned a phonetic interpretation (and in either case are eliminated before the SM interface), but they still have no semantic interpretation. Therefore they must be removed when transferred to the CI interface. Furthermore, this operation cannot take place after the phase level at which they are valued, because once valued, they are indistinguishable at the next phase level from interpretable features, hence will not be deleted [by Transfer, SDE HK TDS] before reaching the CI interface. It follows that they must be valued at the phase level where they are transferred, …

As illustration of this motivation for feature inheritance whereby ϕfeatures must be moved from a phase head to the head of the phase head complement, consider (3). (3)

[CP Cĳ [TP T [vP Sue [v v* [VP jumped ] ] ] ] ]

The phase head C (inherently) bears ĳ-features, while T does not. If C does not transmit its ĳ-features to T, there could (in principle) be direct Agree (C, Sue) valuing ĳ of C and Case of Sue.8 However, since C is a phase head, the TP complement of C will be transferred given the Phase Impenetrability Condition (PIC),9 only C (the phase edge) remains. The problem, however, is that now the “derivational history” of the valuation of C’s ĳ-features (which happened via agreement with Sue) is representationally unrecoverable since the TP containing Sue has been Transferred (under the PIC). Consequently, at the next phase level, Transfer will only “see” the now valued ĳ-features of C and will not have access to the information it needs⎯that these features were previously unvalued and became valued via Agree with a DP goal within TP (now gone). Consequently, Transfer cannot then ‘know’ to remove these ĳ-features from

Exploring phase based implications regarding clausal architecture

107

the CI-bound object; i.e., in the eyes of the non-interpretive operation Transfer, the valued ĳ on C (which need to be removed for CI convergence, since they are uninterpretable) are identical to the valued ĳ-features on N, which are NOT to be removed. It follows, then, that C must transmit its ĳfeatures to T, allowing for Agree (T, Sue).10 Internal to the TP (the phase head complement), Transfer can see that ĳ of T and Case of Sue went from unvalued to valued and thus must be removed from the CI-bound object, avoiding (correctly in this instance) CI crash. To summarize, for Chomsky (2007, 2008) the phase heads C and v inherently bear ĳ-features as a lexical property. Given the nature of Valuation and PIC, it follows that C/v must transmit these ĳ-features to T/V, and T/V in turn Agree with a minimally searchable nominal. Failure to transmit ĳ yields crash at the next phase. This analysis has a further, and unnoted, consequence, namely, theta “assignment” must take place before structural Case valuation, the details of which we reveal in the next section. 2.2. Feature-transmission + Agree + PIC entails: Structural Case cannot precede Theta-Marking Minimalist analysis seeks to eliminate non-interface, syntax-internal levels of representation, such as GB Theory’s D-Structure and S-Structure, deriving their properties from (i) irreducible lexical features, (ii) Interface Conditions, the conditions that narrow syntax must satisfy to be “usable at all” by the systems with which it interacts (see Epstein 2007 for discussion of this Internalist Functionalism approach), and (iii) general properties, including those associated with optimal computation (3rd factor considerations, Chomsky 2005). As noted, it has long been stipulated that structural Case assignment cannot precede theta role assignment, i.e the clausal architecture is: Theta, then Case, then scope.11 Interestingly, the same conclusion is (attractively) deducible from postulates of the minimalist analysis reviewed above, which has no syntax-internal levels, hence no level-ordering, and no syntax internal filters either. That structural Case cannot precede theta marking is deducible from principles motivated on entirely independent grounds unrelated to Theta Theory. The short story is this: any instance of a DP first merged into a non-theta, structural Case position by External Merge (EM), and then “moved into” a theta position by Internal Merge (IM) will necessarily result in that DP being problematic for the CI-interface, i.e. a convergent, nongibberish expression can be derived only if theta precedes Case.

108

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

Here we provide a reductio argument, i.e. we will provide the necessary properties of any derivation in which structural Case is assigned before theta, and will show that it either crashes or else yields convergent gibberish at the CI interface. To illustrate, consider the DP in (4):12 (4)

DP vl ĳ u Case

where: DP is an argument, and DP bears inherently valued ĳ-features, and DP bears an unvalued Case feature

In order to value the Case of this DP, before assigning it a theta role, we must merge in a Case-valuing category and we must do so before any theta assigner is externally merged in such a way as to assign the DP a theta role. Suppose, for purposes of illustration, that the potential Case-valuer T is externally merged in:13 (5)14 T > … > DP

where: “…” may contain other categories but contains no theta assigner of DP and therefore DP is not in a theta position

Now, for T to actually value Case on DP, (prior to DP receiving theta) two prerequisite conditions must be met: (6)

In order for Agree (T, DP) to take place, a. there must be no phase head intervening between T and DP. This follows from the PIC: such a phase head would induce Transfer and (given the PIC) DP would in effect be “gone”, and hence unavailable for Probe by T, and hence unavailable for Agree (T, DP), and, b. T must inherit ĳ-features from C15

Thus, C must be externally merged with the T-projection so that C (lexically bearing unvalued ĳ-features) can transmit ĳ to T. Thus, the relevant configuration for structural Case valuation before theta assignment would (have to) be of the following form: (7)

Cuĳ + [TP T > … > DPvlĳ / uCase ] where “…” contains no phase head and no theta marker of DP, and, therefore, DP is not in a theta configuration

Exploring phase based implications regarding clausal architecture

109

As discussed above, C must transmit its ĳ-features to T (otherwise the ĳfeatures would be stranded on the edge of C (causing crash at the next phase). Thus:

(8)

ĳ-Feature Inheritance, from C to T: C + [TP Tuϕ > … > DPvlϕ / uCase ] where “…” contains no phase head and no theta marker of DP, and, therefore, DP is not in a theta configuration

Once T receives ĳ-features from C, T can ĳ-probe DP and (since, by assumption, all relevant conditions are met) Agree (T, DP) can take place, valuing the ĳ-features of T and nominative Case of DP: (9)

Result of Agree (T, DP): C + [TP Tvlĳ > … > DPvlĳ / NOM ] where “…” contains no phase head and no theta marker of DP, and, therefore, DP is not in a theta configuration

We have thus reached the first step in our reductio argument, i.e. we have performed a derivation in which valued structural Case on an argument DP is valued before assigning DP a theta role. But, what happens next? By the Phase Impenetrability Condition, Transfer of TP applies to (9).16 Thus the TP complement of the phase head C is transferred to the phonological component, and transferred (minus CI-illegitimate features like structural Case) to the semantic component. What enters the semantic component, therefore, is (10)

[ T > … > DP ] where, crucially, DP is not in a theta configuration

That is, the DP has no theta role in the CI representation of the transferred TP. Thus, a “Theta Criterion” violation at CI results. But what exactly becomes of “The Theta Criterion”, a GB postulate, within Minimalism? The relevant and more general CI interface condition in this instance is the arguably ineliminable principle of Full Interpretation

110

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

(Chomsky 1986); the DP itself can be interpreted, but it will have no computable semantic (i.e. theta) relation with anything else in the given local structure; no coherent compositional semantic relation between DP and the rest of the structure can be established. Thus, CI representations such as (10) will have essentially the same status as, say, John seems that Sue sleeps (which were called “theta criterion violations” in GB parlance) wherein John has its Case checked, but has no semantic/theta relation to the rest of the sentence, yielding convergent gibberish at CI. The kind of strings receiving the analysis in (10) above include, for example (11)

[CP C = that [TP T = is … DP = John ] for to John if is John that/for/if . . . is/to . . . John

Structural case (NOM) is valued, but no theta role is assigned, and the result is that the transferred TP is (correctly) predicted to be convergent gibberish at CI, the only level of representation at which it “makes sense” to apply interpretive conditions like “the theta criterion” (re-construed as a facet of the far more general principle of Full Interpretation). The prediction is obtained with no theta criterion stipulated to apply at a syntax internal, all at once, derivation-initiating, deep structure level of representation. It thus seems to be a property of the system that if any gibberish (i.e. a single transferred gibberish Phase Head Complement (henceforth, PHC)) reaches the CI interface, the entire subsequent derivation is semantically doomed to gibberish. That is, “gibberish of the part entails gibberish of the whole”. Indeed to say that PHCs are cyclically transferred to the interface is to say that they periodically do undergo interpretation at the interface. (Thus semantic compositional recycling at LF, i.e. doing bottom up compositional interpretation on an entire already built up LF representation, as in GB, is avoided (Epstein et al. 1998). Given this, the appearance of any gibberish at the interface naturally entails a gibberish final complete sentential CI representation. Every phase head complement (each a syntactic object) is transferred to both PHON and SEM for interpretation. In SEM, the object is converted into a semantic representation. If the object contains a “free” argument, unrelated semantically to the rest of the local representation, i.e. thematically unrelated to the rest of the transferred PHC, then that structure violates FI as gibberish. Crucially, once interpreted at SEM, the object is given over to CI, and now no further semantic operations apply to its internal (gibberish) semantics. It doesn’t matter that the transferred PHCs will be

Exploring phase based implications regarding clausal architecture

111

combined at CI in assembling the complete, multi-phasal sentential CI representation. The internal semantic interpretation of the PHC is done and so no combination of the entire PHC, with some other PHC(s) can salvage the PHC-internal gibberish. Thus once a PHC is gibberish, it is impossible to continue the derivation and overcome this gibberish. To illustrate the immutability of a gibberish PHC, let’s reconsider our core case where an argument DP begins in a non-theta position, and has its Case checked; but now suppose it ‘escapes” the PHC and moves into a theta position, (12)

DP > … > PH [PHC … DP … ]

Although the argument DP has itself acquired a theta role (by assumption), the PHC now containing the copy of the argument DP (i.e. [PHC … DP[-theta] … ] in (12)) will be transferred to PHON and SEM. In SEM, this syntactic object will be converted into a semantic representation that will be gibberish in that the argument DP (copy) has no theta relation to the rest of the transferred PHC of which it is a part. This semantic representation is then given over to CI. Later, other PHCs of this syntactic derivation (and the root phase) will get transferred to SEM and their internal semantics will be computed individually. Ultimately, these semantic representations will be connected together at CI, yielding the entire sentential CI representation (Root + PHC + PHC + PHC…). But the local PHC-internal gibberish could not be “corrected” or “repaired” by assembly of this entire PHC with another PHC or root. If this line of reasoning is on track, any gibberish within a transferred PHC entails gibberish in the final multi-phasal complete sentential CI representation. If this logic is correct, the central argument of our paper is complete. Given general semantic considerations, phase by phase derivation with phase by phase semantic compositional interpretation at the interfaces yields the desired result: Structural Case valuation before Theta-“marking” cannot possibly yield convergent, non-gibberish; and there need be no appeal to syntax-internal levels of representation, or constraints on external first merge of arguments. While we believe this argument to be very much on track, we could, of course, have it wrong. A counter-argument might run as follows:

112

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

2.3. A possible counter argument against “once gibberish, always gibberish” It might be argued that the PHC internal gibberish can be repaired, so that even though a single transferred PHC has internal gibberish, subsequent derivation can nonetheless assemble a complete, multiphasal sentential CI representation in which there is no gibberish whatsoever. For example, an argument DP first merged in a non-theta position could move to a theta position and in so doing it could “count” as having satisfied Full Interpretation (since in the final multiphasal CI representation of the entire sentence, the argument has a theta role, i.e. appears in a theta configuration). To illustrate, consider again the situation where a transferred PHC is indeed gibberish. Suppose in the derivational continuation, the theta-less DP inside this PHC were to subsequently move to acquire a theta role (via the edge of the phase, as in (12)) or acquire a theta role phase internally. In the final complete sentential CI representation reassembling all the cyclically transferred individual PHCs, the DP argument has a theta role, i.e. is in a theta position, and this “negates” or “cancels out” or “repairs” the prior gibberish internal to the individually transferred PHC. In the complete CI representation, although there is a copy of the argument DP in non-theta position, the argument has moved to a theta position and so the chain in the complete CI representation, does contain a theta position”.17 The status of this counterargument to our above claim that “If a PHC contains gibberish, then the entire CI representation containing this PHC will necessarily contain gibberish” is not entirely clear to us. And since we are unsure of this counterargument’s status, let us tentatively assume it is right, thereby creating the “most trouble” for our proposal that we can. In the following sections, we explore just such derivations, ones in which the argument DP, first merged into a non-theta position, moves to get a theta role. There are two general types of cases: movement through the edge to get theta and within-phase movement to a theta position. Interestingly, we will argue that neither can possibly yield convergent non-gibberish. Thus, the overall logic is as follows: If “once gibberish, always gibberish” is right, and the counterargument is wrong, then we have explained what we set out to explain. If “once gibberish, always gibberish” is wrong, and the counterargument is right, we hope to show in the following sections that any attempt to repair gibberish by movement of an argument to a theta position, is independently excluded. If “once gibberish always gibberish” is true of the system and in addition movement to overcome gibberish is indeed barred by the system, then we may be revealing in the following sections a redundancy in the system whereby first merge of an argument in [–theta] BOTH condemns the complete

Exploring phase based implications regarding clausal architecture

113

derivation to immutable gibberish, AND any movement of the argument to acquire theta is also excluded. 3. Movement to acquire theta Suppose a DP has checked its Case but has no theta role: could the DP move into a theta configuration,18, 19 thereby yielding Case before theta, with no Gibberish in the final reassembled multiphasal CI representation? A number of independently motivated mechanisms within current theory mitigate against this possibility. Again, the situation we are considering is depicted in (13), (13)

… > PH > ... > DP

where DP has its Case checked by the phase head (PH) but that DP has no theta relation. Our question is: can the DP move to the edge of the phase to be available for movement into a theta position, ultimately yielding convergent non-gibberish? 3.1. Theta configurations In Chomsky (1995) and his subsequent work (including Chomsky 2007, 2008), Hale & Keyser’s (1993) configurational approach to theta relations is adopted. Under this approach, a “theta role” is “a relation between two syntactic objects, a configuration, and an expression selected by its head” (Chomsky 2000:103). Basically, an argument DP receives a theta role by being in a configuration with a theta assigner X only if DP is in a sister relation to X.20 3.1.1. Theta configurations and chains But what counts as a syntactic object in a sister relation to X? more generally, just what counts as an argument? Chomsky (1995:313) states that only a trivial chain (i.e. a one-membered chain) can be in a sister relation. Thus, if an argument DP is externally merged (EM) with X, forming a onemembered chain CH (DP1), then CH (DP1) is in a sister relation to X; the DP would count as an argument in a theta configuration. However, for Chomsky, if internal merge (IM) subsequently merges DP with Y, forming a twomembered chain CH (DP2, DP1), then neither the entire chain CH (DP2, DP1) nor the head of the chain, DP2, is in a sister relation to any configuration

114

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

(including Y). In effect, it is only the tail of a movement chain that counts as an argument, i.e. as an object in a sister relation to X. Thus, the tail of CH (DP2, DP1) is (still) in a sister relation to X, thereby being capable of receiving a theta role from X. In short, it is a theorem of this approach that only EM (forming a one-membered chain) can form a configuration for thematic interpretation, thereby excluding movement (IM) of a thetaless argument DP into a theta relation; if the DP is to have a theta role, it must be EM’ed (i.e. first merged) into a theta relation. The status of a chain-theoretic principle like the above is not clear; it is stipulative to the extent that it is stated rather than derived. Why, on a deeper level, should such a principle hold? 3.2. Theta configurations and Movement to Complement position Interestingly, certain independently motivated aspects of Chomsky (2007, 2008) prohibit movement of a theta-less argument from its non-theta position into a theta-marked complement position. It is thus unnecessary to adopt/stipulate the chain theoretic principle regarding argument positions reviewed above to exclude such cases. We will argue here that, in fact, there is no movement to a complement position, and hence there is no chance for a DP to move from a non-theta position to a theta position. First, assume with Chomsky (2007, 2008) that each lexical array must contain a PH, v or C.21 Let’s also assume with Chomsky that all EM takes place before any IM, this too being motivated on efficiency grounds. Finally, assume Chomsky’s label accessibility condition (LAC), also independently motivated, which states that only the label of a syntactic object is accessible to the computational component (since only the label is available with minimal search). With these three (arguably 3rd factor) independently motivated assumptions, we can derive that “there can be no movement to a complement theta position”. How does this follow? Recall, the configuration we’re considering: (14)

[ … DP PH … [ … DP … ] ]

where the argument DP is first merged into a position P in which its Case is valued but P is not a theta configuration; DP then moves to the edge of the phase to be accessible for further computation (given the PIC). Suppose we now externally merge in the theta assigner head e.g. V (or N or Adj).

Exploring phase based implications regarding clausal architecture

115

(15) V + [ DP [ PH [ … DP … ] ] ] At this point in the derivation, there is simply no way to create a structure where DP becomes the complement (sister/co-member) of V. By virtue of V externally merging to the object containing DP, V already has a sister, viz, the object containing the DP. No new sister relation involving just DP and V can now arise.22 Thus, without stipulation, it follows from the independently motivated principles (i) each array contains a phase head (ii) EM before IM and (iii) LAC, that the sister to a theta marking head can never be created by Internal Merge. One class of cases of an argument first merged into a (Case valued) non-theta configuration but subsequently IM’ed to acquire theta is not possible; namely, movement of an argument from a non-theta configuration into a complement theta position. 3.3. Movement to spec of v’ What about movement of a theta-less Case valued argument to a noncomplement theta position, e.g. [Spec, v’]? Consider first the spec of v’ position i.e. the “agentive subject position”. Again, suppose an argument DP has its Case valued, that the DP has no theta role, and the DP has moved to the edge of the phase and hence is available, in principle, for movement to a theta position: (16)

DP + PH [ … DP … ] ( = CP or vP) where DP had its Case checked and has moved to the edge of the phase to escape PIC, but DP has no theta role.

Suppose we Transfer the PHC and then merge in V,

(17)

V + [DP PH]

and then Merge in the phase head (e.g. v): (18)

v + [V [DP PH]

The phase head v transmits its ĳ-features to V, allowing V to now function as a ĳ-probe. V then tries to agree with DP, the only visible

116

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

matching goal, given that the PHC has been transferred. But notice, if there is an Activity Condition, then since DP is already inactive (due to its Case feature being valued prior to its moving to the edge), the derivation crashes since v/V can’t check ĳ-features with a visible active Goal, i.e. the previously Case-valued DP on the edge is the only visible Goal in the current representation, but it is inactive.23 But, what if there is no Activity Condition (as argued in Nevins 2005)? Without Activity, v/V could in principle ĳ-agree with the inactivated DP, thereby valuing V’s unvalued ĳ (inherited from v) and the DP could then simultaneously raise (following Chomsky) to both [Spec, VP] and [Spec, vP], yielding the following DP movement configuration (19)

[vP DP v [VP DP V [CP DP (on edge) C [TP DP(EPP) T DPNOM/-THETA

Now, at last, the (leftmost) DP would “get” a theta role from the v’ (in spec v’ position = external argument theta role position). Thus although an earlier transferred phase head complement (TP) contained a theta-less argument, with this transferred PHC therefore violating FI at CI, why is this “still” an “offense” if, in the final completely reassembled multi-phasal CI representation of the entire sentence, Case is valued and the DP argument is in a theta configuration—as in (19)? As we detailed above (section 2) it could well be that gibberish of the part in fact yields gibberish of the whole, this being a natural consequence of cyclic transfer to the interfaces with cyclic interpretation. But, note that even if local PHC gibberish could in principle be salvaged or corrected, such derivations can be independently excluded: the argument DP has moved “improperly” from A’-position (phase edge = [Spec, CP]) and then to an A position, namely spec of V. This is indeed A’ to A movement if we adopt (20)

Chomsky (2007:24): “A movement is IM (internal merge) contingent on probe by uninterpretable inflectional features, while Abar movement is IM driven by EF“.

By this definition, [Spec, VP] is an A-position (since V ĳ probed DP which then moved to [Spec, VP]) and [Spec, vP] is A-bar, since this movement involved only the edge feature of v. But why/how is improper movement excluded? Obata & Epstein (2008, 2011) argue that improper movement is an agreement phenomenon, hence parameterized. In their analysis this derivation crashes because the ĳ-features

Exploring phase based implications regarding clausal architecture

117

of v-V find no matching goal in the DP occupying spec embedded CP. This is because the ĳ-features of DP are moved to the A-position, specTP under their “Feature Splitting Internal Merge” analysis, hence DP at the edge lacks ĳ-features. Thus improper movement crashes, since DP at the edge has no ĳ, yet the theta marker introduced to assign DP theta in (19), namely v, inherently bears ĳ-features, but they cannot be valued since DP at the edge of CP is ĳ-less (and, of course, the embedded TP is gone, so DP on the edge is the only visible/present Goal). Thus, this kind of derivation in which the argument is first merged into a Case valued theta-less position and then moves to a non-complement position to get theta, crashes. Regardless, it has been assumed since 1973 that such derivations are in fact blocked (Chomsky 1973) or the representation resulting from such movement is independently excluded (by Binding theory condition C, May 1979, or Fukui 1993). Thus interphasal movement of an argument DP so as to acquire its first theta role, as in (19) is “improper” and by assumption excluded by independently motivated principles seemingly unrelated to “First merge of an argument must be into a theta position”. Thus to summarize, perhaps “gibberish of the part (a transferred PHC) entails gibberish of the whole. But if not, we have now excluded central cases in which the thetaless DP copy is transferred to CI, yet the moved DP tries to get a theta role via IM; one case was movement to complement position, which is independently excluded, another is movement to non complement position ([Spec, vP]) which as just shown, is also excluded. Thus, even if gibberish of the part does not entail gibberish of the whole, a thetaless DP cannot “escape via the edge’ in order to acquire its first theta role. 3.4. ECM But now, what about intra-phasal movement of a DP, within a single phase, to acquire a theta role? In such cases, movement via an edge position would not be required and, as a result, the movement would not be improper. Such cases are exemplified by ECM constructions, as in, for example: (21)

[vP John v [ Vexpect [TP tJohn to seem [CP that Bill left ] ] ] ] = *John expect to seem that Bill left

Interestingly, there seems to be nothing wrong with this string “as a vP”, i.e. if the derivation is now complete (and notice expect has no agreement suffix since no ĳ-agreeing T appears above vP) and will not continue on, this vP

118

Samuel D. Epstein, Hisatsugu Kitahara, and T. Daniel Seely

converges. However, without the mood marker C, the object is still gibberish in that it will have no interpretation as “declarative”, “imperative”, etc. Suppose we continue the derivation so that we generate a full CP representation of the string “John expects to seem that Bill left”. This entails externally merging T and then C. (22)

C . . . T + (21)

But now, assuming the feature-splitting analysis of Obata and Epstein traced above, recall John at the edge⎯in this case, the edge of vP in (18)⎯no longer bears ĳ-features, since these features were split off and moved to the Aposition, [Spec, VP]. Thus the ĳ-features on the matrix T, inherited from the matrix C, find no matching Goal on the edge of v, and since VP and its contents are “gone” by PIC, the derivation crashes with unvalued ĳ on the higher T. 4. Conclusion and speculation concerning scope positions in clausal architecture We argued that certain aspects of clausal architecture are deducible from independently motivated mechanisms of Chomsky’s phase-based analysis. To summarize, if you first merge an argument DP in a non-theta position and are able to check that DP’s Case, the transferred PHC containing the DP is gibberish. We argued that “once gibberish, always gibberish” follows from the architectural design whereby phase-head complements are periodically transferred to and interpreted by the interfaces; specifically once a PHC is transferred to the interface, its internal interpretation is immutable and simply can’t be “repaired” by assembly of the entire PHC with other PHC’s. But, we’ve also argued that even if gibberish can (somehow) be salvaged, independently motivated components of Chomsky’s analysis still correctly disallow the relevant structures. Thus, theta before Case (and not Case before theta) is deduced within Chomsky’s system with no appeal to syntax internal ordered levels of representation. If on track, this is just a case study of one aspect of clausal architecture that might be explicable by minimalist 3rd factor postulates (phasal, episodic Transfer, CI is the only level of representation undergoing semantic interpretation, including the Theta Criterion being reduced to FI), which were motivated on grounds having nothing to do with the phenomena examined here. A much larger project consists of systematically determining which

Exploring phase based implications regarding clausal architecture

119

aspects of articulated clausal architecture can and cannot be deduced (beyond Case can’t precede theta). Toward this end, we note that Case and scope are both non-thematic, so why is scope above Case in the clausal architecture and not the other way around? One answer is that if scope is related to the Phase Head C (left periphery is headed by a phase head), then if we did theta first, and then did scope (IM to edge CP), the next thing that happens is Transfer of TP by PIC. But this crashes since DP has unvalued case. So if we have successive IM of a single DP as follows, with theta before Case (as argued above), but scope below Case, (23)

DP (CASE) 3rd

DP(scope) PH [PHC 2nd 1st

DP (theta) . . .

we would end up transferring the PHC, but the transferred DP within the PHC still has unvalued Case. If a crashed PHC that has been transferred to the interface cannot be salvaged (de-crashed) by subsequent derivation (since computational (local) efficiency dictates crash ends the derivation, i.e. we don’t continue the derivation ad infinitum to “see” if we can overcome the crash), then we have an explanation of why scope is above Case in the clausal architecture and not the other way around. If on track, then we have gone some distance toward explaining clausal architecture and operational ordering without appealing to descriptive rule ordering and without ordered level of syntax internal representation (DS< SS can, can > only) b. Taro-ga migime -dake-ga tumur-e-ru Taro-Nom right.eye-only-Nom close-can-Pres Taro can only close his right eye (only > can, can > only)

362

Željko Boškoviü

In both examples in (52), the verb tumur ‘close’ is accompanied by the potential affix -e ‘can’. While the accusative object in (52a) must scope under the potential affix, the nominative object in (52b) can take wide scope with respect to the potential affix. Koizumi (1994), Tada (1992), Ura (1996), and Nomura (2003, 2005), among others, argue for a case-driven A-movement analysis of these data based on the following derivations: (53)

a. [TP SubjiNOM [canP ti [vP OBJkACC PRO [ tk V ] v ] can ] T] (=(52a)) b. [TP SubjiNOM OBJkNOM [canP ti [vP (PRO)[ tk V] v ] can ] T] (=(52b))

In (52a), the object is case-valued by v and moves to SpecvP. Since the object is lower than the potential verb, it must take scope under the potential verb. In (52b), the nominative object and the nominative subject move to Spec TP. Since the potential verb is below TP, the object takes scope over it. Takahashi (in press a) shows that this analysis cannot capture the full paradigm. One of the problems with the analysis concerns the fact that elements that do not bear structural case show a scope contrast correlating with the case of the object. (54)

a. Taro-ga sakana-o koshou-dake-de taber-are-ru Taro-Nom fish -Acc pepper -only-with eat -can -Pres It is only pepper that Taro can eat fish with (*only > can) Taro can eat fish with only pepper (can > only) b. Taro-ga sakana-ga koshou-dake-de taber-are-ru Taro-Nom fish -Nom pepper-only -with eat -can-Pres It is only pepper that Taro can eat fish with (only > can) Taro can eat fish with only pepper (?can > only)

In (54b), which contains a nominative object, dake ‘only’ in a PP can take either wide or narrow scope with respect to the potential affix. On the other hand, in (54a), with an accusative object, dake ‘only’ can only take narrow scope. Since the PPs in question do not have structural case, hence they do not undergo case-driven movement, the case-movement analysis cannot account for these data. A number of authors have argued that dake ‘only’ undergoes QR (see Bobaljik & Wurmbrand 2007, Futagi 2004, Harada & Noguchi 1992, and Shoji 1986). Assuming with these authors that dake undergoes QR, Takahashi proposes an account of the scope puzzle where both case and QR of dake play a role based on the following condition.16

Phases in NPs and DPs

(55)

363

QR dake ‘only’ is bound to domains of case-valuation

Consider the structure for the basic data from (52), where bold letters indicate a phase. (56)

a. [TP SubjiNOM [canP ti [vP PRO [OBJACC V ] v ] can] T] (=(52a)) [ACC ] [NOM] b. [TP SubjiNOM [canP ti [vP (PRO) [OBJNOM V ] v ] can ] T] (=(52b)) [ACC] [NOM]

Following Ura (1996), Takahashi assumes that the potential morpheme optionally absorbs the case-feature of v. In (56a), rare ‘can’ does not absorb the accusative case-feature of v and v values the case of the object. vP then works as a bounding domain for the QR of dake given (55). On the other hand, in (56b), rare absorbs the case-feature of v; the object is then casevalued by T. Since v does not value case, vP does not constitute a bounding domain for QR. This is why dake can take wide scope with respect to the potential morpheme. The analysis straightforwardly extends to (54), as the reader can verify. What is important here is Takahashi’s proposal that case valuation determines phases. Notice that Takahashi’s proposal regarding the role of case valuation in phasehood, which Takahashi discussed with respect to vP, can be extended to CPs, since under Chomsky’s (2008) C-T association analysis C is involved in case valuation. Takahashi’s claim that case valuation determines phasehood may then be extendable to all phases. In this respect, it is worth noting that Boškoviü (2007b) argues that ECM/raising infinitives are CPs (for empirical evidence, see Boškoviü 2007b:605-606 and references therein). However, under Takahashi's proposal they still correctly do not count as phases since the C here is not involved in case valuation.17 We then have a complete parallelism between vP and CP; some vPs and CPs are phases, and some are not. Whether or not a vP or a CP is a phase is determined by case valuation. At any rate, what is important for our purposes is that case valuation determines phasehood. I now turn to inherent case, which Takahashi did not discuss with respect to the scope of dake. Like SC, Japanese has verbs that do not assign accusative to its complement NP, which I assume then bears inherent case. Significantly, inherent case patterns with nominative, not accusative, in the relevant respect: like nominative objects and in contrast to accusative objects, inherently case-marked dake objects can take wide scope. This

364

Željko Boškoviü

confirms that there is no case valuation with inherent case (as discussed above), hence vP does not count as a phase and does not block QR of dake. (57)

a. Taroo-wa daitooryoo-dake-ni a -e -ru Taro -Top president -only -Dat meet-can-pres Taro can meet only with the president (only > can, can > only) b. Taro-wa daitooryoo-ni suutu-dake-de a -e -ru Taro-Top president -Dat suit -only-with meet-can-pres Taroo can meet with the president only in a suit (only > can, can > only)

The crucial conclusions of the above discussion are that case valuation determines phases and that inherent case does not involve valuation. Returning now to the phasehood of TNP and the structural/inherent case distinction with respect to the diagnostic tools employed here to determine TNP phasehood, we now have exactly what we need: Since inherent case is not licensed through regular case valuation, nouns that license inherent case then do not determine phases, independent evidence for which comes from the scope of dake in Japanese. The upshot of all of this regarding SC is that only NPs headed by genitive case-licensing nouns are phases. As a result, PIC and anti-locality conspire to block deep LBE, deep adjunct extraction, and movement of the complement of such nouns. Given that the NP is a phase here, any movement out of the NP has to proceed via the Spec of the NP. However, in the case of deep LBE, deep adjunct extraction, and movement of the complement of such nouns, movement to SpecNP violates anti-locality, as discussed above. On the other hand, nouns like pretnja ‘threat’, which license inherent case, do not head phases. As a result, nothing goes wrong with deep LBE, deep adjunct extraction, and movement of the complement of such nouns. Since the NP here is not a phase, the movements in question do not need to proceed via SpecNP, hence they do not violate anti-locality. It is worth noting here that in addition to the unification of the three phenomena in SC, the phase-based analysis unifies the SC phenomena in question with the facts regarding the scope of dake in Japanese. The above analysis makes an interesting prediction. Since NP functions as a phase only when its head is involved in case assignment the prediction is that, in contrast to genitive complements of nouns, PP complements of nouns will be able to undergo extraction in SC. Since a noun that takes a PP complement is not involved in case assignment, its maximal projection should not be a phase, which means that a PP complement can move out of an NP dominating it without stopping in SpecNP, avoiding an anti-locality violation.

Phases in NPs and DPs

(58)

365

[O kojem novinaru] i si proþitao [þlanak ti]? about which journalist are read article About which journalist did you read an article?

The analysis on which case assignment determines phasehood, as proposed by Takahashi (in press a), thus makes exactly the right cut: it allows extraction of PP and inherently case-marked nominal complements, but not of genitive case-marked nominal complements, in addition to unifying the patterns of extraction out of nominal domains in SC with the scope patterns of dake in Japanese. There is, however, one important consequence of the discussion in this chapter that needs to be clarified. Notice first that due to the presence of vP, which functions as a phase because its head is involved in accusative case assignment, structurally case marked verbal complements can move. As should be clear from (59), such movement does not violate anti-locality. The same holds for left branch and adjunct extraction out of a complement of a verb (in languages where such extraction is in principle possible). (59)

NPi [vP ti [v' [VP V ti]]]

This means that there should be no small n in TNPs; otherwise, nominal domains would pattern with verbal domains in the relevant respect. This is an important conclusion in light of the fact that nP is often posited for TNPs, mostly to achieve a parallelism with VP. However, we have seen above that the phrases in question simply do not display uniform behavior with respect to phenomena that are sensitive to the presence of vP/nP. To see the issue more clearly, consider what would happen if an nP is posited for SC.18 Given that nP is generally posited to obtain a parallelism with vP, under this analysis it would be natural to assume that nP, rather than NP, functions as a phase in SC (this means that n should be involved in structural case assignment, just like v is). To allow left-branch and adjunct extraction in SC in any context it would then be necessary to assume that APs and adjuncts are nP rather than NP adjoined. (If they were NP adjoined even simple left-branch extractions like (13) and adjunct extractions like (19) would be ruled out.) However, if APs and adjuncts were nP adjoined deep left-branch extraction and deep adjunct extraction would be incorrectly ruled in even where they are unacceptable: the elements undergoing deep leftbranch extraction and deep adjunct extraction in (42b) and (44) respectively would cross a full phrase (higher NP) on their way to the Spec of the nP

366

Željko Boškoviü

dominating the higher NP. As a result, movement to the phasal edge, SpecnP under this analysis, would not violate anti-locality. (60)

NPi [nP ti [NP [nP ti [nP [NP

4.4. Back to DP languages Let us now consider how English (and DP languages more generally) fit into the system developed above with respect to SC, an NP language. Let us first consider whether NP functions as a phase in English. Given that DP is a phase in English (see the discussion below and section 3.1), the only test that can be run is complement movement (recall that DP phasehood has a blocking effect on LBE and adjunct extraction). In English, nouns can only take PP complements. PP nominal complements can be extracted in English, both those that are headed by of and those that are headed by other prepositions. ((62) is taken from Huang 1982, Chomsky 1986b.)19 (61)

?To which problem did you discover (the) solutions?

(62)

Of which city did you witness the destruction?

This, however, does not tell us much: Since there is no case assignment with PP complements the NP would not be expected to function as a phase here anyway. One could conceivably treat of-genitives as NPs with a casemarker (of). However, if such genitive is an inherent case, as Chomsky (1986b) argued, we still would not expect any locality problems here. Another possibility here could be that the of-genitive is a structural case but that nP is present in English, in contrast to SC (as a result of a more general difference in the richness of nominal structure between English and SC, see footnote 18). nP would then function as a phase in (62). No problem with respect to locality would arise in (62) under this analysis either; (62) would in fact be treated in the same way as (59) under this analysis. The fact that in German, which has genitive NP complements, such complements cannot be extracted, as illustrated by (63), argues against the last analysis.20 (63)

a. Ich habe Bilder der Pyramiden gesehen I have pictures the-gen.pl pyramids seen I have seen pictures of the pyramids

Phases in NPs and DPs

b. *Wessen hast du Bilder gesehen? whose-gen have you pictures seen? c. *Der Pyramiden habe ich bilder gesehen d. Du hast Bilder Berlins gesehen you have pictures Berlin-gen seen e. *Berlins hast du bilder gesehen

367

(Klaus Abels, p.c.)

I then assume that nP does not exist in DP languages either, hence will not consider it in the discussion below.21 Notice furthermore that the ungrammaticality of the extractions in (63) provides evidence that NP is a phase in German: if NP is a phase in (63) (and there is no nP) the unacceptable constructions can be ruled out in the same way as the corresponding SC examples discussed in section 4.2. On the other hand, if the only phase in (63) were a DP it is not clear how the unacceptable examples would be ruled out. Movement of the genitive complement to SpecDP would obey both the PIC and anti-locality. We would then need to assume that a factor independent from those considered here is responsible for the ungrammaticality of the constructions. Since under the NP-as-a-phase analysis we get an account of the ungrammaticality of (63b,c,e) for free (i.e. from what is already present in the system), and the analysis does not require positing any crosslinguistic variation with respect to phasehood, I will endorse the NP-as-a-phase analysis here. Notice also that a prepositional (i.e. von) genitive can be extracted, which is expected given that, as discussed above with respect to SC, NP does not function as a phase when it takes a PP complement. (Additional examples, involving non-von PPs, are given in (65)).22 (64)

a. Von Berlin hast du Bilder gesehen of Berlin have you pictures seen b. Wovon hast du Bilder gesehen? where-of have you pictures seen

(65)

a. Für dieses Problem hat er (die) Lösungen gefunden/entdeckt for this problem has he the solutions found discovered b. Auf alle Fragen hat er eine Antwort gegeben to all questions has he an answer given c. Zu ihrem Konto wurde ihr der Zugriff verweigert to her bank-account was her.dat the access denied (Klaus Abels, p.c.)

368

Željko Boškoviü

To conclude, we have seen that there are several options for accounting for the grammaticality of (61)-(62), which cannot tell us anything conclusive about the phasehood of NP in English. On the other hand, there is evidence that NP is a phase in German, another DP language, which also provides evidence that there is no nP in the TNP of DP languages, which again pattern with NP languages in this respect. How about DP? We clearly want DP to function as phase in English, in fact not only when the ‘s genitive is assigned (in such cases DP is clearly involved in case assignment) but in all contexts; otherwise we could not account for the unacceptability of left-branch extraction and adjunct extraction in examples like (12) and (18b). The assumption was also evoked above in the account of (7) (it made it possible to subsume (7) under Abels’s generalization). How can the phase status of DP be achieved within the approach to phases adopted in this chapter? There are several possibilities here. Takahashi (in press a) does not explicitly assume that case valuation is relevant to all phases; i.e. he does not take the step regarding the extension of the relevance of case valuation to the phasehood of CP suggested above (see, however, Takahashi in press b), and tacitly assumes that CP is a phase regardless of case valuation. Suppose DP is treated in the same way, which means it is a phase regardless of case valuation. Given that DP is always a phase in English, LBE and adjunct extraction out of DPs in English will be blocked regardless of what assumptions are made about English genitive case or case assignment to English DPs. However, it is not necessary to give up the extension of the relevance of case valuation to all phases to accommodate the phase status of DP. In other words, there is a way of involving DP in case assignment, in which case we can assume that case assignment determines all phasehood. Legate (2005) (see also Rodríguez-Mondoñedo 2007) considers cases of long-distance agreement across a CP boundary, and argues that agreement in such cases proceeds through a series of local steps, like v-C-NP (C agrees with the NP, v agrees with C, with the verb ending up agreeing with the NP as a result of all of this). Consider now from this perspective a rather serious issue that arose with the advent of the DP hypothesis. If NPs are subject to a case requirement, how can they be case licensed if they are dominated by DP? One way of doing it would be by adopting Legate-style cyclic agreement. This can be implemented as follows: Let us assume that, as suggested in section 2, D with an unvalued case feature enters into an Agree relation with its NP complement that also has an unvalued case feature. When the case feature of D is valued by a DP external case licensor, the NP which has undergone Agree with the D receives the same case value as D. Under this

Phases in NPs and DPs

369

analysis D is crucially involved in the case licensing of the NP; it basically assigns case to the NP by passing it along from the DP external case assigner. The DP should then function as a phase any time it takes an NP complement in Takahashi’s system. 5. Conclusions This chapter has applied the phasal approach to the locality of movement to the extraction out of TNPs. We have seen that left-branch extraction and several related constructions can be used as a powerful test for phasehood. More precisely, we have seen that deep left-branch extraction, deep adjunct extraction, and NP complement movement pattern together, they are either all allowed or all disallowed with different types of SC TNPs. The phenomena in question are all crucially affected by the structural/inherent case distinction, since whether or not they are allowed depends on this distinction; more precisely, whether the relevant NP bears inherent or structural case, the three phenomena in question being disallowed only with the latter. The inherent/structural case distinction is also relevant to the scope of dake in Japanese, inherently case-marked objects being able to take wider scope than accusative objects. The general pattern here is that inherent case is less constrained than structural case both with respect to extraction out of NPs (SC extraction facts) and with respect to QR of NPs (Japanese scope facts). I have provided a unified phase account of all these facts based on Takahashi’s (in press a) proposal that case valuation determines phasehood. The difference between structural and inherent case is that only structural case involves valuation, inherent case being licensed independently through theta-role assignment. In Takahashi’s system, only NPs assigning structural case then project phases, blocking processes such as deep left-branch extraction, deep adjunct extraction, and nominal complement movement via the Phase Impenetrability Condition, given the anti-locality hypothesis. Similarly, only vPs assigning structural accusative case have a blocking effect on the phasesensitive QR of dake in Japanese. In addition to providing evidence for Takahashi's approach to phases, where case valuation plays a crucial role in determining phasehood, the current analysis has important consequences for the phasehood of TNP. A phase-based account of the ban on left-branch extraction and adjunct extraction out of English TNPs was provided that was crucially based on the assumption that DP is a phase. Since (putting aside the case of nominals that assign inherent case) NP has the same blocking effect on these movements in

370

Željko Boškoviü

SC, an article-less language which lacks DP, as DP does in English, it then follows that NP works as a phase in SC. This was confirmed by the impossibility of movement of genitive nominal complements in SC. There is, however, no need to posit any parametric variation with respect to phasehood in TNPs. The effects of DP phasehood are not observed in NP languages like SC for a trivial reason, given that such languages lack DP. As for English, nothing actually goes wrong if we assume that NP is a phase in English. The phasehood of DP and the type of complement that nouns take in English make it impossible to observe any effects of NP phasehood in English. However, the effects of NP phasehood can be observed at least to some extent in other DP languages, e.g. German. The general conclusion is then that both DP and NP function as phases with no crosslinguistic variation in this respect apart from the trivial point that NP languages lack DP, hence the effects of DP phasehood cannot be observed in such languages. However, the lack of DP is what has made it possible to conduct several phasehood tests for NPs in NP languages which could not be conducted in DP languages due to the presence of DP (more precisely, due to DP phasehood). Finally, I have provided evidence that movement out of DP in DP languages like English must proceed via SpecDP, a step of movement which was crucial in the account of the ban on left branch extraction, adjunct extraction, and D-complement movement out of DP in English. Providing arguments that movement out of DP must proceed via SpecDP is important since cases that are standardly offered as arguments for successive cyclic movement via SpecDP in the literature, like the impossibility of extraction across possessors, involve interfering factors, namely the Specificity Condition and the stipulation that, in contrast to other phasal heads, which allow multiple Specs, multiple Specs are not available for D. There are two additional consequences of the discussion in this chapter: there is no nP counterpart of vP in the nominal domain. nP is often posited merely to achieve parallelism with VP. However, we have seen that the phrases in question do not display uniform behavior with respect to phenomena that are sensitive to the presence of vP/nP.23 Another, more general conclusion of the current work is that the inherent/structural case distinction has very significant consequences for uncontroversially syntactic phenomena (i.e. locality of syntactic movement24). Since the distinction then has to be reflected in the syntax, we also have evidence here that case cannot be pushed outside of the syntax.

Phases in NPs and DPs

371

Notes *

1. 2.

3. 4.

5.

For helpful comments and suggestions I thank two anonymous reviewers and the participants of my 2009 syntax seminar at the University of Connecticut, Moscow Student Conference on Linguistics 5, GLOW 33, and Syntax Fest 2010 at Indiana University. This material is based upon work supported by the National Science Foundation under grant BCS-0920888. I use the term TNP to refer to noun phrases without committing myself to their categorial status, i.e. the functional structure that may be present above NP. The presence of additional projections between DP and nP would not affect the discussion. I am ignoring here issues concerning the linear order of the relevant elements; an analysis along the lines of (5) requires positing rightward Specs or stylistic/PF movement, both of which have been proposed in the literature (see, e.g. Torrego 1987 for the former and Ticio 2003 for the latter). Notice that there are many accounts of the DP extraction hierarchy in Spanish, see, e.g. Torrego (1987), Ormazabal (1991), Sánchez (1996), and Ticio (2003) (see also Riqueros in preparation for an application of the phase system developed in Boškoviü 2010b to extraction out of various kinds of Spanish TNPs). Note, however, that the Specificity Condition is weaker in Spanish than in English, i.e. it is voided in some cases where it still holds in English (see Ticio 2003 and references therein). German allows some examples of this type. (The process in question is referred to as split topicalization.) (i) Bücher hat Hans einige gelesen books has Hans some read In principle, this could be accounted for if German TNPs have a bit more structure than English TNPs, in which case the relevant German examples would not have to involve movement of the complement of D, but a lower phrase. I will not explore this analysis here since the subextraction analysis of split topicalization faces numerous problems (the main problem is that the fronted element corresponding to books in (i) seems to be an independent TNP which can even have its own article). There are in fact many analyses of such constructions in German that do not involve subextraction from DP (for a survey of such analyses, see van Hoof 2006; see also Roehrs 2006 for another non-subextraction analysis). At any rate, it is beyond the scope of this paper to determine the proper analysis of German split topicalization, which does not seem to involve subextraction, or to conduct a detailed examination of examples like (7) crosslinguistically, which may have various sources that are irrelevant for our purposes in different languages (e.g. NP ellipsis in the in-situ “remnant” DP or an analysis in terms of a quantifier float construction (with some a floating quantifier)). See Boškoviü (2008a, 2010a) for illustrations of the generalizations in (10)-(11), as well as the precise definitions of the phenomena referred to in these generalizations (e.g. what is meant by scrambling in (10c) is long-distance scrambling of

372

6.

7.

8.

9. 10.

11. 12.

13.

Željko Boškoviü the kind found in Japanese). Notice also that what matters for these generalizations is the presence of a definite article in a language since Slovenian, which has indefinite but not definite article, patterns with article-less languages with respect to these generalizations, see Boškoviü (2009b). Like most of the generalizations in (10)-(11), (15) is a one-way generalization; it does not require all article-less languages to allow LBE. There are other requirements on AP LBE, in addition to the lack of articles, one of them being agreement between the adjective and the noun (see Boškoviü 2009d). The lack of such agreement is the reason why, e.g., Chinese disallows LBE. I focus on adjectival LBE and ignore possessor extraction because several accounts of the impossibility of adjectival LBE in article languages leave a loophole for possessor extraction to occur in some languages of this type (see Boškoviü 2005:4). In fact, Hungarian, which has articles, allows possessor extraction. However, it disallows adjectival LBE, which is what is important for our purposes (den Dikken 1999, however, analyzes Hungarian possessive extraction as involving a left dislocation-type configuration with a resumptive pronoun). (18b) is actually acceptable in Spanish. However, Ticio (2003) shows that the relevant phrase is an argument in Spanish (see Ticio 2003 for relevant tests). With clear adjuncts, such as the one in (22a), such extraction is disallowed (see Ticio 2003 for additional adjunct examples). See also footnote 22 regarding the status of (23) in German. Note that the above account readily extends to non-restrictive adjectives under Morzycki’s (2008) analysis, where non-restrictive adjectives are also treated as having type and required to be interpreted inside the determiners. Chinese and Japanese behave like SC in the relevant respect (see Boškoviü 2010a and Cheng in preparation), which provides strong evidence for the no-DP analysis of these languages. It should be noted that it is important that the pronoun in (37)-(38) is not contrastively focused, since contrastive focus affects binding relations (the pronoun in (37a) is a clitic, hence cannot be contrastively focused; this, however, weakens the binding violation somewhat, since binding violations are often a bit weaker with clitics). As for the order of adjectives with respect to each other under this analysis, see Boškoviü (2009c), where it is argued that the order follows from semantic considerations. See the discussion of SC in the previous section. Adjectives are also assumed to be NP adjoined in English under the analysis summarized here. (Note also that Boškoviü 2005 does not assume a separate DP and PossP for English, the possessor being located in SpecDP and ’s in D in English under Boškoviü’s 2005 assumptions.) If there were an additional functional projection in the object TNP the adjective would need to be adjoined to that projection. (One could then in principle allow for the possibility of a mixed DP language that would disallow AP LBE and allow adjunct extraction if the adjunct is still assumed to be NP adjoined (see the discussion of adjunct extraction below).)

Phases in NPs and DPs

373

14. Note that, as in the case of verbs, nouns do not all assign the same inherent case. Thus, in contrast to the noun pretnja ‘threat’ in (48), which assigns instrumental case, the noun pomoü ‘help’ assigns dative case, which means that this information needs to be lexically specified. 15. The general conclusions of this work regarding phasehood are in fact quite different from Boškoviü (2010b). 16. More generally, Takahashi (in press a) argues that QR is phase-bound, but only vPs that are involved in case valuation count as phases (see the discussion below; see also Takahashi in press b for arguments (based on the Japanese causative construction) against an alternative QR analysis proposed by Bobaljik and Wurmbrand 2007, where the categorical status of the complement of rare plays a crucial role). 17. Movement out of ECM/raising infinitives then does not have to proceed via SpecCP, which could be what makes A-movement out of such CPs possible (but see Boškoviü 2007b, where it is suggested that the issue of improper movement would not arise here anyway; see also Obata 2010 for discussion of improper movement in general). 18. The problem about to be noted arises in SC, but not in English. The following discussion then does not completely rule out the possibility that an nP could be posited in English, but not SC, perhaps as part of a more general difference in the structural richness of the TNP between article and article-less languages (see, however, section 4.4 for relevant discussion). 19. See Rodman (1977) for arguments that examples like (61) involve genuine extraction from DP (Rodman shows that a Bach & Horn (1976) style analysis, where the PP directly modifies the verb, does not work for such examples). 20. Thanks are due to Klaus Abels and Susi Wurmbrand for very helpful discussions of German. 21. The conclusion is somewhat tentative pending further crosslinguistic investigation but it should be noted that Icelandic and Romanian pattern with German in the relevant respect; Romanian examples are given in (i). (i) a. Ai văzut poze ale piramidelor have seen pictures gen.pl pyramids-the.gen.pl. You have seen pictures of the pyramids b. *Ale cui / piramidelor ai văzut poze? gen.pl who pyramids-the.gen.pl have seen pictures (Simona Herdan, p.c.) The question here is whether there are DP languages that allow extraction of clear genitive NP complements of nouns, where we can furthermore show that the adnominal genitive is a structural case. (Note in this respect that there is independent evidence that adnominal genitive in SC is a structural case. Thus, as shown in Boškoviü 2010b, like structural accusative assigned by verbs and structural nominative assigned by Tense, adnominal genitive can be overridden by genitive of quantification in SC, which is not possible with inherent cases assigned by verbs and nouns; see Boškoviü 2010b for a more detailed

374

Željko Boškoviü

discussion). If there are such languages, they could be accommodated within the current system in one of these two ways: (a) such languages would have nP, and nP rather than NP would function as a phase on a par with the situation found with vP and VP (see also section 4.3.); (b) there would be no phase in the NP/nP domain of such languages; to minimize variation, under (b) analysis it would be more natural to assume that such languages lack nP, like SC and German. (It should, however, be emphasized that I have not yet found languages of this type, which in fact may not exist.) 22. It should be noted, however, that there are a number of different views regarding the proper analysis of examples like (64)-(65), see, e.g., Fanselow (1987, 1991), Fortmann (1996), Grewendorf (1989), de Kuthy (2002), Müller (1998), Pafel (1993). Note that Fortmann (1996) argues that only PP arguments of nouns can be extracted in German; this is why von-PPs, which are arguments, are quite freely extractable. In this respect, notice the ungrammaticality of (i), where a clear adjunct is extracted. (i) a. Ich habe Männer mit langen Bärten getroffen I have men with long beards met b. *Mit was/womit hast du Männer getroffen? with what/where.with have you men met (Susi Wurmbrand, p.c.) 23. It is also often noted that nouns do not theta-mark in the same way that verbs do, which also undermines the argument based on a putative parallelism with VP. 24. See also Starke (2001) for relevant discussion.

References Abels, Klaus 2003 Successive cyclicity, anti-locality, and adposition stranding. Doctoral dissertation, University of Connecticut, Storrs. Bach, Emmon, and George M. Horn 1976 Remarks on ‘Conditions on Transformations’. Linguistic Inquiry 7: 265-299. Baker, Mark 1996 The Polysynthesis parameter. New York: Oxford University Press. 2003 Lexical categories: Verbs, nouns, and adjectives. Cambridge: Cambridge University Press. Bašiü, Monika 2004 Nominal subextractions and the structure of NPs in Serbian and English. MA Thesis, University of Tromsø. Bobaljik, Jonathan, and Susi Wurmbrand 2007 Complex predicates, aspect, and anti-reconstruction. Journal of East Asian Linguistics 16: 27-42.

Phases in NPs and DPs

375

Boeckx, Cedric 2003 Islands and chains: Resumption as stranding. Amsterdam: John Benjamins. 2005 Some notes on bounding. Ms., Harvard University, Cambridge. Boškoviü, Željko 1994 D-structure, ș-criterion, and movement into ș-positions. Linguistic Analysis 24: 247-286. 1997 The syntax of nonfinite complementation: An economy approach. Cambridge: MIT Press. 2004 Be careful where you float your quantifiers. Natural Language and Linguistic Theory 22: 681-742. 2005 On the locality of left branch extraction and the structure of NP. Studia Linguistica 59: 1-45. 2007a On the Clausal and NP Structure of Serbo-Croatian. In Proceedings of the Annual Workshop on Formal Approaches to Slavic Linguistics: The Toronto Meeting 2006, ed. by Richard Compton, Magdalena Goledzinowska, Ulyana Savchenko, 42-75. Ann Arbor: Michigan Slavic Publications. 2007b On the locality and motivation of Move and Agree: An even more minimal theory. Linguistic Inquiry 38: 589-644. 2008a What will you have, DP or NP? In Proceedings of 37th Conference of the North-Eastern Linguistic Society, 101-114. Amherst: GLSA, University of Massachusetts. 2008b On the operator freezing effect. Natural Language and Linguistic Theory 26: 249-87. 2009a On relativization strategies and resumptive pronouns. In Studies in Formal Slavic Phonology, Morphology, Syntax, Semantics and Information Structure. Proceedings of FDSL 7, Leipzig, ed. by Gerhild Zybatow, Uwe Junghanns, Denisa Lenertová, and Petr Biskupr, 79-93. Frankfurt am Main: Peter Lang. 2009b In Proceeding of the University of Novi Sad Workshop on Generative Syntax 1, 53-73. 2009c More on the No-DP Analysis of article-less languages. Studia Linguistica 63: 187-203. 2009d On Leo Tolstoy, its structure, Case, left-branch extraction, and prosodic inversion. In Studies in South Slavic Linguistics in Honor of E. Wayles Browne, ed. by Steven Franks, Vrinda Chidambaram, and Brian Joseph, 99-122. Bloomington: Slavica. 2010a On NPs and clauses. Ms, University of Connecticut, Storrs. 2010b Phases beyond clauses. Ms., University of Connecticut, Storrs. Boškoviü, Željko, and Jon Gajewski in press Semantic correlates of the DP/NP parameter. In Proceedings of 39th Conference of the North-Eastern Linguistic Society. Amherst: GLSA, University of Massachusetts.

376

Željko Boškoviü

Cheng, H-T. Johnny in prep. Null arguments in Mandarin Chinese and the DP/NP parameter. Doctoral dissertation, University of Connecticut, Storrs. Cheng, Lisa L.-S., and Rint Sybesma 1999 Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30: 509-542. Chierchia, Gennaro 1998 Reference to kinds across languages. Natural Language Semantics 6: 339-405. Chomsky, Noam 1986a Barriers. Cambridge, Mass.: MIT Press. 1986b Knowledge of Language: Its Nature, Origins, and Use. New York: Praeger Publishers. 2000 Minimalist inquiries. In Step by step: Essays on minimalist syntax in honor of Howard Lasnik, ed. by Roger Martin, David Michaels, and Juan Uriagereka, 89-155. Cambridge, Mass.: MIT Press. 2001 Derivation by phase. In Ken Hale: A life in language, ed. by Michael Kenstowicz, 1-52. Cambridge, Mass.: MIT Press. 2008 On Phases. In Foundational Issues in Linguistic Theory. Essays in Honor of Jean-Roger Vergnaud, ed. by Robert Freidin, Carlos Peregrín Otero, and Maria Luisa Zubizarreta, 133-166. Cambridge, Mass.: MIT Press. Cinque, Guglielmo 1980 On extraction from NP in Italian. Journal of Italian Linguistics 5: 4799. Compton, Richard, and Christine Pittman 2007 Affixation by phase: Inuktitut word-formation. Presented at the 2007 annual meeting of the Linguistics Society of America, Anaheim. Corver, Norbert 1992 Left branch extraction. In Proceedings of 22nd Conference of the North-Eastern Linguistic Society, 67-84. Amherst: GLSA, University of Massachusetts. Culicover, Peter, and Michael S. Rochemont 1992 Adjunct extraction from NP and the ECP. Linguistic Inquiry 23: 496501. Despiü, Miloje 2009 On the Structure of the Serbo-Croatian Noun Phrase - Evidence from Binding. Proceedings of Formal Approaches to Slavic Linguistics: The Yale Meeting, 17-32, Ann Arbor: Michigan Slavic Publications. Despiü, Miloje in press On two types of pronouns and so-called “movement to D” in SerboCroatian. In Proceedings of 39th Conference of the North-Eastern Linguistic Society. Amherst: GLSA, University of Massachusetts.

Phases in NPs and DPs

377

Dikken, Marcel den 1999 On the structural representation of possession and agreement: The case of (anti-)agreement in Hungarian possessed nominal phrases. In Crossing boundaries: Advances in the theory of central and eastern European languages, ed. by István Kenesei, 137-178. Amsterdam: John Benjamins. 2007 Phase extension: Contours of a theory of the role of head movement in phrasal extraction. Theoretical Linguistics 33:1-41. Epstein, Samuel David 1992 Derivational constraints on A'-chain formation. Linguistic Inquiry 23: 235-259. Fanselow, Gisbert 1987 Konfigurationalität. Untersuchungen zur Universalgrammatik am Beispiel des Deutschen. Tübingen: Gunter Narr Verlag. 1991 Minimale Syntax-Untersuchungen zur Sprachfähigkeit. Habilitationsschrift, Universität Passau, Passau. Fortmann, Christian 1996 Konstituentenbewegung in der DP-Struktur. Linguistische Arbeiten 347. Tübingen: Max Niemeyer Verlag. Frampton, John, and Sam Gutmann 2000 Agreement is feature sharing. Ms., Northeastern University, Boston. Franks, Steven 2007 Deriving discontinuity. In Studies in Formal Slavic Linguistics, ed. by Frank Marušiþ and Rok Žaucer, 103-120. Frankfurt am Main: Peter Lang. Fukui, Naoki 1988 Deriving the differences between English and Japanese. English Linguistics 5: 249-270. Futagi, Yoko 2004 Japanese focus particles at the syntax-semantics interface. Doctoral dissertation, Rutgers University. Gavruseva, Elena 2000 On the syntax of possessor extraction. Lingua 110: 743-772. Giorgi, Alessandra, and Giuseppe Longobardi 1991 The syntax of noun phrases: Configuration, parameters, and empty categories. Cambridge: Cambridge University Press. Grewendorf, Günther 1989 Ergativity in German. Dordrecht: Foris. Grohmann, Kleanthes 2003 Prolific domains: on the anti-locality of movement dependencies. Amsterdam: John Benjamins. Gutiérrez-Rexach, Javier, and Enrique Mallén 2001 NP movement and adjective position in the DP phases. In Features and Interfaces in Romance, ed. by Julia Herschensohn, Enrique Mallén,

378

Željko Boškoviü

and Karen Zagona, 107-132. Amsterdam and Philadelphia: John Benjamins. Harada, Yasunari, and Naohiko Noguchi 1992 On the semantics and pragmatics of dake (and only). In Proceedings of the 2nd conference on semantics and linguistic theory, eds. Chris Barker and David Dowty, 125-144. Columbus: Ohio State University. Heck, Fabian, Gereon Müller, and Jochen Trommer 2008 A phase-based approach to Scandinavian definiteness marking. In Proceedings of the 26th West Coast Conference on Formal Linguistics, ed. by Charles B. Chang and Hannah J. Haynie, 226-233. Somerville, Mass.: Cascadilla Press. Herdan, Simona 2008 Degrees and amounts in relative clauses. Doctoral dissertation, University of Connecticut, Storrs. Hoof, Hanneke van 2006 Split topicalization. In The Blackwell companion to syntax, ed. by Martin Everaert and Henk van Riemsdijk, 408-462. Oxford: Blackwell. Huang, C.-T. James 1982 Logical relations in Chinese and the theory of grammar. Doctoral dissertation, MIT, Cambridge, Mass. Ishii, Toru 1999 Cyclic spell-out and the that-trace effect. In Proceedings of the 18th West Coast Conference on Formal Linguistics, ed. by Sonya Bird, Andrew Carnie, Jason D. Haugen, and Peter Norquest, 220-231. Somerville, Mass.: Cascadilla Press. Jeong, Youngmi 2006 The landscape of applicatives. Doctoral dissertation, University of Maryland, College Park. Kaplan, David 1977/1989 Demonstratives. In Themes from Kaplan, ed. by Joseph Almog, John Perry, and Howard Wettstein, 481-563. Oxford: Oxford University Press. Kayne, Richard 1994 The antisymmetry of syntax. Cambridge, Mass.: MIT Press. Koizumi, Masatoshi 1994 Phrase Structure in Minimalist Syntax. Doctoral Dissertation, MIT, Cambridge, Mass. Kramer, Ruth 2009 Definite Markers, Phi Features and Agreement: A Morphosyntactic Investigation of the Amharic DP. Doctoral dissertation, University of California, Santa Cruz. Kuno, Susumu 1973 The structure of the Japanese language. Cambridge, Mass: MIT Press.

Phases in NPs and DPs

379

Kuroda, Shige-Yuki 1965 Generative grammatical studies in the Japanese language. Doctoral Dissertation, MIT, Cambridge, Mass. Kuthy, Kordula de 2002 Discontinuous NPs in German: A Case Study of the Interaction of Syntax, Semantics, and Pragmatics. Stanford: CSLI Publications. Larson, Richard, and Sungeun Cho 1999 Temporal adjectives and the structure of possessive DPs. In Proceedings of the 18th West Coast Conference on Formal Linguistics, ed. by Sonya Bird, Andrew Carnie, Jason D. Haugen, and Peter Norquest, 299-311. Somerville, Mass.: Cascadilla Press. Lasnik, Howard, and Mamoru Saito 1992 Move Į: Conditions on its application and output. Cambridge, Mass.: MIT Press. Legate, Julie Anne 2005 Phases and cyclic agreement. In Perspectives on Phases: MIT Working Papers in Linguistics 49, ed. Martha McGinnis and Norvin Richards, 147-156. MITWPL, Department of Linguistics and Philosophy, MIT, Cambridge, Mass. Marelj, Marijana 2008 Probing the relation between binding and movement: Left branch extraction and pronoun-insertion strategy. In Proceedings of 37th Conference of the North-Eastern Linguistic Society, 73-86. Amherst: GLSA, University of Massachusetts. Matushansky, Ora 2005 Going through a phase. In Perspectives on Phases: MIT Working Papers in Linguistics 49, ed. Martha McGinnis and Norvin Richards, 157-181. MITWPL, Department of Linguistics and Philosophy, MIT, Cambridge, Mass. Morzycki, Marcin 2008 Nonrestrictive modifiers in nonparenthetical positions. In Adjectives and adverbs: Syntax, semantics and discourse, ed. by Chris Kennedy and Louise McNally, 101-122. Oxford: Oxford University Press. Müller, Gereon 1998 Incomplete category fronting: A derivational approach to remnant movement in German Dordrecht: Kluwer. Müller, Gereon, and Wolfgang Sternefeld 1993 Improper movement and unambiguous binding. Linguistic Inquiry 24: 461-507. Murasugi, Keiko, and Tomoko Hashimoto 2005 Three pieces of acquisition evidence for the v-VP Frame. Nanzan Linguistics 1: 1-19.

380

Željko Boškoviü

Nomura, Masashi 2003 The true nature of nominative objects in Japanese. In Proceedings of the 26th Annual Penn Linguistics Colloquium, ed. by Elsi Kaiser and Sudha Arunachalam, 169-183. Philadephia: University of Pennsylvania, Penn Linguistics Club. Nomura, Masashi 2005 Nominative Case and AGREE(ment). Doctoral dissertation, University of Connecticut, Storrs. Obata, Miki 2010 Root, Successive-Cyclic and Feature-Splitting Internal Merge: Implications for Feature-Inheritance and Transfer. Doctoral dissertation, University of Michigan. Ormazabal, Javier 1991 Asymmetries on wh-movement and some theoretical consequences. Ms., University of Connecticut, Storrs. Pafel, Jürgen 1993 Ein Überblick über die Extraktion aus Nominalphrasen im Deutschen. In Extraktion im Deutschen I, vol. 34 of Arbeitspapiere des SFB 340, ed. by Franz-Josef d'Avis, Sigrid Beck, Uli Lutz, Jürgen Pafel, and Susanne Trissler, 191-245. Universität Tübingen. Partee, Barbara H., and Vladimir Borschev 1998 Integrating lexical and formal semantics: Genitives, relational nouns, and type-shifting. In Proceedings of the Second Tbilisi Symposium on Language, Logic, and Computation, ed. by Robin Cooper and Thomas Gamkrelidze, 229-241. Tbilisi: Center on Language, Logic, Speech, Tbilisi State University. Pereltsvaig, Asya 2007 On the universality of DP: A view from Russian. Studia Linguistica 61: 59-94. Pesetsky, David, and Esther Torrego 2007 The syntax of valuation and the interpretability of features. In Phrasal and clausal architecture: Syntactic derivation and interpretation. In honor of Joseph E. Emonds, ed. by Simin Karimi, Vida Samiian, and Wendy K. Wilkins, 262-294. Amsterdam: John Benjamins. Rappaport, Gilbert 2000 Extraction from Nominal Phrases in Polish and the theory of determiners. Journal of Slavic Linguistics 8: 159-198. Reintges, Chris, and Anikó Lipták 2006 Have = be + prep: New Evidence for the preposition incorporation analysis of clausal possession. In Phases of Interpretation Studies in Generative Grammar 91, ed. by Mara Frascarelli, 107-132. Berlin: Mouton.

Phases in NPs and DPs

381

Riqueros, Jose in prep. Spanish nominal extraction and the concept of phase. Ms., University of Connecticut, Storrs. Rizzi, Luigi 2006 On the form of chains: Criterial positions and ECP effects. In Whmovement: Moving on, ed. by Lisa Lai-Shen Cheng and Norbert Corver, 97-133. Cambridge, MA: MIT Press. Rodman, Robert 1977 Concerning the NP Constraint. Linguistic Inquiry 8: 181-184. Rodríguez-Mondoñedo, Miguel 2007 The syntax of objects: Agree and differential object marking. Doctoral dissertation, University of Connecticut, Storrs. Roehrs, Dorian 2006 The morpho-syntax of the Germanic Noun Phrase: Determiners move into the Determiner Phrase. Doctoral dissertation, Indiana University, Bloomington. Runiü, Jelena 2011 Clitic doubling in non-standard Serbian and Slovenian dialects. Ms., University of Connecticut, Storrs. Saito, Mamoru 2006 Subjects of complex predicates: A preliminary study. In Stony Brook Occasional Papers in Linguistics 1, ed. by Tomoko Kawamura, Yunju Suh, and Richard K. Larson, 172-188. Department of Linguistics, Stony Brook University, New York. Saito, Mamoru, and Keiko Murasugi 1999 Subject predication within IP and DP. In Beyond principles and parameters, ed. by Kyle Johnson and Ian G. Roberts, 167-188. Dordrecht: Kluwer. Sánchez, Liliana 1996 Syntactic structure in nominals: A comparative study of Spanish and Southern Quechua. Doctoral dissertation, University of Southern California, Los Angeles. Schoorlemmer, Eric 2009 Agreement, dominance, and doubling: The morphosyntax of DP. Doctoral dissertation, University of Leiden. Shoji, Atsuko 1986 Dake and sika in Japanese: Syntax, semantics and pragmatics. Doctoral dissertation, Cornell University. Starke, Michal 2001 Move dissolves into Merge: A theory of locality. Doctoral dissertation, University of Geneva. Stjepanoviü, Sandra 1998 Extraction of adjuncts out of NPs. Presented at the Comparative Slavic Morphosyntax Workshop, Spencer, Indiana.

382

Željko Boškoviü

Stowell, Timothy 1989 Subjects, specifiers, and X-bar theory. In Alternative conceptions of phrase structure, ed. by Mark R. Baltin and Anthony S. Kroch, 232262. Chicago: University of Chicago Press. Svenonius, Peter 2004 On the edge. In Peripheries: Syntactic edges and their effect, ed. by David Adger, Cécile De Cat, and George Tsoulas, 259-287. Dordrecht: Kluwer. Szabolcsi, Anna 1994 The noun phrase. In Syntax and Semantics 27, ed. by Ferenc Kiefer and Katalin É. Kiss, 179-274. New York: Academic Press. Tada, Hiroaki 1992 Nominative objects in Japanese. Journal of Japanese Linguistics 14: 91–108. Takahashi, Masahiko in press a. Case-valuation, phasehood, and nominative/accusative conversion in Japanese. In Proceedings of 39th Conference of the North-Eastern Linguistic Society. Amherst: GLSA, University of Massachusetts. in press b. Case, Phases, and Nominative/Accusative Conversion in Japanese. Journal of East Asian Linguistics 19. Ticio, Emma 2003 On the structure of DPs. Doctoral dissertation, University of Connecticut, Storrs. Torrego, Esther 1987 On empty categories in nominals. Ms., University of Massachusetts, Boston. Uchishiba, Shin’ya 2006 The enhancement/repression of phasehood. In KLS 27: Proceedings of the Thirty-First Annual Meeting, 195-205. Kansai Linguistic Society. Ura, Hiroyuki 1996 Multiple Feature Checking: A Theory of Grammatical Function Splitting. Doctoral Dissertation, MIT, Cambridge, Mass. Uriagereka, Juan 1988 On government. Doctoral dissertation, University of Connecticut, Storrs. Willim, Ewa 2000 On the grammar of Polish nominals. In Step by step: Essays on minimalist syntax in honor of Howard Lasnik, ed. by Roger Martin, David Michaels, and Juan Uriagereka, 319-346. Cambridge, Mass.: MIT Press. Zlatiü, Larisa 1994 An asymmetry in extraction from noun phrases in Serbian. Indiana Linguistic Studies 7: 207-216.

Phases in NPs and DPs 1997 1998

383

The structure of the Serbian Noun Phrase. Doctoral dissertation, University of Texas, Austin. Slavic Noun Phrases are NPs, not DPs. Presented at the Comparative Slavic Morphosyntax Workshop, Spencer, Indiana.

Phases, head movement and second-position effects* Ian G. Roberts

1. Introduction This paper is an investigation into the nature of second-position (P2) effects in syntax and the nature of the ‘C-field’. I argue that P2 effects are largely attributable to the nature of C as a phase head (PH). P2 effects are very widespread cross-linguistically; languages showing such effects are found in all the European branches of Indo-European as well as in all the ancient IndoEuropean languages (see Wackernagel 1892, Fortson 2004:146-7, and §3.2 below). Such effects are also found in Basque (Laka 1990), Warlpiri (Hale 1983) and many other non-Indo-European languages (see the papers in Halpern & Zwicky 1996). Although one should be wary of drawing inferences about UG from the frequency with which a phenomenon is attested, the cross-linguistic evidence nonetheless suggests that there is something special about the second position. This paper proposes an account of that ‘specialness’, exploiting the essential idea that C is a phase head. The central proposal of this paper develops ideas sketched in Roberts (2010a:65ff.) and is as follows: (i) if, as I have argued extensively elsewhere (Roberts (2010a)), narrow-syntactic head-movement exists and is a reflex of the Agree relation under certain highly specific conditions, and (ii) if, as proposed in Chomsky (2008), phase heads drive all narrow-syntactic operations in virtue of their uninterpretable formal features and their Edge Features (EF), then (iii) head-movement to PH-positions should exist and, most importantly, owing to the nature of head-movement as a purely Agree-based operation, this operation cannot satisfy PH’s EF. In the case of C, this gives rise to verb- and clitic-second effects, where C’s uninterpretable formal features attract the verb or clitic, with its EF attracting some XP to its edge, subject to a discourse interpretation. The discourse-relevant interpretations include topic, focus and Wh; in fact, where verb- and clitic-movement interact, we are led to propose a version of the ‘extended left periphery’ of the kind proposed in Rizzi (1997). In terms of Rizzi’s approach, the head I am here identifying as C would be Fin.1 Each section of the paper is devoted to a different type of P2 effect. Serbian/Croatian P2 clitics are dealt with in §2; Germanic V2 in §3 and European Portuguese enclisis in §4.

386

Ian G. Roberts

2. Serbian / Croatian It is well-known that many (but not all: Bulgarian and Macedonian are exceptions) Southern and Western Slavonic languages show P2 effects. This is true of Slovenian, Croatian, Serbian, Czech and Slovak (see Franks & King 2000 for an overview). In each of these languages there is a clitic cluster which has very rigid internal and external ordering requirements. The external requirement is that the cluster must come exactly second in the clause: it cannot come first and it cannot come third or later. This, of course, is the P2 effect. The internal ordering inside the cluster is also highly constrained. (1) gives the order in Serbian/ Croatian, Czech and Slovak (Franks & King 2000:28, 105, 128): (1)

a. Serbian/Croatian: li - aux - dat - acc - gen - reflexive - je b. Czech: li - conditional/aux - non-agr dat/refl - dat - gen/acc c. Slovak: by (conditional) - aux - refl - dat - acc - gen

(Li is an interrogative morpheme (more below); Croatian je is the 3sg form of ‘be’). From (1), we can extrapolate the common order (Q) - Aux Pronoun. Within the pronominal cluster, we can note the consistent order dat - acc - gen. In fact, the order Aux - dat - acc - (gen) is also found in all the Slavic languages with non-P2 clitics: Slovenian, Bulgarian, Macedonian, Polish and Serbian. It has often been claimed that the P2 effect in these languages should not be defined in terms of phrasal constituents but rather in terms of words. This proposal has been made for two main reasons: first, a verb may precede the cluster⎯the most striking case of this is when a non-finite verb precedes a finite auxiliary;2 second, there are cases where it seems as though the clitic cluster interrupts a phrasal constituent. These phenomena are illustrated in (2a) and (2b) respectively (this data is Croatian, again from ûavar & Wilder (1992) unless otherwise indicated): (2)

a. Dao ga je Mariji given it has (he) to-Maria He has given it to Maria b. Taj mi je pjesnik dao autogram this to-me has poet given autogram This poet has given me an autogram

Phases, head movement and second-position effects

387

We will discuss (2a) in some detail below. The main thing to note here is that this is a genuine case of V-fronting: it is not remnant topicalisation of the kind found in German (cf. Gelesen hat er das Buch nicht ‘Read has Hans the book’; den Besten & Webelhuth 1989 and the discussion in §3). This point is demonstrated in detail in ûavar & Wilder (1992). Consequently, the Slavonic P2 effect cannot be stated purely in terms of phrasal constituents, as it can in Germanic. We have here an important difference between the two types of language. Moreover, in addition to verbs, adjectives can precede the clitic cluster in copular clauses: (3)

Pametan je / *je pametan intelligent is (he) is (he) intelligent He is intelligent

So there are genuine cases where a single word, in fact a head, precedes the clitic cluster. On the other hand, the putative cases where the clitic cluster breaks up a phrasal constituent can be shown to involve subextraction of part of the constituent in question.3 They thus reduce to the general kind of fronting operation which places material in front of the clitic cluster. This kind of fronting violates Ross’ (1967) Left Branch Condition (LBC), but it is well-known that Slavic languages typically violate this constraint in any case; Boškoviü (2008, 2010, this volume) argues that this is connected to the general absence of D-positions in these languages. Thus it is possible to establish a straightforward correlation between wh-extraction off a left branch and extraction of a subconstituent of a DP off a left branch: (4)

a. Tatino Ivan razbija [ (tatino) auto] father’s Ivan ruins car Ivan is ruining his father’s car b. ýije Ivan razbija [ (þije) auto ] ? whose Ivan ruins car Whose car is Ivan ruining?

We are then justified in positing a similar subextraction in a example like (2b): (5)

Taj mi je [ (taj) pesnik ]….

We see that the apparent breaking up of an initial constituent is purely illusory.4 An extension of this argument is provided in Borsley & Rivero

388

Ian G. Roberts

(1994), who point out that if one analyses examples comparable to (2b) as involving a clitic breaking up a constituent, one is obliged to say that this can happen only where the constituent can be broken up by movement. For example, in Polish, prepositions cannot be separated from their complements. Similarly, clitics cannot intervene between prepositions and their complements: (6)

*Do-Ğ Poznania pojechal to-2.sg PoznaĔ gone You went to PoznaĔ

Following the general approach to clitic placement instigated in Kayne (1991), I regard clitics as occupying a fixed position, with certain constituents able to move leftward over them. The possibility of leftwardmovement extends to a wide range of subextractions.5 (Polish clitics are not, however, subject to P2 effects⎯cf. Borsley & Rivero 1994). I conclude that the P2 effect on clitic clusters in Croatian, and elsewhere in South and West Slavic as mentioned above, requires that the clitic be preceded by either one XP or one head.6 The XP in question can be essentially anything, subject to the standard constraints on movement, etc.: (7)

a. Anina sestra im nudi þokoladu Ana’s sister them offers chocolate a’. Anina im sestra nudi þokoladu Ana’s them sister offer chocolate Anna’s sister offers them chocolate b. Na sto ga ostavi on table it leave Leave it on the table [form Progovac 1995:414-6] c. Dao knjigu mi Ivan nije given book to-me Ivan isn’t c’. Dao mi knjigu Ivan nije given me book Ivan is-not Ivan hasn’t given me a/the book d. Koliko im ko daje? how-much them who gives Who gives them how much? e. Vidio ga je seen him is He has seen him

Phases, head movement and second-position effects

f. Da li ga Ivan vidi? C Q him Ivan see Does Ivan see him?

389

[from Dimitrova-Vulchanova 1999:109]

The example in (7a’) shows that the clitic can also follow material subextracted from a left branch (see the discussion of (2b) and (5) above. (7c’), like (7e), is a case of long verb-movement; see below. Thanks to Moreno Mitrovic for providing these examples). In embedded finite clauses, the clitic cluster immediately follows the complementizer. No material may intervene between them: (8)

a. ... da mu ga Ivan daje ... that him it Ivan gives b. *... da rado mu ga Ivan daje ... that gladly him it Ivan gave c. *... da Ivan mu ga daje ... that Ivan him it gave ... that Ivan gave it to him (gladly)

The data from embedded clauses shows two things. First, the clitic cluster occupies a position above the normal position of the subject and below the normal position of complementizer. I thus take the clitic cluster to occupy a position between C and T.7 This position is similar to that occupied by clitics/weak pronouns in Germanic, as we shall see in detail below. We thus have the following structure in the relevant parts of embedded clauses: (9)

CP %' C XP da %' X TP &* %' clitics X DP subject T’

Since we want the head attracting clitics to be a phase head, it must be a further variety of C. The most natural move is then to identify the C in (9) as Force and X as Fin, using the terminology of Rizzi (1997). This analysis implies that movement to a position preceding the clitic in main clauses is movement to one of the specifier positions in the C-field, SpecForce or

390

Ian G. Roberts

SpecFin. This approach is compatible with the idea that C is a phase head and functions as the attractor for movement if we take it certain feature combinations in C may be ‘split’ to form separate heads which are associated with particular discourse-related interpretations, very much along the lines of Rizzi (1997). We can understand this idea in the following way: each ‘core functional category’ of the clausal hierarchy, v, T, and C, can be seen as adding its own formal feature, as indicated in (10): (10)

C[+Clause type, +T, +V], T[+T, +V], v[+V], V[]

We can thus define each category in terms of its intrinsic ‘formal weight’: C=3, T=2, v=1 and V=0. Lexical categories can be defined as having a formal weight of 0. More interestingly, since we know that limiting the structure of the clause to the three core functional categories is empirically inadequate (this paper adds to the evidence that this is true of the C-system, and the entire ‘cartographic tradition’ stemming from Cinque 1999 clearly indicates this), we can now define a cartographic field: a cartographic field consists of a sequence of structurally adjacent heads of equal formal weight (where Į is structurally adjacent to ȕ iff either Į minimally asymmetrically c-commands ȕ or is minimally asymmetrically c-commanded by ȕ). Furthermore, we can extend Richards’ (2007) notion that phase heads must alternate with nonphase-heads, and obtain the result that H is a phase head iff H has an oddnumbered formal weight. We then adopt Chomsky’s (2008) idea that only phase heads may be probes, and assume that EF and uninterpretable features can only be associated with heads of odd-numbered formal weight. This gives us the result that C (as well as v, of course) can be a distributed phase head. To the extent that the phase heads act as a single head, we might think that the structurally lowest head will bear probing features, as this always creates a smaller distance between probe and goal than otherwise, and that only the highest head has EF, since the highest edge can be thought as the edge of the entire field of equally heavy heads. Tempting though this line of thought is, it cannot be true in all cases as we shall see instances in which lower heads have EF and higher heads have probing features; it remains possible that this is a preference though. As we shall see, in clitic-second languages, C’s phase-head properties are distributed across Fin (the probing features) and Force (the Edge Feature). We are now faced with two questions. First, what causes clitics to move to this position in to the C-system? Second, how are complement clitics in these languages able to avoid cliticisation to v, a phase head which is also capable of attracting clitics, and indeed does in many languages, including

Phases, head movement and second-position effects

391

most of the Romance languages (see Roberts 2010a)? Regarding the first question, I follow the general approach in Roberts (2010a) and therefore I take it that clitic-incorporation takes place where (a) an Agree relation holds between the host (probe) and the clitic (goal), and (b) the formal features of the clitic-goal are properly included in those of the host-probe; in this case the goal is defective. For example, Romance object clitics are taken to be ĳelements, i.e. a bundle of (interpretable) ĳ-features lacking a D-feature. The probe v has (uninterpretable) ĳ-features, in addition to its intrinsic V-feature (see (10)) and other features (perhaps EF, for example). The important point is that the features of the clitic are included in those of v; hence the clitic is a defective goal (see Roberts 2010a for more details). This means that the Match relation holding in virtue of Agree causes the host to become a featural copy of the probing features of the host. The standard chainreduction algorithm (e.g. as in Nunes 2004) then has the effect of treating the goal as the copy of the host and so spells out the features of the host rather than those of the goal. This gives the effect of movement, with no direct trigger for movement at all. In fact, an EF or EPP feature associated with the probe arguably could not be checked by Agree with a defective goal since the movement effect is independently guaranteed.8 In these terms, secondposition clitics in S/C and similar languages, then, are attracted to Fin by the ĳ-features of Fin. Concerning the second question, in Roberts (2010a) I propose that secondposition clitics may ‘escape’ vP because they are Dmin/max, rather than ĳmin/max. Because of this they are distinct from v and so unable to incorporate to it, since v has no D-feature. I further assume that Fin in languages like S/C has D-features in addition to ĳ-features, so Dmin/max will be attracted to this position following the mechanisms outlined in the previous section. It is likely that pronominal clitics are the only Dmin/max elements in the language: proper names arguably have more internal structure, involving the ‘lexical’ n/N phase (see Longobardi 1994), as do bare quantifiers, in which the n/N phase can be seen as a structural manifestation of their restrictor. (It is also possible that there are no Ds inside nominals in S/C and similar languages; see Boškoviü 2008, 2010). The auxiliary clitics also bear a D-feature to the extent that their person-agreement specification is sufficiently ‘rich’ to license a null subject (recall that S/C, like all South Slavic languages, are consistent null-subject languages; see Holmberg 2005, 2010 on licensing null subjects). These elements, which I am taking to be first-merged in T, have T-features, but so does Fin. Main verbs presumably lack T-features and therefore cannot raise to Fin. This much allows for D-cliticisation to C, but we also need to allow Dmin/max to escape the vP phase, since the Phase Impenetrability Condition will not al-

392

Ian G. Roberts

low movement in one step from a complement position inside VP to a C-position (this is true on either of the formulations in Chomsky 2001, as well as that in Rackowski & Richards 2005). Clearly, vP must have an Edge Feature allowing Dmin/max to move through a Spec,vP. It is highly unlikely that such a feature would be specialized for clitic-movement, however. Instead, I propose that this feature is one formal correlate of a generalized scrambling/’free-wordorder’ type of system. It seems to be an empirically correct observation that languages with second-position clitics have scrambling: this is true of Southern Slavic languages generally (except for Bulgarian/Macedonian, which have scrambling but adverbal clitics⎯see Boškoviü 2001), Latin, Old Spanish and perhaps other Old Romance languages (Fontana 1993, Ledgeway to appear, Rivero 1986, 1992), German and Classical Greek to my knowledge (note that the implication is one-way: scrambling does not imply second-position clitics, or indeed any clitics at all). So we see that second-position cliticisation arises from the combination of the availability of an attractor in the C-system (a point which is clearly subject to parametric variation), the fact that the clitics are Delements, and a generally available Edge Feature on v, giving rise to free movement of internal arguments into the Mittelfeld, one kind of scrambling.9 The data in (8) further shows that no fronting can take place where a complementizer is present in Force. This recalls the well-known situation in relation to Germanic verb-second (den Besten 1983), and indicates that the target position for V-movement in V2 and XP-movement with P2 clitics is SpecForce. Matrix Force has an EF which triggers movement. Embedded Force also has an EF, but this feature is ‘checked’ by means of selection by, and therefore Merge with, the higher predicate. Thus the structure of examples in (7a-d) is as in (11), where some XP is fronted to SpecForceP: (11)

ForceP %' XP Force’ %' Force FinP % Fin TP &* %' clitics Fin DP T’

Phases, head movement and second-position effects

393

Again, the parallel with V2 is clear: XP-fronting is triggered by the EF feature of Force. This triggers XP-movement to SpecForceP, and a special discourse interpretation (generally topicalisation) for the fronted XP. (7b,c) show that XP does not have to be a DP. (7d) shows that the same operations trigger wh-fronting. However, here a little more needs to be said, since the languages in question are all multiple wh-fronting languages, as (12) illustrates for Serbian/Croatian: (12)

Ko koga vidi? who whom sees Who sees who?

The clitic cluster, including li, must follow the first wh-phrase in S/C (examples from Rudin 1988): (13)

a. Ko mu je šta dao? who him is what given Who has given him what? b. *Ko šta mu je dao? Who what him is given?

(14)

a. Ko je šta kome dao? who is what to-whom given? Who gave what to whom? b. *Ko šta je kome dao? Who what is to-whom given? c. *Ko šta kome je dao? Who what to-whom is given?

(15)

Ko li koga voli? who Q whom loves Who on earth loves whom?

[from Boškoviü 2002]

Boškoviü (2002) argues convincingly that multiple wh-fronting results from a general requirement for focus-fronting of wh-phrases, with ‘true’ whmovement to SpecCP being a further possibility, whose instantiation varies across languages. He shows that, once focus-fronting is controlled for, the variation in ‘true’ wh-movement in multiple-fronting languages parallels that in wh-movement in non-multiple-fronting languages. In terms of the

394

Ian G. Roberts

proposals in Rizzi (1997) (which Boškoviü does not explicitly adopt), we can treat ‘true’ wh-movement as movement to SpecForceP and focus-fronting as movement to SpecFocus. Since Rizzi posits that Focus is structurally higher than Fin, this motivates the postulation of Fin-to-Force movement in S/C whquestions (Fin’s ĳ- and D-features are no longer active after cliticincorporation, and so it has just a T-feature; Force can be postulated to have a T-feature when [+wh], rather as it does in English where it thereby triggers subject-aux inversion). In that case, assuming with Boškoviü that S/C is a language which allows just one wh-phrase to undergo ‘true’ wh-movement to SpecCP, we derive the order Wh-phrase - clitics - wh-phrases seen in (13-15) with the structure in (16):10 (16)

ForceP %' WhP Force’ %' Force[WH, FOC, T] FocusP &* %' Focus’ FinT Force WhP* clitics %' Focus FinP %' (Fin) TP

One consequence of adopting Fin-to-Force movement concerns the first-merged position of li. If this element is merged in Force, as suggested in fn. 9, then it should follow the rest of the clitic cluster after leftadjunction of Fin to Force. However, this is clearly not the case. Instead, I suggest that li is first-merged in Focus and raises independently of the clitic cluster to Force. Since Fin must raise before Focus, by the Strict Cycle, Focus containing li will left adjoin to Force and so precede Fin, the derived structure being [Force [Focus li ] [Force [Fin clitics ] Force ]].11, 12 In languages of the SC type, the finite verb does not raise to second position. In this respect, of course, this type of language seems quite different from V2 languages. The notion of a ‘clitic-second’ requirement arises from mechanisms similar to those at work in V2 languages (as we shall see in more detail in the next section when we look at V2) and, like V2, is in fact epiphenomenal: it is a consequence of the features of Fin and those of Force.

Phases, head movement and second-position effects

395

This analysis leads us to the view that declarative main SVO clauses with no focussed constituent feature fronting of the subject to SpecForce. Hence the structure of (17) is as in (18), assuming Focus is not obligatorily 13 present: (17) (18)

Ivan þita knigu Ivan reads book ForceP %' DP Force’ Ivan %' Force FinP %' Fin TP þita knigu

This analysis may suffer from the same conceptual drawbacks as comparable approaches to SV orders in matrix declaratives in V2 languages (on which see Zwart 1997). However, I know of no empirical drawbacks to this approach, and the other advantages of the system being proposed here lead me to maintain this analysis. In (7e), the participle raises over the clitic cluster. Fronting a verb is generally possible with finite main verbs whether the clitics are present or not. Where clitics are not present, what ûavar & Wilder (1992) call a ‘topicalized V reading’ arises. Where clitics are present, V-fronting gives the unmarked reading. Participles, on the other hand, only front where clitics are present: (19)

a. ýita Ivan knjigu reads Ivan book (topicalized V reading) b. ýita ga Ivan reads it Ivan c. ýitao sam knjigu read I-am book (clitic auxiliary) d. *ýitao jesam knjigu read I-am book (full-form auxiliary)

396

Ian G. Roberts

Clearly, no XP has fronted over the auxiliary in any of these examples. (19b,c) show that V can front over the clitic cluster where no XP does. As noted above, this is a major difference with V2: in V2 clauses, both V and some XP must be fronted, while in S/C-type languages it seems that either V or XP must be fronted. I attribute the ill-formedness of (19d) to the fact that the full-form auxiliary must be fronted; this element cannot cliticize or undergo any form of movement to Fin (assuming verbs raise to T in SC, which is plausible since it is a ‘rich-agreement’ null-subject language like the majority of the Romance languages, this amounts to saying that Fin fails to attract T, presumably because it lacks the relevant V-features). Therefore the full-form auxiliary can only be fronted by Force’s EF, but in (19d) two elements would then be fronted in this way. EF-driven fronting of more than one element is impossible, however. The ban on EF-driven fronting of more than one element can be attributed to relativized minimality: EF triggers movement of any category, but that category then counts as an intervener for further EF-driven movement to a higher position, including a higher specifier of the same head (see Roberts 2004). Given this, in order to allow for the combination of scrambling and wh-movement to SpecvP, we must assume that Cs which attract wh-features have an extra feature in addition to EF, so that we have the configuration CWh, EF … vEF … DWh, in which D-movement over v does not violate relativized minimality (see Starke 2001 for further relevant discussion and examples). Multiple wh-movement to Foc of the kind advocated in Boškoviü (2002) may be preceded by formation of a ‘cluster’ at SpecvP, as suggested for clitic clusters above. In (19d), both the auxiliary and the verb are attracted by an EF feature, and hence a violation of relativized minimality ensues. As we pointed out in fn. 2, V-movement to SpecForceP skips the clitic cluster, seemingly in violation of Head Movement Constraint. Two questions arise in this connection: first, why is V-movement necessary? Second, why is V-movement possible? We have seen part of the reason that V-movement is necessary: to check Force’s EF feature. However, the requirement is simply that some category raise to SpecForceP, so why V? Long head movement is unproblematic in terms of the general theory of head movement assumed here, since the only locality constraints operative in the case of head movement are those operative for phrasal movement: relativized minimality and the PIC. There are no phase boundaries or heads of the same kind as Force (EF-bearing heads, in this case) intervening between Force and T (recall that we are assuming that finite verbs systematically raise to T in S/C). There is therefore no locality problem here, so we can see why V-movement is possible (Raposo 2000:280 proposes V-to-Spec movement for enclisis in

Phases, head movement and second-position effects

397

European Portuguese; in §4 below we will analyse the relevant facts but in fact conclude that the landing site of V in enclisis configurations in European Portuguese is not SpecForceP, but the Force head). Other heads are too distant, and prevented from moving given the availability of V. Where there is no V available, as in copular sentences where the copular is a clitic, e.g. in (3) above, some other head can and does raise. The only other contender for movement to SpecForce is the clitic cluster itself, i.e. the Fin head, but this is clearly ruled out where Fin incorporates with Force, as this would entail a part of a head moving to its own specifier. This kind of head-movement is impossible since head-movement, like other cases of movement, is attracted by a feature of the target (indirectly, in the case of Agree-driven headmovement; directly, when attracted by an EF), but movement of a head to its own specifier would require the head to be attracted by a feature of itself, which is surely not possible (to put it another way, attraction is an irreflexive relation). This kind of movement may be ruled out by extension where Fin does not incorporate into Force, if Fin and Force together make up a single phase head as suggested above. In that case, Fin-movement to SpecForce would again be a case of a part of a head moving to its own specifier. Auxiliaries therefore cannot front to SpecForceP if they are cliticized, but can if they appear in their non-clitic form, e.g. jesam in (18d) or its negative counterpart nisam (nisam cannot cliticize to Fin since it has a negative feature, and is therefore not a defective goal; the full-form positive auxiliaries like jesam may, like stressed do in English, bear an Assertion, or positive polarity, feature, again making them non-defective goals for Fin). In this case V-raising is blocked by relativized minimality, since the auxiliary occupies T before raising.14 Before leaving S/C, we should say a brief word about the syntax of yes/no questions and the morpheme li. As we saw above, li always appears first in the clitic cluster. We have suggested that it is merged in Focus and may raise to Force (independently of the clitic cluster in Fin; see (16)). Li appears in matrix yes/no questions, preceded by one of three elements: a moved main verb, a full-form auxiliary or the unmarked complementizer da: (20)

a. Da li ga Ivan i Marija jesu þitali? that Q it Ivan and Maria are read Have Ivan and Maria read it? b. Jesu li ga Ivan i Marija þitali? are Q it Ivan and Maria read? Have Ivan and Maria read it?

(da – li) (aux – li)

398

Ian G. Roberts

c. Daješ li joj pokone? (V – li) give-you Q her presents? Do you give her presents? [from Browne 1975:108, cited in Rivero 1993b:567] These facts are amenable to an extension of the analysis of ‘residual V2’ in Rizzi (1996). The order {da/aux/V} - li - clitics straightforwardly reflects the order of heads in the left periphery, as shown in (21): (21)

ForceP %' OP Force’ %' Force FocusP {da/aux/V} %' Focus FinP li %' Fin TP clitics %' ... T ...

As is common in yes/no questions cross-linguistically, there is no overt constituent in the highest specifier. I take it, following a long line of analysis of V1 yes/no questions in V2 languages, that there is a null operator in SpecForceP. This operator satisfies Force’s EF and prevents ‘long verb movement’ to a higher specifier of Force. Force either attracts V, and hence the finite verb or a non-clitic auxiliary, or is realized by da. We take these to be the possible realisations of the feature combination [V, Q] on Force. Once again we observe that the complementizer blocks movement. In this connection, we can also account for the impossibility of fronting of a participle (as opposed to a finite main verb) with a clitic auxiliary in Fin and li present, as in: (22)

*ýitao li sam knjigu? read Q I-am book Have I read the book?

Here the participle does not satisfy the V-feature of Force, which is unsurprising if we take participles to be not fully verbal; instead, one of the

Phases, head movement and second-position effects

399

elements in (21) is needed in first position. Similarly, participle-raising is impossible in wh-questions, but for a different reason: here a wh-phrase must appear in SpecForceP, blocking all other elements (since a head bearing both WH and EF blocks movement of elements with EF alone, but not vice versa; see above and Starke 2001), and Force triggers incorporation of Fin, i.e. of the clitic cluster, and of Focus (although see fn. 11 on li in wh-questions). Participle-raising is therefore only possible in non-interrogatives, i.e. (19c), the equivalent of (22) without li. Here the participle raises to SpecForceP, Focus is arguably absent, and the clitic cluster remains in Fin. To summarize our account of SC-type languages: (23)

a. Force has EF, which triggers movement of maximal or minimal and maximal category to its specifier; in wh-interrogatives, it attracts exactly one wh-phrase to its specifier and attracts both Fin and Focus; in yes/no interrogatives it hosts a null operator in its specifier and attracts T if da is not merged there. b. Focus hosts li and attracts all wh-phrases but one to its specifier. c. Fin attracts clitics in virtue of its D- and ĳ-features, except je, which is directly merged there. The ordering of clitic-movement reflects the cyclic derivation, so the operations are the reverse of the resulting surface order: nom (=aux) > dat > acc (> gen) > je, giving the surface order which is the same as the first-merged order, by iterated left-adjunction.15

We see that the clitic-second constraint is epiphenomenal, given these specifications of the C-heads (the same conclusion is reached for European Portuguese by Raposo 2000:280; see §4). It also has no direct connection with phonological constraints, although the fact that clitics will always be preceded by a distinct category naturally feeds phonological enclisis, i.e. restructuring of X + clitic(s) as a single prosodic word. But in principle the second-position effect and enclisis are distinct properties: one a syntactic epiphenomenon, the other a phonological property. This seems correct, since we know that enclisis and second-position effects can be separated crosslinguistically.

400

Ian G. Roberts

3. Germanic V2 3.1. German and other contemporary varieties The well-known phenomenon of verb-second in the Germanic languages is illustrated by the following examples (from Tomaselli 1989): (24)

a. Ich las schon letztes Jahr diesen Roman I read already last year this novel b. Ich habe schon letztes Jahr diesen Roman gelesen I have already last year this novel read

(25)

a. Diesen Roman las ich schon letztes Jahr this novel read I already last year b. Diesen Roman habe ich schon letztes Jahr gelesen this novel have I already last year read

(26)

a. Schon letztes Jahr las ich diesen Roman already last year read I this novel b. Schon letztes Jahr habe ich diesen Roman gelesen already last year have I this novel read

(27)

a. *Schon letztes Jahr ich las diesen Roman already last year I read this novel b. *Schon letztes Jahr ich habe diesen Roman gelesen already last year I have this novel read

(28)

Du weißt wohl, .. you know well a. ... daß ich schon letztes Jahr diesen Roman las that I already last year this novel read b. ... daß ich schon letztes Jahr diesen Roman gelesen habe that I already last year this novel read have

These examples illustrate the basic facts about Germanic V2: in matrix declaratives, the inflected verb or auxiliary must occupy second position, being preceded by exactly one XP. (24-26) show that this XP can be of any category. (27) shows that two XPs cannot precede V, and (28) illustrates the basically root nature of the phenomenon, most clearly seen in German since the non-

Phases, head movement and second-position effects

401

second position of V in this language is generally final. This pattern is essentially the same in all the Modern Germanic languages aside, of course, from Modern English. Icelandic and Yiddish have been reported not to show the root-embedded distinction (see Thráinsson 2007, Hrafnbjargarsson & Wiklund 2009 on Icelandic, and Santorini 1992 on Yiddish). I will have little to say about these languages here, except to note that they might exemplify the availability of extra landing-site between the traditional C and T, a natural candidate for this would be Fin (see Roberts 2004). Old English, and to a lesser degree some of the other older Germanic languages, allow the inflected verb to appear third where it is preceded, canonically, by one XP and a clitic or clitic cluster (van Kemenade 1987, Kiparsky 1995, Haeberli 1999, Roberts 1996, Walkden 2009); I return to this very interesting phenomenon below. Otherwise, the Germanic languages largely pattern together and I will treat them as a single entity for the purposes of the discussion, using German for illustration (unless there is some special reason to do otherwise); see Vikner (1995, Chapter 2), Biberauer (2003) for very thorough discussion of further similarities and differences among the Germanic languages regarding the manifestations of V2. One major difference between Germanic V2 and typical Slavic P2 effects of the kind analysed in the previous section is that the Germanic languages do not allow a head to satisfy the V2 constraint; the category preceding the inflected verb in the matrix clause must be an XP. There are, however, well-known examples with the linear order V[-finite] > V[+finite] > subject > ..., as in: (29)

Gelesen habe ich dieses Buch nicht read have I this book not I haven’t read this book

However, since den Besten & Webelhuth (1989) the standard analysis of this order treats it as featuring a combination of object-movement out of VP combined with remnant VP-fronting. This analysis predicts that wherever the order in (28) appears, the order with the object contained in the fronted VP is possible. This is clearly the case in German, as is well known. The basic difference between Slavic P2, as illustrated in S/C, and Germanic V2 is thus that the former languages require fronting of just one element⎯either a head or an XP⎯while the latter require fronting of both a head and an XP. We can account for this by making a very standard assumption about Germanic: namely that, in main clauses, the inflected verb moves all the way up to Force. The crucial difference with S/C is that V-

402

Ian G. Roberts

movement into the C-system always takes place in main clauses in Germanic, while in S/C, as we saw, this only happens in main yes/no questions. On the other hand, Germanic and S/C are alike as regards the nature of the rootembedded asymmetries, and we can thus account for them in the same way: the EF associated with embedded Force is satisfied by selection by the predicate above ForceP, while in the matrix clause this is not possible. The fact that a verb or auxiliary always moves into the C-system in main clauses in Germanic has the consequence that the orders where a finite verb or auxiliary appears in SpecForceP are not found in Germanic. This is because the finite verb or auxiliary must move to Force, and from there cannot move to SpecForceP, as we have already pointed out. However, it is also generally recognized that orders like (29), but with just fronting of the participle rather than the participle heading a remnant VP of some kind, are not found in German (Scandinavian Stylistic Fronting may instantiate this possibility⎯see below). In other words, why is a structure like (30), directly comparable to the structure of an SC example like (19c), not allowed? (30)

ForceP %' Prt Force’ %' Force FinP &* %' Fin Force (Fin) TP &* %' T Fin … (T) … Aux

The answer may lie in the fact that German is head-final in TP. The analysis of this developed by Biberauer (2003), Richards & Biberauer (2005), Biberauer & Roberts (2005) involves a combination of headmovement and leftward-movement of certain clausal categories. In particular, a compound tense, which in a subordinate clause shows the order Object > V > Aux is derived as follows: (31)

a. [VP gelesen das Buch ] Merger of verb and object b. [PrtP [VP (gelesen) das Buch ] [Prt gelesen ] (VP) ] Merger of Prt, movement of V to Prt and of VP to SpecPrtP

Phases, head movement and second-position effects

403

c. [vP [v hat ] [PrtP [VP (gelesen) das Buch ] [Prt gelesen ] (VP) ] Merger of hatv d. [vP [PrtP [VP (gelesen) das Buch ] [Prt gelesen ] (VP) ] [v hat ] (PrtP) ]] Movement of PrtP to SpecvP Here, we can prevent the participle from moving by a generalisation of freezing: movement from a moved category is impossible (see Müller 1998). In that case, gelesen cannot move in this structure.16 On the other hand, the entire PrtP can move. Of course, scrambling of das Buch out of VP must take place before VP-movement (suggesting that there is an intermediate step at which the scrambled category is adjoined to VP, from where it can move to SpecvP); it is the derivation with scrambling of das Buch which gives rise to the string in (29), as originally observed by den Besten & Webelhuth. If the impossibility of participle-fronting alone in German is ultimately due to its surface OV nature, then we might expect to find it in VO V2 Germanic, i.e. in the Scandinavian languages (or possibly Middle English). And we arguably do, only here it is known as Stylistic-Fronting (SF). SF is best-known from Icelandic, as illustrated in (32): (32)

Keypt hafa þessa bók margir stúdentar bought have this book many students Many students have bought this book [from Rögnvaldsson & Thráinsson 1990, cited in Holmberg 2006:537]

Three features of SF are worth briefly commenting on here: (i) the ‘subject-gap’ condition, (ii) the accessibility hierarchy; (iii) its absence in the Mainland Scandinavian (MSc) languages. In main clauses, as we see in (32), the fronted element precedes the finite verb/auxiliary, but note that there is also a subject gap in the TP following the auxiliary here. The subject-gap requirement is illustrated in (33): (33)

a. afleiðslan sem hún var fyrst til að lýsa the derivation that she was first to investigate b. *afleiðslan sem hún fyrst var ... c. *afleiðslan sem fyrst hún var ... d. *afleiðslan sem fyrst var hún ... [from Holmberg 2006:535]

Holmberg (2000) analyses the subject-gap condition as a consequence of the part of the EPP-requirement which requires SpecTP to be filled. Here I follow this view, although I differ from Holmberg in that I do not take head-

404

Ian G. Roberts

movement to SpecTP to be banned; therefore I am not committed to a remnant-movement analysis of SF (although this may be correct in some cases; see Biberauer & Roberts 2006 on Late Middle English). The accessibility hierarchy relates to the strong locality conditions on SF. First, only one element can undergo SF; this constraint can be captured by the postulation of Force’s EF combined with the strict relativized-minimality interpretation of movement to SpecForceP induced by this feature which we discussed above (note that the elements that undergo Styl-F are disjoint from those which undergo wh-movement). Second, which element moves is subject what Maling (1980) described as an accessibility hierarchy, as follows: (34)

negation > predicative adjective > past participle/verbal particle

If an element higher in the hierarchy is present in the clause, it blocks SF of a lower element: (35)

a. Það fór að rigna, þegar búið var að borða it began to rain, when finished was to eat It began to rain when we had finished eating b. ... þegar ekki var búið að borða ... when not was finished to eat c. *... þegar búið var ekki að borða ... when finished was not to eat

Holmberg (2000, 2006) points out that this hierarchy can be understood in terms of relativized minimality: the accessibility hierarchy reflects the asymmetric c-command relations among the first-merged positions of these elements, and therefore each element in the hierarchy is an intervener for all elements lower in the hierarchy. Relativized minimality then guarantees that the hierarchy is obeyed.17 This again is consistent with what we observed for fronting triggered by Force’s EF in SC in the previous section. The third interesting and relevant property of SF is that it is not found in Mainland Scandinavian languages (MSc). Here I follow the proposals in Holmberg (2000, 2006): as just mentioned, SF is primarily a way of satisfying the EPP requirement imposed by T, and as such involves movement to or⎯in the case of V2 clauses⎯through SpecTP. The MSc languages, however, as ‘weak agreement’ languages, impose a further requirement on SpecTP, namely that it must be filled by a nominal category (see Holmberg & Platzack 1995 on this important difference between MSc and Insular Scandinavian⎯Icelandic

Phases, head movement and second-position effects

405

and Faroese⎯on the other). The categories subject to SF are not nominal, and hence SF cannot satisfy MSc’s more stringent requirement on SpecTP. Hence there is no SF in these languages. Thus, in main-clause contexts of SF, an expletive is always found in MSc.18 Zwart (1997) points out that subject clitics in Standard Dutch precede the verb only in SV main clauses but follow V in VS clauses, while they follow C in embedded clauses: (36)

a. ‘k eet vandaag appels I(cl) eat today apples I eat apples todaty b. Natuurlijk eet ‘k vandaag appels of-course eat I(cl) today apples Of course I eat apples today c. ... dat ‘k vandaag appels eet ... that I today apples eat

(A similar point was made by Travis 1984 based on weak forms of certain German pronouns, but see Haider 2010:77-78 for critical discussion). Here too it is natural to place the ‘subject clitic’ in SpecFin. This raises the general question of the nature of clitics, or weak pronouns, in Germanic. In North Germanic, there is nothing to say, as there are no pronominal clitics. But in German, weak/clitic pronouns obligatorily appear immediately following the finite verb in main clauses and the complementizer in embedded clauses: (37)

a. *… daß leider man es hier übersieht … that unfortunately one this here overlooks b. … daß man es hier leider übersieht … that one this here unfortunately overlooks that unfortunately one overlooks this here c. Leider übersieht man es hier unfortunately overlooks one this here Unfortunately this is overlooked here [based on Haider 2010:137]

The pronouns appear to cluster in a fixed position. They also appear in a fixed order: nom > acc > dat:

406 (38)

Ian G. Roberts

a. … daß er sie ihr ja ausgesetzt hat … that he them to-her PRT exposed has that he has exposed them to her b.?? ... daß er ihr sie ja ausgesetzt hat … that he to-her them PRT exposed has [from Haider 2010:137-8]

These two properties are highly reminiscent of the S/C clitic cluster, except that the internal order is different in showing acc > dat rather than the S/C (in fact, general Slavic) order dat > acc. Looking first at the similarity in positioning of the clitic/pronoun cluster, we cannot place the German pronouns in Fin, since here they would be proclitic on the verb;19 in that case, Fin-to-Force movement would preserve the proclisis, giving the order XP > clitics > V, contrary to what we observe in German. Excorporation of [Fin V ] from the complex Fin formed by proclisis is impossible: in the system in Roberts (2010a), excorporation is possible, but only from the edge of the complex. So the clitics could move on, but not the verb (see in particular Roberts 2010a:206-8). We are thus led to conclude that the German weak pronouns do not cliticize to Fin, unlike their S/C counterparts. Recall that we have seen that the S/C clitic-cluster (i.e. Fin) moves to Force in wh-questions in S/C, so S/C is like German in having Fin-to-Force movement (although this operation is restricted to matrix wh-questions), but, while in S/C that movement affects the clitic cluster, in German it affects the verb (or the complementizer in subordinate clauses). But we nonetheless want to capture the fact that weak pronouns in German are always in the vicinity of the left periphery, since they appear immediately to the right of the finite verb or complementizer. I propose that these elements are not clitics and hence do not incorporate to Fin, but are instead attracted to SpecFin. Placing these elements in SpecFin gets the order right, given the general existence of Fin-to-Force movement. The sequence XP > V > pronouns > subject thus has the following structure in German:

Phases, head movement and second-position effects

(39)

407

ForceP %' XP Force’ %' Force FinP &*%' Fin Force pronouns Fin’ &* %' v Fin (Fin) TP % Subject

The elements in question are weak pronouns rather than clitics in Cardinaletti & Starke’s (1999:170) terms: weak elements are deficient phrasal categories, clitics are heads. Thus, weak pronouns are not minimal elements; they have some internal structure. Clitics, on the other hand, are simple feature bundles, structurally minimal and maximal. Since incorporation must involve a minimal category merging with another minimal category, weak pronouns cannot incorporate, while clitics can. Further, weak pronouns do not obligatorily form a phonological unit with the host (unlike clitics), instead they move to designated specifier positions. The trigger for the movement of these elements is, however, unclear; what is clear is that it is not a general EF- or EPP-type feature (in this, weak-pronoun movement resembles clitic-movement). Weak pronouns resemble clitics in being attracted to phasal categories like Fin.20 As already mentioned, weak pronouns are distinct from clitics; other differences are that weak pronouns never allow doubling, weak pronouns do not show morphologically conditioned allomorphy in clusters (cf. the Italian 1sg clitic mi, which becomes me in a cluster), and weak pronouns do not show person-case effects21 (Cardinaletti & Starke 1999:169-70). By all these criteria, the German pronouns in (37, 38) can be seen as weak pronouns. Regarding the internal ordering of weak-pronoun cluster in German, Haider points out that this cannot be derived from the first-merged order of arguments, which is what I proposed in the previous section for S/C. The reason for this is that German ditransitive verbs vary in the first-merged order of arguments: vorstellen (‘introduce’) takes dat > acc order, while aussetzten (‘expose’) takes acc > dat order. One possibility is that acc > dat is formed at the vP-level, and then this cluster raises to SpecFin as a unit, followed by raising of the subject

408

Ian G. Roberts

pronoun. This will guarantee the attested order (in multiple specifiers of Fin), but does not explain the obligatory acc > dat order. A final point to complete the comparison with S/C: there is little to say about interrogatives in German. German does not allow multiple whfronting, and so there is no generalized movement of wh-phrases to SpecFoc of the type proposed by Boškoviü (2002) for S/C and assumed here. German wh-movement is rather like English (except for the possibility of partial whmovement in some varieties, a matter I will leave aside here), in that a single wh-phrase must raise into the C-system in direct wh-interrogatives. In German, this is movement to SpecForceP. V-movement proceeds in the usual way (from v to Fin to Force), any weak pronouns in SpecFinP will thus surface to the right of the verb. In yes-no questions, a null operator occupies SpecForceP; everything else is as in wh-questions. We can now compare Germanic and S/C more systematically. (40)

Germanic: a. Force has EF which triggers movement of exactly one category, which may in principle be maximal or minimal and maximal, to its specifier; also has a V-feature, systematically attracting Fin in all types of main clause; in wh-interrogatives, it attracts exactly one wh-phrase to its specifier; in yes/no interrogatives it hosts a null operator in its specifier. b. Focus may attract exactly one wh-phrase to its specifier and V.22 c. Fin attracts V and weak pronouns to its specifier (West Germanic only). The ordering of pronoun-movement partially reflects the cyclic derivation, so the operations are the reverse of the resulting surface order: nom > {acc, dat}; also attracts the subject to its specifier (at least in certain varieties); complementizers are first-merged in Fin and raise to Force (except in Yiddish and Icelandic).

These properties should be compared with those of S/C enumerated in (23), repeated here: (23)

a. Force has EF, which triggers movement of maximal or minimal and maximal category to its specifier; in wh-interrogatives, it attracts exactly one wh-phrase to its specifier and attracts both Fin and Focus; in yes/no interrogatives it hosts a null operator in its specifier and attracts T if da is not merged there.

Phases, head movement and second-position effects

409

b. Focus hosts li and attracts all wh-phrases but one to its specifier. c. Fin attracts clitics in virtue of its D- and ĳ-features, except je, which is directly merged there. The ordering of clitic-movement reflects the cyclic derivation, so the operations are the reverse of the resulting surface order: nom (=aux) > dat > acc (> gen) > je. 3.2. Old English and other older varieties Here we briefly outline an account of V2 in Old English (OE) and, very sketchily, certain other older Germanic varieties. As mentioned above, OE, although it is often thought of as a V2 language, has pattern of apparent exceptions to V2.23 These arise in positive main-clause declaratives where at least one clitic or weak pronoun is present. In such cases, the order is XP > pronoun > V. The following examples illustrate: (41)

a. hiora untrymnesse he sceal ðrowian on his heortan their weakness he shall atone in his heart He shall atone for their weakness in his heart [CP 60.17, cited in Pintzuk 1993:6] b. þin agen geleafa þe hæfþ gehæledne thy own faith thee has healed Thine own faith has healed thee [BlHom 15.24-25, cited in Pintzuk 1993:9]

On the other hand, wh-questions, negative clauses introduced by the preverbal negator ne and clauses introduced by various adverbs, mainly þa (‘then’), but more sporadically þonne (‘then’) and a few others, are rigidly V2: in these types of clauses the verb precedes the clitic. The following examples (from van Kemenade 1987:138-40) thus contrast with those in (41) as far as clitic-placement is concerned: (42)

a. Hwæt sægest þu, yrþlincg? what sayest thou earthling? What do you say, ploughman? b. Þa weard he to deofle awend then was he to devil changed Then he changed into a devil

[AColl., 22] [AHTh, I, 12]

410

Ian G. Roberts

c. Ne worhte he þeah nane wundre openlice nor wrought he yet any miracles openly Nor did he work any miracles openly

[AHTh, I, 26]

A further important fact is observed by Pintzuk (1991:103ff.). It seems that two XPs can precede the pronoun - verb sequence, while just one XP precedes ne/þa/wh - verb - pronoun: (43)

a. eft æfter lytlum fyrste on þisre ylcan . . . afterwards after little time in this same . . . Romana byri he wearð forbærned Roman borough he was burned Afterwards a little while later he was burned in the same Roman town [GD(H) 30.20-23; Pintzuk (1991:104)] b. þa under þæm þa bestal he hine on niht onweg then meanwhile then stole he him in night away Then, meanwhile, he stole away in the night [ChronA 92.9-10(901); Pintzuk 1991:105]

The evidence in (43) suggests that wh/neg/þa are in SpecFocP, with Vmovement to Focus, with a further EF in Force (which can only allow firstmerged elements such as ‘scene-setting’ adverbials of the type seen in (43), as the lower EF blocks movement of all lower material by relativized minimality). If V has moved to Focus in examples like (42), the simplest thing to say about (41) is that V has moved to a lower position. We can then maintain the general view that clitics and weak pronouns are attracted to Fin. The difference between OE and Modern German lies in the fact that Force lacks a V-feature in OE and only optionally bears EF, while Focus has a V-feature and EF when the appropriate elements are present (wh, negation, þa), and Fin always has a V-feature. So we have the two configurations in (44) in the OE left periphery (Roberts 1996:164): (44)

a. [ForceP XP [FocP YP [FinP Pronouns [Fin V ]]]] b. [ForceP XP [FocP þa [Foc V ] [FinP Pronouns [Fin V ]]]]

OE weak pronouns can appear in one of two positions in embedded clauses:

Phases, head movement and second-position effects

(45)

411

a. ... þæt him his fiend were æfterfylgende ... that him his enemies were following that his enemies were following him [Oros., 48, 12; van Kemenade 1987:113] b. ... þæt þa Deniscan him ne mehton þæs ripes forwiernan ... that the Danes them not could the harvest refuse that the Danes could not refuse them the harvest [ChronA 89.10(896); Pintzuk 1993:188]

Examples like (45a) are straightforward, given the above account: the weak pronoun is in SpecFinP and the subject in SpecTP. The complementizer þæt raises from Fin to Force, giving rise to root-embedded asymmetries as described for German in the previous section. (45b) is trickier since, if we want to hold the position of the weak pronouns constant as we have done up to now, we need to give an account of the position of the subject. It is possible that the subject could optionally raise to SpecFinP in OE (note that, by the Strict Cycle, it would occupy a higher position than the object weak pronoun since it would raise after it). On the other hand, it may be on the edge of vP, but this would imply a rather low position for clausal ne. I leave this question open. Similar orders are found in Old High German (OHG), as originally pointed out by Tomaselli (1995), although they are restricted to the Isidor text (see Fuß 2008 for a critique of Tomaselli’s interpretation of the data). However, Fuß does show that, in OHG, topics may occur to the left of fronted wh-phrases: (46)

[ Uuexsal dhes nemin] huuazs bauhnida? changing-nom of-the name what meant Lt. Mutatio nominis quid significabat? The changing of the name, what did it mean? [Isidor, 532; Axel 2007:209]

Also, there is some multiple XP-fronting (most frequent in the Isidor, cf. Robinson 1997, Axel 2007): (47)

[ Dhea uuehhun][ auur ] [ in heilegim quhidim] arfullant sibun iaar the weeks however in sacred language fulfil seven years Lt. Ebdomada namque in sacris eloquiis septem annis terminatur The weeks, however, take seven years in sacred language [Isidor, 457; Robinson 1997:26]

412

Ian G. Roberts

Fuß (2008) proposes the following structure for the OHG clause: (48)

CP C’ XPtop C’ XPwh C TP C Vfin

Fuß also observes that OHG shows consistent inversion of the verb and pronouns. Given this, we can restate the structure in (48) as in (49), in line with our assumptions here and Rizzi’s (1997) structure for the left periphery:24 (49)

[ForceP (XPtop) [FocP XPwh [Foc V ] [FinP Pronouns Fin [TP …

So in OHG we have consistent V-movement to Foc, an EF on Foc (presumably just when the appropriate element, e.g. a wh-phrase, is available in SpecFocP), and a further EF, but no V-feature, on Force. Once again, except where a scene-setting topic is in SpecForceP, Force and Focus cannot both have EF due to minimality (Focus is more richly specified, having extra Wh, Neg, etc. features, and EF, so can block anything trying to move to Force). Ferraresi (1991, 2005) and Longobardi (1994) show that Gothic was not a V2 language (see Walkden 2009). In this language, V-movement into the left periphery is triggered by the Q-marker -u, by negation and by wh-questions:25 (50)

a. wileid -u nu ei fraleletau izwis thana . . . want -you-Q now that I-release you the . . . thiudan Iudiae? (J, 18, 39) king of-the-Jews Do you want me to release the King of the Jews for you now? b. unte nist unmahteig guda ainhun waurde (C 3, 19) for not-is impossible God any thing For nothing is impossible for God [from Ferraresi 1991:88f.]

Phases, head movement and second-position effects

413

c. σa skuli þata barn wairþan? what shall that child become? What will that child become? [from Luke 1:66, Eythórsson 1995:25, cited in Harbert 2007:406] Eythórsson also points out that the verb appears to front in Gothic when a pronoun would otherwise be in clause-initial position, on the basis of minimal pairs such as the following: (51)

a. ushaihah sik hanged self = Greek ap‫ڼ‬gxato [from Matt. 27:5, Eythórsson 1995:29, cited in Harbert 2007:410] b. jabai mik frijoþ if me you-love = Greek eàn agapâté me [from John 14:15, Eythórsson 1995:31, cited in Harbert 2007:410]

Eythórsson (1995:20f) shows that Gothic had OV order otherwise, cf: (52)

ik in watin izwis dauthja I in water you baptise

[from M 3, 11; Ferraresi 1991:76]

See also Lehmann (1993:34). Gothic appears to show V-movement to Foc only, and as such we see it only where there is a wh-phrase, the Q-morpheme -u (which, like SC li, occupies Foc) or negation (a min/max morpheme in SpecFocP). Finally, in (51a) it is possible that V moves to Foc where there is a pronoun in SpecFinP. Walkden (2009:57) surveys the distribution of verb-movement into the left-periphery in the oldest Germanic languages and arrives at the following results: (53) Run. ON Goth OE OHG

yes/no Q ? yes yes yes yes

whQ ? yes yes yes yes

Neginit ? yes yes yes yes

Imp. yes yes yes yes yes

Narrative inversion ? yes yes yes yes

XPfronting yes yes yes* yes? yes

*only where a definite subject has been fronted

Matrix declaratives yes yes no yes yes

Sub. clauses ? yes no no no

414

Ian G. Roberts

In addition to the languages we have just looked at, here there is evidence from two ancient varieties of North Germanic: Runic and Old Norse. The evidence from Runic is extremely scanty, and very little can be concluded from it except that there was some V-fronting (it was fairly clearly OV, see Faarlund 1994:66). The notable thing about Old Norse is that it is the only variety to show V-fronting in all contexts, including subordinate clauses. It appears to have been a ‘symmetrical’ V2 language, then, just like Modern Icelandic. Although Modern Icelandic has no pronominal clitics or weak pronouns, there is some evidence that weak pronouns followed the verb in V2 clauses in Old Norse (Faarlund 1994:65). It is plausible to think, then, that the Proto-Germanic starting point showed the following order: (54)

Topic > (V) > focus > (V) > clitic/topic (V) [TP …. (V) ]

There are three XP positions in the left-periphery, and the verb may move to a position following each one. Such movement may have been optional at the earliest stages. This picture fits well with what has been observed elsewhere in Indo-European. For example, Hale (1995) gives the structure in (55) for the Vedic Sanskrit clause: (55)

[TopP Top [CP C [FocP Foc IP ]]]

Hale shows that V-movement to C was an option here. This structure closely approximates the Force-Foc-Fin structure we have been assuming. Very similar structures have been proposed by Garrett (1990) for Anatolian, Newton (2006) and Carnie, Pyatt & Harley (2004) for Old Irish and Celtic, Salvi (2004), Ledgeway (forthcoming) and Devine & Stephens (2006) for Latin.26 Fortson (2004:144-7) gives evidence from a range of archaic IndoEuropean languages (Hittite, Old Avestan, Greek, Latin, Armenian, Vedic and Gothic) for ‘verb-topicalisation’, placement of the verb in initial position along the lines proposed for S/C in Section 2, wh-movement, a pre-wh topic position and second-position clitics. See also Clackson (2007:165-171) for very similar observations, and a very interesting discussion of the differences between Anatolian and Sanskrit/Greek/Latin enclisis. 3.3. Conclusion This concludes our discussion of Germanic V2. We have seen that the basic difference between S/C-type Slavic languages and typical V2 Germanic

Phases, head movement and second-position effects

415

languages lies in the fact that the latter have generalized V-movement into the left-periphery, while the former restrict this to certain contexts (basically yes/no questions), and that this movement involves Fin in Germanic, but not in S/C. On the other hand, Germanic shows weak pronouns in SpecFinP while S/C has clitic-movement to Fin. Both have a generalized EF attracting some XP to a higher position. Earlier stages of Germanic reveal different landing sites for V, but consistent landing sites for weak pronouns, and a constant EF associated with some head in the extended CP. What we observe is that the ‘distributed C phase’ with probing features in Fin and EF in a higher head, underlies the various second-position phenomena. This supports the idea put forward earlier of a ‘distributed phase head’ and the formal characterisation of that notion that we suggested. The variables we have seen up to now are listed in (56): (56)

a. Clitics (in Fin) or weak pronouns (in SpecFin). b. Position of V (Force, Foc, Fin, T, v). c. Position of EF (Force, Foc, Fin).

In the next section, we will show how exactly the same constant structure, coupled with the variables in (56), can account for the range of facts regarding enclisis in contemporary European Portuguese. 4. Enclisis in European Portuguese Modern European Portuguese, alone among the present-day Romance standards, preserves a variant of what is traditionally known as the ToblerMussafia Law in Romance philology: the ban on clitics in first position.27 The basic paradigm (cf. Madeira 1993, 1995, Rouveret 1992, Martins 1994, Uriagereka 1995, Duarte & Matos 2000, Raposo 2000, 2001, Costa 2000, Shlonsky 2004, Raposo & Uriagereka 2005) involves enclisis with a definite subject in positive declarative matrix clauses and proclisis with negation, an initial negatively-quantified XP, certain initial quantified XPs, and initial focussed XPs. The data are illustrated below: (57)

Positive matrix declarative - enclisis: a. O Pedro encontrou-a the Peter mether Peter met her

416

Ian G. Roberts

b. *O Pedro a encontrou the Peter her met (58)

Embedded declarative - proclisis: a.*Dizem que o Pedro encontrou-a they-say that the Peter mether b. Dizem que o Pedro a encontrou they-say that the Peter her met They say that Peter met her

(59)

Matrix negative clauses - proclisis:28 a. *O Pedro não encontrou-a the Peter not mether b. O Pedro não a encontrou the Peter not her met Peter didn’t meet her

(60)

Initial negative quantifier - proclisis: a. *Ninguem ajudou-me nobody helped-me b. Ninguem me ajudou nobody me helped Nobody helped me c. *Nada deram-me nothing they-gave.me d. Nada me deram nothing me they-gave They didn’t give me anything

(61)

Wh-questions - proclisis: a. *Onde encontrou-a o Pedro? where mether the Peter b. Onde a encontrou o Pedro? where her met the Peter? Where did Peter meet her?

Phases, head movement and second-position effects

(62)

Initial focussed subject - proclisis: ATÉ O PEDRO me deu uma prenda EVEN THE PETER me gave a present EVEN PETER gave me a present

(63)

Quantified subjects: a. *Todos os rapazes ajudaram-me all the boys helped- me b. Todos os rapazes me ajudaram all the boys me helped All the boys helped me c. Alguns rapazes ajudaram-me some boys helped- me d. *Alguns rapazes me ajudaram some boys me helped Some boys helped me

417

Leaving aside for the moment the data in (63), this pattern is entirely explicable in terms of our system. In essence, the EF feature is never associated with Force, but only with Foc or Fin. As in OE, OHG and elsewhere, Foc has an EF feature which attracts wh-phrases, negation, and negative quantifiers. In addition, Foc can attract focussed subjects and nonreferentially quantified subjects; all the main-clause proclisis contexts above except clausal negation thus involve XP-movement to SpecFocP. The clitics are in Fin and V is in T, as in S/C (see §2; note that EP is a fully null-subject language, and we have been assuming throughout that V moves to T in such languages). Where Foc has EF, it attracts both T and Fin (in that order; see fn. 9), giving proclisis at the Foc level.29 In the case of clausal negation in (70), Foc has an EF feature, but the negative element não incorporates into Foc, possibly from a lower head position. Hence this element cannot satisfy EF, and so the subject does. A non-quantified subject is unable to move to SpecFocP unless either Foc is negative or the subject is focussed; this is a consequence of the inherent, discourse-sensitive properties of Foc. Where Foc does not bear EF, either Fin or Force does. When Fin has EF, it attracts the subject. This gives the pattern of embedded declaratives as in (58). Finally, in matrix declarative clauses with enclisis and a definite subject, Force has an EF, which attracts the subject; Force also attracts the verb (again, directly from T). This gives the enclisis configuration. Force can also

418

Ian G. Roberts

attract non-subjects, which receive a topic interpretation, and sentential adverbs. In both cases the verb is also attracted and so enclisis results: (64)

a. Esses livros, deios/ *os dei à Maria these books I-gave.them/ them.I-gave to-the Maria These books, I gave them to Maria b. Geralmente vejo-a/ *a vejo de manhâ generally I-see.her/ her.I-see in the morning Generally I see her in the morning [from Barbosa 2000:35-36]

The account of the root nature of this phenomenon is as in SC and German: Force’s EF is satisfied by selection by the higher predicate in subordinate clauses. Force’s EF is optional, since no overt subject or topic has to be present, and, as we shall see below, the subject does not have to raise. However, in main clauses, where Foc is inactive, Force always attracts V. Below I give the relevant parts of the structure for the different configurations: (65)

Positive matrix declarative - enclisis: [ForceP XP [Force[EF,V] V ] [FocP [FinP [Fin[ĳ] Cl ] [TP SU [T (V) ] . . .

(66)

Embedded declarative - proclisis: [ForceP [Force[EF] C ] [FocP [FinP SU [Fin[EF, ĳ] Cl ] [TP (SU) [T V ] . . .

(67)

Matrix negative clauses - proclisis: [ForceP Force [FocP SU [Foc não [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]]] [FinP (Fin) [TP (SU) (T) . . .

(68)

a. Initial negative subject quantifier - proclisis: [ForceP Force [FocP Neg-QP [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]] [FinP (Fin) [TP (Neg-QP) (T) . . . b. Initial negative non-subject quantifier - proclisis: [ForceP Force [FocP Neg-QP [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]] [FinP (Fin) [TP SU (T)

(69)

a. Subject Wh-questions - proclisis: [ForceP Force [FocP WhP [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]] [FinP (Fin) [TP (WhP) [T V ] . . .

Phases, head movement and second-position effects

419

b. Non-Subject Wh-questions - proclisis: [ForceP Force [FocP WhP [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]] [FinP (Fin) [TP SU [T V ] . . . (70)

Initial focussed subject - proclisis: [ForceP Force [FocP SU [Foc [Fin[EF, ĳ] Cl ] [Foc [T V ] Foc ]] [FinP (Fin) [TP (SU) [T V ] . . .

Regarding the quantified subjects in (63), we can note that those quantifiers which can topicalize are able to trigger enclisis. These quantifers are referential, both in the sense that they can topicalize and in the sense that they can be coreferential with a pronoun they do not c-command: (71)

a. Some kinds of beans, I don’t like b. Some boysi went to the party and theyi had fun

On the other hand, quantifiers like every are non-referential (or inherent in the sense of Haïk 1984: they do not on their own pick out entities in the domain of discourse), in that they fail both tests: (72)

a. ?*Every kind of bean, I don’t like b. *Every boyi went to the party and he/theyi had fun

In European Portuguese, the former type of quantified expression raises to SpecForceP just like any subject, giving the structure in (65). The latter type is unable to do this, owing to the interpretation the interface imposes on elements in this position being incompatible with the intrinsic semantic nature of the quantifier, hence it remains in SpecFinP, giving the structure in (66) (without the complementizer); see Barbosa (2000:33) for a similar idea. Non-specific indefinites and bare alguém (‘someone’) pattern with nonreferential quantifiers in triggering proclisis. If a referentially-quantified subject is focussed it will be attracted to SpecFocP with ensuing proclisis (the same is true if it is marked with a focus particle like só (‘only’)⎯see Barbosa 2000:36): (73)

a. Alguns rapazes ajudaram-me some boys helped-me Some boys helped me

420

Ian G. Roberts

b. ALGUNS RAPAZES me ajudaram SOME BOYS me helped SOME BOYS helped me It appears that the subject is able to stay in SpecTP if Force lacks EF and Foc is inactive in a main clause. In that case, the clitics move to Fin and the finite V moves to Fin: (74)

Tinha-as a Teresa acabado de comprar had- them the Teresa finished to buy Teresa had just bought them

The pre-participial position of the subject here indicates that this is not ‘free inversion’ of the Italian type (a somewhat restricted option in Portuguese anyway). Enclisis is also required in main clauses with null subjects: (75)

a. Encontrou-a s/he-met- her S/he met her b. *A encontrou her s/he-met

Presumably the null subject is in SpecTP here (but see Barbosa 1995, 2000, 2009 for a different account of null subjects in European Portuguese). For completeness, let us briefly look at yes/no questions. These show the same order as the corresponding declaratives, with interrogative force marked by intonation (as in Spanish and Italian): (76)

O Pedro encontrou-a no cinema? the Peter mether at-the cinema Did Peter meet her at the cinema?

Again, we observe a different pattern in wh-questions and yes/no-questions. It is not clear whether a null operator is present with yes/no questions marked purely by intonation.

Phases, head movement and second-position effects

421

5. Conclusions Here I have proposed a single system for the left periphery, assuming just three categories and three variables: the position of the clitics/weak pronouns, and which heads bear EF and V-features. Following the approach in Roberts (2010a) and making otherwise standard assumptions about movement, Agree, attraction and locality (with one exception⎯see below), I have been able to derive accounts of ‘second-position’ phenomena in S/C, Germanic past and present, as well as European Portuguese. In every case, the second-position itself is epiphenomenal, in the sense that no category has a special marking for this, no reference is made to properties extrinsic to syntax such as obligatory enclisis, and there is no counting to two. The main theoretical innovation in this paper, which I have used throughout, is the notion of ‘distributed phase head.’ This can thought of as follows: first, we assign a specific set of intrinsic formal features to each of the core functional heads, as in (10), repeated here: (10)

C[+Clause type, +T, +V], T[+T, +V], v[+V], V[ ]

Each core head has its own ‘weight’, measured in formal features: C=3, T=2, v=1 and V=0. We can then define a ‘field’ in the clausal cartography as a set of heads with the same formal weight. Extending and adapting the ideas in Richards (2007), phase heads always and only have an odd-numbered formal weight. But we can observe a C-field as a set of structurally adjacent heads of weight 3 (recall that Į is structurally adjacent to ȕ iff either Į minimally asymmetrically c-commands ȕ or is minimally asymmetrically ccommanded by ȕ). We distinguish the heads in a cartographic field with essentially arbitrary semantically-based labels (Force, Foc, etc). The features which are unique to phase heads⎯EF and active probing features⎯can be ‘distributed’ among various heads. Hence, in the C-field, Force may have EF and Fin unvalued ĳ-features. I suggested in §3 that this is the unmarked state of affairs, since it is natural to think of the highest edge in the phasal field as the ‘true’ edge, and it is most economical if the lowest head in the phasal head is the probe. We have seen empirical reasons to depart from these maximally simple assumptions in certain cases however. But the 2P case reflects this situation, which I take to be a significant result, given the widespread cross-linguistic incidence of this phenomenon (see the references given in the Introduction).

422

Ian G. Roberts

In general, then, we can see that a rather slight extension of the notion of phase head, in the direction of ‘cartographic’ thinking, gives us a simple and natural account of well-known and typologically common second-position effects. If phases behaved differently, we would not see second-position effects in the way that we do.30

Appendix In this Appendix, I address the question of a phonological motivation for 2P cliticisation. An influential theory is the Prosodic Inversion theory of Halpern (1992). There are three ideas in this account: (i) P2 clitics are prosodically subcategorized to appear rightadjacent to a prosodic word; (ii) clitics adjoin to IP; (iii) where no element with a phonological matrix appears to the left of the IP-adjoined clitic then Prosodic Inversion (PI) must apply: (A1)

Prosodic Inversion: Clitic*X*Y Î X*clitic*Y

The clitic is now in second position, following X, a word or a phrasal constituent. Material may precede X on the surface as long as it is part of a distinct prosodic domain (i.e. PI only applies within a given prosodic domain, possibly the Phonological Phrase). I have three arguments against the PI approach. First, it is not necessary, as the orders can be derived by using purely standard syntactic mechanisms, as we have seen. There is therefore no need for the extraneous and unmotivated complexities that (i-iii) introduce. It is unclear that we need to stipulate that a given weak or clitic pronoun is enclitic: this may simply follow from the positions of the clitics in relation to other elements and the algorithm for assigning phonological structure. Second, Halpern offers no account of why clitics should adjoin to IP, and it is very hard to see what that would be in terms of current versions of minimalist theory, especially since IP is not a phase. Here, on the other hand, clitic-placement derives from Agree, as briefly described in §2 and discussed in detail in Roberts (2010a). Second, PI is not sufficient. Thus, Labelle & Hirschbuhler (2005:62) state that ‘in Old French, prosodic requirements play no role in the Tobler-Mussafia effects displayed by object clitics’. They show that object clitics can be first in their prosodic domain: (A2)

Jo, qui voldreie parler a tei, le recevrai I who would-like to-talk to you, him I-will-receive I, would like to talk to you, will receive him [from QLR, de Kok 1985:173, Labelle & Hirschbuhler 2005:63]

Starke (1993) gives examples similar to (A2) from Slovak. Furthermore, as we saw above, Boškoviü (to appear) has shown that the P2 clitic-placement in SC cannot be

Phases, head movement and second-position effects

423

accounted for by PI. In addition to the argument given in fn. 4, we can note that where a forename is left-branch extracted to first position in SC there must be morphological case on both parts of the name, whereas this is not required without extraction: (A3)

a. Lava Tolstoj þitam Leo-Acc Tolstoy I-read b. Lav Tolstoja þitam Leo Tolstoy-ACC I-read I read Leo Tolstoy

(A4)

a. Lava sam Tolstoja þitao Leo-Acc am Tolstoy-ACC read I have read Leo Tolstoy b. *Lav sam Tolstoja þitao Leo am Tolstoy-Acc read c. *Lava sam Tolstoj þitao Leo-Acc am Tolstoy read I have read Leo Tolstoy

As Boškoviü points out, there is no way to prevent PI from applying in (A4b,c); there is no reason for this operation to be sensitive to the case-marking of the forename. Leftbranch extraction, on the other hand, may be. Moreover, there are examples where the fronted left-branch element cannot have case in situ, but must have it when fronted: (A5)

a. ýiþinu je on Tominu kolibu srušio Uncle’s-Acc is he Tom’s-Acc cabin torn-down b. *ýiþa je on Tominu kolibu srušio Uncle-Nom is he Tom’s-Acc cabin torn-down c. *On je srušio þiþinu Tominu kolibu he is torn-down Uncle’s-Acc Tom’s-Acc cabin d. On je srušio þiþa Tominu kolibu he is torn-down Uncle-Nom Tom’s-Acc cabin He has torn down Uncle Tom’s cabin

Again the case-requirement on the fronted constituent cannot be accounted for by PI. Furthermore, the local nature of PI predicts that neither of the following examples should be grammatical: (A6)

a. *ýiþa Tominu je kolibu srušio Uncle-Nom Tom’s-Acc is cabin torn-down b. ýiþinu Tominu je kolibu srušio Uncle-Acc’s Tom’s-Acc is cabin torn-down S/he has torn down Uncle Tom’s cabin

424

Ian G. Roberts

If PI is maximally local, why is (A6a) ruled out? Moreover, what allows the entire DP þiþinu Tominu to front in (A6b)? These observations cast severe doubt on PI’s ability to account for the P2-clitic facts of S/C. Furthermore, V2 certainly cannot be accounted for by PI (without giving every matrix finite verb a special phonological subcategorsation frame, and even this would not account for V1 orders), and so the important interactions between this phenomenon and clitic-placement that we saw in §§3 and 4 cannot be dealt with by PI. It is clear that the assumptions in (i-iii) need to be elaborated by further syntactic assumptions, mostly regarding clitic, V- and XP-movement into the left periphery. In that case, if we can account for all aspects of P2 placement with movement into the left periphery just by clitic, V- and XP-movement, then we should. This is exactly what we have done here. Finally, PI is unformulable on standard minimalist assumptions. Merge creates structure and derives the nature of movement operations, and is confined to the core syntax. As such, we do not expect to find movement, i.e. Internal Merge, in the phonological component, but PI is exactly a case of this. If Internal Merge is available in phonology, then we expect to see many other cases of displacement, and perhaps apparently unbounded displacement, in phonology, as we do in syntax. But we do not. Therefore, PI must come at a very high cost indeed, in that the general architecture of the grammar with respect to the role of the core operation Merge, must be rethought. A very major empirical gain would perhaps justify this, but, as we have seen, PI offers no empirical gain at all since all the facts it describes can be handled, and better handled, by standard syntactic operations. I conclude that, whatever the correct account of P2 phenomena may turn to be, PI has no role to play there or elsewhere.

Notes * 1.

2.

Thanks to Ángel Gallego, two anonymous reviewers, and especially Moreno Mitrovic for help and comments on this paper. All errors are mine. For v, I assume that EF is generally satisfied by the merger of the external argument (EA) in its specifier. This category may also attract the lexical V; Chomsky (2001) assumes this is general, but see Biberauer & Roberts (2010), Huang (2007), for the proposal that this is not the case in Modern English and Mandarin Chinese respectively. Where there is no EA lexically available, if SpecvP is an A-position then the only possible argument that can appear there is the IA. Thus, where there is no EA, the IA cannot be frozen in place by Case/ĳ-feature licensing; in other words there can be no Accusative Case. This proposal derives part of Burzio’s Generalisation (if accusative Case, then an external argument). I leave DPs aside, although 2P effects have been observed there (see Avram & Coene 2008). This is the ‘Long Head Movement’ construction discussed by Borsley, Rivero & Stephens (1996), Jouitteau (2005), ûavar & Wilder (1994), Boškoviü (2001), Lema & Rivero (1990, 1991), Rivero (1991, 1993a,b, 1994a,b, 1997), Rivero &

Phases, head movement and second-position effects

3.

4.

5.

6.

425

Terzi (1995), Roberts (1994); in terms of the approach to ‘head movement’ in Roberts (2010a) there is nothing problematic here since there is no Head Movement Constraint either as a derived or primitive principle. This idea was first put forward, to my knowledge, by ûavar & Wilder (1992), and has been much debated since; see, among others, Schütze (1994), Progovac (1995), Franks & King (2000:217-222, 303ff., 358-60), Boškoviü (2000:101-102, 2001: 11ff.), Franks (2000:5-14). ûavar & Wilder (1999:442) note that examples like (i) present a problem for this approach, since the material the clitic follows does not obviously form a constituent: (i) U zelenoj je kuüi stanovao in green is house stayed He stayed in the green house Franks & King (2000:360) observe that prepositions like u seem to be independently able to undergo left-branch extraction with a following wh-word, supporting the idea that these prepositions may be able to procliticize to a left-branch-extracted element. It is well-known that even names can be broken up: (i) Lava sam Tolstoja þitao Leo-Acc am Tolstoy-Acc read I have read Tolstoy Boškoviü (to appear) argues that this construction too involves left-branch extraction, and that the condition on clitic-placement is not phonological, as it is possible to have non-clitic material between the two parts of the name, as in (ii): (ii) Lava þitam Tolstoja Leo-Acc read Tolstoy-Acc I read Tolstoy See Boškoviü (to appear) for details regarding the structure and case properties of various types of names in Serbian/Croatian. For more on the possibility that cliticplacement is phonological see below the Appendix. This argument does not extend straightforwardly to Croatian, since we observed in fn. 2 that the combination of preposition and adjective can apparently be subextracted from a nominal. Aside from the possibility of preposition-cliticisation entertained there, however, prepositions cannot be separated from their objects by the clitic cluster in this language either, as far as I am aware. Mark Hale (p.c.) informs me that the kind of argument just made about Slavic languages does not obviously extend to older Indo-European languages, many of which show similar looking P2 effects. For example, Old Persian, Mycenean Greek and Sanskrit show cases where clitics interrupt phrasal constituents which cannot be readily handled in subextraction terms. Hale (1990) suggests a prosodic solution. It is a plausible speculation that phonologically governed P2 systems develop into syntactic ones of the type considered here. See Galves & Sândalo (forthcoming) for similar ideas regarding the history of European Portuguese. There is also some evidence that some languages show both types of P2 effect: this may be

426

Ian G. Roberts

true of Dutch (Zwart 1993), Latin (Wanner 1987), and Gothic (Eythórssen 1994). On the history of the South Slavic clitics, see Pancheva (2005). 7. Boškoviü (2001:40ff.) argues that the clitic cluster can be realized lower than the canonical subject position, on the edge of vP; see Roberts (2010:70-74) for critical discussion, concluding that the case that clitics are able to appear below the subject is not established on the basis of Boškoviü’s evidence. 8. This entails that the central idea in Alexiadou & Anagnostopoulou (1998), that D-features on the ‘richly inflected’ verbs in null-subject languages could satisfy the EPP by V-movement to T thereby obviating the application of the EPP to SpecTP, cannot be entertained here. However, it can surface in a slightly different form in terms of the deletion-based approach to null subjects put forward in Holmberg (2010) and Roberts (2010b). On this view, ‘rich’ verbal morphology, realized as a non-impoverished ĳ-feature matrix on T, serves to recover the content of a deleted pronominal subject T Agrees with. The two approaches are empirically very similar, although both Holmberg (2005) and Roberts (2010b) give evidence that favours the latter. 9. As we saw above, the clitic cluster has a rigidly determined internal order Q – aux – dat – acc – (gen) – (je). It is natural to place the Q-morpheme in a higher head than Fin, which immediately explains why this element is always first. For the moment, we take this to be Force, but we will question this assumption below. If we assimilate the auxiliary clitics, which bear subject agreement, to subject clitics, then we can think that the order inside the cluster reflects the order of cliticisation, assuming, given the Strict Cycle, that the deepest-embedded clitic moves first, then the next deepest, etc. This implies that the direct object is raised from a lower position than the indirect object, following (Kayne 1984, Aoun & Li 1989, Emonds & Whitney 2006:121f.). In this way, the derived structure for the maximal clitic complex would be as follows: (i) Fin %' Subj-cl Fin %' IO-cl Fin %' DO-cl Fin By this reasoning, genitive clitics must originate from a still more embedded position than direct objects. These elements either originate from within a DP inside TP, or as inherently-Case-marked obliques. It is reasonable to think that either of these positions is more embedded than the first-merged position of the direct object. As we have seen, 2P complement clitics must move through the edge of vP. This movement must be order-maintaining, possibly by the formation of a cluster at this point in the derivation, permitting [ IO [ DO ]] to move on together (and simplifying slightly the derived structure in (i)). To account for the fact that je is always last in the clitic sequence, we could assume that it is merged directly in Fin rather

Phases, head movement and second-position effects

10.

11.

12.

13.

427

than moving there. In that case all the other clitics left-adjoin to it and so precede it. The fact that it bears default 3sg present features may be relevant to this. Boškoviü (2002) actually argues that single-pair answers to multiple questions are restricted to cases where there is no ‘true’ wh-movement (hence in English only pair-list answers are allowed for multiple questions like Who bought what? since English requires true wh-movement of exactly one wh-phrase in all true wh-questions). The exactly parallel S/C example Ko je šta kupio? therefore does not feature ‘true’ wh-movement. This leads Boškoviü to the conclusion that the clitic cluster must be able to occupy varying positions in the clause, a position he develops in full in Boškoviü (2001). One possibility for reconciling the analysis sketched in the text with Boškoviü’s proposals is to suggest, following Roberts (2010:72), that there are two Focus positions in S/C, and that the clitic position is sited in between the upper and the lower one. The other possibility may be to question the validity of Boškoviü’s generalisation concerning ‘true’ wh-movement and single-pair readings, but this would take us to far afield here. In fact, it is unclear how felicitous li is in wh-questions. ûavar & Wilder (1992:33) give just one example of an overt wh-phrase followed by li. Boškoviü (2002, fn. 9) observes that “the li-construction is not a ‘neutral’ question semantically”, but does not explain further. A note of clarification is in order here regarding cyclicity. Consider the following two configurations: (i) A …B…C (ii) A …B…C…D (where ‘…’ means asymmetric c-command). In (i), if A is a probe able to probe both B and C (for distinct features, or B would block the A-C relation by relativized minimality), I assume that this is possible, and that, if A also triggers movement B and C, C must move first since it is lower in the structure. Moving B before C would violate the (strict) cycle, but moving C before B does not, and, other things being equal (notably minimality; see below on the blocking effects of EF) is allowed. On the other hand, in (ii), if A and B are probes and C is the goal of A and D is the goal of B, again as long as the features are distinct, D can move to B and C to A: in that order only, giving a crossed relation. D cannot be the goal of A and C the goal of B, as this would require an anti-cyclic derivation. In the case in point, Force corresponds to A in (i), Foc to B and Fin to C. Therefore Fin moves first, followed by Foc. The result, given left-adjunction every time, is that Foc (containing li) precedes all the other clitics, contained in Fin. A similar derivation is at work fn. 9, deriving the observed order in the clitic cluster as the same as the first-merged order of the arguments. Other cases include multiple wh-movement to Foc and wh-movement combined with scrambling to SpecvP. See the discussion in Roberts (2010a, §4.2). This raises the question of the status of Fin’s D-feature in a cliticless clause. One possibility would be say that the feature can only appear when clitics do, or the derivation would crash. Another possibility is to say that this feature may play a role in li-

428

14. 15. 16.

17.

18. 19.

20.

21. 22.

23.

Ian G. Roberts censing a null subject in SpecTP in an example like (17), perhaps by feature-sharing with T in this case. In that case, the subject in (17) would effectively be clitic left-dislocated. This second alternative appears more interesting, although I do not have empirical evidence that the subject behaves as though clitic-left-dislocated. Argumental categories, including VP, can front independently of RM⎯see Rizzi (1990, 2001). Although there is a complication in that complement-clitic clusters may be formed on the edge of the v-phase⎯see fn. 9. Smuggling derivations of the kind put forward by Collins (2005) for the English passive massively violate any generalized freezing constraint, since, by definition, they involve moving out of a moved category. In this context, it is noteworthy that Baker (1988) proposes head-movement from a moved category in his analysis of causatives, another construction that may well involve smuggling in many languages (see Kayne 1975, Roberts 2010a). Smuggling derivations, then, appear to be able to avoid generalized freezing, for reasons that are unclear. There are some complications with the auxiliaries vera (‘be’) and hafa (‘have’) in Icelandic, but the general point is that wherever an auxiliary cannot undergo SF it is ‘transparent’ for SF of elements lower in the hierarchy. So relativized minimality holds as a general condition. See Holmberg (2006:547-9) for discussion. Presumably the availability of a closer expletive (in SpecTP) would always block participle-fronting directly to SpecCP in MSc by relativized minimality. Recall that the verb moves directly from v to Fin in Germanic V2. The pronoun/cluster appears in Spec,vP (either as a cluster or as separate elements, see the discussion of S/C in the previous section, and further remarks below). Hence the pronoun/clitic cluster asymmetrically c-command the verb immediately prior to movement of either to Fin. By the definition of the Strict Cycle in Roberts (2010a:53), the verb would have to move first, and hence proclisis ensues by leftadjunction of the pronouns/clitics to Fin. It may be that weak pronouns do not have to raise, or at least do not have to raise to SpecFinP, in German, cf.: (i) daß endlich wer sie uns vorstellen/zeigen sollte that after-all someone them-Acc us-Dat introduce show should [from Haider 2010:139] Here the weak pronouns may be on the edge of vP (in fact either the matrix or embedded vP, assuming sollen takes a vP-complement). Note that they are nonetheless attracted to a phase edge. Although Anagnostopoulou (2008) claims the German weak pronouns do. This has not been discussed in the foregoing, because the empirical effects are masked by the fact that Force always attracts one element (which, if there is a whphrase in Force, will be that phrase, by relativized minimality). These are not the only exceptions to the basic pattern. There is a significant minority of verb-final main-clause declaratives, especially in the second conjunct of co-

Phases, head movement and second-position effects

429

ordinations, and there are fairly frequent cases of the order adverb > subject > V, where the adverb is usually a time adverb. I leave these cases aside here, however. 24. Of course, one might want to identify ForceP with Rizzi’s higher TopicP. I stick to the ForceP notation for simplicity, in order to show that it is possible to account for all the attested variations in order in the left periphery with just three categories. 25. Elements like -u not only attached to verbs, but also to ‘preverbs’ such as the aspect-marker ga: (i) a. ga-u-laubjats? Asp-Q-believe.2dual Do you two believe? [from M 9, 28; Longobardi 1994:361] b. thu ga -u-laubeis du sunam gudis? you Asp-Q-believe.2sg in son of-God Do you believe in the Son of God? [from J 9, 35; Ferraresi 1991:xi] The similarity to Old Irish here is striking, cf: (ii) a. aton. cí Preverb > clitic > verb Asp+us.see(3sg) He sees us b. Ní.m*accai ‘conjunct’ > clitic > preverb > verb Neg.me*see.3sg He doesn’t see me (See Carnie, Pyatt & Harley 2000, Newton 2006). In (i) and (iia) we see the preverb moving over the clitic with the verb following the clitic. We can account for this with the following structure: (iii) [FocP/ForceP Preverb [FinP clitics [Fin V ] ... This entails that the clitics are really weak pronouns on SpecFin. Taking the preverbs to be aspectual heads, incapable of bearing tense or agreement marking, and taking V to raise from T to Fin (V-to-T raising is justified by the ‘rich’ subjectagreement marking and the null-subject nature of both of these languages), we have a nested head-movement dependency here. This can be accounted for if Fin attracts T and T attracts V, hence the finite verb, while the preverb moves independently to satisfy an EF associated with either Force or Foc. In (iib) we see a ‘conjunct particle’ (negation, question particle or complementizer). Negation, as we have seen in OE and Gothic, can occupy SpecFocP and satisfy Foc’s EF. In this case and only this, the preverb incorporates with the verb and together these elements move to Fin. The optimal analysis would allow preverb-incorporation in all cases, but with excorporation where it satisfies EF (this is allowed by the approach to incorporation in Roberts 2010a, as described above). Structures similar to (i) and (ii), showing the order preverb > clitic > verb are also found in Archaic Latin and in Greek (Fortson 2004:140, Vincent 1999). 26. Adams (1994) is a detailed survey of ‘Wackernagel pronouns’ in Latin. His conclusion is that these pronouns are enclitic to the focussed constituent on the left edge of the colon (a prosodic unit that Ledgeway forthcoming identifies with the phase). This is consistent with the general proposals in the text: Latin Foc attracts a variety

430

27.

28.

29.

30.

Ian G. Roberts of elements (via its EF) and the weak pronouns are in SpecFin, in a position right adjacent to the focussed category (Foc is empty: Latin does not have a focus particle or other morphological mark of focus), in a position to feed PF enclisis. Brazilian Portuguese does not show these effects, at least not in colloquial registers. The historical development of the Portuguese restrictions on enclisis and the loss of enclisis, indeed of 3rd-person clitics, in Brazilian Portuguese, are dealt with by Galves (2000), Galves, Ribeiro & Torres-Morais (2005), Galves & Paixão de Sousa (2010) and Kato, Cyrino & Reche-Corrêa (2009). Certain aspectual adverbs show the same pattern, and so must be treated as incorporating with Foc when they front: (i) O Pedro já/ nunca o viu the Peter already/never him saw [from Barbosa 2000:36] Raposo (2000:271) gives the following pair of examples: (i) a. Muito whisky o capitão me tem servido! too-much whisky the captain me has served b. Muito whisky me tem o capitão servido! too-much whisky me has the captain served The captain has served me too much whisky! Here it seems that Foc fails to attract Fin and V. It is unclear what constraints govern this apparent optionality. Thanks to a reviewer for clarifying this point.

References Adams, James 1994 Wackernaegel’s Law and the Position of Unstressed Personal Pronouns in Classical Latin. Transactions of the Philological Society 92:103-178. Alexiadou, Artemis & Elena Anagnostopoulou 1998 Parametrizing Agr: word order, verb-movement and EPP-checking. Natural Language and Linguistic Theory 16: 491-539. Anagnostopoulou, Elena 2008 Weak Pronouns and the Person Case Constraint: A Case Study of German. Talk given at the 23rd Comparative Germanic Syntax Workshop, University of Edinburgh. Aoun, Joseph and Audrey Li 1989 Scope and Consituency. Linguistic Inquiry 20: 141-172. Avram, L. and M. Coene 2008 Romanian possessor clitics revisited. In Clitic Doubling in the Balkan Languages; D. Kallulli and L. Tasmowski (eds.), 361-388. Amsterdam: John Banjamins

Phases, head movement and second-position effects Axel, Klaus 2007

431

Studies on Old High German Syntax: Left Sentence Periphery, Verb Placement and Verb-Second. Amsterdam: John Benjamins

Baker, Mark 1988 Incorporation: A Theory of Grammatical Function Changing. Chicago:Chicago University Press. 1996 The Polysynthesis Parameter. Oxford: Oxford University Press. Barbosa, Pilar 1995 Null Subjects. MIT PhD dissertation. 2000 Clitics: A window into the null subject property. In Portuguese Syntax: New Comparative Studies; João Costa (ed.). 31-92. New York/Oxford:Oxford University Press. 2009 Two kinds of subject pro. In A. Holmberg (ed.). Studia Linguistica special edition on partial null subjects 63: 2-58. den Besten, Hans 1983 On the interaction of root transformations and lexical deletive rules. In On the Formal Syntax of the Westgermania, Werner Abraham (ed.), 47131. Amsterdam: John Benjamins. den Besten, Hans and Gerd Webelhuth 1989 Stranding, In Scrambling and Barriers, G. Grewendorf and W. Sternefeld (eds.), 77-92. Amsterdam: John Benjamins. Biberauer, Theresa 2003 Verb Second (V2) in Afrikaans: a minimalist investigation of word-order variation. Ph.D. dissertation: University of Cambridge. Biberauer, Theresa and Ian Roberts 2005 Changing EPP-parameters in the history of English: accounting for variation and change. English Language and Linguistics 9: 5-46. 2006 Loss of ‘Head-final’ Orders and Remnant Fronting in Late Middle English: Causes and Consequences. In Comparative Studies in Germanic Syntax: From A(frikaans) to Z(ürich German), J.Hartmann and L. Molnárfi (eds.), 263-297. Amsterdam: John Benjamins 2010 Subjects, Tense and Verb Movement. In Parametric Variation: Null Subjects in Minimalist Theory, T. Biberauer, A. Holmberg, I. Roberts and M. Sheehan. Cambridge: Cambridge University Press. Borsley, Robert and María-Luisa Rivero 1994 Clitic Auxiliaries and Incorporation in Polish. Natural Language and Linguistic Theory 12: 373-422. Borsley, Robert, María-Luisa Rivero, and Janig Stephens 1996 Long Head Movement in Breton. In The Syntax of the Celtic Languages, I. Roberts and R. Borsley (eds.), 53-74. Cambridge: Cambridge University Press.

432

Ian G. Roberts

Boškoviü, Željko 2000 Second-position clitics: syntax and/or phonology? In Clitic Phenomena in European Languages, F. Beukema and M. den Dikken (eds.), 71-119. Amsterdam: Benjamins. 2001 On the Nature of the Syntax-Phonology Interface: Cliticization and Related Phenomena. Amsterdam: Elsevier Science. 2002 On multiple wh-fronting. Linguistic Inquiry 33: 351-383. 2008 What you have, DP or NP? In Proceedings of NELS 37. 2010 On NPs and Clauses. Ms., University of Connecticut. to app. On Leo Tolstoy, its structure, Case, left-branch extraction, and Prosodic Inversion. In Studies in South Slavic Linguistics in Honor of E. Wayles Browne, S. Franks et al. (eds.), 99-122, Slavica. Browne, Wayles 1975 Serbo-Croatian enclitics for English-speaking learners. In The Zagreb English-Serbo-Croatian Project. Contrastive analysis of English and Serbo-Croatian I, R. Filipoviü (ed.), 105-134. Zagreb: Institute of Linguistics. [Reprinted in Journal of Slavic Linguistics 12(1-2):255-289]. Cardinaletti, Anna and Michal Starke 1999 The typology of structural deficiency: on the three grammatical classes. In Clitics in the Languages of Europe; H. van Riemsdijk (ed.) 145-233. Berlin: Mouton de Gruyter. Carnie, Andrew, Elizabeth Pyatt, and Heidi Harley 2000 VSO order as raising out of IP? Some evidence from Old Irish. In The Syntax of Verb-Initial Languages, A Carnie and E. Guilfoyle (eds.), 3960. Oxford/New York: Oxford University Press. ûavar, Damir and Chris Wilder 1992 Long Head Movement? Verb-Movement and Cliticization in Croatian. Sprachwissenschaft in Frankfurt 7. 1994 Long Head Movement? Verb-Movement and Cliticization in Croatian. Lingua 93: 1-58. 1999 ‘Clitic-third’ in Croatian. In Clitics in the Languages of Europe, H. van Riemsdijk (ed.), 429-468. Berlin: Mouton de Gruyter. Chomsky, Noam 2001 Derivation by Phase. In Ken Hale: A Life in Language, M. Kenstowicz (ed.), 1-52. Cambridge, Mass.: MIT Press. 2008 On Phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, R. Friedin et al. (eds.), 133-165. Cambridge, Mass.: MIT Press. Cinque, Guglielmo 1999 Adverbs and Functional Heads. Oxford: Oxford Univerity Press. Clackson, James 2007 Indo-European Linguistics. Cambridge: Cambridge University Press.

Phases, head movement and second-position effects

433

Collins, Chistopher 2005 A smuggling approach to the passive in English. Syntax 8: 81-120. Costa, João 2000 Portuguese Syntax: New Comparative Studies. Oxford: Oxford University Press. Devine, A.M. and L.D. Stephens 2006 Latin Word Order: Structured Meaning and Information. Oxford: Oxford University Press. Dimitrova-Vulchanova, M. 1999 Clitics in the Slavic languages. In Clitics in the Languages of Europe, H. van Riemsdijk (ed.), 83-122. Berlin: Mouton de Gruyter. Duarte, Ines and Gabriela Matos 2000 Romance Clitics and the Minimalist Program. In Portuguese Syntax: New Comparative Studies, J. Costa (ed.), 94-115. Oxford: Oxford University Press. Emonds, Joseph and R. Whitney 2006 Double Object Constructions. In The Blackwell Companion to Syntax. Volume 1. M. Everaert and H. van Riemsdijk (eds.), 73-144. Oxford: Blackwell. Eythórsson, Thorhallur 1994 Functional Categories, Cliticization and Verb Movement in the Early Germanic Languages. Paper presented at the 9th Comparative Germanic Syntax Workshop, Harvard University. 1995 Verb Position and Verb Movement in Early Germanic. Cornell University PhD dissertation. Faarlund, Jan-Teerje 1994 Old and Middle Scandinavian. In The Germanic Languages, E. König and J. van der Auwera (eds.), 38-71. London: Routledge. Ferraresi, Gisella 1991 Die Stellung des gotischen Verbs im Lichte eines Vergleichs mit dem Althochdeutschen. M.A. Thesis, University of Venice. 2005 Word order and phrase structure in Gothic. Leuven: Peeters. Fontana, Josep Maria 1993 Phrase Structure and the Syntax of Clitics in the History of Spanish. PhD Dissertation, University of Pennsylvania. Fortson, Benjamin 2004 Indo-European Language and Culture. Oxford: Blackwell. Franks, Steve 2000 Clitics at the Interface: an Introduction to Clitic Phenomena in European Languages. In Clitic Phenomena in European Languages, F. Beukema and M. den Dikken (eds.), 1-46. Amsterdam: Benjamins. Franks, Steve and T. King 2000 A handbook of Slavic Clitics. Oxford: Oxford University Press.

434

Ian G. Roberts

Fuß, Eric 2008

Word order and language change. On the interface between syntax and morphology. Habilitationsschrift, Goethe-Universität Frankfurt. Galves, Charlotte 2000 Agreement, Predication and Pronouns in the History of Portuguese. In Portuguese Syntax: New Comparative Studies; João Costa (ed.), 143168. Oxford: Oxford University Press. Galves, Charlotte and M.-C. Paixão de Sousa 2010 The loss of V2 in the history of Portuguese: subject position, clitic placement and prosody. Talk given at the 12th Diachronic Generative Syntax Conference, Cambridge. Galves, Charlotte, Ilza Ribeiro and Aparecida Torres-Morais 2005 Syntax and morphology in the placement of clitics in European and Brazilian Portuguese. Journal of Portuguese Linguistics 4: 143-177. Galves, Charlotte and M. Sândalo forth. From Intonational Phrase to Syntactic Phase: the grammaticalization of enclisis in the history of Portuguese. To appear in Lingua. Garrett, Andrew 1990 The syntax of Anatolian pronominal clitics. PhD Dissertation, Harvard University Giannollo, Chiara, Cristina Guardiano and Giuseppe Longobardi 2008 Three fundamental issues in parametric linguistics. In The Limits of Syntactic Variation, T. Biberauer (ed.), 109-142. Amsterdam: Benjamins. Haeberli, Eric 1999 On the word order ‘XP-subject’ in the Germanic languages. Journal of Comparative Germanic Linguistics 3: 1-36. Haider, Hubert 2010 The Syntax of German. Cambridge: Cambridge University Press. Haïk, Isabelle 1984 Indirect Binding. Linguistic Inquiry 15: 185-224. Hale, Mark 1995 Wackernagel’s Law in the Rigveda. Ms., University of Concordia. Halpern, Aaron 1992 Topics in the Placement and Morphology of Clitics. PhD Dissertation, Stanford University. Halpern, Aaron and Arnold Zwicky 1996 Approaching Second: Second Position Clitics and Related Phenomena. Stanford: CSLI. Harbert, Wayne 2007 The Germanic Languages. Cambridge: Cambridge University Press.

Phases, head movement and second-position effects

435

Holmberg, Anders 2000 Scandinavian stylistic fronting: How any category can become an expletive. Linguistic Inquiry 31: 445-83. 2005 Is There a Little Pro? Evidence from Finnish, Linguistic Inquiry 36: 533564. 2006 Stylistic Fronting. In The Blackwell Companion to Syntax, M. Everaert and H. van Riemsdijk (eds.), 530-63. Oxford: Blackwell. 2010 Null subject parameters. In Parametric Variation: Null Subjects in Minimalist Theory, T. Biberauer et al., 88-124. Holmberg, Anders and Christer Platzack 1995 The Role of Inflection in Scandinavian Syntax. Oxford: Oxford University Press. Holmberg, Anders and Ian Roberts 2010 Introduction: parameters in minimalist theory. In Parametric Variation: Null Subjects in Minimalist Theory, T. Biberauer et al. 1-57. Cambridge: Cambridge University Press. Hrafnbjargarsson, Gunnar and A.-L. Wiklund 2009 General embedded V2: Icelandic A, B, C, etc. Working Papers in Scandinavian Syntax 84: 21-51. Huang, C.-T. James 2007 The macro-history of Chinese syntax and the theory of language change. Talk given at the University of Chicago. Jouitteau, Mélanie 2005 La syntaxe comparée du breton. PhD dissertation, University of Nantes. Julien, Marit 2000 Syntactic heads and word formation: A Study of Verbal Inflection. PhD Dissertation, University of Tromsø. Kato, Mary, Sonia Cyrino and V. Reche-Corrêa 2009 Brazilian Portuguese and the recovery of lost clitics through schooling. In Minimalist inquiries into child and adult language acquisition: case studies across Portuguese, A. Pires and J. Rothman (eds.), 245-272. New York: Mouton de Gruyter. Kayne, Richard 1975 French Syntax. Cambridge, Mass.: MIT Press. 1984 Connectedness and Binary Branching. Dordrecht: Foris. 1991 Romance clitics, verb movement and PRO. Linguistic Inquiry 22: 64786. [Reprinted in Parameters and universals. R. Kayne (ed. 2000a), 6097. Oxford: Oxford University Press]. van Kemenade, Ans 1987 Syntactic Case and Morphological Case in the History of English, Dordrecht: Foris.

436

Ian G. Roberts

Kiparsky, Paul 1995 Indo-European Origins of Germanic Syntax. In Clause Structure and Language Change, A. Battye and I. Roberts (eds.), 140-169. Oxford: Oxford University Press. de Kok, Ans 1985 La place du pronom personnel regime conjoint en français. Amsterdam: Rodopi. Labelle, Anne and Paul Hirschbuhler 2005 Changes in Clausal Organisation and the Position of Clitics in Old French. In Grammaticalization and Parametric Variation, M. Batllori et al. (eds.), 60-71. Oxford: Oxford University Press. Laka, Itziar 1990 Negation in syntax: on the nature of functional categories and projections. Ph.D. Dissertation, MIT. Ledgeway, Adam forth. From Latin to Romance: Morphosyntactic typology and change. Oxford: Oxford University Press. Legate, Julie 2008 Warlpiri and the theory of second position clitics. Natural and Linguistic Theory 26: 3-60. Lehmann, Winfried 1993 Theoretical Bases of Indo-European Linguistics. London: Routledge. Lema, José and María Luisa Rivero 1990 Long head-movement: ECP vs. HMC. Proceedings of NELS 20, 333347. 1991 Types of verbal movement in Old Spanish: modals, futures and perfects. Probus 3: 237-78. Longobardi, Giuseppe 1994 Reference and proper names: a theory of N-Movement in syntax and Logical Form. Linguistic Inquiry 25: 609-665. Madeira, Anna-Maria 1993 Clitic-Second in European Portuguese. Probus 5: 127-154. 1995 Topics in Portuguese Syntax: The Licensing of T and D. Ph.D. Dissertation, University College London. Maling, Joan 1980 Inversion in embedded clauses in Modern Icelandic. Islenskt mal og almenn marfrædi 2: 175-193. [Reprinted in Modern Icelandic syntax J. Maling and A. Zaenen (eds., 1990), 71-91. San Diego: Academic Press] Martins, Anna-Maria 1994 Cliticos na história do português. PhD dissertation, University of Lisbon. 2010 Ordem de palavras e interpretação: o Português Antigo. Talk given at UNICAMP, Brazil.

Phases, head movement and second-position effects

437

Newton, Glenda 2006 The development and loss of the Old Irish double system of verbal inflection. Ph.D. Dissertation: University of Cambridge. Nunes, Jairo 2004 Linearization of Chains and Sideward Movement. Cambridge, Mass.: MIT Press. Pancheva, Roumyana 2005 The Rise and Fall of Second-position Clitics. Natural Language and Linguistic Theory 23: 103-167. Pintzuk, Susan 1991 Phrase Structure in Competition: variation and change in Old English word order. Ph.D. Dissertation, University of Pennsylvania. 1993 Verb seconding in Old English: Verb Movement to Infl. The Linguistic Review 10: 5-35. Progovac, Liliana 1995 Clitics in Serbian/Croatian: Comp as the Second Position. In Approaching Second: second position clitics and related phenomena, A. Halpern and A. Zwicky (eds.), 411-428. Stanford: CSLI. Rackowski, Andrea and Norvin Richards 2005 Phase Edge and Extraction: A Tagalog Case Study. Linguistic Inquiry 36: 565-599. Raposo, Eduardo 2000 Clitic positions and verb movement. In Portuguese Syntax, João Costa (ed.), 266-297. Oxford: Oxford University Press. 2001 Objetos nulos e CLLD. Uma teoria unificada. Ms., University of California at Santa Barbara. Raposo, Eduardo and Juan Uriagereka 2005 Clitic placement in Western Iberian: A Minimalist view. In The Oxford handbook of comparative syntax, G. Cinque and R. Kayne (eds.), 639697. Oxford/NY: Oxford University Press. Richards, Marc 2007 On feature-inheritance: an argument from the Phase Impenetrability Condition. Linguistic Inquiry 38: 563-72. Richards, Marc and Theresa Biberauer 2005 Explaining Expl. In The Function of Function Words and Functional Categories, M. den Dikken and C. Tortora (eds.), 115-154. Amsterdam: John Benjamins. Rivero, María Luisa 1986 Parameters in the Typology of Clitics in Romance and Old Spanish. Language 62: 774-807. 1991 Long Head Movement and Negation: Serbo-Croatian versus Slovak and Czech. The Linguistic Review 8: 319-351.

438

Ian G. Roberts

1992

Clitic and NP Climbing in Old Spanish. In Current Studies in Spanish Linguistics, H. Campos and F. Martínez Gil (eds.), 241-282. Washington, D.C.: Georgetown. 1993a Long Head Movement vs V2 and Null Subjects in Old Romance. Lingua 89: 113-141. 1993b Finiteness and Second Position in Long Head Movement Languages. Ms., University of Ottawa. 1994a Negation, Imperatives and Wackernagel Effects. Rivista di Linguistica 1: 91-118. 1994b Clause structure and V-movement in the languages of the Balkans. Natural Language and Linguistic Theory 12: 63-120. 1997 On two positions for complement clitic pronouns: Serbo-Croatian, Bulgarian and Old Spanish. In Parameters of Morphosyntactic Change, A. van Kemenade and N. Vincent (eds.), 170-206. Cambridge: Cambridge University Press. Rivero, María Luisa and Arhonto Terzi 1995 Imperatives, V-Movement and Logical Mood. Journal of Linguistics 31: 301-332. Rizzi, Luigi 1990 Relativized Minimality. Cambridge, Mass.: MIT Press. 1996 Residual Verb Second and the WH-Criterion. In Parameters and Functional Heads, A. Belletti and L. Rizzi (eds.), 63-90. Oxford: Oxford University Press. 1997 On the fine structure of the left periphery. In Elements of grammar, L. Haegeman (ed.), 281-337. Dordrecht: Kluwer. 2001 Relativized Minimality Effects. In Handbook of Syntactic Theory, M. Baltin and C. Collins (eds.), 89-110. Oxford: Blackwell. Roberts, Ian 1993 Verbs and diachronic syntax: a comparative history of English and French. Dordrecht: Kluwer. 1994 Two Types of Head Movement in Romance. In Verb Movement, N. Hornstein and D. Lightfoot (eds.), 207-242. Cambridge: Cambridge University Press. 1996 Remarks on the Old English C-system and Diachrony of V2. In E. Brandner and G. Ferraresi (eds). Linguistische Berichte 7: 154-167. 2001 Head Movement. In Handbook of Contemporary Syntactic Theory, M. Baltin and C. Collins (eds), 113-147. Oxford: Blackwell. 2004 The C-system in Brythonic Celtic. In The Structure of IP and CP, L. Rizzi (ed.), 297-328. Oxford: Oxford University Press. 2005 Principles and Parameters in a VSO Languages: A Case Study in Welsh. Oxford: Oxford University Press. 2010a Agreement and Head Movement: Clitics and Defective Goals. Cambridge, Mass.: MIT Press.

Phases, head movement and second-position effects 2010b

439

A deletion analysis of null subjects. In Parametric Variation: Null Subjects in Minimalist Theory, T. Biberauer et al. (eds.). Cambridge: Cambridge University Press. Robinson, Orrin W. 1997 Clause Subordination and Verb Placement in the Old High German ‘Isidor’ Translation. Heidelberg: Winter. Rögnvaldsson, Eirikur and Höskuldur Thráinsson 1990 On Icelandic word order once more. In Modern Icelandic syntax, J. Maling and A. Zaenen (eds.). New York: Academic Press. Ross, John 1967 Constraints on variables in Syntax. Ph.D. Dissertation: MIT. Rouveret, Alain 1992 Clitic-placement, Focus and the Wackernagel Position in European Portuguese. Paper presented at the European Science Foundation Workshop on Clitics, Donostia/San Sebastian. Rudin, C. 1988 On multiple questions and multiple WH fronting. Natural Language and Linguistic Theory 6: 445-502. Salvi, Giampaolo 2004 La formazione di struttura di frase romanza. Tübingen: Niemeyer. Santorini, Beatrice 1992 Variation and change in Yiddish subordinate clause word order. Natural Language and Linguistic Theory 10: 595-640. Schütze, Carsten 1994 Serbo-Croatian Second Position Clitic Placement and the Phonology Syntax Interface. MIT Working Papers in Linguistics 21: 373-473. Shlonsky, Ur 2004 Enclisis and proclisis. In The Structure of CP and IP, L. Rizzi (ed.), 329354. Oxford: Oxford University Press. Starke, Michal 1993 En deuxième position en Europe Central. B.A. Dissertation, University of Geneva. 2001 Move reduces to Merge: a theory of locality. Ph.D. Dissertation, University of Geneva. Thráinsson, Höskuldur 2007 The syntax of Icelandic. Cambridge: Cambridge University Press. Tomaselli, Alessandra 1989 La sintassi del verbo finito nelle lingue germaniche. Ph.D. Dissertation, University of Pavia. 1995 Cases of verb third in Old High German. In Clause Structure and Language Change, A. Battye and I. Roberts (eds.), 345-369. Oxford: Oxford University Press.

440

Ian G. Roberts

Travis, Lisa 1984 Parameters and effects of word order variation. Ph.D. Dissertation: MIT. Uriagereka, Juan 1995 Aspects of the Syntax of Clitic Placement in Western Romance, Linguistic Inquiry 26: 79-124. Vikner, Sten 1995 Verb movement and expletive subjects in the Germanic languages. Oxford: Oxford University Press. Vincent, Nigel 1999 The evolution of c-structure: prepositions and PPs fom Indo-European to Romance. Linguistics 37: 1.111-1.154. Wackernagel, Jakob 1892 Über ein Gesetz der Indogermanischen Wortstellung. Indogermanische Forschungen 1: 333-436. Walkden, George 2009 The comparative method in syntactic reconstruction. M.Phil, University of Cambridge. Wanner, Dieter 1987 The development of Romance clitic pronouns from Latin to Old Romance. Berlin: Mouton de Gruyter. Zwart, Jan-Wouter 1993 Dutch Syntax: A Minimalist Approach. Ph.D. Dissertation, University of Groningen.

Index of subjects

Affix Inner (Class 1/Stem level) ~, 16, 261 Outer (Class 2/Word level) ~, 16, 261

CED effects, 27, 144, 147-150, 161, 164 Adjunct Condition, 147, 151, 152 Subject Condition, 141-143, 152, 173, 174

Agree, 23, 28, 33, 85, 105, 106-109, 120, 137, 138, 140, 153, 173, 177, 178, 183, 194-200, 202, 205-207, 210, 212, 219, 220, 222, 224, 226, 235, 245, 247, 256, 275, 345, 368, 385, 391, 421, 422 Long Distance ~ (LDA), 197, 221 Multiple ~, 207, 220

Cliticization, 33, 34, 36, 390-392, 422, 425, 426 Enclisis, 385-397, 399, 414, 415, 417-421, 430 Proclisis, 406, 415-419, 428

Agreement, 4, 21, 30, 36, 52, 55, 59, 73, 92, 105, 106, 116, 117, 120, 159, 176-178, 181, 183, 184, 186, 195, 196, 198, 205-213, 215-219, 221-227, 235, 237, 329, 368, 372, 391, 396, 404, 426, 429 Spec-Head ~, 30, 205, 212, 226 Allomorphy, 15, 259, 260, 266-268, 407 Case, 22, 28, 32, 33, 37, 104-111, 113-120, 132, 139, 140, 149, 150, 151, 153, 163, 173, 174, 176, 177, 181, 184-186, 188, 189, 191, 207, 210, 213, 214, 218-220, 222-224, 236, 237, 247, 300, 328, 329, 330, 345, 424 Abstract ~, 21, 35 Structural ~, 4, 21, 22, 30, 92, 103-105, 107-109, 111, 120, 219, 323 Inherent ~, 23, 37, 120, 219 ~ Filter, 104, 119

Convergence, 38, 105, 107, 175, 179, 234, 236, 238 Cycle, 9, 10, 16, 17, 20, 25, 34, 35, 45, 58, 83, 84, 91, 125, 131, 155, 251, 252, 254, 256, 257, 264, 265, 276, 312, 323, 327, 331, 333, 334, 338, 339 Phonological ~, 9, 58 Single ~ Syntax, 35 Strict ~ (Condition), 25, 35, 37, 251, 253, 254, 256, 257, 394, 411, 423, 427, 428 Cyclicity, 1, 2, 4, 68, 83, 87, 120, 251, 252, 427 Featural ~, 21 Morpho−phonological ~, 254 Phase ~, 68, 130, 139, 152, 155, 161 Phonological ~, 264 Strict ~, 4, 254 Successive ~, 25, 46, 88, 152, 202, 217, 234, 329 Compositionality, 2, 4, 9, 309, 319, 320, 322, 329, 331, 333, 336-338 Principle of Phasal Composition, 327, 334, 336, 338

442

Index of subjects

Semantic ~, 312, 338 Strict ~, 34 Computational complexity, 10, 12, 20, 21, 67 Deep Structure (D-Structure), 103, 107, 110, 323 Defective (Probe, Goal, etc.), 29, 33, 196, 198, 199, 201, 202, 204, 205, 211-213, 215, 216, 220-224, 226, 391, 397 Partially ~, 30, 213-215, 217, 218, 220-224, 227, 228 Totally (completely) ~, 30, 213, 214-218, 221-224, 228 Dynamical frustration, 69, 70, 74, 83, 92, 95 Endocentricity, 2, 4, 126, 127, 155, 156, 157 Exceptional Case Marking (ECM), 117, 150, 151, 213, 219, 222, 330, 373 Economy, 10, 37, 48 Escape hatch, 23-26, 47, 68, 144, 145, 329 Edge (of a phase, head, etc.), 5, 6, 13, 14, 17, 25, 29, 36, 47, 51-53, 72-74, 79-82, 85, 86, 88-90, 92, 105, 106, 109, 112-120, 131, 132, 134, 135, 139, 145, 147, 149, 150, 151, 153, 158, 160, 165, 173, 175-178, 181, 183-186, 188-192, 196-199, 201, 204, 213, 214, 216, 221, 222, 224, 233, 235-238, 244-247, 254, 260, 266, 267, 269, 270, 283, 284, 286, 289, 295, 296, 297-302, 323,

324, 327, 329, 333, 334, 338, 344, 359, 366, 390, 406, 411, 421, 426, 428, 429 Extraction, 24-27, 38, 141, 145, 153, 154, 174, 217, 221, 226, 329, 343, 347, 348, 350-352, 357-361, 364-366, 369-373, 387, 423 Left Branch ~, 23, 343, 344, 348, 349, 356, 357, 359, 365, 366, 368, 369, 370, 372, 423, 425 F Game, 80, 81, 89, 91, 94 Feature (Case, Wh-, agreement, V, etc.), 38, 48, 56, 58, 59, 105, 106, 108, 116, 119, 120, 127, 128, 132, 137, 140, 149-151, 158, 163, 179, 185, 190, 200, 201, 206, 213, 214, 217, 219, 220, 236-239, 245-247, 313, 315, 344, 345, 347, 363, 390-392, 394, 396-398, 404, 408, 410, 412, 424, 426, 427 Edge ~ (EF), 27, 48, 49, 58, 59, 127, 128, 177, 180, 236, 238, 247, 248, 385, 390, 392, 393, 396, 417 EPP ~ (EPP position), 31, 68, 116, 120, 121, 128, 136, 140, 141, 142, 157, 163, 164, 165, 197, 206, 209-211, 218, 227, 239, 300, 391, 403, 404, 426 ~ bundle, 51, 398, 407 ~ inheritance, 28, 29, 32, 36, 59, 105, 106, 109, 174, 176, 195, 196, 198, 200-203, 205, 207, 209, 210, 212, 216, 223, 225-227 ~ percolation, 27, 50, 126, 137, 139, 162 ~ sharing, 246, 331, 428 ~ -splitting, 31-33, 117, 118, 173, 174, 176-179, 181, 183, 184, 186-191

Index of subjects ~ valuation, 15, 105, 174, 175, 234-236, 245-247 uninterpretable ~, 6, 29, 140, 147, 150, 177, 237, 247 Fibonacci, 74-76, 78, 79, 82, 83, 8991, 93-95 Freezing effects, 141-143, 153, 161, 346 Full Interpretation, 28, 104, 109, 110, 112, 155, 156, 200, 235 H-Į schema, 127, 129-132, 134-136, 138, 139, 141, 143-145, 147-157, 159, 160, 161, 164 Head movement, 179, 180, 182, 191, 234, 246, 301, 385, 396, 397, 425, 428, 429 Long ~, 396, 424 ~ constraint, 396 Interface conditions, 9, 10, 15, 20, 28, 34, 36, 69, 73, 91, 92, 104, 106, 107, 109, 175, 200, 283 Intervention (see also Minimality), 22, 25, 33, 152, 182, 187, 189, 225 Label, 4, 36, 59, 68, 114, 130, 155, 162, 164, 234, 272, 316, 323 s-~, 17, 285, 286, 296, 297 w-~, 17, 285, 296 Labeling algorithm, 5, 6, 126, 130, 156 Last Resort, 45, 48, 49, 140

443

Lexical Phonology, 252, 253, 256, 259, 261, 263, 264 Lexicon, 2, 11, 16, 22, 29, 35-38, 49, 50, 57, 67, 120, 127, 158, 159, 179, 251, 260, 263, 316, 338, 360 Memory, 11, 12, 67, 84, 85, 88, 89, 199, 200 Merge External ~ (EM), 3, 36, 107, 132134, 137, 140, 145, 164, 210, 323 Internal ~ (see also Movement), 3, 22, 23, 32, 36, 48, 85, 107, 113, 115, 116, 121, 134-141, 143, 153, 155, 157, 158, 163, 173, 174, 176-178, 180, 181, 190, 191, 210, 234, 238, 256, 323, 424 Maximal ‫ ׋‬Condition, 274 Minimality, 174, 182-184, 187, 247, 396, 397, 404, 410, 412, 427, 428 Minimal (minimize) computation, 120, 152, 207 ~ search, 4, 114, 126, 156, 157, 207, 218 Movement A ~, 135, 136, 163, 173, 178, 180, 182-186, 188, 189, 191, 196, 202-205, 216, 234, 237, 238, 241, 362, 373 A-bar (A’) ~, 136, 146, 160, 163, 178-184, 191, 192, 234, 238, 245, 247, 273, 324, 338, 344, 346 Improper ~, 32, 116, 117, 173, 178, 179, 181, 185, 186, 188, 189, 191, 373

Lexical Array, 11, 12, 67, 114 Natural bracketing hypothesis, 264

444

Index of subjects

NP/DP Parameter, 23, 343, 348, 350, 357

Root (non-embedded) domain, 14, 59, 111, 190, 401, 411, 418

NP language, 352, 357, 358, 359, 366, 368, 370

Phase, 5, 6, 10, 12-15, 18, 20, 23-25, 27, 29, 31, 32, 36, 37, 46, 51-54, 56, 57, 59, 67, 68, 70, 73, 80, 82, 83, 85, 88, 91, 92, 103, 105, 107, 109, 111-113, 115, 118, 120, 130-135, 138-140, 145, 148, 149, 155, 161, 163, 174, 175, 177, 178, 185, 189, 195-198, 201, 209, 210, 212-216, 218, 219, 221-223, 233-239, 244-247, 251, 252, 254, 256, 259, 260, 263, 264, 267, 269, 274, 275, 283, 284, 287, 295, 299, 300, 310312, 318, 320, 322-333, 335, 336, 338, 343-347, 356-359, 360, 361, 363-369, 373, 374, 385, 391, 397, 415, 422, 428, 429 strong ~, 198, 199, 202, 204, 205, 209, 211, 213, 220, 227, 298, 301 weak (defective) ~, 198-200, 202, 211, 216, 220, 226 ~ extension/sliding, 37 ~ head, 14-16, 22, 28, 29, 30, 33, 36, 39, 47, 48, 56, 72, 74, 105107, 109, 110, 113, 115, 116, 118-121, 130, 131, 133, 135, 139-141, 147, 149-154, 160, 161, 163, 176, 181, 191, 195-197, 199, 201, 202, 209, 210-212, 214, 224, 226, 234-236, 238, 239, 244, 245, 247, 248, 255, 256, 258, 259, 263, 268, 273275, 284, 323, 334, 345, 347, 348, 385, 389, 390, 415, 421, 422 ~ Impenetrability Condition (PIC), 5, 15, 17, 28, 29, 35, 36, 37, 47, 48, 90, 106, 107, 108, 109, 114, 115, 118-120, 131, 141, 148, 154, 163, 178, 195, 196, 197-200, 201, 202, 205, 207, 233, 234, 235, 238, 244-

Stress, 1, 9, 16, 17, 18, 36, 75, 256, 263, 264, 265, 267, 275, 284, 285-287, 289, 290-293, 296-298, 301-303 Nuclear Sentence ~, 17, 284, 285, 302 Phrasal ~, 17, 283, 285, 289, 296, 300-302 Sentential ~, 268, 283, 286, 287, 302 Numeration (see also Lexical Array), 11, 36, 58, 238, 239, 246 Obligatory Contour Principle, 17, 285, 296 Phrasal Sister Condition, 152, 153 Position A ~, 116, 117, 173, 174, 178182, 184-188, 190, 191, 247, 424 A-bar (A’) ~, 116, 173, 174, 179, 180-184, 187, 188, 190, 191, 247, 274 Projection, 4, 20, 23, 24, 47, 50, 68, 70, 86, 89, 108, 125, 126, 127, 129, 130, 139, 155-157, 160-164, 190, 206, 269, 290, 295, 296, 328, 331, 339, 344, 354, 355, 364, 372 Propositional, 19, 21, 35, 328, 332, 338 Root, 16, 17, 258, 259, 260, 261, 262-268, 275, 276, 315, 325, 328, 331

Index of subjects 246, 252, 254, 256-259, 261, 263-268, 275, 284, 344-346, 356, 358, 359, 367, 369, 391, 396, ~ level, 5, 22, 23, 36, 51, 52, 106, 149, 158, 164, 175, 201, 235, 244, 344 ~ periodicity, 68 ‫׋‬domain, 269, 274 Phrase

445

Reconstruction, 23, 35, 39, 47, 88, 198, 234, 240, 241, 248, 318 Second-position (effects), 33, 385, 392, 399, 414, 415, 421, 422 Spell-Out (see also Transfer), 17, 35, 45, 48, 51-53, 56, 57, 68, 85, 144, 162, 175, 195, 199, 201, 210, 211-213, 227, 256, 273, 275, 284, 295-301

‫~ ׋‬, 36 Minor ~, 295 Major ~ (MaP), 17, 284-286, 288, 295-299, 301 ~ level phonology, 259, 268 Probe, 4, 6, 30, 32, 37, 72, 105, 108, 109, 115, 116, 120, 126, 174, 177, 178, 180, 186, 189, 195, 196, 199, 201, 205-207, 209-211, 213-216, 219, 220, 222-224, 226, 238, 245, 344, 390, 391, 421, 427 Raising, 5, 24, 31, 55, 141, 151, 153, 154, 164, 185, 189, 202, 217, 219, 221-223, 227, 233, 234, 238, 239, 246, 273, 301, 324, 330, 349, 363, 373, 397, 399, 407, 429 Super ~, 184, 185

Multiple Spell-Out, 37, 195, 256 Transfer, 5, 6, 14, 15, 17, 18, 21, 22, 27-30, 32, 35, 36, 37, 47, 51-53, 56, 59, 68, 90, 105-109, 115, 116, 118, 119, 130-136, 138-141, 144, 145, 147-151, 153, 154, 158, 160-163, 174-176, 185, 189, 191, 195, 199, 201, 203, 212214, 216, 218, 220, 226, 235238, 245-247, 254, 256, 273, 323, 333, 334 Valuation, 4, 6, 15, 21-23, 28-30, 32, 52, 105-108, 111, 120, 150, 174, 175, 200, 201, 206, 210, 213, 214, 218, 220, 226, 234-237, 245-247, 343, 345, 361, 363, 364, 368, 369, 373