Eurotyp: 1 Constituent Order in the Languages of Europe [Reprint 2010 ed.] 9783110812206, 9783110151527

 

276 9 198MB

English Pages 845 [848] Year 1997

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contributors
Abbreviations
Introduction
Part I — Word order surveys
Word order in Celtic
An overview of the main word order characteristics of Romance
Word order in the Germanic languages
An overview of word order in Slavic languages
Basic characteristics of Modern Greek word order
Word order in European Uralic
Word order in Kartvelian languages
Word order in Daghestanian languages
Part II — Parameters of word order variation
Aspects of word order in the languages of Europe
Order in the noun phrase of the languages of Europe
Flexibility and consistency in word order patterns in the languages of Europe
The relative order of recipient and patient in the languages of Europe
Variation in major constituent order; a global and a European perspective
Word order variation in some European SVO languages: a parametric approach
Celtic word order: some theoretical issues
Word order variation in some SOV languages of Europe
Discourse-configurationality in the languages of Europe
Some issues in a performance theory of word order
Appendix – 12 word order variables in the languages of Europe
Index of authors
Index of languages
Index of subjects
Recommend Papers

Eurotyp: 1 Constituent Order in the Languages of Europe [Reprint 2010 ed.]
 9783110812206, 9783110151527

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Constituent Order in the Languages of Europe

W DE G

Empirical Approaches to Language Typology EUROTYP 20-1

Editors Georg Bossong Bernard Comrie

Mouton de Gruyter Berlin · New York

Constituent Order in the Languages of Europe

edited by Anna Siewierska

Mouton de Gruyter Berlin · New York

1998

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter 8c Co., Berlin.

® Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication-Data Constituent order in the languages of Europe / edited by Anna Siewierska. p. cm. — (Empirical approaches to language typology ; 20-1) "The present volume is one of a series of nine volumes in which the results of the European research project "Typology of Languages in Europe" (EUROTYP) are published" — General pref. Includes bibliographical references and index. ISBN 3-11-015152-9 (cloth ; alk. paper) 1. Europe — Languages — Word order. 2. Typology (Linguistics) I. Siewierska, Anna. II. Typology of Languages in Europe (Project) III. Series. P380.C56 1997 415-dc21 97-33231 CIP

Die Deutsche Bibliothek — Cataloging-in-Publication-Data Constituent order in the languages of Europe / ed. by Anna Siewierska. — Berlin ; New York : Mouton de Gruyter, 1998 (Empirical approaches to language typology ; 20 : EUROTYP ; 1) ISBN 3-11-015152-9

© Copyright 1997 by Walter de Gruyter & Co., D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Typesetting and printing: Arthur Collignon GmbH, Berlin. Binding: Lüderitz & Bauer, Berlin. Printed in Germany.

General preface The present volume is one of a series of nine volumes in which the results of the European research project "Typology of Languages in Europe" (EUROTYP) are published. The initiative for a European project on language typology came from a proposal jointly submitted to the European Science Foundation (ESF) by Johannes Bechert (University of Bremen), Claude Buridant (University of Strasbourg), Martin Harris (University of Salford, now University of Manchester) and Paolo Ramat (University of Pavia). On the basis of this proposal and following consultations with six experts the Standing Committee for the Humanities of the ESF decided to organize a workshop (Rome, January 1988), in which this idea was further explored and developed. The results of this workshop (published by Mouton, 1990) were sufficiently encouraging for the Standing Committee to appoint a preparatory committee and entrust it with the tasks of drawing up a preliminary proposal, of securing interest and participation from a sufficiently large number of scholars and of finding a suitable programme director. The project proposal formulated and sent out by Simon Dik (University of Amsterdam) as chair of this committee met with very supportive and enthusiastic reactions, so that the Standing Committee for the Humanities recommended the funding of a planning stage and the General Assembly of the ESF approved a year zero (1989) for an ESF Programme in Language Typology. During this planning phase all major decisions concerning the management structure and the organisation of the work were taken, i. e. the selection of a programme director, the selection of nine focal areas around which the research was to be organized, the selection of a theme coordinator for each theme and the selection of the advisory committee. The first task of the programme director was to draw up a definitive project proposal, which was supplemented with individual proposals for each theme formulated by the theme coordinators, and this new proposal became the basis of a decision by the ESF to fund the Programme for a period of five years (1990-1994). Language typology is the study of regularities, patterns and limits in crosslinguistic variation. The major goal of EUROTYP was to study the patterns and limits of variation in nine focal areas: pragmatic organization of discourse, constituent order, subordination and complementation, adverbial constructions, tense and aspect, noun phrase structure, clitics and word prosodic systems in the languages of Europe. The decision to restrict the investigation to the languages of Europe was imposed for purely practical and pragmatic

vi

General preface

reasons. In the course of the project an attempt was made, however, to make as much sense of this restriction as possible, by characterizing the specific features of European languages against the background of non-European languages and by identifying areal phenomena (Sprachbünde) within Europe. More specifically, the goals of the EUROTYP project included the following: — to contribute to the analysis of the nine domains singled out as focal areas, to assess patterns and limits of cross-linguistic variation and to offer explanations of the patterns observed. — to bring linguists from various European countries and from different schools or traditions of linguistics together within a major international project on language typology and in doing so create a new basis for future cooperative ventures within the field of linguistics. More than 100 linguists from more than 20 European countries and the United States participated in the project. — to promote the field of language typology inside and outside of Europe. More specifically, an attempt was made to subject to typological analysis a large number of new aspects and domains of language which were uncharted territory before. — to provide new insights into the specific properties of European languages and thus contribute to the characterization of Europe as a linguistic area (Sprachbund). — to make a contribution to the methodology and the theoretical foundations of typology by developing new forms of cooperation and by assessing the role of inductive generalization and the role of theory construction in language typology. We had a further, more ambitious goal, namely to make a contribution to linguistic theory by uncovering major patterns of variation across an important subset of languages, by providing a large testing ground for theoretical controversies and by further developing certain theories in connection with a variety of languages. The results of our work are documented in the nine final volumes: Pragmatic Organization of Discourse in the Languages of Europe (edited by G. Bernini) Constituent Order in the Languages of Europe (edited by A. Siewierska) Subordination and Complementation in the Languages of Europe (edited by N. Vincent) Actance et Valence dans les langues de l'Europe (edited by J. Feuillet) Adverbial Constructions in the Languages of Europe (edited by J. van der Auwera) Tense and Aspect in the Languages of Europe (edited by 0. Dahl) Noun Phrase Structure in the Languages of Europe (edited by F. Plank)

General preface

vii

Clitics in the Languages of Europe (edited by H. van Riemsdijk) Word Prosodic Systems in the Languages of Europe, (edited by H. van der Hülst) In addition, the EUROTYP Project led to a large number of related activities and publications, too numerous to be listed here. At the end of this preface, I would like to express my profound appreciation to all organizations and individuals who made this project possible. First and foremost, I must mention the European Science Foundation, who funded and supported the Programme. More specifically, I would like to express my appreciation to Christoph Mühlberg, Max Sparreboom and Genevieve Schauinger for their constant and efficient support, without which we would not have been able to concentrate on our work. I would, furthermore, like to thank my colleague and assistant, Martin Haspelmath, and indeed all the participants in the Programme for their dedication and hard work. I finally acknowledge with gratitude the crucial role played by Johannes Bechert and Simon Dik in getting this project off the ground. Their illness and untimely deaths deprived us all of two of the project's major instigators. Berlin, September 1995

Ekkehard König, Programme Director

Preface The present volume is the product of five years of collaborative research on the topic of constituent order carried out from the beginning of 1990 to the end of 1994 by the members of the Constituent Order group of the European Science Foundation Programme in Language Typology (Ευκοτγρ). The group was formed by eleven linguists from nine countries, namely: Dik Bakker, Matthew Dryer, John Hawkins, Anders Holmberg, Katalin Kiss, Beatrice Primus, Jan Rijkhoff, Anna Siewierska, Maggie Tallerman, Yakov Testelec and Maria Vilkuna. These linguists were brought together by their common interest in the factors underlying constituent order variation rather than by an allegiance to a single theoretical orientation or research methodology. They include formalists and functionalists of various persuasions, experts on individual languages or groups of languages and language typologists. Given the diversity of backgrounds of the members of the group, instead of engaging in fruitless efforts to define a single line of research agreeable to all, from the outset we opted to pursue our own individual research interests while drawing on each others expertise, exchanging ideas, sharing data and testing hypotheses. Though this volume does not reflect the full range of issues that were discussed by the group in the course of the project, it does present several different interpretations of the factors underlying constituent order variation and divergent conceptions of the role and nature of language typology. In conducting our research we have benefitted from the knowledge of many scholars a large number of whom were also directly involved in the ESF project, but by no means all. Much of the language data for our research originated from a 26-page word order questionnaire compiled by the members of the group. We are particularly indebted to the linguists who found the time to fill in this rather extensive questionnaire, namely: Tor A. Afarli, Jurij Anduganov, G. M. Awbery, Emanuel Banfi, Giovanni M. G. Belluscio, Giuliano Bernini, Vit Bubenik, Ines Loi Corvetto, Helma Dik, Karen Ebert, Anna Gavarro, Inge Genee, Riho Gr nthal, Martin Haspelmath, Toomas Help, Mateja Hocevar, Lisbeth Falster Jakobsen, Jan de Jong, Johannes Gisli Jonsson, Paula Kokkonen, Chryssoula Lascaratou, Ruta Marcinkeviciene, Belen Lopez Meirama, Juan Carlos Moreno Cabrera, Christian T. Petersen, Donall P. O Baoill, Eusebio Osa, Bernard Oyharcabal, Sirkka Saarinen, Tapani Salminen, Merja Salo, Pekka Sammallahti, Gerjan van Schaaik, Suzanne Schlyter, Irja Seuruj rvi-Kari, H. A. Sigur sson, Svillen Stanchev, Janig Stephens, Pirkko Suihkonen, Ingrid Thelin, Ludmila Uhlifova, Enric Vallduvi, Martina Vanhove, Bibinur Zaguljajeva, Tomaso Zorzutti and Bostjan Zupanicic. The individual members of the

Preface

group also made use of other questionnaires geared to elicit more fine tuned word order data of relevance to their individual research. The respondents to these questionnaires are gratefully acknowledged in the individual articles. A considerable amount of the data obtained from the various questionnaires is stored in computerized form in the Amsterdam Data Base, the development of which has been a major preoccupation of Dik Bakker. Though some of the data that we have amassed yet needs to be processed, we hope to make all the information that the linguistic community has shared with us generally available, in computerized form, in the years to come. Working with a group of highly motivated and dedicated scholars for the past five years has been a highly rewarding experience. I would like to take this opportunity to express my thanks to the members of the Constituent Order group for five years of stimulating discussions, cooperation, support and friendship. Two of our meetings were conducted jointly with the members of the theme group 'The pragmatic organization of discourse' and several sessions were enlivened by guests from other theme groups and linguists from outside the project, namely: Kersti Börjars, Bill Croft, David Gil, Chryssolula Lascaratou, Christian Lehmann, Johanna Nichols, Frans Plank and H. A. Sigurösson. Their contributions to our discussions are gratefully acknowledged. I would also like to extend my thanks to the Scientific Advisory Committee of the ESF project headed by the late Simon Dik and, especially to the project director Ekkehard König and his assistant Martin Haspelmath who have been particulary generous with their help and advice both in professional and personal matters. Much of the pleasure that we have derived from the project was due to the organizational skills and understanding of the ESF personnel in Strasbourg, most notably Max Sparreboom and his secretary Genevieve Schauinger whom I would also like to warmly thank. Lancaster, 28th August 1995

Anna Siewierska

Table of contents Contributors Abbreviations Anna Siewierska Introduction .......

.................................... ...................................

.

xiii xv

............................

1

Part I — Word order surveys Maggie Tallerman Word order in Celtic

...............................

21

Alfredo R. Arnaiz An overview of the main word order characteristics of Romance Anders Holmberg and Jan Rijkhoff Word order in the Germanic languages . . . . .

...............

Anna Siewierska and Ludmila Uhlirova An overview of word order in Slavic languages

...............

Chryssoula Lascaratou Basic characteristics of Modern Greek word order Maria Vilkuna Word order in European Uralic

.............

.........................

.....

47

75

105

151

173

Yakov G. Testelec Word order in Kartvelian languages

......................

G. Testelec Word order in Daghestanian languages

....................

235

257

Part II — Parameters of word order variation Matthew S. Dryer Aspects of word order in the languages of Europe Jan Rijkhoff Order in the noun phrase of the languages of Europe

.............

...........

283

321

xii

Table of contents

Dik Bakker Flexibility and consistency in word order patterns in the languages of Europe

383

Beatrice Primus The relative order of recipient and patient in the languages of Europe .

421

Anna Siewierska Variation in major constituent order; a global and a European perspective

475

Anders Holmberg Word order variation in some European SVO languages: a parametric approach

553

Maggie Tallerman Celtic word order: some theoretical issues

599

Yakov G. Testelec Word order variation in some SOV languages of Europe

649

Katalin E. Kiss Discourse-configurationality in the languages of Europe

681

John A. Hawkins Some issues in a performance theory of word order

729

Anna Siewierska, Jan Rijkhoff and Dik Bakker Appendix — 12 word order variables in the languages of Europe . . . .

783

Index of authors Index of languages Index of subjects

813 820 825

Contributors Dr. Alfredo R. Arnaiz Departamento de Humanidades Pontifica Universidad Catolica del Peru Av. Universitaria cuadra 18 s/n San Miguel Lima, Peru Dr. Oik Bakker Dept of Computational Linguistics University of Amsterdam Spuistraat 134 1012 VB Amsterdam The Netherlands Prof. Matthew Dryer Dept of Linguistics Faculty of Social Sciences 685 Baldy Hall, North Campus Buffalo, New York 14260 U.S.A. Prof. John A. Hawkins Dept of Linguistics, University of Southern California, Los Angeles, California 90089 U.S.A. Prof. Anders Holmberg Dept of Linguistics University of Troms0 9037 Troms0 Norway Prof. Katalin E. Kiss Linguistic Institute of the Hungarian Academy of Sciences Szenthäromsag u. 3 H-1014 Budapest Pf. 19 Hungary

Prof. Chryssoula Lascaratou Dept. of English Studies The University of Athens University Campus Zografou GR 157 84 Athens Greece Prof. Beatrice Primus Institut für Deutsch als Fremdsprachenphilologie Universität Heidelberg Plöck 55 D-6900 Heidelberg BRD Dr. Jan Rijkhoff Department of General Linguistics The University of Amsterdam Spuistraat 210 1012 VT Amsterdam The Netherlands Prof. Anna Siewierska Department of Linguistics and Modern English Language Lancaster University Lancaster LAI 4YT Great Britain Dr. Maggie Tallerman School of English and Linguistics University of Durham Elvet Riverside 1 New Elvet Durham DH1 3JT Great Britain

xiv

Contributors

Dr. yakov Testelec Institut Jazykoznanija AR Sektor Jazyki Mira, Bol'shoj Kislovskij per., d. 1/12 103 009 Moscow Russia Dr. Ludmila Uhlifova The Prague Academy of Sciences Institute of Czech Language Valentinskä l 116 46 Prague The Czech Republic

Dr. Maria Vilkuna Research Institute for the Languages of Finland Sornaisten rantatie 25 Fin-00500 Helsinki Finland

Abbreviations ABE ABL ABS ACC ADESS ADP ADJ ADJP AGR AFF ALL AOR ART ASP AUX

abessive case ablative case absolutive case accusative adessive case adposition adjective adjective phrase agreement affective allative case aorist article aspect Γ auxiliary

CL CLF CLT CMP CMPR COM COMP COND CNJ CONV COP CORR CP CTR

class marker classifier clitic completive comparative comitative case complementizer, complement conditional conjunction converb copula correlative complementizer phrase contrast

DAT DEC DEF DEM DES DESCR DET DIST

dative declarative definte demonstrative desiderative descriptive determiner distal

ELA EMPH ERG

elative case emphatic ergative (case)

F FIN FOC PUT

feminine gender finite focus future

GEN GER

genitive case, possessed gerund

I ILL IMP IMPF IMPR IND INDEF INESS INF INT INST IP IPFV IPS IO

inflection illative imperative imperfective impersonal indicative indefinite inessive infinitive intensifier instrumental inflection .rphrase imperfective impersonal passive indirect object

LAT LOG

lative locative

M

masculine gender

N NEG NFIN NOM NP NT NUM

noun or nominal negative element nonfinite nominative noun phrase neuter numeral

Ο OBL OBJ OBV OPT

object oblique object obviative optative

XVI

P PART PASS PFV PL POSS POST PNCT PP PRED PREDP PREV PREP PRF PRO PROG PROX PRS PRTV PST PTL *—V

Q QUEST

Abbreviations patient participle passive perfective plural possessive postposed punctual prepositional phrase predicative predicative phrase preverb preposition perfective pronoun progressive proximate present partitive past particle question particle, question word question

R REL RESTR RFL

recipient relative restrictive reflexive

S

SPEC SUB SUBESS SUBJ

subject subject singular specifier subordinator subessive subjunctive

T TNS TOP TRNSF

tense tense topic transformative

V VP 1 2 3

verb verb phrase 1st person 2nd person 3rd person

SBJ SG

Γ

Anna Siewierska

Introduction

1. The structure of the volume This volume has been compiled with two aims in mind. On the one hand, it is intended as a source of information on constituent order in the languages of Europe. And on the other hand, it seeks to provide a variety of perspectives on the factors underlying word order variation. 1 Though de facto all the papers in the volume pursue both goals, part one of the volume entitled Word order surveys is essentially descriptive, while part two Parameters of variation focuses on the analysis of aspects of word order variation.

1.1. Word order surveys Part One features descriptive word order surveys of eight groups of European languages. The surveys focus on the order of the verb and its arguments and that of the noun and its modifiers drawing particular attention to word order variation both within individual languages and among the languages of the relevant group. The described word order characteristics are richly illustrated with language data, wherever possible taken from the less commonly discussed languages of a particular group. For ease of reference all the surveys are structured around the following headings: 1. Introduction, 2. Inflectional and other functional categories, 3. The word order type, 4. Order in declarative clauses, 5. Subordinate clauses, 6. Other sentence types, 7. The noun phrase. The range of data discussed under each of these headings differs somewhat from survey to survey, depending on the word order properties of the languages involved and the availability of information. The language groups surveyed are Celtic, Romance, Germanic, Slavic, Modern Greek, Uralic, Kartvelian and Daghestanian. We had hoped to be able to

2

Anna Siewierska

present overviews of the word order of all the major branches of European languages, but this proved to be impossible. Contrary to what might be supposed, detailed information on constituent order is not readily available for a large number of the European languages. And needless to say, the language expertise of the members of the group, though wide, does not encompass all the sub-groups of the languages of Europe. Five of the eight word order surveys are of the most commonly discussed branches of Indo-European languages. Consequently, much of the word order data contained in the overviews of Celtic, Germanic, Modern Greek, Romance and Slavic are well known, though not necessarily easily accessible. There are few publications which offer word order overviews of more than one language and, to the best of my knowledge, none which contain summaries of word order data of several groups of languages. The presentation of the word order characteristics of the above language groups in a single volume fills in an evident gap in the linguistic literature and provides a convenient point of departure for a word order investigation of any of the individual languages, relevant branches of Indo-European or Indo-European languages in general. In comparison to Indo-European languages the Uralic, Kartvelian and Dagestanian languages have received little attention in the linguistic literature. Of the Uralic languages only two, Hungarian and Finnish, are likely to be familiar to the general linguistic public. Maria Vilkuna's overview of the word order characteristics of Uralic is therefore of special interest. Much of the data that she presents is new, the result of her own research with informants and the analyses of corpora. The Uralic languages are spoken in the North-eastern fringes of Europe and manifest word order properties both of the SVO languages dominant in central and western Europe and of the SOV type prevalent in central Eurasia. As such, they constitute a rich source of information on the spread of linguistic features via areal contact and in particular on word order change, both distant and ongoing. Vilkuna offers an illuminating discussion of the SVO/SOV alternation in several of the Uralic languages including the poorly studied Erzya (Mordvin), Komi Zyrian, Mari and Udmurt. The statistical data that she cites reveal significant differences in the distribution of word order patterns in both transitive and intransitive clauses across the European Uralic languages, which are subsequently related to differences in the pragmatic structure of the clause in the respective languages. Also deserving mention is Vilkuna's discussion of postmodification at the level of the NP, a phenomenon which, though not typical of the Uralic languages, in the light of her data emerges as far less alien to them than is generally thought. The indigenous languages of the Caucasus have been well investigated from the point of view of their morphological properties, especially their morpholog-

Introduction

3

ical ergativity but not in regard to their word order characteristics. Little is known about the word order of many of these languages apart from the fact that they tend to be head-final. In this light, the word order data presented by Yakov Testelec in his overviews of the Kartvelian and Daghestanian languages are all the more impressive. Both overviews are the product of intensive consultations with specialists of Kartvelian and Daghestanian languages and several field trips undertaken by the author. Most of the investigated languages are shown to exhibit considerable word order variation. Particularly worthy of note are the numerous departures from the head-final norm which Yakov Testelec elaborates on in the second part of the volume.

1.2. Parameters of variation The explanations offered for the patterns of constituent order variation observed include semiotic and semantic principles, processing principles, discourse factors as well as principles and parameters of Universal Grammar. The ten papers comprising part two of this volume deal primarily with the synactico-semantic determinants of constituent order rather than with the discoursepragmatic ones. This does not mean, however, that the members of the group have paid no heed to the effect of the discourse-pragmatic statuses of constituents on linear order. On the contrary, several working papers of the group have been devoted to this topic. These do not feature in this volume since they have been published elsewhere. Nonetheless, some indication of the group's discussions of the effect of discourse-pragmatic factors on linearization is given in John Hawkins's paper, whose performance theory of order and constituency has been a major source of inspiration, and in Katalin Kiss's contribution, who presents a syntactico-semantic analysis of the discourse-pragmatic notions of topic and focus. Of the ten papers, five, those of Dik Bakker, Matthew Dryer, Jan Rijkhoff, Beatrice Primus and Anna Siewierska are firmly couched in the Greenbergian tradition in that they consider the word order characteristics of European languages from the point of view of the distribution of the well known Greenbergian word order variables, i. e. the order of the verb and its arguments and/or of the noun and its modifiers. Dryer addresses the issue of the factors underlying the distribution of noun/modifier and modifier/noun order among the VO languages of Europe in the context of the global distribution of NP internal order in VO and OV languages. Rijkhoff considers the ordering possibilities of the modifiers of the noun both relative to the noun and relative to each other in simplex and complex NPs. The basic order patterns and the occurring alter-

4

Anna Siewierska

natives are analyzed in relation to three principles of order, namely: the Principle of Domain Intergrity (PDI) which specifies that discontinuity is a marked phenomenon; the Principle of Head Proximity (PHP) which states that heads tend to be placed at the closest possible distance from each other; and the Principle of Scope (PoS) which specifies an iconic relationship between the positioning of a modifier relative to the head in semantic and surface structure. Bakker, on the basis of the word order data contained in the appendix to this volume, offers a typology of word order of the languages of Europe using three parameters: flexibility, consistency and consequence. Flexibility is measured in terms of the number of word order variables in each language that have both head/modifer and modifer/head order, consistency in terms of the number of variables that display a dominant modifier/head or head/modifier order and consequence in terms of the number of alternatives to the basic orders that comply with the dominant modifier/head or head/modifier patterns. Primus analyzes the placement of the patient and recipient in distransitive clauses taking as her point of departure the position of the two on the thematic and case hierarchies in (1) and (2) respectively. (1)

Pro to-Agent controller causer experiencer possessor


genitive > relative clause > numeral > demonstrative

The adjective can be placed both pre- and postnominally in just over half of the languages of Europe, the genitive in a little over a third, the relative clause in about a quarter, the numeral in about a fifth and the demonstrative in only 13%. The distribution of alternative orders of the demonstrative and numeral in the VO languages of Europe has no bearing on Dryer's hierarchy, but that of the adjective, genitive and relative clause does. Dryer's hypothesis of relative geographical and chronological distance from the Eurasian OV type would lead us to expect that the further removed a language is from this Eurasian OV type the less relics of this type and more deviation in the direction of NAdj and NG order it should display. This is indeed the case. If we look at the possibilities of alternative adjective order among the VO languages of Europe we find that the languages on the far left of Dryer's hierarchy, the Celtic, are strongly NAdj and those on the far right, the Germanic, Baltic and Finnic, are strongly AdjN, while those in the middle more readily allow departures from their basic NAdj (Albanian, Maltese, Romance) or AdjN (Greek and Slavic) order. In Celtic AdjN order is only found with a very small class of adjectives which form semi-compounds with the noun. In Romance, though only some adjectives favour AdjN as opposed to NAdj order, AdjN order can be used with most adjectives for purposes of highlighting. The same more or less holds for NAdj order in Greek and Slavic. In Germanic and Baltic NAdj order is ungrammatical and in Finnic it is found only in Finnish, Mordvin and Vepsian, though arguably in such cases the adjective is actually in apposition to the noun. Thus the possibilities of the use of AdjN order decrease the further the language is geographically and arguably chronologically removed from the Eurasian OV type. The alternative orders of the genitive pattern in a similar

Introduction

9

way. The Celtic languages, Albanian and the Romance languages permit no alternatives to the basic NG order, Greek and Slavic allow GN under various circumstances, the Germanic NG languages all have NG and the Germanic and Baltic GN languages have NG and the majority of the Finnic languages are strictly GN. As for the relative clause, only the Finnic languages have a regular RelN alternative to the basic NRel, a clear relic of their relatively recent OV past. Turning to the issue of word order flexibility, Siewierska measures the degree of word order flexibility in a sample of 48 European languages on the basis of the number and type of word order variants that a language displays, where by a word order variant is meant any permutation of a nominal subject, object and finite verb, found in main, positive, declarative clauses featuring no morphological marking other than that present in the basic order and no disjunctures. If one considers the number of occurring word order variants relative to the logically possible ones, the Northwest Caucasian languages emerge as manifesting the highest average level of flexibility 90%, followed by the Uralic 84%, Nakh-Daghestanian 80%, Kartvelian 63%, then the Indo-European 56% and finally the Altaic 33%. Among the Indo-European languages the most flexible are the Balto-Slavic (100%) all of which display maximum variation, i. e. all six of the logically possible permutations of the subject, object and verb, and also Greek. The least flexible are the Celtic which have no word order variants, in the sense of the term used by Siewierska. Of the remaining branches of Indo-European, the Germanic score the lowest with an average level of flexibility of 37%, followed by the Iranian 45%, then the Romance 52% and Albanian and Armenian 60%. Thus the degree of word order flexibility increases from the Celtic languages in the west to the Balto-Slavic and Uralic languages in the east and the Northwest Caucasian, Nakh-Daghestanian and Kartvelian in the south-east. This west-to-east increase in word order flexibility is actually evinced by the European VO languages since the OV languages, with the exception of Basque, Latin, Gothic and Armenian are all spoken in the south-east. The low level of word order flexibility among the languages in western as compared to eastern Europe is typically attributed to the absence of affixal case marking and also restricted or no agreement marking in the former as opposed to the latter. However, Siewierska's analysis of a global sample of 171 languages suggests that neither the presence of agreement nor of case marking is a sufficient condition for flexible order, nor does rigid order entail the absence of either form of morphological marking. 46% of the languages with case marking in Siewierska's global sample and also 46% of those with agreement marking have either no or only a single word order variant. And con-

10

Anna Siewierska

versely of the languages with no or only a single word order variant, 43% have case marking and 56% have agreement marking. Therefore the presence vs absence of morphological marking is unlikely to be the sole factors determining the difference in word order flexibility among the western as compared to the eastern VO languages of Europe. A more promising source of this difference is suggested by Kiss's discourse configurational typology. Kiss's analysis of topic and (identificational) focus encoding among 35 European languages reveals that 69% (24/35) of the languages of Europe are discourse configurational in that they are topic prominent and have a structural position for the identificational focus. And potentially only three languages, Danish, Norwegian and Swedish, display no discourse configurationality, i. e. subject prominence and no structural focus position. Of the remaining eight languages in her sample, five are topic prominent but have no focus position (Dutch, English, German, Icelandic and Bezhta) and three (Welsh, Irish and Scottish Gaelic) are neither topic- nor subject prominent but do have a focus position. In all, ten of the eleven languages which are either not discourse configurational or only encode the topic or the identificational focus belong to the least flexible Celtic and Germanic branches of Indo-European. Nonetheless, on closer inspection there does not appear to be a necessary relationship between Kiss's discourse configurationality and high word order flexibility, at least in terms of Siewierska's measure of flexibility. Most of the discourse configurational languages in Kiss's sample do indeed exhibit all the possible permutations of the subject, object and verb, for instance, Basque, Georgian, Hungarian, Estonian, Finnish, Greek, Polish, Slovak and Russian. However, among her discourse configurational languages we also find Breton, Italian, Spanish, Turkish, Laz and Lezgian which display considerably less word order variation. While structural encoding of the topic coupled with the structural encoding of the identificational focus, undoubtedly entails some degree of word order variation in a language, one may hypothesize that the high degree of word order flexibility among most of the discourse configuratitmal languages in Kiss's sample is actually due to the existence of a structural position for the INFORMATION focus or at least the existence of a favoured position for the information focus. After all, contexts in which exhaustive interpretations are required are rare while arguably every utterance has an information focus. At present the nature of the relationship between a structural position for the identificational focus and one for the information focus is not clear. In some of the languages that Kiss discusses the two coincide (e. g. Basque, Hungarian) but in others they do not (e. g. Polish, Welsh). And presumably a language may express identificational focus purely phonologically while nonetheless favouring a certain clausal position for the information focus.

Introduction

11

The fact that neither morphological marking nor, most probably, discourse configurationality fully account for the difference in word order flexibility among the languages of Europe has lead Holmberg to develop his own typology in terms of the more abstract and theory specific parameters enumerated earlier above. Whether the differences in the word order flexibility of at least the SVO languages of Europe can be accounted for by means of Holmberg's typology remains yet to be determined. The four parameters established by Holmberg successfully distinguish the Germanic SVO languages from each other, the Slavic from the SVO Uralic, and among the SVO Uralic, Finnish and Northern Saami from, for example, Estonian. But, so far, the Romance languages have not been considered, nor Greek and Albanian. It is also not clear, how in terms of Holmberg's typology one could distinguish, for example, among SVO languages manifesting a single but distinct word order variant. SVO languages in which each of the five OSV, OVS, VSO, VOS and SOV orders occurs as the sole alternative to the basic SVO order are all attested in Siewierska's sample, though admittedly not all in Europe. In terms of the flexibility in the ordering of the noun and its modifiers, the European languages are more uniform than in relation to the permutations of the subject, object and verb that they permit. Since currently we have no comprehensive information on all the possible linearizations of the constituents of the NP among the languages of Europe, we must content ourselves with a measure of flexibility based on the ordering of noun/modifier pairs. The appendix to this volume contains information on the basic and alternative orders of: the definite and indefinite article, the demonstrative, numeral, nominal and pronominal possessor, the adjective and the relative clause. In view of the fact that not all languages have articles, and that the ordering of pronominal possessors poses several problems for analysis, let us restrict our attention to the remaining five modifiers. In his contribution to this volume Bakker measures word order flexibility on the basis of the number of noun/modifier pairs in a language which exhibit both pre- and post-head order irrespective of whether the pre- and post-head placement involves the same or different subcategories of a given modifier. In terms of such a measure of flexibility, the indigenous languages of the Caucasus, Indo-European and Uralic languages display very similar average levels of flexibility 34%, 32% and 30% respectively. The Altaic languages trail far behind with an average of only 3%. Among the Indo-European languages the highest average flexibility is manifested by the Hellenic branch (60%), then the Armenian (50%), Balto-Slavic (42%), Celtic (33%), Romance (28%), Germanic (25%), Albanian (20%) and Iranian (16%). As one would predict, word order flexibility in the NP is considerably lower than at the clause level in all the phyla and branches, with one exception, i. e. of the

12

Anna Siewierska

Celtic languages. The relatively high level of flexibility of the Celtic languages at the NP level, is, however, not a direct reflection of variation in the ordering of certain modifiers, but rather due to the AdjN order of the very restricted class of adjectives mentioned earlier above and the circumlocution of numerals above ten. Unlike at the clause level, there is no clear increase in flexibility from west to east. The Balto-Slavic languages are indeed notably more flexible than the Celtic, Germanic and Romance, but the Uralic, as discussed by Vilkuna, exhibit relatively little flexibility in the ordering of nominal modifiers; the only modifier which regularly manifests alternative positioning relative to the noun is the relative clause. In relation to the OV/VO typology, the average level of flexibility in the ordering of the noun and its modifiers is slightly higher in the VO than in the OV languages of Europe. Only three of the VO languages (Saami, Assyrian and potentially Old Prussian) show no flexibility in the positioning of any of the nominal modifiers, as compared to 29 OV languages which are in the main Turkic and Nakh-Daghestanian. Using a modifier/head vs head/modifier typology rather than the OV/VO one based on the number of modifiers in a language with dominant modifier/head and head/modifier order respectively, Bakker observes that the languages which exhibit a high degree of consistent modifier/head or head/modifier order in their basic word order patterns tend to have less flexible word order than those that are less consistently modifier/ head or head/modifier. At the NP level this tendency is most evident in the modifier/head languages. Of the 41 language which have fully consistent modifier/head basic order at the level of the NP, 63% (26) display no flexibility in the ordering of the five noun/modifier pairs. Recall that there are no languages in Europe which have consistent dominant head/modifier order for all the five nominal modifiers. However, the most consistent ones, i. e. the Celtic (which have four posthead modifiers) and Assyrian, Maltese, Albanian and the Romance languages (which have three postnominal modifiers) manifest no (Assyrian) or low levels of flexibility.

2.2. Europe vs. the rest of the world Having briefly reviewed the distribution of certain word order properties within Europe, we may now turn to the issue of how the word order characteristics of the languages of Europe relate to those found in other areas of the world? If we consider this issue from the point of view of the principles that have been formulated to account for linearization, then there is no reason to expect word order in Europe to be governed by different principles from those

Introduction

13

found to be operative elsewhere. And indeed while there is considerable controversy in relation to the nature of the principles that underlie linearization and how they should be captured in a grammar, none of the word order principles discussed in this volume are specific to European languages. For example, Hawkins's EIC hypothesis which predicts a tendency both in grammar and performance for short > long order in VO languages and long > short order in OV, though to date most extensively tested on European languages, finds strong support in the global distribution of basic orders and more detailed analyses of word order variation in languages such as Korean and Japanese. To give another example, Primus's analysis of the basic order of the verb and its arguments in terms of the thematic and case hierarchies provides an account not only of the lack of consistency in the ordering of the patient and recipient in ditransitive clauses in European languages, but also globally. Moreover, it also captures the positional tendencies for the placement of the agent relative to the patient and recipient. Primus argues that since proto-agents and nominative arguments are ranked higher than recipients and patients on both the thematic and case hierarchies given earlier in (1) and (2), the two hierarchies lead us to expect that proto-agents should not be placed in between proto-patients and proto-recipients in the basic order. Thus, using Greenberg's S, O, V typology and omitting the statistically marginal OSV and OVS languages, the hierarchies exclude as possible basic orders: RSVO, RSOV, RVSO, RVOS, VRSO, VROS and VOSR. As shown by Primus, this is borne out not only by the basic orders attested in her European sample of 54 languages, but also by those in Blansitt's (1973) global sample of 107 languages. And significantly all the remaining nine possibilities (SVOR, SVRO, SRVO, SROV, SORV, SOVR, VSOR, VSRO, VORS) are attested. Yet another set of word order principles shown to be equally valid for European and non-European languages are Rijkhoff's Principle of Domain Integrity, Principle of Head Proximity and Principle of Scope which account for the ordering of the modifiers of the noun not only relative to the noun but also relative to each other. For simple NPs (i. e. consisting of a single NP) of the 24 logically possible linearizations of the demonstrative, numeral, adjective and noun, the three principles define the eight word order patterns in (7) as potential basic orders.

(7)

a. Dem Dem Dem Num

Num Adj N Num N Adj N Adj Num N Adj Dem

b. Adj N Num Num Adj N Dem Adj N N Adj Num

Dem Dem Num Dem

Rijkhoff's investigation reveals that both in Europe and globally all the basic word order patterns conform to his three word order principles.

14

Anna Siewierska

While the factors underlying word order in Europe are unlikely to diverge from those found elsewhere, we may expect there to be some differences between Europe and other areas in regard to the range of linearization patterns found, their distribution and frequency of occurrence. Most trivially there are gaps in the basic orders manifested in Europe relative to those encountered cross-linguistically. For example, there are no languages in Europe which favour the placement of the object before the subject, or in ditransitive clauses position the nominal recipient on the opposite side of the verb than the patient in the basic order, i. e. no SRVO or SOVR languages. However, since OS languages and also SRVO and SOVR languages are cross-linguistically rare, the absence of such basic orders can hardly be seen as a feature peculiar to European languages. There are also gaps in the basic orders of the noun and its modifiers, but these too coincide with the cross-linguistically uncommon orders. For instance, Rijkhoff observes that of the eight basic orders of the demonstrative, numeral, adjective and noun specified in (7) the four in (7b) do not occur in Europe. But of these four patterns only NAdjNumDem is attested in more than one language (Selepet and Yoruba) in his global sample. Turning from the word order patterns that do not occur in Europe to those that do, the only potential basic order characteristics specific to European languages that I am aware of is that identified by Rijkhoff, namely basic DemNumNAdj order. This order occurs in the Romance languages, Albanian, Maltese, Assyrian (all VO) and also in the OV Abhkaz. It is the second most common order in Europe, after DemNumAdjN, which is the dominant order both in Europe and globally. Rijkhoff's investigation suggests that DemNumNAdj order is not attested in any other area of the world. Whether this is indeed so requires verification by means of a more extensive sample. The other basic order characteristic associated with European languages, though, as shown by Dryer, actually characteristic of Eurasia, is the previously mentioned prenominal placement of adjectives and relative clauses in OV languages. Though OV&AdjN and OV&RelN orders are attested as basic orders in all the six macro-areas distinguished by Dryer, i. e. Africa, Eurasia, SouthEast Asia & Oceania Australia & New Guinea, North America and South America, only in Eurasia is this the dominant basic order. Moreover, as shown in the Appendix to this volume, of the eleven OV languages in Europe which display the atypical for Eurasia NAdj order, nine have the more common AdjN as an alternative. And of the eight European OV languages with basic NRel order, five also display, the more characteristic of Eurasia, RelN. The atypicality of postposed modifiers in European OV languages is underscored by Testelec's claim that in most of the OV languages which allow prenominal adjectives, relative clauses and also nominal possessors to be postposed (virtu-

Introduction

15

ally all of which are Dagestanian languages), the head-initial orders differ from the head final in terms of constituent structure. Testelec argues that in the Dagestanian languages the constructions with postposed nominal modifiers involve two NPs in an appositional relation, with the second NP having a modifier as head or a deleted head. Furthermore, in the case of binary branching structures consisting of two NPs as in possessive constructions corresponding to the house of the boy's father or the horse of the boy who came from the city yesterday, a modifier can be postposed only if its own modifier is not postposed. In other words double postposing is ungrammatical. This, however, holds specifically for NP categories, not for postposing in general. For instance, in doubly embedded infinitival constructions double-initial head order is grammatical. At the level of the clause, somewhat unusual from the cross-linguistic perspective is the rarity in Europe and also Eurasia of languages with dominant verb-initial order. Though VI order is far less common cross-linguistically than SOV or SVO, the only other of the six macro-areas distinguished by Dryer that features fewer VI languages than Eurasia is Australia & New Guinea. The typological status of the VI languages of Europe has elicited much comment, most recently, especially in the GB literature. Since none of the Celtic languages actually exhibit verb-initial order in all clause types and some only rarely in main clauses, within GB theory they are considered to be atypical of VI languages. This view is strongly contested in Tallerman's paper which questions not only the current GB diagnostics of true verb-initial languages but also the notion of a single V-initial type and the assumption of structural homogeneity which such a notion suggests. Tallerman argues that the three structural properties taken to be indicative of true verb-initial languages by Ohualla (1991), i. e. the placement of tense affixes further way from the verb stem than subject agreement affixes, the presence of inflected infinitives and the existence of alternative SVO order are not specific to verb-initial languages cross-linguistically and even if they were, do not identify the Celtic languages as a cross-linguistic oddity. Since the Celtic languages actually manifest fusional rather than agglutinative morphology, affix order is not at issue, and they do in fact have inflected infinitives and surface SVO order, albeit morphologically marked. Tallerman is equally sceptical of other potential diagnostics of true VSO-hood such as the existence of inflected prepositions proposed by Kayne (1994). And indeed Gensler (1993) shows that there are no conjugated prepositions in, for instance, Hawaiian or Maasai; Squamish and Chumash have no category of adpositions at all; and Yagua has inflected postpositions rather than prepositions. Moreover, there are various SVO languages with inflected prepositions (e. g. Gbeya, Lango, Nkore-Kigaare). Though the Celtic languages do display several struc-

16

Anna Siewierska

tural properties which appear to co-occur only in the Hamito-Semitic languages of Africa and are potentially indicative of substratal or areal influences (see Gensler 1993), the only truly unusual word order property found in Celtic is the clause final placement of pronominal objects. As captured in Hawkins' EIC, short > long order is the unquestionable norm in VO languages, yet in Irish and to a lesser extent Scottish Gaelic pronominal objects follow even nonvalency adverbials. Also cross-linguistically unusual are the verb-second characteristics in main clauses of the Germanic languages and some Rhaeto-Romance dialects which have received so much attention, again, especially in the GB literature. I know of no language outside Europe that displays comparably strong V2 properties. Whether the virtual lack of subject-prominence in Europe, in the sense of Kiss, a feature indirectly associated with order, is another potentially distinctive, if not exclusively European, characteristic remains to be determined. Kiss (1995) notes that topic- as opposed to subject-prominent languages are attested in Eurasia (e.g. Nepali, Hindi, Korean), Africa (e.g. Somali, Tangale, Aghem, Yoruba), the Americas (e.g. Haida, the Mayan languages, Quechua) and Austronesia (e.g. Ilonggo), but to date the only area which has been investigated in detail with respect to this parameter is Europe. With respect to word order flexibility the position of the languages of Europe vis ä vis the rest of the world is rather difficult to assess due to lack of comprehensive data on word order in most areas of the world. If Bakker's observation in regard to the existence of an inverse relationship between flexibility and consistency were to hold cross-linguistically, then given the unusually high level of consistency in the modifier/head ordering of nominal modifiers in the OV languages of Europe noted by Dryer, we would expect more flexibility in the ordering of the noun/modifier pairs in the OV languages in other areas of the world than that found in Europe. Whether this is indeed so will, however, prove difficult to test, since as Rijkhoff shows, many languages have no category of numerals or adjectives. For example, of the 45 languages in Rijkhoff's global sample, less than half (22) evince the two categories. Moreover, while in Europe the norm is for the noun and its modifiers to form an integral NP, in other areas of the world the relation of apposition, if not preferred, is much more common. This leads Rijkhoff to suggest that we may well find more variation in the sequencing of demonstratives, numerals, adjectives, genitives and relative clauses outside Europe than in Europe, but that this variation may not be at the NP level. A comparison of the flexibility in the ordering of the subject, object and verb in Europe with that found in other parts of the world is in turn impeded by the relative infrequency in many languages of clauses with more than one

Introduction

17

nominal argument. In view of this, it is perhaps not surprising that Siewierska's investigation of word order flexibility reveals that Europe evinces higher levels of word order variation in terms of the number of permutations of S, O and V than that manifested globally. Of the 48 languages in her European sample, nearly half (48%) exhibit all six possible permutations of the subject, object and verb as compared to only 14% in her global sample of 171 languages. And only five European languages (10%) have no word order variants, whereas the corresponding figure for the global sample is 22%. Of the three dominant word order types, in Europe SVO languages display the highest overall level of variation (78%) followed by SOV (64%), while the VSO have no word order variants. Globally, on the other hand, there are no significant differences in the levels of word order variation among the three dominant word order types, but crucially VSO languages display the highest level of variation, i. e. 37% as compared to 30% for SVO and 27% for SOV. If we consider the global distribution of word order flexibility relative to the six macro-areas distinguished by Dryer, the only macro-areas which approach Eurasia (containing the European languages) in regard to word order flexibility are Australia & New Guinea and North America, both of which, however, possess on an average one word order variant less than Eurasia. The macro-areas which most radically diverge from Eurasia in regard to word order flexibility are Africa and South-East Asia & Oceania; only 3% of the African languages in the global sample and 4% of those from South-East Asia &c Oceanian display highly flexible word order and 49% and 30% respectively manifest rigid order. These two macro-areas also exhibit a markedly different distribution of word order flexibility from Eurasia relative to word order type. Whereas in Eurasia the SVO languages have the highest average level of word order flexibility and the VI the lowest, in Africa and South-East Asia &: Oceania the reverse is the case. In the remaining three areas, North America, South America and Australia & New Guinea, like in Eurasia, the highest average levels of word order flexibility are exhibited by the SVO languages, but, unlike in Eurasia, the VI are just as flexible or more flexible than the SOV. Significantly, however, the distribution of word order flexibility relative to word order type differs not only across the macro-areas, but also within the macro-areas, just as in Europe. This may be viewed as confirming Nichols's (1992) findings, based on the cross-linguistic distribution of basic order, as to the strong, small-scale areality of word order characteristics.

3. Concluding remarks The above comments on the word order properties of the languages of Europe do not pretend to do justice to the research results contained in this volume.

18

Anna Siewierska

In concentrating on the overall patterns of distribution of a few well-known typological word order features I have sought to provide a convenient background for the discussion of the many detailed differences in constituent order between various languages and groups of languages that are elaborated in the respective articles, the actual analyses of the word order data that are proposed and the important theoretical issues that several of these analyses raise. I refer the reader to the articles in this volume with the belief that they present a truly unique collection of word order studies both in regard to the range of data considered and the diversity of theoretical standpoints adopted.

Notes 1. In this volume the terms 'constituent order' and 'word order' are used as synonyms.

References Dryer, Matthew 1993 "The Greenbergian word order correlations", Language 63: 81 — 138. Gensler, Orin, David 1993 A typological evaluation of Celtic/Hamito-Semitic syntactic parallels. Ph. D diss. University of California at Berkeley, (to appear in revised version in Oxford University Press.) Hawkins, John A. 1991 "A parsing theory of word order universals", Linguistic Inquiry 21: 223 — 261. 1994 Λ performance theory of order and constituency. Cambridge: Cambridge University Press. Kayne, Richard 1994 The antisymmetry of syntax. Cambridge, MA: MIT Press. Kiss, Katalin 1995 "Introduction", in: Katalin E. Kiss (ed.), Discourse configurational languages. New York/Oxford: Oxford University Press, 3 — 27. Nichols, Johanna 1992 Language diversity in space and time. Chicago: The University of Chicago Press. Ouhalla, Jamal 1991 Functional categories and parametric variation. London: Routledge.

Parti

Word order surveys

Maggie Tallerman

Word order in Celtic

1. Introduction More on geographical than linguistic grounds, a distinction is often made between Insular Celtic and Continental Celtic. Continental Celtic is a cover term for the collection of Celtic dialects spoken in continental Europe before 500 AD, and about which relatively little is known. The languages discussed in this paper form the branch known as Insular Celtic, that is, the Celtic languages of the British Isles and Brittany, four of which survive into the twenty-first century. Insular Celtic in turn is regarded as having two branches: P-Celtic (or Brythonic, Brittonic) consisting of Welsh, Breton and the extinct language Cornish; and Q-Celtic (or Goidelic, Gaelic) consisting of Irish, Scottish Gaelic and the extinct Manx. There are around half a million first language speakers of Welsh and Breton, but only around 80,000 speakers of Scottish Gaelic, and somewhere between 25,000 and 70,000 speakers of Irish. Cornish died out as a first language in the eighteenth century, and Manx became extinct in the 1970's.

2. Inflection and other functional categories In general the Celtic languages do not exhibit fully productive morphological case. The P-Celtic languages have no case distinctions, whilst the Q-Celtic languages display distinct forms for genitive, but do not have a nominative/accusative distinction in lexical nouns. Subject-verb agreement in Celtic follows an exceptional pattern amongst European languages: verbs do not agree in number with a full lexical subject, so a plural subject co-occurs with a verb form identical to the third person singular. There are limited forms of agreement between a noun and an attributive or predicative adjective, both in terms of number and gender (confined to masculine/feminine in all the languages).

22

Maggie Tallerman

The Celtic languages are strongly head-initial, with heads typically preceding both their complements and optional modifiers. Adpositions are in all cases prepositions. The members of the Celtic family all display the phenomenon known as prodrop. Not only are there null subjects, but in fact all pronominal arguments may (or must) be null in the context of agreement morphology: this includes the objects of inflected prepositions (a feature of all the languages) and the objects of non-finite verbs, which have agreement proclitics. The remaining distinctive property of the Celtic languages is the feature known as initial consonantal mutation, sets of morphophonological changes which occur in a variety of lexical, morphosyntactic and syntactic contexts. In what follows, mutations will not be specifically noted.

3. Word order typology Although the Celtic languages can be classified as rigid VSO in type, this statement must immediately be qualified with reference to the situation in Breton and Cornish. Embedded finite clauses in all the languages are VSO, but this pattern is not in general licit in main clauses in Breton and Cornish (see § 4). Breton and Cornish exhibit main clause verb-second effects, although in all other respects they are typologically very similar to the remaining languages in the family. Unlike many other VSO languages (such as Arabic or Berber) Celtic languages do not have an alternative SVO word order: SVO does occur in topicalizations and focalizations, but not as an unmarked word order.

4.

Major constituents in declarative clauses

4.1. The verb and its arguments There are two major word order patterns in finite clauses in Celtic. Leaving the main clauses of Breton and Cornish aside for the moment, the first unmarked pattern has the finite lexical verb in initial position, followed by the subject, the object, any further arguments of the verb, and finally any optional modifiers, giving the order VSOX, as shown in (1): (1)

Verb < Subject < Object < Oblique arguments < Adverbials

(2) and (3) illustrate this pattern from each side of the family:

Word order in Celtic

(2)

Welsh Rhoddais i afal i'r bachgen ddoe. give:PST:lSG I apple to-the boy yesterday gave an apple to the boy yesterday.'

(3)

Irish (McCloskey 1983: 10) Thug me ull don ghasur inne. give:PST I apple to:the boy yesterday gave an apple to the boy yesterday.'

23

The second unmarked clause type is a periphrastic construction, consisting of an inflected auxiliary verb in initial position, followed by the subject, and then a verb phrase containing the infinitival main verb1 and its complement(s) and optional modifiers. Where the auxiliary is a form of the verb 'be', aspect markers occur before the lexical verb: (4)

Welsh Mae o'n adeiladu tai ym Mangor. be:PRS he-PROG build house:PL in Bangor 'He's building houses in Bangor.'

(5)

Irish (McCloskey 1983: 12) Tä se ag togäil tithe i nDoire. be:PRS he PROG build houserPL in Derry 'He's building houses in Derry.'

(6)

Scottish Gaelic Tha Alasdair a'togail an taighe an Alba. be:PRS Alasdair PROG-build the house:GEN in Scotland 'Alasdair is building the house in Scotland.'

(7)

Manx (Gregor 1980: 167) Va eh giarey yn faiyr. be:PST he (PROG-)cut the grass 'He was cutting the grass.'

This pattern is often known in the generative literature as the I-SVO pattern (where I signifies the inflected verb); it occurs in both P-Celtic and Q-Celtic, but Irish and Scottish Gaelic also exhibit I-SOV word order in finite clauses. ISVO appears with the progressive aspect, as in (5) and (6),2 but the perfective aspect, as in (8) and (9), gives rise to I-SOV order:3

24

Maggie Tallerman

(8)

Irish (0 Dochartaigh 1992: 46) Tä Mäire tar eis an litir a scriobhadh. be:PRS Mäire PFV the letter to write 'Mary has written the letter.'

(9)

Scottish Gaelic (MacAulay 1992 a: 171) Bha Iain air an leabhar a cheannach. be:PST Iain PFV the book to buy 'Iain had bought the book.'

The prospective aspect in both languages also gives OV word order, and OV is the standard order in infinitival clauses in Q-Celtic (see § 6). As the foregoing examples illustrate, what the two main clause types have in common is that the inflected verb is in clause initial position. This verb can, however, be preceded by various clause-initial particles, denoting such grammatical categories as polarity (negative/affirmative), tense, interrogative, etc., although the particles may be phonologically reduced so that only their various mutation effects appear (triggered on the following verb) in the spoken languages: (10)

Irish (0 Dochartaigh 1992: 60) D'iarr me air sin a dheanamh. PST-ask I on:3MSG that INF do asked him to do that.'

(11)

Welsh Mi welais i ddraig. AFFMT see:PST:lSG I dragon saw a dragon.'

(12)

Breton (Borsley 1990: 82) Ne lenn ket Anna al levr. NEC read:PRS NEC Anna the book 'Anna doesn't read the book.'

Since the verb is not in absolute initial position in (12), this word order is permissible in Breton, even though the negator ne is typically non-overt in the spoken language. In the VSO clause pattern, there is obviously no surface verb phrase constituent, since the verb is separated from the object by the subject. However, the periphrastic I-SVO/I-SOV pattern does contain a VP. The phrase headed by

Word order in Celtic

25

the non-finite verb is clearly a constituent: see Anderson & Chung (1977: 22), Stephens (1982: 132-3) and Timm (1989) on Breton, McCloskey (1983) on Irish, and Sproat (1985: section 1.1) on Welsh; the VP can, for example, be the focus of a cleft construction:4 (13)

Welsh [ V p Adeiladu tai ym Mangor] a wnaeth o. build house:PL in Bangor PTL do:PST:3SG he 'He built houses in Bangor.'

(14)

Irish (0 Siadhail 1989: 236) [ V p Ag peinteail cathaoir] a bhi an fear. PROG paint chair PTL be:PST the man 'The man was painting a chair.'

The data in this section have shown that in the neutral word order in all the Celtic languages, apart from Breton and Cornish, the finite verb precedes all of its arguments and other modifiers. Such a verb is typically, though not invariably, in clause-initial position, and may be either an auxiliary or a lexical verb. In the following section I briefly review the facts of Breton and Cornish.

4.2. Clause types in Breton and Cornish Concentrating on Breton, since informant data is available for that language, we find that both clause types presented in §4.1 occur although neither has the verb in initial position: (15)

Breton a. *Zibab Yann ul levr. choose:PRS Yann a book ('Yann chooses a book.') (Press 1986: 192) b. Yann a zibab ul levr. Yann PTL choose:PRS a book 'Yann chooses a book.' c. Ul levr a zibab Yann. a book PTL choose:PRS Yann 'Yann chooses a book.'

26

Maggie Tallerman

(15) shows that in a typical transitive clause with a finite main verb, either the subject (15 b) or the object (15 c) must be fronted (see Stephens 1982, Press 1986 for discussion of the pragmatics of such constructions). (15 a), following the usual Celtic VSO pattern, is absolutely ungrammatical. The two orders permitted in (15) result in either SVO or OVS order, but no other order is found; both OSV and SOV orders would also avoid an initial finite verb, but are ungrammatical. The same situation obtains in Cornish, to which Breton is closely related. Breton also exhibits the periphrastic construction, but once again the finite verb is impossible in sentence-initial position: (16)

Breton a. *Ra Yann klask ul levr. do:PRS Yann seek a book ('Yann seeks a book.') (Press b. Klask seek 'Yann

1986: a PTL seeks

191) ra Yann ul levr. do:PRS Yann a book a book.'

c. Klask ul levr a ra Yann. seek a book PTL do:PRS Yann 'Yann seeks a book.' Although (16 b) and (16 c) both have an initial verb, this is the non-finite form in each case. The pattern in (16 b), where a non-finite transitive verb minus its complement(s) appears in initial position, is widely attested in both Breton and Cornish (see Tallerman, this volume, § 2). (16 c) displays fronting of the entire VP, a pattern also found in the other Celtic languages: see (13) and (14). (17) illustrates typical Cornish word orders, again with some constituent other than the finite verb in initial position: (17)

Cornish (George 1993: 455) a. An den a wel an gath. the man PTL see:PRS the cat 'The man sees the cat.' b. Ny wel an den an gath. NEG see:PRS the man the cat 'The man doesn't see the cat.'

Word order in Celtic

27

Breton permits certain finite forms of bezan 'be' in absolute initial position: (18)

Breton (Borsley 1990: 86) Emaon ο komz eus da vreur. be:PRS:lSG PROG talk about your brother Tarn talking about your brother.'

Cornish seems also to have allowed this word order, as shown in (19): (19)

Cornish (adapted from Gregor 1980: 168) Yma orth ow hnoukya. PTL:be:PRS ASP 1SG strike 'He is striking me.'

However, George (1991: 246) notes that there are no examples of the construction in one complete text from Middle Cornish, the period for which the best data exists. Surprisingly, there are a few examples in the same text with a sentence-initial finite lexical verb: (20)

Cornish (from Beunans Meriasek, cited by George 1991: 238) Υ leferys offeren. PTL say:PST:3SG mass 'He said mass.'

Finite declarative lexical verbs cannot occur initially in Breton independent and main clauses. Imperative verbs appear initially in both Breton and Cornish: (21)

a. Breton (Press 1986: 195) Diskouezit din an ti. show:IMP:2PL to:lSG the house 'Show me the house.' b. Cornish (adapted from George 1991: 241) Kemerens pup y arvov. take:IMP:lPL everyone 3PL arm:PL 'Let each take up his arms!'

The status of the construction illustrated in (16 b) is discussed in detail in Tallerman (this volume), as well as the question of what can be considered the unmarked word order in Breton.

28

5.

Maggie Tallerman

Other major constituents in the clause

5.1. The position of object clitics and pronominal objects The pronominal objects of non-finite verbs in Celtic appear as an agreement proclitic on the verb, with optional clitic doubling in Welsh. Compare (4) through (7), where a lexical NP object occurs in postverbal position, to (22) through (24): (22)

Welsh Mae o'n eu adeiladu (nhw). be:PRS:3SG he-PROG 3PL build them 'He's building them.'

(23)

Manx (Broderick 1993: 272) V'eh dyn dilgey shiu. be:PST-he ASP:2PL throw you 'He was throwing you.'

(24)

Irish (0 Siadhail 1989: 295) Tä se (do) mo thoraiocht (*me). be:PRS he ASP 1SG search me 'He is searching me out.'

Overt agreement typically occurs in complementary distribution with overt nominals in Celtic: see for example Stump (1984) on Breton. However, as (22) illustrates, this restriction does not apply to pronominals in Welsh, which can co-occur with overt agreement; nor does it apply to Manx pronominals in examples such as (23): the plural forms of the proclitic agreement marker are identical in form and in mutation effects, so the doubled pronoun distinguishes between persons, and is actually required. The Irish example in (24), however, displays no overt postverbal object. In Colloquial Welsh, the proclitic is often absent, sometimes leaving its mutation effects on the non-finite verb; the 'doubled' clitic is then obligatory in postverbal position. This also occurred in later stages of Manx: see Broderick (1993: 272—3). Apart from in Breton and Cornish, the pronominal objects of finite verbs occur in the postverbal position, i. e. as independent pronouns, like lexical object NPs:

Word order in Celtic

(25)

Scottish Gaelic Chunnaic mi thu. see:PST I you Ί saw you.'

(26)

Welsh Gwelais i ti. see:PST:lSG I you Ί saw you.'

29

In Cornish and in traditional Breton, an independent object pronoun may occur as the initial constituent in the verb-second construction, in which case it is focalized, or alternatively a clitic pronoun occurs in preverbal position, as in (27) and (28): (27)

Cornish (George 1993: 455) My a's gwel. I PTL-3FSG see:PRS Ί see her.'

(28)

Breton (Press 1986: 101) Me az kwel. 1SG 2SG see:PRES Ί see you.'

However, in modern Breton the pronominal object typically occurs as an inflected form of the preposition a Of in postverbal position: (29)

Breton (Press 1986: 101) Me a wel ac'hanout. 1SG PTL see:PRS of:2SG Ί see you.'

In Irish, there is a strong tendency for the pronominal object of a finite verb to appear in clause-final position, even following (a string of) non-valency adverbials: see Tallerman (this volume, § 4) for further discussion: (30)

Irish (Stenson 1981: 42) Chonaic se i mBaile Atha Cliath areir /'. see:PST he in Dublin last.night her 'He saw her in Dublin last night.'

30

Maggie Tallerman

5.2. Adjuncts and adverbiale I turn now to constituents other than the main verb, subject and object. The majority of adjuncts and adverbials occur clause-finally, as noted in § 4.1. (31) illustrates the typical Celtic situation: (31)

Scottish Gaelic (Mackinnon 1971: 53) Am bi a' chlann a' tighinn dhachaidh INT be the children PROG come home as an sgoil aig ceithir uairean? from the school at four o'clock 'Will the children be coming home from school at four o'clock?'

However, as well as the VSOX pattern, a secondary XVSO pattern also exists. Sentence-initial position is used as an optional position for sentential adverbials and temporal adverbials throughout the Celtic family, without necessarily adding focus to the adverbial: (32)

Welsh Ddydd a nos gweithiodd ef yn galed day and night work:PST:3SG he PRED hard 'Day and night he worked hard.'

(33)

Irish (0 Siadhail 1989: 215) Ar ndoigh, bhi Micheal sästa. of course be:PST Micheal content Of course, Micheal was content.'

Strings of adverbials can also occur in clause-initial position, even in Breton, where they appear as it were outside the verb-second construction, apparently adjoined to the full clause: (34)

Breton (Press 1986: 204) Disul, goude an oferenn, ar beleg a zo act da Sunday after the mass the priest be:PRS go:PST.PART to c'hoari c'hartou. play cards 'Last Sunday, after Mass, the priest went to play cards.'

In (34) the main clause has a fronted subject, but two adverbials appear to the left of this constituent.

Word order in Celtic

31

Irish also exhibits this construction, but interestingly, when (strings of) adverbials occur in embedded clauses, they must precede the comple.mentizer which introduces the embedded clause: see McCloskey (1996). More rarely, adverbiale occur in a clause-medial rather than peripheral position. Irish has a minor pattern in which the adverbial follows the verb and precedes the subject: (35)

Irish (0 Siadhail 1989: 217) Ta ar ndoighe saighdiuiri ar a' bhealach on be:PRS of course soldiers on the road from Chlochan Glas. Ciochan Glas 'There are, of course, soldiers on the road from Clochän Glas.'

In the periphrastic construction, Welsh VP-adverbials may occur immediately outside the verb phrase, where they precede all aspect markers: (36)

Welsh Mae hi eisoes wedi bod yn socian am dridiau. be:PRS 3FSG already PFV be PROG soak for three.days 'It's already been soaking for three days.'

5.3. Topicalization and focalization A salient feature of Celtic word order is that "fronting" constructions occur frequently in both the spoken and the written languages; that is, constructions in which some phrasal category occurs in clause-initial position, for purposes of focalization or topicalization. The degree of emphasis is usually slight or nonexistent compared to, say, clefting in English. Fronting constructions in Celtic have an initial XP (some phrasal category) followed by a particle, or complementizer, which precedes the verb.5 These particles, usually no more than a reduced vowel, are typically absent from the spoken languages, although they often have mutation effects which remain. Focalizations and topicalizations have the same or very similar syntax: consider for example topicalization versus constrastive fronting in Modern Welsh, distinguished solely by the intonation pattern. (37)

Welsh (Fife & King 1991: 145-6) a. Dydd Sul a ddaeth. Sunday PTL came 'Sunday came.' (topic)

32

Maggie Tallerman b. DYDDSUL a ddaeth. Sunday PTL came 'SUNDAY came.' (i.e. not Saturday — constrastive)

Most constituents can be fronted in the Celtic languages. Fronted NPs, especially subjects, are very common, and probably gave rise to an unmarked SVO order in Cornish: (38)

Cornish (George 1993: 455) My a wel an gath. I PTL see:PRS the cat Ί see the cat.'

Some scholars also consider a fronted subject to represent the unmarked word order in Breton: see Tallerman (this volume, § 2) for further discussion. (13) and (14) illustrated fronted VPs in Welsh and Irish. PPs and adverbials can also be focalized in each of the Celtic languages: (39)

Welsh [PP I'r bachgen] y rhoddais i afal. to-the boy PTL give:PST:lSG I apple 'It was to the boy that I gave an apple.'

(40)

Manx (Broderick 1993: 276) [PP Da'n dooinney] hug eh yn skian. to-the man give:PST he the knife 'It was to the man that he gave the knife.'

(41)

Welsh Eleni y maen nhw'n mynd i Lydaw. this year PTL be:PRS:3PL they-PROG go to Brittany 'It's this year that they're going to Brittany.'

(42)

Breton (Ternes 1992: 388) Breman e tigoran an nor. now PTL open:PRS:lSG the door Ί open the door now.'

(43)

Manx (Thomson 1992: 111) Fastyr Jycrean haink ad dy chur shilley orrin. evening Wednesday come:PST they to put sight on:2PL 'It was Wednesday evening they came to see us.'

Word order in Celtic

33

APs can be fronted in the Irish copula construction, as (44) illustrates: (44)

Irish (McCloskey 1979: 91) [AP Dubh] a bhi se. black PTL be:PST it 'Pit's black that it was.'

But fronted bare APs are much more marginal in Welsh: (45)

Welsh (Jones & Thomas 1977: 290) ?[AP Del] mae Mair yn edrych. pretty be:PRS Mair PROG look 'Pit's pretty that Mair's looking.'

'Long movement' of focalized constituents, i. e. from an embedded clause to the main clause, seems to be generally possible, as illustrated from Irish: (46)

Irish (McCloskey 1979: 154) [PP i mBethlehem] a duirt na targaireachtai in Bethlehem PTL say:PST the prophecy:PL a bearfai an Slänaitheoir. PTL be:born:COND the Saviour 'It was in Bethlehem that the prophecies said that the Saviour would be born.'

Nor is the landing site of a fronted constituent necessarily in the root clause: at least in Welsh and Breton, embedded clauses can also have a focalized or topicalized phrase, as (47) and (48) show. Note that Welsh has a special clefting complementizer, mat or taw, (see Tallerman, 1996): (47)

Welsh Dywedodd hi [mai [PP i'r bachgen] say:PST:3SG she COMP to-the boy y rhoddais i afal]. PTL give:PST:lSG I apple 'She said that it was to the boy that I gave an apple.'

(48)

Breton (Press 1986: 210) Me 'lavar deoc'h [ar marc'h-se a oa re gozh]. I telhPRS to:2PL the horse-DEM PTL be:PST too old tell you that horse was too old.'

34

Maggie Tallerman

Press (1986: 210) in fact notes that such constructions do not necessarily topicalize or focalize the fronted NP in Breton. As expected, the pragmatic effects vary according to what constituent is fronted: as in many other languages, fronting of a subject NP is not highly marked, but fronting of a direct object, indirect object, and of non-valency constituents is much more marked. This applies to Breton (see Press 1986: ch. 4, Stephens 1982: ch. 3) and Cornish just as much as the remaining languages.

6. Subordinate clauses Finite subordinate clauses are unremarkable in all the Celtic languages, including Breton and Cornish, since the word order is VSO: (49)

Breton (Borsley 1990: 85) Gouzout a ran [e glaskas Yann Per], know PTL do:PRES:lSG PTL look.fonPST Yann Per Ί know that Yann looked for Per.'

(50)

Cornish (George 1993: 459) Ev a grys [y hwelav an gath]. he PTL believe:PRS PTL see:PRS:lSG the cat 'He believes that I see the cat.'

The fact that embedded clauses exhibit VSO order in these two languages suggests strongly that they should indeed be classified as typologically VSO, although with a general constraint against initial finite verbs in main clauses. Embedded infinitival clauses display much more variation in word order both between the different languages, and within languages. Although all the Celtic languages have SV word order in infinitival clauses, the position of the direct object varies. The P-Celtic languages, Welsh, Breton and Cornish, display SVO order: (51)

Welsh 'Dw i'n gwybod [i Mair ddarllen y llyfr]. be:PRS:lSG I-PROG know to Mair read the book Ί know that Mair read the book.'

Word order in Celtic

(52)

Breton (Stephens 1990: 155) Krenv awalc'h eo Lomm [evit Yann da spontan strong enough be:PRS Lomm for Yann to frighten dirazan]. before:3MSG 'Lomm is sufficiently strong to frighten Yann.'

(53)

Cornish (George 1993: 460) Ev a grys [my dhe weles an gath]. he PTL believe:PRS I to see the cat 'He believes that I see the cat.'

35

See Tallerman (this volume, § 5) for further discussion of the Celtic infinitival clause. In the Q-Celtic languages SOV order is more usual than SVO order. In Irish, both word orders are found, the availability of each being dialectally determined, but SOV is prevalent: see 0 Siadhail 1989: 255 — 6; McCloskey (1980); and Chung & McCloskey 1987: section 5.1. (54) illustrates both alternatives: (54)

Irish (Bobaljik & Carnie, 1996: 228) a. Ba mhaith Horn [e an teach a thogail]. be:PRS good with:lSG him:ACC the house build would like him to build the house.' b. Ba mhaith liom [e a thogäil an ti]. be:PRS good with:lSG him:ACC build the house:GEN would like him to build the house.'

SOV order is usual in Scottish Gaelic:6 (55)

Scottish Gaelic (MacAulay 1992 a: 188) Thug Iain air Anna [(i) an leabhar a thoirt do Mhäiri]. give:PST Iain on Anna she the book to give to Mäiri 'Iain made Anna give the book to Mäiri.'

VO word order does occur in Scottish Gaelic in infinitival clauses which are aspectually prospective, parallel to the English expressions I'm going to do X, I've come to do X:

36

(56)

Maggie Tallerman

Scottish Gaelic (Gillies 1993: 205) Thainig mi [a charadh an doruis]. come:PST I to fix the door Ί came to fix the door.'

Manx appears to have had (S)VO order in infinitival clauses: (57)

Manx (adapted from Gregor 1980: 199) The object and purpose of this Act is ... [dy chur er-ash ny ooryn-doonee jeh'n thieyn-oast] to restore the hour:PL-closing of-the house:PL-public '... to restore the closing-hours of public houses.'

Note that even in SOV clauses in the Q-Celtic languages, the infinitival verb is not in absolute final position. In (55), the direct object an leabhar 'the book' precedes the non-finite verb, but the indirect object do Mh iri follows it. This means that SOV clauses in Celtic have a different status than SOV clauses in such languages as Turkish or Japanese, where the verb is in absolute final position, and where the languages exhibit other head-final word order characteristics such as pre-head complements to adjectives and nouns. The Celtic languages, on the contrary, are all strongly head-initial. For this reason recent transformational accounts of SOV infinitival clauses have proposed that the objects move from underlying post-verbal to pre-verbal position: see Bobaljik & Carnie (1996).

7. Different sentence types Yes/no questions and negations retain the VSO word order of finite clauses, both main and embedded, and utilize a clause-initial particle or complementizer, as these examples illustrate: (58)

Irish (0 Siadhail 1989: 321) An raibh tu sasta? INT be:PST you content 'Were you content?'

(59)

Welsh Ni welais i ddim draig. NEC see:PST:lSG I NEC dragon Ί didn't see a dragon.'

Word order in Celtic

37

As already noted in §4.1, in the colloquial languages the various particles (interrogative, negative and negative interrogative) are often non-overt, although their mutation effects are typically retained. Like French and other European languages, Welsh and Breton have both a preverbal and a postverbal negative marker, and, also like French, tend to drop the preverbal marker. The Celtic languages all form ^-questions via fronting of the ^-element to clause-initial position, as illustrated in (60) and (61): (60)

Breton (Stephens 1993: 401) Piv en deus prenet an ti ruz? who 3MSG have:PRS buy:PART the house red 'Who has bought the red house?'

(61)

Irish (0 Siadhail 1989: 250) Ce a bhris an chathaoir? who PTL break:PST the chair 'Who broke the chair?'

In fact, non-polar interrogatives are formed in the same manner as focalizations and topicalizations in the Celtic languages (see § 5.3 above). As (61) shows, the questioned constituent is followed by a (frequently covert) particle or complementizer, just as in other fronting constructions.

8.

The noun phrase

8.1. Order of elements modifying the head noun The general order of elements within the noun phrase in the Celtic languages follows the pattern in (62) (where Det includes articles, (some) quantifiers and possessive proclitics): (62)

Det < numeral < N < Adj < Demonst. < Gen. < Rel. Clause

Only Breton has both a definite and an indefinite article, the remaining languages having the definite article only. Most adjectives follow the head noun in all the languages, but a small subset precede the noun, or take either position: (63)

Breton a. an den kozh the man old 'the old man'

38

Maggie Tallerman b. ur gwall hent a bad road

(64)

Scottish Gaelic a. am bäta mor the boat big 'the big boat' b. scann chu old dog 'an old dog'

Numerals, both cardinal and ordinal, precede the noun, but the Celtic languages all have a vigesimal or partially vigesimal counting system, and various constructions are used with numbers higher than ten. One system in use on both sides of the Celtic family places the units in pre-head position, and the tens in post-head position: (65)

Welsh tair merch ar ddeg three:F girl on ten 'thirteen girls'

(66)

Scottish Gaelic coig craobhan deug five tree:PL ten 'fifteen trees'

A more usual system for counting items over 'eleven' is illustrated in (67) and (68), where an enumerator head is followed by a partitive phrase containing the noun being enumerated: (67)

Welsh un deg tair o ferched one ten three:F of girhPL 'thirteen girls'

(68)

Irish (Dillon & 0 Croinin 1961: 138) cuig cinn deag ar fhichid de mhucaibh five head ten on thirty of pig:PL 'thirty-five pigs'

Word order in Celtic

39

In all the Celtic languages, demonstratives appear in post-head position, obligatorily accompanied by a definite article: (69)

Irish (0 Dochartaigh 1992: 54) a. an lamh seo the hand PROX 'this hand' b. an fear sin the man OBV 'that hand' c. an leabhar ud the book DIST 'yon book'

All the Celtic languages have proximate, obviative and distal demonstratives. In each language, where one or more than one adjective follows the head noun, the demonstrative element follows the (string of) adjectives: (70)

Breton (Ternes 1992: 402) a. an den-se the man-OBV 'that man' b. an den fall-se the man bad-OBV 'that bad man'

Possessive proclitics occur in all the Celtic languages, in complementary distribution with other determiners such as articles. If there is a pre-head adjective, then the possessive precedes the adjective too, just as in the case of Det-Adj-N strings (see (63 b) above): (71)

Welsh f'annwyl gyfaill ISG-dear friend 'my dear friend'

Welsh differs from the other Celtic languages in that overt pronouns do cooccur with agreement morphology, so a post-head enclitic pronoun also occurs (provided there is no coreferential subject NP in the clause):

40

(72)

Maggie Tallerman

Welsh (Awbery 1994: 7) On' ty to o'dd 'y ngartre' /. but house roof be:PRS:3SG 1SG home I 'But my home was a thatched cottage.'

As expected in head-initial languages, genitives are in post-head position, giving the order Possessum-Possessor. In the Q-Celtic languages, which have a productive genitive marking, the possessor is in the genitive case: (73)

Scottish Gaelic (MacAulay 1992 a: 199) taigh Chaluim house Calum:GEN 'Calum's house'

In each of the languages, only the last noun in a sequence of genitives can be modified by a definite article or other determiner: (74)

Welsh stafell pennaeth yr adran room head the department 'the room of the head of the department'

(75)

Breton (Stephens 1993: 393) ti bihan ma zud-kozh house little 1SG people-old 'my grandparents' little house'

Relative clauses are also uniformly in post-head position, and are formed in the same manner as clefts or topicalizations and «^-questions: there are no relative pronouns, and a particle (often non-overt in the spoken languages) follows the head noun: (76)

Welsh y ddynes a welaist ti ddoe the woman PTL see:PST:2SG you yesterday 'the woman you saw yesterday'

(77)

Manx (Broderick 1993: 280) yn baatey nagh row mee ayn the boat NEC be:PST I in:3MSG 'the boat I was not in'

Word order in Celtic

41

8.2. The adjectival phrase Adjectival modifiers are not uniformly in post-head position, as the head-initial character of the Celtic languages would lead us to expect. In Welsh, although the textually-frequent modifier iawn 'very' follows the adjective, as in (78 a), the majority of modifiers precede the adjective, as in (78 b) and (c): (78)

Welsh a. dyn balch iawn man proud very 'a very proud man' b. dyn eithaf balch man quite proud 'a quite proud man' c. llyfr rhy ddrud book too expensive 'a too expensive book'

Breton also has both pre- and post-head adjectival modifiers: (79)

Breton a. un ti re vras a house too big 'a too big house' b. klanv kaer ill very 'very ill'

The majority of adjectival modifiers in the Q-Celtic languages appear in prehead position. Irish ana- 'very' and ro- 'too' are prefixes, and free forms are also pre-head: (80)

Irish a. ana-bheag very small b. reasunta maith reasonably good

42

Maggie Tallerman

Adjectival modifiers in Scottish Gaelic also precede the head: (81)

Scottish Gaelic (Gillies 1993: 202) a. duine caran bodhar man slightly deaf 'a slightly deaf man' b. gle mhor very big

Although degree modifiers typically precede the adjectival head in Manx too, "[modifiers] derived from other adjectives generally follow" (Thomson 1992: 113): (82)

Manx (Thomson 1992: 113) mie yindyssagh good wonderful 'wonderfully good'

9. Conclusion This descriptive overview raises various issues which merit further discussion: the word order in Breton and Cornish, which is superficially so different from that of the remaining Celtic languages; the postposed object pronouns of Irish, which unexpectedly migrate to the end of the clause; the word orders in infinitival clauses, and their relationship with finite clauses. Such matters are of theoretical interest both to typologists and generative linguists, and in Tallerman (this volume) these and other issues are taken up in more detail.

Notes 1. The main verb is in the non-finite form known in traditional Celtic grammar as a verb(al)-noun. 2. The direct object of a progressive verb is traditionally in the genitive case in both Irish and Scottish Gaelic, although as McCloskey (1983: 13) observes for Irish, this rule is typically not maintained in the colloquial language. 3. Manx appears to have standardized I-SVO order in finite clauses, including those in the perfective aspect:

Word order in Celtic

43

(i)

Manx (Broderick 1993: 273) T'ad er tilgey yn shleiy. be:PRS:they PFV throw the spear 'They have thrown the spear.' 4. The status of the pre-verbal particles in clefts and relative clauses is controversial, and in what follows I will gloss them simply as "particle". 5. See also the discussion of the copular construction in Tallerman, this volume: § 3. 6. MacAulay (1992 a: 172) notes that the embedded subject is optional when it is coreferential to the matrix indirect object.

References Anderson, Stephen &c Sandra Chung 1977 "On grammatical relations and clause structure in verb-initial languages", in: Peter Cole & Jerry Sadock (eds.), Syntax and Semantics 8. New York: Academic Press, 1—25. Awbery, Gwenllian 1994 "Echo pronouns in a Welsh dialect: A system in crisis?", Bangor Research Papers in Linguistics, Volume 5: Research papers in Welsh syntax, 1—29. Ball, Martin J. (ed.) 1993 The Celtic languages. London: Routledge. Bobaljik Jonathan & Andrew Carnie 1996 "A minimalist approach to some problems of Irish word order", in: Robert D. Borsley & Ian Roberts (eds.), 223-240. Borsley, Robert 1990 "A GPSG approach to Breton word order", in: Randall Hendrick, (ed.), Syntax and Semantics 23. San Diego: Academic Press, 81 — 95. Borsley, Robert D. & Ian Roberts (eds.) 1996 The syntax of the Celtic languages: a comparative perspective. Cambridge: Cambridge University Press. Broderick, George 1993 "Manx", in: Martin J. Ball (ed.), 228-285. Chung, Sandra &C James McCloskey 1987 "Government, barriers, and small clauses in Irish", Linguistic Inquiry 18: 173-237. Dillon, Myles & Donncha 0 Croinin 1961 Teach yourself Irish. Sevenoaks: Hodder and Stoughton. Fife, James & Gareth King 1991 "Focus and the Welsh 'abnormal sentence': a cross-linguistic perspective", in: James Fife & Erich Poppe (eds.), 81-153. Fife, James & Erich Poppe (eds.) 1991 Studies in Brythonic word order. Amsterdam: John Benjamins. George, Ken 1991 "Notes on word order in Beunans Meriasek", in: James Fife & Erich Poppe (eds.), 205-250. 1993 "Cornish", in: Martin J. Ball (ed.), 410-468. Gillies, William 1993 "Scottish Gaelic", in: Martin J. Ball (ed.), 145-227.

44

Maggie Tallerman

Gregor, Douglas 1980 Celtic. A comparative study, Cambridge: The Oleander Press. Jones, B. M. & Alan Thomas 1977 The Welsh language: studies in its syntax and semantics. Cardiff: University of Wales Press. MacAulay, Donald 1992 a "The Scottish Gaelic language", in: Donald MacAulay (ed.) (1992 b), 137-248. MacAulay, Donald (ed.) 1992 b The Celtic languages. Cambridge: Cambridge University Press. Mackinnon, Roderick 1971 Teach yourself Gaelic. Sevenoaks: Hodder and Stoughton. McCloskey, James 1979 Transformational syntax and model theoretic semantics. A case study in Modern Irish. Dordrecht: Reidel. 1980 "Is there raising in Modern Irish?", Eriu 31: 59-99. 1983 "A VP in a VSO language?" in Gerald Gazdar, Eqwan Klein & Geoffrey Pullum (eds.). Order, concord and constituency. Dordrecht: Foris, 9—55. 1996 "On the scope of verb movement in Irish", Natural Language and Linguistic Theory 14: 47-104. O Dochartaigh, Cathair 1992 "The Irish language", in: Donald MacAulay (ed.), 11-99. 0 Siadhail, Micheal 1989 Modern Irish. Grammatical structure and dialectal variation. Cambridge: Cambridge University Press. Press, Ian 1986 A grammar of Modern Breton. Berlin: Mouton de Gruyter. Sproat, Richard 1985 "Welsh syntax and VSO structure", Natural Language and Linguistic Theory 3: 173-216. Stenson, Nancy 1981 Studies in Irish syntax. Tübingen: Gunter Narr Verlag. Stephens, Janig 1982 Word Order in Breton. PhD dissertation, School of Oriental and African Studies, University of London. 1990 "Non-finite clauses in Breton", in: Martin Ball, James Fife, Erich Poppe, & Jenny Rowland (eds.), Celtic linguistics: Readings in the Brythonic languages, a Festschrift for T. Arwyn Watkins. Amsterdam: John Benjamins, 151-165. 1993 "Breton", in: Martin J. Ball (ed.), 349-409. Stump, Gregory 1984 "Agreement vs. incorporation in Breton", Natural Language and Linguistic Theory 7: 289-348. Tallerman, Maggie this volume "Celtic word order: some theoretical issues". 1996 "Fronting constructions in Welsh", in: Robert D. Borsley & Ian Roberts (eds.), 97-124.

Word order in Celtic

45

Ternes, Elmar 1992 "The Breton language", in: Donald MacAulay (ed.), 371-452. Thomson, Robert 1992 "The Manx language", in: Donald MacAulay (ed.), 100-136. Timm, Lenora 1989 "The problem of a VP in a VSO language: evidence from Breton", General Linguistics 29: 247-271.

Alfredo R. Arnaiz

An overview of the main word order characteristics of Romance

1. Introduction The term "Romance languages" refers to languages which derive from the Italic branch of Indoeuropean via Latin. The main Romance languages are Catalan, Franco-Provencal. French, Galician, Italian, Occitan, Portuguese, Rhaeto-Romance, Rumanian, Sardinian and Spanish.1 Also, there are a number of Creoles derived from some of the Romance languages (especially, from Portuguese, French, Spanish and in a lesser degree Italian). Although originally spoken in Europe, these languages are now spoken all around the globe, as a result of immigration and/or colonization.2 See Table 1 for an approximate number of speakers by language. Table 1. Approximate number of speakers by language.3 Language

Number of Speakers (millions)

Catalan*

9

Franco-Proven9al

4

French*

72(124]

Galician

4

Italian*

58 [63]

Occitan

3

Portuguese*

167 [179]

Rumanian*

26

Rhaeto-Romance

1

Sardinian*

1

Spanish*

326 [371]

48

Alfredo R. Arnaiz

The Romance languages are often divided into Ibero-Romance (Catalan, Galician, Portuguese, Spanish), Gallo-Romance (Franco-Proven9al, French, Occitan), Italo-Romance (Italian, Sardinian, Rhaeto-Romance) and Balkan Romance (Dalmatian!, Rumanian). Also, a division between Southern and Northern, and West and East Romance is often used based on certain common/ distinct features.

2. Inflection and other functional categories In general, the Romance languages do no exhibit Case morphology. Case is only marked in (clitic) pronouns (nominative, accusative, dative); some of these languages also mark genitive/partitive and oblique (mainly, locative), see sec. 4.3. The exception is Rumanian that has retained three distinct case forms (nominative/accusative, genitive/dative and vocative).4 These languages show Subject-Verb agreement (person, number and some instances of gender in certain constructions). They also show noun-adjective agreement and subject-adjective agreement in predicative adjective constructions. Inflexions are uniformly suffixes. There is grammatical gender distinction in these languages (in general, masculine/femenine; plus neutral in Rumanian). Romance languages present a morphological subjunctive mood. With the exception of French, Romance languages are Null Subject languages (Pro-drop). Romance languages are consistently prepositional, with no preposition stranding. Concerning negation, the whole set of possibilities known as the Jespersen cycle is observed in this linguistic family: there are languages that have a preverbal negative marker, a postverbal one or both. Romance languages are — in general — negative concord languages (i.e. they show negative agreement).

3. Word Order Type In general, all Romance language belong to the type SVO (or VO).5 This does not mean that other orders are not possible but, as a general rule, these languages comply with a number of characteristics usually associated with the SVO language type. That is, they have prepositions rather than postpositions; objects or complements follow the verb; auxiliaries are frequent and always precede the main verb; the genitive phrase takes the form of a PP, invariably following the noun; attributive phrases and relative clauses follow the head

The main word order characteristics of Romance

49

noun; the comparative precedes the standard of comparison; adjectives usually follow the noun; quantifiers, determiners and negatives precede the element they modify. The following sections will illustrate these patterns and point out differences and potential discrepancies.

4.

Major constituents in declarative clauses

4.1. The verb and its arguments As mentioned above, the Romance languages are SVO. It is often assumed that this is the unmarked (typical) order in declarative clauses and in most cases of embedded clauses (see sec. 5). But, other orders are possible and sometimes required (or preferred) in certain contexts or with certain classes of verbs. It is necessary to mention two languages that differ somewhat from the general pattern. On the one hand, French in which postverbal subjects are extremely restricted in comparison to the other members of the family. And, on the other, Rumanian which allows certain orders that are not as common in the other related languages (e.g. OVS, cf. sec. 4.4). This is often seen as related to the fact that Rumanian has a case marking system. Transitive clauses typically show an SVO order.6 Other orders are possible (VSO, VOS, SOV, OSV, OVS), but their occurrence is restricted by special conditions, see sec. 4.4. Subject Inversion is required in «^-questions in all the languages under discussion here with the exception of French and Brazilian Portuguese, see sec. 6.1.2. In declarative clauses, postverbal subjects are allowed, in particular VSO — again, with the exception of these two languages. VOS appears to be marked in European Portuguese, requiring a "focus" element at the beginning of the sentence (see (1), from Ambar (1992)) and also in Italian, where there seems to be an adjacency requirement between the V and the postverbal S (see (2), from Rizzi (1991: 19)). Similarly, in Sardinian subject inversion (VS orders without a dislocated subject) is not possible in the presence of a postverbal complement (as evidenced by the ungrammaticality of (36.c) from Jones (1988: 339)). On the other hand, in Catalan, Rumanian and Spanish the equivalents of (2 a) and (3 b,c) are fully acceptable. (1)

Portuguese Ontem comeu a sopa a Joana. FE ate the soup the Joana 'Joana ate the soup.'

50

(2)

Alfredo R. Arnaiz

Italian a. ?Ha risolto il problema Gianni. Has solved the problem Gianni 'Gianni has solved the problem.' b. Lo ha risolto Gianni. Cl(O) has solved Gianni 'Gianni has solved it.'

(3)

Sardinian a. Karkuna a' tunkatu su barkone. someone has shut the window 'Someone has shut the window.' b. *A' tunkatu karkunu su barkone has shut someone the window c. *A' tunkatu su barkone karkunu has shut the window someone

Intransitive clauses allow for both orders: SV and VS. VS is common in presentational sentences and the unmarked option with the so called unaccusative verbs, see (4). (4)

a. Italian (Burzio 1986) Viene Giovanni, comes Giovanni 'Giovanni comes.' b. Spanish Viene Juan, comes Juan 'Juan comes.'

French, which in general does not allow postverbal subjects, has a construction (restricted to unaccusative verbs) where the subject appears postverbally and an expletive pronoun il is in preverbal position, as illustrated in (5). (5)

French (Belletti 1988: 4) II est arrive trois filles. it is arrived three girls 'There arrived three girls.'

The main word order characteristics of Romance

51

The order OVS is common in experiencer predicates (O = experiencer, S = stimulus) in most of these languages. This is illustrated in (6 a) for Spanish and in (6 b) for Italian (from Belletti and Rizzi (1988: 29)). (6)

a. Spanish A Juan le gusta esto. To Juan Cl(IO) pleases this 'Juan likes this.' b. Italian A Gianni piace questo To Gianni pleases this 'Gianni likes this.'

With ditransitive verbs the order is S-V-DO-IO, S-V-IO-DO is also possible. Other orders are allowed following the restrictions mentioned above, see also sec. 4.4. Existential clauses show the order V-NP.

4.2. Adjuncts and adverbials Consider the following schema: (7)

l S 2 Aux 3 V 4 Ο 5

The possibilities for the positioning of adverbs in Romance vary across languages. Among the languages being considered here, only Italian and Sardinian allow for all the possibilities in (7). Catalan excludes position 2. An adverb may only occur between S and V provided that S is topicalized. French also excludes this position. Rumanian and Spanish exclude position 3. As a general tendency, sentential adverbs commonly occupy position 1, but they are not limited to that position. VP-adverbs tend to immediately follow the V or precede it following the restrictions mentioned above.

4.3. Clitics The Romance languages have a rich system of clitic pronouns. All languages have at least accusative and dative clitics, see Table 2 (some examples are given in (8)7

52

Alfredo R. Arnaiz

Table 2. Clitics. Language

Nom.

Ace.

Dat.

Gen.

Loc.

Catalan



X

X

χ

χ

French

χ

X

X

χ

χ

Italian



X

X

X

X

Portuguese



X

X





Rumanian



X

X



-

Sardinian



X

X





Spanish



X

X

-



(8)

a. French (Rizzi & Roberts 1989: 3) Ou est-il alle? where has-Cl(s) gone? 'Where has he gone?' b. Spanish νιο ayer en la manana. Jose la Jose Cl(O) saw yesterday in the morning 'Jose saw her yesterday morning.' c. Sardinian (Jones 1988: 337-338) Li dao su ibru. Cl(IO) give-.lsc the book '(I) give the book to him.' d. Italian Ne offrono a Carlo. Cl(GEN) offer:3pL to Carlo '(They) offer some (of it) to Carlo.' e. Catalan Hi erem. Cl(LOC) be:lPL '(We) were there.'

The positioning of clitics in general adheres to the following pattern: they precede finite verbal forms and follow non-finite ones (particularly, infinitivals and imperatives). But, there are some exceptions. In Portuguese there are some

The main word order characteristics of Romance

53

differences in clitic placement between the European and Brazilian variaties: in the former they are usually enclitic to the verb, while in the latter, they are proclitic; see (9), from Parkinson (1988: 158). Sentence initial clitics are excluded in EP and in written BP. Moreover, they invariably precede the verb when an element other than a lexical NP subject precedes them: this applies to both variaties. In Sardinian, these elements precede infinitives and follow optionally the past participle. (9)

a. European Portuguese Ο pai deu-me um bolo. the father gave-Cl(IO) a cake b. Brazilian Portuguese O pai me deu urn bolo. the father CL(IO) gave a cake 'father gave me a cake.'

These languages show in different degrees a phenomenon often referred to as Clitic climbing. This phenomenon involves a clitic related to an embedded infinitival verb surfacing immediately before the matrix verb, as shown in (106). (10)

a. Spanish Juan quiere comprarlo. Juan wants to-buy-Cl(O) b. Juan lo quiere comprar Juan Cl(O) wants to buy 'Juan wants to buy it.'

4.4. Dislocation and focalization As mentioned in sec. 4.1, most variants of SVO order are attested in Romance with the proviso that certain specific conditions concerning grammar, pragmatics and prosody are met. Even in the case of French which is often considered limited with regard to word order alternation, almost all the possible variations are attested when processes such as dislocation and focalization are considered. One of these processes is the so called Clitic Left Dislocation. In these cases, a constituent is dislocated to the left and a matching (resumptive) clitic pronouns appears with the verb. This is illustrated in (11 — 13).

54

(11)

Alfredo R. Arnaiz

Italian (Cinque 1990: 57-58) a. Al mare, ci siamo giä stati. to the seaside Cl we-have already been 'To he seaside, we have already been.' b. Tutti, non li ho visti ancora. all not Cl have seen yet 'All of them, I have not seen yet.'

(12)

Spanish a. El libro, yo se lo di a Maria. The book, 1 Cl(IO) Cl(O) gave to Maria 'The book, I gave to Maria.' b. A Maria, yo le di el libro. To Maria, I Cl(IO) gave the book 'To Maria, I gave the book.'

(13)

French (Harris 1988: 235) Marie, je la deteste. Marie, I C1(O) detest 'Marie, I detest.'

In this type of construction more than one constituent may be fronted, and the fronted element normally has a topic or non-focal interpretation. There is also a right counterpart of this construction. Another such process is focalization. In this case, a constituent may be fronted with no resumptive pronoun, and this then has a strong focusing effect. Some examples are given in (14). (14)

a. Portuguese (Ambar 1992) A sopa comeu a Joana. the soup ate Joana 'THE SOUP, Joana ate.' b. Sardinian (Jones 1988: 338) A domo semus girande. To home are-lpL returning 'TO HOME, we are returning.'

The main word order characteristics of Romance

55

5. Subordinate clauses In Romance subordinate clauses, word order is in general similar to that of main clauses.8 In Rumanian, the most frequent order is VSO; this is often related to the fact that subordinate clauses lack, in general, a topic (see sec. 3). Finite subordinate clauses are invariably introduced by a clause initial complementizer or subordinating conjunction. 9 In all Romance languages there is a distinction between the indicative and subjunctive mood in complement clauses. Generally speaking, irrealis embedded clauses require the subjunctive, while realis clauses present the indicative mood. As an illustration, consider the examples in (15) and (16) from Italian and Portuguese, respectively. (15)

Italian a. So ehe Davide parla italiano. I-know that Davide speaks:iND Italian Ί know that Davide speaks Italian.' b. Voglio ehe Davide parli italiano. I-wish that Davide speaksrsu j Italian Ί wish Davide speaks Italian.'

(16)

Portuguese a. Afirmo que David estuda portugues. I-affirm that David studiesriND Portuguese Ί affirm that David studies Portuguese.' b. Duvido que David estude portugues. I-doubt that David studiesrsuej Portuguese Ί doubt that David studies Portuguese.'

In the case of the majority of verbs that require a subjunctive complement, a phenomenon often referred to as the "Disjoint Reference Effect" prevents both subjects (matrix and embedded) from being conferential. To overcome this phenomenon, infinitival complement clauses are used instead, as in the Spanish (17b). (17)

Spanish a. *Juan espera que el gane la carrera. Juan expects that he win:suBj the race 'Juan expects to win the race.'

56

Alfredo R. Arnaiz

b. Juan espera ganar la carrera. Juan expects to-win the race 'Juan expects to win the race.' Infinitival complements are a common option in Romance, and may be introduced by a preposition. In Portuguese, this type of complement is extensively used, particularly since this language presents inflected or personal infinitives (that is, infinitival forms inflected for person and number). This is illustrated in (18) from Raposo (1987: 87-88) . (18)

Portuguese a. Eu penso [os deputados terem trabalhado pouco]. I think the deputies have:3pL worked little Ί think the deputies have worked little.' b. Eu entrei em casa [sem [os meninos verem]]. I entered in house without the children see:3pL Ί entered the house without the children seeing.'

Other non-finite verbal forms are found in subordinate clauses as instances of two special constructions: absolute or participial clauses and the gerundival construction. In these constructions, the order VS is required. E. g. (19)

Italian (Belletti 1990: 108,98) a. Salutata Maria, Gianni se ne ando. greeted Maria, Gianni Cl went away 'Maria having been greeted, Gianni left.' b. Avendo Gianni chiuso il dibattito, la reunione e finita prima. Having Gianni closed the debate, the meeting ended early 'Gianni having closed the debate, the meeting ended early.'

6.

Different sentence types

6.1.

Questions

6.1.1. Yes/No questions In all the languages under consideration here, a common strategy for Yes/No questions is the use of intonation. All of the examples in (20) have statement word order and a question intonation pattern.10

The main word order characteristics of Romance

(20)

57

a. Catalan (Badia (1962: 135) El president ha dimitit? the president has resigned? 'Has the president resigned?' b. French Marie vient demain? Marie is-coming tomorrow? 'Is Marie coming tomorrow?' c. Spanish iMaria viene manana? Maria is-coming tomorrow? 'Is Maria coming tomorrow?' d. Italian Maria ha detto ehe si? Maria has said that yes? 'Has Maria said yes?' e. Portuguese Maria esta aqui? Maria is here? 'Is Maria here?' f. Rumanian Copilul se duce la teatru? child-def-nom Cl go the theatre? 'Is the child going to the theatre?' g. Sardinian Maria ari pigau su trenu? Maria has caught the train? 'Has Maria caught the train?'

However, this is not the only possibility (not even the prefered one in some cases). Tag questions are also used in this context. Inversion in yes/no questions is optional in all of these languages, with the following qualifications: • In French, this option is available only with subject clitics (or, conjunctive pronouns — see example (21)) and in a construction known as fausse inversion, • In Portuguese, this is a seldom used option.

58

(21)

Alfredo R. Arnaiz

French Est-il parti? Has-he left 'Has he left?'

In Catalan, French and Portuguese, it is common to form yes/no questions by the addition of a certain element such as que in Catalan, e que in Portuguese and est-ce que in French, as illustrated below: (22)

a. Catalan Que no vens? COMP NEC you-come? 'Aren't you coming?' b. Portuguese (Parkinson 1988: 158) E que o seu pai estä aqui? It-is that the his father is here? 'Is his father here?' c. French (Harris 1988: 237) Est-ce que le president vient? Is-this that the president comes? 'Is the president coming.'

Sardinian presents an interrogative particle that requires the subject to be phonologically empty, dislocated or inverted as shown in (23) from Jones (1988: 341). (23)

a. *A Juanne venit? PART Juanne comes? 'Is Juanne coming?' b. A venit (Juanne)? PART comes (Juanne)? 'Is Juanne coming?'

Indirect yes/no questions do not require a particular word order, and they present a "particle" that takes the form of an interrogative complementizer.

6.1.2. Wh-questions In general, the Romance languages are «^-fronting languages; that is, in nonpolar interrogatives the ^-element being questioned is placed at the begining of the sentence. This is illustrated in (24).

The main word order characteristics of Romance

(24)

59

a. Catalan Quan ha mort el conductor? when has died the driver? 'When died the driver?' b. French Qui Marie a rencontre? who Marie has met? 'Who Marie has met?' c. Italian Che ha detto Mario? what has said Mario? 'What did Mario say?' d. Portuguese Que comprou a Maria? what bought the Maria? 'What did Maria buy?' e. Rumanian Unde se duce copilul? where Cl goes child-the 'Where is the child going?' f. Sardinian Chini esti sa picciocchedda? who is the little-girl? 'Who is the little-girl?' g. Spanish iA quien visito Rosa? Whom visited Rosa? 'Who did Rosa visit?'

From these languages, only French and Brazilian Portuguese allow the possibility of forming single questions by leaving the wh-e\ement in-situ.11 See the French example of this in (25) :12 (25)

Elle a recontre qui? she has met who 'Whom has she met?'

60

Alfredo R. Arnaiz

A point of interest with regard to this type of question is the VS inversion phenomenon. Notice in (24) that — with the exception of French — the typical word order in Romance W^7-questions is Wh— V— S. This order is strict, with the following qualifications: • In Catalan, the order VSO in questions is disfavored. In order to avoid this order, it is often resourced to topicalization and/or dislocation of the "most 'given' information" (see Wheeler (1988)). • In Brazilian Portuguese, the typical order in this context is SVO. European Portuguese may also present this order by the use of the e que periphrasis inserted after the «/^-phrase. • Regarding Spanish, certain Caribbean varieties allow for the order SV. In long distance questions, inversion is also mandatory in the clause from which the «//7-phrase is extracted. Adjunct M^-phrases such as por que and como do not induce mandatory inversion (this last point holds also for Italian). Also, these languages allow multiple interrogation (where only one ment is fronted, the other q-elements must stay in-situ). Pied-pipping is the rule for wh-phrases accompanied by a preposition (i.e. no preposition stranding). And, in echo questions the ^-element stays in-situ. In indirect «^-questions, the word order pattern is in general similar to the one observed in direct whquestions.

6.2. Imperatives The most common word order in imperatives in Romance is V(S). Some languages show a true imperative verb form in 2nd person singular affirmative commands (Castillian Spanish and Portuguese also have a true imperative form in the 2nd person plural). In negative and plural (1st and 2nd person) a suppletive or surrogate form from the indicative or subjunctive is used (the infinitive form is also used in commands in some of these languages). In general, clitics follow the verb in affirmative, but precede in negative commands.13 This is shown in (26) on the basis of Spanish. (26)

a. jHabla! 'Talk!' (true imperative) b. jNo hables! NEC. talk:suBj 'Don't talk!' (subjunctive)

The main word order characteristics of Romance

61

c. jHablale! talk-Cl(IO) 'Talk to her/him!' d. jNo le hables! neg. Cl(IO) talk(subj.) 'Don't talk to her/him!'

6.3. Negation All of the Romance languages under consideration here express negation by inserting a negative marker immediately before the verbal group (clitics included).14 French has in addition a mandatorily postverbal negative marker (pas) (in spoken language there is strong tendency to drop the preverbal marker ne) and Catalan has a similar postverbal marker, but it is optional. See examples in (27) and (28). (27)

a. Italian Gianni non ha telefonato a sua madre. Gianni neg. has called his mother b. Portuguese Joäo nao ligou para sua mäe. Joäo neg. called his mother c. Spanish Juan no ha llamado a su madre. Juan neg. has called his mother 'John hasn't called his mother.'15 d. Rumanian Ion na venit. Ion [nu+a] = neg. + has come 'Ion didn't come.' e. Sardinian Maria no' ari pigau su trenu. Maria neg. has caught the train 'Maria didn't catch the train.'

(28)

a. Catalan La Maria no vindrä (pas). The Maria neg. will-come (neg.) 'Maria won't come.'

62

Alfredo R. Arnaiz

b. French Pierre ne marche pas. Pierre neg. walk neg. 'Pierre doesn't walk.' While the positioning of the preverbal negative marker is strict (always preceding the verbal group, with the exception noted), the positioning of the postverbal one exhibits certain variation (variation that is heterogeneous across languages), often resembling that of sentential adverbs. Negative markers of the pas class always follows the main verb, provided that it is finite and in a simple form: in the case of other verbal forms the abovementioned variation arises. For instance, in French, pas normally precedes the infinitive, while in Catalan it may follow an infinitival form.16 Another point concerning negation worthy of mention is that the languages outlined here are "negative concord" languages. That is, the sentential negative marker plus one or more negative elements constitutes only one instance of sentential negation (i.e. no double negation effect). This is exemplified in (29) below. (29)

a. Italian Non ho detto niente a nessuno. NEG have said nothing to nobody b. Spanish No le he dicho nada a nadie. NEG Cl(IO) have said nothing to nobody Ί haven't said anything to anybody.'

7.

The Noun Phrase

7.1. The nominal group The order of constituents within the Noun Phrase in the Romance languages tends to follow the pattern in (30) (where Det includes articles, demonstratives, quantifiers and possessive adjective/determiner): (30)

Det - N - Adj - PP(Gen.) - Rel. Clause.

There are some exceptions to this schema. First, consider the case of the definite article. In all Romance languages except Rumanian the definite article is a free form preceding the head noun (see (31)).

The main word order characteristics of Romance

(31)

63

a. Catalan la ma 'the hand' b. French le gar£on 'the boy' c. Italian il ragazzo 'the boy' d. Portuguese a mäo 'the hand' e. Sardinian su piccioccu 'the boy' f. Spanish la mano 'the hand'

In Rumanian, the definite article takes the form of a suffix, as shown in (32): (32)

om-ul man-the 'the man'

In all Romance languages the indefinite article is a free form (similar to the numeral "one") and always precedes the noun. The demonstrative determiner in some of the languages under discussion may follow the head noun. This seems to be a marked option in most cases. See the Rumanian example in (33). A similar pattern is observed in Portuguese and Spanish. (33)

a. acest creion 'this pencil' b. creion-ul acesta pencil-the this 'this pencil'

64

Alfredo R. Arnaiz

French has two optional enclitic demonstrative suffixes (-«' and -la), illustrated in (34) (from Agard (1984)). (34)

a. cet animal/cet animal-ci 'this animal' b. cet animal/cet animal-lä 'that animal'

As stated above, quantifiers 17 and numerals always precede the noun. The only exception to this pattern is found in Rumanian where the ordinal tntti "first" may precede or follow the head noun. The adjective generally follows the noun: (35)

a. Catalan home orgullos man proud b. French homme orgueilleux man proud c. Italian uomo orgoglioso man proud d. Sardinian omini orgogliosu man proud e. Spanish hombre orgulloso man proud 'proud man' f. Portuguese homem simple man simple 'simple man' g. Rumanian om bun man good 'good man'

But, there is some well-known variation in this area. In most Romance languages, adjectives may precede the noun, as exemplified in (36).18

The main word order characteristics of Romance

(36)

65

a. Catalan Ilunyanes terres 'far-away lands' b. French belle dame 'beautiful lady' c. Portuguese tranquilas montanhas 'quiet mountains' d. Rumanian frumoasa fata 'beautiful girl' e. Spanish altas montanas 'high mountains'

Some adjectives (a small class) may precede or follow the noun with a difference in meaning. In prenominal position they have a descriptive meaning, while in postnominal position they are restrictive. See (37). (37)

a. Catalan pobre home / home pobre poor man / man poor b. French pauvre homme / homme pauvre poor man / man poor 'wretch man'/'needy man' c. Italian vecchio amico / amico vecchio old friend / friend old 'friend since long time'/'aged friend' d. Portuguese grande homem / homem grande big man / man big 'great man'/'big man'

66

Alfredo R. Arnaiz

e. Spanish viejo amigo / amigo viejo old friend / friend old 'friend since long time'/'aged friend' Another class of adjectives can function as quantifiers when they precede, as illustrated in (38): (38)

a. Italian notize certe/certe notize b. Spanish noticias ciertas / ciertas noticias news sure/true / certain news 'sure/true news'/'certain news' c. Italian libro unico/unico libro d. Spanish libro unico / unico libro book unique / only book 'unique book'/'only book'

Regarding genitives, possession with nouns is indicated in all the languages (except for Rumanian) by means of a "preposition": (39)

Noun da Genitive/Possessed da Possessor

This is illustrated in (40): (40)

a. Catalan la mare del nen the mother of-the child 'the child's mother' b. French le pere de l'enfant the father of the-child c. Italian il padre del ragazzo the father of-the child 'the child's father'

The main word order characteristics of Romance

67

d. Portuguese o pai da Maria the father of Maria 'Maria's father' e. Sardinian su libru de Giuanni the book of Giuanni 'Giuanni's book' f. Spanish la madre del nino the mother of-the child 'the child's mother' In Rumanian, genitives are expressed by a simple juxtaposition of Noun and Genitive without additional grammatical marking (possessed-possessor), as illustrated in (41).19 (41)

cartea fetei book girl 'the girl's book'

Most of the Romance languages present a genitive pronoun as well as genitive determiner, see Agard (1984). Concerning the order in relation to the possessed noun. The genitive "pronoun" follows the head noun, as exemplified in (42), from Agard (1984).20 (42)

a. Italian il cugino suo the cousin his/her b. Portuguese o primo seu the cousin his/her c. Rumanian värul sau cousin-the his/her d. Spanish el primo suyo the cousin his/her 'her cousin'

68

Alfredo R. Arnaiz

The genitive determiner always precedes the possessed noun, but there is some variation regarding the co-occurrence with the definite determiner: (43)

a. French mon cousin my cousin b. Spanish mi primo my cousin 'my cousin'

(44)

a. Catalan el meu cotxe the my car 'my car' b. Italian (il) mio cugino the my cousin c. Portuguese (o) meu primo the my cousin d. Rumanian al meu vär the my cousin 'my cousin'

As illustrated in (43), in French and Spanish the genitive determiner cannot co-occur with the definite determiner. Nevertheless, the NP is definite. On the other hand, (44) shows that some languages allow these determiners to cooccur. In some of these languages, there is a restriction concerning the possessive determiner in cases of inalienably possessed nouns. As shown in (45), the possessive determiner does not co-occur with these nouns: (45)

a. Catalan Em rento les mans. I wash the hands wash my hands'.

The main word order characteristics of Romance

69

b. French (Vergnaud & Zubizarreta 1992: 636) Les enfants ont leve la main. The children have raised the hand 'The children raised the/their hand'. c. Spanish Me duele la/?*mi cabeza. Cl(IO) hurts the/my head 'My head hurts'. The relative clause in all Romance languages invariably follows the head nouns and modifiers. It is introduced by a relative complementizer or a relative pronoun, as illustrated in (46) and (47), respectively. (46)

a. Catalan la noia que no sabem d'un es the girl that NEC we-know from-where she-is 'the girl that we do not know where she comes from' b. French un chien que j'ai vu a dog that I-have seen 'a dog (that) I saw' c. Italian la ragazza ehe studia a Perugia the girl that studies at Perugia 'the girl that studies at Perugia' d. Portuguese os passageiros que desembarcaram the passengers that will-come-ashore 'the passengers that will come ashore'

(47)

a. Italian il negozio dove lavoro the bussiness where I-work 'the store where I work' b. Rumanian (Agard 1984) copilul ai cärui prini au murit child-the whose parents are dead 'the child whose parents are dead'

70

Alfredo R. Arnaiz c. Spanish la ciudad en la cual vivimos the city in the(sc:F) which we-live 'the city in which we live'

7.2. The adjectival phrase The unmarked order within the adjectival phrase is as in (48). Some examples are provided in (49).

(48) (49)

Deg - A - PP a. Catalan home extremadament orgullos del seu fill b. French homme tres orgueilleux de son fils c. Italian uomo estremamente orgoglioso di suo figlio d. Sardinian omini meda orgogliosu de su fillu e. Spanish hombre muy orgulloso de su hijo man very/extremely proud of his son 'very/extremely proud man of his son'

Notes \. Some of these languages include a number of dialects that may well be considered languages in their own right (see Harris and Vincent (1988)). From these languages we will be discussing only the ones that appear in Table I marked with an asterisk (with the exception of Portuguese, those are the languages we were provided with an answered questionnaire). This paper is based in part on the answers to a questionnaire circulated by the constituent order group (Eurotyp). I wish to acknowledge those persons that completed these questionnaires: Anna Gavarro (Catalan), Suzanne Schlyter (French), Giuliano Bernini (Italian), Beatrice Primus (Rumanian). Erster Zini and Tomaso Zorzutti (Sardinian), Ingrid Thelin and Juan Carlos Moreno Cabreta (Spanish).

The main word order characteristics of Romance

71

2. For a detailed presentation of the geographical distribution of the Romance languages, see Harris and Vincent (1988). 3. The number in brackets indicates the total number including non-native speakers. 4. See volume 4 of this collection (Actance et calence), particularly Bossong's article. 5. Some of these languages are sometimes considered or perceived as SVO/free, that is the case for Italian, Rumanian (which sometimes is also regarded as a free language), Sardinian and Spanish. Also, it has been argued that Rumanian is better described as a Topic-V-O languages. It should be mentioned that Harris (1988) points out that in spoken French the "typical" SVO order is becoming rare and that popular spoken French shows "a highly flexible word order of the kind often called free" (p. 236) which appears to rely on the pronominal "clitic" system, see also Bossong (1981). 6. Here, refers to full NP complements, for clitic complements see sec. 4.3. 7. Some Italian "dialects" have a subject clitic pronoun, for example Fiorentino and Trentino. 8. Cf. sec. 6.1 for indirect questions. 9. Rumanian is the only language that presents two different complementizers (ca for subjunctive non-relative complements and cä for indicatives) and a mandatory particle that precedes the subjunctive verb (sä), see Kempchinsky (1986). This is illustrated in (i): (i) a. Vreau ca Ana sä meargä la Bucharest I-want that Ana subj. go to Bucharest want Ana to go to Bucharest.' b. Stiu cä Ana merge la Bucharest I-know that Ana goes to Bucharest know that Ana goes to Bucharest.' 10. The point of these examples is to illustrate that no special or particular word order is required in this type of question. 11. This process is restricted to unembedded clauses, see Rizzi (1991). 12. This is possible only in modern spoken colloquial French. 13. Actually, the picture is more complex, in some cases clitic placement relates to the type of imperative (true vs. suppletive) and/or to person and number. See Zanuttini (1991) and Rivero (1988). 14. French requires the subject clitic to precede the preverbal negative marker. 15. For a detail discussion of the different strategies for negation in Romance see Zanuttini (1991), examples (27 a—c) are taken from there. 16. See previous note, and Espinal (1991) for Catalan. 17. The order NP-Quantifier is attested in most of these languages, apparently as the result of the phenomenon of floating quantifiers. 18. It should be noted that with the majority of adjectives anteposition is stylistically marked, however certain adjectives — small class — usually occur prenominally. For example, in French certain adjectives such as belle and ban normally precede the noun, but others such as orgueilleux and intelligente usually follows the noun. 19. This is the prevelant option, but not the only one, (i) shows a case where a possessive inflected particle appears between N and Gen.:

72

Alfredo R. Arnaiz (i)

cartea mare a fetei book big of girl 'the big book of the girl'

20. This option is not available in French.

References Agard, Frederick 1984 A Course in Romance Linguistics, Georgetown University Press, Washington, D. C. Ambar, Manuela 1992 Para Uma Sintaxe Da Inversao Sujeito-Verbo Em Portugues. Colec9ao Estudios Linguisticos, Edisoes Colibri. Lisbon. Badia, Margarit, Antoni 1962 Gramatica Catalana. Editorial Credos, Madrid. Belletti, Adriana 1988 "The Case of Unaccusatives," Linguistic Inquiry 19: 1-34. 1990 Generalized Verb Movement, Roseenberg 8c Tellier, Turin. Belletti, Adriana and Luigi Rizzi 1988 "Psych-Verbs and 0-Theory," Natural Language and Linguistic Theory 6: 291-352. Bossong, Georg 1981 "Sequence et visee. L'expression positionelle du theme et du rheme en franparle," Folia Linguistica 15: 237—252. Burzio, Luigi 1986 Italian Syntax: A Government-Binding Approach, Reidel, Dordrecht. Cinque, Guglielmo 1990 Types of -Dependencies. MIT Press. Cambridge, MA. Comrie, Bernard (ed.) 1987 The Major Languages of the World, Oxford University Press, Oxford. Cunha, Celso and Luis Cintra 1986 Nova Gramatica Do Portugues Contemporäneo, Edisöes Joäo Sä da Costa, Lisbon. Fabra, Pompeu 1956 Gramatica Catalana, Teide, Barcelona. Farkas, Donka 1980 "Word Order in Rumanian Main Clauses," Folia Slavica 4: 245-262. Espinal, Maria Teresa 1991 "Negation in Catalan. Some Remarks with regard to no pas,n in: Catalan Working Papers in Linguistics. Universität Autonoma de Barcelona. Haiman, John and Paola Beninca 1992 The Rhaeto-Romance Languages. Routledge, London. Harris, Martin 1988 "French," in: Martin Harris and Nigel Vincent (eds.), 209-455. Harris, Martin and Nigel Vincent (eds.) 1988 The Romance Languages, London: Routleedge.

The main word order characteristics of Romance

73

Jones, Michael 1988 "Sardinian," in Harris and Vincent (eds.), 314-350. Kempchinsky, Paula 1986 Romance Subjunctive Clauses and Logical Form, Diss. UCLA. Los Angeles, CA. Mallinson, Graham 1986 Rumanian. Descriptive Grammars. Groom Helm. London. Parkinson, Stephen 1988 "Portuguese", in: Martin Harris and Nigel Vincent (eds.), 131 — 169. Raposo, Eduardo 1987 "Case Theory and Infl-to-Comp: The Inflected Infinitive in European Portuguese," Linguistic Inquiry 18: 85—110. Rigau, Gemma 1987 "Sobre el Caracter Cuantificador de los Pronombres Tonicos en Catalan," in Demonte, Violeta and Marina Fernandez Lagunilla, eds., Sintaxis de las Lenguas Romanicas, Ediciones El Arquero, Madrid, 390—407. Rivero, Maria Luisa 1988 "The Structure of IP and V-Movement in the Languages of the Balkans," ms., U. of Ottawa. Rizzi, Luigi 1991 "Residual Verb Second and the W^-Criterion," in: Technical Reports in Formal and Computational Linguistics No. 2. Universite de Geneve. Rizzi, Luigi and Ian Roberts 1989 "Complex Inversion in French," Probus 1: 1~30. Vergnaud, Jean-Roger and Maria Luisa Zubizarreta 1992 "The Definite Determiner and the Inalienable Constructions in French and in English," Linguistic Inquiry 23: 595—652. Wheeler, Max 1988 "Catalan," in: Martin Harris and Nigel Vincent (eds.), 246-278. Zanuttinni, Raffaella 1991 Syntactic Properties of Sentential Negation: A Comparative Study Romance Languages. Diss. UPenn. Philadelphia. PA.

Anders Holmberg and Jan Rijkhoff

Word order in the Germanic languages

1. Introduction The Germanic branch of Indo-European consists of three main groups (Ruhlen 1987: 327): — East Germanic: Gothic, Vandalic, Burgundian (all extinct); — North Germanic (or: Scandinavian): Danish, Swedish, Norwegian, Icelandic, Faroese; — West Germanic: German, Yiddish, Luxembourgeois, Dutch, Afrikaans, Frisian, English.1 Here we will only consider the languages that are currently spoken in geographical Europe. Thus Afrikaans, which is spoken in South Africa, and the extinct East Germanic languages will not be taken into account (but see e. g. König & van der Auwera 1994). The division between North Germanic and West Germanic is useful also as a syntactic typological classification, except that typologically English clearly forms a group of its own, on a par with Scandinavian and the rest of West Germanic. We will therefore occasionally use the term 'Continental Germanic' to refer to West Germanic excluding English (but including Frisian, thus departing from Ruhlen's use of the term).

2.

Inflection and other functional categories

2.1. Verbal inflection All the Germanic languages except Yiddish have two inflectionally expressed tenses, present and preterite. In addition they have a periphrastic perfect tense, formed by the auxiliary have or be and a participle. There is some variation among the languages as to what extent past time reference can be expressed by the periphrastic form. Yiddish lacks the preterite altogether and only has a synthetic paradigm for the present tense. Since the past tense has been replaced by the perfect the former is expressed by hobn 'have' or zayn 'be' plus past participle (Jacobs et al. 1994: 406-407).

76

Anders Holmberg & Jan Rijkhoff

The future is expressed by an auxiliary or, in some of the languages, by the present tense form. English is the only Germanic language which has a special progressive form (be -ing). Apart from the English progressive, no Germanic language has inflectionally expressed aspect. Only Icelandic and German retain a productive inflectional subjunctive mood. In the other languages its use is mainly restricted to certain standard expressions. Passive is expressed by a participle in combination with an auxiliary meaning 'be' or 'become'. In the Mainland Scandinavian languages (Swedish, Danish, and Norwegian) passive can also be expressed by an inflectional affix -s(t), the use of which varies among the languages. All Germanic languages require a copula in the case of a non-verbal sentence predicate. Finally, as regards subject-verb agreement there is a good deal of variation, ranging from no agreement in the Mainland Scandinavian languages to the relatively rich system of Icelandic, with usually five distinct forms (distributed over three persons and two numbers), in all tenses and moods.

2.2. Nominal inflection Among the Germanic languages German, Icelandic and Yiddish have the richest case systems, distinguishing nominative, accusative, dative, and genitive. In Luxembourgeois the accusative has replaced the nominative (Schmitt 1984: 55, 57, 177). The Faroese case system is similar to the Icelandic one but lacks genitive case. In Icelandic and Faroese case is realized on the head noun as well as on determiners of the noun. In German and Yiddish case is realized mainly on determiners. All languages retain a system of cases with pronouns, distinguishing minimally subjective, objective, and (attributive) possessive forms (e.g. English T, 'me', 'my'). Alongside the set of full forms, there is also a set of unemphatic, reduced pronouns (in e. g. Dutch, Frisian, and Luxembourgeois) which have a special syntax. For instance, in Dutch the reduced (clitic) object forms cannot occur in clause-initial position; it only occurs after a finite verb or a subordinating conjunction (me — reduced form; see also the volume in this series on clitics): (1)

a. Mij (*Me) heb je niet gezien. me have you:SG not seen 'You didn't see ME.'

Nouns in all the Germanic languages except English have grammatical gender. German, Luxembourgeois, Yiddish, Icelandic, Faroese, and most dialects of

Word order in the Germanic languages

77

Norwegian distinguish three genders: masculine, feminine, and neuter. The remaining languages distinguish only a neuter and a common gender. The Scandinavian languages have an inflectional (suffixal) definite article. This is one morphological feature which sets North Germanic (i. e. Scandinavian) off from the rest of Germanic.

2.3. Adjectival inflection In all the Germanic languages except English attributive adjectives are inflected for gender and number concord with the noun, as well as displaying some variety of the typically Germanic 'strong or weak (W)' distinction. Among the Scandinavian languages the broad generalization is that the strong form (which is morphologically unmarked in Mainland Scandinavian) is used in indefinite noun phrases, and the weak form in definite noun phrases. (2)

Swedish a. den stora bilen the big(W) car.-DEF b. en stor bil a big car

The Continental Germanic languages present a slightly more varied picture in this area. In addition to the weak and the strong paradigm, German has a third ("mixed") adjectival paradigm. Eisenberg (1994: 376) sums it up as follows: "The inflectional behaviour of the attributive adjective is governed by one single principle: the adjective marks the noun phrase for case, number and gender according to the pronominal inflection if no other constituent of the noun phrase does so". According to Jacobs et al. (1994: 405), the Yiddish adjectival paradigm is a mixture of the German strong and weak endings, but is also has endings which fit either system. Dutch and Frisian attributive adjectives take an -e suffix, except in indefinite noun phrases with singular neuter nouns. A major morphological difference concerning adjectival inflection is that the predicative adjective is inflected for gender and number in Scandinavian but not in the Continental Germanic languages.

3. Word order type All the Germanic languages except English are V 2 languages, that is to say, in declarative main clauses the finite verb is the second constituent of the clause

78

Anders Holmberg 8c Jan Rijkhoff

(see next section for examples). The Scandinavian languages and English are unquestionably SVO, OV order being almost totally nonexistent. Yiddish is predominantly SVO, but SOV order is fairly common as well (Santorini 1993). If we separate out main clause VO sentences as results of the V 2 rule, the other Continental Germanic languages are predominantly SOV, since when the verb is not affected by V2, a nominal object precedes the verb. According to another terminology the Continental Germanic languages are "mixed VO/ OV".2 All the Germanic languages have prepositions rather than postpositions, the comparative precedes the standard of comparison, wh-questions are formed by fronting the wh-phrase, and yes-no questions are formed by fronting the finite verb (main or auxiliary). In the noun phrase (free) determiners, numerals and adjectives precede the noun, whereas relative clauses appear in postnominal position. However, there is considerable variation in the placement of adnominal possessor phrases (nominal and pronominal). In the following sections these generalizations will be discussed in more detail, pointing out various exceptions.

4. Main declarative clauses With regard to word order, the Germanic languages are distinguished primarily along two parameters: Verb Second (V2) and VO/OV.

4.1. Verb Second All the Germanic languages except English are V 2, that is to say, in declarative main clauses the finite verb, main or auxiliary, typically appears in the second position of the clause. In other words, whatever category is preposed, the finite verb will immediately follow that category. In English the verbal cluster can be preceded by more than one constituent, but English retains a form of V 2 in main clause wh-questions, where the finite auxiliary must invert with the subject. (3)

a. Dutch Jan zag een vogel. a'. Jan heeft een vogel gezien. Jan has a bird seen

Word order in the Germanic languages

79

b. Swedish Jan sag en fägel. b'. Jan har sett en fägel. c. John saw a bird. c'. John has seen a bird. (4)

a. Dutch Gisteren zag Jan een vogel. b. Swedish I gar sag Jan en fägel. yesterday saw Jan a bird c. Yesterday John saw a bird.

(5)

a. Dutch Wat zag Jan? b. Swedish Vad sag Jan? what saw Jan c. What did John see?

(6)

a. Dutch Jan zag waarschijnlijk een vogel. b. Swedish Jan sag troligen en fägel. Jan saw probably a bird c. John probably saw a bird.

V2 is perhaps the most salient "special" typological feature of the Germanic languages, distinguishing Germanic from all the other modern European languages.3

4.2. VO or OV While English, Yiddish (but cf. section 4.4) and the Scandinavian languages are "strictly SVO", the other Germanic languages (Dutch, Frisian, German and Luxembourgeois) have two major verb positions in independent (main) declar-

80

Anders Holmberg & Jan Rijkhoff

ative clauses: the second position is usually taken by the finite verb and all other verbs appear in clause-final position. That is to say, the object precedes the verb in embedded clauses, and in main clauses which contain one or more auxiliary verbs. Furthermore when a single auxiliary is combined with a main verb in a subordinate clause, the (finite) auxiliary always occurs after the main verb in German (SOV Aux), but it may either precede or follow the main verb in Dutch subclauses (SOAuxV/SOVAux). (7)

Dutch Hij zei dat hij een vogel had gezien / gezien had he said that he a bird had seen / seen had 'He said that he had seen a bird.'

Both orders are also attested is attested in Luxembourgeois (Schmitt 1984: 176) and Frisian, but in Frisian this only occurs in the speech of younger speakers (Tiersma 1985: 123). The Germanic languages with OV patterns are not consistently verb-final, since they allow, or require, certain verb complements to follow the nonfinite main verb. (8)

Luxembourgois E blouf beim Patt setzen bis d' Sonn op-goung. he stayed at+the inn sit until the sun up-went 'He stayed (sitting) in the inn until sunrise.'

(9)

Dutch Hij heeft gezegd dat hij een vogel zag. he has said that he a bird saw 'he (has) said that he saw a bird'

4.3. Adpositions On the whole Germanic languages are prepositional but almost all of them (English is an exception) also employ a small number of postpositions, which occur either by themselves or (in some Continental Germanic languages) in combination with a preposition ("circumpositions"). (10)

German a. auf den Berg b. den Berg hinauf

Word order in the Germanic languages

81

c. auf den Berg herauf 'up (onto) the mountain' English and the Scandinavian languages allow preposition stranding in connection with wh-questions, topicalization, relativization, and other constructions of the "unbounded dependencies" class. Some of the languages, notably English and Norwegian, allow preposition stranding also in passives ("pseudo-passives"). (11)

Icelandic Hann spuröi hvern eg heföi talaö viö. he asked who I had spoken with

(12)

Norwegian Reven ble skutt pa. fox:Def was shot at

The Continental Germanic languages have no preposition stranding or only a restricted form of it.4

4.4. Double objects Many Germanic languages exhibit a "Dative Shift" alternation (e.g. Dutch, Frisian, English, and the Mainland Scandinavian languages). That is, if the indirect object is expressed as a prepositional phrase it normally follows the direct object. However, when the indirect object is expressed without the preposition, it must precede the direct object: (13)

Dutch a. Ik gaf het boek aan Anna. b. Ik gaf Anna het boek.

(14)

a. I gave the book to Anna, b. I gave Anna the book.

(15)

Danish a. Jeg gav bogen til Anna. b. Jeg gav Anna bogen.

82

Anders Holmberg & Jan Rijkhoff

Icelandic and German do not have the construction with a PP as indirect object, although both languages do allow inversion of two nominal objects. (16)

Icelandic a. Eg l na Mariu baekurnar. I lent Maria(DAT) books:DEF (ACC) Ί lent Maria the books. VI lent the books to Maria.' b. Eg l na biekurnar Mariu. I lent books:DEF (ACC) Maria (DAT) Ί lent Maria the books. VI lent the books to Maria.'

(17)

German a. Ich gab Anna das Buch. I gave Anna (DAT) the book (ACC) Ί gave Anna the book. VI gave the book to Anna.' b. Ich gab das Buch Anna. I gave the book (ACC) Anna(DAT) Ί gave Anna the book.VI gave the book to Anna.'

The construction in a. is unmarked, while the construction in b. requires focus (indicated by stress) on the indirect object (Dat). Inversion of two nominal objects presumably requires a certain amount of case morphology, as found in Icelandic and German (see Primus, this volume).5 According to Birnbaum (1979: 295) object NPs in Yiddish appear in the order IO+DO: (18)

Zi git der snjjer dus pekl. she gives her daughter-in-law the parcel 'She gives her daugher-in-law the parcel.'

Compare next the following examples with pronominal objects. In German the direct object typically precedes the indirect object; if the order is reversed the indirect object must be stressed (focus). In (British) English either order is possible, but only with the order DO + IO can the preposition be omitted; in Dutch the order is DO + IO, with an optional preposition before the IO. (19)

German a. Ich gab es ihm. I gave it him

Word order in the Germanic languages

83

b. Ich gab ihm es. I gave him it

(20)

a. I gave it him. b. I gave it to him. c. I gave him it.

(21)

Dutch a. Ik gaf het aan hem. I gave it to him Ί gave it to him.' b. Ik gaf het hem. I gave it him Ί gave it him.'

In Yiddish, when both objects are pronouns the order DO + IO is preferred: (22)

Er darf ys ir geibn. he must it her give 'He must give it to her.'

Recall that Yiddish is not strictly SVO in that it has "significant relics of earlier SOV order: the syntax of passive, of periphrastic verbs, and of separable prefixes, and clitic floating/climbing" (Jacobs et al. 1994: 411). Consider also these examples, where the verb follows rather than precedes its complement: (23)

ober dos hot dem rebn zaynem shtark fardrosn. but this has the rabbi his strong annoyed 'but this annoyed his rabbi a lot'

(24)

Ir megt zikh oyf mir farlozn. you can Refl on me depend 'You can depend on me.'

4.5. Object positions In the Germanic OV languages objects and adverbiale have no fixed position in the space between the subject and the main verb, i. e. their relative order is

84

Anders Holmberg & Jan Rijkhoff

determined by pragmatic factors (topic, focus) and relative scope. The Scandinavian languages also have some alternative object positions. Thus, in Icelandic a definite (or better, specific) NP object may precede or follow sentence adverbials, including the negation adverb, a phenomenon known as "object shift"; see Holmberg & Platzack (1995). (25)

a. Jon las sennilega ekki pessa bok. Jon read probably not this book 'Jon probably did not read this book.' b. Jon las pessa bok sennilega ekki. Jon read this book probably not 'Jon probably did not read this book.'

If the object is a weak (i. e. simple and unstressed) pronoun, it must precede the adverbs. In the other modern Scandinavian languages a lexical NP cannot precede the adverbs, in the corresponding construction, while a weak pronoun can do so, or must do so, subject to dialectal variation; see the volume in this series on clitics. (26)

Norwegian a. Jon leste sannsynligvis ikke denne boken. Jon read probably not this book 'Jon probably did not read this book.' b. Jon leste den sannsynligvis ikke. Jon read it probably not 'Jon probably did not read it.'

In all the Scandinavian languages the object can precede the sentence adverbs only in constructions where the main verb is in "V2 position", preceding the adverbs. If not, the object strictly follows the verb. (27)

a. Icelandic Jon hefur sennilega ekki lesio pessa bok. Jon has probably not read this book 'Probably Jon has not read this book.' b. Norwegian Jeg trur at Jon sannsynligvis ikke leste den. I think that Jon probably not read it Ί think that Jon probably did not read it.'

Word order in the Germanic languages

85

4.6. The verb-particle construction A characteristic feature of the Germanic languages is the verb-particle construction (give up, turn off, etc.): a verb with a general meaning is combined with a locative or directional adverb (a verb particle), to express a more specific meaning. All the Germanic languages have a wide variety of verb-particle combinations, some of which are lexicalized and idiomatic, while others are formed according to a productive pattern. There is plenty of variation among the Germanic languages regarding the syntactic form of the verb-particle construction. The variation is in part determined by the VO-OV parameter.6 Thus in the OV languages the particle is preverbal whenever the verb in question does not appear in clause-second position. In the VO languages the particle follows the verb.7 (28)

Dutch Jan heeft zijn moeder op-gebeld. Jan has his mother up-called 'Jan phoned his mother.'

(29)

Swedish Jan har ringt upp sin mor. Jan has called up his mother 'Jan phoned his mother.'

The analysis of the verb-particle construction is a highly controversial issue (see for instance den Dikken 1994). The construction is word-like in some respects (especially semantically), but phrase-like in other respects. Note, for instance, that in the V 2 languages the verb and the particle are always separated when the (finite) main verb appears in clause-second position. (30)

Dutch Jan belt zijn moeder vaak op. Jan calls his mother often up 'Jan often phones his mother.'

(31)

Swedish Jan ringde tydligen inte upp sin mor. Jan called apparently not up his mother 'Apparently Jan did not phone his mother.'

86

Anders Holmberg & Jan Rijkhoff

Among the Germanic VO languages there is variation regarding the position of the object in relation to the verb particle: In English, Norwegian, Icelandic, and Faroese a lexical NP object may precede or follow the particle, but a pronominal object must precede the particle. The following examples all mean "We let the dog out" or "We let it out". (32)

Norwegian a. Vi slapp ut hunden/*den. we let out dog:DEF/it b. Vi slapp hunden/den ut. we let dog:DEF/it out

In Danish the object always precedes the particle. (33)

a. *Vi lod ud hunden/den. we let out dog:DEF/it b. Vi lod hunden/den ud. we let dog:DEF/it out

In Swedish the object always follows the particle. (34)

a. Vi släppte ut hunden/den. we let out dog:DEF/it b. *Vi släppte hunden/den ut. we let dog:DEF/it out

4.7. Raising Raising is one of characteristic features of Germanic languages. In this section we will briefly deal with (i) Subject-to-Subject raising, (ii) Object-to-Subject raising, (iii) Subject-to-Object raising, and (iv) verb raising.8 Consider first the following Dutch sentence, in which the clausal Subject occurs in clause final position ("extraposition", "heavy shift"). (35)

Het is niet belangrijk wat jij denkt. it is not important what you[sc] think 'It is not important what you think.'

Word order in the Germanic languages

87

Observe that the 'normal' subject position is now filled by the anticipatory pronoun bet 'it' (Frisian it, German es).9 With certain verbs, such as 'appear' and 'seem' extraposition is compulsory. (36)

a. Dutch Het schijnt dat hij hees is. it seems that he hoarse is b. It seems that he is hoarse'

There is, however, an alternative way to express sentences of the type it+V + Complement, in which the Subject of the complement appears as the Subject of V, i.e. Subject-to-Subject raising (König 1971; see also Hawkins 1986: 75 f.). (37)

a. Dutch Hij schijnt hees te zijn. b. Frisian Hy liket heas te wezen. c. German Er scheint heiser zu sein, he seems hoarse to be 'He seems to be hoarse.'

(38)

Swedish Han verkar vara hes. he seems be hoarse 'He seems to be hoarse.'

Object-to-Subject raising ("Tough Movement") is found in most of the Germanic languages (Icelandic is an exception), but is most productive in English. For instance in German this is only possible with five adjectives (leicht 'easy', einfach 'simple', schwer 'hard', schwierig 'difficult', interessant 'interesting'). (39)

Es ist leicht, ihn zu überzeugen, it is easy him to convince 'It is easy to convince him.'

(40)

Er ist leicht zu überzeugen, he is easy to convince 'He is easy to convince.'

88

Anders Holmberg & Jan Rijkhoff

Furthermore, in all the Germanic languages the verbs of perception (see, hear, etc.) allow the subject of the Object Clause to be expressed as the object of the verb in the main clause (Subject-to-Object raising, also called "Exceptional Case Marking" (ECM), or "Accusative-with-infinitive"; cf. Dik 1987: 237 f.). (41)

Dutch Ik hoorde dat hij een lied zong. I heard that he a song sang Ί heard that he sang a song.'

(42)

Swedish Jag horde att han sj ng en sang. I heard that he sang a song Ί heard that he sang a song.'

(43)

Dutch Ik hoorde hem een lied zingen. I heard him a song sing Ί heard him sing(ing) a song.'

(44)

Swedisch Jag horde honom sjunga en sang. I heard him sing a song Ί heard him sing(ing) a song.'

Notice that the hearing in (43) and (44) takes place while the song is being sung, but that this is not necessarily the case in the non-raised construction (which can be paraphrased as: 'He heard from X that ...'). Some Germanic languages exhibit a similar pattern with at least some verbs of cognition. English is by far the most productive language in this regard. Among the Scandinavian languages there are at best one or two such verbs in each language (45)

I believe John to be intelligent.

(46)

Icelandic Eg tel Jon vera gafa an I believe Jon be intelligent Ί believe Jon to be intelligent.'

Word order in the Germanic languages

89

Finally, in some West (Continental) Germanic languages, notably Dutch and German, the non-finite auxiliary may appear in the form of an infinitive (verb raising; Kooij 1987 b: 153): (47)

Dutch Hij zegt dat ie het boek heeft kunnen lezen. he said that hefcl] the book has be.able:Inf readtlnf 'He said that he has been able to read the book.'

With respect to the position of modals relative to the main verb, Dutch and Luxembourgeois display mirror image orderings as compared to German and Frisian: (48)

Dutch Wij denken dat ie het boek moet kunnen lezen. we think that he[cl] the book must can:Inf read:Inf 'We think that he must be able to read the book.'

(49)

German Wir denken daß er das Buch lesen können muß. we think that he the book read:Inf can:Inf must 'We think that he must be able to read the book.'

Luxembourgeois allows for considerable variation in the placement of the modal infinitive (Schmitt 1984: 175). (50)

a. D' Kanner hu vill an der Schoul misse leieren. the children have a lot in the school must learn b. D'Kanner hu vill missen an der Schoul leieren. c. D'Kanner hu vill missen leieren an der Schoul. d. D'Kanner hu misse vill an der Schoul leieren. 'The children have had a lot to learn in school.'

5.

Subordinate clauses

5.1. Word order in finite subordinate clauses One of the most striking syntactic characteristics of the Germanic languages is the contrast between main and subordinate clause word order (as already tou-

90

Anders Holmberg & Jan Rijkhoff

ched upon above), which is closely connected with the V2 phenomenon. The word order contrast is most evident in the Germanic OV languages. (51)

Dutch a. Jan zag een vogel. Jan saw a bird 'Jan saw a bird.' b. Jan zei dat hij een vogel zag. Jan said that he a bird saw 'Jan said that he saw a bird.'

But it can be observed in the Germanic VO languages, too, if the sentence includes a sentence-medial adverb, as in the following Swedish examples (see Platzack 1986, Holmberg & Platzack 1995). (52)

a. Jan säg inte fägeln. Jan saw not bird:DEF 'Jan did not see the bird.' b. *Jan inte säg fageln. Jan not saw bird:DEF

(53)

a. *Det är märkligt att Jan säg inte fageln. It is strange that Jan saw not bird:DEF b. Det är märkligt att Jan inte säg fageln. It is strange that Jan not saw bird:DEF 'It is strange that Jan did not see the bird.'

In the main clause the finite verb precedes the sentence adverb (here the negation adverb), due to V2. In embedded clauses the sentence adverb precedes the finite verb. The contrast is less salient in Icelandic and Yiddish. Both of these languages observe V2 order in embedded clauses, too, in the sense that the finite verb immediately follows the subject; see Diesing (1990), Rögnvaldsson & Thrainsson (1990), Sigurösson (1989). (54)

Icelandic t>aö er skrytiö aö Jon sä ekki fuglinn. it is Strange that Jon saw not bird:DEF 'It is strange that Jon did not see the bird.'

Word order in the Germanic languages

91

However, in Icelandic V 2 order is not obligatory in adverbial clauses or relative clauses, so the main-embedded clause word order contrast can be observed in Icelandic, too; see Sigurösson (1989: 44f.). (55)

a. ...pegar Maria loksins keypti bokina. when Maria finally bought the-book b. i>aö er nu paö sem eg ekki veit. that is now it that I not know 'Now that is what I don't know.'

5.2. Subordinating conjunctions Subordinate clauses are typically introduced by a complementizer (subordinating conjunction), but some of the West Germanic languages do not always require such a constituent. For instance, in German the complementizer need not appear with complements of verba dicendi and cognitive verbs. In such cases the finite verb is in the subjunctive form and occurs in second position (Eisenberg 1994: 377; in this example 'hear' functions as a cognitive verb). (56)

Ich höre, du seist gekommen. I hear:Pres:lSG you[sc] are:Pres:SUBj:2sc come:PAST.PA hear that you have come.'

In (formal) English the complementizer that is often omitted with object clauses (Quirk et al. 1985: 1049-1050). The same holds true of the Scandinavian languages, in particular the Mainland Scandinavian languages (see Holmberg 1990). (57)

I believe (that) he will win it tomorrow.

(58)

Swedish Jag tror (att) han inte vinner i morgon. I think (that) he not wins tomorrow think that he will not win tomorrow.'

In Scandinavian the complementizer cannot be omitted when the embedded clauses has main clause word order, which is an alternative with, especially, verba dicendi.

92

(59)

Anders Holmberg & Jan Rijkhoff

Jan säger *(att) han sag minsann inte nagon figel. Jan says that he saw certainly not any bird 'Jan says that he certainly did not see any bird.'

6. Sentence types All the Germanic languages also have V-initial order in yes/no questions, imperatives/requests, exclamations (wishes), and certain conditional subclauses. However, in English only auxiliary verbs can occur initially. When there is no other auxiliary verb, English makes use of a dummy auxiliary do in interrogative (Did he buy the book?) and negative sentences (He did not buy the book). (60)

Frisian Wolle jo moarn reedride? want you tomorrow skate 'Do you want to go skating tomorrow?'

(61)

German Komm her! come here 'Come here!'

(62)

Luxembourgeois Versoen d' Bremsen, dann as es eriwer fail the brakes then is it over 'If the brakes fail, it's over.'

Furthermore at least some languages (e. g. Dutch, Icelandic) can also have the verb in first position in 'dramatic' narrative style (notice that Dutch may use the present tense here to describe a situation in the past): (63)

Dutch Komen we thuis, staat Peter voor de deur! come we home stands Peter in front of the door 'When we came home, we found Peter standing in front of the door!'

(64)

Icelandic Komu peir pa ao storum helli. came they then to big cave 'They came then to a big cave.'

Word order in the Germanic languages

93

Question word phrases always occur in the clause-initial position. (65)

Dutch Wat heb je gekocht? what have you[Sg] bought 'What did you buy?'

(66)

Swedish Veins förslag kommer dorn att stöda? whose proposal come they to support 'Whose proposal will they support?'

(67)

Luxembourgois Weini kommt der dann? when comes he then 'When does he come?'

Only one wh-phrase can, however, occur clause-initially; in multiple questions other wh-phrases are left in situ. (68)

Norwegian Hvem har sagt hva til hvem? who has said what to whom 'Who has said what to whom?'

In existential/locative sentences there is a general tendency to avoid having an indefinite NP in the first position and one of the strategies employed is to use a repletive (expletive) element. In English this is possible mainly in construction with be, but in all the other Germanic languages it is possible with a wide range of verbs, sometimes even "unergative" verbs, as in (74). (69)

Dutch Er stond een vaas op tafel. there stood a vase on table 'There was a vase standing on the table.'

(70)

Frisian Der wenne in tsoender yn dat wäld. there lived a sorcerer in that forest sorcerer used to live in the forest.' ('There used to live a sorcerer in that forest.')

94

Anders Holmberg & Jan Rijkhoff

(71)

German Es fiel ein Stein vom Dach. it fell a stone from-the roof stone fell off the roof.'

(72)

Yiddish Es zenen faran oyf der velt gazlonim. it are available on the world robbers 'There are robbers in the world.'

(73)

Icelandic Jjad haföi sokkiö batur urn nottina. there had sunk a.boat in night:DEF boat had sunk during the night.'

(74)

Swedish Det arbetade kvinnor ute pä faltet, there worked women out in field:DEF There were women working out in the field.'

The construction is possible mainly when the logical subject is indefinite. Dutch er has many uses (repletive, partitive, pronominal, locative; see e. g. Donaldson 1987: ch. 15) and it occurs in a number of contexts where a corresponding particle is absent in e. g. English and German; for instance, when the first position is taken by a question word (Geerts et al. 1985: 395 f., 816—823): (75)

Wie komt er vanavond? (Dutch — repletive 'er') Wer kommt 0 heute abend? (German) who comes there/0 tonight 'Who is coming tonight?'

Finally, repletive elements are also frequently employed in impersonal passives.10 (76)

Dutch Er werd hier altijd gedanst.

(77)

Frisian Der waard hjir altyd dünse. there became here always danced 'There was always dancing done here.'

Word order in the Germanic languages

(78)

Luxembourgois Et gouf gebaut, it gave built 'There was building done here.'

(79)

Swedish Det dansades pä bryggan. there dance:Pass on jetty:DEF 'People were dancing on the jetty.'

95

Such passives are impossible in English. Yiddish also lacks the impersonal passive; instead an active form with men One' or a reflexive pronoun is used (Jacobs et al. 1994: 409): (80)

ven men darf hibn moyekh, helft nit keyn koyekh when one needs have brains help not no brawn 'when brains are needed, brawn won't help'

(81)

es brot zikh a katshke it roast Refl a duck 'a duck is being roasted'

7. The noun phrase Noun phrase internal word order is generally characterized by the following pattern: (82)

determiner — numeral — adjective — N — possessor NP — relative clause

where determiner subsumes articles and demonstratives (but note that Icelandic lacks an indefinite article), and numeral includes both quantifiers and cardinal and ordinal numerals. There are, however, several differences between the individual languages. First of all, the North Germanic languages also have a postnominal element (a suffix) to express definiteness. In Swedish, Norwegian, and Faroese both a bound article suffix and a free (prenominal) definite article are used when the NP contains an attribute adjective (see Borjars 1994, Delsing 1993).n

96

(83)

Anders Holmberg & Jan Rijkhoff

Faroese tann svarti kettlingurinn the black kitteniDEF 'the black kitten'

Free articles (both definite and indefinite), demonstratives, numerals, quantifiers and adjectives typically precede the noun in all Germanic languages. (84)

Faroese hesir triggir gomlu menninir

(85)

Dutch deze drie oude mannen 'these three old men'

Interestingly, only in Yiddish adjectival modifiers may also occur after the noun as a kind of appositional constituent (i. e. "in an NP-NP structure" (Jacobs et al. 1994: 408); they mention that such a construction is also attested in Semitic): (86)

a. a sheyn meydl a pretty girl b. a meydl a sheyne a girl a pretty

The relative clause follows the noun in all Germanic languages and is introduced by a relative pronoun or a relative clause complementizer (English that, Scandinavian som (sem)). (87)

Dutch Dat is de man die mijn fiets gestolen heeft. that is the man who my bike stolen has 'That is the man who stole my bike.'

A head noun can also be modified by a present or past participle in all languages (the writing man, the written letter), but at least in Dutch, Frisian, and German the participle can be preceded by its complements. The relativised constituent is always the subject of a modifying participle; notice that the participle is inflected like an attributive adjective.

Word order in the Germanic languages

(88)

97

Frisian de in boek lezende frou the a book reading woman 'the woman who is/was reading a book'

A modifying past participle has a passive meaning: (89)

German der gestern geschriebene Brief the yesterday written letter 'the letter that was written yesterday'12

The same construction is also attested in North Germanic, but here (with the possible exception of Danish) the "syntactic expandability is heavily constrained. When complements are added in accordance with the valency requirements of the verbs in question, the result is stylistically marked or even deviant" (Askedal 1994: 249). Finally, at least Dutch and Frisian can also have an infinival verb (plus te 'to') serving as a modifier, which always has a passive meaning as well as a modal nuance: (90)

Dutch het duur te verkopen huis

(91)

Frisian it djür te ferkeapjen hus 'the house that must be/is going to be sold for a high price'

German, on the other hand, must employ an inflected present participle (92)

das teuer zu verkaufende Haus the expensive to selling house 'the house that must be/is going to be sold for a high price'

There is considerable variation as to the position of the possessor NP in the Germanic languages. In all languages the adnominal possessor NP can occur on either side of the head noun, albeit that in many languages there is often a preference for one or the other (depending e. g. on the nature of the possessing entity and the internal complexity of the possessor phrase). For instance, possessor NPs generally follow the noun in West Germanic, but if it consists of a

98

Anders Holmberg & Jan Rijkhoff

proper name or a relationship term it will often precede. If it precedes it will carry the genitive -s (the so-called Saxon genitive; typically with proper names), if it follows it will be preceded by a preposition (but not necessarily in German, see below). (93)

(94)

Dutch de hond van de leraar the dog of the teacher 'the teacher's dog' a. Annas kleren b. de kleren van Anna

(95)

a. Anna's clothes b. the clothes of Anna

In Yiddish the order is usually possessor-possessed, but the reverse order is found in constructions expressing certain kinship relations and in the prepositional variant; compare (Birnbaum 1979: 299): (96)

zaan svester-s ziindl his sister-S little.son 'his sister's little son'

(97)

Iser Braandl's Iser Braandl-GEN 'Braandl's son Isser'

(98)

der klimat fjn Kanady the climate of Canada 'the climate of Canada'

The widest range of possible constructions is perhaps attested in German; all examples can be translated as 'Adenauer's speeches'. (99)

a. Adenauer-s Rede-n Adenauer-GEN speech-PL b. die Rede-η Adenauer-s the:NOM.PL speech-PL Adenauer-GEN

Word order in the Germanic languages

99

c. die Redevon Adenauer the:NOM:PL speech-PL of Adenauer d. dem Adenauer seine Rede-n the:DAT.M.sc Adenauer his speech-PL The (substandard) variant with possessive pronoun cross-referencing the possessor NP (as exemplified in (99 d), where both seine 'his' and the possessor NP Adenauer refer to the same entity) is also attested in other West Germanic languages, particularly in the more colloquial speech variants. In Dutch this involves the reduced form of the possessive pronoun. Observe that the possessor NP in Luxembourgeois (like in in German) has dative case.13 (100)

Dutch dat meisje d'r fiets that girl her bike 'that girl's bike'

(101)

Luxembourgois dem Papp seng Vokanz the:M.SG.DAT father his vacation 'father's vacation'

This construction is also widely used in Norwegian (where it is assumed to be originally a loan from Low German), alongside the more standard constructions (all: 'the Englishman's boat'); see Fiva (1987). (102) a. engelskmannen sin bat Englishman:DEF his boat b. engelskmannens bat Englishman:DEF:GEN boat c. bäten til engelskmannen boat:DEF to Englishman:DEF In Faroese the prepositional construction is typically used for personal relationships, but it is also possible for the possessor NP to appear in the accusative (Barnes & Weyhe 1994: 208): (103) a. mamma til Kjartan mother to Kjartan 'Kjartan's mother'

100

Anders Holmberg & Jan Rijkhoff

b. papi drong-in father boy-ACC.sc.DEF 'the boy's father' Compare also these examples from Faroese (Barnes & Weyhe 1994: 208): (104) a. Jogvan-sar batur Jogvan-GEN boat 'Jogvan's boat' b. bätur Jogvan-sar (less common) boat Jogvan-GEN 'Jogvan's boat' Note finally that Frisian possessor NPs have the suffix -e, which is mainly used with kinship terms (Tiersma 1985: 55): (105)

us heit-e pup our father- pipe Our father's pipe'

Attributive possessive pronouns generally precede the noun in the West Germanic languages (my book [En], mein Euch [Ge], mijn boek [Du]), except when the noun is also preceded by a demonstrative or an article, as in this Dutch example: (106)

een artikel van jou een article of you:OBj 'an article of yours'

In such a construction the pronoun is in the object form and follows the preposition van Of. In Yiddish, however, an indefinite possessed NP may be preceded by an article (cf. Jacobs et al. 1994: 406; Birnbaum 1979: 299). (107)

mayne a shvester my a sister 'a sister of mine'

Notice that the possessive pronoun can also follow the noun (Jacobs et al. 1994: 406):

Word order in the Germanic languages

(108)

101

der Bankrot zeyerer the bankruptcy their 'their bankruptcy'

Among the Scandinavian languages Danish, Swedish, and Faroese have prenominal possessive pronouns, while Icelandic, Norwegian, and Northern Swedish have postnominal possessive pronouns. In the former case the head noun has the bare, indefinite form, in the latter case it has, obligatorily, the definite form. (109) a. Danish min bog my book b. Norwegian boka mi book.Def my Icelandic and many dialects of Norwegian and Northern Swedish also take a proper name in construction with a pronoun as a postnominal possessor, but only with a definite head noun. (110)

Icelandic bokin hans Jons book.DEF his Jon (GEN) 'Jon's book'

Notes 1. These are the standardized North and West Germanic languages. Which dialects become standardized is of course mainly a matter of politics. For instance German contains dialects, spoken by millions of people, which are as different from standard German as Frisian is from standard Dutch. Luxembourgeois is a based on one of these dialects, namely Moselfränkisch. Norwegian has two standard forms. 2. For a recent discussion, see Zwart (1994). 3. Old French was a V2 language (see Vance 1989). Estonian has a form of V2 rule, which, however, is not obligatory the way it is in Germanic (see Vilkuna, this volume). 4. Thus Dutch and various German dialects allow preposition stranding by "R-pronouns": (i)

Waar heeft hij een prijs mee gewonnen? [Du] what has he a prize with won 'What did he win a prize with?"

102

Anders Holmberg & Jan Rijkhoff

5. Note, however, that Faroese does not allow inversion of two nominal objects although Faroese has as clear a morphological distinction between accusative and dative as Icelandic; see Holmberg (1994). 6. See also Jacobs et al. (1994: 411) on separable verb prefixes in Yiddish, whose behaviour can be explained in terms of an earlier SOV order. 7. In the Scandinavian languages the particle may also be incorporated in the verb. (i)

Jag har lagt fram / framlagt ett nytt förslag [Sw] I have put forth / forth.put a new proposal

In most cases the form with an incorporated particle is lexicalized, having a more restricted meaning than the corresponding phrasal construct. 8. There is a considerable amount of literature on raising and other infinitival constructions in Germanic. We could mention Postal (1974), Koster (1987: 119), Sigurösson (1989: 49ff.), Kutten (1991). 9. The same pronouns are used in impersonal constructions: Het regent [Du] — 'It is raining' [En] — It reint [Fr] — Es regnet [Ge] — Etreent [Lu]. See Bennis (1986). 10. There is also the alternative construction without the repletive element in e. g. Dutch and German, as when a locative adverbial appears in clause-initial position. (i)

Hier wordt vandaag gedanst [Du] here becomes today danced 'Here is dancing done today'

11. On NP structure in Scandinavian, see especially Delsing (1993) and the articles in Studia Linguistica 47,2.; on Icelandic, see Sigurösson (1993). 12. Compare also Luxembourgeois: dat gebrodent Fleesch 'the roasted meat'. The prenominal participle plus complements is also found in Yiddish, but this is said to be a loan from Modern German (Birnbaum 1979: 81); the 'correct' position of this modifier is after the head noun. 13. The dative case is also used on Yiddish determiners and adjectives, which have no separate genitive form (Jacobs et al. 1995: 405): (i)

dem altn yidns bukh the:M.Sg.Dat old:Dat jew:Gen book 'the book of the old jew'

References Askedal, John Ole 1994 "Norwegian", in: Ekkehard König—Johan van der Auwera (eds.), 219— 270. Barnes, Michael P.—Eivind Weyhe 1994 "Faroese", in: Ekkehard Konig-Johan van der Auwera (eds.), 190-218. Bennis, Hans 1986 Gaps and dummies. Dordrecht: Foris. Birnbaum, Solomon A. 1979 Yiddish: a survey and a grammar. Toronto: University of Toronto Press.

Word order in the Germanic languages

103

Börjars, Kersti 1994 "Swedish double determination in a European typological perspective", Nordic Journal of Linguistic. 17: 219—252. Comrie, Bernard (ed) 1987 The major languages of the world. London: Groom Helm. Delsing, Lars-Olof 1993 The internal structure of noun phrases in the Scandinavian languages. Doctoral dissertation, Department of Scandinavian Languages, University of Lund. Diesing, Molly 1990 "Verb movement and the subject position in Yiddish", Natural Languages & Linguistic Theory 8: 41-79. Dik, Simon C. 1980 Studies in Functional Grammar. New York: Academic Press. 1987 The theory of Functional Grammar. Part I: The structure of the clause. Dordrecht: Foris. Dikken, Marcel den 1994 Particles. Oxford and New York: Oxford University Press. Donaldson, Bruce C. 1987 Dutch reference grammar. Leiden: Nijhoff. Eisenberg, Peter 1994 "German", in: Ekkehard Konig-Johan van der Auwera (eds.), 349-387. Finnigan, Edward 1987 "English", in: Bernard Comrie (ed), 77-109. Fiva, Toril 1987 Possessor chains in Norwegian, Oslo: Novus. Geerts, G.—W. Haeseryn—J. de Rooij—M. C. van den Toorn 1984 Algemene Nederlandse Spraakkunst. Groningen: Wolters-Noordhoff. Geilfuß, Jochen 1991 "Jiddisch als SOV-Sprache", in: Verb- und Verbphrasensyntax. Arbeitspapiere des Sonderforschungsbereichs 340 "Sprachtheoretische Grundlagen für die Computerlinguistik" — Bericht Nr. 11. Universität Stuttgart — Universität Tübingen — IBM Deutschland GmbH (Stuttgart). Haider, Hubert—Martin Prinzhorn (eds.) 1986 Verb second phenomena in Germanic languages. Dordrecht: Foris. Hawkins, John A. 1986 A comparative typology of English and German: unifying the contrasts. London: Croom Helm. 1987 "Germanic", in: Bernard Comrie (ed), 110-138. Holmberg, Anders 1990 "Bare infinitivals in Swedish", in: Joan Mascaro—Marina Nespor (eds.), 234-237. 1994 "Morphological parameters in syntax: the case of Faroese". Report 35, Department of Linguistics, University of Umea. Holmberg, Anders & Christer Platzack 1995 The role of inflection in Scandinavian syntax. Oxford and New York: Oxford University Press. Jacobs, Neil G.—Ellen F. Prince—Johan van der Auwera 1994 "Yiddish", in: Ekkehard König-Johan van der Auwera (eds), 388-419.

104

Anders Holmberg & Jan Rijkhoff

König, Ekkehard 1971 Adjectival constructions in English and German: a contrastive analysis. Heidelberg: Julius Groos Verlag. König, Ekkehard—Johan van der Auwera (eds.) 1994 The Germanic languages. London: Routledge. Koster, Jan 1987 Domains and dynasties. Dordrecht: Foris. Kooij, Jan 1987b "Dutch", in: Bernard Comrie (ed), 139-156. Mascaro, J. & Marina Nespor (eds.) 1990 Grammar in progress. Dordrecht: Foris. Quirk, Randolph—Sidney Greenbaum—Geoffrey Leech—Jan Svartvik 1985 comprehensive grammar of the English language. London: Longman. Platzak, Christer 1986 "COMP, INFI and Germanic word order", in: Lars Hellan-Kirsti Koch Christensen (eds.), Topics in Scandinavian syntax. Dordrecht: Kluwer, 185-234. Postal, Paul 1974 On Raising. Cambridge MA: MIT Press. Primus, Beatrice this volume "The relative order of recipient and patient in the languages of Europe". Rögnvaldsson, Eirikur & Hoskuldur Thrainsson 1990 "On Icelandic word order once more", in: Joan Maling—Annie Zaenen (eds.), Syntax and Semantics 24: Modern Icelandic syntax. New York: Academic Press, 3—40. Rutten, Jean 1991 Infinitival complements and auxiliaries. Doctoral dissertation, University of Amsterdam. Santorini, Beatrice 1989 The generalization of the verb-second constraint in the history of Yiddish. Doctoral dissertation, University of Pennsylvania. Schmitt, Pierre 1984 Untersuchungen zur luxemburgischen Syntax. Marburg: Elwert. Sigurösson, Halldor A. 1989 Verbal syntax and case in Icelandic. Doctoral dissertation, Department of Scandinavian Languages, University of Lund. 1993 "The structure of the Icelandic NP", Studia Linguistica 47: 177-197. Tiersma, Pieter M. 1985 Frisian reference grammar. Dordrecht: Foris. Vance, B. 1989 Null subjects and syntactic change in Medieval French. Doctoral dissertation. Cornell University, Ithaca, N. Y. Vilkuna, Maria this volume "Word order in European Uralic". Zwart, Jan Wouter 1994 Dutch syntax: a minimalist approach. Doctoral dissertation, University of Groningen.

Anna Siewierska and Ludmila Uhlirova

An overview of word order in Slavic languages1

1. Introduction There are 14 extant Slavic languages which are typically grouped into three branches: west, east and south. The languages of the west branch are: Czech, Kashubian, Lower Sorbian, Polish, Slovak and Upper Sorbian. The two Serbian languages, also referred to as Upper and Lower Lusatian, are spoken by about 120 thousand people east of a line from Berlin to Dresden in the upper reaches of the river Spree, mainly around the towns of Cottbus and Bautzen. Kashubian, whose speakers number about 200 thousand, is spoken on the left bank of the lower Vistula, west of Gdansk. The east branch of Slavic is comprised of Byelorussian, Russian and Ukrainian. To the south branch belong Bulgarian, Macedonian, Slovene, Serbian and Croatian (the last two often grouped together as Serbo-Croatian).

2. Inflection and other functional categories2 The Slavic languages with the exception of Bulgarian and Macedonian all display case marking on nouns and pronouns. There are seven cases, namely: nominative, accusative, genitive, dative, instrumental, locative and vocative. In the East Slavic languages and in Slovak and Slovene the vocative case is either obsolete or obsolescent (Byelorussian, Ukrainian). The Sorbian languages and Slovene use the instrumental and locative cases only with prepositions. In Bulgarian and Macedonian vestiges of case marking on nouns are found with masculine nouns denoting proper names and close relatives. Pronouns in these two languages are inflected for nominative, accusative and dative case. The Slavic verb is inflected for the verbal categories of person, number, mood, aspect, tense and voice. With the exception of Bulgarian and Macedonian, both of which have an additional evidential (renarrated) mood, the mood categories are: indicative, conditional and imperative. The indicative mood has no special morphological markers. No regular, all-embracing rules can be given for the formation of the aspects. The most common means of forming the

106

Anna Siewierska & Ludmila Uhlirova

perfective is by prefixation, and of the imperfective via vowel lengthening or a change in the present or infinitive endings. There are many suppletive forms. The only synthetic tense manifested in all the Slavic languages is the present. Bulgarian, Macedonian, Serbo-Croatian and Upper and Lower Serbian have also retained the aorist and imperfect. In Serbo-Croatian, however, the latter two tenses tend to be used only in writing. All of the Slavic languages have a periphrastic passive, restricted to transitive verbs, formed by means of the auxiliary be and the past participle of the verb. The periphrastic passive is used mainly in the written language. By contrast, constructions with the reflexive morpheme se or $ΐς and the 3rd person singular of the verb, which are often analyzed either as passive or impersonal, are widespread in both speech and writing. In all the Slavic languages the verb agrees with the subject in person (though not in the East Slavic preterite) and number, and in compound tenses formed on a participle also in gender. Slovene and Serbian have retained the dual number. All the Slavic languages manifest null subjects (pro-drop), though the conditions under which pro-drop applies differ from language to language. For example, whereas in Western Slavic a pronominal subject coreferential with a newly introduced rhematic referent in the preceding clause does not require expression via an overt subject pronoun, in East Slavic it does (Bily 1981; Adamec 1985, 1987, 1988; Uhlirova 1988). Compare the second clause in the Polish (1) which is grammatical both with and without the overt subject pronoun ona with the Russian (2) in which the subject pronoun ona is obligatory. (1)

Polish Piotr spotkaf Ewe. Ona/ 0 powiedziaia mu, ze .... Peter met Eve she told him that....

(2)

Russian (Bily 1981: 113) Petr vstretil Natasu. Ona/*0 skazala jemu, cto... Peter met Natasa. She told him that...

Bulgarian and Macedonian are the only two Slavic languages which have a definite article (derived from the demonstrative). In both languages the article, inflected for gender and number, takes the form of a suffix on the noun. If the noun is preceded by a modifier, for example, a numeral or adjective, the definite article is suffixed to the first stressed word of the NP as in (3 b). (3)

Macedonian (De Bray 1980: 268) a. vol-ot ox-the

Word order in Slavic languages

107

b. beli-ot vol white-the ox In both Bulgarian and Macedonian the polysemic lexeme edin (Bulgarian) and eden (Macedonian) is aquiring the status of an indefinite article, used with specific indefinites. The indefinite article is obligatory only with initial, bare, concrete, countable nouns in the singular, which do not bear sentence stress and function as the theme of the clause (in the Functional Sentence Perspective sense). Thus whereas (4b) without edno is ungrammatical, both (4c) in which dete bears sentence stress and (4 d) in which dete is clause final require no article. (4)

Bulgarian a. Edno dete vleze v staja-ta. a/one child entered in room-the Ά child entered the room.' b. ""Dete vleze ν staja-ta. child entered in room-the c. DETE vleze v stajata. 'It was a child who entered the room.' d. V stajata vleze dete. Ά child entered the room.'

Some of the other Slavic languages, namely Upper Serbian, Czech and SerboCroatian, evince an article-like usage of the definite determiners 'this' 'that' as well as that of the numeral One', though mainly in the spoken language. The expression of definiteness in these languages is not, however, grammaticalized.

3. The word order type In terms ot word order typology the Slavic languages are generally classified as free SVO languages. The SVO classification is motivated both functionally, and on statistical grounds. Functionally, SVO is the basic, unmarked order in the sense that a sentence with this order has the widest contextual applicability; it may be found in any position in a text, at the beginning, in the middle or at the end and each of the constituents may be contextually bound or unbound. The SVO order occurs in isolated sentences and in answers to questions such

108

Anna Siewierska & Ludmila Uhlirova

as What happened?. Moreover, SVO is the order used in cases of nondistinctiveness of the two arguments of the verb (syncretism of nominative and accusative case forms; identity of person, number and gender) as in the examples below from Polish and Bulgarian. (There is no gender distinction in the 3rd person present tense form of the verb in Polish or in the aorist in Bulgarian.) (5)

(6)

Polish Byt okresla swiadomosc. existence determines awareness Bulgarian a. Tanja vidja Marija. Tanja saw Marija

In the case of Bulgarian and Macedonian, there is yet another piece of evidence for the basic SVO nature of these languages, namely SVO order hardly ever occurs with clitic doubling. A clause such as (6 a) with clitic doubling will normally be interpreted as manifesting OVS order as in (6 b) rather than SVO order. (6)

b. Tanja ja vidja Marija. Tanja her saw Marija 'Marija saw Tanja.'

In clauses with two full NP participants SVO order is the statistically dominant order. The available text counts for various languages suggest that in such clauses SVO order occurs in about two thirds of the cases. For example, for Czech Uhlifova gives the figure 63% and for Polish Siewierska cites the figure of 69% (see § 4.2). Upper and Lower Serbian, in contrast to all the other extant Slavic languages, are sometimes claimed to exhibit basic SOV as opposed to SVO order. However, though SOV order with nominal participants, as in (7), is much more common than in the other Slavic languages, there is currently no statistical data available to suggest that SOV order is in fact the dominant main clause order. (7)

Upper Sorbian (Sewc-Schuster 1976: 52) a. Mefko knihu cita. Mark book reads 'Mark is reading a book.'

Word order in Slavic languages

109

b. Kombajn zito syce, combine wheat harvests 'The combine harvests the wheat.' Only in subordinate clauses as in (8) is SOV order unquestionably dominant. (8)

(Polanski 1967: 70) Dyrbju sej nowu drastu kupic. must:lSG RFL new suit buy Ί must buy myself a new suit.'

The frequency of subordinate SOV order in the Serbian languages is attributed to the influence of German from which these languages have also borrowed the Vaux...Vparticiple frame (with compound verbal forms) illustrated in (9). (9)

(De Bray 1980: 460) Sym hizo wupowedz dai. be:PRS:lSG already notice given Ί have already given notice.'

Significantly, however, neither the frame construction nor verb-final subordinate order is obligatory. For instance in the subordinate clause in (10) the verb is in second position. (10)

Pafka praji, zo wuznaje so lepje na stawiznach hac ja. Pafka told that is:3SG RFL better at the thing than me

The choice between the "second" and final position of the-non-rhematic verb in the Serbian clause is influenced by a number of syntactic, stylistic and contextual factors. Turning to the qualification of Slavic languages as "free" word order languages, apart from the location of clitics there are virtually no syntactic constraints on the ordering of phrases in main declarative clauses. Thus in each of the Slavic languages all twenty four possible combinations of a subject, direct object, indirect object and verb occur as grammatical declarative orders. However, it should be stressed that the variants are NOT freely interchangeable in texts: they are not pragmatically, communicatively free and also their frequencies vary, from very common ones to rare ones (see $ 4.2). The label free denotes nothing more other than that the sentence position of an element is not directly determined (defined) by its syntactic function (Mathesius 1942: 182

110

Anna Siewierska & Ludmila Uhlirova

and Firbas 1992: 118). Considerable freedom of order is also manifest at the phrasal level. Though the basic location of demonstratives, quantifiers, numerals and most adjectives is prenominal, they can also be positioned postnominally (see § 7). The same applies to the modifiers of the adjective and adverb. Moreover, all Slavic languages exhibit cases of discontinuous NPs, AdjPs and AdvPs. The example in (11) is from spoken Russian, that in (12) from a Polish literary text. (11)

Russian Ja sovsem prisla nedavno. I quite came recently Ί came quite recently.'

(12)

Polish Znowu nowych wprowazdi do wyobrazni znajomych. again new introduce to imagination acquaintances 'Again his imagination will be stimulated by new acquaintances.'

PPs can be discontinuous only if the prepositional object is itself modified; there is no preposition standing3. Moreover, no part of the prepositional object may precede the preposition, as evinced by the ungrammaticality of (13c). (13)

Polish a. O tej mowilismy dziewczynie. about this spoke: 1PL girl 'We spoke about this girl.' b. Ο dziewczynie mowilismy tej. c. *Tej mowilismy ο dziewczynie.

It needs to be also mentioned that PPs within the NP cannot be discontinuous, since PPs as units are not marked according to whether they occur in NP or S.

4. Order in declarative clauses Given the virtual lack of syntactic restrictions on the placement of major clausal constituents in declarative clauses, the order of these constituents tends to adhere to the theme — rheme principle. The preference for theme — rheme articulation has been documented by the analysis of most of the Slavic Ian-

Word order in Slavic languages

111

guages; see, for instance, Firbas (1992), Danes (1959), Uhlifova (1987) for Czech, Mistrik (1966) for Slovak, Adamec (1966), Sirotinina (1961) or Kovtunova (1976) for Russian, Toporisic (1967) for Slovene, Silic (1978) for SerboCroatian, Georgieva (1974) and Pencev (1984) for Bulgarian, Michalk (1970) and Fasske (1981) for Upper Serbian, Huszcza (1980), Krucka (1982) or Siewierska (1993) for Polish and also some of the papers in Bernini (to appear). The tendency to linearize constituents in accordance with a gradual rise in communicative dynamism, is not an absolute one. Rheme > theme order (after Mathesius called the "subjective" order) is also possible. It is considered to be a "counterpart, or rather a complement to, the FSP linearity principle" (Firbas 1992: 120). The speaker uses the rheme > theme order to express his/her individual, i. e. subjective, attitude to the conveyed information: the information is evaluated as unexpected, surprising, striking, conspicuous, remarkable, etc. Thus whereas according to the theme > rheme principle the most natural response to the question in (14) would be the OVS clause in (14 a) with the information focus expressed in clause final position, in speech one may well encounter the SOV (14 b), or the OSV (14c), or, also, the SVO (14d). (14)

(15)

Russian Kto napisal Evgenija Onegina? who wrote Eugene Onegin:ACC 'Who wrote Eugene Onegin?' a. Evgenija Onegina napisal Puskin. Evgenija Onegina:ACC wrote Pushkin:NOM 'Pushkin wrote Eugene Onegin.' b. Puskin Evgenija Onegina napisal. c. Evgenija Onegina Puskin napisal. d. Puskin napisal Evgenija Onegina.

It is important to note that the intonation contour of (14 a) as opposed to (14 b, c, d) is quite different: In the latter, the intonation centre (focus) falls on the word in initial position. Note also that as illustrated in (14 d) even the "basic" word order may be communicatively marked. The rheme — theme order is characteristic of optative and exclamative clauses, of narrative genres, poetry, spoken genres and is also common in journalistic writing particularly in short news items. If the initial rhematic position is occupied by a non-subject, the subject is placed either in the middle, or at

112

Anna Siewierska & Ludmila Uhlirovä

the end of clause. The latter, shown in (17), is particularly common in the East Slavic languages. (16)

Cto napisal Puskin? what wrote Pushkin 'What did Pushkin write?'

(17)

Evgenija Onegina napisal Puskin. Evgenija Onegina wrote Pushkin 'Evgenij Onegin is what Pushkin wrote.'

The "pure" rheme > theme order as given in (17) is only one type (one pole) of the "subjective" communicative perspective. The order themei > rheme > theme2, with the rheme in the middle of the clause, is much more frequent. It is quite common in spoken genres of all the Slavic languages; it is motivated by the speaker's desire to foreground the rheme, i. e. to present the most important information before the less important, contextually bound (presupposed) information. The advantage of having the rheme in the middle position (which often coincides with the penultimate position in the clause) is that it is much less emotive than the initial position. The rheme > theme order is most widely used in the East Slavic languages. In fact, together with the theme-rheme-theme order, it is claimed to constitute the norm in speech (Lapteva 1976; Sirotinina 1961; Zemskaja 1973; Keijsper 1985 and Yokoyama 1986). Adherence to rheme — theme order in spoken varieties of the Eastern Slavic languages is a frequent source of phrasal discontinuity, as in (18). (18)

Russian Strasno! Byt' nicego ne mozet strasneje! Terrible! be nothing NEG can terrible: CMPR 'Terrible! Nothing can be more terrible than that!'

Probably, only in East Slavic is it possible to postpose a subordinate conjunction after the rheme as in (19). (19)

Russian (Barnetovä et al. 1979) Tak ja spal, tak spal! U sebja doma potomu cto. so I slept so slept in my place because slept so comfortably because I was at home.'

Word order in Slavic languages

113

Among the South Slavic languages, rheme > theme order is more common in Bulgarian and Macedonian than in Slovene. And in West Slavic, the rheme is placed in the middle of the clauses more often in Polish and Slovak than in Czech.

4.1.

The location of clitics

4.1.1. Auxiliary and pronominal clitics A number of differences in the ordering of clausal constituents in Slavic stem from the placement of auxiliary and pronominal clitics. In the Slavic languages that possess such clitics, i. e. in Western and Southern Slavic (as a consequence of historical development, the East Slavic languages have lost them), the position of clitics is determined by their stress quality in the flow of speech, i. e. by rhythmical factors. In languages with fixed accent, namely in West Slavic, as well as in Serbo-Croatian and Slovene, there is a tendency to locate clitics early in the utterance, preferably in Wackernagel's position, i. e. after the first stressed constituent in the utterance. On the other hand, in Bulgarian and Macedonian, i.e. the languages with movable accent, Wackernagel's "law" operates only inside the NP. In all Slavic languages, including East Slavic, Wackernagel's law operates in the domain of inflexible clitics. There is no Slavic language, in which Wackernagel's "law" holds categorically rather than as a preference. In the West Slavic languages, Serbo-Croatian and Slovene this preference is quite strong. All clitics are regularly placed in Wackernagel's position. We see this in the Slovene and Czech examples below on the basis of the placement respectively; of the reflexive particle and the auxiliary be of the compound past (20 a), the object personal clitics and the be of the compound past (20 b), the conditional particle and the personal pronouns (21 a) and the reflexive particle and a personal pronoun (21 b). (20)

Slovene (questionnaire) a. Pojavil se je problem, emerged RFL be:PRS:3SG problem 'There emerged a problem.' b. Oce mu jo je dal. father he:DAT it:ACC be:PRS:3SG given 'Father gave/has given it to him.'

114

(21)

Anna Siewierska & Ludmila Uhlirova

Czech (Toman 1986: 124) a. Dnes by ti je jiste prodali levneji. today would:3PL you them certainly sold cheaper 'They would sell them to you cheaper today.' b. Tohle stare kolo se ti jednou rozpadne. this old wheel itself you once falls-apart 'This old bicycle will fall apart on you one day.'

Nonetheless, under certain conditions, clitics may leave their "regular" Wackernagel's position. For instance, in Slovak, especially in fiction, (Uhlirovä 1988) the conditional auxiliary clitic is frequently placed in "third" position (22 a) or, when enclitic to the verb, even later (22 b).4 (22)

Slovak a. Maria na stöl by polozila knihu. Mary on table would put book 'Mary would put a/the book on the table.' b. Maria jablkä priniesla by zajtra. Mary apples bring would tomorrow 'Mary would bring the apples tomorrow.'

In Upper Serbian, the postposition of a clitic after the verb is often found in clauses with V-second (23) order; the reflexive clitic may occupy even the last position in the clause (24). (23)

Upper Sorbian Njedzelu zwoblece so wucer belu koslu. Sunday took RFL teacher white shirt On Sunday the teacher took on a white shirt.'

(24)

Wsitcy posmewkowachu so. all laughed RFL 'All people laughed.'

Serbo-Croatian is the only Slavic language in which a clitic may 'interrupt' an NP.5 A clitic may be located either after the first stressed word, as in (25 a), or later in the clause, i. e. after the first constituent, as in (25 b), or, if the subject of the clause is long (with a postmodifier, or coordinated), it may be placed after the first stressed element of the VP (25 c).

Word order in Slavic languages

(25)

115

Serbo-Croatian (Browne 1986: 25; Sedlacek 1989: 259) a. Nasi su uvazeni clanovi postavili jedno pitanje. our be:PRS:3PL respected members posed one question Our respected members posed a question.' b. Nasi uvazeni clanove su postavili jedno pitanje. our respected members be:PRS:3PL posed one question c. Rankovic, Lazarevic, Glisic i Domanovic poznati Rankovic, Lazarevic, Glisic and Domanovic famous su srpski knjizevnici. PRS:3PL Serbian writers 'Rankovic, Lazarevic, Glisic and Domanovic are famous Serbian writers.'

In Upper Serbian the auxiliary verbs of the compound past, pluperfect and the conditional mood tend to be proclitic, even orthotonic. They often precede the active past participle and frequently occur initially as in (26). (26)

Upper Sorbian (De Bray 1980: 460) a. Sym hizo wupowedz dat. be:PRS:lSG already notice given Ί have already given notice.' b. Bych rad to widzial. would pleased it seen Ί would gladly see it.'

On the other hand, the object personal clitics and the reflexive clitics favour the second position in the clause unless there is an auxiliary verb present in which case they tend to follow the auxiliary verb. Reflexive clitics, however, can be placed also at the beginning of the main or subordinate clause, which is not the first (initial) clause of the complex or compound sentence (see ex. below). Polish is perhaps the most liberal of the Slavic languages in the placement of clitics. The remnants of the present auxiliary of the compound past (personal endings) are always enclitic, typically to the verb, as are the clitics of the imperative. However, if a sentence contains both the past tense personal endings and the conditional clitic by, the former necessarily attach to the latter (Mikos &c Moravcsik 1986). Note the ungrammaticality of (27 c, d) where the two clitics are separated from each other.

116

(27)

Anna Siewierska 8c Ludmila Uhlirova

Polish a. Ja powiedziala-by-m. I said:F-would-lSG Ί would say.' b. Ja-by-m powiedziaia. I-would-lSG said:F c. *Ja-by powiedziala-m. I-would said:F-lSG d. *Ja-m powiedziala-by. I-1SG said:F-would

Presumably due to the above, the personal endings of the past tense cannot be encliticized to the verb in the presence of a few words which contain, by, namely gdyby 'if, jakby 'if, oby Ί wish', zeby 'in order to'. These words are always clause initial, and the personal clitics are consequently in second position. With the object and reflexive clitics early placement in the clause is favoured and final placement avoided, but there are no absolute requirements. Thus both (28 a) and (28 b), for example are possible. (28)

Polish a. Bardzo mi sig podobal. very IrDAT RFL liked:3SG Ί liked the look of him very much.' b. To Mozarta, objasnia Karol, kiedy serenada skonczyla sie. PTL Mozarta explains Charles when serenada finished RFL 'It was Mozart, Charles says, when the serenade was over.'

The final placement of si$ is a new, frequent tendency in present-day Polish (Jodlowski 1976; Misz 1966; Orlos 1976). In Kashubian the ordering characteristics of the object and reflexive clitics are similar to that of Polish. An object clitic may, however, be placed in absolute clause initial position (29), a phenomenon which in Polish is encountered only sporadically in colloquial speech. (29)

Kashubian (Breza & Treder 1981: 153) Mu sg zachcaio ti zloti szable. he:DAT RFL want that gold sword 'He developed a desire for that gold sword.'

In contrast to Polish, the conditional auxiliary be (inflected or uninflected) is virtually always preverbal, though not necessarily second. E. g.

Word order in Slavic languages

(30)

117

a. Jo bem przeszedl do dom, ale ... I would: 1SG came to home, but Ί would come home, but ...' b. Ocze eszcze be chcale, jeno z ladek ni moze. eyes still would: 1SG want but stomach neg can Ί would very much like to have mote, but I can't.'

The auxiliary of the compound past occurs in second position. It must be mentioned, though, that the compound past with be is becoming obsolescent. Younger speakers tend to use either the synthetic past or the perfect with have. Unlike in the languages discussed above, the placement of clitics in Bulgarian and Macedonian is governed by a grammatical principle (sometimes called the principle of coherence of members); a clitic must either immediately precede or follow its 'host'. An important difference between the two languages is that in Macedonian, but not in Bulgarian, a pronominal clitic may be placed at the beginning of simple sentence. Note the ungrammaticality of the Bulgarian (31 b) as compared to the Macedonian (32).6 (31)

Bulgarian a. Pozasmja se tja. laughed RFL she 'She laughed.' b. *Se pozasmja tja.

(32)

Macedonian Si storil nesto? RFL done:2SG anything 'Have you done anything?'

Both in Macedonian (33) and in Bulgarian the conditional morpheme hi and the future Bulg. ste, Mac. k'e are proclitic. (33)

Macedonian (Kramer 1986: 82, 106) a. K'e pojdes kaj nego i k'e mu reces. PUT go:2SG to him and PUT he:DAT say 'You will go to him and you will tell him.' b. Bi sakala. would like Ί would like.'

118

Anna Siewierska 8c Ludmila Uhlirovä

Be when used as an auxiliary or copulative can be either pro- or enclitic, but it cannot stand at the beginning of a sentence. Cf. (34)

Macedonian Ocigledno e deka... evident is that... 'It is evident that...'

Postposition of clitics is obligatory after participles and after deverbal adjectives: (35)

Bulgarian strämno spuskastite se vitoski sklonove sheer running-down RFL Vitocha slopes 'sheer slopes of Vitocha'

It needs to be stressed that though clitic placement in Bulgarian and Macedonian may in certain instances parallel that of clitic placement in Western Slavic and the other South Slavic languages, the factors underlying this placement are not the same. Thus, for example, whereas in both the Bulgarian (36) and the Czech (37) the reflexive clitic occurs in second position, this placement in Bulgarian is motivated by the grammatical "coherence" principle which requires the clitic to be adjacent to its host, while in Czech by Wackernagel's law. (36)

Bulgarian Bratovcedät se värna na drugija den. cousin RFL returned on other day 'The cousin returned the other day.'

(37)

Czech Bratranec se vratil druhy den. cousin RFL returned other day 'The cousin returned the other day.'

This different motivation becomes evident as soon as the word order in (36) and (37) is changed into that in (38) and (39) respectively. (38)

Bulgarian Na drugija den bratovcedät se värna.

Word order in Slavic languages

(39)

119

Czech Druhy den se bratranec vratil.

Since in Czech, Slovak, Serbo-Croatian and Slovene the auxiliaries are enclitic to the first stressed constituent, when the lexical verb is clause initial they follow it, and when it is not, they precede it. Compare the examples in (40) and (41) below adapted from Rivero (1990). (40)

Slovak a. Napisal som list, written be:PRS:lSG letter Ί have written a letter.' Serbo-Croatian b. Citao sam knigu. read be:PRS:lSG book Ί have read a book.'

(41)

Slovak a. Ja som napisal list. b. List som napisal. Ί have written a letter.' Serbo-Croatian c. Ja sam citao knigu. d. Knigu sam citao. Ί have read a book.'

4.1.2. Inflexible clitics Perhaps the most common (but not universal!) Slavic inflexible clitic is It. In Czech, Polish, Serbo-Croatian, as in Bulgarian and Russian, it regularly takes the position after the first stressed constituent. The "second" position of the interrogative U is regular if the rheme of the question or at least its first stressed component stands at the beginning of the clause (See § 6.1. for more details). The word order of the South Slavic polysemic clitic da differs from that of //'. In Bulgarian and Macedonian da is placed immediately before the verb; it may be separated from the verb only by another clitic as in (42).

120

Anna Siewierska & Ludmila Uhlirova

(42)

Bulgarian Grechota e chljabät da vi caka. sin is bread to you wait 'It is a sin to let the bread wait for you.'

The adjacency of da and the verb is so strict that some compound conjunctions, the second component of which is da, such as predi da 'before' or vmesto da 'instead' become discontinuous: (43)

Bulgarian Predi ulickata da me izvede na plostada,... before street to me:ACC lead:3SG on square 'Before I reach the square, walking through the street ...'

On the other hand, in Serbo-Croatian and Slovene da is placed at the beginning of the clause (it behaves like any other conjunction): (44)

Serbo-Croatian: Ta sam primijetio da se brod prilicno ljulja. observed :1SG to RFL boat pleasantly pitches observed that the boat pitches pleasantly.'

4.1.3. Order in the clitic cluster The following constitutes a summary of the major ordering tendencies noted in the individual languages. (45)

Serbo-Croatian (Browne 1975) li > aux (except je) > dat > ace > gen > rfl > je

(46)

Slovene (Bennet 1986) particles > pst aux (except je] rfl > dat > ace > gen > fut aux > je

(47)

Bulgarian (Englund 1977: 109-119) li > aux (except je) > dat > ace > rfl > je

(48)

Macedonian (De Bray 1980) neg > aux > li > dat > ace

(49)

Czech (Toman 1986: 125) li > aux > rfl > dat > ace

Word order in Slavic languages (50)

Slovak (De Bray 1980: 222) conditional by > aux > rfl > dat > ace

(51)

Polish (Rappaport 1988: 310) dat > $ίξ > ace > gen (nonrigid)

(52)

Upper Serbian (De Bray 1980: 461) /ι > rfl > dat > ace

(53)

Kashubian (Breza & Treder 1981) aux > dat > ace (nonrigid)

121

4.2. The order of the verb and its arguments Concern with the pragmatic underpinnings of word order in Slavic has resulted in the relative neglect of characterizations in terms of the location of the subject, object(s) and verb. The observations that have been made will be briefly summarized below.

4.2.1. Transitive clauses As mentioned in § 3, SVO is the statistically dominant transitive order in all Slavic languages. To give some idea of the textual frequency of the other transitive orders in table 1 we cite the distribution of the six transitive patterns in Polish. The data is taken from a small written corpus of 1450 clauses. Table 1. Textual frequency of the six transitive word order patterns in written Polish Nr of clauses

% relative to total of 308

% relative to total of 1450

sov

7

2.2%

0.48%

SVO

213

69.1%

14.6%

vso vos ovs osv

22

7.1%

1.5%

33

10.7%

2.3%

28

9.0%

1.9%

5

1.6%

0.34%

122

Anna Siewierska & Ludmila Uhlifovä

We see that SVO order is not only the most statistically dominant order in written Polish, but also that it outnumbers all of the other word order patterns as a group by 2 to 1. Of the non-SVO orders, OSV is clearly the most disfavoured. The occurrence of SOV and VOS orders is conditioned in part by the categorial status of the object. In Polish clause initial and clause final placement of clitic or unstressed object pronouns is avoided. Therefore if the object is a clitic or unstressed pronoun and there is no clausal material other than the subject, object and verb either SOV or VOS is used. The former is favoured if the subject conveys given information, as in (54), the latter if the subject represents new information, as in (55). (54)

Polish [Andrzej is making a film with all those present and also someone else. An actor-friend, adored, but unmanageable, leading a chaotic life, and who fails to show up for the shooting and] Rezyser sam go dubluje. director himself him doubles 'The director himself plays him.'

(55)

[The inscription by an unknown author was made in the 13th century.] Odkryt go w wieku XVIII uczony discovered it in century XVIII learned:NOM benedyktyn, Anzelm Banduri (1671 — 1743). BenedictinerNOM Anzelm:NOM Banduri:NOM 'It was discovered by a learned Benedictine Anzelm Banduri.'

The most common alternative to SVO with a nominal subject and object is OVS. E.g. (56)

[The power of Magadha in the second half of the 6th century was undoubtedly due to its ruler Bimbisar (545—493 b.c.).] Po Bimbisarze wladze objal syn jego after Bimbisar powenACC took over son:NOM his Adzatasiatru (...) Adzatasiatru:NOM 'After Bimbisar his son Adzatasiatru took over power.'

As for VSO order, most declarative VSO clauses have an initial adverbial or PP. The subject is virtually invariably given, while the object is new. E.g.

Word order in Slavic languages

(57)

123

[Soon Wyspianski's Parisian dream came to an end and the sad reality of Poland returned.] W roku 1897, w liscie do Rydla, napisal Wyspianski in year 1897 in letter to Rydel wrote W:NOM stowa ktore staly sie odtad haslem words:ACC which became RFL from then motto jego dalszej tworczosci. his further creativity 'In 1897, in a letter to Rydel, Wyspianski wrote the words which became the motto for his further creative work.'

The occurrence of this declarative word order pattern in Polish is to a large extent restricted to expository and journalistic texts. Of the patterns of distribution shown in Table 1, the dispreference for OSV and relatively frequent use of OVS with nominal participants appear to hold for all the Slavic languages. By way of comparison consider the distribution of transitive orders in Czech (Table 2) established by Uhlirova on the basis of 6101 transitive clauses taken from a corpus consisting of approximately 30000 clauses (540000 words) of written and spoken journalistic, scientific and administrative texts. Table 2. Textual frequency of the six transitive word order patterns in written and spoken Czech % relative to 6101 SOV

6.1%

SVO

63.1%

VSO

9.8%

VOS

3.7%

OVS

14.6%

OSV

2.6%

Though the figures for the distribution of the six transitive patterns in the two languages are not identical, they define the same preference for OVS and dispreference for OSV. Also found throughout Slavic is the use of SOV order when the object is a clitic or unstressed pronoun. Little is known about the distribution of the other

124

Anna Siewierska & Ludmila Uhlirova

transitive orders. One commonly cited difference is the relatively frequent occurrence of subject postposing, i. e. XVSO or XVSX order in Czech, Slovak, Serbo-Croatian and Slovene. A Polish XSV(O)X clause is likely to be rendered in these languages as XVS(O)X. Compare, for instance, the Polish (58) with the Czech (59). (58)

[The road was empty. There was no one about. On the other side there was a low brick wall.] Polish Pod nim czterech malych obdartusow gralo w orla i under it four little rascals played in heads and reszke tails heads 'Under it four little rascals were playing heads and tails.'

(59)

Czech U zdi hräli ctyfi mali trhänci hlava — orel. under brick played four little rascals tails heads

The relative frequency of subject postposing in these languages is undoubtedly in part due to the strength of WackernageFs position for clitics (cf. 4.1). As shown in (59), it is not, however, confined to clauses with clitic auxiliaries. In the case of Czech the occurrence of subject postposing may also have been influenced by the impact of German (a tendency for the "V-second" position). The East Slavic languages, in turn, are seen to display a stronger tendency for verb initial orders, both VSO and VOS, than the West or South languages. Though typically associated with folkloric or poetic style, according to Belicova 6c Uhlirova (1966) and Yokoyama (1986), such orders are in fact quite common, particularly in speech. Also connected primarily with the spoken language is the relatively frequent use of SOV with nominal participants in Russian (see in particular Keijsper 1985: 127).

4.2.2. Intransitive clauses In intransitive clauses, as in transitives, preverbal subjects are more common than postverbal ones. The main reason is that the subjects of such clauses usually express, semantically, bearers of a "quality" expressed by V, and as such, they are thematized. Normally, the "quality" expressed by the verb is new.

Word order in Slavic languages

125

In existential and locative clauses there is a strong preference to place the entity whose existence is being asserted both after the locative and existential verb (in the East Slavic languages there is no copula in the present tense), typically in clause final position. E. g. (60)

Slovene (questionnaire) V sobi je zenska. in room is woman 'There is a woman in the room.'

If the existential noun is preposed, it is generally accompanied either by the numeral one (61) or an indefinite pronoun (62); otherwise it would receive a definite interpretation. (61)

Bulgarian Edna zena e v stajata. one woman is in room Ά woman is in the room.'

(62)

Czech Nejak zena je v pokoji. some woman is in room Ά woman is in the room.'

In presentative clauses such as the one in (63) VS order is the norm. (63)

Serbo-Croatian Pojavilo se covek na vratina. appeared RFL man at door Ά man appeared at the door.'

As in existential clauses, a preposed subject (particularly when human or animate) occurs with an indefinite pronoun or one. If there is an adverbial of setting (expressed by an adverb or PP), it is typically placed initially: (64)

Russian V 1949 godu tjazelo zabolel staryj master, in 1949 year seriously fell-ill old master 'In 1949 the old master fell seriously ill.'

126

Anna Siewierska & Ludmila Uhlirova

In addition to the AdvVS order, VAdvS order is common in East Slavic, most of the South Slavic languages and also in Polish. Czech, Slovak and Slovene exhibit this latter order rather infrequently. Presentative clauses with preverbal subjects, such as the one in (65), are also found in all of the Slavic languages, though only sporadically; they are stylistically marked. (65)

Bulgarian [Somewhere in the distant Bohemia there were green fields and meadows.] Star zamäk se izdigase sred zelenite livadi. old castle stood among green meadows 'An old castle stood among the green meadows.'

4.2.3. Ditransitive clauses With nominal participants there is no fixed order for the patient and recipient. The case marking languages exhibit a preference for recipient > patient order (in accordance with the hierarchy discussed by Primus this volume), while in Bulgarian and Macedonian in which nominal recipients are marked by a preposition the preferred order is patient > recipient, (e. g., for Bulgarian, see Georgieva 1974: 45). In these languages, the word order variants are closely correlated with the distribution of the (definite) articles with NP patients and NP recipients. By contrast the order of pronominal object clitics is fixed; in all the Slavic languages that have object clitics the dative clitic obligatorily precedes the accusative as illustrated in (66) on the basis of Macedonian. (66)

Macedonian Daj mu go. give:2SG he:DAT it:ACC 'Give it to him.'

4.3. Adverbials As one would expect, there are no clear restrictions on the placement of adverbials. Valency adverbials favour postverbal placement. Manner adverbiale tend not to occur clause initially; they are often placed either immediately before (67 a) or after the verb (67 b) (Georgieva 1974).

Word order in Slavic languages

(67)

127

Czech a. Petr vesele vyprävel o svych dobrodruzstvich. Peter merrily talked about his adventures 'Peter talked merrily about his adventures.' b. Petr vypravel vesele o svych dobrodruzstvich.

Czech favours preverbal placement of manner adverbs, as in (67 a), while Russian postverbal. Adverbials of setting are typically either clause initial or clause final. Attitudinal adverbs in clause final position are avoided. Sequences of adverbs are generally clause initial or final. In some of the Slavic languages, the sequence of adverbiale is likely to be interrupted by an auxiliary, if the auxiliary is a clitic, as in the Czech (68). (68)

Czech Veer a jsme se v Praze sesli s pfedstaviteli obou yesterday AUX RFL in Prague met with representatives of both stran. parties. 'Yesterday we met with the representatives of both parties in Prague.

5. Order in subordinate clauses Subordinate clauses in Slavic languages do not exhibit any evident word order characteristics divergent from those found in main clauses. Complementizers occur clause initially. As mentioned in § 3, in Serbian SOV order is clearly favoured over SVO in subordinate clauses.

6.

Other sentence types

6.1. Questions The location of question particles in yes/no questions depends on whether the particles are enclitic, or not. In Russian and in South Slavic (with the exception of Slovene) the basic particle is enclitical It. In the unmarked case, it is placed in Wackernagel's position, i. e. after the initial verb.

128

Anna Siewierska & Ludmila Uhlirovä

(69)

Russian Znajet li on eto? knows Q he it 'Does he know it?'

It is important to note that in Bulgarian and Macedonian // does not necessarily form part of the clitic cluster. Thus, for example, in (70) below, unlike the clitics mi and go, it occurs in postverbal position and not after the first constituent. (70)

Macedonian (Englund 1977: 110) K'e mi go ispolnis li toa vjetuvan'e, Veti? PUT I:DAT it:ACC keep Q that promise Vetia 'Will you keep that promise to me, Veti?'

Other Slavic particles in yes/no questions are either proclitic, or orthotonic, and as such, they occupy the first position in the clause. Among them are Pol. czy, Ukr. cy, Byelorussian ci, Slovene alt, Bulgarian dali, nima, Macedonian dali, Czech jestlipak, zdalipak, Kashubian abo or e, all of which are obligatorily clause initial. E. g. (71)

Byelorussian (Pashkievich 1978: 12) Ci ty znajes jago? Q you know him 'Do you know him?'

(72)

Slovene (questionnaire) AH si mu ga dal? Q be:PRS:2SG he:DAT it:ACC given 'Have you given it to him?'

In the Slavic languages yes/no questions may be formed by intonation alone without any special particles. The word order in such questions is often the same as in declaratives, but it may be inverted. Inversion is widely used in Czech, Slovak and Polish (Krizkova 1972). Question words in wh-questions are typically clause initial. In East Slavic, Macedonian, Bulgarian, Slovak and Czech some variation in the placement of wh-words is permitted as shown by the examples in (73) through (75).

Word order in Slavic languages

(73)

Russian Tebja kak zvat'? you:ACC how call:INF 'What's your name?'

(74)

Bulgarian A na tebe kakvo ti e? and to you what you:DAT be:PRS:3SG 'And what's the matter with you?'

(75)

Czech A s praci mam byt hotov dokdy? and with work shalhlSG finish when 'When shall I finish the work?'

129

In multiple wh-questions (when several items are questioned in the same clause) all the question words are usually fronted. E. g. (76)

Serbo-Croatian (Rudin 1988: 449) Ko koga vidi? who whom sees 'Who sees whom?'

(77)

Russian (Wachowicz 1974: 158) Kto cto kogda skazal? who what when said 'Who said what when?'

There are, however, language specific differences in the interaction between the placement of wh-words and clitics which have been brought to light by Rudin (1988). Rudin observes that in Bulgarian an initial sequence of wh-words cannot be interrupted by auxiliary or pronominal clitics; the clitics must follow the last wh-word. Compare (78 a) and the ungrammatical (78 b). (78)

Bulgarian (Rudin 1988: 461) a. Koj kakvo ti e kazal? who what you is said 'Who told you what?' b. *Koj ti e kakvo kazal?

130

Anna Siewierska 8c Ludmila Uhlirova

In Czech and Serbo-Croatian, on the other hand, a clitic comes after the first of a series of wh-words. Thus, (79 b), for example, is ungrammatical. (79)

Czech (Rudin 1986: 461) a. Kdo ho kde videl? who he:DAT where seen 'Who saw him where?' b. *Kdo kde ho videl?

Polish in turn, allows for both of the above possibilities. A clitic may come either after the first wh-word, as in (85 a) or after the whole wh-sequence, as in (85 b). (80)

Polish a. Kto sie komu podlizal? who RFL whom ingratiated 'Who ingratiated himself to whom?' b. Kto komu sig podlizal?

6.2. Imperatives Imperatives in Slavic do not display any peculiarities of order. A point worth mentioning is that in Polish the 1st person and 2nd person plural enclitics of the imperative, unlike those used in the compound past, form a phonological unit with the verb. This is evidenced by the fact that they are sensitive to the penultimate syllable stress rule characteristic of Polish. Thus the 2nd person plural czytajcie 'you(pl) read!' is pronounced with the stress on taj, while the corresponding past participle czytaliscie 'you(pl) read' takes the stress not on the penultimate syllable li but again on the ta.

6.3. Negatives The Slavic grammatical particle 'not' may have the status of a prefix or of a proclitic. In compound verb forms it is prefixed either to the auxiliary, as in the Czech and Slovak future (81), or alternatively to the participle, as in the Czech and Slovak past tense (82).

Word order in Slavic languages

(81)

Czech Nebudu vahat. not-AUX:FUT:lSG hesitate Ί will not hesitate.'

(82)

Czech Nedelal jsem to. not-did be:lSG it Ί was not doing that.'

131

In rare cases, for example, in the Czech past conditional (83), either of the above is possible. (83)

a. Byli bychom ne-vyhrali. be COND:1PL not-won 'We would have not won.' b. Nebyli bychom vyhr li. not-be CONDrlPL won

The negative proclitic is typically proclitic to the auxiliary (in compound verb forms) as in (84). (84)

Russian Ja ne budu citat'. I not be:FUT:lSG read Ί shall not read.'

In Bulgarian and Macedonian the negative morpheme can be separated from the finite verb or auxiliary by enclitic personal and reflexive pronouns, or by other enclitics such as the future and conditional ones. This is illustrated in (85) and also further below in (86). (85)

Macedonian Kako ne how NEC 'How could

(Kramer 1986: 101) k'e sum go cekala? PUT be:PRS:lSG he:ACC waited I possibly not wait for him?'

It is worth noting that in Bulgarian, the negative particle ne influences the orthotonisation of pronominal and reflexive clitics. The pronominal and reflex-

132

Anna Siewierska & Ludmila Uhlirova

ive clitics, both in declaratives and in questions, become stressed if postposed after the initial proclitic negative particle: (86)

Bulgarian Ne 'ti li charesva tova? Not you:DAT Q like it 'Don't you like that?'

This is important because the stress difference may be semantically distinctive in speech. Note the difference in meaning between (87 a) and (87 b). (87)

Bulgarian a. Ne 'go vidjachte? not 'him saw:2PL 'Didn't you see him?' b. 'Nego vidj achte? 'him saw:2PL 'Was it him that you saw?'

7. The NP In all of the Slavic languages the basic location of the demonstrative, quantifier, numeral and adjective is prenominal. Relative clauses, by contrast, are obligatorily postnominal, and nominal possessors, other nominal modifiers and participial modifiers tend to be postnominal. There are, however, some language specific differences in the type of word order variants permitted with various subtypes of the above modifiers.

7.1. The demonstrative Although free demonstratives are regularly placed before the noun, in Polish and East Slavic proximate demonstratives may be postposed as in (88). (88)

Polish a. slowa te words these b. dziwne stowa te strange words these

Word order in Slavic languages

133

Such postposing tends to occur when the noun precedes the verb and when it is not further modified (Fontanski 1986). If there is an adjectival modifier it cannot be placed between the noun and demonstrative: c. *slowa dziwne te words strange these The favourite position of an NP with a postpositive demonstrative is the beginning of the clause. The postposed demonstrative fulfils an anaphoric function. Typically the referent of the NP is a discourse entity introduced in the immediately preceding clause, as in the example below. (89)

[That created a big shock and initiated a long intellectual process (...)] W procesie tym glowna role odegrali prorocy. in process this main role played prophets 'Prophets played the major role in this process.'

Generally, as mentioned above, any postposed demonstrative is anaphoric, but not conversely. It is not the case that if a demonstrative has an anaphoric function, it must be postposed. In Polish the postposing of the demonstrative is characteristic of the written language, particularly of expository and journalistic texts. In East Slavic it appears to be more widespread as evinced by the example in (90) (Wojcik 1983: 230). (90)

Russian On uveren, cto stichi eti — jeho sobstvennaja improvizacija. he sure that poems those his own improvisation 'He believes that those poems are his own improvisation.'

7.2. The numeral Both cardinal and ordinal numerals normally precede the noun. In East Slavic postposition is accompanied by a semantic effect, namely an approximate reading as in the Byelorussian (91). (91)

Byelorussian nevjalickaja vjosacka, chat tryccac ci sorak small village houses thirty or forty a small village, of about thirty of forty houses

134

Anna Siewierska & Ludmila Uhlirovä

7.3. The adjective Polish is the only Slavic language in which qualitative or evaluative adjectives favour prenominal placement, while relational adjectives (often denominal adjectives) reflecting some intrinsic or type feature of the object denoted favour postnominal placement. Compare (92) and (93). (92)

Polish a. piekna kobieta beautiful woman b. dlugie wlosy long hair

(93)

a. fizyka nuklearna physics nuclear b. bilet tramwajowy ticket tram

For reasons of focus or emphasis, the location of the two types of adjectives relative to the noun may be reversed, e. g. kobieta piekna, tramwajowy bilet. The same motivation is seen to underlie the postnominal placement of the adjectives in the other Slavic languages most of which allow for such an option. E.g. (94)

Slovak nove hodinky / hodinky nove new watch / watch new

(95)

Ukrainian sine morje / morje sine blue sea / sea blue

The above type of adjective/noun inversion is most typical of the West and East Slavic languages. It is much less common in the South Slavic languages where it occurs mainly in fiction, poetry, or the spoken language and is almost entirely absent in academic writing.7 Adjective postposing is least common in Bulgarian in which postposition of a focused adjective occurs only in indefinite NPs without any prenominal modifier. Moreover only evaluative as opposed to relational adjectives may be postposed (Pechlivanova &c Burov 1989).8

Word order in Slavic languages

135

In all of the Slavic languages, postposing of adjectives often occurs in exclamations, in forms of address and abuse, and in various expressive contexts of use, in which the adjective (and in some cases, also a demonstrative or indefinite pronominal) has a positive, or in other cases, negative intensifying or evaluative meaning. Adjective postposing is also found in proper names of historical persons (96) and in the names of certain institutions (97). (96)

Polish Boleslaw Smiaiy Bolesiaw the Brave

(97)

Czech Vysokä skola ekonomickä university school economical 'Higher School of Economies'

As in English, postposition is obligatory with indefinite pronouns. E.g. (98)

Serbo-Croatian nesto neobicno something odd

When an adjective takes a PP (or other) complement, as in (99) and (100), postnominal placement of the adjective phrase, though not obligatory, is the norm. (99)

Czech muz pysny na sveho syna man proud of his son

(100)

Polish czlowiek bogaty w doswiadczenia man rich in experience

In the East Slavic languages, Polish, Bulgarian and Macedonian when an adjective with a PP complement is placed pre- rather than postnominally, the PP complement regularly occurs between the adjective and noun; the noun and the adjective form a "frame", within which the PP (or a complement) to the adjective is placed. E. g.

136 (101)

Anna Siewierska & Ludmila Uhlifovä Polish

bogaty w doswiadczenia cziowiek rich in experience man 'a human being rich in experience' (102)

Ukrainian smisna w svojij zuchval'osti divcyna funny in her arrogance girl 'a girl amusing in her arrogance'

It is this "frame" that enables the preposing of long and syntactically complex adjectival phrases. On the other hand, in Czech, Slovak, Serbian, Serbo-Croatian and Slovene the PP complement is obligatorily placed before the adjective. Thus whereas in Czech, for example, both (99) above and (103 a) below are grammatical, (103 b) is not. (103)

Czech a. na sveho syna pysny muz of his son proud man 'a man proud of his son' b. *pysny na sveho syna muz

The just mentioned languages do not allow the preposing of an NP which begins with a PP complement. Grammarians and stylists warn against sequences of prepositions heading different nouns (see e. g. for Czech Belicova & Uhlifova 1996; for Upper Serbian Fasske 1981; for Slovene Toporisic 1982); still, one comes across examples such as the one in (104). (104)

Czech Dozvedeli jsme se o na dnesek plänovanem zävode. learned: 1PL AUX RFL about for today planned race 'We learned about a race planned for today.'

In all of the Slavic languages the qualifier (often an intensifier) of an adjective normally precedes the adjective. E. g. (105)

Polish nadzwyczaj doniosle mechanizmy exceptionally important mechanisms

Word order in Slavic languages

(106)

137

Russian ocen' dlinnoje nazvanije very long title

Discontinuous constructions with an adjective preceding the head noun and a complement of the adjective following the head noun are also to be found, but under various special conditions. For example, in Polish, they occur primarily when the adjective is qualified by an intensifier (as in English). E.g. (107)

Polish a. zbyt duze mieszkanie dla jednej osoby too large apartment for one person b. bardzo latwe imie do zapamietania very easy name to remember

In Czech, similar constructions are found only with a limited number of adjectives, which are usually listed in Czech grammars. Serbo-Croatian (Ivir 1983: 34) also allows for the preposing of the adjective without the PP complement, as in (108). (108)

Serbo-Croatian karakteristicna kretanja za nasu zemlju peculiar trends to our country 'trends peculiar to our country'

7.4. The adnominal genitive The basic location of the adnominal genitive or of other adnominal NPs expressed by means of case or prepositional marking (Bulgarian and Macedonian) is postnominal. E. g. (109)

Czech vüne rüzi smell of roses

(110)

Kashubian plakanie od molech dzecy crying of small children

138

Anna Siewierska & Ludmila Uhlifova

Prenominal placement is also possible, but subject to different restrictions in different languages. Polish and East Slavic allow prenominal placement, as shown in (111) and (112), if the adnominal construction expresses the shape, colour, size, material, or other qualities of an entity, a product, or a psychical feature of a person, or if it is a classifying noun. (111)

Polish tego typu opis this:GEN type:GEN description 'the description of this type//this type of description'

(112)

Ukrainian seredn'oho zrostu junacok middle:GEN height:GEN boy 'a boy of average height'

In the other Slavic languages this type of preposing of modifiers occurs only occasionally. In all the Slavic languages with the exception of Czech if there is more than one postnominal modifier of the noun, the order of the modifiers relative to each other is free. In Czech a postnominal genitive cannot be separated from the head noun by another modifier of the noun. Note the ungrammaticality of (113 b). (113)

Czech a. preklad kolektivni monografie do rustiny translation collective:GEN monograph:GEN into Russian 'the/a translation of a collective monograph into Russian' b. *pfeklad do rustiny kolektivni monografie

The other Slavic languages manifest constructions corresponding to both (113 a) and (113 b). The order of constituents is motivated by the communicative importance of the postmodifiers. The postmodifier which is less communicatively important, i. e. which refers to an entity already mentioned or to the setting, comes first, and the postmodifier which is communicatively more important, informationally and syntactically heavier and longer, follows. E. g. (114)

Serbo-Croatian pustanje u promet praskoga metroa putting in action Praguean metro 'the activation of the Prague metro'

Word order in Slavic languages

(115)

139

Upper Serbian rjadowanje po alfabece rodnych mjenow ordering by alphabet first names:GEN:PL 'alphabetical ordering of first names'

In all of the Slavic languages, the possessive genitive is in competition with the possessive adjective. Possessive adjectives are derived by special possessive suffixes, but only from a few classes of nouns (from proper names, names of relatives, general names of persons). They differ from all other adjectives in possessing special features, which they share with nouns; particularly worthy of note is the ability of the possessive adjective to function as the antecedent in an anaphoric sequence in the text. With the exception of Upper Sorbian (116), and some rare, more or less fossilized cases in Slovak (117), Bulgarian and Macedonian, such adjectival possessors cannot be themselves modified. (116)

Upper Sorbian (Sewc-Schuster 1976: 97) mojeje sotfine dzeci my:GEN:F:SG sister (ADJ):NOM child:NOM:PL 'my sister's children'

(117)

Slovak stareho otcova fajka old:GEN:M:SG father (ADJ) pipe:NOM:SG Old father's pipe (grandfather's pipe)'

Consequently when further modification is required, the noun in the genitive is used, as in (118) and (119). (118)

Byelorussian dzjadzjka nasaj Very uncle our Vera:GEN Our Vera's uncle'

(119)

Czech pole pana Novaka field Mr Novak 'Mr Novak's field'

Possessive adjectives are most common in Upper Sorbian, Serbo-Croatian and Slovene. They are also preferred in Czech and Slovak. East Slavic and Polish,

140

Anna Siewierska & Ludmila Uhlirovä

on the other hand, strongly favour possessive genitives over possessive adjectives. In Bulgarian and Macedonian possessive adjectives are less frequent than in the other South Slavic languages, but more frequent than in East Slavic. Whereas the possessive genitive is normally postposed (see (118) and (119) above), the possessive adjective like other adjectival modifiers, is normally preposed. E. g. (120)

Czech Novakovo pole Novak's (ADJ) field

(121)

Byelorussian (Pashkievich 1978: 60) Veryn dzjadzjka Vera's (ADJ) uncle

When the possessive adjective occurs in a sequence of premodifiers, its placement is freer than the placement of other adjectives, see § 7.7.

7.5. The relative clause Relative clauses follow the nouns they modify. Relative pronouns, both the inflected (e. g. Polish ktory, ktora, ktore) and uninflected (e. g. Czech and Polish co, Russian cto, Upper Sorbian kiz, Bulgarian deto, etc.) normally head the relative clause. E. g. (122)

Russian kniga, kotoruju ja citaju book which I read

(123)

Bulgarian knigata, kojato si kupich book-the which RFL bought:lSG

In the East Slavic languages, relative pronouns with a "possessive" meaning are regularly postposed after the head noun of the relative clause. Such postposing of a "possessive" relative pronoun is also found in Bulgarian and Serbo-Croatian but, with the exception of some rare instances is Slovak, not in the West Slavic languages. Compare (123) with (124).

Word order in Slavic languages

(124)

141

Bulgarian knigata, avtorät na kojato book-the author-the of which

7.6. Participial phrases Like relative clauses participial phrases tend to be postnominal, but prenominal placement is also to be found. In those Slavic languages, which do not have the so-called "frame" construction (see § 7.3.2 above), the placement of the participial construction before or after the head noun has an effect on the word order of the participial phrase. If the participial phrase precedes the head noun, then the complement to the participle is placed before the participle (125 a). On the other hand, if the participial phrase is placed after the head noun, the complement is, preferably, placed after the participle (125 b). (125)

Slovak a. slnkom osvetlene schody sun:INST lit staircase 'a staircase lit by the sun' b. schody

osvetlene slnkom

7.7. Sequences of modifiers When the demonstrative, numeral and adjective(s) are all prenominal, the most typical order is DemNumAdJQ ua iifAdj Re iN, as in the examples below. (126)

Slovene (questionnaire) tiste prve gledalie predstave that first theatrical performance

(127)

Kashubian (Breza & Teder 1981: 152) te dwie madre corczi those two wise daughters

In some cases, the demonstrative (128) and the possessive may be placed between the adjective and noun, and so may a possessive pronoun (129); however,

142

Anna Siewierska & Ludmila Uhlifovä

the inversion is always stylistically marked. It features only in some genres and it is not too frequent in texts. (128)

Serbo-Croatian (questionnaire) strasni onaj covek terrible that man 'that terrible man'

(129)

Upper Serbian (Sewc-Schuster 1976: 103) starsi moj bratfik older my brother 'my older brother'

The order of a demonstrative and possessive pronoun is obligatorily demonstrative > possessive pronoun (130), whereas the order of demonstrative > numeral is not so strict (131 a, b). There may be small semantic differences between the two variants (for Czech see Uhlifova, 1993). (130)

(131)

Upper Serbian (Sewc-Schuster 1976: 102) tuto moje rjane doziwjenje this my morning eating 'this morning eating of mine' Czech a. tyto dve knihy these two books b. dve tyto knihy

In Bulgarian, the clitics with possessive meaning are obligatorily placed after the first stressed word of the NP; the variant in (132 c) is ungrammatical. (See also Scatton 1983.) (132)

Bulgarian a. negovoto malko verno kucence his-the small faithful dog b. malkoto mu verno kucence small-the him:POS:DAT faithful dog c. *malkoto verno mu kucence

Word order in Slavic languages

143

A possessive adjective typically precedes both the numeral and adjective (133 a), but other variants such as the ones in (133 b, c) are quite common, too. (133)

Czech a. sousedovy tri male deti neighbour's three small children 'my neighbour's three small children' b. tfi sousedovy male deti c. tfi male sousedovy deti

8. Final remarks In the above overview of some of the word order characteristics of Slavic languages, we hope to have documented the high degree of word order flexibility found in this branch of Indo-European. As we have seen, there are some syntactic restrictions on this word order flexibility. But undoubtedly of greater interest are the pragmatic and semantic restrictions which in the main determine the actual use of the particular word order variants and which may be interpreted in terms of the theory of functional sentence perspective. The functional sentence perspective principles are the same for all the Slavic languages, though their interplay and relative weight differs from language to language. These differences have been barely touched upon in this sketch. They constitute a rich area for future research.9

Notes \. We would like to thank the following scholars for supplying information and data on various aspects of word order in the individual Slavic languages: Vit Bubenik and Anna Stunova for Czech, Jadranka Gvozdanovic for Serbo-Croatian, Yakov Testelec for Russian, Bostjan Zupanicic and Mateja Hocevar for Slovene, Svillen Stanchev for Bulgarian and Jack Feuillet for Macedonian. Needless to say, the responsibility for any potential misinterpretations is ours. 2. For more detailed information on the morphology of the Slavic languages see the articles in Comrie & Corbett (1993). 3. It has been suggested that in Polish some prepositions should be treated as proclitic to the NP. For some discussion of this issue and clear evidence that some prepositions may function as heads see Borsely & Jaworska (1989). 4. The placement of clitics in other than second position is much more common in Slovak than in Czech.

144

Anna Siewierska & Ludmila Uhlirovä

5. Only idiomatic phrases cannot be "interrupted" by a clitic in Serbo-Croatian. 6. In Bulgarian, a clitic may, however, stand at the beginning of the noninitial clause in complex sentence. 7. The investigations of the current tendencies in some Slavic languages, e. g. in Czech (Uhlirovä 1989) or Serbo-Croatian (Jonke 1963) suggest that postnominal placement of adjectives in these languages is on the wane. 8. Another reason for the postposing of adjectives is historical — terminological. In most Slavic languages, previously, terminological systems of a number of branches of the sciences and the humanities were modelled on Latin which favours postposed adjectives. 9. For some discussion of these issues see the papers in Bernini (to appear).

References Adamec, Pfemysl 1966 Porjadok slov v sovremennom russkotn jazyke. Praha: Academia. 1985 "K voprosu vyrazenii referencial'noj sootnesennosti v cesskom i russkom jazykach", in: Novoje v zarubeznoj lingvistike 15. Moskva, 487—497. 1987 "Nulevyje ekvivalenty mestoimenij i ich referencial'naja sootnesennost", Ceskoslovenska rusistika 32: 108 — 113. 1988 "K vyjadrovan'i a rozpoznävani koreference v rustine a v cestine", in: Ceskoslovenska slavistika: 167—177. Barnetova, Vilma, Helena Belicova, Oldfich Leska, Zdena Skoumalova & Vlasta Strakova 1979 Russkaja grammatika. Praha: Academia. Belicova, Helena & Jan Sedläcek 1990 Slovanske souveti. Studie a prace lingvisticke 25. Praha: Academia. Belicova, Helena 8c Ludmila Uhlirova 1996 Slovanska veta. Praha: Euroslavica. Bennet, David C. 1986 "Towards an explanation of word order differences between Slovene and Serbo-Croatian", The Slavonic and East European Review 64: 1—24. Bernini, Giuliano (ed.) to appear Pragmatic organization of discourse in the languages of Europe. Berlin: Mouton de Gruyter. Bily, M. 1981 Intrasentential pronominalization and functional sentence perspective in Czech, Russian and English. Lund. Brecht, Richard D. & James S. Levine (eds.) 1986 Case in Slavic. Columbus, Ohio: Slavic Publishers, Inc. Breza, Edward & Jerzy Treder 1981 Gramatyka kaszubska. Gdansk: Zrzeszenie Kashubsko-Pomorskie. De Bray, R. G. A. 1980 Guide to the Slavonic languages, 3rd edition vols 1—3. (1st edition 1951). London, New York: Dent, Dutton. Borsely, Robert & Ewa Jaworska 1989 On Polish PPs. Linguistics 27: 245-256.

Word order in Slavic languages

145

Browne, Wayles 1975 Serbo-Croatian enclitics for English-speaking learners, in: Kontrastivna analiza engleskog i hrvatskog Hi srpskog jezika, vol 1. Zagreb: Institute of Linguistics, 105—34. 1986 Relative clauses in Serbo-Croatian in comparison with English. Zagreb: Institute of Linguistics. Comrie, Bernard & Greville G. Corbett 1993 The Slavonic languages. London: Routledge. Danes, Frantisek 1959 "Kotäzce poradku slov v slovanskych jazycich", Slovo a slovesnost 20: 1-10. 1974 "Functional sentence perspective and the organization of the text", in: Frantisek Danes (ed.), Papers on functional sentence perspective. The Hague: Mouton, 106-128. 1985 Veta a text. Studie a prace lingvisticke 21, Praha: Academia. Englund, Brigitta 1977 Yes/No questions in Bulgarian and Macedonian. Stockholm: Almqvist & Wiksell. Fasske, Helmut 1981 Grammatik der obersorbischen Schriftsprache der Gegenwart, Morphologie. Bautzen: Unter Mitarbeit von S. Michaik. Firbas, Jan 1981 "Scene and perspective", Brno Studies in English 4: 37—79. 1992 Functional sentence perspective in written and spoken communication. Cambridge: Cambridge University Press. Fontanski, Henryk 1986 Anaforyczne przymiotniki wskazujqce w jqzyku polskim i rosyjskim. Katowice: Universytet Slaski. Friedman, Victor A. 1977 The grammatical categories of the Macedonian indicative. Columbus, Ohio: Slavica. Georgieva, Elena 1974 Slovored na prostoto izrecenie v bälgarskija knizoven ezik. Sofija: Izdatelstvo na Bälgarskata akademia na naukite. Gramatika na savremennija bälgarski knizoven ezik 3, Sintaksis 1983 Sofija: Izdatelstvo na Bälgarskata akademia na naukite. Grzegorek, Maria 1984 Thematization in English and Polish. Poznan: Wydawnictwo Naukowe Uniwersytetu Adama Mickiewicza. Gvozdanovic, Jadranka 1981 "Word order and displacement in Serbo-Croatian", in: A. M. Bolkestein et al. Predication and expression in Functional Grammar. London: Academic Press, 125-141. Hlebec, Boris 1986 "Serbo-Croatian correspondents of the articles in English", Folia Slavica 8: 29-49. Huszcza, Roman 1980 "Tematyczno-rematyczna struktura zdania w jezyku polskim", Polonica 6.

146

Anna Siewierska & Ludmila Uhlifovä

Ivancev, Svetomir 1978 Prinosi v bälgarskoto i slavjanskoto ezikoznanie. Sofija: Nauka i izkustvo. Ivic, Milka 1989 "O slovenackom en (neki) i problemima referencije", in: Zbornik razprav iz slovanskega jezikoslovja, Ljubljana, 111 — 116. Ivir, Vladimir 1983 A contrastive analysis of English adjectives and their Serbo-Croatian correspondents. Zagreb: Institute of Linguistics. Jakobson, Roman 1971 Les enclitiques slaves. Selected writings 11, Word and language. The Hague/ Paris: Mouton. Jodlowski, Stanislaw 1976 Podstawy polskiej skladni. Warszawa: Panstwowe Wydawnictwo Naukowe. Jonke, L. 1963 "O redu rijeci sa sintaktickog i stilistickog gledista u hrvatskosrpskom jeziku", in: Zbornik u cas Stjepana Ivsica. Zagreb. Karolak, Stanislaw 1990 Kwantyfikacja i determinacja w j^zykach naturalnych. Warszawa: Panstwowe Wydawnictwo Naukowe. Keijsper, Cornelia E. 1985 Information structure with examples from Russian, English and Dutch. Amsterdam: Rodopi. Kovtunova, 1.1. 1976 Sovremennyj russkij jazyk. Porjadok slov i aktual'noje clenenie predlozenia. Moskva: Prosvescenie. Kramer, Christina E. 1986 Analytic modality in Macedonian. München: Verlag Otto Sagner. Krucka, Barbara 1982 "Problem szyku wyrazow w jezyku polskim", Biuletyn Polskiego Towarzystwa Je^zykoznawczego 39: 109—124. Krizkova, Helena 1972 "Kontextove cleneni a typy täzacich vet v soucasnych slovanskych jazycich", Slavica Slovaca: 228-231; Slavia 41: 241-262. Lapteva, O. A. 1976 Russkij razgovornyj sintaksis. Moskva: Nauka. Lunt, Horace G. 1952 Grammar of the Macedonian literary language. Skopje. Mathesius, Vilem 1942 "Ze srovnävacich studii slovoslednych", Casopis pro modern: filologii 28: 181-190,302-307. 1947 K pofadku slov v hovorove cestine. Cestina a obecny jazykozpyt. Praha: Melantrich. Michalk, Frido 1970 "K prasenjam slowosleda w serbskich dialektach", Letopis A 17: 1—29. Mikos, Michael J. & Edith A. Moravcsik 1986 "Moving clitics in Polish and some cross-linguistic generalizations", Studia Slavica Hung 32: 327—335.

Word order in Slavic languages

147

Mistrik, Jozef 1966 Moderna slovencina. Bratislava: Slovenske pedagogicke nakladatelstvo. 1984 Slovosled a vetosied v slovencine. Bratislava: Veda. Misz, H. 1966 "Szyk sie w dzisiejszej polszczyznie pisanej", J^zyk Polski 46: 102—110. Moravcsik, Edith A., Charles A. Ward 6c Jessica A. Wirth 1978 "Towards a typology of second position clitics; the case of Serbo-Croatian", Paper presented at the Minnesota Regional Meeting of Linguistics, May 12 1978, 1-8. Naylor, Kenneth E. 1983 "On expressing "definiteness" in the Slavic languages and English", in: M. S. Flier (ed.), American Contributions to the 9th International Congress of Slavists. Columbus, Ohio: Slavica, 203—220. Nilsson, Barbro 1980 "Szyk zaimkow osobowych w jezyku rosyjskim i polskim", Studia gramatyczne III. Wroclaw: Ossolineum, 47—64. Orlos, T. Z. 1976 "O szyku polskiego sie, czeskiego se", Jqzyk polski, 36 L Pashkievich, Valentyna 1978 Fundamental Byelorussian. Toronto: Harmony Printing LTD. Pechlivanova, P. & S. Burov 1989 "Ocenäcno znacenie na zadpostavenoto säglasuvanoto opredelenie", Bälgarski ezik, 79—83. Pencev, Jordan 1984 Stroez na bälgarskoto izrecenie. Sofia: Nauka i izkustvo. Polanski, Kazimierz 1967 Zdania zlozone w jqzyku gornoluzyckim. Wroclaw: Ossolineum. Primus, Beatrice this volume "The relative order of the patient and recipient in the languages of Europe". Rappaport, Gilbert C. 1988 "On the relationship between prosodic and syntactic properties of pronouns in the Slavic languages", in: A. M. Schenker (ed.), American contributions to the 10th international congress of slavists. Columbus, Ohio: Slavica, 301-327. Rivero, Maria-Luisa 1990 "Long head movement and negation: Serbo-Croatian vs Slovak and Czech", Linguistic Review 8: 319-351. Rittel, Teodozja 1975 Szyk czlonow w obrebie form czasu przeszlego i trybu przypuszczajgcego. Wroclaw: Ossolineum. Rusanivskij, V. M. et al. 1986 Ukrainskaja grammatika. Kijev: Naukova dumka. Rudin, Catherine 1986 Aspects of Bulgarian syntax. Complementizer and Wh-constructions. Columbus, Ohio: Slavica. 1988 "On multiple questions and multiple wh fronting", Natural language and Linguistic Theory 6: 445—502.

148

Anna Siewierska & Ludmila Uhlirovä

Scatton, Ernest 1983 A reference grammar of Modern Bulgarian. Columbus, Ohio: Slavica. Sedläcek, Jan 1989 Strucna mluvnice srbocharvatstiny. Praha: Academia. Schaarschmidt, Gunter 1988 "Word order and the article in Serbian: a textual analysis", in: M. Basal et al. (eds.)j Wokol je^zyka. Rozprawy i studia poswiqcone Prof. M. Szymczaka. Wroclaw: Ossolineum, 353—361. Sewc-Schuster, H. 1976 Gramatika hornjoserbskeje rece. Syntaksa. Bautzen: VEB Domowina. Siewierska, Anna 1993 "Syntactic weight versus information structure and word order variation in Polish", Journal of Linguistics 29: 233—265. Silic, Josip 1978 "The basic, or grammatical word order in the Croatian Literary language", in: R. Filipovic (ed.), Contrastive analysis of English and Serbo-Croatian Vol 2. verbal aspect. Word order. Zagreb: Institute of Linguistics, 356—393. Sirotinina, . . 1961 "O porjadke slov v russkom jazyke. Voprosy teorii i metodiki izuceniji russkogo jazyka", Trudy vtoroj naucnoj konferencii kafedr russkogo jazyka pedagogiceskich institutov Povolz'ja, Kujbysev, 195—212. Stankiewicz, Edward 1986 The Slavic languages, unity in diversity. Amsterdam. Mouton de Gruyter. Stankov, Valentin 1987 "Za semanticnija invariant na opredelitelnija clen v bälgarskija ezik", Bälgarski ezik, 70—76. Stojanov, Stojan 1977 Gramatika na bälgarski knizoven ezik. Sofia: Nauka i izkustvo. Toman, Jindrich 1986 "Cliticization from NPs in Czech and comparable phenomena in French and Italian" in: H. Borer (1986) (ed.), Syntax and semantics 19. The syntax of prenominal clitics. New York: Academic Press, 123—145. Topolinska, Zuzanna (ed.) 1984 Gramatyka wspolczesnego je^zyka polskiego. Skladnia. Warszawa: Panstwowe Wydawnictwo Naukowe. Toporisic, Joze 1967 "Besedni red v slovenskem knjiznem jeziku", Slavisticna Revija, 251—274. 1976 Slovenska Slovnica. Maribor. 1982 Nova slovenska skladnja. Ljubljana. Uhlirovä, Ludmila 1987 "Determinace a aktualni cleneni ve slovanskych jazycich", Acta Universitatis Carolinae - Philologica 2-3: 105-123. 1987 Knizka o slovosledu. Praha: Academia. 1988 "Aktualni cleneni a slovosled ve slovenstine a cestine", Slavica Slovaca 23: 267-227. 1989 "Slovanske slovosledne izoglosy", in: Problemy teoretyczno-metodologiczne badan konfrontatywnych jqzykow slowianskick, Warszawa, 101— 115.

Word order in Slavic languages 1992

149

"Slovosledne principy v jazykovem systemu a v reci", Slovo a slovesnost 53: 292-299. 1993 "Ten nejaky/nejaky ten a pfipady podobne", Nase fee 75: 247—254. Wachowicz, Krystyna 1974 "Against the universality of a single wh-movement:, Foundations of Language 11: 155-166. Wojcik, Tomasz 1973 Gramatyka j^zyka rosyjskiego. Warszawa: Panstwowe Wydawnictwo Naukowe. Wright, T. 8c Talmy Givon 1987 "The pragmatics of indefinite reference: quantified text-based studies", Studies in language 11: 1—33. Yokoyama, Olga T. 1986 Discourse and word order. Amsterdam: John Benjamins. Zemskaja, E. A. 1973 Russkaja razgovornaja rec. Moskva.

Chryssoula Lascaratou

Basic characteristics of Modern Greek word order1

1. Introduction Modern Greek, the living language which represents the Hellenic branch of Indo-European, is spoken by approximately eleven million speakers in Greece and Cyprus and some three million others in the 'Hellenic diaspora' in America, Canada, Australia, Britain and other parts of the world. This overview presents the main word order characteristics of the standard form of the language, which is Athenian Dimotiki.

2. Inflection and other functional categories Modern Greek is a predominantly suffixing inflectional language. Articles, nouns, pronouns, demonstratives, and adjectives are marked for case. Four cases are distinguished, namely nominative, genitive, accusative and vocative, though there is some case syncretism. The three way gender system, i. e. masculine, feminine and neuter, applies to both animate and inanimate entities. Number distinctions are singular and plural. Within the noun phrase all inflectable modifiers obligatorily agree in number, gender and case with the head noun. The verb is morphologically marked for the verbal categories of tense, person, aspect, mood, and voice. The basic formal tense distinction is between past and non-past, with further distinctions between imperfect and aorist for past, and present and future for non-past. With the exception of the past tense augment prefix e- accompanying monosyllabic and disyllabic forms, tense distinctions are coded by suffixes on the verb, while the proclitic particle θα is used to mark the future. Verb stems are also formally marked to express the perfective versus imperfective aspectual distinction, the perfect being periphrastically formed with the auxiliary exo 'have'. As for mood, Greek formally distinguishes between indicative, subjunctive, and imperative. The formal difference between indicative and subjunctive is basically marked by the selection

152

Chryssoula Lascaratou

of negators (öe(n) and mi(n), respectively) and particles (na and as for the subjunctive). Moreover, imperative and nonimperative forms are mainly distinguished by their inflectional suffixes, though there are also other formal distinctive characteristics, such as differences in clitic placement and negation, and the presence of a syncope rule in the imperative. Finally, two voice distinctions are morphologically coded on the Greek verb, namely active and mediopassive. Modern Greek verbal inflectional morphemes are often fused, i. e. 'portmanteau' (in Hockett's sense of the term). However, in those cases where they are segmentable they follow a specific relative order (see Ralli 1988). Finite verb forms obligatorily agree in person and number with their subjects; nonfinite forms never code person, though adjectival participles code number, case and gender. Modern Greek has a personal inflectional passive which presupposes a transitive verb, though there are some transitive verbs that do not passivize at all or at least not freely (Lascaratou 1984: 68—75). It is also possible, though very rare, to form passives from ditransitive verbs. There are also many periphrastic constructions with the auxiliary ime 'be' + passive participles derived from active, mostly intransitive, verbs having no other passive forms. These participles usually denote some natural, physical or mental state. (See Triantafyllidis 1941: 374; Tzartzanos 1946: 330-331; and Lascaratou & Philippaki-Warburton 1984: 105 — 106.) Mediopassive verb forms can also express reflexivity or reciprocity either on their own or by prefixation with afto- 'self, and alilo'each other', respectively. Finally, Greek displays cross-reference with (direct and indirect) objects by means of clitic pronouns. (For a discussion of clitics and their location see 4.2.1 below.)

3. The word order type Modern Greek has traditionally been characterized as having flexible order, SVO being the dominant order in main declarative clauses and VSO the normal order in most other cases, e.g., questions, subordinate clauses (Tzartzanos 1963: 273—277). Moreover, M. Greek was also classified as an SVO language in Greenberg's typology (1963), which, however, was based on main declarative sentences with nominal subject and object only. In Lascaratou (1989: 42 and 47—48), on the basis of the analysis of 2530 active declarative clauses, it was shown that SVO is by far the most frequent active transitive order (49.2%), as well as the overwhelmingly most frequent main clause order (62.8%). In other

Modern Greek word order

153

words, SVO is the main clause order par excellence, which justifies the widely held view that M. Greek is an SVO language.2 On the other hand, the equal distribution of SV and VS intransitive orders as well as passive ones, observed in Lascaratou (1989), constitutes evidence in support of Philippaki-Warburton's (1980 a, 1980 b, 1982, 1985 b, 1987) view that the clause initial position is a theme rather than a basic subject position. Thus, Lascaratou argues that the tendency of subjects (active or passive) to move to clause initial position (P l in Dik's, 1978: 175 — 6 and 178, terminology) is determined by their characteristics, e. g., definiteness, anaphoricity, categorial complexity (Dik's 1978: 190-212 LIPOC hypothesis), length (Lascaratou's 1989:67—73 and 107, FIPROC hypothesis), etc., and the needs of communication. Conclusively, Lascaratou proposes that Greek should be reinterpreted as having a P l VSO basic order, despite the attested statistical prevalence of SVO. Therefore, M. Greek could plausibly be classified as having free word order with SVO as its dominant active transitive order.

4.

Order in declarative clauses

4.1. General remarks Tzartzanos (1963: 273 — 277) suggests that in an active independent declarative clause when 'speech is calm and free of passion' the normal order is SVO. Given that, as Tzartzanos claims, emphatic or contrastive constituents usually occur clause initially, he suggests that SVO is also used when the subject is stressed, either emphatically or contrastively. On the other hand, if it is the object or the verb which carry emphatic stress, then SVO may be replaced by OVS, VSO and VOS, respectively. In the light of the Prague School theory of Functional Sentence Perspective, Tzartzanos claims could be interpreted as conforming with the distinction between the Objective order' used in normal relaxed speech where 'theme' precedes 'rheme', and the 'subjective order' used in excited speech where 'rheme' precedes 'theme'. The same pragmatic principles underlie Philippaki-Warburton's (1980 b) claim that SVO is the result of the application of the 'Subject Thematization' rule, which serves clearly pragmatic needs. The first systematic, corpus-based,3 investigation of Greek constituent order realizations is to be found in Lascaratou (1989), a follow-up study of Lascaratou (1984), where a functional account of the flexibility of Greek order is offered in terms of specific variables determining order realizations. Thus, the main factors affecting the operation of Greek constituent order were found

154

Chryssoula Lascaratou

to be: (a) categorial complexity of constituents, i.e. Dik's (1978) LanguageIndependent Preferred Order of Constituents (LIPOC) hypothesis, (b) NP definiteness, (c) the distinction between main and subordinate clauses, and, most importantly, (d) NP size as expressed in Lascaratou's (1984, 1989) FunctionIndependent Preferred Relative Order of Constituents (FIPROC) linearization principle.

4.1.1. Categorial complexity (LIPOC), NP definiteness, subordination, NP size (FIPROC) and constituent order Dik's LIPOC hypothesis that the preferred position of constituents is from left to right in increasing order of categorial complexity was borne out for M. Greek by Lascaratou's (1984, 1989) findings on the positioning of (active and passive) subjects and objects. The effect of LIPOC on active nominal and pronominal subjects, in particular, is seen as a 'force' pushing them to occur preverbally. Moreover, NP definiteness is shown to operate in the same direction, i. e. favouring the positioning of definite NPs clause initially. The definite character of NPs itself is seen to be encouraged by two functions, namely the syntactic function of subject and the semantic function of agent, the former being stronger. The most important observation concerning the differences between main and subordinate clause orders is that in the latter there is a clear preference for the verb to occur clause initially, whereas in the former it is the subject which tends to occupy the initial position. Finally, Lascaratou's FIPROC hypothesis, i. e. the preference for longer (in number or words) NPs to occupy later positions in the clause than shorter ones, irrespective of their syntactic, semantic and pragmatic functions, was observed to prevail and even override the definiteness variable. What is more, it was shown that it is the proportion of the larger by the smaller constituent, rather than their strict difference in size, which determines their relative positioning. In conclusion, despite its great flexibility, Greek order is not arbitrary, but rather it is determined by (and to some extent predictable from) certain tendencies, tension being caused when conflicting forces are at play.

4.2. The order of the verb and its arguments Since in Greek the syntactic functions of NPs are designated solely by morphological case markers rather than by order, it follows that all linear arrangements

Modern Greek word order

155

of the verb and its arguments are logically possible in all clause types. Moreover, subjectless orderings are also possible due to the pro-drop nature of Greek. Thus, personal pronoun subjects are dropped unless used emphatically or contrastively. Finally, the existence of personal clitic pronouns, which may either occur on their own or co-occur with their referent NP (clitic doubling), adds to the multiplicity of possible orders.

4.2.1. Clitics and their location Of the many clitic elements (mostly proclitic but also enclitic) which exist in Modern Greek, personal clitic pronouns 4 deserve special attention in a discussion of word order in general and the order of the verb and its arguments in particular. Personal clitics generally appear in two cases, namely (a) in the accusative, if they function as a direct object, or (b) in the genitive, if they function as an indirect object, an experiencer or an 'ethical dative'. 5 With finite forms, clitic pronouns occur preverbally. With imperative or participial forms, however, clitics occur postverbally. (1)

Se γηοπζ-ΐ. you-ACC:CLT:2SG knows-PRS:3SG 'He knows you.'

(2)

Mu eoo-se to yrama. me-GEN:CLT:lSG gave-PST:3SG the-ACC:SG:NT letter-ACC 'He gave me the letter.'

(3)

δό-se mu tin efimerio-a! give-IMP:2SG me-GEN:CLT:lSG the-ACC:SG:F newspaper-ACC 'Give me the newspaper!'

(4)

Vlep-ondas to seeing-PART it-ACC:CLT:3SG:NT 'Seeing it'

There are some restrictions on the combination and the order of clitic pronouns. Thus, the genitive clitic obligatorily precedes the accusative clitic, whether they attach proclitically or enclitically. With monosyllabic imperatives, however, it is possible, but less preferable, to reverse this order. Further restrictions on the combination and the order of two clitics in a clause are summa-

156

Chryssoula Lascaratou

rized in Joseph & Philippaki-Warburton (1987: 214). More specifically, there cannot be a combination of two clitics one of which is first person and the other second person. Moreover, in such a combination the second clitic, which must be in the accusative, must be 3rd person. If there is only one clitic, however, there is no restriction on either person or case. (5)

a. Mu to estil-e. me:GEN:CLT:lSG it-ACC:CLT:3SG:NT sent-PST:3SG 'He sent me it.' b. *To mu estlil-e.

(6)

a. Pes mu to! say-IMP:2SG me-GEN:CLT:lSG it-ACC:CLT:3SG:NT 'Say it to me!' b. Pes to mu!

4.2.2. Transitive clauses Besides the statistically dominant SVO order, all the other combinations of V, S and Ο are theoretically possible because of the flexible nature of Greek order already discussed in section 3 and section 4.1. The distribution of the eight transitive orders in Lascaratou's (1989) written corpus mentioned earlier (section 4.1) is presented in Table 1 as an indication of their textual frequency. As mentioned in 3 above, SVO is also the overwhelmingly most prevalent order in written texts, while VSO, which Philippaki-Warburton (1982) has proposed to be the basic order in Greek, is extremely infrequent (1.1%). This, however, does not invalidate her claims since they are based on the criteria of (a) the least marked order and (b) the simplification of grammar, rather than on any frequency considerations.6 Greek offers a very wide range of sentence perspectives mapped onto an equally wide range of ordering possibilities. The most natural answer to the question 'What happened?', i. e. when all information is new, is the VSO order, which is morphologically, intonationally and pragmatically least marked, as in (7)

Xtipi-se ο petr-os tin elen-i. hit-PST:3SG the-NOM:SG:M Peter the-ACC Helen-ACC 'Peter hit Helen.'

Modern Greek word order

157

Table 1 . Textual frequency of transitive orders in written M. G. ORDER

Nr

% relative to 2530

SVO

1246

49.2

SOV

19

0.7

VSO

27

1.1

VOS

19

0.7

OSV

10

0.4

OVS

219

8.7

VO

877

34.7

0V

108

4.3

There is, however, a strong tendency for arguments to occur clause initially when they are positively marked for definiteness and agentivity, these features being closely related to the pragmatic function of theme, on the one hand, and the syntactic function of subject, on the other. It is not surprising, therefore, that SVO is the most frequent transitive order. Moreover, SVO is also used when the subject only conveys new information, in which case it takes emphatic or contrastive stress. Therefore, SVO can express both the theme > rheme and the rheme > theme perspectives in an utterance. On the other hand, OVS is preferred when it is the object which represents new information and carries emphatic or contrastive stress. Consider: (8)

a. o petros xtipise tin elani. b. Ο PETROS xtipise tin eleni.

(9)

TIN ELENI xtipise ο petros.

Finally, the six possible combinations of V, S and Ο as well as the two subjectless orders (see Table 1) can also appear with a preverbal object clitic cooccurring with the full object NP to which it refers (clitic doubling).7 As a result the object is derhematized or defocalized, which means that either the subject or the verb can function as rheme. E.g.:

158

Chryssoula Lascaratou

S PROcl V O: (10) a. o petros ti XTIPISE tin eleni. her-ACC:CLT:3SG:F 'What Peter did to Helen was to hit her.' b. Ο PETROS ti xtipise tin eleni. 'It was Peter who hit Helen.' PROcl V S O: (11) Ti XTIPISE ο petros tin eleni. 'What Peter did to Helen was to hit her.' O S PROcl V: (12) Tin eleni, Ο PETROS ti xtipise. 'As for Helen, it was Peter who hit her.' As mentioned in section 4.1.1, categorial complexity and, even more so, constituent length have a strong effect on Greek order. The following are examples of their effect on the positioning of subject constituents. Nominalized clausal subject in final position: (13) Me foviz-i to me-ACC:CLT:lSG scares-PRS:3SG the-NOM:SG:NT na οδίγ-ό ti nixt-a. todrive-PRS:lSG the-ACC night-ACC 'It scares me to drive at night.' Lengthy subject in final position: (14) Ayora-se aftokinit-o i fil-i bought-PST:3SG car-ACC the-NOM:SG:F friend-NOM mu pu ynori-ses xtes. my-POSS:GEN:CLT:lSG whom-ACC met-PST:2SG yesterday 'My friend whom you met yesterday bought a car.'

4.2.3. Intransitive clauses Preverbal and postverbal subjects are equally distributed in intransitive clauses, which means that there is no 'dominant', 'typical' intransitive order. In existential clauses, however, VS is more natural (whether followed or preceded by the

Modern Greek word order

159

locative), SV being acceptable only if the locative occurs in clause final position. E.g.: (15) VS: a. Ine/iparx-i mj-a jinek-a sto is/exists-PRS:3SG a-NOM woman-NOM in the-ACC oomati-o. room 'There is/exists a woman in the room.' b. Sto öomatio ine mja jineka. SV: c. Mja jineka ine sto öomatio. In presentative clauses both SV and VS are possible, though again VS is more natural. (16)

a. Sinev-i en-a foveratixima. happened-PST:3SG a-NOM terrible-NOM accident-NOM:SG:NT terrible accident happened.' b. Ena fovero atixima sinevi.

Overall, the strikingly smaller proportion of preverbal intransitive subjects (49.7%) as opposed to preverbal transitive ones (82.0%) (Lascaratou 1989: 105—106) should be related to the significantly weaker tendency of intransitive subjects to be definite. This characteristic of intransitive subjects can plausibly be interpreted in terms of the absence of two central properties of prototypical agents, namely 'causation' and 'saliency of cause', both assumed to encourage the definiteness of the entity involved. Finally, the strong effect of length and complexity on the positioning of intransitive subjects overrides that of NP definiteness.

4.2.4. Ditransitive clauses In ditransitive clauses there is no fixed order for patient and recipient/benefactive/malefactive arguments, the latter being marked either by a preposition or the genitive case. Despite the multiplicity of theoretically potential orders, it was observed in Lascaratou (1994: 79 — 85) that a very restricted number of orders actually occur, the most frequent being (S) V NPac PP (37.4%), (S) PROcl gen V NPac (32.8%), and (S) V PP NPac (19.2%). There is, therefore, a very strong preference for the patient to occur immediately after the verb and

160

Chryssoula Lascaratou

be followed by the recipient realized as a prepositional phrase rather than as a genitive. The avoidance of genitive indirect objects should be attributed to the ambiguity between a possessive reading and an indirect object reading potentially present in NPac [± animate] NPgen [+animate] sequences. The statistical dominance of order (S) V NPac PP confirms Lascaratou's (1994: 79) hypothesis that on the analogy of the existence of a typical transitive order, i. e. SVO, there should also exist a typical ditransitive one which best approximates to SVO. Moreover, on the basis of statistical tests based either on Lascaratou's FIPROC hypothesis or Hawkins's (1990, 1991 a, 1991 b, 1994) Early Immediate Constituents (EIC) principle, it was shown that the primary function of order (S) V PP NPac is to facilitate communication by postposing a heavy direct object. Finally, as mentioned earlier in section 4.2.1, the relative order of pronominal object clitics is fixed, namely genitive > accusative, the reverse order being also possible only with monosyllabic imperative forms. (17)

O jän-is eöo-se to vivli-o the-NOM John-NOM gave-PST:3SG the-ACC book-ACC sti mari-a. to the-ACC Mary-ACC 'John gave the book to Mary.'

(18)

jam's tis to eöose. her-GEN:CLT:3SG:F it-ACC:CLT:3SG:NT 'John gave it to her.'

4.3. Adverbials In line with the overall flexibility of Greek order, there is great freedom in the positioning of adverbials. Thus, generally speaking, adverbials and sequences of adverbials can appear in any clause position, the final position being the most natural. Moreover, when used emphatically or contrastively, they tend to occur clause initially. There is, however, a restriction on the placement of valency adverbials, namely that they cannot be placed between an auxiliary and a verb. Also, illocutionary adverbials prefer the clause initial position, followed by a comma intonational break. Finally, there is no clear preference for the relative ordering of adverbials within sequences of adverbials. (See Joseph & Philippaki-Warburton 1987: 40.)

Modern Greek word order

161

5. Order in subordinate clauses Generally speaking, the order of major clausal constituents does not necessarily differ from that in main clauses. There is, however, a very strong tendency for the verb to be placed at the beginning of the clause, right after the complementizer. Tzartzanos (1963: 276), in fact, considers VSO order as the normal order in dependent clauses. Manner clauses introduced by sa(n) na 'as if, purpose clauses introduced by ja na 'in order that' and nominalized clauses introduced by to na permit only verb-initial order. Moreover, relative clauses with other than verb-initial order are rare.

6.

Other sentence types

6.1. Questions Since yes/no questions are primarily marked and distinguished from declaratives by intonation, the same orders can be used as in declaratives. However, there is a strong preference for the verb to appear clause initially, the normal order being VSO in transitive clauses and VS in intransitive ones. (See Tzartzanos 1963: 275.) Emphatic constituents are also placed clause initially. The particles mipos, tni(n), taxa and araje can be used in yes/no questions, their typical position being at the beginning of the clause, though mipos, taxa and araje can also occupy later positions. (19)

(Mipos) ir6-e o taxiörom-os? q-PART came-PST:3SG the-NOM postman-NOM 'Did the postman come?'

Question words in w/7-questions are typically placed clause initially with the verb immediately following. When more than one element of the clause are questioned simultaneously, all the respective q-words are preposed. The relative ordering in a sequence of adverbial q-words is not strict, the preferred order being where > when > why > how. In multiple «^-questions comprising also a subject or object question, the subject or object q-word appears first in the sequence of q-words. If both the S and of the clause are questioned, then SVO is the most natural order. (See also Joseph & Philippaki-Warburton 1987: 10.)

162

Chryssoula Lascaratou

(20)

Pj-on xtipise o petros? whom-ACC:SG:M 'Whom did Peter hit?'

(21)

Pj-os, pu, pote, jati ke pos xtipise tin eleni? who-NOM 'Who, where, when, why and how hit Helen?'

(22)

Pj-os zilev-i pj-on? who-NOM envies-PRS:3SG whom-ACC 'Who envies whom?'

6.2. Imperatives It is only the second person singular and plural that have special imperative forms, which, moreover, are exclusively positive. All the other person-number combinations are realized periphrastically with the particle as. With respect to order, what makes this morphological distinction significant is that object clitics attach enclitically with second person imperative forms rather than proclitically, as is the case with all other verb forms. Moreover, only monosyllabic second person positive imperatives allow the genitive > accusative relative order of clitic objects to be reversed. (See sections 4.2.1 and 4.2.4 above.) Furthermore, the verb typically appears clause initially. If an overt subject is also present, then it occurs either at the beginning or at the end of the clause. (23) a. Fij-e apo 'oo! go away-IMP:2SG from here 'Go away from here!' b. ESI fije apo 'Oo! you-NOM 'YOU go away from here!' c. Fije apo 'oo esi! (24) a. Fra-pse to! write-IMP:2SG it-ACC:CLT:3SG 'Write it!' b. To yrapse!

Modern Greek word order

163

6.3. Negatives Sentence negation in Greek is realized with the use of negative particles, among which oe(n) and mi(n) are by far the most widely used. Though they differ considerably with respect to distribution (see Joseph & Philippaki-Warburton 1987: 63 — 66), there are significant points of similarity in the placement of de(n) and mi(n). More specifically, both de(n) and mi(n) are proclitic on the verb. Only other clitic forms, namely the genitive and accusative pronominal clitics, can intervene between oe(n)/mi(n) and the verb. In the case of oe(n), the future particle θα may also intervene. Both oe(n) and tni(n) occur with monolectic and periphrastic (perfect system) forms. In the case of periphrastic forms, both particles attach proclitically on the auxiliary verb exo 'have' or tme 'be'. (25)

oe 6a ίίγ-o tora. not PUT PART leave-lSG now Ί will not leave/am not leaving now.'

(26)

As min aku-si o ngel-os! HORT PART not hear-3SG the-NOM Angelos-NOM 'Let/May Angeles not hear!'

(27)

Isos na min ex-un ksipni-si akoma. perhaps MOD PART not have-3PL woken-PERF yet 'Perhaps they have not woken up yet.'

7. The noun phrase The various classes of modifiers of the Modern Greek noun phrase exhibit a considerable degree of flexibility with respect to their location before or after the head noun. Thus, only articles are obligatorily prenominal, though the definite article can also be repeated postnominally with a postnominal modifier (see section 7.1 below). Demonstratives, numerals, adjectives and adjectival participles are normally but not necessarily prenominal. On the other hand, relative clauses and possessive clitics are obligatorily postnominal while nominal possessors are preferably located postnominally, their prenominal placement also being possible.

7.1. The article Both the definite and the indefinite articles are always prenominal. Moreover, when there is an adjective pre- or postmodifying the noun, the definite article

164

Chryssoula Lascaratou

may occur twice, i. e. both before the adjective and before the noun. The definite article, therefore, may be repeated postnominally with a postnominal modifier, but it never occurs postnominally on its own. (28)

(29)

enas eksipn-os andr-as a-IND ART:NOM:SG:M clever-NOM man-NOM 'a clever man' a. i omorf-i jinek-a the-DEF ART:NOM:SG:F beautiful-NOM woman-NOM 'the beautiful woman' b. i omorfi i jineka c. i jineka i omorfi

7.2. The demonstrative Free demonstratives, i. e. demonstrative pronouns serving as modifiers, obligatorily co-occur with the definite article. They are normally placed prenominally, immediately followed by the definite article. When there are adjectives in the NP, the demonstratives occur before the whole NP. For reasons of emphasis, it is also possible to place the demonstratives postnominally or postadjectivally, i. e. between the adjective and the noun. Consider: (30) a. aft-ί i musik-i this-DEM:NOM:SG:F the-NOM music-NOM 'this music' b. afti i apal-ί musiki soft-NOM 'this soft music' c. i musiki afti d. i apali afti musiki

7.3. The adjective Attributive adjectival modifiers are prenominal. With definite NPs the adjective can also occur postnominally if the definite article is repeated with it. This

Modern Greek word order

165

repetition of the article is also possible with prenominal adjectives, its effect being emphatic. For reasons of emphasis the adjective can also be placed postnominally in indefinite NPs. (31)

a. Latrev-o tis kitrin-es frezi-es. adore-PRS:lSG the-ACC:PL:F yellow-ACC freesias-ACC Ί adore yellow freesias.' b. Latrevo tis frezies tis kitrines. c. Latrevo tis kitrines tis frezies.

(32)

a. I an-a θέΐ-i mj-a ksan9-ja the-NOM Anna-NOM wants-PRS:3SG a-ACC blond-ACC kukl-a me yalan-a mat-ja. doll-ACC with blue-ACC eyes-ACC 'Anna wants a blond doll with blue eyes.' b. I ana Oeli mja kukla ksanBja me m tja yalana.

On the other hand, with predicative adjectival modifiers the order can be either Adj Art N or Art N Adj. (33)

a. Anisix-i i worried-NOM the-NOM:PL:M akuy-an to listened-IMPERF:3PL the-ACC

γόη-is parents-NOM jatr-o. doctor-ACC

b. I yonis anisixi akuyan to jatro. 'The parents, worried, were listening to the doctor.' The adjective is regularly postposed when it occurs with its own PP complement. Prenominal placement of the adjective and its PP complement is also possible, the effect, however, being quite stilted. (34)

a. enas ndras iperifanos ja to jo a-NOM man-NOM proud-NOM for the-ACC son-ACC tu his-POSS:GEN:CLT:3SG:M 'a man proud of his son' b. enas iperifanos ja to jo tu ndras

166

Chryssoula Lascaratou

Both prenominal and postnominal placement of an adverbially modified adjective is possible, the former being more natural, e.g.: (35)

enas ekseretikä iperifanos ändras extremely 'an extremely proud man'

7.4. The numeral Ordinal numerals behave like common adjectives with respect to order. As for cardinal numerals, they are normally prenominal, though for reasons of emphasis they may rarely be placed postnominally. E. g.: (36)

Oeka mil-a ten apples-NOM 'ten apples'

(37)

andr-es ikosi, jinek-es okto men-NOM twenty, women-NOM eight 'twenty men, eight women'

7.5. Participial modifiers Of Modern Greek participles only passive ones can function as NP modifiers. With respect to order, passive participles, which are adjectival in nature (see Lascaratou & Philippaki-Warburton 1984, and Lascaratou 1989 a), behave like common adjectives.

7.6. The relative clause Relative clause modifiers (introduced either by the relative pronoun opios/i opia/to opio or the relative complementizer pu) always occur postnominally, no variation being possible. Relative pronouns are obligatorily placed at the beginning of the clause. If the pronoun o opios is governed by a preposition, then the preposition must precede the relative pronoun. Nominale modified by genitive relative pronouns having possessive or comparative function may be positioned either before or after o opios.

Modern Greek word order

(38)

167

a. i irin-i, i opia/pu /tin opia/pu ... the-NOM Irene-NOM, who /whom b. i irini, ja/apo/me tinopia... for/from/with whom c. i irini, tis opias to vivli-o ... whose the-NOM book-NOM d. i irini, to vivlio tis opias ... e. i irini, tis opias meyaliter-i ime, ... older am-lSG f. i irini, meyaliteri tis opias ime, ...

7.7. Possessive genitives When both the possessor and the possessed are nouns, they are simply juxtaposed. The genitive can either follow or precede the noun, its postnominal placement being more common and natural. When the possessive genitive is a clitic pronoun, it obligatorily occurs enclitically after the noun or a nominal modifier. (39)

a. ta mat-ja tis xristin-as the-NOM:PL:NT eyes-NOM the-GEN:SG:F Christine-GEN b. tis xristinas ta matja 'Christine's eyes'

(40)

a. aft-ί i jakritik-ί simberifor-a this-NOM the-NOM tactful-NOM behaviour-NOM tis her-POSS:GEN:CLT:3SG:F 'this tactful behaviour of hers' b. afti i ojakritiki tis simberifora c. afti tis i ojakritiki simberifora

7.8. Sequence of NP modifiers When the nominal modifiers which are not tied to specific NP slots occupy their preferred positions, the most typical order is Dem Art Num Adj N Gen

168

Chryssoula Lascaratou

Rel. This linearization pattern largely conforms with the intuitions expressed by various linguists (e.g. Dik 1978; Hawkins 1983, 1990; Lascaratou 1989), according to which 'heavier', i. e. categorially more complex and/or lengthier constituents tend to occupy later positions in a string of elements. Some variation is possible, namely, for reasons of emphasis the demonstrative can occur after the first modifier. Also the possessive nominal can be placed at the beginning of the NP. Compare: (41)

a. aft-a ta pende mikr-a vivli-a these-NOM the-NOM five small-NOM books-NOM tu angel-u pu ... the-GEN:SG:M Angelos-GEN that-RELAT COMP 'these five small books of Angelos that ...' b. ta pende afta mikrä vivlia tu ängelu pu ... c. tu ängelu, afta ta pende mikri vivlia pu ...

8. Final remarks Despite its space limitations, the above descriptive account of the main word order characteristics of Modern Greek was intended to bring out its very flexible, though not arbitrary, nature. The first systematic attempt to investigate and assess the effect of syntactic versus pragmatic factors on Greek word order variation is to be found in Lascaratou (1994), where Hawkins' (1990, 1991 a, 1991 b, 1994) syntactically motivated Early Immediate Constituents (EIC) principle is evaluated against Givon's (1983, 1988) Task Urgency pragmatic theory, on the one hand, and the Prague School principles, on the other.

Notes 1. I would like to thank Irene Philippaki-Warburton and Mary Sifianou for their constructive comments on earlier versions of this survey. 2. It is worth noting here some controversial theoretical discussion on the status of the SVO order in M. Greek. Philippaki-Warburton (1980 a, 1980 b, 1982, 1985, 1987) has been concerned with the question of establishing the basic order in M. Greek within the framework of Chomsky's EST (her earlier account) or Government and Binding Theory (1981) and subsequent work (her latest analysis). The essence of PhilippakiWarburton's claims is that SVO is not the basic Greek order but a derivative one. SVO, she argues, is the result of the application of the Subject Thematization rule on

Modern Greek word order

3.

4.

5.

6.

7.

169

VSO, which she claims to be the basic Greek order as it is morphologically, intonationally and pragmatically the least marked order. In establishing VSO as her basic order, Philippaki-Warburton has not been concerned with the criterion of frequency. She does, nevertheless, point out that SVO being the most natural thematic structure, it is not surprising that it is the most frequently occurring order. Philippaki-Warburton's claims, however, are not shared by Horrocks (1983) and Drachman (1985). The former, within the framework of Generalized Phrase Structure Grammar (GPSG), argues that SVO is not a derivative order but a basic one. The latter claims that in the light of pure T. G. criteria 'the unmarked syntactic order of constituents for Greek is indeed SVO'. The empirical evidence of this study is drawn from two corpora, one consisting of 2530 active transitive clauses and 3525 passive ones, and a smaller one consisting of 300 active intransitive clauses used as a control group for the reconfirmation of assumptions and conclusions reached at an earlier stage of the study, i. e. Lascaratou (1984). The function and distribution of M. Greek clitic pronouns have been dealt with in a number of studies, e.g., Anagnostopoulou (1992), Drachman (1970, 1985, 1991), Joseph (1978, 1980, 1983), Joseph & Philippaki-Warburton (1987), Warburton (1977), Philippaki-Warburton (1981, 1982, 1985 a), Theophanopoulou-Kontou (1986). It is also possible to have nominative clitic pronouns, though their occurrence is strictly limited to two constructions. See Joseph (1981) and Joseph & PhilippakiWarburton (1987: 214-215). With respect to the relative positions of S and O, in 83.9% of the cases the subject precedes the object, which indicates that Greek conforms with Dik's (1978: 176) claim that 'in the functional patterns of the vast majority of languages the Subj normally precedes the Obj'. It is, however, very significant to observe that in 88.3% of the clauses where it is the object that precedes the subject the order is OVS, while VOS and OSV are very marginal. This can be interpreted as evidence in support of Lascaratou's (1989) hypothesis that Greek is a language with a dominant SVO order which, however, could be interpreted as the result of the operation of P l rules (in Dik's terminology) on a basic VSO order. It should be noted, however, that clitic doubling is almost restricted to informal language and speech in particular. This explains its marginal occurrence in the written corpus of Lascaratou (1989).

References Anagnostopoulou, E. 1992 "Exartiseis tou A-Tonoumenou kai Klitika sta Nea Ellinika: O Paragontas tis Exoprotasiakis Syndesis" ("A dependent clitics in Modern Greek: the discourse-linking variable"). Studies in Greek Linguistics. Proceedings of the 13th Annual Meeting of the Department of Linguistics, University of Thessaloniki. Thessaloniki: Kyriakidis, 275 — 293. Dik, Simon C. 1978 Functional Grammar. Amsterdam: North-Holland.

170

Chryssoula Lascaratou

Drachman, Gaberell 1970 "Copying and order changing transformations in Modern Greek'. Ohio State University Working Papers in Linguistics 4: 1 — 30. 1985 "Language universals — two approaches," in : Pieper and Stickel (eds.), Studia Linguistica Diachronica et Synchronica. Amsterdam: Mouton, 175-201. 1991 "On the order of clitics in Greek", Unpublished paper, University of Salzburg. Greenberg, Joseph H. 1963 "Some universals of grammar with particular reference to the order of meaningful elements", in: Joseph H. Greenberg (ed.), Universals of Language. Cambridge, Mass.: MIT Press, 73—113. Hawkins, John A. 1983 Word Order Universals. New York: Academic Press. 1990 "A parsing theory of word order universals", Linguistic Inquiry 21: 223-261. 1991 a "Syntactic weight versus information structure in word order variation." Linguistische Berichte. Special issue on Information Structure and Grammar. 1995 A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press. Hawkins, John Α., Κ. Horie, and Stephen Matthews 1991 "On the interaction between performance principles of word order", in: John A. Hawkins and Anna Siewierska (eds.), Performance principles of word order. EUROTYP Working Paper II/2, 141-188. Horrocks, G. 1983 "The order of constituents in Modern Greek", in: Gerald Gazdar, Ewan Klein, and Geoffrey K. Pullum (eds.), Order, Concord and Constituency. Dordrecht: Foris Publications, 95 — 111. Joseph, B. D. 1978 Morphology and Universals in Syntactic Change: Evidence from Medieval and Modern Greek. Ph. D. thesis. Indiana University Linguistics Club. 1980 "Recovery of information in relative clauses: evidence from Greek and Hebrew", Journal of Linguistics 16: 237—244. 1981 "On the synchrony and diachrony of Modern Greek na," Byzantine and Modern Greek Studies 7: 139-154. 1983 "Relativization in Modern Greek: another look at the accessibility hierarchy constraints", Lingua 60: 1—24. Joseph, B. D. and Irene Philippaki-Warburton 1987 Modern Greek. London: Croom Helm. Lascaratou, Chryssoula 1984 The Passive Voice in Modern Greek. Unpublished Ph. D. thesis, Reading University. 1989 A Functional Approach to Constituent Order with Particular Reference to Modern Greek. Implications for Language Learning and Language Teaching. Parousia Journal Monograph Series, Vol. 5, Athens. 1989 a "How 'adjectival' are adjectival passive participles in Modern Greek and English?," Glossologia 7-8: 87-97.

Modern Greek word order 1994

171

Performance Principles in Word Order Variation. Parousia Journal Monograph Series, Vol. 29. Athens. Lascaratou; Chryssoula and Irene Philippaki-Warburton 1984 "Lexical versus Transformational Passives in Modern Greek," Glossologia 2-3:99-109. Philippaki-Warburton, Irene 1980 a "Some aspects of Functional Sentence Perspective in Modern Greek." Paper presented at the English Department, Athens University. 1980 b "Provlimata Schetika me ti Seira ton Oron stis Ellinikes Protaseis" ("Problems related to constituent order in Greek sentences"). Paper presented at the Conference on Modern Greek in English Speaking Countries, held in Athens in 1980. Published in Glossologia 1 (1982), 99-107. 1981 "Katholikoi Periorismoi kai Neoelliniki Syntaxi' ("Universal constraints and Modern Greek syntax"). Studies in Greek Linguistics. Proceedings of the 2nd Annual Meeting of the Department of Linguistics, University of Thessaloniki. Thessaloniki: Kyriakidis, 179 — 212. 1982 "I Simasia tis Seiras Rima Ypokeimeno Antikeimeno sta Nea Ellinika" ("The significance of the verb subject object order in Modern Greek"). Studies in Greek Linguistics. Proceedings of the 3rd Annual Meeting of the Department of Linguistics, University of Thessaloniki. Thessaloniki: Kyriakidis, 135-158. 1985 a "I Theoria ton Kenon Katigorion: to Elleipon Ypokeimeno kai oi Klitikes Antonymies sti Nea Elliniki" ("On the theory of empty categories: the missing subject and clitic pronouns in M. Greek"). Studies in Greek Linguistics. Proceedings of the 6th Annual Meeting of the Department of Linguistics, University of Thessaloniki. Thessaloniki: Kyriakidis, 131 — 153. "Word order in Modern Greek," Transactions of the Philological Society 1985 b 2: 113-143. 1987 "The theory of empty categories and the pro-drop-parameter in Modern Greek," Journal of Linguistics 23: 289-318. Ralli, A. 1988 Elements de la Morphologie du Grecque Moderne: La Structure du Verbe. Unpublished Ph. D. dissertation, University of Montreal. Theophanopoulou-Kontou, D. 1986 "Kenes Katigories kai Klitika stin N. E. I Periptosi tou Amesou Antikeimenou" ("Empty categories and clitics in M. G. The case of the direct object"), Glossologia 5-6: 41-68. Triantafyllidis, M. 1978 Neoelliniki Grammatiki (tis Dimotikis) (Modern Greek Grammar (of Dimotiki). Reprint of the 1941 edition with corrections. Aristotelian University of Thessaloniki, Modern Greek Studies Institute, Thessaloniki. Tzartzanos, A. 1946 Neoelliniki Syntaxis (Modern Greek Syntax), Vol. I. Athens: Organismos Ekdoseos Scholikon Vivlion. 1963 Neoelliniki Syntaxis (Modern Greek Syntax), Vol. II. Second edition. Athens: Organismos Ekdoseos Didaktikon Vivlion. Warburton, Irene P. 1977 "Modern Greek clitic pronouns and the surface structure constraints hypothesis," Journal of Linguistics 13: 259—281.

Maria Vilkuna

Word order in European Uralic

1. Introduction1 The purpose of this paper is to present a descriptive, data-oriented but typologically informed survey of word order in Uralic languages spoken in Europe, with special attention to the less well known languages.2 The classification of the Uralic languages with estimates of current numbers of speakers and areas of distribution is presented in Table I.3 In more traditional classifications, the label "Volgaic" is used to cover Mordvin and Mari, but in the absence of comparative evidence for a common protolanguage (for typological differences, see Saarinen 1991), both Mordvin and Mari are best seen — along with the more closely knit Permian group — as independent intermediate links in the chain of affinities ranging from Hungarian and Samoyed to Finnic and Sami. Another departure from the traditional classification is that the labels Uralic and Finno-Ugrian are treated as synonymous; there is little substantial evidence for a fundamental split between Samoyed and the rest of the languages as a group. Outside the scope of the present survey are the languages spoken in Siberia: the Ob-Ugrian languages Mansi and Khanty as well as most Samoyed languages. However, the speakers of the largest language of this branch, Tundra Nenets, inhabit both sides of Northern Ural. The inclusion of Nenets in this survey is also motivated by its strict SOV character that illustrates one extreme of the rather many-faceted Uralic word order. 'Nenets' in this paper refers to Tundra Nenets only. As is often the case with lists of languages, the language as opposed to dialect status of some of the languages enumerated in Table 1 is not uncontroversial. For example, Erzya and Moksha, both of which have a written standard, are sometimes presented in the literature as one language, Mordvin, while other sources compare their distance to that between Finnish and Estonian (Mosin and Majushkin 1983). With Mari and Komi, the language-dialect question is less settled, since both have two separate written standards (Hill and Meadow Mari, Komi proper and Permyak; only Meadow Mari and Komi are used here). The Sami continuum containing up to eleven languages was earlier

174

Maria Vilkuna

Table l. Classification of the Uralic (Finno-Ugrian) languages, with present numbers of speakers and areas of distribution. Based on the Map Geographical distribution of the Uralic languages. Finno-Ugrian Society, Helsinki 1993. — The languages that do not fall into the scope of the present survey are prefixed with '+'. Sami languages ("Lappish") Southern Sami Ume Sami Pite Sami Lule Sami Northern Sami Inari Sami Kemi Sami Skolt Sami Akkala (Babinsk) Sami Kildin Sami Ter Sami Finnic ("Baltic-Finnic") languages Livonian Estonian Votian Finnish Ingrian Karelian, inch Olonetsian Ludian Vepsian

500 ? 20 ? 20 ? 2,000 30,000 400 extinct since 1800s 300 8 800 6

very few 1,000,000 very few 5,000,000 300 70,000 5,000 6,000

Norway, Sweden Norway (extinct), Sweden Norway (extinct), Sweden Norway, Sweden Norway, Sweden, Finland Finland Finland Finland, Russia Russia Russia Russia Latvia Estonia and adjacent areas Russia Finland and adjacent areas Russia Russia Russia Russia

Mordvin languages Erzya Moksha

500,000 250,000

Russia Russia

Mari ("Cheremis")

550,000

Russia

Permian languages Udmurt ("Votyak") Komi ("Zyryan"), inch Permyak

500,000 350,000

Russia Russia

Ugrian languages Hungarian + Mansi ("Vogul") + Khanty ("Ostyak") Samoyed languages + Nganasan ("Tavgi") + Enets ("Yenisei Samoyed") + Yurats Nenets ("Yurak") + Selkup ("Ostyak Samoyed") + Kamas + Mator

14,000,000 3,000 13,000 600 very few extinct since 1800s 27,000 1,500 extinct since 1989 extinct since 1800s

Hungary and adjacent areas Russia (extinct), Siberia Siberia Siberia Siberia Siberia Russia, Siberia Siberia Siberia Siberia

Word order in European Uralic

175

often treated as one language. Similar vagueness can be found in the Karelian area. The relevant dialect or language isoglosses are typically lexical, morphological and phonological, and little is known about syntactic differences between the subgroups. As Comrie (1981: 91) notes, while the genetic and historical relationships inside the Uralic family are reasonably well-established, the structural and typological divergence of the family is considerable. A rather transparent way of grouping the languages is according to the most conspicuous sources of influence. Thus, Mari, Udmurt, and the Mordvin languages (especially Moksha) have had long and intensive contacts with Turkic languages, mainly Chuvash, Tatar and their ancestors. The structural traces of these contacts are clearly more transparent in the first two than in Mordvin. In Hungarian, Turkic influence is combined with Slavonic and Germanic. The Sami and Finnic languages display marked similarities with Scandinavian, and Baltic influence has been important in the formation of the Finnic group; German has left its mark on Estonian. Needless to say, in present time, the strongest influence on all languages spoken in Russia comes from Russian. This is naturally most apparent the smallest groups, such as the easternmost Finnic languages, Karelian (see e.g., Filppula and Sarhimaa 1994), and especially Vepsian. All the languages treated here, except perhaps the rather rigidly verb-final Nenets, have "free" word order in the sense that none of them excludes any permutation of major clausal constituents (subject, object, verb). Different permutations are said to reflect subtle pragmatic distinctions, but at the same time, two alternatives orders — SOV and SVO — are often perceived to be unmarked. This variability of order coupled with the minority (often endangered) status of many of the Uralic languages is not exactly favorable to obtaining reliable information on word order. Potential interference can be expected from the dominating languages, including both normative Russian grammar and spoken Russian with its extensive freedom of order. Thus, clear native judgements concerning the pragmatic value of any particular permutation can only be expected from people who have investigated the issue. Word order information in the existing grammars is scarce, pending further research. Extensive research has, of course, been conducted on the three national languages, Hungarian (E. Kiss 1987 and subsequent work; see the references in E. Kiss 1994), Estonian (Remmel 1963, Tael 1988, 1990) and Finnish (Hakulinen 1976, Vilkuna 1989, Vainikka 1989). In order to broaden the empirical basis of this survey, its data base has been supplemented by text corpora in the case of Erzya, Estonian, Komi and Udmurt. 4 Although the objective of the present survey is not to present exhaus-

176

Maria Vilkuna

tive statistical information, frequency observations from the corpora will occasionally be cited to give some idea of the existing differences in the distribution of the various word order phenomena in the discussed languages.

2. Inflection and other functional categories All the languages in the family have rich inflectional (suffixal) morphology with a predominantly agglutinative character. Fusional tendencies are found as well, most notably in Estonian and Sami. All the languages have rich case systems, from six or seven cases in Sami and Nenets to well over ten in the other languages. All the languages are accusative, with nominative subjects and subject agreement in number and person; Mordvin, Hungarian and Nenets (as well as Ob-Ugrian and other Samoyed languages) have object agreement systems. Intricate mechanisms of object case marking are common (see Wickman 1955). Invariable accusative marking of objects obtains only in Hungarian and Sami as well as, at least with finite verbs, in Mari; Nenets uses accusative except in imperatives. The choice between accusative and unmarked object in Permian languages is conditioned by definiteness (Baker 1985: 116—118), and Mordvin exhibits alternation between nominative (indefinite), genitive (definite) and local cases (Alhoniemi 1991). Finnic has a semantically distinctive contrast between partitive and "accusative" or non-partitive objects and, among the latter, a syntactically conditioned choice as to marking (genitive) vs. non-marking (nominative). Null pronouns (pro-drop) are typical in subject, object and possessor positions, albeit in a restricted fashion in Western Finnic (Estonian and Finnish). Expletive subjects are generally not used except in colloquial Finnish and — to judge from Nickel (1990: 397) — Scandinavian-influenced Sami. A full-fledged personal passive is rare (existing as an alternative in Estonian), but impersonal passive-like constructions are found — a conspicuous feature in Finnish. Third person plural forms are widely used to refer to unspecified agents. More or less productive passive or passive-refexive verb derivation also seems common, along with other valency-changing derivations, such as causatives. Present-tense nominal, adjectival and locational predications are widely constructed without a copula. The Mordvin languages and Nenets inflect nominal predicates in person and number, and Mordvin, even in tense. Finnic and (most of) Sami is the only group inside the Uralic family to use a present-tensed copula as a rule; still, nominal sentences are common in Karelian and Vepsian, perhaps due to Russian support to the pattern, as well as in Southern and Ter Sami (Korhonen 1981: 343; Bergsland 1994). As for verb categories, the tense

Word order in European Uralic

177

and mood systems vary considerably in complexity and functions. Auxiliaries are used in most languages (not in Mordvin and only marginally in Hungarian) to construct compound tenses, but the uses of the tenses vary; for example, evidentiality plays a particularly important role in Mari and the Permian languages, but the Finnish tense system is remarkably similar to the Scandinavian and English systems. Negation is usually expressed by an auxiliary, which indicates person and number agreement and sometimes tense, but in most languages a non-inflected particle — and in Western Mari, a negative suffix — completes the negative paradigm. Hungarian uses a negative particle, numberinflected negation appearing only in existential sentences. The standard Estonian negation element has completely lost its inflection outside the imperative. In NP structure, agreement of head and adjective or demonstrative is predominantly a Finnic phenomenon. The Finnish-Karelian area exhibits full agreement in number and case, but the Estonian NP-internal agreement is only partial. Sami has no adjective agreement but uses a special attributive form. In Hungarian, the demonstrative (but not the adjective) agrees; number agreement of adjectives is an option in Udmurt and Nenets. Typical of the Uralic NP is the use of possessive suffixes on a possessed head noun. These agree with the (generally optional) genitive possessor phrase. While the possessive suffix system is well and alive elsewhere, most Finnic languages except Finnish have lost or, in the case of colloquial Finnish, are in the process of losing it. Finnish and Sami share a system where a 3rd person suffix must be clause-internally bound (that is, by the genitive or by a higher constituent, typically the subject), while the other languages allow textual antecedents to the suffix. In addition to indicating possession, the suffixes are used for various pragmatic purposes, conveying definiteness or familiarity, especially in Permian, Mari and Nenets; according to Redei (1978: 78), this is the actual main function of the possessive suffix in present-day Komi. Hungarian and Mordvin are the only languages in the group to have developed articles. The former has both a definite and an indefinite article, stemming from the demonstrative and the numeral One', respectively, but Mordvin has a suffixal definite declination. Colloquial Finnish greatly favours the use of demonstrative and indefinite pronouns. Finally, an often repeated characteristic of the Uralic languages is the extensive use of inflected non-finite forms instead of finite subordinate clauses. Finite subordinate clauses are fully integrated in Finnic and Hungarian syntax but practically nonexistent in Nenets. Conjunctions are relatively new in the family; the languages spoken in Russia use quite a few transparently Russian conjunctions and complementizers, but other origins are possible as well, such as the gerund suysa 'saying; that' in Udmurt.

178

Maria Vilkuna

3. Word order type The original basic order of major constituents in the Uralrc family is generally assumed to have been SOV, but few of the present-day languages can be straightforwardly characterized as SOV languages. Nevertheless, the customary Greenbergian correlations of general head-final order prevail below sentence level. First of all, the languages are either absolutely or predominantly postpositional. A number of prepositions have evolved in Finnic and Sami, but postpositions clearly outnumber prepositions; Remmel (1963: 299 — 304) lists 64 postpositions, 15 prepositions, and 12 items with both organizations (sometimes with meaning differences). For Northern Sami, Nickel (1990: 167—188) reports 4 prepositions along with 61 postpositions and 24 items with ambivalent order, free or conditioned. Comparable figures can be cited for Finnish. All of the languages under consideration are suffixing, and given the tendency of postpositions to lose their independent word status, the line between case suffix and postposition is not always clear. The predominantly head-final NP structure, which will be discussed in section 7.1, further supports the Greenbergian picture. In many cases, postpositional phrases derive from NPs with a possessor premodifier and are in fact transparently similar to NPs. It might be helpful to think of the languages discussed here in terms of the following simplistic grouping: • SOV: fairly rigid (Nenets), or consistent head-final characteristics but without rigid adherence to them at the clause level (Mari, Udmurt and Southern Sami). • Topic-Focus (Hungarian): organization according to discourse function, immediately preverbal focus position. • "Eastern" SVO (Komi, Mordvin, Karelian, Vepsian): extensive ordering variation, spontaneous verb or object focusing SOV occurs. • "Western" SVO (Finnish, Estonian, Northern and Inari Sami): occurrence of OV is restricted to specific constructions; discourse-configurational nodes; V 2 tendencies.

4. Major constituents in declarative clauses In the following, section 4.1 concentrates on the essential properties of declarative word order in the investigated languages. Although these properties are not always most revealingly formulated in terms of S, V and O, section 4.2.1

Word order in European Uralic

179

presents some corpus-based data on the ordering of the verb and its arguments in Erzya, Komi and Udmurt, and section 4.2.2 makes some observations on ordering variation and marked orders in some of the less well known languages with special reference to Udmurt. Section 4.3 surveys the available information on focus and its relation to word order.

4.1. Some problems regarding word order type Hungarian is not easily classified in terms of major clausal constituent order but rather conforms to the pattern Topic—Focus—Verb—Neutral. The Focus position may be occupied by a focus proper, defined as a type of exhaustiveness operator in the relevant literature, or in its absence, by an incorporated constituent such as a verbal particle, a predicate complement, or a non-individuated nominal complement (E. Kiss 1987, 1994; this volume). Under this analysis, the quite frequent unmarked SVO order is interpreted as a Topic—Verb—Neutral pattern with no special focusing or incorporated constituent, whereas the unmarked SOV is actually a Topic—Focus —Verb organization with an incorporated, non-individuated object instead of a focus in the above sense. In both cases, the combination of object and verb can be interpreted as "new information". (1)

Hungarian 5 a. LaciT szereti Esztert. (name) love:OBJ-3SG (name):ACC 'Laci loves Eszter.' b. LaciT ujsagotp olvas. (name) newspaper:ACC read.3SG 'Laci is reading a newspaper.'

Grammatical function is not directly relevant to this organization, although statistical connections naturally exist. The subject of (1 a), with an appropriate intonation, could be interpreted as occupying the F position, but the subject might as well appear in the Neutral position, as in (2): (2)

Hungarian Tegnapx ujsagotp olvasott Laci. yesterday newspapenACC read:PST.3SG (name) 'Yesterday, Laci was reading a newspaper.'

180

Maria Vilkuna

SVO is the prevalent order in Finnic, Sami (with the exception of Southern Sami), Mordvin and Komi, but its details differ. First of all, Estonian has a clear Verb-second (V 2) character, usually considered to be the result of German and Scandinavian influence.6 Yet, the V 2 phenomenon is not identical in Estonian and Germanic. First, while V2 syntax seems quite regular in written prose (Tael 1988), it is not obligatory; informants are not willing to exclude sentences that violate the V2 constraint, and exceptions are found, especially in spoken language (Saareste 1960) and with weak pronominal subjects. Examples of V2 inversion and lack of it are (3 a) and (b) below. In particular, WH questions are preferably rendered without subject-verb inversion. Second, V 2 and lexical complementizers are not complementary in Estonian; SOV order is common in subordinate clauses, but by no means obligatory, as shown by the two subordinate clauses in (4). A further complication is the fact that Estonian allows SOVordered main clauses, especially negations, such as (5). (3)

Estonian (corp) a. Neid sönu on ta juba ükskord kuulnud. those:PRTV words:PRTV is s/he already once heanPART 'He has already once heard those words.' b. Seiles kohvikus ta küll varem ei olnud käinud. thatrlNE cafe:INE s/he PTL earlier NEG be.-PART visit:PART 'That cafe he certainly had not visited before.'

(4)

Estonian (corp) just odrakulvi ajal, just barley.sowing:GEN time:ADESS a. kui külvajale löunalauas antakse süüa seasaba, when sower:ALL lunch.table.-INE give:IPS eat:INF pigtail b. et odrapead niisama pikaks kasvaksid. that barley.ear.-PL equally long:TRNSL grow:COND:3PL '— — just at barley sowing time, when the sower is given a pigtail to eat at lunch so that the barley ears should grow to be as long [as pigtails].'

(5)

Estonian (corp) Ojasson tavaliselt lehti ei lugenud, (name) usually papersrPRTV NEG read:PART Ojasson did not usually read newspapers'

Word order in European Uralic

181

In both Estonian and Finnish, it is quite neutral — indeed, often obligatory — for a non-subject to occupy the subject position if no overt subject is available, as in impersonal sentences. This is one of the reasons why Finnish can be seen as a "discourse-configurational" language much like Hungarian, although obeying a different structural pattern: Contrast (K) — Topic (T) — Rest (Vilkuna 1989, 1995). The SVO character of Finnish is the result of the subject being the default topic element, and the object a postverbal Rest element in the neutral case. (6) is an example of an unmarked order conforming to this pattern, and (7) and (8) illustrate instances of a filled K position according to this analysis. (6)

Finnish MikkoT syö lunta. ~ TäälläT sataa lunta. Mikko eat:3SG snow.-PRTV ~ here falBSG snowrPRTV 'Mikko is eating snow. ~ It's snowing here.'

(7)

Finnish Lunta K MikkoT syö. ~ LuntaK täälläy sataa. e. g. 'It's SNOW Mikko is eating. ~ It's SNOWING here.' OR: 'Snow Mikko DOES eat ~ Snowing it IS here'.

(8)

Finnish SyoK MikkoT lunta. ~ Sataa K täällä T lunta. '(Oh yes), Mikko is eating snow ~ it is snowing here'

These permutations are meanigful, with changes in focusing and intonation, and allow slightly varying interpretations. The two types of reading indicated by the translations of (7) could be called contrastive focus and contrastive topic readings, respectively. None of the Uralic SVO languages are rigid; they all accept SOV under some conditions — as well as any other permutation. It is possible that in some of the languages, SOV could just be an archaic variant with no particular function, as the case might in principle be with Mordvin, where a recent decrease in the use of SOV can be attested on the basis of 19th century folklore texts (Saarinen 1991: 50), or Komi, where oral texts such as those in Redei (1978) give a similar picture. A matter to be investigated is to what degree the SVO/SOV alternation shares its properties with the same alternation in Russian. Another potential SOV relic should be metioned: Southern Sami. Interestingly, negation and tense auxiliaries and modal verbs take the medial position in this language, yielding SAuxOV structures such as (9b, c).

182 (9)

Maria Vilkuna a. Sami: Southern (Bergsland 1994: 59) Aehtjie ledtiem vöötji. father birdrACC shoot:PST.3SG 'Father shot a bird.' b. Sami: Southern (Bergsland 1994: 59) Tjidtjie ij raeffiem äadtjoeh. mother not.SSG peace:ACC get 'Mother does not get peace.' c. Sami: Southern (Trosterud to appear) Maen'ne ätj'eb buots'ede gehtet [...] I shall:lSG reindeer:PL:ACC watch:INF must watch over the reindeer.'

Trosterud (to appear) reports that simple SOV-ordered main clauses are the majority in his material, recordings of Southern Sami speakers born in the 19th century. He further shows how the Scandinavian Sami area seems to form a continuum from clear SOV/SAuxOV in the South to clear (if not unexceptional) SVO/SAuxVO in Northern Sami. SOV (or complement — verb order in general) has various functions in Finnic and Sami. What is clear in the case of Finnish is that the object in OV — more accurately, in an OV ordered "Rest" section — never carries the main new information of the sentence. This is not the case in Estonian or Russian, and nor does it hold for our "eastern" SVO languages: Karelian, Vepsian, Erzya or Komi. (Cf. Holmberg, this volume.) For example, sentence (10) from the Komi corpus appears neutral in its context and could, in the abstract, very well answer any of the questions 'What now?', 'What has Avko done?', or 'What has Avko done to the bread?' — with appropriate intonational variation of course (Jevgeni Cypanov, p.c.). (10)

Komi (corp) — Koni n'ebyd n'an'ys, kytts'ö lotin? — '— Where is the soft bread, where did you hide it? —' Avkoyd stav n'an'sö pazjöma. Avko:2SG all bread:ACC crumble:PST2.3SG '— That Avko of yours has crumbled all the bread.'

Let us now turn to the languages which are most consistently SOV, Mari, Udmurt and Nenets, surveying their other Greenbergian correlations at the clause level. The following examples are from Mari and come from a test

Word order in European Uralic

183

asking informants to translate simple sentences in a context where the speaker reports a piece of news. As (11 a) and (b) show, locative complements seem to be placed like objects, and similar late placement is found with certain types of subject: presentational sentences are LocSV, as exemplified in (lie), although this order seems to be subject to more variation. The same orders are neutral in Udmurt. (11)

Mari (ques) a. Poskudo cylymym supses. neighbor pipe:ACC pull:3SG 'The neighbor is smoking a pipe.' b. Sol'ym vüdys puren kajen. little.brothenlSG waterrLAT go.out:GER go:PST2.3SG 'My little brother fell into the water.' c. Kudyvecys(ke) kugu sem masina kudal purys. farmyard.-LAT big black car run come.in:PST1.3SG big black car came into the yard.'

Orders like those in (11) have exact SOV(SXV)-type counterparts in Nenets, but in contrast to Mari and Udmurt, Nenets verb-finality is quite rigid, and the occurrence of VO/VX patterns very infrequent (see Salminen to appear). 7 The SOV order is the typical instantiation of a principle according to which the constituent carrying new information immediately precedes the verb. Thus, along with the SOV sentence in (12a), we find OSV as in (lib) when the object is old and the subject, new: (12)

Nenets (ques) a. Nye xasawam Iad0°. woman man:ACC hit:SUBJ-3SG (What happened?) 'The woman hit the man.' b. Xasawam nye Iad0°da. man:ACC woman hit:OBJ-SG.SUBJ2-SG '(It was) the woman (who) hit the man.'

As a further "object patterner" (Dryer 1992), nominal and adjectival predicates precede the copula in, e.g., past tenses, as in (13). Note that predicative nominals immediately precede the copula in all but one of my relevant Udmurt examples and that they belong to the set of incorporated, or neutrally focuspositioned constituents in Hungarian.

184

(13)

Maria Vilkuna

Mari (Alhoniemi 1993: 49) Myj uskal l styso lijam. I cow milk:PART be(come):lSG Ί will be a cow-milker.'

Further, infinitival complements neutrally precede the verb in our SOV languages, as illustrated in (14).8 Note also the so-called converb construction in (11 b) above, where the finite verb follows the gerundive verb. This construction is typical of Mari but also occurs widely in Udmurt. The Nenets example in (14 c) is a "WH" construction exemplifying a non-finite interrogative complement: (14)

a. Mari (ques) Mylanem knigam ludas kiiles. I:DAT book:ACC read:INF must:3SG Ί must read a/the book.' b. Udmurt (ques) Mynym uze mynyny kule. I:DAT work:ILL go:INF must Ί must go to work.' c. Nenets (ques) Nyar° x0-nyana yilyewamt0 n0mt°r00sy°. friend:2SG where live:INF.ACC.2SG find.out:SUBJ-3SG.PST 'Your friend found out where you live.'

Pre-verbal predicate and infinitival complements are quite marked, although not ungrammatical, in Finnish and Estonian. Even more marked in these languages would be the Mari and Udmurt Verb—Aux ordered compound tense pattern, a SOV-property to be more closely discussed in section 6.3. Interestingly, as we saw above, infinitival complements in Southern Sami are often post-verbal; even the simple examples given by Bergsland (1994: 74) all display the order Vfin—Vinf. Mari and Udmurt are also unique among the discussed languages in having clause-final subordinators (Nenets, as mentioned, does not use such words). The subordinate clauses are typically preverbal, but at least in Udmurt, the postverbal variant in (16b) is also possible, the complementizer in the latter case being optional. The actual variety of subordinate constructions in these languages, both finite and non-finite, cannot be done justice to here.

Word order in European Uralic

(15)

(16)

185

Mari (ques) Nina kolym nales gyn, (myj) (tudym) kockam. (name) fish:ACC get:3SG if I it:ACC eat:lSG 'If Nina gets a fish, I will eat it.' Udmurt (ques) a. Nina mon ts'oryg s'iis'ko suysa malpaz. (name) I fish eat:lSG COMP think:PSTl:3SG b. Nina malpaz, mon ts'oryg s'iis'ko (suysa). 'Nina thought that I eat fish.'

The perceived neutrality of SOV in Mari and Udmurt is supported by the use of this type of order in contexts such as textbook examples or answers to questionnaires. However, as mentioned earlier, neither of the languages is as rigid as Nenets; no permutation of SOV is excluded, and post-verbal complements are by no means rare. Section 4.2 analyzes Udmurt clause level ordering variation in more detail.

4.2.

A closer look at three flexible word order languages

4.2.1. Distribution This section reports some of the results of an investigation of the variation of major clausal constituent order in standard written Erzya, Komi, Udmurt and Finnish. Counting the frequencies of different orderings of S, O and V from a text corpus requires a relatively large corpus, especially if occurrences are to be restricted as in Siewierska (this volume). Collecting all declarative clauses with an overt S, O and V without additonal constituents, but not yet excluding pronoun S and Ο or negative or subordinate clauses, I caught less than 3% of all the clauses in each language: 27 out of the 920 Erzya clauses, 27 of 963 Komi clauses, and 138 of about 5500 Udmurt clauses. The text frequency of objects (defined as accusative/nominative second arguments) is evidently too small in these languages for reliable assessment of the role of the less frequent variants. The next step was to relax the restriction of an object complement: comparable three-constituent clauses with overt S, V, and X were counted, where X is a complement — an object, a predicate complement, or an oblique complement of some sort, and an NP or a PP. The distinction between oblique complements and adjuncts is notoriously fuzzy, and there probably are some inconsistencies

186

Maria Vilkuna

in the analysis, but trusting that these will not cause any systematic error, the results are presented in Table 2.9 The Erzya and Komi data represent the complete texts, whereas the Udmurt data are from a 4000-sentence subcorpus where each individual text of the complete corpus is represented by approximately 200 clauses of running text. Table 2. Order in three-constituent sentences with overt S, V, and X in Erzya, Komi and Udmurt literary texts Erzya

Komi

Udmurt

svx sxv xsv xvs vsx vxs

65 11 13 16 2 1

60.2% 10.2% 12.0% 14.8% 1.9% 0.9%

49 19 8 12 10 0

50.0% 19.4% 8.2% 12.2% 10.2% 0.0%

37 218 83 53 10 4

9.1% 53.8% 20.5% 13.1% 2.5% 1.0%

Total

108

100.0%

98

100.0%

405

100.0%

As it might be of interest to compare these figures with those from a more stable SVO language, Table 3 presents similar results from Finnish: the order of three-constituent positive declarative clauses with an overt NP subject and object in the so-called HKV corpus. This corpus consists of about 10 000 clauses of expository Finnish prose, mainly newspaper texts, so the text type is not strictly speaking comparable to the others discussed in this section.10 Table 3. Order in three-constituent sentences with overt S, V, and O in Finnish expository prose.

SVO

sov osv ovs vso vos

423 10 27 74 7 0

78.2% 1.8% 5.0% 13.7% 1.3% 0.0%

Total

541

100.0%

It should be kept in mind that the data cited here have been collected for a rather special purpose; clauses overtly displaying the three required participants and excluding everything else do not necessarily yield the most natural or representative picture of the role of the relevant patterns in the respective languages. Irrespective of this, and leaving aside potential interference of text type, we can

Word order in European Uralic

187

indeed conclude that Finnish exhibits the strongest concentration of SVX clauses, with comparably lower figures in the other patterns, especially SXV. SXV is the favourite order in Udmurt, with Komi standing between Udmurt and Erzya in this respect. In all four languages, the V-initial orders form the clearest minority. The probability of VXS is low enough to lead to null occurrences in the Komi and Finnish material, and this is not surprising: not only is V-initialness a rare alternative in itself, VXS also presents a marked ordering of subject and complement. A striking coincidence is the similar frequency of the XVS/OVS pattern in all four languages. I will return to this in section 4.2.2, where subject position in Udmurt and Finnish will be discussed in more detail. Extensive corpus-based calculations of Udmurt word order patterns have earlier been made by Suihkonen (1990). Her results (Table 6.11 in Suihkonen 1990: 264) show that in the 893 transitive clauses in her corpus, the most frequent pattern is object before verb - not SOV (21.2%) but OV (39.4%). The proportion of SVO and VO (30.5% in all) is not negligible either — only patterns with Ο preceding V and S are relatively infrequent (7.5%) (1.3% of the instances are unanalyzed). Thus the ratio of (S)OV, (S)VO and O(S)V(S) is 6/3/1. On the other hand, actual (S)OV patterns only seem to cover about 23% of the entire material. Compared to the results in Table 2, the frequency of (S)XV-type orders is lower in Suihkonen's material than in the present corpus. More specifically, in my count of "pure" S,O,V clauses, I found 8.7% SVO, while Suihkonen's reports 14.3%, and her 21.2% of SOV is much lower than my 60.9% of the S,O,V material. There may be certain aspects of the analysis that partially account for this difference, but my guess is that the main reason is the character of the texts. While the present corpus consists of written fiction, 75% of Suihkonen's comes from old oral texts. A cursory examination of the oral texts e. g. in Wichmann (1954) shows that some of them clearly favor (S)VX more than any of my literary texts.11 Note also that while I counted 189 pure OV clauses in 1112 clauses with an object, that is, 17.0%, Suihkonen's OV accounted for 39.4% of the material. Again, it is not surprising if the written sentences contained more overt subjects, written clauses being both more complex and more explicit in general. One potential factor that might influence the SVO/SOV alternation, at least in Permian, is definiteness and individuation of the object as reflected by case marking (accusative) or non-marking (nominative). Sel'kov's authoritative Komi grammar suggests that nominative objects precede the verb (Baker 1985: 36, 38 — 39), and the same is obvious to the eye in the Udmurt texts. It is thus natural to ask whether the Permian nominative object might be analogous to the Hungarian pre-verbal incorporated object. Recall the Hungarian example

188

Maria Vilkuna

in (l b); neutral Komi and Udmurt sentences such as (17) and (18) would have equally neutral SOV renderings in Hungarian, and the absence of accusative marking makes it even more natural to assume some variant of an incorporation analysis. (17)

Komi (corp) Stav n mi ts'umyn syd s'ojam allrlNST we hut.-INE soup eat:lPL 'We all in the hut eat soup.'

(18)

Udmurt (corp) Anajez tsukna kytse ke k mets' pyziz mother:3SG morning some bread bake:PSTl:3SG 'His mother (had) baked some kind of a bread in the morning.'

The picture that emerges from the present material ist that nominative objects indeed favor the OV pattern more clearly than overtly case-marked ones but that this tendency is not exceptionless. The distribution of object case with respect to order is represented in Table 4, where "O —" referes to pre-verbal but not verb-adjacent objects, and the term "non-nominative" covers all types of overt marking. Table 4. Object case and object position in Permian languages. Udmurt Nominative OV

256

Total

25 10 291

όνο

88.0% 8.6% 3.4% 100.0%

Non-nominative 351 346 124 821

42.8% 42.1% 15.1% 100.0%

Komi Nominative

Non-nominative

21 1 29 51

28 17 97 142

41.2% 1.9% 56.9% 100.0%

19.7% 12.0% 68.3% 100.0%

(Total number of objects: Udmurt 1112, Komi 193)

Not surprisingly, it is the SOV-inclined Udmurt that most clearly conforms to the prediction. It is also of some interest to observe that the object type most "movable" is pronominal objects (personal and demonstrative pronouns). With these, 154 in all, OV order obtained only in 29.9%, O- in 47.4%, and VO in 22.7%. As for Komi, while the hypothesis that nominative objects are preverbal as such is clearly not borne out, the existence of such a tendency is fairly clear. Also, even though the material is far too small to allow any definite conclusions to be drawn, note that there was only one example of a preposed

Word order in European Uralic

189

nominative object in the material. This was also a minority pattern in Udmurt, allowing us to hypothesize that nominative objects are not readily movable in these languages.12

4.2.2. Functions of marked orders In Finnish and Sami, OSV/XSV order generally involves some type of initial emphasis, either focusing or contrastive topicalization, as in the two readings of (7) in section 4.1 — that is, movement to the sentence-initial K position (for Sami, see Nickel 1990: 521; standard Estonian, as will be recalled, often resorts to V2 inversion here). In Hungarian, OSV orders can in principle be produced in two ways: both the object and the subject can be topics, or the object can be topic and the subject, the focus. In the more eastern SVO languages, the role of object-initial orders is less clear. Redei (1978: 126) specifically mentions that this order is rare in Komi, and in fact, there are only two OSV cases in my small corpus, one of them the subject-focusing negation that will be seen in example (80) in section 6.3. The Erzya text contains a few more OSV occurrences, but the data are clearly not sufficient to clarify the informational status of this pattern in the language. With respect to XSV order, there is an important difference between Finnishlike SVO languages and SOV languages with a tendency to place focus or new information closest to the verb, such as Nenets. In the latter type of language, the object of OSV may be old or topical and the subject, new; a Nenets example of this was seen in (12b) above. In the Udmurt corpus, this organization is favoured in existential and presentational sentences, such as (19 a), but transitive examples, such as (19 b) on its most natural, subject-focusing reading, can be found. On the other hand, there are also XSV patterns with an old subject; on this interpretation, (19 c) is much like the Finnish "contrastive topic" pattern. In Finnish, then, both (19 a) and (19 b) could be rendered with XVS, while (19 c) would preserve the XSV order. (19)

Udmurt (corp) a. Odig dzytaz'e so dory, sok-pul' sokasa, one evening s/he to DESCR breath:GER vuiz. Kotyrys'tyz araky zyn come:PSTl:3SG around:ELA.3SG vodka smell One evening Pil'yp came to him, breathless. A nates from him.'

Pil'yp (name) pake reek:3SG smell of vodka ema-

190

Maria Vilkuna

b. Professor al'i Moskvayn, operatsios mon les'tylis'ko. professor now Moscow:INE operation:PL I make:lSG 'The professor is now in Moscow, the [surgical] operations are done by me.' c. Ti kytts'y? — börs'az kes'kiz, — kytse nomer? — 'Where [are] you [going]? she shouted behind him, — What [room] number?' Nomerze mon oj ts'akla. number:ACC.3SG I NEG.-PST1 pay.attention didn't pay attention to its number.' The SVO languages in the sample also evince the inverted variant, OVS. This order might in principle be used either for subject focusing or for afterthought-like subjects. Thus, in a construction like the Hungarian (2) in section 4.1, the focused object is preverbal and the subject is old but "neutral" or indifferent in the sense that the sentence is not a predication about Laci (but about tomorrow). In the inverted version of the Finnish (6), (20) below, the object could be the topic and the reason for the subject to appear late would be that it is focused. The "Hungarian" reading with an old final subject is at most marginal in Finnish. (20)

Finnish Lunta T syö MIKKO. snow:PRTV eat:3SG (name) 'It is Mikko who eats snow.'

This type of OVS is a marked option in Finnish, but there also exist OVS sentences that are perceived as neutral in their information structure. This may happen when, with some lexical choices, the prototypical participant hierarchy based on agency and animacy is reversed, or the predicate is stative and nonagentive. Thus, (21 a) can be used in an out-of-the-blue context in Finnish and (21 b), apparently, in Erzya. As is to be expected, the Mari counterpart can be rendered as OSV, as in (21 c), and a representative Nenets example is (22): (21)

a. Finnish Minua puri käärme. I:PRTV bite:PST.3SG snake b. Erzya (ques) Mon' suskimim guj. I:GEN bite:PST.SUBJ-3SG.OBJ-lSG snake was bitten by a snake.'

Word order in European Uralic

191

c. Mari (ques) Myjym kiske ciinggalyn. I.-ACC snake bite:PST2.3SG Ί was bitten by a snake' (22)

Nenets (ques) Syexarim sira toxora°da. road:ACC snow cover:OBJ-SG.SUBJ-3SG 'Snow covered the road.'

In Hungarian, OVS may be informationally neutral under certain semantic conditions (see E. Kiss 1987: 24), or it can represent the pattern that we now turn to. A particularly interesting topic is the one of post-verbal subjects in basically SOV languages such as Udmurt. Udmurt post-verbal subjects display clear similarities with those in Hungarian. As mentioned earlier, post-verbal constituents in Hungarian are informationally "neutral" and freely ordered. At least the latter is true for Udmurt as well; e.g., the post-verbal object and subject are free to change places without special effects, even though pronouns clearly tend to precede other constituents in the postverbal part in the texts. Thus, (23 b) was judged equal to (23 a) (Bibinur Zaguljajeva, p.c.): (23)

a. Udmurt (corp) As nylme kad' jaratis'ko mon tone, own daughter:ACC.lSG like love:lSG I you:ACC b. As nylme kad' jaratis'ko tone mon. Ί love you like my own daughter.'

Udmurt (and Hungarian) further allow a type of post-verbal subject pattern more typical of SVO languages like Finnish: XVS order in existential and presentative clauses. This is an alternative to XSV but appears to be favoured with longer subjects: (24)

Udmurt (corp) [There is a knock on the door, the protagonist opens it.] Korka pyrizy lyz sin'el'en jegit vorgoron house:ILL enter:PSTl:3PL blue greatcoat:INST young man no jyraz gord kyseten ts'indyr-ts'andyr mugoro nylmurt. and head:INE.3SG red scarf:INST skinny bodied woman Ά young man in a soldier's coat and a skinny young woman with a red scarf entered the house.'

192

Maria Vilkuna

However, in my material, transparently old, often pronominal or proper noun subjects are the majority in Udmurt OVS/XVS sentences. This is represented by (25 c) in the context of (25 a, b); note also the XSV ordered intransitive in

(b). (25)

Udmurt (corp) a. Tan'i dysetis' l'ukal'1'a soosty, PTL teacher collect:3SG they.ACC 'See, the teacher keeps inviting them [to meetings],' b. udmurtjos polys', Gerej s'ana, Kamaj Onton no udmurt:PL among:ELA (name) except (name) and Gord Ivan vetlo. (name) go:3PL 'from among the Udmurts, in addition to Gerej, Onton Kamaj and Ivan Gord go there.' c. Revol'utsija s'arys', vyl' ulon s'arys' veras'kyle dysetis'. revolution about new life about talk:3SG teacher 'The teacher talks about revolution and new life.'

This freedom of post-verbal subject placement is a feature that sharply distinguishes Udmurt both from Nenets and from Finnish. Turning now to the verb-initial permutations VSX and VXS, which are ungrammatical only in Nenets, it is probably safe to say that they are typically "loaded" in all the discussed languages (unless required in yes/no questions, as in Finnish). An apparently common exception is that reported for Komi: verbinitial variants are frequent in folktales but also occur in everyday narrative in "listing of events". This is also found in Finnish folktales, although, as already mentioned, verb-initialness is more likely to be used for contrasting in the present-day language, a feature at least to some degree shared by Sami. The following is a small selection of what I have come to see as typical of the Vinitial patterns in the Permian texts. Example (26) represents the narrative use, but (27) —(29) convey implications like emotional affect and affirmation. (26)

Komi (corp) [Mitruk has prepared his net to get a fish.] Pukalö Mitruk beregyn i vitts'ys'ö ts'erilys' sedöm. sit:3SG (name) shore:INE and wait:3SG fish:ABL get:PART 'Mitruk sits on the shore and waits for the fish to catch.'

Word order in European Uralic

193

(27)

Komi (corp) [Mitruk assumes his pet reindeer has eaten his bread and threatens in his mind: 'Just you wait till we arrive — —'] Petködla me syly, n'uvs'ys'ly puz. show:lSG I 3SG:DAT lick:PART:DAT frost [idiom] will show him, that licker!'

(28)

Udmurt (corp) [A young girl threatened by a repulsive marriage: 'No, I won't go! What will I do with his riches?') Vijoz so mone, kule övöl so mynym, kill:FUT:3SG he I:ACC necessary NEC he I:DAT 'He'll kill me, I don't want him, '

(29)

Udmurt (corp) So mynda dyr tsoze propiskatek? 'So long without propiska [permanent residence permit]?' Veras'ko val mon nats'al'stvoly, say:lSG AUX I superiors: D AT Abas s'eloyn propiskajen besporjadok suysa. (name) village papers:INST disorder COMP did say to our superiors, didn't I, that there is disorder in Abash village with propiskas.'

Similar but not identical implications of affect arise from the markedly verbinitial alternatives in Finnish, which would not be natural in the contexts of (27) —(29). The differences cannot be further discussed in the present context, but a natural suggestion is that the Finnish and the Permian types both focus the verb, but in different ways. The Finnish initial finite verb can only be polarity-focused: the pattern serves as a confirmation or contradiction of the polarity value of the sentence (see Vilkuna 1989: 113). In contrast, the examples in (27) —(29) seem to invite a content-focused reading: the "news" is the lexical content of the verb.

4.3. Focus positions In Nenets, word order variation is connected to the objective conjugation (use of object agreement marking) in an interesting way. As we saw in section 2.1, the immediately preverbal position is a focus position in a slightly wider sense

194

Maria Vilkuna

from that defined for Hungarian by E. Kiss and others — that is, in the sense of conveying new information, alone or together with the verb. When a new object occupies this position, the verb is in the subjective conjugation. But a non-focused object may appear non-adjacent to the verb or be completely omitted, in which case the verb obligatorily appears in the objective conjugation. (Salminen, forthcoming; capitals added to mark focus.) (30)

Nenets (Salminen 1993: 262) a. PIDA ng0tye°da. he wait:OBJ-SG.SUBJ-3SG 'HE is waiting for him.' b. SYITA ng0tye°. he:ACC wait:SUBJ-3SG 'He is waiting for HIM.'

Encountering a neutral but flexible SOV order, one can always ask whether it might in fact be a Hungarian-type Topic-Focus organization in disguise. The information available at the moment does not suffice to answer this question, but it is clear that the other SOV languages discussed here differ from Hungarian in important ways. Starting from Nenets, a major difference is the connection between objective conjugation and object focusing. In Hungarian, the overt or null-pronoun presence of a definite object is marked on the verb whether the object is focused or not, as illustrated in the following. (31)

a. Hungarian EsztertT LACIF värja ~ LACIF värja Esztert (name):ACC (name) wait:SUBJ-3SG.OBJ-3SG 'LACI is waiting for Ester.' b. VENDEGETp vär ~ O/T värja guest:ACC wait:SUBJ-3SG ~ 3SG:ACC wait:SUBJ-3SG.OBJ-SG3 'S/he is waiting for GUESTS ~ for HER.'

The second difference is that, while Hungarian WH phrases are confined to the focus position and Nenets also favours this organization, it is not obligatory in the latter: (32)

Nenets (ques) a. Xibya tim xada°? who reindeer:ACC kill:SUBJ-3SG 'Who killed a reindeer?'

Word order in European Uralic

195

b. Xasawa ngomkaem xada°? man what.-ACC kill:SUBJ-3SG 'What did the man kill?' Third, Hungarian allows much more material in the post-verbal position than Nenets, and this suggests that Nenets preverbal phrases cannot always be topics in the "Hungarian" sense. In sum, the Nenets type SOV structure often coincides with the Hungarian SOV, but as it lacks the grammaticalized position for focus-as-operator (including WH) and does not allow free "leaking" of nonTopic phrases into the post-verbal section, the general picture is rather different. Simplistically speaking, it is thus possible to derive a Hungarian-type discourse configurational language from an SOV language with an old-beforenew principle by adding unrestricted "leaking" and grammaticalization of Topic and Focus categories in a stricter sense. In Mari and Udmurt, the two other SOV-inclined languages discussed here, the former precondition obtains, but focus does not appear to be positionally restricted. The Mari and Udmurt pre-verbal position seems to be a neutral and frequent focus and WH position, but this does not prohibit the placement of WH items and exhaustive foci elsewhere (for further details, see section 6.1). The Mari examples in (33) are a subset of the results of a questionnaire task intended to elicit a contrastive focus interpretation. It seems that when the neutral position of a constituent is pre-verbal, it will remain there when focused, but, for example, a subject is not necessarily placed in this position for focusing purposes. Similar conclusions apply to Udmurt. (33)

Mari (ques) a. Aca(ze) ergyzym kyren. father(:3SG) son:3SG:ACC hit:PST2.3SG Uke, aca(ze) UDYRZYM kyren. no father(:3SG) daughter:3SG:ACC hit:PST2.3SG Ά: The father hit the son. B: No, it was the daughter the father hit.' b. Aca(ze) ergyzym kyren. father(:3SG) son:3SG:ACC hit:PST2.3SG Uke, AVAZE tudym kyren. no mother:3SG 3SG:ACC hit:PST2.3SG ~ Uke, tudym AVAZE kyren. ~ Uke, tudym kyren AVAZE. Ά: The father hit the son. B: No, it was the mother who hit her.'

As the Hungarian focus is an exhaustiveness operator, it is not surprising that explicit indication of exclusion with particles like csak Only' is confined to the

196

Maria Vilkuna

Focus position. This is not the case in Mari or Udmurt: phrases with gine Only' are freely placed, as e. g. in the following: (34)

Udmurt a. Mon tone gine jaratis'ko. ~ Mon jaratis'ko tone gine. I you:ACC only love:lSG love only you.' b. Mon gine tone jaratis'ko. ~ Tone jaratis'ko mon gine. I only you:ACC love:lSG Only I love you.'

Unlike the languages farther in the east, Finnish, Sami and Estonian have a fixed WH position at the beginning of the sentence, and this position is again open to constituents bearing narrow focus (recall the examples in (7) and (8)). It is a matter of further research whether the other Finnic languages or Sami can be given a discourse configurational analysis of the Finnish type. Northern Sami, however — in spite of marked differences at the phonological and morphological level — displays interesting similarities to Finnish, not only in its general SVO character, but in its option of initial focusing. (35) is an example of initial object focusing; to make subject focusing explicit, something more is needed. This may be OV order, as in the Finnish (36 a), but Sami has another device, also known to colloquial Finnish: a focused subject is marked by placing the unaccented pronoun dat 'that, it' after it, as in (36 b): (35)

(36)

Sami: Northern (Irja Seurujärvi-Kari, p.c.) Bardni dat nieida corbmadii. boy:ACC that girl hit:PST.3SG 'It was the boy whom that girl hit.' a. Finnish Tyttö poikaa löi. girl boy:PRTV hit:PST.3SG 'It was the girl who hit the boy.' b. Sami: Northern (Irja Seurujärvi-Kari, p.c.) Nieida dat corbmadii bardni. girl that hit:PST.3SG boy:ACC 'It was the girl who hit the boy.'

For Southern Sami, Bergsland gives the following example of the focusing use of the pronoun dthte (or dah; ace. dam) 'that'. His Norwegian translations are cleft sentences.

Word order in European Uralic

(37)

197

Sami: Southern (Bergsland 1994: 78-79) Aajja jih Javva edtjegan m'innedh viermieh doeredh. grandfather and (name) will go nets pull:INF Jävva d'ihte sävka. Jaevresne ohtje säaletje. (name) that rows lake:INE little island Dan duvvelen dam viermieh leah. that:GEN beyond that nets are 'Grandfather and Javva are going to haul up the nets. It is Jävva that rows the boat. There is a little island in the lake. It is out beyond that that the nets are.'

This construction appears formally similar to left dislocation but differs sharply from typical left dislocation in pragmatic value. It could also be analyzed as some kind of a cleft construction, but Vilkuna (1989) suggests that its Finnish, more marginal counterpart should be seen as involving movement of the subject to the K position and expletive filling of the T position. Further shared features in the focusing systems of Sami and Finnish are the frequent preposing of the negative auxiliary (see section 6.3) and contrastive preposing of finite verbs in general: (38)

Sami: Northern a. li mus leat beana. neg.3SG I:LOC be dog don't have a dog.' b. De husku Mahtte beatnagis. PTL beat:3SG Matthew dog:ACC.3SG '(Oh yes) Matthew does beat his dog.'

These constructions favor OV order; in fact, the only way to get SOV order at sentence level in Finnish simple-tensed main clauses is to make it accompany initial focusing — whether focusing of a nominal, as in (36 a), or a finite verb (sentence polarity), as in (8) cited earlier. It is natural to suggest that the initial focus patterns in Sami and Finnish should be related to the use of second-position clitic elements, of which datl dihte can perhaps be assumed to be one.13 Interestingly, Komi is another language where initial emphasis is accompanied by a clitic particle, öd, which can only attach to the first constituent: (39)

Komi (ques) a. Nyvkays öd kutskis zonkaös. girl:3SG PTL hit:3SG boy:ACC 'It was the girl who hit the boy.'

198

Maria Vilkuna

b. Zonkasö öd kutskis nyvkays. boy:3SG.ACC PTL hit:PSTl:3SG girl:3SG 'It was the boy who the girl hit.' Another such particle, the question clitic, will be discussed in section 6.1, and more will be said about focusing in relation to negation in 6.3.

4.4. Intransitives and nominal predicates As already mentioned, a general feature of Uralic languages is the common absence of a finite verb in present-tense indicative sentences. This is an option not reflected in the customary parlance of word order typology, where position of nominal constituents is defined in relation to the verb. The problem is hardly more than technical; as in more traditional accounts, the subject precedes the predicate, both in "normal" intransitive and transitive clauses and in clauses with nominal (NP, AP, PP or adverbial) predicates. The options of leaving out the copula vary; Nenets and Hungarian have no copula with NP or AP predicates but do require it in locational sentences and with non-third persons, whereas even the latter are quite commonly rendered without a copula in Permian, as in (40 c) and (d). (40)

Hungarian a. 0 itt ''"'(van), s/he here is b. Te szep *(vagy). you beautiful be:2SG Udmurt c. So tatyn. s/he here 'S/he is here.' d. Ton ts'eber. you beautiful 'You are beautiful.'

Some sources (e.g. Redei 1978: 125) are quite strict in their insistence on subject-predicate order in nominal sentences. The obvious reason is that predicate-subject order would lead to homonymy: reversing the Komi kerka ydzyd 'house [is] big' would yield 'big house', something preferably to be interpreted

Word order in European Uralic

199

as a noun phrase. However, the predicate-subject order is possible under appropriate contextual conditions and is widely attested in Hungarian and Udmurt. The ambiguity problem only arises with determinerless single subjects and in very simple sentences. The place of the determiner (article in the Hungarian example, demonstrative in Udmurt) acts as an immediate disambiguator: (41)

a. Hungarian A häz nagy. ~ Nagy a häz. the house big big the house b. Udmurt Ta korka baddz'ym. ~ Baddz'ym ta korka. this house big big this house 'The ~ this house is big.'

Further disambiguating factors are intonation, pronoun subjects and intervening adverbials, as well as the fact that plural agreement is typically confined to predicate position. The functional motivation for the subject-predicate order is thus less convincing than it might seem at first sight, and predicate-subject order is indeed found in Komi texts as well. (42 a) has a pronoun subject, while the intalicized clause in (42 b) is ambiguous in principle but was interpreted as a predicate-subject order: (42)

Komi (corp) a. Tsyg sijö, — vis'talis bat'ys. hungry it tell:PSTl:3SG father 'It's hungry, father said.' b. Vidzödö da, s'uvtöm ts'eriys i s'ömys abu. check:3SG PTL gut.less fish:3SG and scale:3SG NEG:COP '[Having noticed a fish in his net, H]e checks: the fish is without guts and has no scales.'

Whether a copula or some other verb is present or not, and independently of its placement, intransitives with a locative typically exhibit the ordering pattern illustrated in (43) and (44) below. The order Loc-NP in the (b) sentences characterizes an existential or presentative interpretation in contrast to the locative predicates in the (a) sentences, and the interpretation of the subject NP tends to be indefinite in the (b) pattern. Note, however, that the translations indicate default readings of the sentences in question; other interpretations can be made explicit by prosodic means. Recall, for example, that old destressed

200

Maria Vilkuna

postverbal subjects occur in Udmurt, so (43 b) could in principle also represent such a situation. (43)

Udmurt a. Kysnomurt komnatayn. woman room:INE Ά/the woman is in the room.' b. Komnatayn kysnomurt. 'There is a woman in the room.'

(44)

Nenets (ques) a. Nye myak°na me0, woman tent:LOC be:3SG Ά/the woman is in the tent.' b. Mayak°na nye t0nya°. tent:LOC woman exist:3SG 'There is a woman in the tent.'

In the absence of a locative constituent, an existential sentence may be verbinitial, that is, VS. But in Finnish and Estonian, the occurrence of non-contrastive VS is restricted to a narrow subtype of existential sentences, such as (45) below, and opening formulae of narratives such as '(there) was a man and a woman', the latter a common feature in all the languages under consideration except Nenets. It is interesting to note how this avoidance of VS-ordered sentences adds to the differences between Western Finnic and those Uralic languages that allow VX order in the first place. Not only is the "listing of events" type of verb-initialness (cf. (26) above) uncommon in present-day Finnish, but Finnish and Estonian do not allow such thetic or "all-new" sentences as (46) and (47) in an "out of the blue" context. (45)

Finnish Syttyi sota. break.out:PST:3SG war Ά war broke out.'

(46)

Hungarian Ugat a kutya. bark:3SG the dog 'The dog is barking.'

Word order in European Uralic

(47)

201

Udmurt Zingyrtiz karted. telephone:PSTl:3SG husband:2SG 'Your husband called. [E. g., said by a colleague when addressee returns to office.]'

Note that the problem in Finnish is not the VS order itself, as the equivalent of (47) could well be used as a contrastive sentence of the type (8): Oh yes, your husband did call'. Rather, Finnish requires that neutral sentences should have something (a default topic) in the T position.

4.5. Objects and recipients The ordering of co-complements is particularly variable in the languages under discussion, obeying factors such as those presented by Primus (this volume) at the textual level. As an example of this type of variation, let us now briefly survey the basic ditransitives, the 'give' clauses, of Udmurt and Finnish. None of the languages discussed in this paper has double-object constructions of the English type. In ditransitives, the recipient is coded with a particular case, dative or allative. The mutual order of the theme (patient) and the recipient, or theme and location, is free, and this freedom is generally assumed to reflect informational status — old before new — at least in written Estonian and Finnish. Other competing factors in the ordering of theme and recipient are semantic role as such and weight considerations. Recipients can be considered inherently more topicworthy and thus more likely to be placed earlier, while on the latter account, shorter constituents precede longer ones (cf. Hawkins, this volume). The general picture arising from Finnish and Udmurt texts is that informational, role and weight considerations coincide: the typical state of affairs is for a human, given, often pronominal recipient to precede a non-human, often longer theme. The typical case is illustrated by the Udmurt (48 a) below, while orders such as, say, (48 b) are by no means uncommon. (48)

Udmurt (corp) a. Stolovojyn soly kyk murtly tyrmymon syd canteen:INE 3SG:DAT two person:DAT suffice.-PART soup s'otizy, give:PSTl:3PL 'In the canteen, he was given a portion of soup enough for two people.'

202

Maria Vilkuna

b. Nos ik zal'aj, nos ik kyk pud s'oti mon again pity.PSTl.lSG again two pound givetPST.lSG I tynyd. you:DAT 'Again I pitied you, again I gave you two pounds [of grain].' From the Udmurt corpus, I isolated 51 clauses with the verb s'otyny 'give' with overt theme (henceforth object or O) and recipient (R). In these, R precedes O in 39 clauses, that is, in 76.5% of all instances. This goes together with the fact that Rs are more often pronominal than Os, 1st and 2nd person pronouns, 50 's/he, it, that' and ta 'this' being counted as pronouns. R is a pronoun in 51% of all the relevant clauses, and O, in 13%. Correspondingly, Os with two or more words are far more common (53%) than Rs (22%). In the written Finnish material of 196 antaa 'give' clauses reported by Vilkuna (1991), the relations are rather similar: Rs were found to be shorter than Os on average, and often pronominal. The Udmurt 'give' corpus contains 37 cases where R and O are immediately adjacent, and the largest subgroup of these, 17 in all, are of the type where a shorter R precedes a longer O. The next two groups, exemplified by 6 instances each, were R preceding O but either longer than or equal to O. The same tendency, shorter Rs preceding longer Os, was found in non-adjacent cases as well. The most frequent word order type in R — O sentences was (S/A)ROV, illustrated by (48 a), 25 in all. In clauses where R and O followed the verb, R was again pronominal in all but one instance. In Finnish clauses with post-verbal R and O, the same prototypical state of affairs was discernible: short (often pronominal) Rs precede longer Os. The first nominal was longer than the second only in 6% of the cases, and shorter in 69%. In accordance with this, RO order outnumbered OR: 66% as opposed to 34%. In the equal-length pattern, R preceded O slightly more often than vice versa. The general conclusion was, however, that weight was a better predictor of order than semantic role.

4.6. Adverbials The question of adverbial placement is far too complex to be discussed here in a meaningful way. I will therefore not strain the limits of this survey, but focus on a rather conspicuous feature, the placement of manner adverbials, an "object patterner" according to Dryer (1992). These typically follow the verb in Western Finnic but tend to precede it elsewhere (as they do in Russian). In

Word order in European Uralic

203

Finnish, an adverbial like nopeasti 'quickly' often occurs between the finite verb and its complement, but the opposite order is possible as well, often signaling givenness of the complement in written Finnish. This is illustrated in (49); immediately pre-verbal placement of a manner adverbial is not completely excluded in spoken language, however, as (50) illustrates: (49)

a. Mikko leipoi nopeasti kakun. (name) bake:PST.3SG quickly cake:ACC '[default reading:] 'Mikko quickly baked a cake.' b. Mikko leipoi kakun nopeasti. [default reading:] 'Mikko baked the cake quickly.'

(50)

Minä ihan nopeasti vaan katson sitä lehteä. I quite quickly just look:lSG that:PRTV papenPRTV 'I'll just quickly look at the paper.'

A tendency to place manner adverbials immediately before the verb is evident in all the languages that allow unmarked SOV order. Two examples: (51)

Komi (corp) Avko vidzts'ys'ömön s'ibödts'ö dad' dorö. (name) be.wary:GER approach:3SG sleigh towards:ILL 'Avko approached the sleigh cautiously.'

(52)

Erzya (corp) Mazij bankin'es' s'eske tus' sonze pretty box quickly come:PST1.3SG 3SG:GEN.3SG ked's, - hand:ILL 'The pretty box quickly came into his possession.'

In Nenets, the preferred placement of central types of adverbials obeys the template Time—S — Place—O — Manner— V; only very rarely does an adverbial occur after the verb, the examples being of cases where two adverbials of the same type occur in the clause. The basic pattern does vary, especially according to the tendency to place new information in front of the verb. Tapani Salminen (questionnaire) reports that, for example, his Nenets corpus contained 60 cases of the adverbial s0waw°na 'well', and only once did something else oust it from its proper place. This happened with an evident idiom — 'hit heart' in (53):

204

(53)

Maria Vilkuna

Nenets (ques) Nyebyanta sewa wadaq s0waw°na syey°x0nta mother:GEN.3SG good word:PL good.-PROS heart:DAT.3SG tyeb0°q. hit:SUBJ-3PL 'His mother's good words consoled him well (lit. hit well in his heart).'

The same tendency is strong in Udmurt. I counted 18 instances of dzog 'quickly', of which 16 were immediately in front of the verb (and one in a verbless clause). The figures were exactly the same with umoj 'well', the two counterexamples being with the imperative 'Look well!'. All 8 instances of I'ek 'mean, bad' in its adverbial use immediately preceded the verb. The list could be continued; the general picture is that even though manner adverbiale are not totally unseparable from the verb, they stick to preverbal positions, as shown by a decreasing order of naturalness in the following (Bibinur Zaguljajeva, p.c.): (54)

a. Udmurt (corp) Odot'len anajez dzök s'örys' dzogak (name):GEN mother:3SG table behindrELA quickly potiz. come.out:PST:3SG b. Odot'len anajez dzogak dzök s'örys' potiz. c. Dzogak Odot'len anajez dzök s'örys' potiz. d. *Odot'len anajez potiz dzogak dzök s'örys'. Odot's mother quickly left the table.'

A conspicuous feature of Udmurt is the extensive use of descriptive, often untranslatable words, perhaps classifiable as adverbs, specifying the semantic content of the verb. These also stand immediately before the verb, although their actual separability seems to depend on whether they are truly idiomatic with the verb. This is illustrated by the judgements in (55) and (56): (55)

a. Udmurt (corp) Mözmyt s'injosyd mynes'tym s'ulemme tsuzak sad eye:PL:2SG I:ABL heart:ACC.1SG DESCR pös'vuazy. burn:PSTl:3PL

Word order in European Uralic

205

b. *Mozmyt s'injosyd tsuzak mynes'tym s'ulemme pös'vuazy. c. *Tsuzak mözmyt s'injosyd mynes'tym s'ulemme pös'vuazy. 'Your sad eyes burned my heart.' (56)

a. Udmurt (corp) Kin ke no korka uknojez zingrak söriz. somebody house window:ACC DESCR break:PSTl:3SG b. Kin ke no zingrak korka uknojez söriz. 'Somebody broke the windows of the house.'

Tsuzak in (55) combines with the verb burn to indicate momentanity, and the result is idiomatic and unseparable; the same could be said about an idiom like kual'ak us'yny 'fall kual'ak: to be suddenly frightened'. Zingrak in (56) evokes a sound of breaking glass, but does not form an idiom with söryny 'break', as combinations such as zingrak us'yny 'fall zingrak^ are possible of, say, a vase falling and breaking. Other adverbials display a preference for initial position and seldom occur in the immediately preverbal position. In the Udmurt corpus, sobere 'then' was absolutely initial in 92 of its 93 occurrences, but tabere 'now, from now on, these days' varied more: of 73 instances, only 21 were not initial, but several followed the verb. Of 40 phrases with the postposition dyrja 'while, in the time of, 36 were absolutely initial or followed an initial subject or adverbial. Finally, unlike Nenets, Udmurt rather freely allows post-verbal adverbials, but they are clearly less frequent in the corpus than post-verbal subjects or complements. The 4000-clause subcorpus revealed 347 non-verb-initial clauses with one postverbal constituent, and only 31 or 9% of these were ..VA, a low figure compared to the 167 ..VS and 134 ..VX patterns.

5. Subordinate clauses For those languages and constructions that exclusively or predominantly use non-finite subordination, verb-final order is naturally their norm. We have already seen examples of clause-final subordinators in the SOV languages and noted that such clauses typically but not necessarily precede the main clause. In the majority of the languages, the subordinators are clause-initial as a rule. But there are other alternatives. Conditional clauses in the Permian languages are marked with a clitic particle (see Riese 1984).14 The attachment of this particle inside the clause varies quite extensively in Udmurt:

206

Maria Vilkuna

(57)

Udmurt (ques) S'urs man'et-ke s'otod, vuzalo. thousand rouble-if give:FUT:2SG sell:FUT.lSG 'If you give a thousand rubles, I will sell.' ~ S'urs man'et s'otod-ke, vuzalo. ~ S'otod-ke s'urs man'et, vutalo.

In the Udmurt corpus, -ke clearly favours verb-final orders, and there is reason to believe that it also prefers the Neg—V—Aux group, as it typically attaches to the negative auxiliary in Neg—Verb contexts. The most common pattern is an XV clause with final -ke, and no examples surfaced with final -ke and verb initial or medial order. Medial -ke favors verb-final orders as well, although two examples of the third type in (57) were found. But -ke is not absolutely tied to the verb; there are of course nominal sentences among the conditionals, and a few examples were found of the type XP-ke XP V. In Komi, the cognate element -ko typically appears after the first constituent both in my corpus and the material treated by Riese (1984). The Russian-origin jes'l'i 'if appears sentence-initially in Riese's examples, and the same is true of the Erzya but'i — as well as of independent subordinators in these two languages in general. Clause-medial subordinators also exist in eastern Finnic: see the Vepsian example in (58 a) (Remmel 1963: 355 mentions the same pattern in Votian) and Finnish (58 b). In Finnish, the pattern seems to be favored in the Eastern dialects but has a rather stable minor role in the standard language as well:

(58)

a. Vepsian (Kettunen 1943: 572) Sur' ku ol'iiz neizne, ka ii varaid'aiz big if be:COND girl PTL NEG:3SG fear:COND 'If the girl was big, she wouldn't be afraid.' b. Finnish Sen kun ~ jos teet, nun pääset kotiin. that:ACC when ~ if do:2SG so can.go:2SG home 'If ~ when you do that, you can go home.'

In the Karelian area, the dubordinator position is extremely variable (like in spoken Russian):

Word order in European Uralic

(59)

207

Karelian: Olonetsian (Ludmila Markianova, p. c.) Kodih tulemmo, konzu lopemmo ruavon. home:ILL comerlPL when stop:lPL work:ACC ~ ruavon konzu lopemmo. ~ ruavon lopemmo konzu. 'We'll come home when we have stopped working.'

As to the internal order of subordinate clauses, only the Finnic group shows any evidence of special characteristics: Estonian, Livonian (according to Remmel 1963: 356) and Finnish allow SOV order without any of the special focusing properties that characterize SOV-ordered main clauses. Still, the variation is not random; it encodes subtle textual differences that cannot be discussed in the present context. What can be mentioned here is the distribution of (S)XV order in my Estonian corpus, counted from approximately 160 clauses of each of the following groups. (S)XV was most frequent in kui 'if, when' clauses (58%) and least frequent (17%) in finite et 'that' complement clauses; relative clauses were 37% XV. In Finnish, the statistical frequency of SOV subordinate clauses is again clearly smaller.

6.

Order in different sentence types

6.1. Questions We have already seen that the position of WH question elements varies greatly in Uralic. Western Finnic and Sami — both Northern and Southern — adhere to the sentence initial position and Hungarian, to the pre-verbal Focus position. In the SOV languages Nenets, Mari and Udmurt, WH elements frequently but not obligatory occupy the pre-verbal, verb-adjacent position. (In a collection of 308 WH questions in the Udmurt corpus, about 43% had the question word immediately preceding the verb and about 38%, preceding but non-adjacent.) Thus, the analogue of the Mari (60 a) would be ungrammatical in Hungarian and that of (60 b), ungrammatical in Finnish. (60)

Mari (ques) a. Ko cajym jüyneze? who tea:ACC drink:DES.3SG b. Cajym kö jüyneze? 'Who would like to have tea?'

208

Maria Vilkuna

The reverse — frequent but not obligatory sentence-initial position — is reported for Erzya and Komi and is supported by the present corpora. The Komi corpus contained one explicitly non-WH initial sentence and 53 WH-initial ones, while the corresponding numbers in the Erzya corpus were three to 37, one of the three being (61). Similar variation occurs in Karelian and Vepsian, illustrated in (62): (61)

Erzya (corp) Ton mezd'e ars'it'? you what:ABL think:PSTl:SUBJ-2SG 'What were you thinking of?'

(62)

Vepsian (Kettunen 1943: 405) Min sä nagrad? ~ sägi min nagrad? what:ACC you laugh:2SG ~ you:too what:ACC laugh:2SG 'What are you laughing at?'

Turning to Yes/No questions, we find two main strategies, often combined in one language: questions marked by intonation only, and the use of particles. Mari is special in that it only uses an optional sentence-final particle, mo. In Hungarian, intonation is the main question marker, as the clitic particle -e only occurs in embedded questions in the present-day language; Nenets and Erzya use intonation only. The Udmurt and Komi question particle (-a and -o respectively) is a clitic attached to the questioned phrase, which is neutrally the verb, but may be any other constituent. In Udmurt, the particle can appear anywhere in the sentence: (63)

Udmurt (ques) Vetlid-a ton klube? go:PSTl:2SG-Q you club:ILL ~ Ton-a klube vetlid? ~ Klube-a vetlid ton? ~ Ton klube-a vetlid? 'Did you visit the club?'

The Komi question particle typically, but not necessarily, cliticizes onto the first constituent, and all instances in my Komi corpus were of this type. Northern Sami and Finnish use the particle -kO (Finnish), -go (N. Sami), apparently much like Udmurt and Komi, but the particle must be attached to the first constituent; that is, it is one of the second-position clitics mentioned in section 4.3.15

Word order in European Uralic

209

A clitic question particle obviously marks its host as the focus of the question, and neutral questions result when the particle attaches to the finite verb, in which case it is natural to assume that the focus is the sentence polarity. In Udmurt, where the particle is freely placed, its most common host is still the verb; in a corpus of 96 questions, this was the case in 70 instances, of which 5 had the negative auxiliary as host instead of the main verb. In this respect, the picture is not entirely different from that in Finnish, illustrated in (64). In Finnish, verb+fcO produces a neutral question and other -kO phrases usually a specifically focused one, although some speakers use neutral-ordered sentences such as (64 c) for questions that are neutral with respect to focusing. (64)

Finnish a. Kävit&ö sinä kerhossa? visit:PST:2SG:Q you club:INE 'Did you visit the club?' b. Kerhossafeo sinä kävit? 'Was it the club you visited?' c. Sinä&ö kävit kerhossa? 'Was it you who visited the club?' or: 'Am I right in assuming you visited the club?'

Not all Finnic or Sami languages have clitic question particles. Estonian yes/ no questions are formed using either an independent particle, kas> or marked verb-initialness alone: (65)

Estonian Kas sa tahad siia tulla? Q you want:2SG here come:INF ~ Tahad sa siia tulla? 'Do you want to come here?'

Southern Sami uses a sentence-initial particle (mah, mejtie, or dagke) or intonation alone (Bergsland 1994: 35). Except for the second-position placement of the question particle in Finnic, the investigated languages do not have any word order properties specific to yes/no questions. On the other hand, yes/no questions, together with negationinitial sentences, account for the bulk of finite-verb initial occurrences in Finnish.

210

Maria Vilkuna

6.2. Imperatives In all the languages under discussion except for the most consistently headfinal Nenets, neutral imperatives can be verb-initial; even Mari, where the OV in (66) is the common pattern, allows a VO alternative: (66)

Mari (ques) Myla(ne)m stakan viidyrn pu. I:DAT glass water:ACC give:IMP.2SG 'Give me a glass of water'

In Hungarian, imperatives obey the general Focus-V pattern, verb-initial imperatives simply lacking a focus. The typical imperative order of the SVO languages is naturally VO, but OV order with constrastive Ο sometimes occurs even in Finnish, especially in negative imperatives. OV imperatives such as (67) (first clause) and (68) occur in my small Erzya and Komi texts, although VO is more common. (67)

Erzya (corp) Er, Matr'a, kasa ejstest pid'ek, PTL (name) porridge 3PL:ELA cook:IMP.2SG kadik ejkakst'e jarsit' pekest let:IMP.2SG.OBJ-3SG child:PL:DEF eat:3PL stomach:3PL pesked'ems. fill:INF:ILL 'Well, Matra, cook porridge out of them, let the children eat their fill.'

(68)

Komi (corp) N'ebyd n'an' vaj, koris Mitruk. soft bread bring.IMP.2SG ask:PSTl:3SG (name) 'Bring soft bread, asked Mitruk [after trying to eat frozen bread]'

Finnish seems to allow far less word order variation in the imperative than its more eastern relatives. This is because both non-verb-initial order and overt subject constituents are more alien to Finnish — an intuitive judgment that arises when encountering the extensive ordering variation of Udmurt imperatives. Essentially all conceivable word order variants seem to be present in the Udmurt texts, although imperatives certainly are shorter than indicatives on the average. In a sample of 242 instances, the imperative types were distributed as follows: XV (41%), V (29%), VX (19%), XVX (7%), and SV (3%). The

Word order in European Uralic

211

Udmurt questionnaire suggests that non-verb-initial imperatives are emphatic, but in view of occurrences like the ones in (69), the difference does not seem too prominent. (69)

Udmurt (corp) Ti sokem en dzoztis'ke. you(PL) so.much NEG.IMP:2 take.offence:IMP.PL Avtobuse puks'e no myne pristan'e. bus:ILL sit.down:IMP.PL and go:IMP.PL wharf:ILL 'Don't [formal] be so offended. Take the bus and go to the wharf. [Advice by hotel receptionist to disappointed customer.]'

6.3. Negation and auxiliaries Rather strict ordering prevails in the Uralic auxiliary structures. The negative auxiliary precedes the main verb in all the languages. Minor exceptions to this include the Mari second past tense construction, illustrated in (70 b), the analogue of which occurs as a dialectal alternative in the Udmurt second past put preferably in the order Neg—V. The negation element used in the second past also occurs after the predicate in nominal-predicate sentences such as the Udmurt (71). Note that the element in question, övöl or ogyl (in Komi, abu) can be characterized as a negative copula, in contrast to the pure negative auxiliary used with verbs in present or first past tense. (70)

Mari (ques) a. Myjym kiske ys cünggal. IrACC snake NEG:PST1.3SG bite wasn't bitten by a snake.' b. Myjym kiske cunggalyn ogyl. I:ACC snake bite:PST2.3SG NEG.COP.3SG haven't been bitten by a snake.'

(71)

Udmurt (ques) Korkaje baddz'ym övöl. house: 1SG big NEG.COP 'My house is not big.'

Sentences with the "pure" negative auxiliary following the verb are an exception in Uralic, but occur in the extremely freely ordered Vepsian, as shown by (72), as well as in the south-eastern Estonian Vöru dialect (Savijärvi 1981).

212

(72)

Maria Vilkuna

Vepsian (Kettunen 1943: 571) lile spickoid, a spickoid anda not.be:3SG match:PL:PRTV but match:PL:PRTV give lavossik ii shopkeeper NEG:3SG 'There are no matches, but the shopkeeper will not give any.'

Negative auxiliaries should be compared to tense auxiliaries. These follow the main verb in the SOV languages and precede it in Sami, Finnic and generally — but not obligatorily — also in Komi. Mordvin, as already mentioned, lacks such auxiliaries, and the only comparable element in Hungarian is the uninflected conditional auxiliary volna, post-verbal as in Mari and Udmurt. 16 When both auxiliaries are present, Mari and Udmurt display both V—Neg— Aux and Neg—V—Aux order depending of the type of the negation element, Neg—V—Aux being the choice with the "pure" auxiliary of the type (70 a). Adding the Neg—Aux —V order in Finnic, the generalization is that Neg precedes at least Aux but typically also the lexical verb. The verb complex consisting of Neg, Aux and the main verb is typically strictly organized, especially since immediate precedence is often required. This is true of Nenets and Mari, where nothing can intervene between an auxiliary and its main verb, or between two auxiliaries; in Udmurt, only small particles like no 'too', na 'still, yet', (i)n'i 'yet' can be placed between the negative auxiliary and the main verb. The Udmurt verb complex is illustrated in (73): (73)

Udmurt (corp) a. Ug n'i malpas'ky peres' Ignat. NEG:3 anymore ponder old (name) Old Ignat is not pondering anymore.' b. Ben mon pel'am ik oj pony val. PTL I ear:ILL.lSG PTL NEG:PST1.1SG put AUX 'Well, I — — did not even take notice of it.'

This strict adjacence of negation and the verb is in sharp contrast to the position of Finnish and Sami, which also differ markedly from the "eastern" type in the structure of their auxiliary complex. In Finnish, the slot between the auxiliary and the verb is heavily trafficked by adverbials, and this includes the negative auxiliary; see (74 a) and (b). It is also not uncommon to prepose an object to the main verb, as in (74 c). Finally, as mentioned earlier, the negative element easily moves to the sentence initial position, as in (75 a). This has the effect of allowing (but not requiring) the subject to be focused, and can be

Word order in European Uralic

213

analyzed as movement of the negative auxiliary to the contrast position in the same sense as preposing of finite (auxiliary and main) verbs in general, which is illustrated in (75 b).17 (74)

Finnish a. Minä olisin sunnuntaina mielelläni lähtenyt I AUX:COND:1SG Sunday:ESS with.pleasure leaverPART ulos. out would have liked to go out on Sunday.' b. Minä en sunnuntaisin mielelläni lähde ulos. I NEG:3SG Sundays with.pleasure leave:INF out don't like to go out on Sundays.' c. Minä en sitä tehnyt. I NEG.1SG if.PRTV do:PART didn't do it.'

(75)

Finnish a. En MINÄ sitä tehnyt ~ En minä SITÄ tehnyt. NEG.1SG I it:PRTV do:PART 'It wasn't me who did it ~ I didn't do THAT.' b. Ölen minä sen tehnyt. AUX:1SG I if.ACC do:PART '(Oh yes) I have done it.'

Negation need not be adjacent to the main verb in Sami either. Negatives comparable to (76) are typical and apparently neutral in Sami with prenominal subjects. I have not been able to check whether this has effects on negation focus. (76) a. Sami: Northern (Nickel 1990: 518) In mum diede. b. Sami: Southern (Bergsland 1994: 44) Im manne daejrieh. NEG.1SG I know don't know.' Contrasting the Finnish/Sami pattern in (74) —(76) with the Udmurt (73), we see that the former is quite close to the Germanic (tense) auxiliary pattern.

214

Maria Vilkuna

This becomes even clearer if we take into account the fact that the Finnish (and Germanic) tense auxiliary is an inflected finite verb combining with a nonfinite main verb without person inflection. By contrast, the Mari, Udmurt and Komi tense auxiliary is not inflected, although it is similar to the Finnish one in being basically a copula; rather, person and number inflection (as well as basic tense) remain on the main verb. The same holds for the Hungarian volna. (Further differences, which cannot be treated here, are found in the semantics of the compound tenses.) Standard Estonian, whose auxiliary syntax is much like that of Finnish and Sami, differs from them in that the negation element is, again, tied to the immediately pre-verbal position wherever the negation-verb complex is placed; for example, a verb particle cannot intervene between the two, as illustrated by (77). (However, the negation-initial pattern has a marginal existence in Estonian.) (77)

Estonian (corp) Kui te seda nalja ära ei löpeta, if you(PL) that:PRTV joking:PRTV off NEC stop 'If you don't cut off that joking, '

It seems natural to suggest that this restriction is somehow related to inflection; recall that inflection is no longer part of the Estonian (non-imperative) negation. And indeed, looking at the imperative, where the negation element still carries number inflection, we find that it is not inseparable: the orders in (78) are possible variants of the general Neg—V—X pattern: (78)

Estonian Ära siia tule ~ Arge siia tulge NEG:2SG there come ~ NEG:2PL there come:IMP.2SG 'Don't come here.'

An interesting question that cannot be discussed here in any depth is how the expression of negation focus relates to word order. In Finnic, negation focus must generally follow the main verb, so that the position between the negation and the verb seen in (74 b, c) cannot be occupied by the negation focus. An apparent exception to this are the negation-initial sentences such as (75); I have interpreted these as resulting from the movement of the negative auxiliary. The following Hungarian pattern, shared by Komi (as well as Russian), differs rather sharply from Finnic. In Hungarian, the negation particle again immediately precedes the verb except in the case of narrow focusing, where the

Word order in European Uralic

215

particle must immediately precede the focused element. The Komi pattern, rendered in most tenses by a negative auxiliary, is illustrated in (80): (79)

Hungarian a. En nem mentem. I NEC go:PST:lSG didn't go.' b. Nem EN mentem. NEC I go:PST:lSG 'It wasn't me who went.'

(80)

Komi (corp) Dad'jassö oz vövjas kyskyny, a körjas. sleigh:PL:ACC NEG:3 horse:PL pull:3PL but reindeenPL 'The sleighs are not pulled by horses but by reindeer.'

Analoguosly, and unlike its Udmurt counterpart, the Komi tense auxiliary völi can be separated from the main verb, as in (81); recall that its placement relative to the main verb also varies. (81)

Komi (Redei 1978: 133) Seme völi burdzyka kyjs'ö. (name) AUX better hunt:3SG Syly völi jondzyka sedö ij undzyk kyjö. he:DAT AUX more come:3SG and more catch:3SG 'Seme hunted better. More [game] came to him and he caught more.'

Needless to say, the situation is again different in the languages that do not allow negation and the verb to be separated. I have not detected any generalizations as to negation focus placement in Udmurt. A focusing device in Mari is the negative copula ogyl, perhaps to be interpreted as a particle in a cleft construction, attached to the negation focus, as in (82). A somewhat similar pattern can be found in Udmurt texts with 'not A but B' contrasts (83): (82)

Mari (ques) a. Pusenge ys jörlö. tree NEG.PST.3SG fall The tree didn't fall.' b. Pusenge ogyl jörlö. tree NEG.COP fall 'It wasn't the tree that fell.'

216

(83)

Maria Vilkuna

Udmurt (corp) Kolhozen gine övöl, bydes rajonen kivaltis'ko. kolkhoz:INST only NEG.COP whole district:INST direct: 1SG Tm in the charge of not only the kolkhoz but the whole district.'

6.4. Non-finite constructions Recall the Southern Sami situation described in section 4.1; a conclusion from the situation reported by Trosterud (to appear) is that in the process from SOV to SVO, auxiliary-medial sentences were the first to appear. In the FinnicSami area, it can be observed that complements of non-finite verbs can maintain the OV order for long periods after the finite clause changes to VO. Thus, present-day Finnish allows OV order in compound tenses and infinitival complements, although apparently not as extensively as Sami (according to Sammallahti 1991: 226), or Estonian, where the following order is quite frequent: (84)

Estonian (corp) a. Ta oli ennegi jaapanlannadest uht-teist s/he AUX:PST:3SG before:too Japanese:fem:ELA one:other:PRTV kuulnud. hear:PART 'He had heard things about Japanese women before.' b. Ma ei oska sellele küsimusele vastata. I NEG can that:ALL question:ALL answer:INF can't answer that question.'

It should be mentioned that head-final order is consistently even more favoured in non-finite adjunct constructions than in complements in both Estonian and Finnish. The following picture emerges from an investigation of the Estonian short story corpus, where the number of instances in each group varies from 100 to 250. In adjunct infinitivals with the ending -des 'while Ving', XV order was practically exclusive except with an especially heavy X. In complement infinitivals, XV occurred in 69% of those with the ending -ma and in 59% of those with -Ta (see 81 b), the difference between the two being essentially one of lexical selection. With participles of active compound tenses such as (84 a), the proportion of XV order was 44% in subordinate clauses

Word order in European Uralic

217

(which easily allow XV order as such) and 29% in main clauses. Counting these figures from written Finnish would not be too instructive, as the proportion of XV would be negligible, but intuitively, the variation is in the same direction. OV type non-finite clauses are naturally the expected order in the SOV languages. In my Udmurt corpus, practically all infinitival complements (without extraction; see below) and all of the extremely common gerunds are verb-final; the Erzya and Komi infinitives in the corpus are predominantly VX, but in Komi, XV appeared to be slightly more common in infinitives than in finite clauses. Another feature of interest in non-finite structures is the extractability of their dependents. The considerable freedom of movement from complement infinitivals and participials, extensively discussed in (Vilkuna 1989), is repeated in Sami (Sammallahti 1991) and Estonian; the Northern Sami examples in (85) would be fine in Finnish (with initial focusing): (85) a. Sami: Northern (Sammallahti 1991: 226) Biigga bahppa aiggui Anaris vuolgit viezzat. maid:ACC priest intend:PST.3SG (name):LOC go:INF fetch:INF 'The priest intended to go to fetch a maid from Anar.' b. Sami: Northern (Sammallahti 1991: 231) li daid galgga joavdelasaid bargat suovvat. NEG.3SG they:ACC/GEN must follies:ACC do:INF let-.INF 'You shouldn't let them do anything foolish.' On the basis of examples like those in (86) through (88), we can see that constituents of inifinitival complements are rather freely extractable in the other languages under consideration as well. It is not impossible for a phrase extracted from an infinitive to end up after the main verb; one of the few Udmurt examples illustrating this is the last one, (89). (86)

Mari (ques) Nina kinosüretym oncas sona. (name) film-.ACC see:INF want:3SG ~ Kinosüretym Nina oncas sona. ~ Nina oncas sona kinosüretym. 'Nina wants to see a film.'

218

Maria Vilkuna

(87)

Komi (corp) Mitruk sy oz vermy ledzny, (name) sound NEG:3 can let.out:INF 'Mitruk couldn't make a sound.'

(88)

Erzya (corp) St'apanon' kavto ts'orin'etn'en'en' (name):GEN two son:PL:DAT:DEF ul'ts'aso karmast' mer'eme kamchadalt. street:INE start:PSTl.SUBJ-3PL call:INF Kamchatkan:PL On the street, people started calling St'apan's two sons Kamchatkans.'

(89)

Udmurt (corp) Medjany no malpaz val muket pöjsurjosty. hire:INF too think:PSTl:3SG AUX other animal:PL:ACC 'He [rabbit in a folktale] even thought of hiring other animals.'

7.

The noun phrase

7.1. Left and right branching In contrast to the marked differences in clause structure and clausal word order, the Uralic languages are fairly homogeneous in their insistence on headfinal NP-internal order. Determiners — excluding, of course, the Mordvin definite suffix — demonstratives, possessors, numerals, quantifiers and adjectives precede the noun as the only or predominant pattern in all the languages. There is some variation in the relative placement of these elements, and also some room for language-internal variation. The normal order of a multiple-premodifier NP in Finnish and Komi are given in (90) and (91): (90)

Komi (ques) Vas'alön tajö kyk ydzyd kan'ys Vasya:GEN this two big cat:3SG 'these two big cats of Vasya'

(91)

Finnish Näissä Villen kahdessa mustassa kissassa these:INE Ville:GEN two:INE black:INE cat:INE 'in these two black cats of Ville's'18

Word order in European Uralic

219

The head-final pattern is repeated inside adjective and adverb phrases: (92)

Hungarian Udmurt Finnish

nagyon tuz hyvin very

sok uno paljon much

In comparative constructions, the standard of comparison is typically in some locative case (partitive or a 'from' case in Finnic, 'from' cases elsewhere), although Finnic, Sami and Hungarian use than-type complementizers as an alternative. The locative phrase precedes the adjective in the unmarked organization, but permutations are possible, as in the following Vepsian examples; such variation is possible at least in Finnish and Hungarian in predicative positions. (93)

Vepsian (Kettunen 1943: 377, 378) a. Muzik humalakas a ak vou sida humalakhemb man drunk but woman still he:PAR drunk:CMPR 'The man is drunk, but the woman is even more drunk.' b. Tiriine penemp kajagad tern small:CMPR gull:PAR tern is smaller than a gull.'

Some sources report variability of adjective-noun order. The Komi sentence in (94 b) contains a stylistic variant of the neutral arrangement in (94 a), and the same is mentioned for Erzya by Mosin & Bajushkin (1983: 27). The constructions differ in that the postponed adjectives agree with the noun in number. The same option is available in Finnish — where all adjective modifiers agree with their heads — but is generally considered an appositional construction. (94)

Komi (ques) a. ydzyd da pemyd kerkajas big and dark house:PL b. Kerkajas, ydzydös' da pemydös', tydalisny matyn house:PL big:PL and dark:PL be:visible:PSTl:3PL close n'in. already 'Houses, big and dark, could already be seen close by.'

220

Maria Vilkuna

However, at least in the heavily Russian-influenced eastern Finnic, especially in Vepsian, NP-internal order is markedly free. Thus, the available recordings contain numerous occurrences of such postnominal genitives and adjectives as the following. (95)

a. Vepsian (Kettunen 1943: 572) Nece muzik vanhemp kucuu mor'z'men this man older call.-PST wife:ACC 'This older man called his wife.' b. Vepsian (Kettunen 1943: 394) Titär icenze ol' ani coma daughter self:GEN was very pretty 'His own daughter was very pretty.'

The order of relative clause and noun varies according to the type of the construction: participial relative clauses precede the noun just like adjective modifiers, while finite relative clauses follow. This is exemplified below by Udmurt: (96)

Udmurt (ques) a. Piosmurt, kudiz tatyn ule, tros uza. man who here live:3SG much work:3SG 'The man who lives here works a lot.' b. So nyl ysem gurte vuem. he/that girl disappear:PART village:ILL come:PST2.3SG '(He) came to the village which (that) girl had disappeared from.'

A question that cannot be answered here is to what degree the individual languages favor non-finite or finite relative clauses. For example, colloquial Finnish makes little use of participial modifiers in NPs; they are generally perceived to be rather complex and are restricted to more formal, planned registers. Also, even though Finnish has a fairly well-developed system of participial verb forms (five in all), these are generally restricted to subject and object relativization, whereas Mari and the Permic languages show greater freedom in this respect (cf. Pajunen 1991). Thus, Finnish would not allow the participial construction in (96b). At the other end of the spectrum, Nenets would only use non-finite constructions. Another point of interest is the post vs. pre-head position of oblique NP, PP and infinitival modifiers and complements of nouns. Post-head modifiers seem alien to the eastern Finno-Ugrian languages but are fairly common in Finnic.19 The Finnish and Hungarian oblique and PP complements and adjuncts must

Word order in European Uralic

221

follow the noun, and only nouns transparently derived from verbs accept oblique premodifiers, as in the construction in (97 a); otherwise the order is strictly as in (97 b): (97)

Finnish a. Mikolle puhuminen on vaikeaa. Mikko:ALL talk:NR is difficult:PAR 'Talking to Mikko is difficult.' b. satu prinsessasta, askel kohti parempaa story princess:ELA, step towards better:PRTV 'a story about a princess, a step towards the better'

Estonian constitutes an interesting exception to this picture: oblique casemarked and adpositional premodifiers are tolerated along with postmodifiers. The position of the modifier is not free; premodifiers are said to bear a characterizing meaning (Remmel 1963: 279—280). Thus, there are contrasts like the following: (98)

Estonian a. pilves ilm, lapsega naine, körvata tass cloud:INE weather, child:COM woman, ear.ABE cup 'cloudy weather, a woman with a child, a cup without a handle' b. samm vabamate olude poole, vöileib step free:CMPR conditions:GEN towards, butter.bread vorstiga sausage:COM 'a step towards more liberated conditions, a sandwich with sausage'

Oblique cases occur as premodifiers to nouns in the Permian languages. One example is the use of the elative case in constructions involving part-of relations or origin, as in (99); but see (24) in section 4.2.2. for another type of example: "a man in a coat, a woman with a scarf". One subcase is a bit more special: genitive possessors cannot combine with an object, and an ablative must be used instead, as in (100). (99)

Udmurt (corp) Mon jaratis'ko stol'itsays' ad'amiosty. I love:lSG capitahELA person:PL:ACC love people from the Capital.'

222

(100)

Maria Vilkuna

Komi (corp) Mitruk malystis kyn'pilys' kymössö. (name) stroke:PSTl:3SG arctic.fox.cub:ABL brow:ACC.3SG 'Mitruk stroked the little fox's brow.'

Adjectives in at least Finnic and Hungarian may take case-inflected, adpositional or infinitival complements, and these have unmarked post-head placement. Such complements are mainly used with predicative adjectives and can be used with modifying adjectives only when the premodifier organization is possible: (101)

Finnish a. Mies on ylpeä pojastaan. man be:3SG proud son:ELA:3 'The man is proud of his son.' b. pojastaan ylpeä mies ~ *ylpeä pojastaan mies 'a/the man proud of his son'

The use of noun postmodifiers is similarly restricted; a postmodified noun cannot appear, say, as a genitive premodifier. The same generalization applies to postpositions, which do not combine with NPs containing a postmodifier. In principle, abstract nouns and adjectives may take infinitival complements, which typically follow the noun or adjective in Finnic, adding to the inventory of postmodifier structures in NPs. However, the existence of genuine infinitival modifier structures may be disputed. According to Bartens (1979), such complements in Mordvin, Mari and Udmurt — see (102) — are confined to predicative adjectives and nouns in object position; it is often questionable whether these infinitives do form constituents with the proposed noun, and apparently no adjacency with the noun is required. These doubts hold for Finnish as well, although NPs with infinitival complements perhaps have a more extensive distribution in abstract, formal prose. (102) a. Erzya (Bartens 1979: 83) Son anok ul'n'es' tujems kudov s/he ready was leave:INF home 'S/he was ready to leave for home.' b. Udmurt (Bartens 1979: 185) Pi argan sudny tuz usto vylem. boy accordion play:INF very good was 'The boy was good at playing the accordion.'

Word order in European Uralic

223

7.2. Discontinuities One feature typical of Finnish NP, PP and infinitival modifiers of nouns and adjectives is that strict internal adjacency is not required: (103)

Finnish Pojastaan mies on ylpeä. son:ELA:3 man is proud 'The man is proud of his son.'

Even with the more representative premodifier types, occasional nominal discontinuity is probably found in all Uralic languages. However, the extent of the phenomenon is hard to assess, as it apparently tends to be restricted to colloquial speech. In any case, Vepsian is again explicitly mentioned to freely allow such discontinuity: (104)

Vepsian (Kettunen 1943: 572) Sida en n'äge prihad. that:PRTV NEG.-1SG see boy:PRTV don't see that boy.'

A noteworthy type of NP discontinuity can be found in the extraction possibilities of possessors. Even if NP discontinuity is quite marginal in Finnish, extraction of the genitive possessor from a postverbally placed NP into the topic position is possible in colloquial Finnish with inalienable possession, as in (105). Adessive and allative NPs alternate (dialectally) with the genitive here, but only the genitive construction can be seen as an extraction. (105)

Finnish Mun on pääni kipee ~ mulla on pää kipee. I-.GEN be:3SG head:lSG sore ~ I:ADE is head sore have a headache.' (Colloquial Finnish)

In Sami, the genitive possessor is regularly extracted in WH questions from a predicative NP: (106)

Sami: Northern (Nickel 1990: 522) Gean dat lea biila? who:GEN that be:3SG car 'Whose car is that?'

224

Maria Vilkuna

Possessor extraction is especially productive in Hungarian, where possessors are either nominative (unmarked) or dative-marked, and the possessed bears an agreeing possessive suffix. While the nominative possessor must precede the noun (N') and be adjacent to it, as in (107 a), the dative can be freely detached, as in (197 b) and (c) (see Szabolcsi 1983): (107)

Hungarian a. Janos kalapja eltünt. (name) hat:3SG disappear:PST.3SG b. Janosnak a kalapja eltunt. (name) :DAT the hat:3SG disappear:PST.3SG c. Janosnak eltünt a kalapja. 'John's hat disappeared.'

Hungarian has no actual genitive reserved for adnominal use; the dative, as the term suggests, typically appears as a complement to a verb. But even in languages with a separate genitive and dative, such as Udmurt, the genitivemarked possessor can be separated from the possessed noun in both colloquial speech and writing: (108)

Udmurt (corp) a. Man'eryz sytse peres' Miktalen. manner:3SG such old (name):GEN 'Such is old Mikta's style. [He never speaks straight]' b. Vozez potem vorgoronlen. anger:3SG come.out:PST2.3SG man:GEN 'The man got angry ["his anger broke out"].'

The Udmurt ablative-marked possessor construction discussed in section 7.1 is separable as well; the two variants in (109) are synonymous. (109)

Udmurt (corp) Valze jusky soles'. horse:ACC.3SG unharness:IMP.2SG s/he:ABL ~ Jusky soles' valze. 'Unharness his horse.'

Baker (1985: 37) mentions postposition or disjunction of genitives in Komi, and these occur in my corpus. Genitive separation is probably found in all the languages under consideration; an example from Nenets:

Word order in European Uralic

(110)

225

Nenets (ques) syengk0 noxah syur°x01yoy° poh torta blue article.fox:GEN round year:GEN hair:3SG ngob-toq!0s°. one-alike 'The fur (lit. hair) of the blue arctic fox is the same throughout the year.'

Note, however, that in Nenets, extraction is only possible with the less common possessor construction, where genitive case is accompanied by a possessive suffix on the head noun — not if the suffix is absent, as it often is. Genitives also typically encode subject arguments of non-finite constructions and nominalizations in all languages that have a genitive. Such genitives can sometimes also be separated from their heads. A good example is the 'want' construction in Udmurt. The main verb with the meaning 'come out' is combinded with the past participial form (or nomen actionis according to Bartens 1979) of the verb denoting the desired action. The subject of this verb is in the genitive and agrees with the possessive suffix of the participial verb as any possessor NP would. Still, this subject is freely movable: (111)

Udmurt Mynam s'ijeme pote. I:GEN eat:PART:lSG come.out:3SG ~ S'ijeme mynam pote. ~ S'ijeme pote mynam. Ί want to eat.'

A similar option exists in Mari (Bartens 1979: 122). For Mari, Kangasmaa-Minn (1966: 216 — 224) assumes three grades of "liberation" of genitives from their immediately pre-nominal position: they can follow the head (112 a), be detached (112 b), or have completely independent function with extra case inflection, an option that need not concern us here. The same seems to happen in Erzya; see (113). (112)

Mari (Kangasmaa-Minn 1966: 216) a. Idy vatazy Ivanyn tuge listen, this wife:3SG Ivan:GEN thus did 'This wife of Ivan did that way.' b. Imm'et tyste tyjyn. horse:2SG here you:GEN 'Your horse is here.'

226

(113)

Maria Vilkuna

Erzya (corp) a. Sudozo Oldaen' viskin'e. nose:3SG (name):GEN little Olda's nose is small' or: Olda has a small nose.' b. Oldaen' sas' venchamo skazo. (name):GEN arrive:PST1.3SG marriage time:3SG 'The time had come for Olda to get married.'

According to Kangasmaa-Minn, the postnominal genitive is more loosely connected to the head noun than the prenominal one. A natural suggestion is to analyze postnominal genitives as a special case of whatever accounts for extracted genitives in general. The actual description of possessor extraction may also turn out to cover a type of construction with a more obvious clauselevel genitive or dative: have constructions such as (114). Sentences like this are often — in principle — ambiguous between the 'have' interpretation and the interpretation where the possessor and the possessed constitute an NP; note especially that Udmurt and Mari have the genitive here, although a separate dative exists. These genitives are placed like subjects and probably display other subject properties.20 (114) a. Hungarian Jänosnak könyve van. (name):DAT book:3SG be:3SG b. Udmurt Piosmurtlen kn'igajez van'. man:GEN book:3SG COP c. Komi Zonkalön em n'ebög. boy:GEN be:3SG book:NOM d. Mari Ergyn knigaze ulo. boy:GEN book:3SG be:3SG e. Erzya Ts'oran't' (ul'i) kn'igazo. boy:GEN.DEF be:3SG book:3SG f. Sami: Southern (Bergsland 1994: 53) Laaran (lea) b'ienje. (name):GEN be:3SG dog

Word order in European Uralic

227

8. Conclusion The Uralic languages discussed in this paper do not have much in common in terms of word order, especially not at the level of clause constituents, and even the common features, such as the usually head-final structure inside nominal phrases, are shared by quite a few other European languages. The other major shared feature is the general flexibility of order, whose syntax and discourse functions in most languages are, however, generally rather poorly understood. Given the historical depth of the language family, there is nothing strange in the fact that there is no "Uralic word order". But further work on the word order of Uralic languages could be fruitful as part of a larger attempt of shedding light on the nature and conditions of flexible SVO and flexible SOV order.

Notes 1. My sincerest thanks to those who completed the Eurotyp word order questionnaires and to the following people, who have made this overview possible: Irja SeurujärviKari, Pekka Sammallahti (Sami); Birute Klaas, Tut Hennoste, (Estonian), Ludmila Markianova (Karelian), Jack Rueter (Mordvin); Juri Anduganov, Igor Sadovin, Sergei Cherasov (Mari); Jevgeni Cypanov (Komi); Valej Kel'makov, Bibinur Zaguljajeva, Tatyana Krasnova (Udmurt). Tapani Salminen, Paula Kokkonen, Riho Grünthal and Merja Salo have helped me in various ways and considerably increased my (still deplorable) knowledge of Finno-Ugristics, and Marja Leinonen has been an indispensable source of information about Russian. 2. For an overview of the structure and social situation of the Uralic languages, see Comrie (1981) and Hajdu & Domokos (1987); Sinor (1988) is a collection of comprehensive presentations with a historical-comparative orientation. 3. The numbers of Sami speakers have been updated by Tapani Salminen (p. c.) and are generally lower than on the map referred to. 4. More specifically, the following material has been used: • Sizable subsets of the Udmurt corpus compiled by Pirkko Suihkonen and Bibinur Zaguljajeva, now published in Suihkonen (ed.) 1995. Approx. 9000 ortographic sentences of prose (20th century fiction). • Part of the Komi corpus (484 ortographic sentences, 963 clauses), compiled, glossed and translated by Paula Kokkonen. Contemporary fiction, mainly children's; one political column. • Beginning of a contemporary Erzyan novel (567 ortographic sentences, 920 clauses). Compiled, glossed and translated by Jack Rueter. • A corpus of approx. 3250 ortographic sentences of contemporary Estonian short stories compiled by Maria Vilkuna. The glossing and translating of the Erzya and Komi corpora was made possible by ESF Eurotyp Small Grants in 1994. With the exception of the Erzya text, the corpora

228

Maria Vilkuna

are situated in the corpus server of the Department of General Linguistics, University of Helsinki. — It should be added that the Tundra Nenets questionnaire by Tapani Salminen also relies on extensive corpus work in addition to the judgments of informants. 5. There are four types of sources of examples: the Eurotyp questionnaire or the smaller questionnaire on discourse-related information and further details I prepared in 1992 (marked 'ques'); the corpora mentioned in note 4 (marked 'corp'); linguistic literature (reference given); and my personal knowledge or informant interviews (without marking). The following non-standard abbreviations are used: Q = yes/no question clitic, PTL = non-specified discourse or focus particle, PST 1 and PST 2 = the two past tenses of Permian, Mari and Erzya. Present tense is not separately indicated even where it appears to have overt marking. The tags SUBJ- and OBJ-prefixed to personnumber marking indicate subject and object conjugation (Nenets, Erzya, Hungarian). The languages written in Cyrillic alphabet (Erzya, Mari, Komi and Udmurt), are here represented in a simple, roughly phonematic transcription. Consonant palatalization, variably indicated by the "soft" character or the subsequent vovel character in the orthographies of these languages, is typically phonemic and is here indicated by attaching' to the consonant. 6. Verb-second syntax was common in written Finnish up to the first decades of the present century, but the phenomenon never became a part of spontaneous colloquial Finnish and exists now as a minor pattern, mainly after preposed subordinate clauses. Remmel (1963: 356) specifically mentions the lack of V2 inversion in Livonian, a language otherwise close to Estonian. 7. Post-verbal subjects are very rare in Nenets, but see the following, where the postponed subject is clearly contrastive; otherwise, post-verbal material is reported to be afterthougth-like. Nenets (ques) Nyabyir 0 nyanta ρύ-nyakuna yad0°, another:SG2SG friend:GEN.SG3SG behind:LOC walk:3SG tu-nyimta ny0qm°byi nyabyir 0 . rifle:ACC.SG3SG hold:SUBJ-3SG another:SG2SG One walks behind his friend, and (the other) one holds his rifle.' 8. The sentences in (14 a) and (b) represent the expression of necessity typical of Uralic; the Allative/Dative/Genitive participant can be taken as a quirky subject. 9. Pronominal arguments thus still occur in this material, as well as negations and questions. Note that question formation does not induce any systematic changes in word order in these languages. But a minor pattern systematically increasing the number of VS-type orders in literary material was excluded: reporting clauses after direct quotes. Inversion in such clauses is a conspicuous feature of all three languages (and Russian), but it may be more than just a literary convention as it also occurs in the old oral Udmurt texts in e. g., Wichmann 1954. 10. No data directly comparable to those in Table 2 were presented in Hakulinen et al. (1980), where this corpus and its statistical investigation are documented. The present results were obtained by retrieving the relevant values from the numerical coding made for the original investigation, and it was not possible to retrieve the posi-

Word order in European Uralic

229

tion of "X" in the manner of Table 2 using this method. No obvious discrepancies with the results of Hakulinen et al. were found, but in might be mentioned that the high proportion of OVS clauses in Table 3 has a connection with the exclusion of adverbials; when these were not excluded, the percentage of OVS was only 7.7%. Working on the subject-focusing OVS clauses of the corpus, I have observed that they indeed disfavor additional constituents. On the other hand, the proprotion of XVS order reported by Hakulinen & al. was again as high as 11%; in this case, XVS also includes existential Loc—V—S sentences. 11. The hunch that Komi and Udmurt might differ far less dramatically when examined on the basis of spoken discourse remains to be substantiated by further studies. 12. Erzya is too clearly SVO and our Erzyan corpus too small to warrant any conclusions concerning object case and position. In the present corpus, the number of SXV clauses with an object is only 5, and not a single SOV with a nominative object was found. In the 36 OV clauses, the number of nominative objects was 16 (44.4%), while in VO and O-clauses, the proportion of nominative objects was 31.8% and 26.3%, respectively, so a slight connection may exist, but it would be premature to suggest a similar explanation as for the Permian languages. 13. At this point, it should be added that there is at least one second-position clitic particle in Finnish that has nothing directly to do with focusing: the element -hAn. This clitic has the effect of marking the content of the sentence as somehow pragmatically known: Finnish Minähän kävin kerhossa. I+CLT visit:PST:2SG club:INE visited the club [as you perhaps know].' 14. This particle also occurs as a marker of arbitrariness in quantifiers such as the Udmurt kin ke no 'who ke too', i.e. 'someone', or kön'a ke 'how many ke', i.e., 'some'. 15. More precisely, both constituent-final and constituent-internal placement are possible inside the initial constituent: Finnish (i) Siinä kerhossafco sinä kävit? that:INE club:INE:Q you visit:PST:2SG 'Did you visit that CLUB ~ THAT club? (ii) Siinä&ö kerhossa sinä kävit? that:INE:Q club:INE you visit:PST:2SG 'Did you visit THAT club?' 16. Hungarian future tense is rendered with the verb fog, originally 'to get, catch', which unlike volna takes an infinitival complement. Syntactically, fog conforms to such infinitive-taking verbs as akar 'want' and szeret 'like', and the position of the infinitive is variable. 17. Initial negation also has a pragmatically neutral variant in Finnish: clause-initial combinations of inflected negation and a complementizer or conjunction such as ettei 'that not', ellei 'if not', eikä 'and not'.

230

Maria Vilkuna

18. In the nominative, this phrase should be analyzed as containing a numeral head with an NP complement (in the partitive). This is true of all Finnic languages. 19. And Hungarian, but with restrictions. For example, (ii), a participial structure, is favored over (i) in the standard language (Katalin E. Kiss, p.c.). Hungarian (i) A beszelgetes Lacival nem sikerült. the discussion {name):COM not succeed:PST.3SG (ii) A Lacival valo beszelgetes nem sikerült. the (name):COM be:PART discussion not succeed:PST.3SG 'The discussion with Laci did not work out.' 20. Finnic and Northern Sami use locative cases at the sentence level, not the genitive: a. Sami: Northern Mähtes leat odda sabehat. (name):LOC be:3PL new ski:PL b. Finnish Matilla on uudet sukset. (name):ADESS be:3SG new:PL ski:PL 'Matthew has a new pair of skis.'

References Alhoniemi, Alho 1991 "Zur Kasuskennzeichnung des Objekts im Mordwinischen", SKY 1991: 12-30. 1993 Grammatik des Tscheremissischen (Mari). Hamburg: Helmut Buske Verlag. Baker, Robin 1985 The development of the Komi case system. A dialectological investigation. MSFOu 189. Bartens, Raija 1979 Mordvan, tseremissin ja votjakin konjugaation infiniittisten muotojen syntaksi. [The syntax of the non-finite verb forms in Mordvin, Mari and Udmurt]. MSFOu 179. Bergsland, Knut 1994 Sydsamisk grammatikk [A Southern Sami grammar]. Second edition. Davvi Girji O. S. Comrie, Bernard 1981 The languages of the Soviet Union. Cambridge University Press. Dryer, Matthew 1992 "The Greenbergian word order correlations", Language 68: 81 — 138. Filppula, Markku and Anneli Sarhimaa 1994 "Cross-linguistic syntactic parallels and contact-induced language change", SKY 1994: The Yearbook of the Linguistic Association of Finland, ed. Susanna Shore and Maria Vilkuna, 89—134.

Word order in European Uralic

231

Hajdu, Peter and Peter Domokos 1987 Die uralischen Sprachen und Literaturen. Hamburg: Helmut Buske. Hakulinen, Auli 1976 "Suomen sanajärjestyksen kieliopillisista ja temaattisista techtävistä" [On the grammatical and thematic functions of Finnish word order], in: Reports on text linguistics: Suomen kielen generatiwista lauseoppia 2. Meddelanden fran Stifteisens for Abo Akademi Forskningsinstitut, Nr 7. Abo. Hakulinen, Auli, Fred Karlsson and Maria Vilkuna 1980 Suomen kielen tekstilauseiden piirteitä: kvantitatiivinen tutkimus [Features of Finnish text sentences: a quantitative study]. Publications No. 6, Department of General Linguistics, University of Helsinki. Hawkins, John A. this volume "Some issues in a performance theory of word order". Holmberg, Anders this volume "Word order variation in European SVO languages: a parametric approach." Kangasmaa-Minn, Eeva 1966 The syntactic distribution of the Cheremis genitive. MSFOu 139. Kettunen, Lauri 1943 Vepsän murteiden lauseopillinen tutkimus [A syntactic study of Vepsian dialects]. Suomalais-ugrilaisen Seuran Toimituksia LXXXVI. Helsinki. E. Kiss, Katalin 1987 Configurationality in Hungarian. Budapest: Reidel and Akademiai Kiado. 1994 "Sentence structure and word order", in: Katalin E. Kiss and Ferenc Kiefer (ed.) Syntax and Semantics 27: The syntactic structure of Hungarian. San Diego: Academic Press, 1 — 90. Korhonen, Mikko 1981 Johdatus lapin kielen historiaan [An introduction to the history of Sami]. Helsinki: Suomalaisen Kirjallisuuden Seura. MSFOu = Memoires de la Societe Finno-Ougrienne. Helsinki. Mosin, M. V. and N. S. Bajushkin 1983 Ersämordvan oppikirja [An Erzya Mordvin textbook]. Helsinki: Suomalais-ugrilainen Seura. Nickel, Klaus Peter 1990 Samisk grammatikk [A Sami grammar]. Universitetsforlaget. Pajunen, Anneli 1991 "Leksikalistinen hypoteesi ja agglutinoivat kielet" [The Lexicalist Hypothesis and agglutinative languages], Virittäjä 95: 155 — 181. Primus, Beatrice this volume "The relative order of recipient and patient in the languages of Europe". Redei, Karoly 1978 Syrjänische Chrestomathie mit Grammatik und Glossar. Studia Uralica 1. Wien: Verband der wissenschaftlichen Gesellschaften Österreichs. Remmel, N. 1963 "Sönajärjestus eesti lauses. Deskriptiivne käsitlus" [Word order in the Estonian sentence. A descriptive study], in: Eesti keele süntaksi küsimusi. Keele ja Kirjanduse Instituudi Uurimused VIII. Tallinn.

232

Maria Vilkuna

Riese, Timothy 1984 The Conditional sentence in the Ugrian, Permian and Volgaic languages. Studia Uralica 3. Wien: Verband der wissenschaftlichen Gesellschaften Österreichs. Saareste, Andrus 1960 "Subjekti ja predikaadi vastastikusest asenemisest eesti keeles laiendi järgi" [On the order of subject and predicate after a complement or adjunct in Estonian dialects], in: Vironseppo, Juhlakirja Julius Mägisten 60-vuotispäiväksi 19.12. 1860 [Festschrift for Julius Mägiste]. Helsinki: Kotikielen seura and Virittäjä, 61 — 69. Saarinen, Sirkka 1991 "Typological differences between the Volgaic languages", SKY 1991: 43-52. SKY 1991 = The Yearbook of the Linguistic Association of Finland, ed. Maria Vilkuna and Arto Anttila. Helsinki. Salminen, Tapani 1993 "Word classes in Nenets", in: Festschrift für Kaija Bartens, MSFOu 215:

257-264. to appear "Tundra Nenets", to appear in Daniel Abondolo (ed.), The Uralic languages. London: Routledge. Sammallahti, Pekka 1991 "NO PASSING, NO XING: Traffic regulations for word order", in: Lea Laitinen, Pirkko Nuolijärvi and Mirja Saari (eds.), Leikkauspiste. Helsinki: Suomalaisen, kirjallisuuden Seura. Savijärvi, Ilkka 1981 "Sanajärjestystyyppi pääverbi-kieltoverbi viron kielessä" [The Verb-Negation order in Estonian], Virittäjä 85: 109—117. Siewierska, Anna this volume "Variation in major constituent order: a global and an European perspective." Sinor, Denis (ed.) 1988 The Uralic languages. Description, history and foreign influences. Leiden: E. S. Brill. Suihkonen, Pirkko 1990 Korpustutkimus kielitypologiassa sovellettuna udmurttiin [Typological corpus research with special reference to Udmurt]. MSFOu 207. Suihkonen, Pirkko (ed.) 1995 Udmurt Texts. Helsinki: Suomalais-Ugrilainen Seura. Szabolcsi, Anna 1983 "The possessor that ran away from home", The Linguistic Review 3: 89— 102. Tael, Kaja 1988 Sönajärjemallid eesti keeles (vörredluna soome keelega) [Estonian word order patterns (compared to Finnish)]. Eesti NSV teaduste akadeemia Keele ja kirjanduse instituut. Preprint KK1 —56. Tallinn. 1990 An approach to word order problems in Estonian. Eesti NSV teaduste akadeemia Keele ja kirjanduse instituut. Preprint KK1—66. Tallinn.

Word order in European Uralic

233

Trosterud, Trond to appear "Die südsamische Wortfolge kann als eine Kombination der deutschen und der marischen Wortfolge analysiert werden." Vainikka, Anne 1989 Deriving syntactic representations in Finnish. Doctoral dissertation, University of Massachusetts. Vilkuna, Maria 1989 Free word order in Finnish: its syntax and discourse functions. Helsinki: Suomalaisen Kirjallisuuden Seura. 1991 "Constituent order and constituent length in Finnish", in: John A. Hawkins and Anna Siewierska (eds.), Performance principles of word order. EUROTYP Working Paper II.2, 81-106. 1995 "Discourse configurationality in Finnish", in: Katalin E. Kiss (ed.), Discourse-configurational languages. New York and Oxford: Oxford University Press, 244—268. Wichmann, Yrjö 1954 Wotjakische Chrestomathie mit Glossar. Zweite, ergänzte Auflage. Helsinki: Suomalais-ugrilainen Seura. Wickman, Bö 1955 The form of the object in the uralte languages. Uppsala: Almqvist & Wiksell.

Yakov G. Testelec

Word order in Kartvelian languages

1. Introduction The Kartvelian languages are spoken mostly in Georgia; smaller groups of speakers inhabit Azerbaidjan, Turkey and Iran. The number of speakers is approximately 3.8 million. The family consists of four languages: Georgian, including Old Georgian, represented by an uninterrupted written tradition since the 5th century, Mingrelian, Laz (also referred to as Chan), and Svan. Mingrelian and Laz are the closest genetically; they are viewed by some as a single Zan language. Georgian and the Zan languages constitute a separate genetic branch opposed to Svan.1

2. Inflectional and other functional categories By contrast to the neighbouring North Caucasian languages which are mostly consistent SOV languages and employ ergative/absolutive systems in case marking and verbal agreement, the Kartvelian languages are rather difficult to characterize in terms of inflectional and case marking typology. Most researchers label these languages as tense/split ergative/accusative (Boeder 1979), or active/accusative (Harris 1981; Klimov 1977). Grammatical relations are marked by cases, postpositions and agreement. Word order seems to convey no grammatical relations at all at the clause level, and very little at the phrase level. Some postpositions are phonetically independent words, others (e. g. Georgian si 'in', Svan -zi On') display properties of both postpositions and case endings. Subject, direct object and indirect object are marked basically with the three major cases: nominative, ergative (otherwise called narrative) and dative. Personal pronouns do not distinguish major cases. Other cases include instrumental, adverbial, genitive, and vocative. The major case endings mark not only the syntactic features of the NPs but also denote the transitivity class and one of the three tense subsystems to which their verbal predicate belongs. This is shown in Table 1 below where Sact stands for subjects of, roughly, active

236

Yakov G. Testelec

atelic intransitives; Smact denotes subjects of other intransitives (for details cf. Holisky 1981):2 Table 1. Case marking patterns in Kartvelian

Mingrelian

Georgian & Svan Perfect subsystem Present subsystem

transitive

S-DAT

O-NOM

intransitive

Sact-DAT

Slnact-NOM

transitive

S-NOM

O-DAT

intransitive Aorist subsystem

transitive intransitive

S-NOM S-ERG Sact-ERG S inact NOM

O-NOM S-ERG

Laz (not sensitive to tense distinctions) transitive

S-ERG

O-NOM

intransitive

Sac -ERG

Jinac -NOM

The verb agreement system is basically accusative. The prefix of the verb agrees in person/number with the (in)direct object if it is a 1st or 2nd person pronoun and with the subject otherwise; the suffix agrees in person/number normally with the subject or under special conditions with the object. Adjective and genitive modifiers, if preposed, copy some case features of their heads; if postposed or discontinuous from the head NP, they copy all its case/number characteristics. Unemphatic pronoun drop (see Harris 1981 on Georgian) is obligatory with subject and (in)direct object(s) and impossible with other NPs. Like in many other pro-drop languages, two full NPs are rare as S and Ο in a single clause.

3. Word order type (Old) Georgian, Mingrelian and Svan may be classified as free SOV/SVO languages, with SOV prevailing in isolated sentences and statistically. Since NPs are provided with case markers, as shown in (1), and the 1st and 2nd person pronouns as S and Ο are disambiguated by verbal agreement (2), word order does not have a purely disambiguating function.

Word order in Kartvelian languages

(1)

Georgian Monadire-m mo-kl-a irem-i. hunter-ERG PREV-kill-AOR.3SG.SBJ deer-NOM 'The hunter killed the deer.'

(2)

Svan Sgäj mi m-amar-e-d. you I lSG.OBJ-prepare-PRS-l/2PL.SBJ 'You are preparing me.'

237

The only instances where word order does perform a disambiguating role is in Old Georgian where the S and O, if proper nouns, do not distinguish nominative and ergative. The SO order as in (3) is used in such situations. (3)

Old Georgian Abraham sv-a isaak. Abraham bear-AOR.3SG.SBJ Isaac 'Abraham bore Isaac.'

Laz appears to be a typical SOV language. In regard to the VO/OV typology, all Kartvelian languages, except Old Georgian, are consistently left-branching, but with a right-branched relative clause following the head noun and beginning with a relative pronoun.3 Old Georgian, however, has right-branched parallels to many left-branched structures (for details see below). Georgian is the only Kartvelian language whose word order has been the subject of systematic studies (see e.g. Kaxaze 1953; Kiziria 1950; Pocxua 1962; Alxazisvili 1966; Kvacase 1966, among others). Unfortunately, most of the studies within the native grammatical tradition are rather fragmentary, do not take into consideration negative evidence and often employ vague grammatical notions. Vogt (1971, 1988) was the first to examine word order in Georgian within a more convincing grammatical framework. Harris (1984) investigated word order in Georgian interrogative constructions. Up to now, the most detailed description of modern Georgian is that of Apridonije (1986); her description is not free of some of the shortcomings mentioned above, but nonetheless represents a wealth of important data. The striking contrast between word order in Old Georgian and modern Georgian has not, of course, been neglected by researchers. The drift from 'head-modifier' to 'modifier-head' order has been discussed in much detail by Klimov (1961) and Sar3vela3e (1984).

238

4.

Yakov G. Testelec

Major clausal constituents in declarative clauses

4.1. The verb and its arguments Georgian, Mingrelian and Svan do not differ significantly in regard to the degree of word order flexibility of major constituents in declarative clauses. All the six possible orders of S, V, and Ο can be elicited from an informant or found in a written or spoken narrative or dialogue. There is a prevalence of TFX and TXF orders, where Τ denotes the topical, and F the focused constituent^). Some relevant examples from Georgian are given in (4), (5), and (6). (4)

Georgian a. Bic-ma gogo-0 ga-abraz-a. boy-ERG girl-NOM PREV-make angry-AOR.3SG.SBJ 'The boy made the girl angry.' b. Bic-ma ga-abraz-a gogo-0. c. Gogo-0 bic-ma ga-abraz-a. d. Gogo-0 ga-abraz-a bic-ma. e. Ga-abraz-a bic-ma gogo-0. f. Ga-abraz-a gogo-0 bic-ma.

(5)

Ra-0 mo-u-vid-a bic-s? what-NOM PREV-3.OBJ.VERS-go-AOR.3SG.SBJ boy-DAT 'What happened to the boy?'

(6)

Ra-0 mo-u-vid-a gogo-s? what-NOM PREV-3.OBJ.VERS-go-AOR.3SG.SBJ girl-DAT 'What happened to the father?'

In the context of (5), a proper reply may be (4 a, b); other orders are less acceptable. In the context of (6), (4d) is more acceptable than (4c) and much better than all the other orders given under (4). Within a narrative discourse, if there are no contrastive or focused NPs or other special cases, all six variants are interchangeable. Such is the case, for example, if the referents of the NPs denote the main characters in the narrative, as in (7).

Word order in Kartvelian languages

(7)

239

Georgian a. Bevri eseb-a xelmcipe-m svil-i. much seek-AOR.3SG.SBJ king-ERG son-NOM 'The king was looking for his son for a long time.' b. Xelmcipe-m bevri e3eb-a svil-i. c. Svil-i bevri eseb-a xelmcipe-m. d. Xelmcipe-m svili bevri e3eb-a.

Laz, however, is a strict V-final language. The diachronic drift of Laz toward strict SOV is obviously due to the influence of Turkish. (The Laz people have been bilingual for many centuries.) VX appears only under special conditions, e. g. if the V is moved to the second position, following a WH-pronoun as in (8). (8)

Laz Mi u-skun worsi lazuri nena-0? who 3.SBJ-know well Laz language-NOM 'Who knows well the Laz language?'

VO is possible also with imperatives. E. g. (9)

Ma mi-cw-a dulja-0 skani-0 I ISG.OBJ-say-IMP business-NOM your-NOM 'Tell me of your business!'

In the Kartvelian languages (except Laz), introductory or 'presentative' intransitive verbs denoting that the referent of the subject appears on the stage or disappears from it, tend to be placed in the first position (evidence for Georgian can be found in Apridonise 1986: 120). (10)

Mingrelian Ko'ope art-i yalier-i koc-i. be.AOR.3SG. one-NOM rich-NOM man-NOM '(Once upon a time) there was a rich man.'

(11)

Old Georgian Da gardamoqd-a cyma-j da carmoxecn-es and open-AOR.3SG.SBJ rain-NOM and overflow-AOR.3PL.SBJ

240

Yakov G. Testelec

mdinare-n-i da kr-odes kar-n-i. river-PL-NOM and blow-AOR.3PL.SBJ wind-PL-NOM 'And the rain began, and the streams overflowed, and the winds blew.' Word order is not particularly sensitive to length restrictions. However, with a non-nominalized sentential object OVS, VOS, and SOV orders are disfavoured as shown in (12). (12)

Georgian a. Bic-ma tkv-a rom ar cava boy-ERG say-AOR.3SG.SBJ COMP NEC go.FUT.3SG.SBJ saxl-si. house-LOC 'The boy said that he wouldn't go home.' b. *Tkv-a, rom ar cava saxl-si, bic-ma. c. *Bic-ma, rom ar cava saxl-si, tkva. d. ?Rom ar cava saxl-si, tkv-a bic-ma.

In sentences with a sentential and a nominal object, the first position for the nominal object is obligatory or favoured, as in (13 a). (13)

Mingrelian a. Ndii-k mitxu-u koc-i namu-k-it. Div-ERG ask-AOR.3SG.SBJ man-NOM which-ERG-REL tina-0 gakurcxin-un-i. he.NOM awake-CAUS-PLUP.3SG.SBJ 'The Div asked the man, who did awake him.' b. *Mitxu-u, namu-k-it tina gakurcxin-un-i, koc-i.

Arguments provided with "long" postposed relative clauses, do not seem to avoid the position before their heads, e.g. (14a) is as normal as (14b). (14)

Mingrelian a. Koc-i namu-sen-it ma 0-pikren-di, man-NOM which-ABL-REL I lSG.SBJ-think-IMPF.l/2SG.SBJ doyur-u. die-AOR.3SG.SBJ 'The man, of whom I have been thinking, died.'

Word order in Kartvelian languages

241

b. Doyur-u koc-i, namu-sen-it ma 0-pikren-di. Apridonise (1986: ch. 1) claims that in 'long' sentences SVO is more frequent than in 'short' ones. Her claim is supported by text counts. Apparently, her results are due to the fact that VO is preferable with 'long' Os. Contrasted topics, irrespective of the grammatical relation that they bear, always occupy the leftmost position. A contrastive focus may precede the verb or follow it. A setting adverbial may precede the contrastive topic, if it modifies the contrastive situation as a whole. Consider, for instance, the examples in (15). (15)

Georgian a. Saxl-i aasen-a givi-m, vasl-i ki house-NOM build-AOR.3SG.SBJ Givi-ERG apple-NOM CONJ darg-o sandro-m. plant-AOR.3SG.SBJ Sandro-ERG 'It was Givi who built the house, and it was Sandro who planted the apple-tree.' b. *Aasen-a saxl-i givi-m, darg-o ki vasl-i sandro-m.

In conjoined clauses with coreferential subjects, V1SX1V2X2 order is frequent in narrative texts. (The figures denote the clauses to which the constituents belong.) (16)

Svan Cu-1-äg al märe-0 nesga xälx-isga i PREV3.SBJ-stand.PRS that man-NOM inside crowd-LOC and gargl-i. speak-PRS.3SG.SBJ 'That man is standing inside the crowd and speaking.'

(17)

Mingrelian Ginocqwid-3 papa-k diakon-isi nopulo 'wil-ua decide-AOR.3SG.SBJ priest-ERG deacon-GEN secretly kill-MSD do duucq-3 dara3-ua. and begin-AOR.3SG.SBJ watch-MSD 'The priest decided to murder the deacon and began to keep watch over him secretly.'

242

(18)

Yakov G. Testelec

Georgian a. (Gela-m gaabraz-a givi-0.) Gela-ERG make angry-AOR.3SG.SBJ Givi-NOM daartq-a ma-s givi-m da gaikc-a. hit-AOR.3SG.SBJ he-DAT Givi-ERG and run away-AOR.3SG.SBJ '(Gela made Givi angry.) Givi hit him and ran away.' b. (...)?? Daartq-a ma-s givi-m. hit-AOR.3SG.SBJ he-DAT Givi-ERG '(...) Givi hit him.'

In (18) VS is not acceptable if isolated (18 b), unless a conjoined clause beginning with a verb follows as in (18 a).

4.2. Adjuncts and adverbials There are no strict rules involving the placement of adverbials of setting, except the general rule that any contrastive topic occupies the first position. If an adverbial is a contrastive topic, it obeys this rule, unless it is preceded by another setting adverbial, which introduces the whole proposition which contains the contrasted elements: (19)

Georgian a. Cem bay-si gusin da-v-bar-e my garden-LOC yesterday PREV-lSG.SBJ-dig-AOR.l/2SG.SBJ mica-0, dyes ki v-morgl-av. ground-NOM today CONJ ISG.SBJ-weed-PRS 'In my garden, yesterday I was digging the ground, and today, I am weeding.' b. ??Bay-si da-v-bar-e gusin mica-0, v-morgl-av dyes.

Place and time adverbials in Georgian prefer the initial position, whereas manner and circumstantial adverbials tend to be placed immediately before the verb (Vogt 1971: 221, 224). Apridonise (1986) points out, that 'absolute' local adverbials, i. e. adverbs denoting place independently of other referents of the utterance such as guriasi in (20), occupy the leftmost position, whereas 'relative' adverbials, referring to the place via some other referent's place, tend to occur before or after the verb.

Word order in Kartvelian languages

(20)

243

Guria-si qasax-is svil-i cem svil-eb-s Guria-LOC cossack-GEN son-NOM my son-PL-DAT ein ver du-u-dg-eb-a. before NEC PREV-3.OBJ-stand-PRS-3SG 'In Guria (an area in Western Georgia) the cossack's son won't stand before my sons.' (Apridonise, 1986: 37)

This tendency may be due to a more general difference between the 'absolute' and 'relative' adverbials, i. e. that the former are usually shorter, for obvious reasons. It confirms Apridonise's (1986: 5) claim that 'formless' and monosyllabic adverbials prefer the initial position in a sentence. Negative adverbials, like all negative elements, strictly and immediately precede their scope, in this case the verb (Apridonise 1986: 7). Temporal adverbials usually occupy the initial position (according to Apridonise's data, 89% in twoword clauses, 67% in three-word ones, 52% in four-word clauses etc.). Aspectual adverbials, however, are placed in immediate preverbal position, like other manner adverbials (Apridonise 1986: 39—44). Causal adverbials are sentenceinitial (1986: 47-48), goal adverbials as in (21) follow the verb (1986: 48-49). (21)

Old Georgian Dasxd-es igin-i cam-ad pur-isa. sit-AOR.3PL.SBJ they-NOM eat-MASD bread-GEN 'They sat down to eat (some) bread.'

In Svan texts, there are many examples of adverbials in the rightmost position (although there are little or no restrictions on their placement): (22)

Svan Natxwjar-al-0 mama x-atx-ena twetnald-a girkid. deer-PL-NOM NEC 3.SBJ-find-PRF Twetneld-GEN around 'He could find no deer around the Twetneld (mountain).'

5. Subordinate clauses Word order in subordinate clauses does not differ basically from that of main clauses. In Georgian, the conjunctions rom 'that; if and tu 'if (in conditional use; they differ in tense/aspect and the semantics of the subordinate clauses they introduce) occur either clause-initially or after the first stressed word (or constituent) in the subordinate clause (Tschenkeli, S 1958: 108).

244

(23)

Yakov G. Testelec

Georgian Tkven tu/ tu tkven da-0-cer-t you if PREV-2.SBJ-write-PRS.l/2PL.SBJ ceril-s, me ga-v-gzavn-i ma-s. letter-DAT I PREV-l.SBJ-send-PRS it-DAT 'If you write the letter, I will send it.'

In the Zan languages (i.e. Mingrelian and Laz), in contrast to Georgian and Svan, subordinate clauses have clause-final complementizers, in conformity with their general left-branching strategy. Laz, being a strict SOV language, has moved much further in this direction. The overlap of the two strategies in Mingrelian is evident. A relative clause may begin with a relative pronoun and end with a relative conjunction -ni: (24)

Mingrelian Mu-ti 0-oko ni, ti-s which-REL 3SG.SBJ-want.PRS-CONJ that-DAT e-v-uteen-k.

PREV-l.SBJ-do-PRES.l/2.SBJ 'Whatever he wants, I'll do it.'

Most conjunctions are clause-final. E. g. (25)

G-oko-na da, kmo-rt-i! 2.OBJ-want-COND if PREV-come-IMP 'If you wish, come!'

In Laz, a system of gerundial converbs typical of SOV languages has emerged instead of subordinate conjunctions:

(26)

Laz Foma heleke moft-is-is, jemasa yesterday there come-OPT-TEMP afternoon zu saat or-tu. two hour be-PAST.3SG.SBJ 'Yesterday, when I had come from there, it was two o'clock.'

Etymologically, the converb affix in (26) is analyzable as consisting of a genitive case ending-« attached to some nominalized form of the verb.

Word order in Kartvelian languages

245

6. Different sentence types In Georgian yes/no questions, verb-final order is most common, due to the general tendency of placing the focus of the yes/no question in the rightmost position. The yes/no question focus is provided with a special intonation contour, rising from low or middle to the highest pitch. Alternatives occur if a non-verbal constituent is in focus: (27)

Georgian Gela-0 kitxulob-s cign-s? Gela-NOM read-PRS.3SG.SBJ book-DAT 'Does Gela read the book?', 'Is it the book that Gela reads?'

In Mingrelian yes/no questions are marked by -o, in Laz by -(j)i enclitics attached to the verb, even if the verb is not the question focus, as in (28) and (29). (28)

Mingrelian Si re-k-o koc-i? you be-PRS.l/2SG.SBJ-INT man-NOM 'Are YOU a man?' (question asked by a Div who wants to find a man)

(29)

Laz Artuk haci 0-idater-en-i? well now 2SG.SBJ-leave-PRS-INT 'Well, will you leave NOW?'

In Georgian the optional yes/no question particle tu is always clause-initial. Verbs denoting perceptions, emotions etc. occur in initial position (Apridonise, 1986: 81). E.g. (30)

Georgian Mo-g-con-s kal-i? PREV-2SBJ-like-3SG.OBJ woman-NOM 'Do you like the woman?'

The WH-phrase position is strictly preverbal in the Kartvelian languages (for details on Georgian see Harris 1984).

246

(31)

Yakov G. Testelec

a. Rogor mo-g-econ-a es ambav-i? how PREV-2.SBJ-like-3SG.OBJ this story-NOM '(How) did you like this story?' b. *Rogor es ambav-i mo-g-econ-a?

The question phrase usually precedes the residue of the clause, unless the WH—phrase is a direct object or a setting adverbial. In the two latter cases it may equally well follow it (Apridoni3e 1986: 72). 'Long' constituents tend to follow the WH-phrase (Apridonise 1986: 73). In Laz the situation is similar as (32) and (33) illustrate.

(32)

Laz Nam odax-es rkola-0 r-en? which room-GEN key-NOM be-PRS.3SG.SBJ 'Which room is the key from?'

(33)

Ma he bere-0 muco p-cop-a? I this boy how l.SBJ-catch-OPT.l/2.SBJ 'How can I catch this boy?'

In imperatives, VO as in (34a) seems to prevail over OV (35b). (34)

Mingrelian a. Ko-m-uc-i para-0 PREV-lSG.OBJ-give-IMP.2SG.SBJ money-NOM 'Give me the money!' b. Para-0 ko m-uc-i

Sentence negation is expressed in Kartvelian by free negative particles. Each language possesses several negative particles which have different modal functions; these differences have no bearing on word order. The only possible order is XNeg(Aux)VY, as in (35 a). (35)

Georgian a. Ar v-ici. NEC l.SBJ-know.PRS Ί don't know.' b. *V-ici ar. l.SBJ-know.PRS NEC

Word order in Kartvelian languages

247

Negative adverbials, if they replace the negative particle, occupy the same position (36); if they occur together with the negative particle ('double negation') they can be post- (37 a) or preposed (37 b). (36)

a. Araper-i v-ici. nothing-NOM l.SBJ-know.PRS b. *V-ici araper-i. l.SBJ-know.PRS nothing-NOM

(37)

a. Araper-i ar v-ici. nothing-NOM not l.SBJ-know.PRS Ί know nothing' b. Ar v-ici araper-i. not l.SBJ-know.PRS nothing-NOM

7. The noun phrase Of all the Kartvelian languages, only Old Georgian had articles, and they were postposed. The indefinite article was erti which functioned as an article only if postposed, and as a numeral One', if preposed. The demonstrative pronouns igt, ese, ege 'this, that' were preposed; if postposed, they functioned as definite articles. Compare the examples in (38). (38)

Old Georgian a. kac-i ert-i man-NOM INDEF-NOM 'a man' b. ert-i kac-i one-NOM man-NOM One man'

Boeder (1995) has provided sufficient evidence that the articles, as opposed to demonstratives and numerals, were unstressed clitics attached usually to the first stressed constituent of an NP. In view of this they were sometimes placed before the noun as in (39). (39)

qovel-sa ma-s er-sa all-DAT DEF-DAT people-DAT 'to all people'

248

Yakov G. Testelec

Demonstratives, numerals and quantifiers are preposed in Georgian and both of the Zan languages. E. g. (40)

Georgian is kac-i / *kac-i is that man-NOM 'that man'

(41)

Mingrelian zir-i koc-i / koc-i zir-i two-NOM man-NOM 'two men'

(42)

tito koc-i / *koc-i tito every man-NOM 'every man'

In Svan, numerals and quantifiers may 'float' into preverbal position: (43)

Svan a. Mägcqintär-0 an-qäd-x. all boy-PL-NOM PREV-come.AOR-3PL.SBJ b. Cqint-är-O mag an-qäd-x. 'All the boys came.'

(44)

a. Semi mare-0 an-qad-0. three man-NOM PREV-come-3SG.SBJ b. Mare-0 semi an-qad-0. 'Three men came.'

Possessive pronouns are preposed in modern Georgian and Svan. Cases of postposition are found in Svan folklore texts (Palmaitis & Gudjedjiani 1986: 42; Klimov 1961: 259): (45)

Dede-s isgw-a si gar x-or-d-as. mother-DAT thy-DAT you only 3SG.SBJ-be-IMP-l/2 'Thy mother had thee only' (of the popular folk song "Mirangula")

Possessive pronouns in Laz are usually postposed. Their postposing is obligatory in cases of inalienable possession (46); preposing occurs usually under contrastiveness (Klimov 1961: 258), as in (48).

Word order in Kartvelian languages

249

(46)

Laz toli-0 ckimi-O / *ckimi-0 toli-0 eye-NOM my-NOM 'my eye'

(47)

Essey-si nena-0 0-i3ere-ja do ass-GEN word-NOM 2SG.SBJ-believe.PRS-CIT and ckimi nena-0 museni va O-i3ere-ja? my word-NOM why not 2SG.SBJ-believe.PRS-CIT 'You do believe what the ass says, but why don't you believe what I'm saying (lit.: MY word)?'

In Old Georgian possessive pronouns were usually postposed. (48)

Old Georgian Asul-i cem-i acya mokud-a. daughter-NOM my-NOM just die-AOR.3SG.SBJ 'My daughter has died just now.'

Postposed possessive pronouns are used as second parts of compounds with some kinship terms in modern Georgian and Mingrelian. (49)

Georgian mama-cem-i father-my-NOM 'my father'

(50)

Mingrelian mua-ckim-i father-my-NOM 'my father'

In all the Kartvelian languages except for Old Georgian adjectives are normally preposed, irrespective of their morphological or semantic properties. In modern Georgian, the postposing of an adjective is one of the features of archaic style (which may follow the Old Georgian patterns in many other points as well). In Old Georgian both orders were possible, and NAdj prevailed: in the first Georgian Gospel there are 847 cases of NAdj to 232 cases of AdjN (Kaxay, 1953: 373). The functional load of this opposition (if there was any) has not yet been investigated. At first sight, preposed adjectives denote stable and non-

250

Yakov G. Testelec

restrictive characteristics, whereas postposed ones are noncommittal as to this distinction. Sar3vela3e noted (1984: 510) that stable adjectival epithets in Old Georgian like cmidaj 'saint', ujmrtoj 'godless', brjeni 'wise' etc., are always preposed to the names of their characters. Cf. also the following examples. (51)

Old Georgian did-n-i ig-i marxva-n-i great-PL-NOM DEF-NOM fast-PL-NOM 'the Lent (long winter period of fasting)'

(52)

patiosan-sa 3var-sa honourable-DAT cross-DAT 'to the honourable cross'

Adjectives and possessive pronouns in Old Georgian, if contrastively focussed, are obligatorily preposed (first observed in other terms by Klimov 1961: 260— 261). E.g. (53)

Sevidod-i-t icro-sa ma-s bce-sa. enter-IMP-PL narrow-DAT DEF-DAT gate-DAT 'Enter the NARROW gate.' (Mth 7:13)

Genitives are usually preposed in all the languages except for Old Georgian where the postposing prevails. Mingrelian, Svan, and Georgian have postposed genitives, which differ formally from the preposed ones. In Mingrelian an immediately postposed genitive bears the case ending of the NP it belongs to, whereas the head noun doesn't (54). If a genitive is distantly postposed, both constituents of the NP are case-marked (54), cf. Kipsidze (1914: 135). (54)

Mingrelian skua muarxe-si-s son slave-GEN-DAT 'to the slave's son'

(55)

Giantxuu-d-es kata-s te xencap-si attack-IMP-AOR.3PL.SBJ people-DAT that king-GEN mala-si-s. estate-GEN-DAT 'They attacked the people of that king's estate.'

Word order in Kartvelian languages

251

In Svan, a genitive, which is postposed or discontinuous from its head, has the additional marker -s, originating from the Common Kartvelian genitive marker. (56)

Svan a. gezl-a kor-0 boy-GEN house-NOM b. kor-0 gezl-a-s house-NOM boy-GEN-POSTP MARKER c. *kor-0 gezl-a d. *gezl-a-s kor-0 'boy's house'

In Old Georgian postposed genitives agree with their head in case/number. The case/number suffix of the head noun is copied after the genitive marker (Plank 1990, Boeder 1995): (57)

Old Georgian 3m-isa saxl-man / saxl-man 3m-isa-man brother-GEN house-ERG house-ERG brother-GEN-ERG 'brother's house'

If there are two levels of genitive embedding, only the last, i. e. the lowest genitive NP in a double genitive phrase can bear the case/number agreement markers. First it copies the case/number endings of its immediate head; after that the ending of the top head noun is copied: (58)

saidumlo-j ig-i sasupevel-isa mystery-NOM DEF-NOM kingdom-GEN m-is ymrt-isa-jsa-j DEF-GEN God-GEN-GEN-NOM 'the mystery of the kingdom of God'

If the genitive NP is discontinuous or contains an apposition, the rule can apply to the genitive which is followed by the inserted element, or by the apposition: (59)

Cuen mose-s-n-i v-ar-t we Moses-GEN-PL-NOM lSG.SBJ-be-l/2PL.SBJ mocape-n-i. disciple-PL-NOM 'We are the disciples of Moses.'

252

Yakov G. Testelec

(60)

nerg-n-i samotx-isa-n-i plant-PL-NOM paradies-GEN-PL-NOM m-is edem-isa-n-i DEF-GEN Eden-GEN-PL-NOM 'the plants of Eden the Paradise'

Here again, like the adjectives, preposed genitives usually denote stable and non-restrictive characteristics: (61)

cmid-isa 3uar-isa 3al-isa cineba-sa holy-GEN cross-GEN power-GEN sign-DAT 'the sign of the power of the holy cross'

In modern Georgian texts, postposed genitives are associated with archaic or solemn style. Sometimes NGen is used if N has very long preposed dependents. This is exemplified by (62 a) which is considered to be preferable to the normal GenN order in (62b). 4 (62)

Georgian a. ara mravalricxovani da mainc didi istori-is not numerous but still great history-GEN mkone-0 er-i kartl-isa possessor-NOM people-NOM Georgia-GEN 'although not very numerous, but nonetheless having a great history, the people of Georgia' b. ?kartl-is ara mravalricxovani... etc. er-i Georgia-GEN not numerous ... people-NOM

NGen is also common with long relative modifiers of the genitive, for the same reasons as those mentioned in note 4. A case in point is illustrated in (63 b). (63)

a. ?si3ulvil-is, romel-i-c didixania hatred-GEN which-NOM-REL long gv-acux-eb-s, mizez-i lPL.OBJ-trouble-PRS.3SG.SBJ reason-NOM 'the reason of the haltred which troubles us for a long time' b. mizez-i sisulvil-isa, romel-i-c... etc. reason-NOM hatred-GEN which-NOM-REL

Word order in Kartvelian languages

253

As we have seen, Old Georgian tends to have a right-branched equivalent to many (if not all) left-branched structures. The tendency is consistent enough to affect even postpositions. In Old Georgian many adpositions can be placed preand postpositionally. (64)

cinase netar-isa susanik-isa before blessed-GEN Shushanik-GEN 'before blessed Shushanik'

(65)

mtavr-isa cinase ruler-GEN before 'before the ruler'

Zorrell's (1930: 38—39) lists of pre- and postpositions overlap greatly, although there were adpositions which were used exclusively pre- or postpositionally. In modern Georgian all the Old Georgian prepositions and variable adpositions have become postpositions. Relative clause modifiers are always postposed in Georgian and Svan (unless the relative clause is a preposed participial construction). In Mingrelian relative clauses are pre- or postposed, in Laz they are preposed. (66)

Mingrelian wezir-ep-i la'api-s U3ine-d-es-ni wezir-PL-NOM game-DAT see-IMPF-3PL-REL 'the wezirs who saw the game'

(67)

yoront-isa id-3-ni ti koc-i God-LOC go-AOR.3SG.SBJ-REL this man-NOM 'this man who went to God'

(68)

almas-i, namuti yir sumi sopeil-isawe diamond-NOM which.NOM worthy three village-LOC 'the diamond which costs three villages'

(69)

Laz na 0-ili-dortun avi-0 which 2SG.SBJ-kill-PERF bird-NOM 'the bird that you killed'

Sometimes, an NP containing a focused constituent is discontinuous: in the order TFX, the constituent which is the most relevant pragmatically moves outside the material that is pragmatically neutral:

254

Yakov G. Testelec

(70)

Old Georgian rametu aravin i-pov-a kac-tagan-i because none PASS-find-AOR.3SG.SBJ man-ABL-NOM 'because NO ONE was from the people.'

(71)

Mingrelian Maran-s did-i lagwan-ep-i r-da cellar-DAT big-NOM jug-PL-NOM be-IMPF.3SG.SBJ ywin-isi epsa. wine-GEN full 'There were BIG JUGS full of wine in the cellar.'

The normal order of the NP modifiers is GenDemPossNumAdjNRel. This is exemplified in (72). (72)

Georgian bic-is es cemi ori didi satamaso-0, boy-GEN this my two big toy-NOM romel-i-c mo-v-itan-e which-NOM-REL PREV-lSG.SBJ-bring-AOR.l/2SG.SBJ 'these two my big toys of the boy which (the toys) I have brought'

8. Conclusion The Kartvelian languages show a combination of free word order at the clause level and strict head-final order within the noun phrase, which is typical for most languages of the Caucasian area. Unlike, for example, in the East Caucasian languages, in Kartvelian there is apparently no unmarked order of major clausal constituents. Old Georgian and Laz are exceptional in this respect. Laz belongs to the consistent head-final type; a curious characteristic of Old Georgian is that it was "free-branching", i.e. it allowed a head-initial option to almost all head-final constituents, including postpositional phrases.

Notes 1. I wish to thank colleagues and informants who helped me in collecting data on Kartvelian languages, especially Ketevan Gadilia for Georgian, Roza Kantaria for Mingrelian, Zina Gudjedjiani and David Kochkiani for Svan, among many others. I

Word order in Kartvelian languages

255

would like to thank Georgij A. Klimov and Winfried Boeder for their valuable comments for the first draft of the paper; all mistakes are my own. 2. S and Ο are used here in Greenberg's sense and not in Comrie's (1978) or Dixon's (1979). 3. For a discussion of this asymmetry found in a considerable number of left-branching languages see Dryer (1988) and Hawkins (1990). 4. That (63 b) is less acceptable follows from Hawkins' (1990) unified theoretical account of word order universale, grammaticalization processes and acceptability scales of equally grammatical options of word order based on considerations of quick and effective parsing of syntactic material. According to his theory, if one of several grammatical orders shows less "distance" between the heads of constituents embedded in one another, this order will be the most acceptable. The method of calculating this distance (the "Early Immediate Constituents value") proposed by Hawkins shows indeed that in (63 b) the distance is much greater for the more inclusive NP than it is in (63 a).

References Alxazisvili, A. A. 1959 "Porjadok slov i intonacija v prostom povestvovatePnom predlozenii gruzinskogo jazyka", Foneticeski/ sbornik TGU, posv'ascennyj akad. G. S. Axvlediani. Tbilisi. Apridonise, Shukia 1986 Sitqvatganlageba axal kartulsi. Tbilisi: Mecniereba. Boeder, W. 1979 "Ergative syntax and morphology in language change: the South Caucasian languages", in: Frans Plank (ed.), Ergativity. New York: Academic Press, 435-480. 1995 "Suffixaufnahme in Kartvelian", in: Frans Plank (ed.), Double case, NY: Oxford University Press, 151-215. Comrie, Bernard 1978 "Ergativity", in: Winfried P. Lehmann (ed.), Syntactic typology. Studies in the phenomenology of language. Austin & London: The University of Texas Press, 329-394. Dixon, M. W. 1979 "Ergativity", Language 55: 1-144. Dryer, Matthew S. 1988 "Object-verb order and adjective-noun order: dispelling a myth", Lingua 74: 185-217. Harris, Alice C. 1981 Georgian syntax. A study in Relational Grammar. Cambridge: Cambridge University Press. 1984 "The Georgian question", in: W. S. Chisholm (ed.), Interrogativity. TSI, vol. 4. Amsterdam & Philadelphia: John Benjamins. Hawkins John A. 1990 "A parsing theory of word order universale", Linguistic Inquiry 21: 223— 261.

256

Yakov G. Testelec

Holisky, Dee Ann 1981 Aspect and Georgian medial verbs. Chicago: The University of Chicago Press. Kaxa3e, V. 1953 Msazyvrel-sazyvrulis tanamimdevroba otxtavsi. Tbilisis universitetis studenta nasromta krebuli, 6. Tbilisi. Kiziria, A. 1950 Semasmenlis adgili cinadadebasi. Kartuli ena da literatura skolasi III. Tbilisi. Kipsidze, losif 1914 Grammatika mingrel'skago (iverskago) jazyka. S-Petersburg: Imperatorskaja Akademija Nauk. Klimov, Georgij A. 1961 "K voprosu porjadke clenov atributivnogo kompleksa v kartvel'skix jazykax", in: Anatoloj A. Bokarev, (ed.), Voprosy izucenija iberijsko- kavkazskixjazykov. M. Izd-vo AN SSSR, 257-270. 1977 Tipologija jazykov aktivnogo stroja. Moskva: Nauka. Kva^ase, L. 1966 Tanamedrove kartuli enis sintaksi. Tbilisi: Ganatleba. Palmaitis Letas K. & Chato Gudjedjiani 1986 Upper Svan: grammar and texts. Vilnius: Vilnius Mokslas Publishers. Plank, Frans 1990 "Suffix copying as a mirror-image phenomenon", Linguistics 28: 1039— 1045. Pocxua, B. 1962 Sitqvatganlagebistvis kartulsi. iberiul-kavkasiuri enatmecniereba. XIII. Tbilisi. Sar3vela3e, Zurab 1984 Kartuli saliteraturo enis istoriis sesavali. Tbilisi: Ganatleba. Tschenkeli, Kita 1958 Einführung in die georgische Sprache. Zürich: Amirani Verlag. B. I. Vogt, Hans 1971 Grammaire de la langue georgienne. Oslo. 1988 L'ordre des mots en georgien moderne. — In: Vogt, Hans. Linguistique caucasienne et armenienne. Oslo: Norvegian University Press. Zorrell, Franz 1930 Grammatik der altgeorgischen Bibelübersetzung. Roma: Scripta pontificii instituti biblici.

Yakov G. Testelec

Word order in Daghestanian languages1

1. Introduction The Daghestanian languages belong to the Nakh-Daghestanian, or East Caucasian, branch of the North Caucasian family. Nikolayev & Starostin (1994) have recently clarified the genetic groupings and external relationships of the East Caucasian languages. In particular, they have shown that all the Daghestanian groups belong to the same taxonomic level as the Nakh languages, which have been traditionally viewed as opposed to the whole Daghestanian branch. The classification of the East Caucasian family now looks as follows: East Caucasian groups: Nakh group: Batsbi (= Bats), Chechen, Ingush; Avar-Andian group: Avar language (a separate subgroup); Andian (or Andic) subgroup: Akhvakh, Andi, Bagvalal (= Bagulal, Bagvali), Botlikh, Chamalal (= Chamali), Godoberi, Karata, Tindi; Tsez (or Tsezic, or Dido) group: Bezhta (= Bezhita), Ginukh, Gunzib, Inkhokvari, Khvarshi, Tsez (= Tsezi, Dido); Lak language (a separate group); Dargwa (= Dargin) language (actually a group consisting of "dialects" as much differentiated as, e.g. Germanic languages); Lezgian (or Lezgic) group: Agul, Archi, Budukh, Kryz, Lezgian (= Lezgi), Rutul, Tabasaran, Tsakhur, Udi; Khinalug language (a separate group). The term "Daghestanian" has therefore no genetic or classificational value and will be used below only for convenience since the Nakh languages will not be covered. It must be noted that the differences among the groups listed above are comparable to those of the groups of Indo-European languages. The East Caucasian languages are spoken mostly in the Republic of Dagestan; Chechen and Ingush are spoken in the Chechen and Ingush republics respectively. All three belong to the Russian Federation (Chechenia's attempt to secede since 1991 has led the Chechens to a war with the federal army).

258

Yakov G. Testelec

Budukh, Kryz and Khinalug are spoken in Azerbaidjan. Speakers of some of the other languages belonging to the Lezgian group inhabit both countries; e.g., after the breakup of the Soviet Union the Lezgian and Tsakhur peoples found themselves divided by the state border. Batsbi is spoken in one single village in Eastern Georgia. Large groups of speakers of East Caucasian languages emigrated from the Caucasus in the XlXth century, and now their descendants are scattered in many countries of the Middle East, especially in Turkey where most languages continued to be used in everyday life until now. The total number of speakers is approximately 2.3 million. Although there are a few valuable works on the syntax of the East Caucasian languages, little interest has been shown to date in word order issues, especially at the clause level. In most current descriptions the word order is simply characterised as SOV or free, and as modifier > head in the NP. Xaidakov (1975: 90, 99, 114—116, 1977: 238 — 240) noted some restrictions on word order permutations in simple declaratives in Archi, Tsakhur, and Avar, but the data at my disposal do not confirm them. For his native language Lak, Xaidakov claims that all logically possible permutations are grammatical.

2. Inflectional and other functional categories The East Caucasian languages display the bunch of typological characteristics which is widespread in Northern Eurasia, i. e., SOV and modifier-head orders (including AdjN), rather consistent agglutination and rich (mostly suffixal) nominal and verbal paradigms in morphology.2 However, unlike the Altaic or Uralic languages, the East Caucasian languages evince ergativity in case marking and agreement. Their morphology is not particularly sensitive to grammatical relations such as subject, object etc.; rather, case endings and agreement patterns encode thematic categories such as Agent, Patient, Experiencer, or complexes of complementarity distributed thematic roles called "hyperroles" by Kibrik (1979). Most of the East Caucasian languages manifest agreement systems represented by class and number markers in verbs and adjectives. Adjectives agree with head nouns, verbs agree with intransitive subjects and transitive objects (i.e. with NPs in the absolutive case). In several languages, verbs agree in person and number with subjects and objects (if the latter are 1st or 2nd person pronouns). The class agreement markers are glossed below with class numbers: 1CL, 2CL... etc., where nouns denoting men belong usually to the first class, nouns denoting women belong to the second class, those denoting animals to the third, and inanimates are distributed among several classes. The verb in-

Word order in Daghestanian languages

259

fleets also for tense, aspect (usually durative vs. terminative), various modal categories like mood, interrogativity, evidentiality etc., and negation. Some kinds of pragmatic and referential categories like focus or restrictiveness may be marked with affixes or clitics; most descriptions, however, contain little data on this interesting phenomenon.

3. Word order type The East Caucasian languages all belong to the SOV type and have postpositions only. The unmarked order of all dependent categories, branching and non-branching alike, is pre-head. Different languages of the family permit different degrees of freedom of head-dependent ordering in noun phrases and higher level constituents.

4. Major clausal constituents in declarative clauses Daghestanian languages admit all the six logically possible permutations of Greenberg's categories S, O, and V, as shown in (1) on the basis of Avar, although both V-initial orders are more or less strongly marked, and V-medial orders may have a narrower distribution than the V-final orders (cf. Abdullaev (1971: 27) for Dargwa and Xajdakov (1977: 238-240) for Lak). (1)

Avar3 a. Was-as t'ex c'al-ana. 4 boy-ERG book read-AOR 'The/a boy read the/a book.' b. T'ex was-as c'al-ana. book boy-ERG read-AOR c. Was-as c'al-ana t'ex. boy-ERG read-AOR book d. T'ex c'al-ana was-as. book read-AOR boy-ERG e. C'al-ana was-as t'ex. read-AOR boy-ERG book f. C'al-ana t'ex was-as. read-AOR book boy-ERG

260

Yakov G. Testelec

As noted by Haspelmath (1993: 298), Lezgian exhibits a strong tendency to place the subject before all other arguments. This may be viewed, however, as a particular instance of a more general pragmatically based tendency according to which elements bearing given information precede material bearing new information, since subjects most often represent given information. Haspelmath (1993: 301) observes that "if another argument is given and the subject is new, the subject follows that argument". Such is the case in the third clause of (2) where a xazina 'that treasury' is placed before the subject. (2)

Cna wa-z xazina hina cünüx-nawa-t'a we.ERG you-DAT treasury where hide-PRF-CND luhu-da, wuna pacah-diz a xazina zin-erri say-FUT you.ERG king-DAT that treasury jinn-PL.ERG cünüx-na Iah. hide-AOR say:IMP 'We'll tell you where the treasury is hidden, and you tell the king that that treasury was hidden by jinns.'

In some languages, however, an interesting phenomenon of consistent rightmost placement of topics (mostly 'given' subjects which have overt antecedents in the same paragraph) can be observed in narratives. A case in point is that of Petr in the second clause of the Avar example in (3). (3)

Petri-da rak'-alde sw-ana zin-da-go ab-un Peter-LOC heart-LOC reach-AOR REFL say-GER r-uk'-ara-l ra^abi. qw'at'i-we u-n 'ek'ekijalda PL-AUX-PAST-PL word-PL.ABS outside-LOC go-GER bitterly Od-ana petr weep-AOR Peter 'Then Peter remembered the words spoken to him. And Peter went outside and wept bitterly.'

The same phenomenon seems to occur in Archi. Although Kibrik calls the factor which stipulates the right shift, logiceskoe vydelenie 'logical marking', and podcerkivanie 'emphasis', which are terms more appropriate for focus, his data on Archi show that it is in fact the topical argument that occupies the rightmost position (Kibrik et al., 1977: 201). Both 'that man' in (4) and 'his wife' in (5) have been previously mentioned. (4)

Akonniiu xu-li jati werqlu-li jamu bosor. in the morning rise-GER up go-AOR that man 'Having got up in the morning, that man went.'

Word order in Daghestanian languages

(5)

261

Jamu bosor-mutu jow-mu jamut xarafut heX'ana that man-COM he-ERG that precious thing t'al-ow-li ianna-rak. send-AUX-AOR wife-DIR 'He sent this precious thing for his wife with that man.'

4.1. The verb and its arguments At the clause level, neither thematic (hyper-)roles, nor grammatical relations are sensitive to the order of the verbal arguments relative to each other. Word order encodes primarily pragmatic functions like topic, focus, givenness, and contrastiveness. The referential characteristics of arguments are involved not immediately but rather only in as much as they determine their pragmatic functions. Word order interacts with supersegmental features like pitch, tone, and phrasal accent, and thus one and the same order may represent different values of pragmatic categories. The focused NP normally occupies the preverbal position, but SOV, as the unmarked option, doesn't seem to obligatorily render the direct object focal. Instead, OVS order may be used to put the object in focus (postverbal material is usually topical). Compare the Archi (6 a) and (6b). (6)

Archi 5 a. Boxlofu-mu xams a-b-c'u. hunter-ERG bear kill.AOR 'The hunter killed a bear.' b. xams a-b-c'u boxlofu-mu. bear kill.AOR hunter-ERG 'The hunter killed a BEAR.'

On the other hand, SVO, although perfectly grammatical, is not an appropriate means (the* superscript marks inappropriatness) of placing the subject in focus; normally OSV order is used for this purpose as in (7b) and (8). (7)

Archi a. *BoxIofu-mu a-b-c'u xams. hunter-ERG kill.AOR bear b. xams boxlofu-mu a-b-c'u. bear hunter-ERG kill.AOR 'The HUNTER killed the bear.'

262 (8)

Yakov G. Testelec Rutul

Si "Hirceqana-ra jiwxi-ri. bear hunter-ERG kill-AOR The HUNTER killed the bear.' VSO order is marked and restricted in use. It is used, for example, if the action denoted by the verb is contrasted with some other action, as in (9), an infrequently occurring situation. (9)

Karata C'ar-da ida siw iso-l-te. drink-GER AUX.PRS water cat-ERG-EMPH 'The cat IS DRINKING the milk.' (at first, it didn't want to...)

VSO order may become more acceptable if some constituent other than the S and Ο occupies the leftmost position as is the case in (10 b). (10)

Andi a. ?C'inni-du moc'i-sdi insu-w-gu homolo^i. hit-PERF boy-ERG his-lCL-REFL friend 'The boy hit his friend.' b. koiisedu c'inni-du moc'i-sdi insu-w-gu homolo^i recently hit-PERF boy-ERG his-lCL-REFL friend 'Not long ago the boy hit his friend.'

In languages with subject-object person agreement, the verb featuring the overt agreement affixes can freely occur sentence-initially. Overt affixes are used for 1st and 2nd persons; the 3rd person verbal agreement markers being zero, such verbs cannot occupy the initial position. Cf. the semigrammatical (lib) with the grammatical (12b). (11)

Tabasaran a. Ba-li kat'abx-nu >lwan. boy-ERG throw-AOR-3SBJ stone 'The boy threw a stone.' b. ??Kat'abx-nu ttwan ba-li.

(12) a. Uzu uwu-z jab-ura-za. I you-DAT beat-PRS.-lSG.SBJ Ί am beating you.'

Word order in Daghestanian languages

263

b. Jab-ura-za uwu-z uzu. beat-PRS.-lSG.SBJ you-DAT I Contrasted topics occur sentence-initially, as is common in languages with free word order. Observe that (13 b) is a felicitous answer to the question in (13 a), whereas (13 c) is not. (13)

Itsari dialect of Dargwa a. Ce barq'-eb-ni tupang-licila-ra. kul-licila-ra. what happen-AOR-INTERR gun-LOC-CNJ skin-LOC-CNJ 'What happened to the gun and the skin?' b. Tupang milic'aba-1 berq-ib, kul mahamma-1 gun police-ERG take-AOR skin Muhammad-ERG cinni bat-ub. himself.ERG hold-AOR 'The police took the gun, and Muhammad left the skin with himself.' c. #Milic'aba-l police-ERG cinni himself.ERG

tupang berq-ib, mahamma-1 skin take-AOR Muhammad-ERG kul bat-ub. skin hold-AOR

This holds also if the finite verb is a contrasted topic, as in (14 a), although the cleft construction, in which the initial position of the finite verb is avoided (14 b), is also available. (Note that the auxiliary enclitic -caw also serves as the copula enclitic in the cleft construction.) (14)

Itsari dialect of Dargwa a. Kez-ib-caw surrati-ciw mahamma, kejc-ur-caw 'ali. sit-PRT-AUX picture-LOC Muhammad stand-PRT-AUX Ali 'The one who is sitting in the picture is Muhammad, the one who is standing is Ali.' b. Surrati-ciw kez-ib-ci mahamma-caw, kejc-ur-ci picture-LOC sit-PRT-RESTR Muhammad-COP stand-PRT-RESTR 'ali caw. Ali-COP

In Tsez, where verb-initial orders are more consistently avoided, the cleft option in such cases is thus the only available one.

264

(15)

Yakov G. Testelec

Tsez a. *QI'ida ic-asi mahama, hec'k'er ic-asi 'ali sitting stay-RESULT Muhammad standing stay-RESULT Ali b. Ql'ida ac-iru mahama jol, hec'k'er ac-iru 'ali sitting stay-PRT Muhammad be.PRS standing stay-PRT Ali jot be.PRS 'The one who is sitting is Muhammad, the one who is standing is Ali.'

So far, we have only dealt with syntactic verb constructions, where the head of the clause (the tense/aspect/mood finite category) is attached to the verbal stem. In analytical, or auxiliary verb, constructions, it seems more appropriate to take the auxiliary as the V (in Greenberg's terms, though not in terms of terminal categories) and not the main verb. Note that whereas in synthetic constructions the verb immediately follows the focused constituent, in constructions with auxiliaries, such as the one in (16) it is the the auxiliary rather than the main verb which occupies this position. (16)

Archi (Kibrik et al., 1977: 201): Pulannut mazdaj e-b-t-uli tipu some place.LOC put-PRT three zawhar b-i. precious thing 3CL-AUX 'In some place, THREE PRECIOUS THINGS lie.'

The same phenomenon can also be observed in the question-answer pairs in (17) and (18) from Chamalal, where the present form of the auxiliary verb combined with the perfect participle denotes the present perfect tense. (17)

Chamalal a. Lede rasul c'in? who.ERG Rasul beat.INTERR 'Who has beaten Rasul?' b. Ihwa-d ida rasul c'Tn. shepherd-ERG AUX.PRS Rasul beat.PRF 'The SHEPHERD has beaten Rasul.'

(18)

a. Ihwa-d ime c'in? shepherd-ERG who beat.INTERR 'Whom has the shepherd beaten?'

Word order in Daghestanian languages

265

b. Ihwa-d rasul ida c'Tn. shepherd-ERG Rasul AUX.PRS beat.PRF 'The shepherd has beaten RASUL.' Here again, like in the constructions containing no auxiliary, the OV order is the unmarked one. The conflict between two unmarked orders for immediate precedence, i. e. between the Object-Finite Verb and the Main Verb-Aux, seems to be resolved in favour of the former. For instance, in the examples in (19) from Chamalal where the whole proposition is in the scope of the focus, the unmarked order is felicitous. (19 b, c) show that in the unmarked case the auxiliary immediately follows the direct object. (19) a. Ede buX? what happen.INTERR 'What happened?' b. Imu-d 'ali ida wut l. father-ERG Ali AUX.PRS lose.PRF 'Father has lost Ali.' c. ?Imu-d 'ali wut l ida. father Ali lose.PRF AUX.PRS Unlike the synthetic forms, however, the auxiliary cannot occur clause-initially. Note the ungrammaticality of (20 b, c). (20)

Avar a. Di-ca t'ex c'al-ule-b b-ugo. I-ERG book read-PRS-PRT.3CL 3CL-AUX.PRS Ί am reading a book.' b. *B-ugo d-ica t'ex c'al-ule-b. AUX.PRS I-ERG book read-PRS-PRT.3CL c. *B-ugo c'al-ule-b d-ica t'ex. AUX.PRS read-PRS-PRT.3CL I-ERG book

In constructions with ditransitive verbs, the unmarked order is S-IO-O-V; SO-IO-V order occurs if the indirect object is in focus as in (21 b). (21)

Archi (Alekseyev, p. c.) a. Ucitel-li laha-s q'onq' λο. teacher-ERG child-DAT book give.AOR 'The teacher gave the book to the child.'

266

Yakov G. Testelec

b. Ucitel-li q'onq' laha-s λο. teacher-ERG book child-DAT give.AOR 'The teacher gave the book to the CHILD.' Verb-initial order with intransitives (both monovalent and bivalent) seems more acceptable than with transitive verbs, cf. the Tabasaran ( l i b ) above and (22 b, 23 b) below. (22)

Tabasaran a. Bali-z aba kun-du. boy-DAT father love-PRS b. Kun-du bali-z aba. love-PRS boy-DAT father 'The boy loves his father.'

(23)

a. Baj boy walk-PRS

b. "HIa"H-ura baj. walk-PRS boy 'The boy is walking.' VS order may be marked for modality. For instance, in Archi, VS implies some kind of emotional attitude of the speaker toward the situation, either positive or negative (M. Alekseev, p.c.): (24)

Archi a. Lo qlwa. boy come.AOR 'The boy came.' b. QIwa lo. come.AOR boy ('At last...', or: 'Unfortunately...') 'the boy came.'

Sometimes the acceptability of the VS order is lexically restricted, but the relevant restrictions are yet to be investigated in more detail. VS is apparently more acceptable with "presentative" verbs like 'appear', 'come', 'arise' and patientive ("unaccusative") verbs, and less acceptable with some (not all) agentive ("unergative") intransitives.

Word order in Daghestanian languages

(25)

267

Chamalal a. Mil bahinn-aq ida. sun arise-GER AUX.PRS 'The sun is rising.' b. Bahinn-aq ida mhV arise-GER AUX.PRS sun

(26)

a. Ima vi'a. father come.AOR 'Father came.' b. Vi'a ima. come.AOR father

(27)

a. Ima halt'i-d. father work-AOR 'Father worked.' b.

??

Halt'i-d ima. work.AOR father

With stative predicates, existentional and locative constructions are distinguished by the relative order of the stative Theme and Location adverbial. Compare the Tsez (28 a) and Botlikh (29 a) which are locative constructions, with (28 b) and (29 b), which are existential. (28)

Tsez (Polinsky 1995) a. Rutku hon-xo igo joi. house mountain-ADESS near be.PRS 'The house is near the mountain.' b. Hon-xo igo "üutku jot. mountain-ADESS near house be.PRS 'There is a house near the mountain.'

(29)

Botlikh a. Jesik'wa adam ida isi. female person be.PRS at home woman is at home.' b. Isi jesik' w a adam ida. at home female person be.PRS 'There is a woman inside (= in the house).'

268

Yakov G. Testelec

Heavy arguments tend to be placed before shorter arguments in verb-final clauses. E.g. (30)

Lezgian (Haspelmath 1993: 302) Abur muq'ufda-ldi k'el-un wa ezber-un za kwe-z they skill-LOC read-MSD and cram-MSD I you-DAT k'ewelaj mesl t qalur-zawa. strongly advice show-PRS Ί advise you strongly to read and study them carefully.'

This is also evinced by the examples with a heavy subject in the Tsakhur (31); whereas both (31 a) and (31 b) are equally acceptable, the order in (31 d) is much less felicitous than that in (31 c). (31)

Tsakhur a. Bisi-n n,ak il,ob"Huna. cat-ERG milk drink.PAST b. N,ak bisi-n il,ob"Hu-na. milk cat-ERG drink.PAST 'The cat drank the milk.' c. Co3-e qabi-ni bisi-n n,ak il,ob>Iuna. brother-ERG bring-PRT cat-ERG milk drink.PAST 'The cat that the brother and brought, drank the milk.' d. # N,ak co3-e qabi-ni bisi-n il,ob"Huna. milk brother-ERG bring-PRT cat-ERG drink.PAST

4.2. Adjuncts and adverbials Adverbials which occur inside an NP behave like other dependents, i. e. they normally precede the head noun. E. g. (32)

Chamalal a. St'ol-c' axis χο§3 guda. table-LOCon book give.IMP 'Give me the book which is on the table!' b. Guda st'ol-c' a/is %osa. give.IMP table on book

Word order in Daghestanian languages

269

c. PGuda xosa stol-c' axis, give.IMP book table-LOC on d. PXosa stol-c' a/is guda. book table-LOC on give.IMP Adverbiale dependent on adjectives must obligatorily precede them (whereas the adjectives themselves may precede or follow their head nouns): (33)

Bezhta a. JiXa bercinab kid j-öq'o-jo. very nice girl 2CL-come-AOR b. Kid \\ka bercinab j-öq'o-jo. girl very nice 2CL-come-AOR c. *Bercinab jiXa kid j-öq'o-jo. nice very girl 2CL-come-AOR d. *Kid bercinab jiXa j-oq'o-jo. girl nice very 2CL-come-AOR very nice girl came.'

Adverbials of setting may occupy any position, whereas manner adverbials tend not to precede the subject (in the subject-first order): (34)

Chamalal a. (taq) oddi (iaq) ihi ida (taq) (yesterday) they.ERG (yesterday) build.PRF AUX.PRS (yesterday) X'e (taq) bridge (yesterday) 'They built the bridge yesterday.' b. (*t'obad) oddi (t'obad) ihi ida (t'obad) (completely) they.ERG (completely) build.PRF AUX (completely) X'e (t'obad) bridge (completely) 'They built the bridge completely.'

In line with this, Haspelmath notes for Lezgian (1993: 302) that whereas setting adverbials tend to be placed clause-initially, adverbials of manner more frequently occur in what he calls 'clause-medial' position.

270

Yakov G. Testelec

However, some other adverbials in Chamalal, e.g. ajfis'i 'with diligence, zealously'or bisna 'probably' behave differently in that they normally immediately precede the auxiliary, i. e. they occupy the focus position. This class of adverbials may therefore be called "focus adverbials". Some relevant examples are given in (35) and (36) below. (35)

Chamalal a. Mac'i-d axwis'i ida ka"Hat q w ad-aq. boy-ERG diligently AUX.PRS letter write-PRT.PRS 'The boy is writing the letter diligently.' b. ?Mac'i-d ka"Hat ida a/wis'i q w ad-aq. boy-ERG letter AUX.PRS diligently write-PRT.PRS c. *Mac'i-d ida axwis'i ka"Hat qwad-aq. boy-ERG AUX.PRS diligently letter write-PRT.PRS d. *A%wis'i mac'i-d ka^at qwad-aq ida. diligently boy-ERG letter write-PRT.PRS AUX.PRS e. ??Mac'i-d ka^at axwis'i qwad-aq ida. boy-ERG letter diligently write-PRT.PRS AUX.PRS

(36)

a. Mac'i-d qwa-la bisna ida ka"Hat. boy-ERG write-PRTFUT probably AUX.PRS letter 'The boy will probably write the letter.' b. Mac'i-d ka>Iat qwa-la bisna ida. boy-ERG letter write-PRTFUT probably AUX.PRS c. Mac'i-d bisna ida qwa-la kailat. boy-ERG probably AUX.PRS write-PRTFUT letter d. *Bisna mac'i-d q w a-la ida ka^at. probably boy-ERG write-PRTFUT AUX.PRS letter e. *Mac'i-d bisna ida q w a-la ka^at. boy-ERG probably AUX.PRS write-PRTFUT letter f. ?Mac'i-d qwa-la ida bisna ka at. boy-ERG write-PRTFUT AUX.PRS probably letter

In some other languages like Bezhta, Karata and Tsakhur, adverbials can occupy various positions apart from the position between the main verb and

Word order in Daghestanian languages

271

the auxiliary. This seems to hold true for all semantic classes of adverbials as the examples in (37) and (38) illustrate. (37)

Bezhta a. (Hajhaj) öz-di (hajhaj) ka~Haj (hajhaj) cäx-cäs (probably) boy-ERG (probably) letter (probably) write-PRT.PRS (*hajhaj) gej (hajhaj). AUX.PRS 'Probably the boy is writing the letter.' b. (Mühkango) öz-di (mühkango) ka>Iaj (miihkango) (diligently) boy-ERG (diligently) letter (diligently) cä%-cäs write-PRT.PRS (diligently) gej (mühkango). AUX.PRS (diligently) 'The boy is writing the letter diligently.' c. (Ist'oli-jaX'alX'a) öz-di (ist'oli-jaX'alX'a) ka aj (ist'oli(table-LOC) boy-ERG (table-LOC) letter (table-LOC) jaX'alX'a) cäx-cäs write-PRT.PRS (*ist'oli-jaX'aR'a) gej (ist'oli-jaX'alX'a). (table-LOC) AUX.PRS (table-LOC) 'The boy is writing the letter (while sitting) at the table.'

(38)

Tsakhur a. (Helbette) (of course) (*helbette) (of course) Of course

gad-e (helbette) ka^az (helbette) ok'a-n boy-ERG (of course) letter (of course) write-PRT wod (helbette). AUX.PRS (of course) the boy is writing the letter.'

b. (Axareqame) gad-e (axareqame) ka^Iaz (axsreqame) ok'an (to the end) boy-ERG (to the end) letter (to the end) write-PRT (*aX9reqame) wod (axareqame). (to the end) AUX.PRS (to the end) 'The boy is writing the letter completely.' (lit.: '... to the end')

5. Subordinate clauses Subordinate clauses in Daghestanian languages are mostly participial, infintival, absolutive and converb constructions headed by non-finite verb forms. (For

272

Yakov G. Testelec

details on the placement of relative clauses see section 7 below.) Normally adjunct and argument clauses occur in the same positions as NP, PP and adverbial adjuncts and arguments. (39)

Karata a. How k'use ust'ul-lic'o xigi ka"Hat q w a-raxa. hesit.AOR table-LOC behind [letter write-INF] 'He sat down at the table in order to write a letter.' b. Ka"Hat q w a-raxa how k'use ust'ul-lic'o xigi. [letter write-INF] hesit.AOR table-LOC behind

However, argument and adjunct clauses are most often heavy constituents and tend therefore to precede all other elements of the superordinate clause, although they may be placed in the centre of the superordinate clause as well (Haspelmath 1993: 375). Word order in subordinate clauses is stricter than in main clauses. An important difference is that whereas V-initial order is admissible in finite clauses, it is ungrammatical in subordinate clauses. Note the ungrammaticality of (40 b). (40)

Bezhta a. Ged-i χο-na m- q-na hi χυλο-β. cat-ERG meat-CNJ 3CL-eat-GER milk drink-PRS 'The cat, having eaten the meat, is drinking the milk' b. *Ged-i m-iiq-na χο-na hi χυλο-5. cat-ERG 3CL-eat-GER meat-CNJ milk drink-PRS

6. Different sentence types With imperatives and interrogatives, no special word orders are employed. Yes/ no questions are marked morphologically in the verb and/or by a special intonational contour. Verb-initial orders, however, seem to be less marked in yes/ no interrogatives: (41)

Tabasaran a. Ba-li kitab u-b-x-uri ajan? boy-ERG book read-PRT AUX.INTERR b. U-b-x-uri ajan ba-li kitab? read-PRT AUX.INTERR boy-ERG book 'Does the boy read the book?'

Word order in Daghestanian languages

273

Some interrogative particles such as hani 'isn't it? in Archi favour verb-initial order. (42)

Archi (Alekseev, p. c.) a. Lo qlwa-ra? boy come.AOR.INTERR 'Did the boy come?' b. Lo qlwa hani // qlwa hani lo? boy come.AOR INTERR 'The boy came, didn't he?'

In WH-interrogatives, of all the six orders, those in which the WH-phrase immediately follows the verb may be ungrammatical, as is the case in Archi (Alekseev. p. c.) and Tabasaran. (43)

Tabasaran a. Fu u-b-x-ura rasul-i? what read-PRS Rasul-ERG 'What does Rasul read?' b. Fu rasul-i u-b-x-ura? what Rasul-ERG read-PRS c. Rasul-i fu u-b-x-ura? Rasul-ERG what read-PRS d. U-b-x-ura rasul-i fu? read-PRS Rasul-ERG what e. * Rasul-i u-b-x-ura fu? Rasul-ERG read-PRS what f. *U-b-x-ura fu rasul-i? read-PRS what Rasul-ERG

In many (probably all) of the Avar-Andian languages sentences containing WH-questions and other focus markers cannot have finite verb forms; instead, the sentences are headed by the participle. In this case the whole sentence, like a genuine participial construction, does not permit verb-initial order. (44)

Karata a. Hedol c'alda idja-b was-asul? what write.PRT AUX.PRT-3CL boy-ERG 'What does the boy write?'

274

Yakov G. Testelec

b. Was-asul c'alda idja-b hedol? boy-ERG write.PRT AUX.PRT-3CL what c. *C'alda idja-b hedol was-asul? write.PRT AUX.PRT-3CL what boy-ERG d. *C'alda idja-b was-asul hedol? write.PRT AUX.PRT-3CL boy-ERG what

7. The noun phrase In all Daghestanian languages, the unmarked order of all modifier categories is anteposition; postposition of a modifier is either ungrammatical or denotes that the postposed modifier is focused, contrasted, or restrictive. (45)

Bezhta a. Lana suk'o o%q'-ojo. three man come-AOR 'Three men came.' b. Suk'oiana oq'o-jo. man three come-AOR 'THREE men came.' (not two and not four...)

(46)

Rutul a. Byt'rad xydyldy ji-r-q'-yri. nice woman come Romance: —»Germanic: -* Slavic:

0.0 —> 0.56 0.34 -> 0.4 0.14—> —0.45 (mean; n = 11) 0.4 —» 0.14 (mean; n =12) 0.78 -» 0.25 (mean; n=10) 0.33 -»0.29

There is a light overall tendency for these languages to become less consistent. However, there are strong differences per case. If we compare these figures with those for flexibility in note 8, there is a tendency for consistency to move in the opposite direction from flexibility. Germanic seems to be the only exception: Armenian Greek Romance Germanic Slavic

flexibility —0.4 0.0 -0.66 -0.28 +0.06

consistency +0.56 +0.07 +0.31 -0.26 -0.53

In section 5, I will come back to this observation. 13. If we look only at the parameters Adj, Rel, Dem, Num and Gen, and restrict ourselves to the postpositional languages — 40 out of 47 languages; 13 out of 17 patterns — then we find that, for these languages, Hawkins' (1983) Postpositional Noun Modifier Hierarchy (PoNMH) holds, i.e. they conform to (i): (i)

PoNMH: Post -> ((AdjN OR RelN —· DemN AND NumN) AND (DemN OR NumN -> GenN))

14. Again, looking at the parameters Adj, Rel, Dem, Num and Gen only, and restricting ourselves to the prepositional HM languages, we find that Hawkins' (1983) Prepositional Noun Modifier Hierarchy (PrNMH) holds, i.e. they conform to (i): (i)

PrNMH: Prep —· ((NDem OR NNum — NAdj) & (NAdj — NGen) AND (NGen — NRel))

414

15.

16.

17.

18.

Dik Bakker The claim about NNum trivially holds since NNum does not occur in the sample. Arguably, Latin is a counterexample to the claim about NAdj since the position of the genitive is floating while the basic position of the adjective is postnominal. When we include the (7) prepositional MH languages, which is in harmony with the PrNMH, the hierarchy still holds, since all languages only have the relative clause in postnominal position. Of the four languages that are missing from the HM column, two — Latin and Classical Greek — have a floating Gen/N paramater. Norwegian and Faroese (HM) are the real exceptions, be it that NG is an alternative. The predictive force of the genitive order was noticed earlier by Jepson (1989), among others. Interestingly, as Georg Bossong pointed out to me, W. Schmidt, in Die Sprachfamilien und Sprachenkreise der Erde from 1926 already stresses the importance of the genitive as a typological parameter for word order, be it for other reasons. If we would compute the type of a language on the basis of an equal contribution from both levels, taking the PP as a third level, we would get the following distribution of types (in brackets the figures based on the 10 variables): MH: 41 languages (47) HM: 42 languages (33) none: 3 languages (6) mean consistency: 0.73 (0.52) Byelorussian and Russian are borderline cases, reducing the number of clear cut cases to 7. 37 languages have alternative patterns of the 'wrong' type, for which the majority of the modifiers are on the other side of the head than is the case for the basic type. Such patterns get a negative score in the calculations. If we take the absolute value of such scores, then 16 rather than 9 languages would have a higher average consistency value for the non-basic patterns. If we calculate the theoretically possible patterns of a languages, including the non-grammatical ones, and the fraction of patterns that would have higher consistency than the basic pattern, and compare the theoretically possible fraction of more consistent alternative patterns with the actual fraction, it turns out that only 18 languages in S 86 score worse than the theoretical fraction. Thus, in the vast majority of the languages the number of patterns that have a higher consistency than the basic pattern is (much) lower than it could be were all alternative patterns allowed. There is a remarkable difference between the two variants of Komi on all three order parameters that were introduced here; cf.

Komi-Zyrian Komi-Permiak Flexibility: 0.7 0.4 Consistency: 0.11 0.78 Consequence: 0.34 1.0 A possible cause of this is the greater influence of Russian on Komi-Zyrian. 19. There are none when the standardized absolute value of the independent variable — either flexibility or consistency — is put at 3, an often assumed default value. Only at the lower value of 2 do the first outliers appear. These are the ones discussed below. 20. This could be due to influence from Russian. It is not clear to me whether precisely these languages should be more sensitive to influence by Russian than the other Caucasian languages, or should have a tendency to more flexibility in general.

Flexibility and consistency in word order patterns

415

21. If we include the inflexible languages, the negative correlation increases to a more significant —0.36. 22. Of course, these predictions are derived from a set of hypotheses that themselves are partially based on correlations in the same set of languages on which they are tested, and inspired by these correlations. However, the predictions are formulated in terms of groupings that were not established beforehand. Also, in this form, and the reformulation below, they may be tested on other samples, preferably representative for the world's languages.

References Aristar, Anthony 1991 "On diachronic sources and synchronic pattern: an investigation into the origin of linguistic universale", Language 67: 1—33. Bakker, Dik 1994 Formal and Computational Aspects of Functional Grammar and Language Typology. Amsterdam: IFOTT. Bakker, Dik & Anna Siewierska 1991 "A Database System for Language Typology". EUROTYP Working Papers 112. 1992 "A contribution to the problem of constituent order explanation". EUROTYP Working Papers II 5, 127-144. Dik, Simon C. 1989 The Theory of Functional Grammar. Part I. The structure of the clause. Dordrecht: Foris. Dryer, Matthew S. 1992 "The Greenbergian word order correlations", Language 68: 81 — 138. this volume "Aspects of word order in the languages of Europe". Givon Talmy 1988 "The pragmatics of word order: predictability, importance and attention", in: Michael Hammond et al. (eds.), Studies in Syntactic Typology. Amsterdam: John Benjamins, 243—284. Greenberg, Joseph H. 1963 "Some universals of grammar with particular reference to the order of meaningful elements", in: Joseph H. Greenberg (ed.), Universals in Language. Cambridge (Mass.): MIT Press, 58-90. 1966 Language Universals. The Hague: Mouton. Hawkins, John A. 1983 Word Order Universals. New York: Academic Press. Jepson, J. 1989 "Holistic models of word order typology", Word 40: 297-314. Lehmann, Wilfried P. 1973 "A structural principle of language and its implications", Language 49: 47-66. Nichols, Joanna 1992 Linguistic Diversity in Space and Time. Chicago: University of Chicago Press.

416

Dik Bakker

Rijkhoff, Jan this volume "Order in the noun phrase in the languages of Europe", Rijkhoff, Jan, Dik Bakker, Kees Hengeveld 8c Peter Kahrel 1993 "A method of language sampling", Studies in Language 17: 169—203. Ruhlen, Merritt 1987 A Guide to the World's Languages. Vol. 1: Classification. London: Edward Arnold. Siewierska, Anna 1988 Word Order Rules. London: Groom Helm. this volume "Variation in major constituent order: a global and a European Perspective". Siewierska, Anna &c Dik Bakker 1996 "The distribution of subject and object agreement and word order type". Studies in Language. 20: 115—161. Tallerman, Maggie this volume "Word order in Celtic". Tesniere, L. 1953 Esquisse d'une Syntaxe Structurale. Paris: Klincksieck. Testelec, Yakov this volume "Word order variation in some SOV languages of Europe". Vennemann, Theo 1972 "Analogy in generative grammar, the origin of word order", in: L. Heilman (ed.), Proceedings of the Eleventh International Congress of Linguistics Vol. 2. Bologna: II Mulino, 79-83. Vilkuna, Maria this volume "Word order in European Uralic".

Appendix A: The languages in the S 86 sample Altaic (7): Southern Turkic (1): Turkish. Western Turkic (3): Bashkir, Crimean Tatar, Kumyk. Central Turkic (1): Nogai. Bolgar (1): Chuvash. Mongolian (1): Kalmyk. Caucasian (19): North-West (2): Abaza, Abkhaz. Nakh (2): Chechen, Ingush. Avaro-Andi (5): Andi, Avar, Botlikh, Chamalal, Godoberi. Dido (2): Dido, Hinukh. Lak-Dargwa (2): Dargwa, Lak. Lezgian (6): Archi, Khinalug, Lezgi, Rutul, Tabasaran, Tsakhur. Kartvelian (1): Georgian

Flexibility and consistency in word order patterns

417

Indo-European (47): Indo-Iranian (1): Kirmanji. Armenian (2): Classical Armenian, Armenian. Germanic (12): Danish, Dutch, English, Faroese, Frisian, German, Gothic, Icelandic, Luxembourgeois, Norwegian, Swedish, Yiddish. Albanian (1): Albanian. Greek (2): Classical Greek, Greek. Latino (1): Latin. Romance (11): Catalan, French, Friulian, Galician, Italian, Ladin, Portuguese, Romansch, Rumanian, Sardinian, Spanish. Baltic (2): Latvian, Lithuanian. Slavic (10): Bulgarian, Byelorussian, Lower Serbian, Macedonian, Polish, Russian, SerboCroatian, Slovak, Slovene, Upper Sorbian. Celtic (5): Breton, Cornish, Irish, Manx, Scottish-Gaelic. Afro-Asiatic (1): Maltese. Isolates (1). Basque. Uralic (10): Permic (3): Komi-Zyrian, Komi-Permyak, Udmurt. Volgaic (1): Mordvin. Saamic (1): Northern Saami. Baltic Finnic (5): Estonian, Finnish, Ingrian, Livonian, Vepsian.

Appendix B: The flexibility and consistency values for the individual languages Note that flexibility is defined on the basis of 10 word order variables, and not only on the basis of Subject/Verb/Object order, as in Siewierska (this volume). Language

Flexibility

Consistency

Consequence

Abaza Abkhaz Albanian Andi Archi Armenian Avar Bashkir Basque Botlikh Breton

.20 .50 .50 .60 .10 .40 .60 .10 .60 .60 .20

.80 .60 .56 1.00 1.00 .56 1.00 1.00 .56 1.00 .60

.67 .81 .97 1.00 1.00 .94 1.00 1.00 .89 1.00 1.00

418

Dik Bakker

Language

Flexibility

Bulgarian Byelorussian Catalan Chamalal Chechen Chuvash Classical Armenian Classical Greek Cornish Crimean Tatar Danish Dargwa Dido Dutch English Estonian Faroese Finnish French Frisian Friulian Galician Georgian German Godoberi Gothic Greek Hinukh Icelandic Ingrian Ingush Irish Italian Kalmyk Khinalug Kirmanji Komi-Zyrian Komi-Pertnyak Kumyk Ladin Lak Latin Latvian Lezgi Lithuanian Livonian

.60 .70 .40 .60 .10 ,10 .80 .60 .20 .20 .30 .50 .10 .40 .40 .40 .40 .60 .10 .30 .10 .20 .50 .40 .40 .70 .60 .00 .40 .50 .00 .20 .30 .10 .00 .10 .70 .40 .10 .10 .00 .90 .50 .10 .30 .40

Consistency

Consequence

.11 .20

.67 .50

.40 1.00 1.00 1.00

.67 1.00 1.00 1.00 1.00

.00 .33 .60 1.00 .00 1.00 1.00 .11 .00 .20 .20 .11 .40 .11 .40 .40 .75 .11 1.00 .40 .40 1.00 .40 .11 1.00 .60 .40 1.00 1.00 .33 .11 .78 1.00 .40 1.00 .14 .00 1.00 .11 .20

.65 1.00 1.00 1.00 1.00 1.00 .67 1.00

.27 .93 .49 1.00 .87 1.00

.67 1.00 .67 1.00 .94 .65 1.00 1.00 .29 1.00 .71 .86 1.00 1.00 .67 .34 1.00 1.00 1.00 1.00 .50 1.00 1.00 .14 .27

Flexibility and consistency in word order patterns Language

Flexibility

Consistency

Consequence

Lower Sorbian Luxembourgeois Macedonian Maltese Manx Mordvin Nogai Northern Saami Norwegian Polish Portugese Romansch Rumanian Russian Rutul Sardinian Scottish Gaelic Serbo-Croatian Slovak Slovene Spanish Swedish Tabasaran Tsakhur Turkish Udmurt Upper Sorbian Vepsian Yiddish

.40 .40 .50 .30 .30 .60 .10 .40 .40 .60 .30 .10 .50 .70 .30 .20 .30 .50 .50 .50 .30 .40 .40 .30 .20 .50 .60 .80 .50

.33 .11 .20 .60 .78 .11 1.00 .11 .20 .20 .40 .40 .78 .20 1.00 .60 .56 .20 .20 .40 .40 .00 1.00 1.00 1.00 .56 .50 .11 .00

1.00 .67 .81 .86 .87 .49 1.00 .33 .93 .65 .86 1.00 .97 .50 1.00 1.00 .52 .81 .81 .97 .43 1.00 1.00 1.00 1.00 .81 1.00 .50 1.00

419

Beatrice Primus

The relative order of recipient and patient in the languages of Europe

1. Introduction1 Typological word order studies are carried out in terms of linearization patterns that are commonly referred to as the basic order of a language. Within the context of typological studies the term "basic order" is usually introduced by the following working definition or a very similar variant thereof: 2 (1)

Basic order is the order that occurs in stylistically neutral, independent, indicative clauses with full noun phrase participants, where the subject is definite, agentive and human, the object is a definite semantic patient and the verb represents an action, not a state or an event.

This is an explicit or implicit working definition for many influential publications on universale of argument position such as Greenberg (1963), Hawkins (1983), Tomlin (1986), and Siewierska (1988). The working definition (1) is restricted in application to agents and patients and interprets subject and object as those syntactic functions that express, in general, an agent and respectively a patient of a basic transitive verb such as hit and write. The exclusion of experiencers and stimuli of transitive stative verbs such as know and like, as well as the exclusion of recipients of ditransitive verbs such as give and sell is motivated mainly by practical considerations due to the fact that these roles are semantically and formally more difficult to identify cross-linguistically. Nevertheless, these roles often offer the crucial piece of evidence for an empirically exact and simple formulation of the linearization rules in a language. One of the purposes of the present paper is to fill this gap by taking a closer look at the relative order of recipients and patients in the languages of Europe. In most of the languages of the world the relative order of constituents is determined by markedness (or preference) rules rather than by strict rules. Preference rules are strictly local in the sense that they hold only r e l a t i v e to one determining factor. This is formulated in (!'):

422 (Γ)

Beatrice Primus The order of two constituents is an unmarked (or basic) order r e l a t i v e to a linearization (sub)rule R if and only if R is a preference rule or subrule with only one determining factor such that is ordered according to the rule and is not «Y, X> is the marked (or non-basic) order r e l a t i v e to R).

is not an exception to the rule, of course, unless ist statistically dominant over and this dominance cannot be explained by the intervention of another synchronic or diachronic rule.3 Within the present approach, word order is viewed as a multi-factor phenomenon. There are several competing linearization factors and each factor determines a particular word order. In language use, a linearization pattern is the more preferred, the more linearization rules it obeys. This competition model of word order has been successfully applied to languages with apparent word order freedom such as German (cf. Reis 1987; Jacobs 1988; Primus 1994 a). In such languages, a word order obeying a linearization rule R 1 is reversed to only if there is another linearization rule R 2 that motivates . The more rules obeys, the more rigid becomes, i. e. the less motivated the reverse order is. The impression of freedom arises in German due to the interaction of several competing rules. By contrast, in a genuine free word order language, can be reversed to without motivation. The competition model of word order is supported by the performance model with the same name proposed by Bates and MacWhinney (1989) on the basis of Parallel Distributed Processing Models (cf. Rumelhart 8c McClelland 1986). (!') also allows pragmatic factors to determine a basic order and some languages with only this type of basic order will be mentioned in section 5.3 below. By contrast, pragmatic factors lie outside the scope of the above-mentioned working definition (1). The elimination of pragmatic factors is obvious from the restriction to stylistically neutral, independent, indicative clauses with full noun phrases that are both definite. Since even grammatical factors are reduced to one (i. e. to subject and object interpreted thematically), the working definition (1) favours the idea of the (particular) basic order of a language. As will be shown below, some languages have two grammatical linearization factors that do not necessarily coincide in their prediction so that even without taking pragmatic factors into consideration one cannot assume only one basic order. One of the aims of the present paper is to draw attention to distinctions regarding the freedom of word order. As will be shown below, recipient-patient order in some languages is apparently free. But this is not due to the fact that

The relative order of recipient and patient

423

the word order rules determining agent-patient order (i. e. subject-object order) do not apply to recipients and patients. Rather, in some languages this freedom is an instance of two grammatical factors leading to divergent basic orders. There are, of course, also languages with absolutely free word order of verbal arguments relative to grammatical factors in the precise sense that there are no grammatical word order rules in these languages. Yet another reason for classifying languages into free vs. rigid word order languages is the fact that universal preferences show up as strict rules in some languages and as preference rules in others. A rigid order of subject and object, for instance, means that any permutation of subject and object also changes the interpretation of the sentence in terms of thematic roles, as in the following English examples: (2)

a. The mother kissed the daughter, b. The daughter kissed the mother.

In (2 a) the mother must be interpreted as the agent, in (2 b) only the daughter can be interpreted as the agent.4 Other languages with rigid order of subject and object in this sense are, for example, Mam (Mayan, cf. England 1983), Jatmul and Yessan-Mayo (both Papuan, cf. Foley 1986: 103). Many languages have a certain degree of freedom in ordering subject and object. German is a typical language without a rigid order of arguments. See (3): (3)

a. Die Mutter küßte die Tochter. b. Die Tochter küßte die Mutter. 'The daughter kissed the mother.' or 'The mother kissed the daughter.'

The noun phrases show no overt case marking in these examples so that the interpretation of the sentences in terms of thematic roles relies heavily on word order. The relative word order freedom is reflected in the fact that in (3 a) the mother is not necessarily but only preferably interpreted as the agent, and that (3 b) also allows this interpretation. The preference to interpret the initial arguments as the agent is very strong in German. As shown in Primus (1994 a), agent-patient order is judged to be by far more acceptable than patient-agent order in acceptability tests, and it reaches a frequency of ca. 97% as opposed to ca. 3% patient-agent order in the texts analysed by Hoberg (1981) and Primus (1994 a). This means that German has a preference rule according to which agents precede patients. The difference between English and German is

424

Beatrice Primus

that the same rule is a strict rule in English and a preference rule in German. We obtain preference rules from strict rules by introducing hedges such as "preferably", "in the unmarked case", or "tends (to precede)". The content of the rule remains the same. The present paper will show that the exact nature of the grammatical factors involved in the linearization of verbal arguments cannot be explored by using conglomerate terms such as subject and object, whose usefulness relies heavily on an idealization. No matter how subject and object are defined, each definition overgeneralizes a correlation between two relational concepts belonging to different autonomous systems (e.g. agent-nominative, agent-topic, etc.). This idealization and the thematic orientation in the interpretation of subject and object in typological word order studies has lead to a neglect of formal marking and of its role in determining the basic position of a verbal argument. This paper will show that formal marking, specifically case marking, offers genuine explanations for word order variation across languages. The conspicuous property of the present approach is that instead of the notions of subject, direct object, and indirect object, it uses the more basic relational concepts entering the definition of subject and object in various theoretical approaches. This approach allows various types of relational concepts to apply independently of each other. Every system has its relational concepts aligned on a hierarchy. Therefore, the present approach is called Generalized Hierarchy Grammar (cf. Primus 1994 b). There are at least two types of relational concepts that qualify as grammatical factors in determing the basic position of verbal arguments: thematic relations and formal relations established by the case or adpositional marking of a verbal argument. I will show that each type of relation is relevant for universal ordering preferences. This parametrization gives the present approach the flexibility to cope with constructions in which the functions of arguments are incongruous in terms of thematic roles and formal marking. Such constructions are, for instance, ditransitive constructions involving recipients and patients, oblique experiencer constructions, and ergative constructions. In this paper I will concentrate on ditransitive constructions (cf. oblique experiencer constructions and ergative constructions in Primus 1994b). I will not consider constructions with clitic, pronominal or clausal arguments. Pronominal arguments in German will be mentioned as independent evidence, because their linearization reflects the principles defended in this paper. The order of presentation is as follows. Section 2 discusses phrase structure restrictions on word order. Section 3 introduces the approach to thematic roles defended in this paper and states the preference rule of word order based on thematic roles. Section 4 states the preference rule relative to formal complexity

The relative order of recipient and patient

425

and discusses the hierarchy that determines this preference. Section 5 takes a close look at the relative order of recipients and patients in 54 European languages. The data from the European languages are supplemented by data from a broader cross-linguistic sample. Section 6 takes a closer look at the role of animacy in determining the relative order of verbal arguments. Section 7 discusses the relevance of verb-patient bonding in determining the proximity of recipient or patient to the verb. Section 8 summarizes the results of this investigation. An appendix tabulates the 54 European languages according to the relative order of recipient and patient.

2. Remarks on phrase structure Within most phrase structure grammars, phrase structure rules generate ordered expressions, and accordingly, linearize the constituents of the clause. This does not apply, of course, to models that do not represent syntactic phrase structures such as Relational Grammar (e. g. Perlmutter & Rosen 1984) or Functional Grammar (e.g. Dik 1978, 1989). It also does not apply to recent approaches within Generalized Phrase Structure Grammar (cf. Gazdar et al. 1985; Uszkoreit 1986). In this model, phrase structure rules only represent the dominance relations within complex linguistic expressions. Linear order rules are stated separately and this method is particularly congenial for languages where the grammatical word order rules are multi-factored and do not yield one specific basic order (cf. Uszkoreit 1986). But even within the Generative Grammar influenced by Chomsky (1981, 1986), the linearization patterns defined by phrase structure rules cannot be claimed to impose too many word order restrictions by themselves. Consider the commonly assumed structure of a ditransitive clause in English as generated by phrase structure rules in (4): (4)

[s NP [ V p V NP NP]]

A genuine phrase structure restriction on linearization blocks discontinuity which results from the prohibition of crossing branches. This seems to be a good prediction, because discontinuous placement of two constituents that belong to the same phrase is certainly a marked phenomenon. But beyond that, there is nothing in the phrase structure rules generating (4) nor in the phrase structure representation itself that prohibits interpreting the NP immediately dominated by S as the patient and one of the VP-dominated NPs as the agent. Such is the case in Pari, an ergative Nilotic language (cf. Andersen 1988). Without the addition of case assignment principles, (4) does not even prohibit hav-

426

Beatrice Primus

ing an accusative NP immediately dominated by S in a nominative language. These observations reveal that, as in the above-mentioned proposals within Generalized Phrase Structure Grammar, word order rules have to be stated separately, either literally as word order rules or as other rules that show the intended word order effect. This suggests that phrase structure rules and linear order rules are disjunct even in grammatical models in which phrase structure rules determine linearized expressions. Both structural relations and precedence relations are hierarchy relations (cf. Erne 1982). One of their main functions is to express the syntactic superiority of a verbal argument over another verbal argument. The hierarchical organization of verbal arguments explains the fact that certain arguments, for instance nominative agents, are more readily involved in grammatical rules than other verbal arguments, for instance accusative patients. This insight underlies various universal syntactic theories such as those of Keenan & Comrie (1977) and Chomsky (1973 and subsequent work). The idea of a hierarchical organization of verbal arguments is unchallenged, but the exact nature of this hierarchical organization is a matter of constant debate. The Generalized Hierarchy Approach defended in this paper allows different types of pragmatic, semantic or morphosyntactic hierarchy relations and considers both structural and precedence relations as two independent hierarchy relations. The structural superiority of a verbal argument over another has been defined in more recent syntactic studies in terms of the c-command relation (cf. Reinhart 1983): (5)

A node X c-commands a node Υ if (a) X and Υ do not dominate each other, and (b) the first branching node dominating X dominates Y.

X is structurally superior to Υ if and only if X asymmetrically c-commands Y. In the example (4) above, the first NP asymmetrically c-commands the other NPs. The first NP is therefore structurally superior to the others. The second and third NPs c-command each other, so that neither is structurally superior to the other. The fact that structural and topological asymmetries between verbal arguments do not necessarily coincide shows up most clearly in non-configurational, so-called "flat" structures and in left-branching structures (cf. Reinhart 1983). The first option is illustrated in the VP structure in (4). Both NPs within the VP c-command each other so that the first one cannot be claimed to be

The relative order of recipient and patient

427

structurally superior to the second one. A hierarchical relation between the two NPs arises by precedence alone. As pointed out by Barss & Lasnik (1986), the topological asymmetry in terms of basic order suffices to explain the binding facts shown in (6): (6)

a. I showed Mary; herself; in the mirror. b. *I showed herself; Mary; in the mirror. c. Herself; I showed Mary; in the mirror.

The relevant rule for English is that antecedents have to c-command and precede their reflexives in the basic order. (6c) is not a counterexample to this rule because it involves a marked order obtained through topicalization. A case of conflict between structural and topological superiority arises in left-branching structures. Let us consider VORS languages such as Tumbala Choi (Mayan), Tzotzil (Mayan), and Mezquital Otomi (Oto-Manguean), cf. Blansitt (1973). If we consider word order alone, such languages are rather remarkable in having two objects before the subject as a basic order. But under the analysis [[VO]R]S, the subject (S) emerges as structurally superior to the recipient object (R). The same holds for the recipient object relative to the patient object (O). If the basic structural positions of recipient and patient are thematically determined as discussed in section 5.1. below, and there is no formal asymmetry between Ο and R, one expects [[VO]R]S as a basic structure in a left-branching language. Another language which may be seen to display a left-branching construction is Albanian. In Albanian, the relative order of patient and recipient is considered to be rather free (cf. Hubbard 1985, ADB), but nevertheless there is a preference for the order SVOR (cf. Williams 1988, Buchholz & Fiedler 1987), as illustrated in (7): (7)

Albanian (Buchholz & Fiedler 1987: 546) Profesori [[u dha sqarime] disa studenteve] the-professor-NOM CLT gave comments-ACC some students-DAT 'The professor gave some students comments.'

Under the structural analysis indicated by the brackets in (7), one can explain the reflexivization pattern of Albanian with respect to the binding properties of accusative and dative objects in terms of a universal c-command restriction. The restriction is that an anteccedent has to c-command its anaphor (cf. Reinhart 1983). In Albanian active sentences, a dative or nominative argument can

428

Beatrice Primus

serve as an antecedent of an accusative reflexive irrespective of the surface order of antecedent and reflexive (cf. Hubbard 1985, Williams 1988). Cf. the data offered by Williams (1988: 161-162) in (8): (8)

Artisti ia tregoi veten Drites. the-artist-NOM CLT-CLT showed self-ACC Drita-DAT Artisti ia tregoi Drites veten. 'The artist showed Drita himself/herself.'

If the reflexive is dative, the accusative argument is blocked as an antecedent. The only remaining option is coreference with the nominative argument, as illustrated in (9): (9)

Artisti ia tregoi Driten vetes. Drita-ACC self-DAT Artisti ia tregoi vetes Driten. 'The artist showed Drita to himself/*to herself.'

As shown in (7), the structural VP analysis which accounts for the reflexivization facts is left branching. The above facts cannot be captured in terms of precedence, which constitutes a strong argument for the empirical relevance of structural relations in general, and that of the c-command relation in particular. The present paper does not take structural relations into consideration for practical reasons. Structural relations are more difficult to identify, since they are not directly visible on the surface. This is why assumptions about phrase structure relations are very controversial even for well-studied languages such as English. Information on phrase structure relations for less-studied European languages is even more controversial or absent. These practical considerations should not lead to the premature conclusion that structural relations are theoretical, fake constructs without empirical substance. Keenan (1978) and Reinhart (1983), for instance, draw attention to phenomena of anaphor binding in left-branching structures, e. g. in VOS languages, and the above Albanian data also clearly indicate the empirical value of the c-command relation. Structural relations have to be taken into consideration and they may lead to an evaluation of some of the languages considered in this paper which differs from an evaluation based on purely topological relations.

3. Basic order determined by semantic dependencies The first type of relational concepts we will discuss are thematic roles, which are also called deep cases or semantic relations. The present approach to the-

The relative order of recipient and patient

429

matic roles can be viewed as a further development of Dowry's (1991) analysis. Under this view, thematic roles are decomposed into more basic thematic relations involving control, causation, sentience etc. Thematic roles are treated as cluster concepts of prototype theory (cf. Rosch & Mervis 1975; Dahl 1987). This allows a verbal argument to have different degrees of membership in a prototype role. This gradience is a reflection of the fact that an argument may be characterized only by several, but not by all of the entailments defining the prototype role. Under this assumption, Dowty needs only two theta-role prototypes: Proto-Agent and Proto-Patient. The Proto-Roles posited by Dowty are simular, but not identical to the macroroles Actor and Undergoer within Role and Reference Grammar (cf. Foley 6c Van Valin 1984). While both Proto-Roles and macroroles are more general than the specific thematic relations of agent and patient, the macroroles and microroles of the Role and Reference Grammar Model belong to different levels (tiers) of semantic role representation. Macrorole assignment principles are needed to regulate the relationship between the roles of the two levels. By contrast, a Proto-Role is defined on the basis of a list of basic thematic relations. The relation between a Proto-Role, such as Proto-Agent, and a specific thematic role, such as agent, experiencer, or causer, can be captured in this approach on the same level of thematic representation. A list of basic thematic relations that characterize the Agent Proto-Role includes at least the following: (10)

a. a controls P[a] b. a causes P[a] c. a is active in P[a] d. a is sentient of P[a] e. a possesses an entity involved in P[a]

a represents a participant in P[a]; the term "participant" is used for the semantic representation of an argument of a predicate on the level relevant for thematic information. P[d] represents the situation denoted by the rest of the sentence including a, the predicate of α and other participants or modifiers of this predicate that are relevant for thematic information. This captures the fact that thematic relations are not always solely determined by verb lexemes. In principle, any part of P[a] may contribute to or may be affected by the thematic property of a. Thus, thematic role analysis is not confined to lexical information, but it can be restricted in this way if so desired.

430

Beatrice Primus

The list in (10) is very similar, but not identical to Dowty's (1991). Each of the basic thematic relations in (10) is semantically independent, although most of the verbs in English and other languages select more than one such basic relation for an argument. For example, consider the subject argument of the two-place predicates χ murders y, χ nominates y, and χ interrogates y. The basic thematic relations that they all share include χ controlling the situation described by the verb, and the fact that χ intends this to be the kind of act named by the verb. Furthermore, χ causes some event to take place involving y (y dies, y acquires a nomination, y answers questions), χ is sentient of the event named by the verb, and χ moves or changes externally (i. e. not just mentally). The control relation is not shared by the subject argument of kill (traffic accidents also kill) and by the subject argument of convince (one can convince inadvertently), the causation and activity relation is not shared by the subject argument of like and understand. The sets of basic relations of murders, nominates, and interrogates with respect to their subject argument is the same, so that all of these verbs can be said to select the same thematic role for their subject argument. This role is traditionally named agent. The thematic role of the subject argument of kill is called causer or effector in some approaches. The thematic role of the subject argument of like and understand is traditionally called experiencer. The possessor relation is selected by verbs such as acquire, take and have. The Patient prototype is defined by thematic dependence in the present approach. Thematic dependence is meant to capture the intuitive notion of affectedness. The basic thematic relations defining Proto-Patient imply thematic dependence on another participant. If one participant of a predicate is causally affected, the predicate necessarily selects a causer as another participant, and correspondingly for controlled and moved participants. If one participant of a predicate is a stimulus, the predicate necessarily selects an experiencer as another participant. Correspondingly, the thematic role of the possessed can be defined only on the basis of the notion of possessor. Thematic (in)dependence is the crucial property distinguishing Proto-Agents from Proto-Patients in the present approach. This is captured in (11): (11) (a) Proto-Agent is defined by those and only those basic thematic relations that imply thematic independence; (b) Proto-Patient is defined by those and only those basic thematic relations that imply thematic dependence an another participant. Thematic independence and dependence is defined in (12):

The relative order of recipient and patient

(12)

431

α is thematically independent in P[a] if and only if the basic thematic relations of α in P[a] are assigned independently from the basic thematic relations of another participant in P[a]. Correspondingly, a is thematically dependent in P[a] if and only if the basic thematic relations of α in Ρ [α] are dependent on the basic thematic relations of another participant in P[a].

Other non-agentive roles such as recipient, addressee, goal, and benefactive, which have been neglected by Dowty, can be made more precise in a similar way. See (13): (13)

If α is recipient, goal, addressee or benefactive in P[a], then there is another participant β in P[a] that controls and causes P[a].

Remarks that come close to the implication in (13) referring to benefactives have been made as early as Fillmore (1968) and Dik (1978). Apparent counterexamples such as in this is good for him do not meet the condition for benefactives in (13), and indeed for him does not seem to be a benefactive proper in this example. As with other non-agentive roles, recipients, addressees, and goals cannot be distinguished from each other or from a patient-like role in isolation. They are distinguished by the type of activity involving the protoagentive participant in (13). If β is involved in a transaction activity, α is a recipient or transaction goal, and if β is involved in a speech activity, α is an addressee. Recipients, goals, addressees, and benefactives have an affinity to ProtoAgents. Their involvement in the event described by the verb unilaterally implies that they acquire or cease to have a Proto-Agent property (cf. for a similar view Smith 1985). Recipients or transaction goals such as Mary in Peter gives Mary a cake and some benefactives such as Mary in Peter is baking Mary a cake become possessors. The relation between Mary and cake in these examples unilaterally implies a possessor-possessed relation (cf. Jackendoff 1991). The same holds for related roles such as source-recipient, a quasi-converse of a goal-recipient, as in the German sentence Peter nahm Maria den Kuchen weg 'Peter took away the cake from Mary'. The role frame of this predicate unilaterally implies that Mary was a possessor of the cake. Other typical ditransitive verbs such as teach, tell, and show have similar Proto-Agent implications for their Proto-Recipient. If χ teaches y z, then χ intends that y gets to know z. If χ shows y z, then χ intends that y sees z (cf. Blansitt 1973). Restrictions such as ( l i b ) and (13) predict that there are no lexical predicates with role frames that contain, for instance, only one stimulus, benefactive,

432

Beatrice Primus

recipient or addressee participant. This seems to be a valid (possibly statistical) generalization across languages. Recipients, addressees, goals, and benefactives from a natural class, since they share common thematic implications, namely that in (13). This class is called Proto-Recipient in the present approach. By the criterion of the thematic dependence relation, Proto-Patients and Proto-Recipients are thematically dependent on Proto-Agents and Proto-Patients are dependent on Proto-Recipients as described above. This yields the Thematic Hierarchy in (14): (14)

Proto-Agent 0.001), though slightly less so than that of agreement. The languages which lack case marking, like those which lack agreement, favour rigid (36%) or restricted (33%) word order, but less so than the languages with no agreement. Recall that 81% of the languages without agreement are rigid or restricted word order

510

Anna Siewierska

ones, whereas the corresponding figure for languages without case marking is 69%, i.e. 12% lower. The lower incidence of rigid and restricted word order languages without case marking relative to those without agreement marking can be observed in all three of the dominant word order types. Compare the figures in Table 12 with those in Table 13. Table 13. Number of word order variants relative to presence and absence of overt case marking in SOV, SVO and VI

;+)

4l

1

2

3

8 40% 6 13%

8 20% 24 49%

1 5% 8 16%

2 5

10% 10%

1 5% 6 12%

-38 + 18

15 39% 2 11%

15 39% 4 22%

4 3

11% 17%

2 5% 4 22%

2 5% 5 28%

-17

6 35% 0 0

1 6% 5 42%

7 41% 2 17%

2 12% 4 33%

1 1

basic w/o

Case

SOV N = 69

-20 +49

SVO N - 56 VI N = 29

0

+ 12

6% 8%

Whereas 78% of the SOV languages lacking agreement have rigid or restricted order, only 60% of those without case marking are rigid or restricted order languages. The corresponding figures for SVO are 96% vs 78%, and for V I 60% vs41%. As for the distribution of case marking relative to the different degrees of word order flexibility, shown in Table 14 below, there is an increase in the level of case marking from 24% in rigid word order languages to 79% in highly flexible word order languages. Table 14. Presence and absence of overt case marking relative to number of word order variants 1

2

Case

0

- Case 80

29

76%

26

44%

13

45%

9

24%

33

56%

16

55%

+ Case 91

4+

3 7 33%

5

21%

66%

19

79%

14

However, unlike with agreement, there is a sharp rise in case marking between rigid and restricted word order languages 24% vs 56%, a marginal drop between restricted and variable order ones, and then a 11% and 13% increase between variable and flexible (55% vs 66%) and flexible and highly flexible (66% vs 79%) ones. Thus whereas rigid word order languages are much more

Variation in major constituent order

511

likely to lack case marking than highly flexible word order ones, the restricted, variable and flexible word order languages show no strong preferences in this respect. Moreover, 27% of the flexible and highly flexible order languages lack case marking. Recall that for agreement the corresponding figure was only 16%. The figures in Table 12 and 14 suggest that the relationship between case marking and word order flexibility is somewhat different than that between agreement and word order flexibility. Absence of case marking is a poorer predictor than agreement of rigid or restricted order (by 13%) and flexible and highly flexible order is a poorer predictor of the presence of case marking than of the presence of agreement (by 11%). The most interesting relationship between case marking and word order flexibility defined by the data is that between rigid word order and lack of case marking; 76% of the rigid word order languages lack case marking. Though this figure is not as high as standard assumptions would lead us to expect, note that among the rigid word order languages, lack of case marking is considerably more common than lack of agreement, both overall 76% vs 45% and for the dominant word order types (57% vs 29% for SOV, 88% vs 65% for SVO and 100% vs 33% for VSO). Among the restricted word order languages, on the other hand, the number of languages without case marking is the same (44%) as those without agreement. The relationship between rigid word order and lack of case marking, however, is very weak in SOV languages; only 57% (8/14) of the rigid SOV languages lack case marking as compared to 88% of the SVO (15/17) and 100% (6/6) of the V I . It needs to be noted that, as in the case of agreement, word order variation in SVO languages is more sensitive to case marking than in SOV. Though table 13 documents that the percentage of SVO languages with case marking which exhibit rigid order is similar to that of SOV (11% and 13% respectively), the percentage of flexible and highly flexible SVO languages with case marking is considerably higher than that of the SOV, the relevant figures being 28% vs 12%. Since there are considerably more languages without case marking than without agreement, the likelihood of the word order variants defined as dispreferred in § 3.3.1. occurring in languages lacking case marking is greater than in those lacking agreement. Yet in relation to the verb-peripheral variants, this is not so. There is only one verb-initial language without case marking, Upper Chinook (Boas 19116), which exhibits verb-final variants, and there are two verb-final languages without case marking which have verb-initial variants, namely Abkhaz (Hewitt 1979), which has both VSO and VOS (used mainly for contrastive purposes) and Karok (Bright 1957) which has VOS. By contrast,

512

Anna Siewierska

the OVS variant in verb-initial languages is just as common in languages without case marking (7) as in case marking languages (7). In SOV languages without case marking, on the other hand, OVS order is rare. There is only one such language in the sample, namely Wichita (Rood 1976). As alluded to at the end of Section 4.1., lack of case marking, unlike lack of agreement marking, does not preclude the occurrence of postverbal transitive subjects in SVO languages; 6/14 SVO languages which have OVS variants, 8/ 12 which have VOS variants and 6/12 which have VSO ones lack case marking. In the light of Holmberg's (this volume) discussion of SVO languages, it needs to be noted that the least common word order variant in SVO languages without case marking is SOV. Nonetheless, even if we disregard the four SVO languages in which SOV order is dependent on tense or aspect (see note 9), still 4/13 (30%) of the SVO languages which have a SOV variant lack case marking. The languages in question are Mandarin (Sun & Givon 1985), Salinian (Dryer 1989), Tuscarora (Mithun 1976) and Woods Cree (Wolfart & Caroll 1981). In Mandarin and Tuscarora the SOV variants are used only in contrastive contexts, but not in the other two languages.31 Note, however, that three of the above languages are Amerindian. Thus Holmberg's suggestion that the presence of overt case marking, though not a sufficient condition is a necessary condition for the occurrence of SOV order in a SVO language may be in essence correct for European SVO languages and perhaps SVO languages in other areas, but not for NAmerican SVO languages. In contrast to the SOV variant, the most common word order variant in SVO languages without case marking is OSV; 10/16 (63%) instances of this variant occur in languages with no case marking. This discrepancy in the frequency of SOV as compared to OSV orders in SVO languages without case marking is presumably connected with the identification of the subject with immediate preverbal position.

5. Word order variation: the languages of Europe The following discussion is based on 48 European language, roughly a third of the languages of Europe, selected according to the same methodology as the global sample. The languages in question are listed according to phyla and major sub-branches in Appendix II. Since the majority of the languages of Europe belong to the Indo-European phylum this phylum is the best represented with 27 languages. The Altaic phylum is represented by six languages, the Uralic phylum by five, the Nakh-Daghestanian by four and the Kartvelian and Northwest Caucasian by two each. The two remaining languages are the Afro-Asiatic Maltese and the language isolate Basque.

Variation in major constituent order

513

For 38 of the 48 languages in the European sample, data on word order variation were obtained by means of an extensive questionnaire completed by native speaker linguists of the languages in question or experts of the given language in cooperation with native speakers.32 Information on the remaining ten languages was gathered from grammars and linguistic articles.

5.1. The basic order In terms of the word order typology that I am using, the languages of Europe exhibit considerably less diversity in basic order than that found globally. There are no VOS, OVS or OSV languages in Europe. The vast majority of the 48 languages in the sample are SOV 22 (46%) or SVO 20 (42%). Of the remaining six languages, three are VSO and three split SOV/SVO. The split languages are Hungarian, Georgian and Upper Sorbian. Though, in the light of Kiss's (this volume) discourse configurational anal) sis of Hungarian, the language could be classified as exhibiting no basic order, in terms of statistical frequency it qualifies as a split SOV/SVO language. For instance, in a small written corpus investigated by Behrens (1982: 161) consisting of 1009 declarative clauses, among the 83 transitive clauses with two nominal arguments there were 59 (71%) SVO clauses, 18 (22%) SOV and only 3 (4%) OSV, 2 (2%) OVS, and 1 (1%) VSO. And her spoken corpus of 1463 declarative clauses (ibid 198) containing 103 transitive clauses with two arguments numbered 45 (44%) SVO clauses, 33 (31%) SOV, 9 (9%) OSV, 7 (7%) OVS, 5 (5%) VOS and 2 (2%) VSO. Thus, despite the fact that SVO outnumber SOV, the latter are evidently favoured over any other of the transitive orders. As for Georgian, according to Testelec (this volume), the existing text counts of the order of NPs in Modern Georgian, while favouring a SOV classification, cannot be considered as reliable. That SOV order is not, in fact, statistically dominant in Georgian is also suggested by Vogt's (1971: 220—224) analysis of 50 pages of a modern novel in which SVO clauses clearly outnumber SOV. For Upper Sorbian there do not appear to be any statistical data available. My classification of Upper Sorbian as having split basic order is essentially the result of the conflicting claims that have been made in regard to this language. For example, while Stone (1993: 655) claims that SOV order is basic in non-compound tenses, Uhlirovä (private communications) argues that this is only the case in subordinate clauses, while in main clauses both SOV and SVO are common. Moreover, in compound tenses, the finite and nonfinite verbs are split, as in German, with the finite verb occurring in second or initial position and the nonfinite finally (see Sie-

514

Anna Siewierska

wierska & Uhlifova this volume). This suggests that a straightforward SOV classification of Upper Sorbian would be misleading. The majority of the SOV languages in the European sample are nonlndoEuropean (68%). By contrast all the VSO languages (i.e. three Celtic languages) and the majority of the SVO (85%) languages are Indo-European. The distribution of basic order is thus very much genetically determined. This is shown more clearly in Table 15. Table 15. The distribution of basic order in the European sample according to phylum phylum

SOV N = 22

SVO N = 20

VSO N = 3

split N = 3

total N = 48

IE

6

27%

17

85%

3

100%

1 33%

27

56%

Altaic

6

27%

0

0

0

0

0

6

13%

Uralic

2

9%

2

11%

0

0

1 33%

5

10%

NakhDaghestan.

4

18%

0

0

0

0

0

4

8%

Kartvelian

1

5%

0

0

0

0

1 33%

2

4%

NW Caucasian

2

9%

0

0

0

0

0

0

2

4%

Other

1

5%

1

5%

0

0

0

0

2

4%

0

0

5.2. Word order variation A priori there is little reason to expect word order variation in Europe to diverge substantially from that in the global sample. Admittedly, Proto-IndoEuropean, the ancestor of over half of the languages in the European sample, is said to have had highly flexible word order, but this does not necessarily entail that the same property will be manifested in its descendants. It therefore comes as somewhat of a surprise to find that the level of word order variation among the languages in the European sample is 26% higher than in the global sample. The 48 languages in the European sample display 66% of the logically possible alternatives to the basic order, while those in the global sample only 40%. This high level of word order variation holds irrespective of whether we do or do not take into account the morphologically and phonologically marked nonbasic orders. The relevant data are presented in Table 16 and a language by language specification of the word order variants is provided in Appendix

Variation in major constituent order

515

III. As in Table 5 which showed the corresponding data in the world's languages, the first two columns of figures in Table 16 specify the instances of the relevant order which qualify as variants on the basic order in the sense of the term defined in § 2. The third and fourth columns of figures represent the morphologically marked nonbasic orders and those containing a disjuncture. The fifth and sixth columns give the totals of each of the six linearizations that occur as nonbasic orders. And the last column specifies the number of languages manifesting a basic order other than that given in the column on the far left. The percentages, apart from those in the bottom row, are calculated with respect to the figures in this last column. The percentages in the bottom row refer to the number of nonbasic orders relative to the logically possible 237 if all the languages in the sample had five, or in the case of the split languages, four variants each. Table 16. The distribution of the logically possibly linearizations of the subject, object and verb in the European sample w/o

as a variant order

with additional morphological marking and/or a disjuncture

total

basic order other than

SOV

11

48%

1

4%

12

52%

23

SVO

17

68%

3

11%

23

80%

25

VSO

26

58%

2

4%

28

62%

45

vos ovs osv

23

48%

3

6%

26

54%

48

28

58%

10

18%

38

79%

48

37

77%

3

6%

40

83%

48

total

142

57%

22

9%

164

66%

As evidenced by the figures in Table 16, just as in the case of the global sample, the inclusion of the phonologically and morphologically marked orders has little overall affect on the frequency of occurrence of the nonbasic orders. The marked orders increase the number of nonbasic orders by only 9%, the most notable increases involving the object-first orders, again as in the global sample. If we compare the data in Table 16 with the corresponding global data in Table 5 we see that the occurrence of each of the six linearization patterns as nonbasic orders is on an average around 30% higher than in the global sample. The biggest differences involve OVS (79% vs 43%) and VSO (62% vs 29%)

516

Anna Siewierska

orders, the smallest SOV orders (52% vs 36%). There are also differences in regard to the relative ranking of the nonbasic orders in terms of their frequency of occurrence. Unlike in the global sample, VSO is just as frequent as OVS as a variant order and VOS just as frequent as SOV. And overall SOV is the least common nonbasic order, occurring considerably less frequently than VSO and marginally less frequently than VOS. Needless to say, these differences in the relative ranking of the six linearization patterns as nonbasic orders are essentially due to the higher incidence of basic SOV (46% vs 40%) and SVO (42% vs 32%) languages and lower incidence of VSO 88% vs 14%) languages in the European sample as compared to the global one. This, however, has no direct bearing on the overall higher incidence of each of the nonbasic orders. The distribution of the word order variants relative to word order type among the languages in the European sample basically echoes that of the global sample in the case of SOV languages, but not in the case of the SVO and VSO ones. The data in question are shown in Table 17. Table 17. The distribution of variant orders relative to basic order in the European sample basic w/o

SOV variant

SVO variant

VSO variant

VOS variant

OVS variant

OSV variant

SOV N = 22

0 0

17 77%

10 46%

9 41%

10 46%

22 100%

SVO N = 20

11 55%

0 0

13 65%

11 55%

15 75%

12 60%

V S O N = 3

0 0

0 0

0 0

0 0

0 0

split N = 3

0 0

0 0

3 100%

3 100%

3 100%

0 0

3 100%

As in the global sample, the most common word order variant in the SOV languages of Europe is OSV, followed by SVO, 100% and 77% respectively. Interestingly enough though, VSO is just as common as OVS (46%), unlike in the global sample. In the case of the European SVO languages the most common word order variant is OVS (75%) rather than VOS as in the global sample. And the second most common variant is VSO (65%) which, also unlike in the global sample, is marginally more common than VOS (55%). As for the European VSO languages., in contrast to those in the global sample, they exhibit no word order variation and thus no preference for SVO variants. However, as

Variation in major constituent order

517

suggested by the data in Table 17, all three of the Celtic languages in the sample have SVO if the subject is followed by a special particle.33 (36)

Breton (Janig Stephens, Eurotyp questionnaire) Annaig a lenn ul levr. Annaig PTL read:PRS:3SG a book 'Annig reads a book.'

(37)

Welsh (Maggie Tallerman, Eurotyp questionnaire) Fy nhad a bryn-odd y beic newydd. my father PTL buy-PST:3SG the bike new 'It was my father who bought the new bike.'

(38)

Irish (O'Dochartaigh 1992: 49) (Is) carr a bhuail an coisi oka. cop-PST car PTL hit:PST a pedestrian drunk 'It was a car that hit the drunk pedestrian.'

In Irish, as shown in (38), and also in two of the other Celtic languages which are not in the sample i.e. Manx (Thomson 1992: 111) and Scots Gaelic (Gilles 1993: 211 — 212), fronting of the subject and also of the object typically requires an initial copula which in the case of Irish optionally deletes in the simple present. The structures in question are generally considered to be cleft constructions, though the copula may, in fact, be better analyzed as a complementizer (see Tallerman this volume). 34 Despite the above differences in distribution, the word order variants in the languages of Europe do conform to the word order generalizations advanced in §3.3.1. There are no verb-final variants in the VSO languages; the SOV languages disfavour the verb-first orders and the SVO languages exhibit no dispreferences. What is striking, though, is that even the dispreferred verbinitial variants in SOV languages are more common than the neutral SVO variants in the global sample.

5.3. Word order flexibility As already suggested by the data in Table 17, there are no rigid SOV languages in Europe. Of the SVO languages only two have rigid order. By contrast, all three of the VSO are rigid word order languages. This is shown in Table 18

518

Anna Siewierska

which presents the degrees of word order flexibility among the languages in the European sample. Table 18. Number of word order variants relative to basic order in the European sample 2

3

4+

5 23%

6 27%

1 5%

10

46%

5 26%

2

11%

1 5%

10

50%

3 100%

0

0

0

0

0 0

0

0

split

0

0

0

0

0

0

0 0

3

100%

total

5

10%

10

20%

8

17%

23

48%

1

basic w/o

0

SOV

0

0

SVO

2

11%

VSO

2

4%

Recall that in the global sample 20% of the SOV, 25% of the VSO and 30% of the SVO languages were rigid word order ones. Thus, whereas the European SOV and SVO languages display much lower levels of rigid order than globally, the converse applies to the VSO languages. If we also take into account the morphologically and phonologically marked orders, none of the European VSO languages will emerge as rigid word order languages, since Breton and Welsh and arguably also Irish have nonbasic SVO and OVS orders. But in this case the number of rigid SVO orders would also drop to zero. The two SVO languages in the sample which have no word order variants, i. e. Maltese and French, both display nearly all of the possible orders of the subject, object and verb with disjunctures. According to Vanhove (Eurotyp questionnaire), Maltese, in addition to its basic SVO order, has S#OV, O#SV, O#VS and VO#S with the indicated disjunctures. Some examples of each order are given in (39 a) through (39 d). (39)

Maltese a. It-tifel il-kelba la?at. the boy the dog hit:PFV:3SG:M 'It is the dog that the boy hit.' b. Il-kelba it-tifel la?at. the dog the boy hit:PFV:3SG:M 'It is the dog that the boy hit.' c. Il-kelba la?at it-tifel. the dog hit:PFV:3SG:M the boy 'It is the dog that the boy hit.'

Variation in major constituent order

519

d. La?at il kelba it-tifel. hit:PFV:3SG:M the dog the boy 'He hit the dog, the boy.' And Colloquial French has O#SV, O#V#S, V#SO and perhaps also V#O#S with the indicated disjunctures and pronominal proclitics. E.g. (40)

French Suzanne Schlyter (Euroryp questionnaire) a. Ce gargon, Marie le connatt. this guy Marie him knows 'This guy, Mary knows him.' b. Ce gar£on, eile le connit, Marie, this guy she him knows Mary c. Elle le connait, Marie, ce garcon. she him knows Mary the guy d. Elle le connait, ce gar£on, Marie Dupont. she him knows the guy Mary Dupont

Even more striking than the virtual absence of rigid word order languages is the high incidence of highly flexible SOV and SVO languages among the European sample; 46% of the European SOV languages and 50% of the SVO ones have highly flexible order. The corresponding figures for the global sample are only 7% for each. And 2/7 of the highly flexible SOV languages and 3/6 of the highly flexible SVO languages in the global sample are in fact European. If we were to eliminate the 14 European languages from the global sample, the number of highly flexible word order languages would drop from 24 (14%) to 16 (10%). In the light of our findings in §4.1 and §4.2 concerning the relationship between the absence of agreement and case marking and lack of word order flexibility, we would expect the low incidence of rigid and restricted SOV and SVO languages and the high incidence of rigid VSO ones in Europe to be related to comparable differences in the distribution of morphological marking. Let us now consider whether this is indeed so.

6. Morphological subject and object encoding in the languages of Europe The distribution of agreement and case marking among the languages in the European sample is shown in Table 19.

520

Anna Siewierska

Table 19. The distribution of agreement and case marking in the languages in the European sample relative to word order type No Agr

Case

96%

1

4%

21

96%

1

4 0//o

18

90%

2

10%

16

80%

4

20%

VSO

0

0

3

100%

0

0

split

3

100%

0

0

3

100%

0

0

total

42

87%

6

13%

40

83%

8

17 /o

basic w/o

Agr

SOV

21

SVO

No Case

3 100%

Of the European languages in the sample (87%), all but six languages, have agreement with nominal NPs. The corresponding figure for the global sample is 69%. There is thus an 18% difference in the level of agreement in the two samples. Particularly interesting are the difference in the distribution of agreement relative to basic order. The six European languages lacking agreement are Breton, Irish, Welsh, English, Swedish and Lezgian. The first three are VSO languages, the following two SVO and Lezgian is SOV. Thus whereas in the global sample 79% of the VSO languages have agreement as compared to 74% SOV and 56% SVO, in the European sample none of the VSO have agreement as opposed to 96% SOV and 90% SVO. The difference in the levels of case marking among the European languages as compared to globally is even higher than the difference in the levels of agreement. All but eight (83%) of the European languages have case marking as compared to only 53% in the global sample, which is a difference of 30%. Of the eight European languages which lack case marking, three are VSO (Breton, Irish and Welsh), four are SVO (Dutch, English, French and Swedish) and one is SOV (Abkhaz). The differences in the levels of case marking relative to word order type in the European languages as compared to those in the global sample are respectively: for SOV 96% vs 71%, for SVO 80% vs 33% and for VSO 0 vs 42%. Thus, in the European SVO languages, case marking is more than twice as common as globally, while in the European VSO languages case marking is completely absent. Since the level of agreement and particularly case marking among the languages of Europe is higher than the global level, the relationship between morphological marking and word order flexibility is also somewhat different. The relevant data are presented in Tables 21 and 22. Table 21 depicts the different degrees of flexibility among the languages with and without agreement and

Variation in major constituent order

521

case marking and Table 22 the number of languages with and without case marking among the languages with different degrees of word order flexibility. I have not broken down the languages according to word order type, since this can be largely determined from Tables 17 and 19. Table 20. Number of word order variants relative to the presence and absence of overt agreement and case marking in the European sample 0

Agr

+

Case

1

2

3

4

-

»

-

3 50% 2 5%

2 33% 8 19%

1 17% 7 17%

0 0 2 5%

0 0 23 55%

4 50% 1 3%

3 38% 7 18%

0 0 8 20%

0 0 2 5%

1 1lJ"3 O//o 22 55%

Table 21. The presence and absence of overt agreement and case marking relative to number of word order variants in the European sample 0 Agr

Case

+

1

2

3

4

+

3 60% 2 40%

2 20% 8 80%

1 7

11% 89%

0 2

0 100%

0 23

0 100 %

4 60% 1 40%

3 30% 7 70%

0 8

0 100%

0 2

0 100%

1 22

5% 95 %

Observe from the data in Table 20 that only 5% of the languages with agreement and 3% of the languages with case marking have rigid order. By contrast 55% of the languages with agreement and also 55% of those with case marking have highly flexible order. The former set of figures is considerably lower than those in the global sample given earlier in Tables 9 and 11 (3% vs 18% and 5% vs 10%) and the latter considerably higher (55% vs 17% and 55% vs 21%). Moreover, as the figures in both Table 21 and 22 show, there is no flexible or highly flexible European language which lacks agreement and only one which lacks case marking. Whereas in the global sample (see Tables 10 and 14), among the flexible and highly flexible word order languages 17% lack agreement and 21% lack case marking. On the basis of the European data one could therefore be justified in considering both agreement and case marking as fairly good predictors of word order flexibility and word order flexibility to be a strong predictor not only of the presence of agreement but also of the presence of case marking. Nonetheless, while the distribution of agreement and case marking in the languages of Europe may well lie at the heart of the com-

522

Anna Siewierska

mon assumption that morphological marking goes hand in hand with word order flexibility, we have seen that there is in fact no necessary relationship between the two. Accordingly, the absence of agreement and case marking among the European VSO languages, and the high incidence of both among the SVO and particularly SOV languages may be seen as providing an explanation for the rigidity of order of the former and the low level of rigidity of the latter. However, since globally there is no correlation between the presence of agreement and/or case marking and word order flexibility, but only a correlation between the absence of agreement and lack of word order flexibility and rigid word order and absence of case marking, it would be difficult to argue that the high level of agreement and case marking among the languages of Europe are directly responsible for the amazingly large number of highly flexible word order languages in Europe.

7. Word order variation: the genetic and areal factor When we look at the genetic distribution of the flexible word order languages in Europe, we see that in the main they belong to the Slavic, Baltic and Hellenic branches of Indo-European, to the Uralic phylum and the Northwest Caucasian and Nakh-Daghestanian phyla. The Celtic, Romance and Germanic branches of Indo-European and the Altaic languages, on the other hand, display less word order variation. Though with the exception of Abkhaz and Lezgian all the languages which lack case and/or agreement marking are Celtic (three), Germanic (three) or Romance (one), the Altaic languages have both case and agreement marking, but exhibit relatively little word order variation, at least in comparison to the Slavic and Uralic languages. This suggests that word order flexibility is to some extent dependent on genetic and areal factors. Such is also the impression that one gets on the basis of the distribution of the various levels of word order flexibility in the global sample. Given that the genetic affiliation fairly closely correlates with areal distribution and the latter can be more conveniently dealt with in tabular form than the former, in Table 22 I give the distribution of the various degrees of word order flexibility among the languages in the global sample relative to the six macro-areas distinguished by Dryer. We see that there are considerable differences in the frequency of occurrence of the various degrees of word order flexibility in the six macro-areas. The most drastic differences emerge in terms of the number of highly flexible and rigid word order languages. Eurasia overshadows each of the other macroareas in its proportion of highly flexible word order languages (39%). The only

Variation in major constituent order

523

Table 22. Degrees of word order flexibility in the six macro-areas macro-area

1

0

2

:+)

4 (

3

1

4%

5

22%

4

17%

4

17%

9

39%

17

49 %

13

37%

3

9%

1

3%

1

3%

SEA & Oc N = 27

8

30 %

11

41%

4

15%

3

11%

1

4%

Aust & NG N = 31

5

16 %

12

39%

2

7%

5

16%

7

23%

NAmer N = 33

4

12 %

10

30%

7

21%

7

21%

5

15%

SAmer N = 22

3

14 %

8

36%

9 41%

1

5%

1

5%

Eurasia N = 23 Africa N - 35

other macro-area which comes anywhere close to Eurasia in this regard is AustNG with 23%. NAmerica trails far behind with only 15% and the other macroareas have only 5% or less.35 With respect to rigidity of order, Eurasia again stands out with proportionally the lowest number of rigid word order languages (4%), and Africa with proportionally the highest (49%). Since the macro-areas which exhibit the most languages with highly flexible word order also manifest low levels of rigid word order, we may expect the overall level of word order variation in the six macro-areas to echo the levels of highly flexible and rigid order. That this is indeed so is documented in Table 23, which presents the proportion of occurring word order variants relative to the logically possible number of variants in each macro-areas and also the average number of word order variants. Table 23. The distribution of word order variants relative to the logically possible ones according to macro-area macro-area

Possible variants

Occurring variants

Average number of variants

Eurasia N = 23

112

66

59%

2.87

Africa N = 35

175

27

15%

0.77

SEA & Oc N = 27

136

32

24%

1.19

Aust & NG N = 31

156

65

42%

2.10

NAmer N = 33

167

69

41%

2.09

SAmer N = 22

110

33

30%

1.50

524

Anna Siewierska

The figures in Table 23 show that Eurasia exhibits the highest level of variation (59%) followed by Aust-NG (42%) and NAmerica (41%), then SAmerica (30%), SEA & Oc (24%) and finally Africa with (15%). Moreover, the level of word order variation in Eurasia is nearly four times as high as in Africa and two and a half times as high as SEA & Oc, the difference in the average number of variants between Eurasia and Africa being 2.1 and between Eurasia and SEA & Oc 1.7. Also worth noting is the fact that even Aust-NG and NAmerica, the macro-areas closest to Eurasia in terms of word order variation, have on average nearly one variant fewer than Eurasia. While due to lack of data the global sample may underrepresent the degree of word order variation in NAmerica (recall that I was unable to determine the word order variation for quite a few of the Amerindian languages in my original 237 language sample), it is significant that the extraordinarily high level of word order variation in Eurasia as compared to the languages in the other macro-areas is not echoed by an analogous difference in levels of case and agreement marking. The Eurasian languages in the sample have higher levels of both case and agreement marking than any other area, but not radically so. The relevant figures for agreement marking are 91% for Eurasia, 81% for SAmerica, 78% for NAmerica, 74% for Aust-NG, 51% for Africa and 44% for SEA & Co.36 The corresponding figures for case marking are: Eurasia 83%, Aust-NG 74% Africa 49%, SAmerica 48%, SEA & Oc 37% and NAmerica 33%. The percentage of highly flexible word order languages among the languages with agreement or case marking in Eurasia is, however, predictably higher than in the other macro-areas. For instance, whereas in Eurasia 29% of the languages with agreement and 37% of those with case marking have highly flexible order, the relevant figures for Aust-NG are 17% and 27% and for NAmerica 17% and 17%. The influence of genetic and areal factors on word order variation can also be discerned by considering the distribution of word order flexibility relative to basic order in the six-macro-areas. Though the macro-areas with the lowest levels of word order variation, Africa and SEA & Oc, also have the highest levels of SVO order (see table 3), 57% of the Eurasian SVO languages, 33% of those from Aust-NG and 25% of the N American ones are highly flexible, but none of the SVO languages from the other macro-areas have highly flexible order. Yet another potential correlate of genetic and areal factors is the high proportion of flexible and highly flexible word order languages which display nonaccusative agreement and/or case marking. Two-thirds (16/24) of the highly flexible word order languages in the sample manifest non-accusative agreement and/or case marking. Of the 16 non-accusative highly flexible word order Ian-

Variation in major constituent order

525

guages, seven are from Australia, four from Eurasia, three from NAmerica and only one each from SEA & Co and SAmerica. The vast majority (14/16) thus originate from the three macro-areas which exhibit the highest levels of word order variation. In sum, in the light of the above data, the fact that word order flexibility in Europe is closely tied to genetic and areal features is not an isolated phenomenon, but a reflection of a global tendency. While lack of word order flexibility correlates with lack of agreement and case marking, high word order flexibility appears to be to a large extent dependent on genetic and areal factors. In any case, if there are structural or morphological features which underlie the occurrence of word order flexibility, they remain to be discovered.

8. Concluding remarks At the outset of this investigation several questions were posed concerning the distribution of word order variants relative to the basic order of a language, its morphological marking characteristics and its genetic and areal classification. We are now in a position to provide some tentative answers to these questions. First of all, regarding the distribution of word order variants relative to word order type, the occurrence of verb-peripheral variants appears to be heavily dependent on the basic order that a language displays, while that of verbmedial variants are considerably less so. Recall from § 3 that languages with verb-peripheral basic order disfavour word order variants with the mirror image peripheral order, but verb-medial variants, though more common in verbinitial languages than in verb-final ones, are not disfavoured by any word order type. Moreover OVS variants are dispreferred relative to SVO variants irrespective of word order type. Concerning word order flexibility, i. e. the number of word order variants that a language displays, there are no significant differences relative to word order type with respect to either rigid or highly flexible order. VSO languages, however, are significantly less likely than SOV or SVO languages to exhibit no or minimal word order variation. Turning to the relationship between word order variation and morphological marking, the analyzed data reveal that neither the presence of agreement nor of case marking is a sufficient condition for flexible order, nor does rigid order entail the absence of either form of morphological marking. There is nonetheless a relationship between the two phenomena, namely flexible order tends to be accompanied by the presence of overt agreement and/or case marking and lack of agreement and/or case marking tends to be accompanied by rigid or

526

Anna Siewierska

restricted order. This relationship while valid for all the dominant word order types is particularly strong for SVO languages; 92% of the flexible and highly flexible SVO languages in the sample have overt agreement and 69% have overt case marking; and 96% of the SVO languages without agreement and 79% of those without case marking are rigid or restricted word order languages. As the above figures suggest, the relationship between word order flexibility and overt agreement marking, on the one hand, and lack of agreement marking and rigid and restricted order, on the other, is stronger than that between word order flexibility and case marking. As for potential areal and genetic differences in regard to word order flexibility, these are highly significant both overall and in relation to word order type. Globally, Eurasia exhibits higher levels of word order flexibility than any other macro-areas, which is largely due to the high degree of word order flexibility among the languages of Europe. Nearly half (48%) of the languages in the European sample exhibit highly flexible order as compared to only 14% of those in the global sample. And whereas globally VSO languages tend not to exhibit rigid or restricted order, in Europe they do. There are also significant differences in word order flexibility among the European languages. The most flexible word order is manifested by the Northwest Caucasian, Uralic NakhDaghestanian and Balto-Slavic Languages and also Greek and Basque, the least flexible by the Celtic. To what extent is the difference in the level of word order variation among the languages of Europe as compared to other areas determined by the current investigation an artifact of our limited knowledge of non-European languages? It is highly unlikely that we will ever be in a position to provide an answer to this question. I have no doubt that at least some of the languages in the global sample exhibit word order variants that so far have not be discerned. As is well known, variations in word order are very sensitive to text type. Most descriptions of non-European languages are based on small corpora, typically consisting exclusively of transcripts of narratives which favour pronominal as opposed to nominal arguments. Since in narratives clauses with two overt nominal arguments are relatively infrequent, it is quite probable that even a reasonably sized corpus will not feature all the permissible constellations of the verb and its arguments in the language in question. This can be better appreciated by the rarity of some of the transitive word order patterns in the well studied European languages. For instance, in my small 2247 clause corpus of written Polish, a language which allows all six possible constellations of the subject, object and verb, there were only 7 (0.3%) instances of OSV orders and 11 (0.5%) of SOV (Siewierska 1993). Similar distributional data are cited by Vilkuna (this volume) for Finnish and several other Finnic languages. Particu-

Variation in major constituent order

527

larly telling are Vilkuna's statistics for Finnish. In a corpus of expository prose consisting of 10141 clauses, she found only 8 VOS, 14 SOV and 42 VSO clauses which together constitute less than 0.5% of the data. In the light of such statistics the possibility of a particular word order option in a much more poorly studied language being overlooked must be assumed to be quite high. The above notwithstanding, if the dispreference for mirror image orders in verb peripheral languages discussed in § 3.3.1. is indeed valid, we would not expect a consideration of other text types or more extensive corpora to result in a proliferation of such orders, for instance in the languages of SEA & Oc or New Guinea. And if lack of case marking correlates with lack of word order flexibility, as argued in § 4.2, we should not expect the rigid or restricted noncase marking SVO languages of Africa or South-East Asia to suddenly emerge as highly flexible ones. In other words, while the results of this investigation may under-represent the level of word order flexibility in some geographical areas, particularly NAmerica, I do not believe that they can be simply dismissed.

Notes * I would like to thank all the members of the constituent order group for comments on earlier versions of this paper and particularly Matthew Dryer for his many helpful suggestions and insightful remarks. I would also like to express my thanks to Georg Bossong. Most of the research for this paper was funded by the Royal Netherlands Academy of Arts and Sciences (KNAW) whose support I gratefully acknowledge. 1. One could also take into account semantic features such as animacy and definiteness. This is somewhat problematic owing to the fact that there is currently no agreement as to the animacy and definiteness status of prototypical direct objects. In any case, languages in which certain word order patterns are allowed only if a constituent is definite or indefinite etc., tend to also mark such linearization patterns in other ways. 2. This statement holds for finite verbs. Nonfinite verbs may occur clause-initially in most of the Germanic languages as a result of a phenomenon called, in the generative literature, VP-preposing. 3. Verb-initial orders in transitive indicative declaratives were also possible in Old High German (Lenerz 1985: 103), Old Norse (Christoffersen 1980) and Gothic (Petersen Eurotyp questionnaire). 4. There are even languages which, arguably, do not allow two non-oblique NPs in a simple clause. According to Hukari (λ^β], such is the case in the Coast Salish language Lushootseed. 5. Whether the absence as opposed to the presence of agreement marking should be viewed as suggestive of the word order pattern in question being the basic order, by analogy to the occurrence of clitic doubling mentioned above, is by no means clear. I have accepted Andersen's (1988) analysis of Pari as a OVS language, since there

528

6. 7.

8.

9.

10.

11.

12.

13.

14.

Anna Siewierska are reasons other than the agreement facts which argue for this being the basic order. In the case of SOV languages, the term free order is often used to denote free reordering of the verbal arguments and adjuncts in preverbal position. I am aware of the fact that Ruhlen's (1987) classification of the world's languages is highly controversial (particularly the Amerindian, Indo-Pacific, Altaic and Austric phyla) and have made various adjustments to it in selecting the languages for my sample. It is unquestionably the case that in some of the languages which have been assigned a basic order of the S, O, and V the relevant order is an epiphenomenon of a strong tendency for the subject and object to bear particular pragmatic statuses associated with specific clausal locations. The Slavic languages are a case in point as are also the discourse configurational languages discussed by Kiss (this volume) and Primus (this volume). Needless to say, in a cross-linguistic study of this size one cannot hope to be able to distinguish systematically between languages in which a particular constellation of the subject, object and verb is essentially due to syntactic as opposed to pragmatic factors. For the purpose of this exercise, the split languages are treated as having both SVO and SOV basic order. Thus, the number of languages with nonbasic SOV order in table 5 is 94 not 99 and the number of languages with nonbasic SVO order is 109 and not 114. It should be mentioned in this context that in four of the 16 SVO languages which have a SOV variant, namely in Ewe, Godie, Tikar and Vute the occurrence of SOV order is conditioned by tense. I would like to stress that throughout the discussion, the term 'frequency' is always used in relation to the occurrence of a given nonbasic order relative to word order type and not in relation to its frequency in individual languages. I make no claims concerning the frequency of occurrence of the various nonbasic orders within languages. In counting the preferences and dispreferences, I have ignored the split word order languages and the languages with no basic order. In the case of the minority word order types, VOS, OVS and OSV, which are so poorly represented in the sample, I have counted as a dispreference only the absence of a variant. In the dominant word order types, I have taken a variant to be dispreferred if it occurs in less than 15% of the languages manifesting the given word order type. Only those variants were counted as preferred which occur clearly more frequently than any other of the variants in a given word order type. Using this procedure, there are 11 instances of dispreferred variants and five of preferred variants; seven of the dispreferred variants do not occur at all in the languages exhibiting the given order. It must be pointed out, though, that Steele's (26) correctly predicts that SVO languages exhibit less word order flexibility than the other word order other types, while my (23) does not. As will be discussed in § 3.3.2, I, unlike Steele, do not use the generalizations in (23), (24) and (25) as a basis for measuring flexibility, since I find her typology not very illuminating. Neither Steele's generalizations nor my own make any distinction between the frequency of occurrence and the potential likelihood of co-occurrence of the following neutral variants: SVO and OSV in SOV languages; SVO and SOV in OSV languages; SVO and VOS in VSO languages; SVO and VSO in VOS languages; and SVO and

Variation in major constituent order

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

529

SOV in OVS languages. Nor is a distinction made between the following neutral variants: VSO and VOS in SOV and OSV languages, SOV and OSV in VSO and VOS languages. According to Dunn (1979), Coast Tsimshian allows SVO order only if the subject is pronominal. Recall that nonbasic orders involving pronouns are not taken into account in this study. OVS order and also SV, on the other hand, may involve a nominal object or intransitive subject. I know of six other instances, from languages outside the sample, in which the predictions in (23), (24) or (25) are broken. The SOV language Carib has an OVS variant but no SVO. The SOV language Parecis has OSV and VOS variants, but no SVO variant. The SOV Gugu Yalandji has SVO, OVS and VOS but no OSV. The OSV language Jamamadi has SOV and OVS but no SVO. This would also follow from Steele's (1978) word order constraints, since if each word order variant in SVO languages is equally dispreferred, we would not expect there to be any patterns in the co-occurrence possibilities of the variants. The second column of percentages reveals that the inclusion of the marked orders has no direct bearing on the dispreferenccs captured in (23), (24) and (25), with the exception of SOV languages in which now OVS variants slightly outnumber SVO variants. The difference in the ratio of dispreferred and neutral orders which receive additional marking is particularly high in SOV languages; 45% of the dispreferred orders are marked, as opposed to only 8% of the neutral. The corresponding figures for VSO are 30% vs 12% and for VOS 50% vs 14%. I have not been able to establish the factors which underlie verb-final order in the three V-first languages in the sample which have SOV and/or OSV clauses, namely, Nez Perce, Upper Chinook and Tamazight. Two other Mayan languages which are claimed to have both preverbal topics and foci identified solely positionally, i. e. without additional morphological markers such as clitics, demonstratives or relational nouns are Quiche (Norman 1978) and Kaqchikel (England 1991). The same holds for Classical Nahuatl which, according to Steele (1976), is a VOS language in the process of change to an SVO one. Though, the language exhibits SOV orders, Steele mentions that SOV clauses are very uncommon. OSV order, on the other hand, is possible only with a pause after the object. The verb-final languages in the sample which exhibit VSO and/or VOS order are: Bandjalang, Yidin, Basque, Evenki, Abkhaz, Sandawe, Karok (VOS), Mountain Maidu (VSO), Tucano (VSO) and Hindi (VOS). Note that the above reasoning holds even if one adopts the view that fronting es essentially due not to topicality but to relative importance or newsworthiness, as claimed by Givon (1987), since entities rendered by subjects have been shown by all relevant studies (see e.g. the articles in Payne 1992b) to have a higher persistence (and thus importance) in the subsequent discourse than objects. As mentioned in the introduction, the measure of word order flexibility adopted here differs significantly from that used in Bakker (this volume), where flexibility is determined on the basis of the number of head/modifier pairs in a language which exhibit both pre- and post-head order. Note that if we consider only the word order types which fall within the scope of the Greenbergian typology of SOV, SVO, VSO, VOS, OVS and OSV, the absence of

530

27. 28. 29.

30.

31. 32.

33. 34.

35.

36.

Anna Siewierska word order variants is more common in languages in which the subject precedes the object than in languages in which the object precedes the subject. These labels are used only for ease of exposition and should not be interpreted as corresponding to how the same labels are used elsewhere in this volume. The issue of the relationship between word order flexibility and agreement marking is also briefly discussed in Bakker (this volume). The three flexible languages without agreement are: Mohave (for third person) PittaPitta and Tagalog. The four highly flexible languages are: Hanis Coos, Bandjalang, "Wangkumara and Yidiji. Heine & Reh (1984: 152) state that the basic order of Gude is actually VSO and that SVO order is used only in constructions marked for focus which involve the occurrence of a special form of an aspect particle. Hoskison (1983), on the other hand, gives SVO order as the basic order in what he calls neutral aspect and VSO order as basic in completive, continuous and potential aspects. I have classified Gude as a SVO language on the basis of the order in the neutral aspect. In claiming that SOV order in Mandarin is used for contrastive objects, I follow Sun & Givon (1985) rather than the earlier work of Li & Thompson (1975). I would like to thank the following scholars for filling out the word order questionnaire or contributing language data: A. Afarli, G. M. Awbery, Emanuel Banfi, Giovanni M. G. Belluscio, Ines Loi Corvetto, Inge Genee, Riho Grünthal, Martin Haspelmath, Mateja Hocevar, Anders Holmberg, Jan de Jong, Johannes Gisli Jonsson, Katalin Kiss, Chryssoula Lascaratou, Ruta Marcinkeviciene, Yaron Matras, Juan Carlos Moreno Cabrera, Igor Nedjalkov, Christian T. Petersen, Beatrice Primus, Donall P. Baoill, Eusebio Osa, Bernard Oyharcabal, Jan Rijkhoff, Tapani Salminen, Merja Salo, Gerjan van Schaaik, Suzanne Schlyter, Rienk Smeets, Svillen Stanchev, Janig Stephens, Pirkko Suihkonen, Maggie Tallerman, Yakov G. Testelec, Ingrid Thelin, Martina Vanhove, Maria Vilkuna, Bibinur Zaguljajeva, Tomaso Zorzutti and Bostjan Zupanicic. In Colloquial Welsh this particle is typically absent, although its mutation effects remain. According to O Siadhail (1989: 210), Irish does allow initial subjects in narrative style. Thomson (1992: 110) states that in Manx a subject can also be placed in initial position for emphasis, without using the cleft construction. And MacAulay (1992a: 189) mentions a non-cleft form of fronting in Scots Gaelic. However, in each case I have been unable to determine whether this holds for both transitive and intransitive subjects and if so, whether the object can be a noun, as opposed to a pronoun. Cornish, in contrast to all the other Celtic languages, appears to have had much more flexible word order. For instance, according to George (1993: 455—57), Middle Cornish had, VSO, SVO, OVS and SOV clauses. The lower incidence of word order flexibility in North America and Australia, as compared to Eurasia, is quite startling given that these two areas are renowned for their freedom of order. This is undoubtedly to a large extent a consequence of incomplete data. More about this will be said in § 8. Though these data appear to counter Nichols's (1992) findings in relation to the particularly high level of head-marking in North America, it must be remembered that I, unlike Nichols, have taken into account only agreement with third person referents. Four of the six languages in North America which do not have agreement with 3rd person subjects or objects do display agreement with speech act partici-

Variation in major constituent order

531

pants. In my original 237 language sample the level of agreement in NAmerica is marginally higher than in Eurasia; 95% of the languages in NAmerica exhibit agreement, as compared to 93% in Eurasia.

References Abaev, V. I. 1964 A grammatical sketch of Ossetic. Publication 35 of the Indiana Research Center in Anthropology, Folklore and Linguistics. Bloomington: Indiana University. Abbott, Miriam 1991 "Macushi", in: Desmond C. Derbyshire & Geoffery K. Pullum (eds.), vol 3, 22-160. Agricola, William Jan 1987 Lokomo Duan the Arawak language of Suriname. Anne Arbor: University Microfilms. Alphonse, Ephraim S. 1956 Guaymi grammar and dictionary with some ethnological notes. Smithonian Institution. Bureau of American Ethnology Bulletin 162. Andersen, Torben 1988 "Ergativity in Pari, a Nilotic language", Lingua 75: 289-324. Arnott, David W. 1970 The nominal and verbal systems of Fula. Oxford: Oxford University Press. Badmajev, B. B. 1966 Grammatika kalmyckovo jazyka. Morfologija. Elista: Kalmyckoe kniznoe izdatel'stvo. Bakker, Dik this volume "Flexibility and consistency in word order patterns in the languages of Europe". Ball, Martin J. (ed.) 1993 The Celtic languages. London: Routledge. Bascom, Burton 1982 "Northern Tepehuan", in: Ronald W. Langacker (ed.), 267-393. Bashir, Elena 1985 "Towards a semantics of the Burushaski verb", in: Arlene R. K. Zide et al. (eds.), Proceedings of the Conference on Participant Roles: South Asia and Adjacent Areas. Bloomington: Indiana University Linguistic Club, 1—32. Bauman, James J. 1975 Pronouns and pronominal morphology in Tibeto-Burman. Ann Arbor: University Microfilms. Beaumont, Clive H. 1979 New Ireland Tigak. Pacific Linguistics B 58. Canberra: Australian National University. Behrens, Leila 1982 Zur funktionalen Motivation der Wortstellung: Untersuchungen anhand des Ungarischen. München: Veröffentlichungen des Finnisch-Ugrischen Seminars an der Universität München.

532

Anna Siewierska

Bender, M. Lionel 1976 The Non-Semitic languages of Ethiopia. East Lansing (Michigan): African Studies Centre. 1986 "Asymmetrical case correspondences in Ethio-Semitic", Afrikanistische Arbeitspapiere 7: 127-135. Benjamin, Geoffery 1976 "An outline of Temiar grammar", in: Phillip N. Jenner et al. (eds.), Austroasiatic Studies. Part I. Honolulu: The University Press of Hawaii. Berinstein, Ava 1986 "Indirect object constructions and multiple levels of syntactic representation", CLS 22, 36-50. Berman, Ruth A. 1980 "The case of an (S)VO language: subjectless constructions in Modern Hebrew", Language 56: 759-776. Bhat, D. N. S. 1991 Grammatical relations: the evidence against their universality. London: Routledge. Birk, D. B. W. 1976 The Malak-Malak language, Daly River. Pacific Linguistics B 45. Canberra: Australian National University. Blake, Barry J. 1979 "Pitta-Pitta", in: Robert M. W. Dixon & Barry J. Blake (eds.), 183-244. 1987 a Australian Aboriginal grammar. London: Groom Helm. 1987 b "The grammatical development of Australian languages", Lingua 71: 179-201. Boas, Franz (ed.) 1911 a Handbook of American Indian languages Part I. Washington: Government Printing Office. Boas, Franz 1911 b "Tsimshian", in: Franz Boas (ed.), 283-422. 1911 c "Chinook", in: Franz Boas (ed.), 559-678. Boas, Franz 8c John R. Swanton 1911 "Siouan Dakota", in: Franz Boas (ed.), 875-966. Borg, A. J. & Bernard Comrie 1984a "Object diffuseness in Maltese", in: Frans Plank (ed.), 109-126. Borgman, Donal 1990 "Sanuma", in: Desmond C. Derbyshire & Geoffery K. Pullum (eds.), vol 2, Berlin: Mouton De Gruyter, 17—248. Borsley, Robert D. & Janig Stephens 1989 "Agreement and the position of subjects in Breton", Natural Language and Linguistic Theory?: 407—427. Bradley, C. Henry 1970 A linguistic sketch of Jicaltepec Mixtec. University of Oklahoma: SIL. Breen, J. Gavin 1981 "Margany and Gunya", in: Robert M. W. Dixon & Barry J. Blake (eds.), 275-393. 1976 "Ergative, locative and instrumental case inflections in Wangkumara", in: Robert M. W. Dixon (ed.), 336-339.

Variation in major constituent order

533

Bright, William 1957 The Karok language. University of California Publications in Linguistics 13, Berkely and Los Angeles: University of California Press. Broadbent, Sylvia M. 1964 The Southern Sierra Mitvok language. University of California Publications in Linguistics 38. Berkeley and Los Angeles: University of California Press. Bromley, H. Myron 1981 A grammar of Lower Grand Valley Dani. Pacific Linguistic C 63. Canberra: Australian National University. Bruce, Les 1984 The Alamblak language of Papua New Guinea (East Sepik). Pacific Linguistics C 81. Canberra: The Australian National University. Bynon, Theodora 1979 "The ergative construction in Kurdish". BSOAS 42: 211-224. 1980 "From passive to active in Kurdish via the ergative construction", in: Elizabeth C. Traugott et al. (eds.), 151 — 163. Camaj, Martin 1984 Albanian grammar. Wiesbaden: Harassowitz. Camp, Elizabeth L. 1985 "Split ergativity in Cavinena", International Journal of American Linguistics 51: 38-58. Campbell, Lyle 1985 The Pipil language of El Salvador. Mouton Grammar Library 1. Berlin: Mouton. Capell, A. & H. E. Hinch 1970 Maung grammar, texts and vocabulary. The Hague: Mouton. Caughley, Ross. C. 1982 The syntax and morphology of the verb in Chepang. Pacific Linguistics. B 84. Canberra: Australian National University. Chadwick, Neil 1975 A descriptive study of the Djingili language. Canberra: Australian Institute of Aboriginal Studies. 1976 "The Western Barkly languages", in: Robert M. W. Dixon (ed.), 390-396, 432-436. Chapell, Hiliary 1991 "Syntax and semantics of the benefactive construction in Moutain Sgaw Karen", La Trobe University Working Papers in Linguistics 4: 37 — 52. Christoffersen, Marit 1980 "Marked and unmarked word order in Old Norse", in: Elisabeth C. Traugottetal. (eds.), 115-121. Chung, Sandra 1978 Case marking and grammatical relations in Polynesian. Austin: University of Texas Press. Claudi, Ulrike & Daniela Mendel 1991 "Noun/verb distinction in Egyptian-Coptic and Mande: a grammaticalization perspective", Afrikanistische Arbeitspapiere (no number): 31—53. Cole, Peter 1982 Imbabura Quechua. Lingua Descriptive Studies 5 Amsterdam: North Holland.

534

Anna Siewierska

Comrie, Bernard 1979 "Degrees of ergativvity: some Chukchee evidence", in: Frans Plank (ed.), 219-240. 1981 The languages of the Soviet Union. Cambridge: Cambridge University Press. 1984 "Some formal properties of focus in Modern Eastern Armenian", Annual of Armenian Linguistics 5: 1 — 21. Conrad, Robert J. & Kepay Wogiga 1991 An Outline of Bukiyip grammar. Pacific Linguistics C 113. Canberra: Australian National University. Conzemius, Eduard 1929 "Notes on the Miskito and Sumu language of Eastern Nicaragua and Honduras", International Journal of American Linguistics 5: 57—117. Cooreman, Anne 1983 "Topic continuity and the voice system of an ergative language: Chamorro", in: Talmy Givon (ed.), 425—490. 1988 "The anitpassive in Chamorro: variations on the theme of transitivity", in: Masayoshi Shibatani (ed.), 561—594. Craig, G. Colette 1977 The Jacaltec language. Austin: University of Texas Press. Crowley, Terry 1978 The Middle Clarence dialects of Band)alang. Canberra: Australian Institute of Aboriginal Studies. Daglish, Gerad M. 1979 "Subject identification and free word order: the case of Sandawe", Studies in African Linguistics 10: 273—310. Dawkins, C. H. 1969 The fundamentals of Amharic. Sudan Interior Mission: Addis Ababa. De Vries, Lourens 1989 Studies in Wambon and Kombai. Ph. D. dissertation, University of Amsterdam. Demuth, {Catherine 8c Mark Johnson 1989 "Interaction between discourse functions and agreement in Setwana", Journal of African Languages and Linguistics 11: 21 — 35. Dench, Alan 1991 "Panyjima", in: Robert M. W. Dixon 8c Barry J. Blake (eds.), 125-244. Derbyshire, Desmond C. 1981 "A diachronic explanation for the origin of OVS in some Carib languages", Journal of Linguistics 17: 209—220. 1983 "Ergativity and transitivity in Paumari", Working Papers of Summer Institute of Linguistics University of North Dakota 27: 11—28. 1985 Hixkaryana and linguistic typology. Dallas: Summer Institute of Linguistics & The University of Texas at Arlington. 1986 a "Comparative survey of morphology and syntax in Brazilian Arawakan", in: Desmond C. Derbyshire and Geoffery K. Pullum (eds.), 469—566. 1986 b "Topic continuity and OVS order in Hixkaryana", in: J. Sherzer & G. Urban (eds.), Native South American discourse. Berlin: Mouton, 237—306.

Variation in major constituent order

535

Derbyshire, Desmond C. and Geoffrey K. Pullum 1981 "Object initial languages", International journal of American Linguistics 47: 192-214. Derbyshire, Desmond C. and Geoffrey K. Pullum (eds.) 1986 Handbook of Amazonian languages, vol 1. Berlin: Mouton de Gruyter. 1990 Handbook of Amazonian languages, vol 2. Berlin: Mouton de Gruyter. 1991 Handbook of Amazonian languages, vol 3. Berlin: Mouton de Gruyter. Diesing, Molly 1990 "Verb movement and the subject position in Yiddish", Natural Language and Linguistic Theory 8: 41—79. Dimmendal, Gerrit 1983 The Turkana language. Dordrecht: Foris. Dixon, Robert M. W. (ed.) 1976 Grammatical categories in Australian languages. Canberra: Australian Institute of Aboriginal Studies. Dixon, Robert M. W. 1977 Yidiß. Cambridge: Cambridge University Press. 1988 A grammar of Boutnaa Fijian. Chicago: The University Press of Chicago. Dixon, Robert M. W. & Barry J. Blake (eds.) 1979 The handbook of Australian languages, vol 1. Canberra: Australian National University. 1983 The handbook of Australian languages, vol 3. Canberra: Australian National University. 1991 The handbook of Australian languages, vol 4. Oxford, Melbourne: Oxford University Press. Dryer, Matthew S. 1983 "Coos word order", paper delivered at the Western Conference on language. Eugene, Oregon. 1989 "Salinan word order", in: Scott Delancey (ed.), Papers from the 1988 Kokan-Penutian languages workshop. Eugene: University of Oregon Papers in Linguistics, Publications of the Center for Amerindian Linguistics and Ethnography 1, 40-49. 1992 "The Greenbergian word order correlations. Language 68: 81 — 138. 1995 "On the six-way word order typology", to appear in Studies in Language. Dunn, John A. 1979 reference grammar for the Coast Tsimishian language. Otawa: National Museum of Canada. Durie, Mark 1987 "Grammatical relations in Acehnese", Studies in Language 11: 365 — 399. Eades, Diana 1979 "Gumbaynggir", in: R. M. W. Dixon & Barry B. Blake (eds.), 245-361. Eastman, Carol M. 1979 "Word order in Haida", International Journal of American Linguistics 45: 141-148. 1986 "Haida: exemplar of a pragmatic communicative mode", in: Benjamin F. Elson (ed.), Language in global perspective. Papers in honor of the 50th anniversary of the Summer Institute of Linguistics. Dallas: Summer Institute of Linguistics, 329—45.

536

Anna Siewierska

Ebert, Karen 1979 Sprache und Tradition der Kera. Berlin: Verlag Van Dietrich Reinner. Egerod, Sören 1966 "Word order and word order classes in Atayal", Language 42: 346—369. Einarsson, Stefan 1949 Icelandic. Baltimore: The John Hopkins Press. Eissien, Okon E. 1990 "The aspectual character of a verb and tense in Ibibio", Journal of West African Languages 20: 64—72. England, Nora C. 1991 "Changes in basic order in Mayan languages", International Journal of American Linguistics 57: 446—486. Everett, Daniel L. 1986 "Piraha", in: Desmond C. Derbyshire & Geoffery K. Pullum (eds.), 200325. Pagan, Joel L. 1986 A grammatical analysis of Mono-Alu. Pacific Linguistics B 96. Canberra: Australian National University. Foley, William A. 1986 The Papuan languages of New Guinea. Cambridge: Cambridge University Press. 1991 The Yimas language of New Guinea. Stanford (California): Stanford University Press. Foreman, Velma 1974 Grammar of Yessan Mayo. Santa Anna (California): Summer Institute of Linguistics. Fortescue, Michael 1984 West Greenlandic. London: Croom Helm. 1993 "Eskimo word order variation and its contact-induced perturbation", Journal of Linguistics 29, 267-289. Fortune, George 1955 An analytical grammar of Shona. London: Longmans, Green & Co. Foster, Mary LeCron 1969 The Tarascan language. University of California Publications in Linguistics 56. Berkely and Los Angeles: University of California Press. Frachtenberg, Leo J. 1922 "Coos" in Franz Boas (ed.), Handbook of American Indian languages, vol 2. BAE Bull. 40 No. 2. Washington, D.C.: Smithsonian Institution, 297-429. Franklin, Karl James 1971 A grammar of Kewa New Guinea. Pacific Linguistic C 16. Canberra: Australian National University. E. S. & C. E. Furby 1977 A preliminary analysis of Garawa phrases and clauses. Pacific Linguistics B 42. Canberra: Australian National University. George, Ken 1993 "Cornish", in: Martin J. Ball (ed.), 410-468. Georgopoulos, Carol 1991 Syntactic variables, resumptive pronouns and A' binding in Palauan. Dordrecht: Kulwer Academic Press.

Variation in major constituent order

537

Gillies, William 1993 "Scotish Gaelic", in: Martin J. Ball (ed.), 145-227. Givon, Talmy 1976 "On the VS word order in Israeli Hebrew: theoretical implications", in: Peter Cole (ed.), Studies in Modern Hebrew syntax. Amsterdam: North Holland, 153-181. 1987 "The pragmatics of word order: predictability, importance and attention", In Michael Hammond, et al. (eds.), Studies in syntactic typology. Amsterdam: John Benjamins, 243—284. Givon, Talmy (ed.) 1983 Topic continuity in discourse. Amsterdam: John Benjamins. Green, M. M. & G. E. Igwe 1963 A descriptive grammar of Igbo. Berlin: Akademie Verlag. London: Oxford University Press. Greenberg Joseph H. 1963 "Some universals of grammar with particular reference to the order of meaningful elements", in: Joseph H. Greenberg (ed.), Universals of language. Cambridge Mass.: MIT Press, 73 — 113. Grimesm Joseph E. 1964 Huichol syntax. The Hague: Mouton. Gulian, Kevork H. 1957 Elementary Modern Armenian grammar. New York: Fredrick Ungar. Hagege, Claude 1969 Esquisse linguistique du tikar. Paris: SELAF. Hagman, Roy S. 1973 Nama Hottentot grammar. Ann Arbor: University Microfilm. Haider Hubert & Martin Prinzhorn (eds.) 1986 Verb second phenomena in Germanic languages. Dordrecht: Foris. Haile, F. Bernard 1926 A manual ofNavaho grammar. New York: AMS Press. Haiman, John 1980 Hua: a Papuan language of the Eastern Highlands of New Guinea. Amsterdam: John Benjamins. Hale, Austin & David Watters 1973 Clause, sentence and discourse patterns in selected languages of Nepali. Part II Clause. Norman: Summer Institute of Linguistics, University of Oklahoma. Hale, Kenneth 1973 "A note on subject-object inversion in Navajo", in: Braj B. Kachru et al. (eds.), Issues in linguistics: Papers in honour of Henry and Rene's Kahane. Urbana: University of Illonois Press. 300—309. Harbert, Wayne & Willem Pet 1988 "Movement and adjunct morphology in Arawak and other languages", International Journal of American Linguistics 54: 416-435. Harford Perez, Carolyn 1983 "Locative pseudo-subjects in Shona", Journal of Africal Languages and Linguistics 5: 131 — 155.

538

Anna Siewierska

Harris, Alice C. 1985 Syntax and semantics 18. Diachronie syntax: the Kartvelian case. New York: Academic Press. Harris, H. 1981 A grammar of Comox. Ann Arbor: University Microfilms. Harrison, Carl H. 1983 "Typological disharmony and ergativity in Gujajara", Working Papers of Summer Institute of Linguistics North Dakota 27: 73—106. Harrison, Sheldon P. 1976 Mokilese reference grammar. Honolulu: The University Press of Hawaii. Hartzler, Margaret 1993 "Sentani", in: Peter Kahrel & Rene van den Berg (eds.), 51-64. Haspelmath, Martin 1993 A grammar of Lezgian. Mouton Grammar Library 9. Berlin: Mouton de Gruyter. Haviland, John 1979 "Guugu Yimidirr", in: Robert M. W. Dixon & Barry J. Blake (eds.), 599611. Heath, Jeffery 1978 Ngandi grammar, texts and dictionary. Canberra: Australian Institute of Aboriginal Studies. Heine, Bernd 1980 "Language typology and linguistic reconstruction: The Niger-Congo case", Journal of African Language and Linguistics 2: 95—112. Heine, Bernd &: Mechthild Reh 1984 Grammaticalization and reanalysis in African Languages. Hamburg: Helmut Buske Verlag. Hetzron, Robert 1976 "The Agaw languages", Afroasiatic Linguistics 3: 31—45. Hewitt, B. George 1979 Abkhaz. Lingua Descriptive Studies 2. Amsterdam: North Holland. Hinds, John 1982 Ellipsis in Japanese discourse. Alberta: Linguistic Research, Inc. 1983 "Topic continuity in Japanese", in: Talmy Givon (ed.), 43 — 93. Holmer, Arthur 1993 "Atayal clitics and sentence structure", Working Papers Lund University Department of Linguistics 40: 71—94. Holmberg, Anders this volume "Word order variation in some European SVO languages; a parametric approach". Hook, Peter E. 1976 "Is Kashmiri an SVO language?", Indian Linguistics 37: 133-142. 1987 "Poguli syntax in the light of Kashmiri: a preliminary report", Studies in the Linguistic Sciences 17: 63—71. Hoskison, James T. 1983 A grammar and dictionary of the Gude language. Ann Arbor: University Mirrnfilms

Variation in major constituent order

539

Huang, Lillian N. 1994 "Ergativity in Atayal", Oceanic Linguistics 33: 131 — 143. Hudson, Richard A. 1976 "Beja", in: M. Lionel Bender (ed.), 97-132. Huffman, Franklin E. 1976 Modern Spoken Cambodian. New Haven: Yale University Press. Hukari, Thomas E. 1976 "Person in Coast Salish", international Journal of American Linguistics 42: 305-318. Hutchinson, John P. 1971 "Coreferent pronominalization in Dire Songhai", Studies in African Linguistics 2: 83-103. 1976 Aspects of Kanuri syntax. Ph. D. dissertation. Indiana University. 1986 "Major constituent case marking in Kanuri", in: Gerrit J. Dimmendal (ed.), Current approaches to African linguistics, vol. 3, Dordrecht: Foris, 191 — 208. Jackson, Ellen M. 1980 "Aspect, tense, and time shifts in Tikar", Journal of African Languages and Linguistics 2: 17—37. Jaggar, Philip 1978 "And what about ...? — topicalization in Hausa", Studies in African Linguistics 9: 69-81. Jankowski, Henryk 1992 Gramatyka je^zyka krymsko-tatarskiego. Poznan: Wydawnictwo Naukowe Uniwersytetu imienia Adama Mickiewicza. Jensen, John T. 1977 Yapese reference grammar. Honolulu: The University Press of Hawaii. Jones, Linda K. 1986 "The question of ergativity in Yawa a Papuan language", Australian Journal of Linguistics 6: 37—56. Jones, Michael A. 1993 Sardinian syntax. London: Rourledge. Jones, Wendell & Paula Jones 1991 Barasano syntax. Studies in the languages of Colomba 2. SIL and The University of Texas at Arlington. Josephs, Lewis S. 1975 Palauan reference grammar. Honolulu. The University Press of Hawaii. Junghare, Indira Y. 1985 "The functions of word order variants in Indo-Aryan", in: Elena Bashir et al. (eds.), Selected Papers from SALA-7. Bloomington: Indiana University Linguistic Club, 236—253. Kahrel, Peter & Rene van den Berg (eds.) 1994 Typological studies in negation. Amsterdam: John Benjamins. Kaswanti Purwo, Bambang 1988 "Voice in Indonesian: a discourse study", in: Masayoshi Shibatani (ed.), 195-241. Keen, Sandra 1983 "Yukulta", in: Robert M. W. Dixon & Barry J. Blake (eds.), vol 3, 191304.

540

Anna Siewierska

Keenan, Edward L. 1976 "Remarkable subjects in Malagasy", in: Charles Li (ed.), Subject and topic. New York: Academic Press, 249-301. Kiss, Katalin E. 1981 "Structural relations in Hungarian, a "free" word order language", Linguistic Inquiry 12: 185-213. this volume "Discourse configurationality in the languages of Europe". Knecht, Laura E. 1986 Subject and object in Turkish. Ph. D. dissertation. MIT. Kumakhov, Mukhadin & Karina Vamling 1993 "Complement types in Kabardian", Working Papers Lund University Department of Linguistics 40: 151 — 131. Kuprijanova, Z. N. & M. J. Barmic & L. V. Homic 1985 Neneckij jazyk. Leningrad: Gosudarstvennoje Ucebno-Pedagogiceskoe Izdatel'stvo Ministerstva Prosvescenija RSFSR. Ladusaw, William A. 1985 "The category structure of Kusaal", Proceedings of the Berkeley Linguistic Society 11, 196-206. Langacker, Ronald W. (ed.) 1982 Studies in Uto-Aztecan grammar, vol 3. Arlington: Summer Institute of Linguistics. Lascaratou, Chryssoula 1989 Functional approach to constituent order with particular reference to Modern Greek. Athens. 1994 "An overview of word order in Modern Greek", EUROTYP Working Paper II/6: 39-58. Lawal, Nike S. 1987 "Yoruba reletivization and the continuous segment principle", Studies in African Linguistics 18.1: 67-89. Lee, Kee-dong 1975 Kusaiean reference grammar. Honolulu: University Press of Hawaii. Lenerz, Jürgen 1985 "Diachronie syntax: verb position and comp in German", in: Jindrich Toman (ed.), Studies in German grammar. Dordrecht: Foris, 103 — 132. Levine, Robert D. 1979 "Haida and Na-Dene: a new look at the evidence", International Journal of American Linguistics 45: 157—170. Li, Charles & Sandra A. Thompson 1975 "The semantic function of word order in Chinese", in: Charles Li (ed.); Word order and word order change. Austin: University of Texas Press, 163-195. Li, Charles & Sandra A. Thompson 6c Jesse O. Sawyer 1977 "Subject and word order in Wappo", International Journal of American Linguistics 43: 85-100. Lindenfeld, Jacquelin 1973 Yaqui syntax. Berkely & Los Angeles: University of California Press. Little, Greta D. 1976 "Word order function typology: the Amharic connection", Studies in African Linguistics 7: 83 — 90.

Variation in major constituent order

541

Longacre, Robert E. 1990 Storyline concerns and word order typology. The James S. Coleman African Studies Centre and The Department of Linguistics University of California at Los Angeles. Longacre, Robert E. & E Woods (eds.) 1977 Discourse grammar, vol 3. Dallas: University of Texas at Arlington. Loogman, A. 1965 Swahili grammar and syntax. Pitsburg: Buquessue. Lydall, Jean 1976 "Hamer", in: M. Lionel Bender (ed.), 393-437. Mac Eoin, Gearoid 1993 "Irish", in: Martin J. Ball (ed.), 101-144. Macaulay, Donald 1992a "The Scottish Gaelic language", in: Donald Macaulay (ed.), 137-230. 1992 b (ed.) The Celtic languages. Cambridge: Cambridge University Press. MaCaulay, Monica 1992 "Inverse marking in Karuk: the function of the suffix -ap", International Journal of American Linguistics 58: 182—201. Macdonald, Lorna 1990 A grammar of Tauya. Mouton Grammar Library 6. Berlin: Mouton de Gruyter. Mao, T-W & T-Y Chou 1972 "A brief description of the Yao language", in: H. C. Purnell (ed.), Miao and Yao Linguistic Studies, Ithaca: Department of Asian Studies, Cornell University, 239-255. Marchese, Lynell 1978 "Time reference in Godie", in: J. E. Grimes (ed.), Papers on discourse. Arlington, Texas: Summer Institute of Linguistics: The University of Texas at Arlington, 63—75. 1983 "On assertive focus and the inherent focus nature of negatives and imperatives: evidence from Kru", Journal of African Languages and Linguistics 5: 115-130. 1984 "Tense innovation in the Kru language family", Studies in African Linguistics 15: 189-213. Masica, Colin P. 1976 Defining a linguistic area, South Asia. Chicago: University of Chicago Press. Matras, Yaron 1992 "Markedness shift and multiple markedness shift in Indo-Iranian: Kurdish, Hindi and Romani", Paper presented at the 5th International Conference on Functional Grammar, University of Antwerp, 24—28th August. Matras, Yaron & Hans-Jürgen Sasse (eds.) 1995 Verb-subject order and theticity in European languages. Special issue of Sprachtypologie und Universalienforschung 48.1/2. Mcleod, Ruth 1974 "Paragraph, aspect and participants in Xavante", Linguistics 132: 51—74. Mithun, Marianne 1976 A grammar of Tuscarora. New York: Garland.

542 1987

Anna Siewierska

"Is basic word order universal?", in: Russell S. Tomlin (ed.), Coherence and grounding in discourse. Amsterdam: John Benjamins, 281 — 328. Mock, Carol C. 1980 "Chocho case marking and the typology of case", ms Morin, Yves-Charles & Etienne Tiffou 1988 "Passives in Burushaski", in: Masayoshi Shibatani (ed.), 493-524. Munro, Pamela & Lynn Gordon 1982 "Syntactic relations in Western Muskogean: a typological perspective", Language 58: 81-115. Munro, Pamela 1976 Mojave syntax. New York: Garland. Muräne, Elizabeth 1974 Daga grammar. Summer Institute of Linguistics. Norman: University of Oklahoma. Nedjalkov, Igor 1983 "Evenki", in: Peter Kahrel & Rene van den Berg (eds.), 1-34. Nedjalkov, Vladimir. P. 1979 "Degrees of ergativity in Chukchee", in: Frans Plank (ed.), 241-262. Newman, Stanley S. 1946 "The Yawelmani dialect of Yokuts", in: Harry Hoijer et al. (ed.), 227-248. 1965 Zuni grammar. Albuquerque: The University of New Mexico Press. Nichols, Johanna 1984 "Direct and oblique objects in Chechen-Ingush and Russian", in: Frans Plank (ed.), 183-209. 1992 Linguistic diversity in space and time. Chicago: The University Press of Chicago. Nichols, Johanna & Anthony C. Woodbury (eds.) 1981 Grammar inside and outside the clause. Cambridge: Cambridge University Press. Noonan, Michael 1992 A grammar of Lango. Mouton Grammar Library 7. Berlin: Mouton & de Gruyter. Norman, William M. 1978 "Advancement rules and syntactic change: the loss of instrumental voice in Mayan", Proceedings of the Berkeley Linguistic Society 4: 258—276. 0 Dochartaigh, Cathair 1992 "The Irish language", in: Donald MacAulay (ed.), 11-99. OSiadhail, Micheäl 1989 Modern Irish. Cambridge: Cambridge University Press. Gates, L. F. 1976 "Ergative, locative and instrumental case inflection in Muruwari", in: Robert M. W. Dixon (ed.), 342-347. Okell, John 1969 A reference grammar of colloquial Burmese. London: Oxford University Press. Olson, Michael L. 1981 Barai clause junctures: towards a functional theory of interclausal relations. Dissertation, Australian National University. Canberra.

Variation in major constituent order

543

Osborne, C. R. 1974 The Tiwi language. Canberra: Australian Institute of Aboriginal Studies. Owens, Jonathan 1985 A grammar of Harar Oromo. Hamburg: Helmut Büste Verlag. Palmer, F. R. 1957 "The verb in Bilin", Bulletin of the School of Oriental and African Studies 19: 131-159. 1958 "The noun in Bilin", Bulletin of the School of Oriental and African Studies 21: 376-391. Pasch, Helma 8c Talmy Givon 1988 "Verb complementation in Sango", Afrikanistische Arbeitspapiere 16: 69-96. Payne, Dorothy 1990 The pragmatics of word order. Typological dimensions of verb initial languages. Berlin: Mouton de Gruyter. 1991 a "Introduction", in: Dorothy Payne (ed.), 1-15. 1991 b (ed.) Pragmatics of word order flexibility. Amsterdam: John Benjamins. Payne, Thomas E. 1982 "Role and reference related subject properties and ergativity in Yup'ik Eskimo and Tagalog", Studies in Language 6.1: 75 — 106. Peck, Stephen, M. Jr. 1988 Tense, aspect and mood in Guinea-Casamance Portuguese Creole. Ann Arbor: University Microfilms. Penchoen, T. G. 1973 Tamazight of the Ayt Nidhir. Los Angeles: Undena Publications. Perrin, M. 1974 "Mambila", in: John Bendor-Samuel (ed.), Studies in Nigerain languages 4. Ten Nigerain tone systems. Kano: Centre for the Study of Nigerain languages, Ahmadu Bello University, 93 — 108. Plank, Frans (ed.) 1979 Ergativity. New York. Academic Press. 1984 Objects. London: Academic Press. Platzack, J. Christer 1986 "The position of the finite verb in Swedish", in: Hubert Haider & Martin Prinzhorn (eds.), 27—47. Polinskaja, Maria S. 1989 "Object initially: OSV", Linguistics 27: 257-303. Poppe, Nicholas 1964 "Der altaische Sprachtyp", in: B. Spuler (ed.), Handbuch der Orientalistik Voll: Mongolistik. Leiden: Brill, 1-16. Primus, Beatrice this volume "The relative order of recipient and patient in the language of Europe". Prost, G. R. 1962 "Signaling of transitive and intransitive in Chacobo (Pano)", International Journal of American Linguistics. 28: 108 — 118. Prost, R. P. A. 1956 "La Langue Sonay et Ses Dialectes", Memoires de l'Institut Francais d'Afriqu Noire 47. Dakar: Ifan.

544

Anna Siewierska

Quizar, Robin & Susan M. Knowles-Berry 1991 "Ergativity in Cholan languages", International Journal of American Linguistics 54: 77-95. Recinos, Adrian 1957 Cronicas indigenas de Guatemala. Guatemala: Imprenta Universitaria. Reh, Mechtild 1985 Die Krongo Sprache (niino mo-di). Berlin: Dietrich Reiner Verlag. 1986 "Where have all the case prefixes gone?", Afrikanistische Arbeitspapiere 3: 121-134. Rehg, K. L. 1981 Ponapean reference grammar. Honolulu: The University Press of Hawaii. Rene' van den Berg 1989 A grammar of the Muna language. Ph. D. dissertation. Leiden. Rigsby, Bruce 1975 "Nass-Gitksan: an analytic ergative syntax", International Journal of American Linguistics 41: 346—354. Rijkhoff, Jan; Dik Bakker, Kees Hengeveld & Peter Kahrel 1993 "A method of language sampling", Studies in Language 17: 169—203. Robins, R. H. 1958 The Yurok language: grammar, texts, lexicon. University of California Publications in Linguistics 15. Berkely and Los Angeles: University of California Press. Romero-Figueroa, Andres 1985 "OSV as the basic word order in Warao", Lingua 66: 115-134. Rood, David S. 1976 Wichita grammar. New York: Garland. Rosenbaum, Harvey 1977 "Zapotec gapping as counterevidence to some universal proposals", Linguistic Inquiry 8: 379—395. Rude, Noel 1982 "Promotion and topicality of Nez Perce objects", Proceedings of the Berkely Linguistic Society 8: 463-483. 1983 "Ergativity and the active-stative typology in Loma", Studies in African Linguistics 14: 265-283. 1991 "On the origin of the Nez Perce NP suffix", International Journal of American Linguistics 57: 24 — 50. Ruhlen, Merritt 1987 A guide to the world's languages, vol. 1. Classification. Stanford: Stanford University Press. Saltarelli, Mario 1988 Basque. London: Routledge. Samarin, William J. 1967 A grammar of Sango. The Hague: Mouton. Sampson, John 1968 The dialect of the Gypsies of Wales. Oxford: The Clarendon Press, (first published 1926) Schachter, Paul & Fe Otanes 1972 Tagalog reference grammar. Berkeley & Los Angeles: University of California Press.

Variation in major constituent order

545

Sharpe, M. C. 1976 "Are Australian languages syntactically nominative-ergative?", in: Robert M. W. Dixon (ed.), 505-515. Shibatani, Masayoshi (ed.) 1988 Passive and voice. Amsterdam: John Benjamins. Shipley, William F. 1964 Maidu grammar. University of California Publications in Linguistics 51. Berkley and Los Angeles: University of California Press. Shnukal, Anna 1988 Broken. An introduction to the creole language of Torres Straits. Pacific Linguistics C. Canberra: Australian National University. Siewierska, Anna & Ludmila Uhlirova this volume "An overview of word order in Slavic languages". Silverstein, Michael 1974 "Dexis and deducibility in Wasco-Wishram passive of evidence", Proceedings of the Berkeley Linguistic Society 4: 238—253. 1976 "Hierarchies of features and ergativity", in: Robert M. W. Dixon (ed.), 172-190. Smeets, Ineke 1989 A Mapuche grammar. Ph. D. dissertation. Leiden. Smeets, Rienk 1992 "On valencies, actants and actant coding in Circassian", ms, Leiden University. Soane, Ely B. 1913 Grammar of the Kurmanji or Kurdish language. London: Luzac & Co. Speiser, E. A. 1941 Introduction to Hurrian. New Haven: American School of Oriental Research. Starks, Donna 1987 "Word ordring: more than ordering subjects, objects and verbs", in: Paul D. Kroeber & Robert E. Moore (eds.), Native American languages and grammatical typology. Bloomington, Indiana University: Indiana University Linguistic Club. Steele, Susan 1976 "A law of order: word order change in Classical Aztec", International Journal of American Linguistics 42: 31—45. 1978 "Word order variation: a typological survey", in: Joseph H. Greenberg (ed.), Universals of human language vol. 4. Stanford: Stanford University Press, 585-623. 1989 Agreement and antiagrcement: A syntax of Luiseno". Dordrecht Reidel. Stone, Gerald 1993 "Sorbian", in: Bernard Comrie & Creville G. Corbett (eds.), 593-685. Suarea, Jorge A. 1983 The Mesoamarican Indian languages. Cambridge. Cambridge University Press. Subrahmanyam, P. S. 1971 Dravidian verb morphology. Annamalali University. Annamalainagar Tamilnndu.

546

Anna Siewierska

Sun, Chao-Fen & Talmy Givon 1985 "On the so-called SOV word order in Mandarin Chinese: a quantified text study and its implications", Language 61: 329—351. Tallerman, Maggie this volume "Celtic word order: some theoretical issues". Testelec, Yakov this volume "Word order in Kartvelian languages". this volume "Word order in Daghestanian languages". Tamrazian, Armine' 1991 "Focus and wh-movement in Armenian", University College London Working Papers in Linguistics. 3: 101 — 121. Thomas, David D. 1971 Chrau grammar. Honululu: University of Hawaii Press. Thompson, Lawrence C. 1965 A Vietnamese grammar. Seattle: University of Washington Press. Thomson, Robert L. 1992 "The Manx language", in: Donald Macaulay (ed.), 100-136. Thräinsson, Höskuldur 1986 "V l, V 2, V 3 in Icelandic", in: Hubert Haider & Martin Prinzhorn (eds.), 169-194. Thwing, Rhonda 8c John Watters 1987 "Focus in Vute", Journal of African Linguistics 9: 95-121. Tiersma, Pieter M. 1985 Frisian reference grammar. Dordrecht: Foris. Topping, Donald 1979 Chamorro reference grammar. Honolulu: University of Hawaii Press. Traugott, Elizabeth C. et al. (eds.) 1980 Papers from the fourth International Conference on Historical Linguistics. Amsterdam: North Holland. Tucker, A. N. & M. A. Bryan 1956 The Non-Bantu languages of North-Eastern Africa. London: Oxford University Press. 1966 Linguistic analyses: The Non-Bantu languages of North-Eastern Africa. London: Oxford University Press. Unseth, Pete 1986 "Word order shift in negative sentence of Surma languages", Afrikanistische Arbeitspapiere 5: 135—143. Urban, Greg 1985 "Ergativity and accusativity in Shokleng (Ge)", International Journal of American Linguistics 51: 164—187. Van Valin, Robert D. 1986 "Case marking anf the structure of the Lakhota clause", in: Johanna Nichols & Anthony C. Woodbury (eds.), 363-397. Vilkuna, Maria 1989 Free word order in Finnish. Helsinki: Suomalaisen Kirjallisuuden Seura. this volume "Word order in European Uralic". Vogt, Hans 1971 Grammarie de langue georgienne. Oslo: Universitetsforlaget 1971 (Instituttet for Sammenlignende Kulturforskning B, 57).

Variation in major constituent order

547

Walker, Alan T. 1982 "A grammar of Savu", NUSA Linguistic Studies in Indonesian and Languages in Indonesia vol 13. Warotamisikkhadit, Udom 1972 Thai syntax: an outline. The Hague: Mouton. Watkins, Laurel J. 1980 A Grammar of Kiowa. Ann Arbor. University Microfilms. Watters, David E. 1973 "Clause patterns in Kham", in: Austin Hale & David E. Watters (ed.), 39-202. Wedekind, Klaus 1972 An outline of the grammar of Busa. Ph. D. dissertation. Kiel. Wells, Margaret A. 1979 Siroi grammar. Pacific Linguistics B 51. Canberra: Australian National University. Welmers, W. E. 1973 African language structures. Berkeley: University of California Press. West, Birdie 1977 "Results of Tucanoan syntax questionnaire pilot study", in: Robert E. Longacre & F. Woods (eds.), 339-375. Westermann, Diedriech & M. A. Bryan 1952 Languages of West Africa. (Handbook of African Languages 2). London: International African Institute. Whistler, Kenneth 1986 "Focus, perspective, and inverse person marking in Nootkan", in: Johanna Nichols & Anthony C. Woodbury (eds.), 227-265. Whitehead, Carl R. 1981 "Subject, object and indirect object: towards a typology of Papuan languages", Language and Linguistics in Melanesia 13: 32—63. Williamson, Kay 1965 A grammar of the Kolokuma dialect ofljo. Cambridge: Cambridge University Press. Wise, Mary R. 1986 "Grammatical characteristics of Preandine Arawakan languages of Peru", in: Desmond C. Derbyshire & Geoffery K. Pullum (eds.), 566-642. Wolfart, H. C. & J. E Carroll 1981 Meet Cree. A guide to the Crce language, Edmonton: The University of Alberta Press. Yar-Shater, Ehsan 1969 A grammar of the southern Tan dialects. The Hague — Paris: Mouton. Yiman, B. 1988 "Focus in Oromo", Studies in African Linguistics 19: 365-384. Zakrzewska, Ewa 1993 "The inner verbal subject in Bohairic Coptic", Discussions in Egyptology 26: 71-90.

548

Anna Siewierska

Appendix I Global sample Languages (N=171) AFRICA: Afro-Asiatic (Amharic, Beja, Bilin, Coptic, Gude, Hamer, Kera, Oromo, Tamazight, Hebrew) Khoisan (Nama, Sandawe) Niger-Kordofanian (Busa, Ewe, Fula, Godie, Igbo, Kolokuma, Krongo, Kusaal, Loma, Sango, Shona, Swahili, Tikar, Vute, Yoruba) Nilo-Saharan (Fur, Kanuri, Lango, Murle, Pari, Songhai, Turkana) Pidgins & Creols (Kriol) AUST-NG: Australian (Alawa, Bandjalang, Djingili, Garawa, Gugu-Yimidhirr, MalakMalak, Maung, Muruwari, Ngandi, Panjima, Pitta-Pitta, Tiwi, Wangkumara, Yidijl, Yukulta) Indo-Pacific (Alamblak, Barai, Daga, Gapun, Grand-Valley-Dani, Hua, Kewa, Mountain-Arapesh, Sentani, Siroi, Tauya, Wambon, Yava, Yessan-Mayo, Yimas) Pidgins & Creoles (Broken) EURASIA: Altaic (Evenki, Japanese, Karachay, Turkish) Northwest Caucasian (Abkhaz; Georgian) Chukchi-Kamchatkan (Chukchi) Elamo-Dravidian (Kannada) Austric (Santali) Indo-Hittite (Albanian, Armenian, Dutch, Greek, Hindi, Italian, Kashmiri, Polish, Welsh) Language Isolates (Basque, Burushaski, Hurrian) Uralic-Yukaghir (Finnish, Hungarian) N-AMER: Amerind (Chocho, Choctaw, Chontal, Comox, Dakota, Hanis Coos, Huichol, Jacaltec, Karok, Kiowa, Luiseno, Mixtec, Mohave, Mountain-Maidu, Nez-Perce, Nootka, Northern Tepehuan, Pipil, Salinan, Southern Sierra Miwok, Tsimshian, Tarascan, Tuscarora, Upper-Chinook, Wappo, Wichita, Woods Cree, Yurok, Zapotec, Zuni) Eskimo-Aleut (Greenlandic Eskimo) Na-Dene (Navajo, Haida) S-AMER: Amerind (Amuesha, Arawak, Cavinena, Chacobo, Chavante, Guajajara, Guaymi, Hishkaryana, Makushi, Mapuche, Miskito, Paumari, Piraha, Quechua, Sanuma, Southern-Barasano, Tucano, Warao, Waura, Xokleng, Yagua) Pidgins and Creoles (Saramaccan) SEA & OC: Sino-Tibetan (Burmese, Chepang, Kham, Mandarin, Newari, Sgaw) Austric (Acehnese, Atayal, Chamorro, Chrau, Fijian, Indonesian, Khmer, Malagasy, Maori, Mono-Alu, Muna, Palauan, Ponapean, Punu, Savu, Tagalog, Temiar, Thai, Tigak, Vietnamese, Yapese).

Appendix II European sample Languages (N=48) INDO-EUROPEAN: Celtic (Breton, Irish, Welsh); Romance (Standard French, Sardinian, Latin, Rumanian, Spanish); Germanic (Dutch, English, German, Gothic, Icelandic, Swedish); Slavic (Bulgarian, Polish, Russian, Slovene, Upper Serbian); Baltic (Lithuanian); Hellenic (Greek); Armenian (Armenian); Albanian (Albanian); Indo-lranian: IndoAryan (Welsh Romany); Iranian (Kirmanji, Ossetic, Tati). URALIC: Finnic (Finnish, Udmurt, Mordvin); Ugric (Hungarian); Samoyed (Nenets).

Variation in major constituent order

549

NORTHWEST CAUCASIAN: (Abkhaz, Kabardian) KARTVELIAN: (Georgian, Laz) NAKH-DAGHESTANIAN: Nakh (Chechen); Daghestanian (Avar, Dargwa, Lezgian). ALTAIC: Turkic (Crimean Tatar, Chuvash, Karachay, Nogai, Turkish); Mongolian (Kalmyk). AFRO-ASIATIC Semitic (Maltese) ISOLATES: Basque

Appendix HI Word order variants in the European sample language

basic

Abkhaz

SOV

Albanian

SVO

Armenian

SOV

+

+

+

+

+

Avar

SOV

+

+

+

+

+

Basque

SOV

+

+

+

+

+

Breton

VSO

Bulgarian

SVO

+

+

+

+

Chechen

SOV

+

+

+

Crimean Tatar SOV

+

Chuvash

SOV

Dargwa

SOV

Dutch

SVO

English

SVO

Finnish

SVO

French

SVO

Georgian

split

German

svo

SOV

SVO

VSO

VOS

+

+

+

+

OVS

+

+

+

OSV

+

+

+ +

+

+

+

+

+

+ + +

+

+

+

+

+

+

+

4-

+

550

Anna Siewierska

(cont. Appendix III) language

basic

SOV

SVO

VSO

VOS

OVS

OSV

Gothic

SOV

+

+

+

+

+

Greek

SVO

+

+

+

+

Hungarian

split

+

+

+

+

Irish

VSO

Icelandic

SVO

Kabardian

SOV

Kalmyk

SOV

Karachay

SOV

+

Kirmanji

SOV

+

Latin

SOV

+

Laz

SOV

Lezgian

SOV

Lithuanian

SVO

Maltese

SVO

Mordvin

SVO

Nogai

SOV

+

Nenets

SOV

+

Ossetic

SOV

Polish

SVO

+

+

+

+

+

Romani

SVO

+

+

+

+

+

Rumanian

SVO

+

+

+

+

+

Russian

SVO

+

+

+

+

+

Sardinian

SVO

Slovene

SVO

+

+ 4-

+

+ +

+

+ +

+

+ +

+

+

+

+ +

+

+

+

4-

+

+

+

+

+

+

+

+

+

+

+ +

+

+

+

+

Variation in major constituent order (cont. Appendix III) language

basic

SOV

SVO

VSO

VOS

+

+

OVS

OSV

Spanish

SVO

Swedish

SVO

Tati

SOV

+

+

Turkish

SOV

+

+

Udmurt

SOV

+

U. Serbian

split

Welsh

VSO

+

+

+

+

+

+

+

+

+

551

Anders Holmberg

Word order variation in some European SVO languages: a parametric approach

1. Introduction The languages which are discussed and compared in this paper are all SVO, and typologically similar in a number of other respects, too. I regard them as representative of a class of languages which includes all the Indo-European languages in Europe, and most, or all, of the Finno-Ugric languages in Europe. I will refer to this class as 'the European SVO languages', although not all of these languages are traditionally classified as SVO.1 Among these languages there is, not surprisingly, considerable variation regarding word order. In particular there is some striking variation regarding what is traditionally termed 'freedom of word order'. Some of these languages seem to permit almost any order of sentence constituents, at first sight, while other languages seem to permit very little variation of a 'basic' word order pattern, and some languages are somewhere in between. In this paper I will discuss the notion of freedom of word order on the basis of facts from some of these languages. I will begin by making some general remarks on the relation between surface word order and structure. I will then discuss word order variation within languages, with illustrations from mainly four languages, representing different degrees of freedom of word order: English, Finnish, Russian, and Icelandic. The hypothesis that I will pursue is that all these languages have essentially the same underlying structure, the same set of categories, and the same structural positions. They differ primarily with regard to the feature composition or feature values of certain functional sentential heads, for instance C(omp) and T(ense), which determines how they can employ the various structural positions. I will formulate four major parameters which concern sentential word order, and which distinguish among these four languages. In order to establish the role of these parameters, we need to have structural analyses of all the major sentential constructions in the languages under discussion. For this purpose I will employ a fairly simple method of determining sentential structure, based on the distribution of topic and focus in conjunction with the distribution of certain sen-

554

Anders Holmberg

tence adverbs. In the final two sections I will discuss briefly where a number of other languages in Europe stand with respect to the parameters proposed. The theoretical framework is that of the Principles and Parameters program ("GB theory"), including some aspects of the Minimalist program of Chomsky (1993).

2. Surface word order and structure One lesson to be learned from the past two decades or so of detailed studies of the sentence structure of a number of languages in Europe and elsewhere, is that the surface order of the major constituents, S, O, and V is a poor predictor of the structure of the sentence. For instance, in Swedish there are two quite distinct SVO structures: in main clauses the structure of an SVO sentence Johan köpte en bok 'John bought a book' is roughly (1 a), while the structure of a superficially identical embedded SVO sentence att Johan köpte en bok 'that Johan bought a book' is ( I b ) : see Platzack (1986); Vikner (1995); Holmberg & Platzack (1995). (1)

a. [cpjohan; [c koptej [IP d [r Ij [ V p Vj en bok]]]]] b. [c att [IP j0han [i· I [VP köpte en bok]]]]

In main clauses the subject is in specCP and the (finite) verb in C, while in embedded clauses the subject is in specIP and the verb in its base position in VP. One indication of this structural difference is the position of sentence adverbs: In a main clause sentence adverbs must appear to the right of the finite verb. In an embedded clause sentence adverbs must appear in the domain between the complementizer and the finite verb. (2)

a. (*mojligen) Johan (*möjligen) köpte (möjligen) en bok. (possibly) Johan (possibly) bought (possibly) a book b. att (möjligen) Johan (möjligen) köpte (*möjligen) en bok.

This follows if (a) S-adverbs can be adjoined to IP or to VP, but not to CP, and also not to any X1 category, and (b) we assume the analyses in (la, b). In that case an adverb preceding the subject in ( l a ) will have to be adjoined to CP, which is ruled out, while an adverb preceding the verb will have to be adjoined to C', which is also ruled out. That adverbs cannot be adjoined to CP or C' is

A parametric approach

555

shown by their distribution in wh-questions, where we know that the initial wh-phrase is in specCP and the finite verb in C (see Holmberg 1986). (3)

(*möjligen) vilken bok (*möjligen) köper (möjligen) Johan (possibly) which book (possibly) buys (possibly) Johan (möjligen)? (possibly) 'Which book will Johan possibly buy?'

The position of the object is not fixed either. Taking Icelandic as an example, the object in a simple SVO sentence may be within VP in what appears to be the base-position, or, if it is definite (or better, specific), it may appear in a position immediately preceding VP (so called Object Shift; see Holmberg & Platzack 1995). The finite verb in Icelandic has moved out of VP, to I, in both cases (4)

a. ao [IP Jon; [r keyptij [VP Vj bokk]]]]] that Jon bought (a) book b. ao [IP Jon; [r keyptij [ bokina k [VP Vj e k ]]]]j that Jon bought the-book

Again, one indication that the structure is different in each case is the position of sentence adverbs, for instance the negation (which is a sentence adverb in Icelandic). (5)

a. ao Jon (*ekki) keypti (ekki) bok (*ekki). that Jon (not) bought (not) book (not) b. ao Jon (*ekki) keypti (ekki) bokina (ekki).

Again this follows if (a) the negation must adjoin to (the left of) VP (crucially not to , and not to the right of VP), and (b) we assume the analyses in (4 a, b). (The reason why the negation can precede or follow the object in (5 b) is that the definite object may remain inside VP or move to the pre-VP position). These examples show that in the same surface order SVO the subject may appear in two distinct positions (specCP or specIP), the object in two distinct positions (comp V or a position between I and VP), and the verb in three positions (C, I, or V). This does not exhaust the possible analyses of an SVO sentence. To begin with, the subject may appear overtly in specVP in some languages.2 For instance, there is good reason to think that the subject is VP-

556

Anders Holmberg

internal in the Finnish construction (6) (PRTV = partitive, NOM = nominative): (6)

Poikaa puri ilmeisesti koira. boy:PRTV bit apparently dog:NOM 'Apparently it was a dog that bit the boy.'

Furthermore, at least in some languages the subject may appear in a position between specIP and specVP. (7) is an example of such a construction (see Holmberg 8c Nikanne 1994): (7)

Sitä voi hevonen luultavasti yhtäkkiä potkaista sinua there may a-horse probably suddenly kick you päähän. (Finnish) in-the-head 'Probably you may suddenly get kicked in the head by a horse.'

In this construction the expletive subject sitä occupies the highest argument position within IP; according to Holmberg & Nikanne (1994) the position can be identified as specAgrSP. On the (standard) assumption that the adverbs in (7) are outside VP, the thematic subject hevonen 'horse' must also be outside VP, in an argument position between specAgrSP and VP. An idea which I will pursue in this paper is that languages in general, or at least languages of the broad type to which all the European SVO languages belong, have a similar sentence structure, and a similar set of syntactic categories and structural positions, but differ with respect to the extent that they can employ these positions (in a sense to be specified below), which is formally related to which features are associated with these positions. In particular the "discourse semantic" feature focus and topic will be shown to be crucially involved. I make the following set of assumptions, more or less standard within recent Principles-and-Parameters theory, concerning sentence structure: The sentence consists (universally) of three domains, called the CP domain, the IP domain and the VP domain. Only binary branching is permitted, and syntactic phrases have the familiar structure [XP ZP [x· X° YP]], where ZP is the specifier, called specXP, and YP the complement of X°. At this point I leave open the internal composition of the CP, IP, and VP domains. There are extremely good reasons to think that they each consist of several heads and associated spec-positions; cf. for example Pollock (1989); Travis (1992); Holmberg & et al. (1993). The composition of the IP domain will be crucial later in this paper. Another crucial assumption is that the subject is base-generated in specVP universally. If we

A parametric approach

557

assume a more articulated structure of VP, as in Travis (1992) and Holmberg & Platzack (1995: 20—22), the position of the base-generated subject must be defined more exactly. Here I will simply assume it is specVP.

3. Free vs. rigid word order One simple measure of freedom of word order is the number of permutations that a language permits of S, O, and V. Compare English, Icelandic, Finnish and Russian: (8)

English SVO

Icelandic SVO

Finnish SVO

Russian SVO

*sov osv *ovs *vso *vos

*sov *osv ovs vso *vos

sov osv ovs vso vos

sov osv ovs vso vos

By this measure Finnish and Russian have very free word order: all the logically possible permutations of S, O, and V are in fact grammatical. English and Icelandic, on the other hand, have quite rigid word order, English being more rigid than Icelandic. I will now present a theory which will "generate all and only the grammatical permutations of S, O, and V" in these languages, to use a famous phrase slightly modified. As noted in the previous subsection, the surface order of the constituents does not tell us very much about the structure. What we have to do, as a first step, is to determine for each string of S, O, and V which is its structural analysis. For this purpose I will use principally two criteria: (a) the topic-focus structure of the sentence, and (b) the distribution of adverbs. Obviously the permitted sequences of S, O, and V in a language are not in free variation, but on the contrary, each variant is associated with a particular interpretation, in terms of notions like topic, focus, contrast, specificity, etc. Given a set of general assumptions regarding the distribution of topic, focus, etc., these interpretations give us a clue as to the structure of these various constructions. For my present purposes the following assumptions are largely sufficient: (9)

a. Focus is inside VP, or in a designated Focus position; b. Nonfocus is outside VP.

558

Anders Holmberg

The term Focus is used here to refer to 'nonpresupposed information', whether 'new' or 'contrasted'. That is to say, it includes 'informational focus' (in the sense of Vallduvi 1992) as well as contrastive focus and contrastive topic. Everything which is not Focus is Nonfocus, i. e. 'presupposed information'. For example, in (10 a) / is nonfocus and John is Focus. In (10 b) / is nonfocus in both sentences, John is focus in the first sentence, Peter in the second. (10)

a. (Who did you see?) I saw John. b. (discussing John and Peter) John, I like. It's Peter I can't stand.

It is well known that there is a strong tendency across languages for old information to precede new information. It is also common for contrasted categories to be fronted, and it is also not uncommon for focused (nonpresupposed) categories to be fronted to a special position, the most obvious and well known example of focus-fronting being that of wh-movement in questions. (9 a, b) constitutes a hypothesis concerning the syntactic correlate of these generalizations. It is probably too general, and does not appropriately distinguish different types of foci, but it is, I believe, "correct enough" to serve as a basis for the theory of sentence structure to be expounded here. Now, given a sentence with a given Focus structure we can, if not assign an analysis to the sentence, at least exclude certain logically possible analyses. For instance given a string VOS (of an SVO language), where Ο is Nonfocus and S is Focus, and given the hypothesis (by now a standard hypothesis) that the subject is base-generated in specVP, we can assign it the structure [V; Oj [Vp S ej e,-]]. Another indication of sentence structure, discussed in the previous section already, is the position of adverbs. Again we need some basic assumptions. For present purposes the following assumption will be sufficient. (11)

Epistemic adverbs are adjoined to VP or IP.

There are various restrictions on the adjunction of epistemic adverbs (possibly, surely, maybe, etc.) to VP or IP, with variation across languages, but these will not concern us for the moment. What is crucial is that (a) epistemic adverbs are not found inside VP, and (b) they are not adjoined to CP. There are other adverbs which are found inside VP (e.g. manner adverbs), and still other adverbs which are adjoined to CP (especially speech-act oriented adverbs such as

A parametric approach

559

frankly, honestly, etc.). But epistemic adverbs appear to be confined to the 'IP domain', broadly speaking, at least in the languages studied here. Take a fronted wh-phrase to be a canonical occupier of specCP in Finnish and Russian, just as in Icelandic or English. In none of these languages can an epistemic adverb precede a fronted wh-phrase; see (4) for a Swedish example. (12)

a. (*Most likely) who (*most likely) will (most likely) win the match? b. Icelandic (*Sennilega) hver (*sennilega) mun (sennilega) ... vinna leikin (probably) who (probably) will (probably) win the-match c. Finnish (*Luultavasti) kuka (luultavasti) voittaa ottelun? (probably) who (probably) wins the-match d. Russian (*Skoree vsego) kto (skoree vsego) vyigraet mate? (most likely) who (most likely) wins the-match

This is one piece of evidence that epistemic adverbs cannot adjoin to CP.3' 4 The strategy is to first compare Finnish, English, and Icelandic, and formulate the parameters which distinguish among these three languages: it will be shown that two parameters are sufficient to handle most of the facts. Subsequently I will discuss a construction in Finnish which is problematic within the theoretical framework assumed. An additional parameter is formulated, which accounts for the properties of this construction. Subsequently I will turn to Russian, showing that Russian has, in a sense, even more freedom of word order than Finnish. I will suggest a formulation of the parameter distinguishing between Finnish and Russian.

4.

Finnish, English, and Icelandic

4.1. Finnish Consider first Finnish. (13) shows the distribution of the adverb ilmeisesti 'evidently' in each of the constructions consisting of just S, O, and V. I will disregard the possibility of placing the adverb in final position. This is always possible with a comma break before the adverb. It is, however, difficult in some cases to determine whether a comma break is required or not. NOM = nominative, PRTV = partitive.

560

Anders Holmberg

(13) a. (Ilmeisesti) koira (ilmeisesti) puri (ilmeisesti) poikaa. evidently dog+NOM bit boy+PRTV 'Evidently the dog bit the boy.'

(SVO)

b. (*Ilmeisesti) koira (ilmeisesti) poikaa (ilmeisesti) puri.

(SOV)

c. (*Ilmeisesti) poikaa (ilmeisesti) koira (ilmeisesti) puri.

(OSV)

d. (Ilmeisesti) poikaa (ilmeisesti) puri (ilmeisesti) koira.

(OVS)

e. (*Ilmeisesti) puri (ilmeisesti) koira (ilmeisesti) poikaa

(VSO)

f. (*Ilmeisesti) puri (ilmeisesti) poikaa (ilmeisesti) koira.

(VOS)

The interpretation of the NPs as definite or indefinite (specific or nonspecific) depends on their structural position, interacting with (9). I will comment on the interpretation of the NPs when relevant. The distributional facts in (13) follow given the assumptions in (9) and (11) if in each case where an initial adverb is excluded, the first non-adverbal constituent is in the CP-domain. Thus only in (13 a) and (13d) is the first non-adverbal constituent in the IP domain. A constituent which can be followed by ilmeisesti (without a clear comma break) is outside VP. This yields the following analyses: I have indicated the verb trace by 'V; for the sake of simplicity I have omitted the trace of the subject in VP. (14) a. [IP koira puri [ V p V poikaa]]5

SVO

b. [Cp koira; [ΙΡε; poikaaj [Vp puri Cj]]]

SOV

c. [cp poikaaj [IP koira [Vp puri Cj]]]

OSV

d. [IP poikaaj puri [ V p koira V e;]]

OVS

e. [cp puri [n> koira [ V p V poikaa]]]

VSO

f. [cp puri [IP poikaa, [ V p koira V e,]]]

VOS

The analyses predict correctly the interpretation of the constructions: In (14b, c) the initial NP is Focus, the readings being "It was a/the dog that bit the boy" and "It was a/the boy that the dog bit", respectively. In Finnish the CP-domain is a Focus domain. More precisely, it is the landing site of wh-phrases and constituents focused in yes-no questions. In non-questions, a phrasal category fronted to the CP-domain is generally contrastive, the construction being usually translatable as a cleft construction; see Vilkuna (1989, 1995). Heads may also be fronted to the CP-domain, in which case they, too, are focused. For

A parametric approach

561

instance, the only possible interpretation of (14 e) is roughly "Sure the dog bit the/a boy" or "The dog did bite the/a boy". Furthermore, the object in (14 b) is necessarily Nonfocus, as predicted since (a) it is outside VP and (b) there is no Focus position in the IP domain in Finnish (unlike the situation in certain other languages; see below), while the subject in (14d) must be Focus, the interpretation being roughly "It was a dog that bit the boy", or "The boy was bitten by A DOG".6 The subject-final constructions (13d, f) could theoretically be derived by rightwards movement of the subject, for instance right-adjoining it to VP. Following Kayne (1994), I will assume, as a working hypothesis, that there is no rightwards movement. At least for the constructions discussed in this paper, there is no compelling reason to assume rightwards movement. Given that essentially all categories can move leftwards, and apparently do in many wellstudied constructions, and given that the subject is base-generated in VP, the null hypothesis would seem to be that the subject final constructions shown above are derived by moving the object and the verb, leaving the subject in situ.7

4.2. English Now compare English, the canonical rigid language. Only two permutations of S, O, and V are possible. Inserting an epistemic adverb in these two constructions yields the following result:8 (15)

a. (Apparently) the/a dog (apparently) bit (^apparently) the/a boy. b. (*Apparently) the boy (Papparently) the dog (apparently) bit.

If we ignore the marginal status of the medial adverb in (15 b), the analyses are as follows, given the assumptions made above:

(16)

a. [IP the dog [ V p bit the boy]] b. [Cp the boy; [IP the dog [VP bit ej]]

The fact that an adverb definitely cannot precede the fronted object in (15 b) implies that the object is not adjoined to IP: English is quite free as regards stacking adjuncts at the beginning of the sentence (Yesterday, apparently, without a sign of warning, the dog bit a boy), but the class of adverbs which can precede a fronted object is very small, and possibly coextensive with the class of

562

Anders Holmberg

adverbs which can precede a fronted wh-phrase, including speech-act oriented adverbs such as frankly, to be honest, etc. If it is not adjoined to IP, the fronted object must occupy a spec-position in the CP-domain, although the head of this spec-position is never visible: see Müller & Sternefeld (1993) for other arguments that "topicalized" arguments in English are not adjoined to IP, but in specCP. The marginal status of the adverb placed between the fronted object and the subject is unexpected (irrespective whether the object is adjoined to IP or substituted into specCP): An epistemic adverb may be adjoined to IP in (15 a), as it looks, so why not in (15 b)? I have no answer to this question.

4.3. Icelandic: articulating the IP domain Now compare Icelandic; the permitted strings are SVO, OVS, VSO. VSO is normally a question structure, but may be a declarative (in narratives; see Sigurösson (1991)). In (17 c) I assume it is a question; for pragmatic reasons I use a different adverb there. (17) a. (*Greinilega) hundurinn (?greinilega) beit (greinilega) (evidently) the-dog:NOM (evidently) bit (evidently) strakin. the-boy:ACC b. (*Greinilega) strakin (*greinilega) beit (Pgreinilega) hundurinn. c. (*Raunverulega) beit (Praunverulega) hundurinn (actually) bit (actually) the-dog+NOM (raunverulega) strakin. (actually) the-boy:ACC 'Did the dog actually bite the boy?' Consider first (17b). The distribution of adverbs is compatible with the following analysis: (18)

[Cp strakin; beitj [IP hundurinn [Vp Cj ej]]

This is, indeed, the standard analysis of such sentences, in Icelandic and other Germanic V 2 languages: the object is moved to specCP and the finite verb to C. Consequently no adverb can precede the object, or intervene between the object and the verb; the latter would entail adjunction to X', which is not permitted in the model assumed here. The analysis predicts that an adverb may

A parametric approach

563

precede the subject, adjoined to IP, or follow the subject, adjoined to VP. The latter is clearly the unmarked option in Icelandic. The distribution of adverbs in (17 c) indicates the following analysis: (19)

[CP Q beit, [IP hundurinn [Vp e; strakin]]]

The order Adverb-V-S-O is generally permitted in declarative sentences (with most types of adverbs), but that is because Icelandic, like all the other Germanic V 2 languages, permits adverbs to occupy specCP, with the verb in C. In questions this is not possible, which I assume is because an abstract question operator occupies specCP. Again an adverb may precede or (preferably) follow the subject. Now consider (17 a): The analysis of subject-initial main clauses in Icelandic is a controversial issue (see for instance Rögnvaldsson & Thrainsson 1990; Vikner and Schwartz, to appear). The controversy concerns whether the initial subject is in specCP or in specIP. On the basis of the assumptions made so far, the distribution of adverbs suggests the structure (20). (20)

[cp hundurinn, [IP Cj beit [ V p V strakin]]]

This analysis predicts that no adverb can precede the subject, since it would then have to adjoin to CP, and it predicts that an adverb may precede the verb by adjoining to IP, and follow the verb, by adjoining to VP (as noted in section 1. the finite verb always moves to I in Icelandic). In fact, nobody has ever proposed an analysis like (20). Those who claim that the initial subject is in specCP (e.g. Vikner & Schwartz 1995; Holmberg & Platzack 1995) assume that the finite verb is in C in all main clauses, including subject-first ones. That analysis is, however, incompatible with the possibility (marginal though it is) of having an adverb placed between the subject and the finite verb, as in (17 a). Focus structure does not, unfortunately, give any clear indications of the position of the subject in (17 a). A fronted object must be contrastive, to some degree. Thus a comparatively natural reading of (17 b) is "The boy the dog bit; it didn't bite the girl". An initial subject need not be contrastive, but it is not clear that anything can be concluded from this, since the specCP position is not a "designated contrast position" in Icelandic (or Germanic V 2 languages generally) the way it is in, for instance, Finnish. As mentioned, various kinds of adverbials including most epistemic adverbs can occupy specCP without any contrast (epistemic adverbs cannot be contrasted), being always immediately followed by the finite verb, standardly assumed to be in C.

564

Anders Holmberg

Consider, however, the following observation: If the adverb in (17 b, c) precedes the subject, the subject must be focused. If the subject precedes the adverb, it may but need not be focused. So, for instance, in (17 c) the subject cannot be a weakly stressed pronoun if it is preceded by an adverb: (21)

Beit raunverulega hundurinn/*hann sträkin? bit actually the-dog/it the boy

Why would adjunction of an adverb to IP have this effect? Assume, however, that adverbs cannot be adjoined to IP in Icelandic. In that case the subject in (21) is either in VP, which would explain why it must be focused, or in a focusposition outside VP but lower than the traditional specIP, the adverb being adjoined to VP or to some projection between IP and VP. But if adverbs cannot adjoin to IP in Icelandic, then the exclusion of the initial adverb in (17 a) is compatible with an analysis where the initial subject is within IP. I will assume the following analysis. Following Pollock (1989) and most subsequent work within the Principles-and-Parameters (P &C P) program, I assume an articulated IP structure, where I is split into at least two distinct heads, each projecting a phrase and each licensing a spec-position. In the references mentioned the two highest sentential heads in the IP-domain are identified as AgrS (subject agreement) and Tense.9 For reasons which will become clear below, I will use the less specific label 'ExtTP', short for 'Extension of TP', instead of AgrSP, and consequently 'ExtT' instead of 'AgrS'

(22)

A parametric approach

565

The finite verb normally moves up to ExtT, and the subject to specExtTP via specTP. An object, or other phrasal category may move to specCP, in which case the verb obligatorily moves to C (hence the V2 order). Under certain conditions the subject may, however, remain in specTP, the lower subject position. The crucial condition is that the subject should be focused.10 Sentence adverbs are normally adjoined to VP, but can, somewhat marginally, be adjoined to TP. They cannot ever adjoin to CP or ExtTP. This will predict all the word order facts in (17). In particular, the subject in (17 b, c) may be preceded by an adverb, but in that case the subject must be focused.11

5. Explaining the differences between Finnish, English, and Icelandic Why is movement of constituents freer in some languages than in others? Why is movement restricted in any language? There is one hypothesis which we may dispose of immediately, and that is that richness of case morphology can explain why Finnish ans Russian have freer word order than English and Icelandic. True, English has almost no case morphology, while both Finnish and Russian have rich case morphology. But Icelandic, too, has a system of morphological cases which is fairly rich, by usual standards (distinguishing four cases, nominative, accusative, dative, and genitive). By this Icelandic contrasts sharply not only with English but also with the closely related Mainland Scandinavian languages, which, just like English have only pronominal case. Yet, at least in terms of SVO permutations, word order in Icelandic is not freer than in Mainland Scandinavian. Clearly rich case morphology is not sufficient to permit "free word order": see also Bakker (this volume: 6f.) and Siewierska (this volume: 24 ff.). I will, however, return to the question of case morphology below, and suggest a more restricted role for it. One of the central principles restricting word order postulated in P & P theory is Relativized Minimality (Rizzi 1990). The principle may be formulated as follows:12 (23)

In a configuration A...B...C no grammatical relation R can involve A and C without also involving B if B is a potential member of R, and B c-commands C but not A.

This principle rules out, for instance, the English construction (24 a), while permitting (24b):

566

(24)

Anders Holmberg

a. »Will this book John buy? a'. Will this bookj [Vp John [v· buy ej] b. Will this book be bought? b'. Will this bookj be [Vp es [v· bought Cj]]]

(24 a) is ruled out by Relativized Minimality (RM): the object has moved across the subject in specVP, i.e. the chain headed by the object does not include specVP, a potential member of an Α-chain, In (24 b) specVP is included in the Α-chain, hence RM is respected. RM will not rule out movement of the object across the subject in the Focus construction (25 a). (25)

a. This book John will not buy. b. [Cp This book; [IP Johnj will not [Vp ej [v· buy e;]]]]

This is because the CP-domain is an Α-bar domain (a nonargument domain). Hence the chain (this bookj, e\) in (25 b) is an Α-bar chain (an Α-bar relation). Neither SpecVP nor SpecIP are potential members of Α-bar chains, hence (25) respects RM. One of the properties of English restricting movement is (26) (see Pollock 1989). (26)

English has no verb movement of main verbs.13

This excludes VSO and VOS as possible permutations of S, O, and V. It is also sufficient to exclude OVS. It does not, on its own, exclude SOV, though, as in (27), for instance: (27)

[IP the dogj the boy,· [ VP e; [v· bit η]]]

Following Chomsky (1993), I assume the problem here, too, just as in (24a), is that movement of the object violates RM, by crossing the VP-internal subject position. That is to say, absence of verb movement, a property which English shares with, probably, many other languages (although not many in Europe; see section 6 below), together with the universal principle RM suffices to predict the range of constructions allowed in English by permuting S, O, and V. RM is a powerful constraint on grammatical relations. Too powerful, in fact, to permit all of the relations in (14), the permutations of S, O, and V in

A parametric approach

567

Finnish. Finnish has verb movement, clearly, all the way to C (see Holmberg et al. 1993). Hence it permits VSO. As shown in (14) Finnish permits OSV only with O in the CP-domain, which is allowed by RM even in its present form. But it also permits OVS and VOS, with O in the IP domain. This is not permitted by RM as it stands. Furthermore Finnish permits SOV with the object in the IP domain, also not allowed by RM as it stands. This means that we have to relax RM somewhat, to allow for the freedom of word order found in Finnish, yet not so much that we will allow more freedom of word order in English (and Swedish and other more rigid languages) than what is actually attested. One case of object movement from the VP to the IP domain which has been studied quite intensively is Scandinavian Object Shift. As shown in the previous section, the Scandinavian languages do permit object movement from VP to the IP domain, so called Object Shift, but crucially only when the main verb itself has been moved from VP to I, giving the order SVO, but now with V as well as preceding all VP-adjuncts, including epistemic adverbs. (This generalization has been called "Holmberg's generalization" in some recent literature.) (28)

Icelandic a. Hundurinn beit sträkin ekki. the-dog bit the-boy not b. *Hundurinn hefur sträkin ekki bitiö. the-dog has the-boy not bit

Adopting what I take to be the essence of Chomsky's explanation (cf. Chomsky 1993), I assume the reason why V-movement has this effect is that it extends the V-domain. Now if RM applies not to movement within a Domain (that is a syntactic domain of a certain kind), but only to movement across Domain boundaries, then it follows that V-movement to I will make object movement across specVP possible. Following Mahaian (1990), Chomsky (1993), and Bobaljik & Jonas (1996), among others, I assume that the shifted object lands in the specposition of an abstract head which in the references mentioned is identified as AgrO, object agreement. For reasons to be made clear below I will use the categorially less specific label 'ExtV, projecting ExtVP, short for 'Extension of VP'. For reasons of presentation I will not represent T and ExtT separately below, but collapse them into I. In the Object Shift construction, the verb moves first to ExtV, and then on to I, extending the V-domain all the way to IP. The structure resulting from Object Shift will thus be (29): (29)

[IP hundurinn; [r beitj] [EXIVP strakint [EXIV ExtVj [ V p ekki [ V p e; [ v · Vj

568

Anders Holmberg

We allow for this possibility by adding an exception clause to the formulation of RM:14 (30)

In a configuration A...B...C no grammatical relation R can involve A and C without also involving B if B is a potential member of R, and B c-commands C but not A, except if A, B, and C are in the same Domain.

One type of Domain is defined by V. More precisely, the Domain of V is the specifier and complement of each link in chain created by V-movement. I leave it open at this point what else can define a Domain except V; I will return to this issue in a separate section. Note that movement of V to I (i. e. T and ExtT) in (29) extends the V-domain to include all of IP, permitting also subject movement across the shifted object without violating RM. As for what triggers verb movement, the standard view is that it is "triggered by verb morphology": Until Chomsky (1993), tense, agreement, and other verb morphology was generally taken to be base-generated in functional head positions outside VP, verb movement being a way of joining the verb with the inflections. Chomsky (1993) proposed that verbs (and other lexical heads) are inserted with inflections already attached, with verb movement out of VP being triggered by the need to check the morphological features on the lexical head against the features of the functional sentential heads T, AgrS, AgrO, etc. generated outside VP. I adopt, provisionally, the latter hypothesis. That is to say, the verb moves to T and ExtT to check its morphological features against the features of the sentential heads T, ExtT, and C (on movement to C, see Holmberg & Platzack 1995). However, movement of V has the effect, accidentally as it were, of permitting object movement out of VP, as well as subject movement across a shifted object. Since English does not have verb movement, we predict that English will not allow Object Shift. (31)

*The dog bit the boy/him never.

One problem which this theory now faces is why (32) is not well formed in for instance Icelandic, with the analysis shown (i.e. with the object in the I/VP domain): (32)

*Strakin beit hundur/hundurinn. the-boy(A) bit a-dog(N)/the-dog(N) [IP strakittj [j· beitj ... [ V p hundur(inn) Vj ε;]]

A parametric approach

569

Note that since Icelandic has morphological case on nouns (distinguishing nominative, accusative, dative, and genitive), the subject-object relations are in a sense recoverable in (32). Yet the construction is totally ungrammatical under the analysis shown, i. e. with the focus structure and adverb distribution associated with this analysis; as shown it does not matter whether the subject is indefinite or definite. It seems that the ungrammaticality of (32) must be ascribed to some independent condition which prohibits any other argument than the subject argument from moving to specIP. Note, however, that this word order is well formed in Finnish, as was shown in (14). The relevant example is repeated as (33 a), the structure of which, we may now assume, is (33 b): (33)

a. Poikaa puri koira. b. [IP poikaai [,· [, purijj [ExtVP e; [ FxtV · ExtV, [VP koira [ v - V{ e{]]]]

Note, furthermore, that (34 a) is not well formed, except under the analysis (34 b), where the initial object is in speed1 heading an Α-bar chain, as discussed above (the position of the verb cannot be determined without additional data; I assume here that it remains in VP, but this is not crucial). It is thus not well formed under the analysis (34 c): (34)

a. Poikaa koira puri. b. [Cp poikaa ; [IP koiraj [r I ... [VP e} puri ej]]]] c. [IP poikaaj [r I ... [ V p koira puri e;]]

This is what we predict: (33 b) does not violate RM, since verb movement has extended the V-domain, permitting object movement across the specVP. (34 b) does not violate RM, since the object is moved out of VP by Α-bar movement, directly to specCP, and consequently may skip specVP. (34 c) does violate RM, since the movement is Α-movement, specVP is not part of the movement chain, and the verb has not extended its domain. That is to say, we find the same pattern in Finnish as in Icelandic: the object can move, by Α-movement, across the subject only if the verb moves out of VP, at least to ExtV. This indicates that the interplay of verb movement and object movement is not just a quirk of the Scandinavian languages. Of course, if it were the case that V-movement was simply obligatory in Finnish, then (34 c) would be ruled out independently of RM, with no consequences at all for Holmberg's generalization. 15 But as will be discussed below verb movement is not obligatory in Finnish: in certain constructions the verb,

570

Anders Holmberg

finite or nonfinite, remains inside VP, which is to say that the necessary feature checking can be accomplished without verb movement (at least without overt verb movement: cf. Chomsky 1993; Pollock (1994).16 Thus (34c) is, apparently, not ruled out simply because of containing unchecked verbal inflectional features. I therefore maintain that it is ruled out because it violates RM. We can now specify exactly the point in the derivation where Finnish and Icelandic differ: Both allow V-movement out of VP, first to ExtV. This makes possible object movement to specExtVP. This is shown schematically in (35).

(35)

IP Γ ExtVP

NPi

ExtV

At this point, Finnish but not Icelandic permits object movement higher up the structure, to specIP. In Icelandic the subject must move across the object to specIP. In order not to violate RM, this presupposes movement of V on to I, extending the VP-domain to IP. The result is the Object Shift construction. In Finnish in principle any category which can serve as topic can move to specIP. It may be the subject, or an object, or even an adverbial. If the category moved to specIP is a nonsubject, the subject normally remains in VP, being focused, as for instance in (36), exemplifying movement of a locative phrase to specIP. That the locative phrase has moved to specIP can be verified in various ways. If epistemic adverbs can only be adjoined to IP or VP but not to CP (as discussed in section 2), then (36 a) shows that the locative phrase is not in specCP. If the auxiliary is in I (the fact that it precedes the adverb aina 'always' shows that it is outside VP) then the locative phrase is in specIP. The fact that the locative phrase may be preceded by the negation moved to C and affixed with the question-particle, as in (36 b) is additional evidence to the same effect.

A parametric approach

571

Finally (36 c, d, e) show that there is not enough space for both the locative phrase and the nominal argument lapsia 'children' in the IP-domain: either the locative phrase or the nominal argument moves to the IP domain, but not both. This is strong evidence that they occupy the same position, namely specIP. (36)

a. Ilmeisesti kadulla on aina leikkinyt lapsia. apparently street:ADESS have always played children:PRTV 'Apparently there have always been children playing in the street.' b. Eik kadulla ole aina leikkinyt lapsia? not+Q street:ADESS have always played children 'Haven't there always been children playing in the street?' c. Ilmeisesti lapset ovat aina leikkineet kadulla. apparently children:NOM have always played streef.ADESS 'Apparently children have always been playing in the street.' d. ^Ilmeisesti lapset kadulla ovat aina leikkineet. apparently children street:ADESS have always played e. ""Ilmeisesti kadulla lapset ovat aina leikkineet. apparently street:ADESS children have always played

Thus Finnish is not 'subject prominent', the way for instance English and the Scandinavian languages are; In these languages only the subject, in the sense of the most prominent argument in VP, can move to specIP. Thus, if the verb is transitive, only the Agent can (and generally must) move to specIP. If the verb has no Agent but has an Experiencer, the Experiencer must move (I will discuss such cases below). If there is only a Theme, the Theme can move. In Finnish, as in many other languages, other sentence functions than the subject can move to specIP, as a marked alternative.17 Why does an argument need to move to specIP at all? In standard GB theory this condition is referred to as the Extended Projection Principle (EPP). I assume, following Kiss (this volume) and Holmberg (1993 a), that the EPP is a consequence of conditions on predication: An argument must move out of VP, or more precisely ExtVP, to specIP in order to create a syntactic configuration where an argument in specIP binds an open Α-position inside VP. This is what predication consists of, in this view. A predicate is a maximal projection of a lexical thematic head (V, A, N, or P) which contains an open (empty) argument position. Predication occurs when this argument position is bound by an argument (a referential expression) in an Α-position outside the predicate. Usually, the way this configuration is created is by movement of an argument out of

572

Anders Holmberg

VP, to specIP. Note that moving an argument to specExtVP is not sufficient for predication: an argument has to move out of ExtVP, to specIP, for predication to occur. This entails that the ExtV-projection is part of the predicate.18 I will call the binding argument the predication-subject. In English and Scandinavian only the highest argument (in terms of a thematic hierarchy Agent < Experiencer < Theme < Adverbial) 19 in VP can be predication-subject. In Finnish, in principle any argument can be predication subject. This is an important parameter distinguishing among languages, affecting sentential word order in particular. See Kiss (this volume) for discussion; see also note 17. How should the parameter be formulated? An interesting hypothesis, widely assumed within the P & P program, is that parameters in syntax are all 'morphological' in the sense that they concern features of functional heads such as T, Agr, C, D(eterminer), etc. (cf. Borer 1984; Ouhalla 1991; Chomsky 1993). As mentioned, there are very good reasons (at least in the case of some languages) to assume that I consists of two heads, each projecting a spec-position, traditionally AgrS and T, here called ExtT and T. If so, what is characteristic of a subject-prominent language, for instance Icelandic, is that the object cannot move from specExtVP to specTP (and then on to specExtTP). In other words, T has some feature which is incompatible with any other argument than the highest argument in VP. I assume that the highest argument in VP is assigned the case feature [ + nom] (which is usually, but not always, phonetically realized as nominative case in those languages which have visible case morphology; see section 6, below). The subject-topic prominence parameter can then be formulated as in (37): (37)

±(Tis[+nom])

As for the category ExtT, if in some languages any argument can be predication-subject, and if the predication-subject eventually moves to the higher specposition, it is probably not a good idea to identify the higher head as AgrS universally. It must be the case in topic prominent languages that subject agreement can be checked without movement of the subject to specExtT). For instance in (38) the verb agrees with the postverbal subject, by hypothesis in VP, not with the preverbal object argument which functions as predication-subject. (38)

Poikaa purivat koirat. boy:PRTV bit:3PL dog:NOM PL 'It was the dogs that bit the boy./The boy was bitten by the dogs.'

That is to say, the nominative subject need not move out of the predicate to have its features checked against those of the verb or auxiliary.20 I will later

A parametric approach

573

suggest that in some languages 'specAgrOP' is not reserved just for objects but may host for instance the nominative subject. This is the main reason why I use the more category-neutral labels ExtT and ExtV instead of AgrS and AgrO, allowing for variation among languages as to the precise feature composition of these heads. Now we can specify even more exactly how Icelandic and Finnish differ, with respect to the facts discussed so far.

(39)

ExtTP ExtT' ExtT

T [+nom]

ExtVP .s^~^^ NPi ExtV

VP

ExtV

NP [+nom]

V V

NP

e

e;

i

In Icelandic, movement of the object NP; from specExtVP to specTP is impossible due to feature conflict. Only the [ + nom]-marked thematic subject can move to specTP, to create a configuration of predication. But movement of the subject across the object presupposes verb movement to T, not to violate RM, hence the result is always SVO order. In Finnish, the object can move on from specExtVP to specTP. (Alternatively, there is no ExtV in the relevant construction, so that the object can move directly from VP to specTP.) For a language like Finnish, then, which has verb movement and where Τ is not marked [+nom], we predict the possibility of SVO, VSO, OVS, and VOS, with the analyses shown in (14). VOS results from applying V-movement to C (verb focusing) to an OVS structure. SOV is still a problem. The problem is allowing SOV in Finnish while excluding it in Icelandic. As discussed, one reason why SOV is ruled out in Ice-

574

Anders Holmberg

landic is that it violates RM, so the question is, why is SOV not ruled out in the same way in Finnish? This problem will be dealt with in a separate section. Summarizing, apart from the remaining problem of accounting for SOV in Finnish, I have accounted for the differences between Finnish, English, and Icelandic as regards word order on the sentence level principally by the two parameters in (40): (40)

1. ±V-movement, 2. ±(T is [ + nom])

A third 'minor' parameter concerns the place of adjunction of adverbs: In Finnish and English an adverb can be adjoined to ExtTP (see (13 a) and (15 a)). Arguably this is not possible in Icelandic, as discussed in section 3.3.21

6. On the role of case morphology Is it possible to derive the variation expressed above as settings of the parameter (40.2) from some other more basic parameter? A not implausible candidate would be case morphology. Conceivably, in a language where all NPs have overt case-marking, as in Finnish and Russian, a nominative-marked NP is licit even if it stays in VP. In a language without overt case-marking, on the other hand, a NP must move to a special position, namely specTP, to be assigned nominative (or have its abstract nominative feature checked). If so, the value for (40.2) depends on the presence or absence of morphological case; a version of the old generalization according to which rich case morphology makes possible more freedom of word order. Consideration of Icelandic and Faroese shows that this is not quite correct. As mentioned, Icelandic and Faroese have rich enough case morphology (Icelandic slightly more than Faroese), yet they are not topic-prominent languages, that is to say, (40.2) takes a positive value in Icelandic and Faroese just as well as in English and the caseless Mainland Scandinavian languages. Clearly, while case morphology may be a necessary condition for topic prominence, it is not a sufficient condition. This is confirmed by Siewierska (this volume): One of the generalizations emerging from the large-scale survey reported there is that although flexible word order languages more often than not have case morphology, there are not a few languages which have case morphology yet have rigid word order. Of the 171 languages investigated by Siewierska, 92 have case morphology (on nouns, not only pronouns). Of these languages 9 have

A parametric approach

575

what she calls rigid word order (defined as permitting no permutation of the basic order of S, O, and V), while 33 have what she calls restricted word order (permitting only one permutation). There is a construction in Icelandic (and, with slightly different properties, in Faroese) which on the face of it, looks like a counterexample: (41)

a. Joni likar bjorinn. Jon:DAT likes the-beer:NOM b. [IP Joni; [r [, likarjj [VP e; [ ·

{

bjorinn]]]]

It has been established in a number of studies (see for instance Zaenen et al. (1985) and Sigurösson (1989)), that the dative experiencer in (41) is a grammatical subject, and clearly it is the predication-subject in present terms. Thus it is potentially a problem for the claim that Icelandic is strictly subject-prominent, i.e. takes the positive value for the parameter (40.2). However, a closer look reveals that (41) is not an example of non-subject-prominence, although it shows that the formulation of the parameter (40.2) is not optimal. It can easily be shown that the construction (41) is very different from, say, the Finnish OVS construction (13d), repeated here (without adverbs): (13) d. Poikaa puri koira. boy:PRTV bit dog:NOM To begin with, the arguments in (41) cannot be switched around, the way they can in (13d) (where, of course, the unmarked word order is SVO): the dative argument in (41) has to precede the nominative one. (42)

a. Siöan hvenaer hefur Joni likao bessi bjor? since when has Jon:DAT liked this beer:NOM b. *Sioan hvenasr hefur bessi bjor Joni likaö/likaö since when has this beenNOM Jon:DAT liked/liked Joni? Jon:DAT

(The nominative argument can be first if it is moved to specCP, but this is of course irrelevant. The example is designed to rule out such an analysis.) Second, (41) may have unmarked focus structure, while (13d) requires focus on the agent. Yet another difference is that the dative argument can bind an anaphor inside VP in (41) (just like any grammatical subject), which the partitive object argument cannot do in (13d); compare (43 a, b):22

576 (43)

Anders Holmberg a. Joni likar husiö sitt. Jon:DAT likes house his:REFL NOM b. *Poikaa puri koiransa. boyrPRTV bit dog:POSS NOM

The possessive suffix in Finnish is an anaphor with a distribution comparable to Germanic reflexive possessives (see Trosterud 1993). As shown in (41), an oblique argument in specIP in Icelandic can control a nominative reflexive possessive in VP, but an oblique (here partitive) argument in specIP in Finnish cannot control a nominative possessive suffix. In Icelandic an oblique NP may move to specIP with a nominative NP remaining inside VP only in passives and experiencer constructions ("psych-verb" constructions) like (41), that is constructions lacking an agent argument. The crucial property which experiencer constructions have, which makes it possible to move the oblique experiencer to specIP is that the experiencer is higher than the theme argument in D-structure: in the absence of an agent, the experiencer is the highest argument in VP (see Ottosson (1991), Holmberg &C Platzack (1995: ch. 7)). On the other hand, to retain the formulation (38.2) of the parameter, we have to assume that the dative argument in (41) is "covertly nominative", so that it does not clash with the feature [ + nom] on T; see Sigurosson (1994) for discussion.23 There are constructions, though, where case morphology seems to have an effect on movement and (hence) 'freedom of word order'. This can be seen most clearly when comparing essentially the same construction involving NPmovement in two languages which differ minimally in that one but not the other has case morphology. 'Scrambling' in the double object construction in Dutch and German is a well known case, recently discussed in Weerman (1994): German, but not Dutch, allows the order DO IO V as well as the order IO DO V with verbs like geben 'give' (DO = Direct Object, Theme, IO = Indirect Object, Benefactive), while Dutch only allows IO DO V. This difference is plausibly due to the fact that German but not Dutch has case morphology. Scrambling in the double object construction is also found in Icelandic, but not in Mainland Scandinavian, again presumably due to the fact that Icelandic but not Mainland Scandinavian, has case morphology (see Holmberg & Platzack 1995, Holmberg 1994). Another construction in Scandinavian where case morphology seems to have an effect on movement is the Scandinavian Object Shift construction. In Holmberg (1986) I argued that case morphology plays a crucial role in Scandinavian Object Shift. This explains why Object Shift is restricted to pronouns in Mainland Scandinavian while it applies to all NPs in Icelandic:

A parametric approach

577

in Mainland Scandinavian only pronouns are overtly marked for case while in Icelandic all nouns are: (44)

a. Icelandic Jon keypti paer/bxkurnar ekki. John bought them/the-books:ACC not. b. Swedish Johan köpfe dem/*böckerna inte. John bought them/the-books not

Faroese seems to be a counterexample ac first, since it has morphological case on nouns (distinguishing nominative, accusative, and dative), yet only pronouns undergo Object Shift, just as in the caseless Mainland Scandinavian languages. However, as I have discussed in Holmberg (1994), a number of facts concerning the internal structure of noun phrases as well as the distribution of nominal arguments show that Faroese has "weaker" case than Icelandic, although morphologically virtually the only difference is that Icelandic has genitive (as a verbal as well as nominal case), in addition to nominative, accusative, and dative. For instance, while Icelandic, as mentioned, allows inversion, or 'scrambling', in the double object construction, allowing the order V DO IO in addition to the unmarked V IO DO, Faroese allows only V IO DO, just like Mainland Scandinavian and English. (45)

a. Icelandic Hun gaf Kjartani bokina./ Hun gaf bokina she gave KjartanrDAT book:DEF/ she gave book:DEF Kjartani. Kjartan-.DAT "She gave Kjartan the book." b. Faroese Hon gav Kjartani bokina./ *Hon gav bokina she gave Kjartan:DAT book:DEF/ she gave book:DEF Kjartani. Kjartan: D AT

As mentioned, case morphology is, presumably, a precondition for the inversion exemplified in (45 a) (which also requires focus on the IO). However, (45 b) shows that case morphology is not a sufficient condition, since Faroese does not allow this inversion. Another fact indicating that case in Faroese is

578

Anders Holmberg

weaker than in Icelandic is the following: In Icelandic a lexically assigned object case is always preserved under passivization; in Faroese it is usually not preserved, but replaced by nominative. For instance, the verb meaning 'help' selects a dative object in Icelandic as well as Faroese. In Icelandic the dative is preserved under passivization, but not in Faroese. (46)

Icelandic a. Hann hjälpaöi Siggu. he helped Sigga:DAT b. Siggu/*Sigga var hjalpaö. Sigga:DAT/*NOM was helped

(47)

Faroese a. Hann hjalpti Siggu. he helped Sigga:DAT b. Sigga/*Siggu bleiv hjalpin. Sigga:NOM/*DAT was helped

Apparently, although case in Faroese is morphologically almost identical to case in Icelandic, it is somehow syntactically weaker than in Icelandic.24 If this is correct, we can maintain that case morphology facilitates movement, as long as (a) the movement is within a Domain, and (b) the case morphology is of the right kind. In other words, morphological case can play a role for Scrambling and Object Shift, and similar processes, which I suggest can be generalized as movement within a Domain, but not for movement across Domain-boundaries, and definitely not for A-bar-movement.25

7. Verb-final order in Finnish As a rough generalization, SOV order is found in Finnish in certain embedded clauses or when a constituent is moved to specCP; see Vilkuna (1989: 121-125). (48)

a. Jos koira poikaa puri, nun se ammutaan. if dog boy bit, then it will-be-shot b. Milloin koira olisi poikaa purrut? when dog have:COND boy bit 'When would the dog have bitten a boy?'

A parametric approach

579

c. Eilen koira poikaa puri. Yesterday dog boy bit 'It was yesterday that the dog bit the/a boy.' d. Koira-ko poikaa puri? dog-Q boy bit 'Was it the dog that bit the/a boy?' e. (*)Koira poikaa puri. (48 a) exemplifies SOV order in an embedded clause. The other constructions are all plausibly analyzed as having a constituent moved to specCP, which, in the case of non-wh-phrases, yields the characteristic contrastive reading of the initial constituent. Thus (48 e) is acceptable only if the subject has the contrastive reading indicative of movement to specCP, i. e. "It was a dog that bit the boy".26 The fact that verb-final order is found in embedded clauses is reminiscent of the situation in German and Dutch, where V-final order is obligatory in embedded clauses. The generalization concerning verb-placement in German and Dutch is that the finite verb moves to C if and only if C is not occupied by a complementizer (see den Besten 1983).27 In fact, something similar holds true of Finnish: As mentioned, a category moved to specCP is always focused. Sometimes the fronted constituent has a focus particle suffixed to it. For instance in (46 d) the initial constituent has the question particle -ko suffixed to it. In addition to -ko, Finnish has a couple of other focus particles which may be attached to a constituent moved to specCP. I assume that whenever a constituent is fronted to specCP, there is a focus feature in C, which may or may not be phonetically realized. Assume that C with a focus feature counts as "filled". Then we can state the following generalization for verb-final order in Finnish: (49)

Verb-final order is possible in Finnish if and only if C is filled.

This makes Finnish and Dutch-German look very similar, in that verb movement is in both cases conditioned by the content of C.28 I assume that V-final order is the result of leaving the verb in situ while moving all other constituents out of VP. If so, the question is, how is this possible without violating RM? In the case of object movement discussed in section 4,1 put forth a version of the hypothesis that V-movement was a prerogative for object movement out of VP: V-movement has the effect of extending the V-domain, within which crossing Α-chains are permitted (perhaps only if the arguments have overt case).

580

Anders Holmberg

Putting aside the case of subordinate clauses such as (48 a), it now appears that the focus feature in C has the same effect as verb movement on object movement out of VP. The following is a suggestion how to explain why this should be the case. The suggestion is based on the idea that violation of RM is allowed within Domains. Assume that a feature [+focus] in C has the effect that the rest of the sentence can be [ — focus]. This can be understood as follows: Every sentence must have a focus.29 The default setting is that VP is focus while IP outside VP is nonfocus. If, however, C is assigned the feature [ +focus], with the effect that specCP is interpreted as focus, the rest of the sentence, including VP can be [ — focus]. This means that IP and VP constitute a single domain with respect to focus. This, in turn, I assume, means that RM is suspended, making it possible for the object to move out of VP, across the subject position. That is to say, [± focus] can define a Domain in the sense of definition (30). Why does Icelandic not employ this way of allowing SOV order? One crucial difference, I suggest, is the feature composition of C. Finnish has a special focus feature associated with C, which is realized overtly in questions by the question particle -ko, and frequently though not obligatorily, by certain other focus particles in declaratives. By virtue of this feature, Finnish C can define a domain in which crossing Α-chains are permitted. Although Icelandic has a specCP position which is a landing site for wh-phrases and can also be used for contrastive focusing of non-wh-constituents, C is "weaker" in Icelandic (reflected in the absence of focus morphology), leaving V-movement as the only way to create a domain where crossing Α-chains are permitted.

8. Russian Like Finnish, Russian permits any permutation of S, O, and V. I take this to mean that Russian has "Finnish values" for the two parameters in (40): it has V-movement, and T is not [ + nom]. However, when taking focus structure and adverb placement into account, it appears that Russian is (at least) one notch freer still, than Finnish. That is to say, there is an additional parameter which affects word order, and which has different values in Finnish and Russian. Consider first the construction with OVS order. With respect to this order, Russian and Finnish are alike: the subject is focus, the object nonfocus, and adverbs can be inserted anywhere, including sentence initially.30 (50)

a. Finnish (Ilmeisesti) poikaa puri koira. evidently boy:PAR bit dog:NOM

A parametric approach

581

b. Russian (Vidimo) malcika ukusila sobaka. evidently boy:ACC bit dog:NOM 'Evidently the boy was bitten by a dog.' This is accounted for, if the subject is left inside VP, the verb is moved to a head position outside VP, and the object is moved to a spec-position outside VP, more precisely specIP (the predication subject position), and the adverb is adjoined to IP. (51)

[|p vidimo [tp malcika; [r Q ukusilaj] [EXIVP

[VP sobaka [v· Cj ej]]]]]

The analysis observes all the principles assumed so far, including RM. That Russian and Finnish are different is clearly shown by (52), the OSV construction: (52)

a. Finnish *Ilmeisesti poikaa koira puri. evidently boy dog bit b. Russian Vidimo malcika sobaka ukusila. evidently boy dog bit

In Finnish the order OSV is possible only if the object is focused while the subject is nonfocused, roughly "It was a/the boy that the dog bit". As discussed in the preceding section, V-final order is allowed (in a main clause) only when there is a focus feature in C, morphologically realized or abstract, which presupposes that there is a constituent in specCP or adjoined to C, either a whphrase or a contrastively focused non-wh-category. Since adverbs cannot adjoin to CP, (49 a) is ungrammatical. I assume, on the basis of (12) (see also note 3), that epistemic adverbs cannot adjoin to CP in Russian either. If so, the fact that (50 b) is well formed, implies that the object is not in specCP, but rather in specIP (functioning as the predication subject in this construction). That OSV sentences in Finnish and Russian are structurally distinct is indicated also by a clear difference in focus structure. (50 b) has two readings: Under both readings the object must be nonfocus, but under one reading the subject is focus, roughly "Evidently it was a dog that bit the boy", while under the other reading the verb is focused, roughly "The boy was evidently BITTEN by a dog".

582

Anders Holmberg

The contrast between (53 a, b) presents a parallel case: (53)

a. Finnish *Ilmeisesti koira poikaa puri. evidently dog boy bit b. Russian Vidimo sobaka malcika ukusila. evidently dog boy bit

The reason why (53 a) is ill-formed is exactly the same as in the case of (52 a): The order SOV is possible here only if the subject is in specCP, with a contrastive reading. But then the subject cannot be preceded by an adverb. Without the adverb the sentence is fine. Again, the Russian SOV sentence differs from the Finnish SOV sentence in than (a) in Russian an epistemic adverb may precede the subject and (b) the focus structure is different: in (53 b) the subject is nonfocus and either the object or the verb is focused. The situation in Russian is similar to that of Hungarian: In Hungarian a focused constituent is always in immediately preverbal position, being optionally preceded by constituents which are topics (non-focus, old information). Thus, in an SOV sentence S may be topic and Ο focus (alternatively S and Ο are both topics). Correspondingly, in an OSV sentence Ο can be topic, and S focus (alternatively S and Ο are both topics). In wh-questions the position of the wh-phrase is in the focus position immediately preceding the verb, so that the wh-phrase, too, can be preceded by topicalized constituents. Such facts have led a number of scholars to assume that the Hungarian sentence contains a special focus position, situated somewhere between ExtTP and VP (in present terms); see Kiss (this volume) for discussion and references. It is characteristic of Hungarian that the focused constituent always immediately precedes the verb; for instance, adverbs may not intervene between the focused phrase and the verb. This implies that there is a head which takes a focused phrase as its specifier, and which must be lexicalized by the verb. Given no adjunction to the X' level, it then follows that the focus and the verb have to be adjacent. Several hypotheses regarding the identity of the focus position have been proposed in the literature; see Kiss (this volume) for discussion and references. In the present framework it seems most plausible that the focus position is specExtV. That is to say, in Hungarian the abstract functional head ExtV has the features [ + focus, ±WH]. The head must be lexicalized by the verb. In Hungarian it seems clear that the focus position, by assumption specExtVP, is an Abar position. Hence it will not interfere with Α-movement from VP to specTP.

A parametric approach

583

The structure of a Hungarian SOV sentence with focused O, for instance (54 a), would thus be roughly (54 b): (54)

a. Janos Mark szereti. Janos Marit loves 'As for Janos, it is Marit whom he loves.' b. [TP Janos; [τ. Τ [ ExtVP Maritj [ EjrtV · bxtv szeretik] [VP e; [ v - ek e;]]]]]]] [ + Foc]

Although they are similar in some respects, there are one or two differences between Hungarian and Russian which argue against assuming a structure like (54 b) in the case of Russian. In particular, the verb and the focused preverbal object need not be adjacent. See (55), where an adverb intervenes between the object and the verb. (55)

Masha MISHU, konecno, pozvala. Masha:NOM Misha:ACC of-course called Of course it was Misha whom Masha called'

This argues against postulation of a sentence-medial special focus position in Russian. Instead, I propose the following: In Russian, ExtVP lacks inherent features altogether. This, I assume, means that it inherits all the properties of VP, including the property of being a focus domain. Thus, just like the subject left behind in VP will be focus, VP being a focus domain, an object moved to specExtVP will be focus. By contrast, in English, Icelandic, and Finnish ExtV has inherent features. I assume that one of these features is [ — focus]. Arguably, moving an argument A to the spec of a functional head B, whose features are all inherited from the complement C of B, is equivalent to adjoining A to C. If so, what sets Russian off from the other languages discussed here is that Russian allows adjunction of an object to VP. I shall maintain, however, following Kayne (1994) and Chomsky (1993), that arguments can only move to spec-positions, which means that there must be a head, be it an entirely featureless head, projecting a spec-position for the object in the Russian SOV construction.31 The structure of, for instance, (53 b) will then be (56): (56)

[EXITP sobaka; [TP e; [τ· T [EX VP malcikaj [EX V ExtV [Vp e; [v· ukusila ej]]]]]]]

If the object stops in specExtVP while the subject moves to specTP, and possibly on to specExtTP, we get an SOV construction with the object focused. If

584

Anders Holmberg

the object moves on from ExtVP to specTP (which is possible because T is not [ + nom] in Russian) and possibly on to specExtTP, while the subject remains in specVP, we get an OSV sentence with subject focus. Finally, since both SOV and OSV sentences have a reading where only the verb is focused, we have to allow for the possibility that both arguments move out of ExtVP. I suggest that ExtTP, too, is a category without inherent features in Russian, which I take to mean that TP can be extended in principle indefinitely, by adding new featureless heads, each with an associated spec-position, to ExtTP. Thus any number of nonfocused phrases may precede ExtVP (i.e. the predicate phrase).32 Adverbs may be adjoined to, presumably, any of the clausal maximal projections including VP. Since ExtVP is automatically an extension of VP, copying all the features of VP, verb movement to ExtV is not required in order to permit object movement to specExtVP: SpecVP and specExtVP are in the same domain. Movement of the subject across specExtVP is, however, ruled out by RM as formulated. This is, obviously, a problem for the theory as formulated here. We will have to say that movement to specTP, the predication-subject position, is not subject to RM. This indicates that the two positions specExtVP and specTP do not have the same status in terms of the A/A-bar dichotomy. In a case like (55), where the object, by assumption in specExtVP, is focused, it seems relatively unproblematic to assume that the position counts as an A-bar position. Whether there is independent support for such an analysis, and whether it can be extended to all the relevant structures, I do not know.

9. Conclusions: four parameters Summarizing, the following parameters distinguish the four languages discussed here, regarded as representatives of varying degrees of freedom of word order among European SVO languages: (57)

1. ± (absence of verb movement) 2. ±(T is [ + nom])

3. ± (ExtVP has inherent features) 4. ± (absence of [+focus] in C) Following a suggestion by the series editor I now formulate the verb movement parameter and the C-focus parameter negatively (i. e. as ± absence of a property). In this way the value ' + ' entails rigidity, the value '—' freedom of word

A parametric approach

585

order. The parameters take the following values in the languages we have discussed.

(58) English Icelandic Finnish Russian

1 + — — —

2 + + — —

3 + + + —

4 + -·+

English differs from the other three languages by its absence ov V-movement. Icelandic and English differ from Finnish and Russian by having [ + nom] T, thus excluding anything but subjects in specTP and specExtTP. Russian differs from the other three languages in that ExtVP has no inherent features, which entails that an argument in specExtVP is focused, just like an argument in specVP, on account of VP being a focus domain. In English, Icelandic, and Finnish ExtVP has the feature [ — focus] (in addition, possibly, to other features). Hence, in order to create a domain where the object can move across the subject (by Α-movement), the verb must move to ExtV, which is possible in Icelandic and Finnish, but not in English. Hence English is clearly the most rigid one of the four languages, disallowing object Α-movement across specVP altogether. Hungarian also has the value + for parameter 3, but in Hungarian ExtV is [ + foc, ±WH], so that superficially Hungarian looks more like Russian than like, say, Finnish as regards sentential word order. Another parameter which may be important in this context is whether the language has case morphology or not. I suggested that crossing Α-chains within domains, allowed by the formulation of RM assumed here, is possible only in language which have overt case-morphology. Are the parameters in (57) independent of each other, or are they, or some subset of them, interdependent? As the theory stands, the parameters are construed as independent, defining 16 possible language types. The prediction is that, in principle, all these types should be represented among the languages of the world. If all 16 types can be attested, this shows that the parameters are independent, and the theory is thus supported. If all 16 types cannot actually be attested, the parameters may still be independent, since the gap may be accidental. On the other hand, if only very few of the 16 types can actually be attested, this implies that the theory is on the wrong track and that word order variation among the languages of the world is not characterizable by the four binary parameters proposed here. My impression is that none of the 16 types defined as possible by this theory seem to be a priori impossible, but it remains to be seen how many can actually be attested. For instance, the theory defines

586

Anders Holmberg

as possible a language which lacks verb movement but has 'Russian values' for the other parameters (in other words, it has the combination -I 1-). Assuming SVO as underlying order, and assuming that the subject cannot be postposed, the result would be an SVO language which permits SOV (with the object focused) and OSV (with the subject focused), but not VSO, VOS, or OVS, all of which presuppose verb movement. The theory also predicts as possible a language which is like Russian except that T is ( + nom) (i.e. it has the combination 1 l·). Again assuming SVO as underlying order and no rightwards movement of the subject, the language would permit SOV, but only with the object focused. It would, potentially, permit VSO, depending on how far the verb is allowed to move, but it would not permit VOS, since it does not allow Ο as predication subject. For the same reason it would not allow OSV or OVS except if Ο is in specCP. Furthermore, the theory predicts as possible a language which is like Icelandic except that C is focus, as in Finnish (i.e. l· -I—). The result is a language which is like Icelandic except that it also permits SOV with a nonfocused object. And so on for other combinations of the four parameters: none appears to be a priori an impossible language.

10. Other languages in Europe How do all the other SVO languages of Europe pattern with respect to these parameters? As regards V-movement, I do not know of any other European SVO language without V-movement than English (and, as mentioned, even English has movement of have and be). Absence of V-movement is something which is (presumably) characteristic of strictly V-final languages, so if we take these languages into account (see Testelec (this volume) on SOV languages in Europe), English will not be alone, even in Europe.33 The languages which have [+nom] T, i.e. the 'subject prominent' languages include the Germanic languages, and among the Romance languages at least French. According to the theory articulated here, there are three parameter values which allow deriving SOV order from an SVO base: ExtV lacks inherent features (Russian), ExtV is [ + focus, +WH] (Hungarian), and C is [ + focus] (Finnish). Which languages in Europe belong to the 'Hungarian type' as opposed to the 'Russian type' is a moot point; see Kiss (this volume) for discussion. As for parameter 4, singling out Finnish among the four languages in this study. I have conducted a small investigation to determine which languages in Europe have the 'Finnish type' of SOV construction, as opposed to the Russian or Hungarian type. The most conspicuous difference between the two types is that Ο is nonfocus in the Finnish type, but (optionally) focus in Russian and

A parametric approach

587

Hungarian type. I asked informants speaking various SVO languages with optional SOV order whether a question like "What did John buy?" could be answered by a sentence with SOV order. If not, the language had the Finnish value for the relevant parameter. The languages from which I acquired data, apart from Finnish and Russian are: Northern Saami, Estonian, Latvian, Lithuanian, Polish, Bulgarian, Greek, and Rumanian. Only one of these language had the Finnish type of SOV construction, namely Northern Saami. This is, in fact, what the theory predicts, if the C-feature [ +focus] is crucial in this construction, as I argued above: of the languages mentioned. Northern Saami is the one which resembles Finnish most closely as regards question and focus formation, employing various affixed focus particles in conjunction with fronting to specCP. Not even Estonian (genetically more closely related to Finnish than Saami) has this type of question- and focus-formation, but instead has a sentence initial free question particle. As expected, Estonian also does not have the Finnish type of SOV construction.34 For the other languages, in order to determine whether their SOV constructions are truly of the Russian type, more data are required, for instance regarding placement of epistemic adverbs, to determine whether the initial argument is inside or outside ExtTP.35

11. A note on Domains Modifying a proposal by Chomsky (1993), I have proposed that V-movement 'extends the V-domain', and thereby creates a Domain where Α-chains can be crossed. Which property of the verb is crucial for creation of a Domain? Plausibly, it is not the semantic, or thematic features of the verb, or even the verbal, categorial features as such, but rather the grammatical features which the verb is a carrier of. Most scholars who have investigated verb movement phenomena would probably agree that the verb does not move to I, or C, etc. in the capacity of being a verb, but in the capacity of being the carrier of tense, mood, and agreement inflections: The verb moves to 'pick up the inflections' in I or C, etc., assuming the incorporation theory of verbal inflections (cf. Baker 1988; Pollock 1989), or to 'check its morphological features' against those of I, or C, etc., assuming the checking theory of Chomsky (1993). The following is a suggestion: Domains are defined either by [±finite] or by [±focus], and languages differ as to whether they employ one or the other, or possibly both. In Icelandic, for instance, V-movement to ExtV, T, and ExtT has the effect of enlarging the finiteness domain, the verb being the carrier of finiteness features (or one of the carriers, if complementizers are also carriers of

588

Anders Holmberg

finiteness, as argued by Holmberg & Platzack 1995). RM can be violated within a [ +finite] or [ — finite] domain, and the effect of V-movement on Amovement is that it can extend the [+finite] or [—finite] domain (the latter in languages where there is movement of nonfinite verbs as well). The Domains created by [+focus] in C in Finnish, or by extension of VP in Russian do not, on the other hand, involve finiteness. If correct, we do not need to assume covert V-movement in, say, the Finnish and the Russian SOV constructions, in order to account for aparent violation of RM. Instead they represent alternative ways of creating Domains where crossing Α-chains are allowed.

12. A note on the Germanic SOV languages Recently Zwart (1994 a, b) has argued that Dutch (and by the same argument, German and Frisian) is underlyingly SVO. That is to say, the SOV order found in non-finite VPs and finite embedded clauses would be derived by object movement leftwards, from complement-of-V to a specifier position to the left of V. (59)

Jan heeft Marie (niet) gekust. Jan has Marie not kissed

Under this hypothesis finite object clauses, as in (59), are in the basic object position. That is to say, NPs (and most other complements) move leftwards, but not clauses. (60)

Piet heeft betreurd [dat Jan Marie gekust heeft]. Piet has regretted that Jan Marie kissed has

If so, the main difference between Dutch and the other Germanic SOV languages on the one hand, and the Germanic SVO languages on the other hand, is that the former have an obligatory object NP movement leftwards which is insensitive to the focus, or definiteness, or other properties of the NP (only very heavy NPs may be exempted). Apart from this difference, Dutch, German, and Frisian seem to have 'Icelandic values' for the parameters in (57) ?6 As noted by Zwart (ibid.), if the NP object shifted leftwards moves out of VP (the position of the negation in (59) shows that it can move out of VP) it violates RM as formulated above (essentially following Chomsky 1993), since it moves the object across specVP without V-movement, without extending the VP domain to ExtVP the Russian way, and apparently without a special focus feature

A parametric approach

589

in C, the Finnish way. Possibly, what makes this violation of RM possible in Dutch, German and Frisian is that the NP-movement is obligatory, triggered by some inherent feature of NPs, and being insensitive to features like focus or finiteness,37 and by virtue of this, overrides RM. Alternatively, these languages employ some variant of the Finnish system, where a feature of C creates a domain where RM is suspended. The feature in question is presumably not [+focus], but might be [+finite], on the assumption that V 2 languages have a finiteness feature in C (a hypothesis investigated in Holmberg & Platzack 1995). If so, it might not be a coincidence that the only Indo-European languages in Europe which are SOV (i. e. have obligatory NP object movement) are V 2 languages.

Acknowledgements I wish to thank the following persons for providing me with data and often other important observations and comments: Josep Fontana, Toomas Help, Baiba Kangere, Maria Koptjevskaja-Tamm, Chryssoula Lascaratou, Ruta Marcinkeviciene, Beatrice Primus, Halldor A. Sigurösson, Kaja Tael, Marianne Tomma, Enric Vallduvi, Virginia Vasiliauskiene, Jordan Zlatev. I am especially grateful to my collegues in the EUROTYP Constituent Order group (Anna Siewierska, Beatrice Primus, Dik Bakker, Jack Hawkins, Jan Rijkhoff, Katalin Kiss, Maggie Tallerman, Yakov Testelec, and Maria Vilkuna) for many helpful comments and observations, and more generally, for all contributing to the creation of an atmosphere of cooperation and common goals, and friendship, in this initially rather heterogeneous group of linguists.

Notes 1. The Celtic languages are usually taken to be VSO (except possibly Breton; see Tallerman, this volume), while Dutch and German are usually taken to be SOV. Some of the Finno-Ugric languages are also predominantly SOV; see Vilkuna (this volume). The Celtic languages are not discussed at all in the present paper. Dutch and German are discussed quite briefly in section 8. However, at least in the case of DutchGerman, and with less confidence in the case of Celtic, I regard them as being only superficially different from the uncontroversial SVO languages in Europe, differing primarily with regard to the surface position of the verb and, in the case of Celtic, the subject. These are precisely the kinds of parameters which distinguish also among the uncontroversial SVO languages, as I will show. Among the important typological features that all these languages share is that they are clearly dependentmarking and nominative-accusative. In this they differ from the languages discussed

590

Anders Holmberg

in Testelec (this volume) ('the European SOV languages'), which, apart from being (rigidly) SOV are, at least for the most part, head-marking and ergative-absolutive. 2. See Guilfoyle et al. (1992) for evidence from Austronesian languages of VP-internal subjects. 3. The reason why the adverb cannot precede the finite verb/auxiliary in English or Icelandic, while it can in Finnish and Russian, is that the finite verb/auxiliary is obligatorily in C in English and Icelandic, but not in Finnish or Russian. A construction where the finite verb is in C in Finnish is yes-no questions (the verb affixed with the question morpheme -ko). This is also true of a form of yes-no questions in Russian, namely those employing the question morpheme //'; see King (to appear). In neither case can an adverb (epistemic or other) precede the verb, which follows if adverbs cannot adjoin to CP, and if specCP is occupied by an abstract question operator. (i) a. *(Zavtra) vyigraet li (zavtra) Ivan (zavtra) mate? (tomorrow) wins Q (tomorrow) Ivan (tomorrow) the-match b. *(Huomenna) voittaa+ko (huomenna) Juha (huomenna) ottelun? (tomorrow) wins+Q (tomorrow) Juha (tomorrow) the-match 'Will John win the match tomorrow?' 4. This constraint may be a special case of a general constraint against adjunction of anything to CP, suggested by Chomsky (1986). See, however, Müller &c Sternefeld (1993) for a different view on adjunction to CP. Cf. also the existence of adverbs like frankly, mentioned above. The standard analysis of adverbs as adjuncts has, in fact, been challenged in recent work by Guglielmo Cinque. Arguing against the analysis of adverbs as adjuncts is the fact that they typically occur in fixed sequences. For the purpose of this paper is does not matter whether adverbs are adjuncts or generated in specially defined positions. What matters is that given types of adverbs are confined to given structural domains. 5. Alternatively the verb may remain inside VP, being preceded by the adverb. When it is preceded by the adverb the verb is preferably focused, as predicted. 6. The subject need not be constrasted in the OVS construction, and even need not itself be the focus, as long as it is part of the focus. Thus (13 d) is a possible answer to the question "What happened to the boy?", where the whole VP is focused. A more cautious formulation of the condition on the subject in the OVS construction is that it cannot be Nonfocus. 7. The subject in (13d) can be followed by other VP-internal material, for instance an adverbial, in which case both the subject and the adverbial are focused. (i)

Ilmeisesti poikaa puri koira käteen. apparently boy+PAR bit dog+NOM in-hand 'Apparently the boy was bitten by a dog, in his hand.'

8. Here, as in many other places in this paper, the examples are constructed in order to highlight specific variables, and compare them in a set of languages, and therefore the examples may not always sound perfectly natural. The claim is, however, that with another choice of lexical items and given a suitable context, the unstarred examples to represent more or less natural word orders, while the starred ones do not. For example, (i) is a perfectly natural English sentence given the right context,

A parametric approach

591

(ii) is acceptable but requires 'parenthetical intonation' of the adverb, while the word order in (iii) is perhaps acceptable if the adverb is intonationally set off clearly enough, constituting a kind of truncated sentence of its own. (i) (ii) (iii)

That book he apparently likes. That book, apparently, he likes. Apparently: That book he really likes.

Since the word order (iii) is not pronuncable as a single sentence, the initial adverb is starred in (15 b). 9. Negation, if present, is a head which may intervene between AgrS and T, in some descriptions, and in some languages. Pollock (1989) proposed the reverse order of AgrS and T. See discussion in Chomsky (1991). See also Ouhalla (1991), Siewierska (1993), and Holmberg et al. (1993). 10. One such construction is discussed in detail in Bobaljik & Jonas (to appear) and Jonas (1994): the Icelandic so called Transitive Expletive Construction (i): (i)

paö hafa sennilega margir studentar begar lesiö bessa bok. there have probably many students already read this book

(Icelandic)

In this construction, they argue, the thematic subject is in specTP; note that the fact that it precedes the adverb fregar shows that it is outside VP. In this construction the subject must be focused and indefinite (or more correctly, it must be quantified; cf. Vangsnes 1995). In Holmberg (1993 b) I argue that the thematic subject is in specTP also in constructions such as (21) and (ii), common in Swedish and Norwegian. (ii)

Jag tror att möjligen Johan har last den boken. I think that possibly Johan has read this book

(Swedish)

Just as in (21), the subject in (ii) must be either indefinite (representing new information) or be contrastive. 11. The analysis predicts that a subject-first main clause may be preceded by an adverb if the subject is focused: a blatantly false prediction: (i)

*Greinilega HUNDURINN beit strakin. apparently the-dog bit the-boy

For this reason, among others, I still regard the analysis of subject-first main clauses in Icelandic as an open question. 12. Cf. Shortest Move or the Minimal Link Condition of Chomsky (1993, 1995). 13. English has movement of have and be; see Pollock (1989). 14. In Chomsky (1993), RM is formulated in terms of the Shortest Move condition, which says, informally, that a movement should always be as short as possible. The shortest possible move for an object of V is to specVP, consequently movement of an object directly to specAgrO is ruled out. The exception clause in (28) is expressed by Chomsky in terms of the notion 'equidistance': V-movement to AgrO/ExtV has the effect of making specAgrOP and specVP equidistant from the complement of V position. Consequently movement of the object to specAgrOP is technically as short as movement to specVP, hence Shortest Movement is not violated in (27), due to verb movement to AgrO.

592

Anders Holmberg

15. This point has been made by Zwart (1994), not in relation to Finnish but in relation to Scandinavian. 16. In the checking theory of Chomsky (1993) obligatoriness of movement is formulated in terms of 'feature strength': For instance, if the functional head AgrS has a 'strong V-feature', V-movement to AgrS is obligatory in overt syntax, since strong features have to be checked off in overt syntax, i. e. before the sentence enters PF. Weak features need not be checked off before PF, and therefore, if AgrS has a weak Vfeature, V-movement to AgrS can be postponed to LF. The general idea is that all languages look the same at LF, with the corollary that all languages have V-movement to T, AgrS, etc. in order to check verbal features, but while some languages have it overtly, i. e. before spell-out to PF, other languages have it covertly, i. e. after spell-out. It could be added that so far the postulation of strong or weak features has provided little more than another name for obligatory movement. 17. I do not use the terms 'subject prominent' and 'topic prominent" in quite the same way as Kiss (this volume). The present terminology is closer to that of Kiss (1994). 18. Recall that VP is a focus domain, more precisely the domain of new information, while the IP-domain is the domain of old information (the presupposition domain). The function of ExtV can then be seen as providing a place for an argument which is old information, yet is part of the predicate (in the languages discussed so far; below I will propose that ExtV may have other functions in other languages). 19. See Speas (1990: 14 ff., 74) on thematic hierarchies. See Speas (1990: ch. 2) for arguments that the thematic hierarchy is mirrored in in the syntax in the structure of VP. 20. Chomsky (1995: ch. 4) argues, on the basis of theory-internal considerations, that subject-verb agreement is checked by movement of the relevant features (person, number, etc.) from an NP in VP to the functional head containing AgrS, where the feature movement may or may not be visible overtly as movement of the entire NP. We may here adopt this theory, in which case (35) would be a case where the feature movement is not accompanied by movement of the entire NP. 21. For ease of exposition I have not considered constructions with auxiliary verbs in this paper. I believe, however, that consideration of sentential structures with auxiliaries does not require more than some minor adjustments in the theory. In particular we need to postulate at least one more functional head with a sentential projection. It could be noted that the subject in the Finnish OVS construction will be postverbal also when an auxiliary is added. (i) a. Ilmeisesti poikaa on purrut koira. apparently boy:PRTV has bitten dog:NOM b. ""Ilmeisesti poikaa on koira purrut. This is consistent with the hypothesis that object movement across the subject requires movement of the main verb, in order to extend the V-domain. The construction in (ia) requires postulation of two additional functional heads, one hosting the auxiliary, and one serving as a landing site for the moved participial main verb: see Holmberg et al. (1993). 22. See Guilfoyle et al. (1992) on subjects, topics and anaphora. 23. It is not clear how to rule out the possibility that the nominative Theme argument first moves to specExtVP (by Object Shift), and then on to specTP, given that it seems to have the right case. This indicates that the exact formulation of the parameter may have to be reconsidered. I will leave this problem unresolved here. See

A parametric approach

593

Taraldsen (1994) for a theory of case and agreement (in Icelandic and Faroese) where the nominative object in constructions such as (38) is checked in (i.e. moves to) specAgrOP. 24. The other facts which, according to Holmberg (1994) point in the same direction are: (a) Faroese has obligatory definite and indefinite articles, Icelandic does not; (b) Faroese but not Icelandic has "free experiencers", that is optional arguments such as John in / baked John a cake. (c) Icelandic has Object Shift of lexical NPs, Faroese only of pronouns. 25. That case-morphology does not affect A-bar-movement is shown by the fact that the Mainland Scandinavian languages are at least as free as Icelandic as regards movement to spec(CP), i. e. "topicalization" and wh-movement, although the result is not infrequently a structurally ambiguous sentence, due to the V2 requirement in conjunction with lack of overt case or agreement morphology. Thus (i) is a prefectly well-formed and commonplace Swedish question in spite of the ambiguity (note that Swedish has neither case morphology nor subject-verb agreement morphology to disambiguate such constructions). (i)

Vem älskar John? who love John 'Who does John love?' or 'Who loves John?'

26. That the subject is in specCP in this construction, when contrastively focused, can be confirmed by a number of tests; for instance, it cannot be preceded by an adverb: (i)

*Ilmeisesti koira poikaa puri. apparently dog:NOM boy:PRTV bit

See Holmberg & Nikanne (1994). 27. Whether the finite verb moves to C also in main clauses with an initial subject, with concomitant subject movement to spec(CP), is a highly controversial question. See Zwart (1994 b) for a recent discussion. 28. There are other similarities: for instance, in Finnish as well as in Dutch-German, basically all kinds of verb complements and modifiers can occur in preverbal position, except object clauses. The fact that object clauses are always postverbal in Dutch and German is the main empirical reason for analyzing these languages as underlyingly SVO, with OV order derived by complement movement to the left, in Zwart (1994a, b). A clear difference between Finnish and Dutch-German is that Vfinal order seems never to be obligatory in Finnish. 29. See Vallduvi (1992: 9ff., 45 ff.). In some cases the whole sentence can be focus, as typically, but not necessarily, in the case of existential sentences. 30. As in the case of Finnish, the subject in the Russian OVS construction need not itself be the focus, but must be part of the focus. Thus (48 b) can be used to answer the questions "What animal bit the boy?" (Focus on the subject). "What happened to the boy" (Focus on the VP), or even, in a context where the boy is known but the dog is new in the discourse, "What happened?". 31. For one thing, this allows us to express the difference between Russian and Hungarian with regard to sentence structure very simply: In Hungarian but not in Russian ExtV has the feature [±WH],

594

Anders Holmberg

32. According to Kiss (this volume) this is characteristic of one type of topic-prominent languages: In principle any number of topics (in our terms, non-focused arguments) may (or must) move to the IP-domain. 33. Siewierska (this volume) shows that languages classified as V-final usually do not have VSO or VOS as alternative orders. This generalization is explained if such languages do not have V-movement. Many of these SOV languages have SVO as an alternative word order. This is consistent with the hypothesis that they lack Vmovement if we assume (a) that SVO is universally the underlying order (following Kayne 1994), and (b) that so called SOV languages have more or less obligatory movement of the object across the verb, which in these languages does not move (see the text below on SOV order in Dutch, Frisian, and German). A weaker hypothesis is that just those so-called SOV languages which have SVO as an alternative order actually have underlying SVO order with object movement applying in the unmarked case. 34. In fact Russian //'-questions (see King, to appear) are very similar to Finnish and Northern Saami questions. However, while Finnish and Northern Saami have several focus morphemes appearing in C, and obligatorily form yes/no questions with a focus particle, Russian has only one, namely //', involved in one form of yes/no question. 35. From Lithuanian I have enough data (thanks to Virginija Vasiliauskiene) to claim that Lithuanian has the 'Russian value' for parameter 3, i. e. ' ExtV lacks inherent features. 36. As far as verb movement is concerned, Dutch, German, and Frisian are (arguably) closer to Mainland Scandinavian than to Icelandic, since the verb moves overtly only to C, not to I, as in Icelandic (see Zwart 1994a,b). 37. In addition, the languages in question have NP-movement which is sensitive to focus, shifting specific NPs to a higher position than unspecific ones, for instance. It is interesting to compare Northern Saami to these languages (Marit Julien p.c.): At least some varieties of Northern Saami appear to be 'semi-SOV language', with 'short NP movement' leftwards in nonfinite VPs, where the movement is sensitive to focus. Thus (i) is the unmarked order if the object is non-focus, while (ii) is preferred if the object is focus (for instance as an answer to "What did you buy?"): (i)

Mun lean girjji lohkan. I have book read

(ii)

Mun lean lohkan girjji.

A parametric approach

595

References Bakker, Dik this volume "Flexibility and consistency in word order patterns in the languages of Europe". Besten, Hans den 1983 "On the interaction of root transformations and lexical deletive rules", in: Werner Abraham (ed.), On the formal syntax of the Westgermania. Papers from the 3rd Groningen Grammar Talks, January 1981, Amsterdam and Philadelphia: John Benjamins, 47—131. Also published as chapter one in Besten, Hans den: Studies in West Germanic syntax. Amsterdam, Atlanta: Rodopi, 1989. Bobaljik, Jonathan 8c Dianne Jonas 1996 "Subject positions and the roles of TP", to appear in Linguistic Inquiry. 27: 195-236 Borer, Hagit 1984 Parametric syntax. Dordrecht: Foris. Chomsky, Noam 1986 Barriers. Cambridge, Mass.: The MIT Press. 1991 "Some notes on economy of derivation and representation", in: Robert Freidin (ed.) Principles and parameters in comparative grammar, Cambridge, Mass.: The MIT Press, 417-454. 1993 "A minimalist program for linguistic theory', in: Kenneth Hale & Jay Keyser (eds.), A view from Building 20, Cambridge, Mass.: MIT Press. 1995 The Minimalist program. Cambridge, Mass.: The MIT Press. Guilfoyle, Eithne, Henrietta Hung &c Lisa Travis 1992 "SPEC of IP and SPEC of VP: Two subjects in Austronesian languages", Natural Language and Linguistic Theory 10: 375—414. Holmberg, Anders 1986 Word order and syntactic features in the Scandinavian languages and English, PhD dissertation, Department of Linguistics, University of Stockholm. 1993 a "On the structure of predicate NP", Studia Linguistica 47: 126-138. 1993 b "Two subject positions in IP in Mainland Scandinavian", Working papers in Scandinavian Syntax 52, Department of Scandinavian Languages, University of Lund. 1994 "Morphological parameters in syntax: the case of Faroese", Report 35, Department of General Linguistics, University of Umea. Holmberg, Anders and Urpo Nikanne 1994 "Expletives and subject positions in Finnish", in: M. Gonzalez (ed.), NELS 24, University of Massachusetts, Amherst MA. Holmberg, Anders, Urpo Nikanne, Irmeli Oraviita, Hannu Reime 8c Trond Trosterud 1993 "The structure of INFL and the finite sentence in Finnish", in: Anders Holmberg & Urpo Nikanne (eds.), Case and other functional categories in Finnish syntax, Berlin: Mouton de Gruyter, 177—206. Holmberg, Anders & Christer Platzack 1995 The role of inflection in Scandinavian syntax. New York and Oxford: Oxford University Press.

596

Anders Holmberg

Kayne, Richard 1994 The antisymmetry of syntax, Cambridge : MIT Press. King, Tracy 1993 Configuring Topic and Focus in Russian. [Unpublished Ph. D. dissertation, Department of Linguistics, Stanford University.] to appear "Focus in Russian //-questions", to appear in Journal of Slavic Linguistics 2. Kiss, Katalin E. 1987 Configurationality in Hungarian. Dordrecht: Reidel. 1994 "Sentence structure and word order", in: Ferenc Kiefer & Katalin E. Kiss (eds.) Syntax and Semantics 27. The syntactic structure of Hungarian. San Diego: Academic Press, 1 — 90. Mahajan, Anoop 1990 The A/A-bar distinction and Movement Theory. [Unpublished Ph. D. dissertation, MIT]. Müller, Gereon & Wolfgang Sternefeld 1993 "Improper movement and unambiguous binding", Linguistic Inquiry 24: 461-507. Ouhalla, Jamal 1991 Functional categories and parametric variation, London and New York: Routledge. Ottosson, Kjartan 1991 "Psych-verbs and binding in Icelandic", in: Halldor A. Sigurösson et al. (eds.), Papers from the Twelfth Scandinavian Conference of Linguistics, Linguistic Institute, University of Iceland. Platzack, Christer 1986 a "COMP, INFL, and Germanic word order", in: Lars Hellan & Kirsti Koch Christensen (eds.), Topics in Scandinavian syntax, Dordrecht: Kluwer, 185-234. Pollock, Jean-Yves 1989 "Verb movement, universal grammar, and the structure of IP", Linguistic Inquiry 20: 365-424. 1994 "Checking theory and bare verbs", in Guglielmo Cinque et al. (eds.), Paths towards universal grammar: Studies in honor of Richard S. Kayne, Washington DC: Georgetown University Press, 293—310. Rizzi, Luigi 1990 Relativized minimality, Cambridge MA: MIT Press. Rohrbacher, Bernhard 1993 The Germanic languages and the full paradigm: a theory of V to I Raising. [Unpublished Ph. D. dissertation, University of Massachusetts at Amherst.] Rögnvaldsson, Eirikur & Höskuldur Thrainsson 1990 "On Icelandic word order once more", In: Joan Maling, & Annie Zaenen (eds.) Syntax and Semantics 24. Modern Icelandic syntax. San Diego: Academic Press, 3—40. Siewierska, Anna 1993 "On the ordering of subject agreement and tense affixes", EUROTYP Working Papers, 11/5, 101-126. this volume "Variation in major constituent order: a global and European perspective".

A parametric approach

597

Sigurösson, Halldor A. 1989 Verbal syntax and case in Icelandic in a comparative GB approach, Ph. D. dissertation, Department of Scandinavian Languages, Lund University. 1994 "Morphological case, abstract case, and licensing", in: Cecilia Hedlund & Anders Holmberg (eds.), Papers from the 14th Scandinavian Conference of Linguistics: Special session on Scandinavian syntax, Department of Linguistics, University of Gothenburg. Speas, Margaret J. 1990 Phrase structure in natural language. Dordrecht: Kluwer. Tallerman, Maggie this volume "Celtic word order: Some theoretical issues". Taraldsen, K. Tarald 1994 "Reflexives, pronouns and subject/verb agreement in Icelandic and Faroese", Working Papers in Scandinavian Syntax 54, Dept. of Scandinavian Languages, University of Lund. Testelec, Yakov this volume "Word order variation in European SOV languages". Travis, Lisa 1992 "Inner aspect and the structure of VP", Cahiers de Linguistique de UQAM vol. 1, no. 1. Trosterud, Trond 1993 "Anaphors and binding domains", in: Anders Holmberg & Urpo Nikanne (eds.), Case and other functional categories in Finnish syntax, Berlin: Mouton de Gruyter, 225—243. Vallduvi, Enric 1992 The informational component. New York and London: Garland. Vangsnes, 0ystein 1995 "Referentiality and argument positions in Icelandic", Working Papers in Scandinavian Syntax 55, Dept. of Scandinavian Languages, University of Lund. Vikner, Sten 1995 Verb movement and expletive subjects in the Germanic languages, Oxford University Press, New York and Oxford. Vikner, Sten & Bonnie Schwartz 1995 "The verb always leaves IP in V 2 clauses", in: Adriana Belletti & Luigi Rizzi (eds.), Parameters and functional heads. Essays in comparative syntax. New York and Oxford: Oxford University Press, 244—268. Vilkuna, Maria 1989 Free word order in Finnish. Helsinki: Suomen kirjallisuusseura [The Finnish Literature Society]. this volume "Word order in European Uralic". 1995 "Discourse configurationality in Finnish', in: Katalin E. Kiss (ed.) Discourse-configurational languages, New York and Oxford: Oxford University Press 1, 244-268. Weerman, Fred 1994 "Scrambling and morphological case without case theory", ms., OTS Utrecht University.

598

Anders Holmberg

Zaenen, Annie, Joan Maling 8c Höskuldur Thrainsson 1985 "Case and grammatical functions: the Icelandic passive", Natural Language and Linguistic Theory 3: 441—483. Zwart, Jan-Wouter 1994a "Dutch is head initial", The Linguistic Review 11: 377-406. 1994 b Dutch syntax. A Minimalist approach. Department of General Linguistics, University of Groningen. 1994 c "On Holmberg's Generalization", in: A. de Boer, H. de Hoop & H. de Swart (eds.), Language and cognition 4, University of Groningen, 229— 242.

Maggie Tallerman

Celtic word order: some theoretical issues1

1. Introduction Amongst the languages of Europe, only the Celtic family is typified by VSO word order. However, none of the Celtic languages exhibit VSO order in all clause types. Some, as we will see, rarely exhibit that order in main clauses; and in each of the languages there are numerous optional and obligatory word order variations. Main clauses do not necessarily follow the pattern of embedded clauses, and finiteness can affect word order. Furthermore, changes in word order due to focalization and topicalization are particularly prevalent in Celtic. An outline of the basic facts about Celtic word order can be found in Tallerman, this volume. In the current paper I examine the mam word order permutations in Celtic, from the theoretical perspective of generative grammar. Section 2 considers the range of fronting constructions found in P-Celtic.2 This branch contains the two languages with the most striking word order variations, Breton and Cornish, and the discussion will concentrate on certain aspects of the syntax of these languages, and of Middle Welsh. Section 3 turns to the Celtic copular clause: through various historical developments these clauses have tended to lose their verb-initial character. Copular verbs have been analysed as moving to the complementizer position in both branches of Celtic. Some of these verbs have been grammaticalized as complementizers, and then lost, whilst elsewhere, movement of a constituent to the Spec, CP position leads to non-VSO order. Section 4 outlines a phenomenon which is peculiar to Irish, that of pronoun postposing. There is a strong tendency for Irish object pronouns to migrate to the end of the clause, contrary to the typical behaviour of pronouns both crosslinguistically and in Irish. Section 5 will be devoted to a consideration of the analysis of Celtic clause structure in generative grammar. I critically examine a proposal that Celtic languages are atypical of VSO languages, and suggest finally that there is no one single "VSO type". Indeed, even within a single language family such as Celtic, there is evidence of important syntactic distinctions. Before turning to the major word order variations in Celtic, I will summarize briefly the basic word order facts. All embedded finite clauses have the inflected

600

Maggie Tallerman

verb in initial position in unmarked word order; the same is true of finite main clauses, except in Breton and Cornish, where verb-initial word order is mostly ungrammatical. Infinitival clauses in Celtic are subject-initial, although the position of the object varies between and within languages, giving both SOV and SVO orders. Fronting constructions, both topicalizations and focalizations, are frequent in all the Celtic languages and are not highly marked. Finite clauses are often VSO, but a more useful term is "finite-verb initial", since there are two main clause types: firstly, a VSOX pattern: (1)

Irish (0 Siadhail 1989: 213) Chonaic me an fear ar maidin. see:PST I the man on morning Ί saw the man this morning.'

(2)

Welsh Gwelais ii'rr dyn y bore yma. see:PST:lSG I-the man the morning here Ί saw the man this morning.'

Secondly, there is a periphrastic pattern, which has an initial finite auxiliary verb, then the subject, and a verb phrase constituent containing the non-finite lexical verb and the object:3 (3)

Irish (0 Siadhail 1989: 236) Bhi an fear ag peinte il cathaoir inne. be:PST the man PROG paint chair yesterday 'The man was painting a chair yesterday.'

(4)

Welsh Roedd y dyn yn peintio cadair ddoe. be:PST:3SG the man PROG paint chair yesterday 'The man was painting a chair yesterday.'

2.

Fronting constructions and V 2: word order in P-Celtic

2.1. Outline of word order in Breton and Cornish Although in most respects Breton is typologically similar to its closest living relative, Welsh, there is no direct Breton counterpart to (2). In fact, V[+f inite ]SO word order is usually ungrammatical:

Celtic word order

(5)

601

Breton (Borsley 1990: 81) *Lenn Anna al levr. read:PRS Anna the book ('Anna reads the book.')

Instead, some constituent other than the finite verb must appear in initial position; for example, the subject, as in (6 a), the object, (6 b) or a negative particle, (6c): 4 (6)

Breton a. Anna a lenn al levr. Anna PTL read:PRS the book 'Anna reads the book.' Borsley (1990: 82) b. Al levr a lenn Anna the book PTL read:PRS Anna 'Anna reads the book.' c. Ne lenn ket Anna al levr. NEC read:PRS NEC Anna the book 'Anna doesn't read the book.'

Because of these facts, Breton is often characterized as a verb-second (V2) language; see for example Stump (1989). However, we will see that neither Breton nor Cornish conform to the typical V2 pattern found, for example, in Germanic languages. Most of what is known about Cornish dates from around 1400 to 1700 AD, a period in which the word order appears to have been stable: see George (1991, 1993). There are strong superficial similarities with modern Breton. Some constituent must normally precede the verb. Most frequently the subject is initial, in both Middle and Late Cornish (from around 1200 to the late eighteenth century). The resulting SVO word order is pragmatically unmarked: (7)

Cornish (adapted from Gregor 1980: 217, 147) a. My a aswon Albanek. I PTL know:PRS Scotsman Ί know a Scotsman.' b. An venen a dheth. the woman PTL come:PST 'The woman came.'

602

Maggie Tallerman

In unmarked negative clauses the negative particle is initial, and the finite verb is in second position: compare the Breton (6c): (8)

Cornish (from Beunans Meriasek, cited by George 1991: 226—7) a. Ny seff henna yth galloys. NEC stand:PRS:3SG that in:2SG power 'That does not stand in your power.' b. Ny vyn mernans ov gueles. NEC wish:PRS:3SG death 1SG see 'Death does not wish to see me.'

Other constituents are focalized when they are fronted: (9)

Cornish (from Beunans Meriasek, cited by George 1991: 228, 239) a. Dour ny effsen eredy. water NEC drink:COND:lSG certainly Ί would certainly not drink water.' b. Hedhyw y hwelsons ... today PTL see:3PL 'Today they see ...'

Both Breton and Cornish also have periphrastic clauses, but again, the finite verb cannot be in absolute initial position, as the ungrammaticality of (10) shows: (10)

Breton *Ra Anna lenn al levr. do:PRS Anna read the book ('Anna reads the book.')

A typical Cornish periphrastic clause is shown in (11): (11)

Cornish (from Beunans Meriasek, cited by George 1991: 213) Eff a ra scollya the goys. he PTL do:PRS spill 2SG blood 'He will spill thy blood.'

Celtic word order

603

In both languages, the entire VP can be fronted and so focalized: (12)

a. Breton [vp Lenn al levr] a ra Anna, read the book PTL do:PRS Anna 'Anna reads the book.' b. Cornish (from Beunans Meriasek, cited by George 1991: 235) [ V p Terry y wormenadow] a regh why break 3MSG commandment:PL PTL do:PRS:2PL you heb feladow. without fail 'Break his commandments you do without fail.'

However, what distinguishes Breton and Cornish from the remaining Celtic languages is that a non-finite verb minus its complement can appear in sentence-initial position, as (13) illustrates from Breton: (13)

Breton (Borsley 1990: 83) Lenn a ra Anna al levr. read PTL does Anna the book 'Anna reads the book.'

I return to the construction in (13) in § 2.2. In some cases both Cornish and Breton do have an initial finite verb. Firstly, certain finite forms of the verb 'be' (bezan in Breton, bos in Cornish) are sentence-initial: the "locative" form of Breton bezan which occurs in initial position is used in stage-level predications according to Robin Schafer (p.c.).5 (14)

Breton (Press 1986: 148) Emaint o tebrin. be:PRS:3PL PROG eat 'They're eating.'

Secondly, VSO is the unmarked order of finite embedded clauses in both languages: (15)

Breton (Borsley 1990: 82) Gouzout a ran [e lenn Anna al levr]. know PTL do:lSG PTL read:PRS Anna the book Ί know that Anna reads the book.'

604

(16)

Maggie Tallerman

Cornish (adapted from Gregor 1980: 190) Certan yu [y scryfas dhedhy]. certain be:PRS PTL write:PST:3SG to:3FSG 'It is certain that he wrote to her.'

Embedded clauses are well known to be more conservative, and are often considered a more reliable indicator of word order type than main clauses (see Hawkins 1983); German and Dutch, for example, only exhibit V2 effects in main clauses. Finite verbs also occur initially in main clauses preceded by a subordinate clause, although this is compatible with a V 2 analysis of Breton; if the subordinate clause is seen as an initial XP constituent, then the finite verb is predictably in second position: (17)

Breton (Anderson 1981: 28) E-kreiv evan e vanne, e kwezhas Yann war e hed. after drink his glass PTL falhPST Yann on his length 'After drinking his glass, Yann fell flat out.'

2.2. Long Head Movement in Breton and Cornish The Breton construction illustrated in (13) involves V-fronting of a non-finite verb without its complement(s): in other words, a head rather than a phrasal constituent is fronted.6 Three different types of non-finite verb occur in this construction in Breton: firstly, the "verb-noun" or infinitival form, as illustrated in (13) and in (18): (18)

Breton (Press 1986: 188) Gwelout a ra Yann e vignonez. see PTL do:PRS Yann his girlfriend 'Yann sees his girlfriend.'

In (18), the finite verb is a form of ober 'do'. Secondly, the past participle can front, in which case the finite verb is the perfective auxiliary, a form of endevout 'have': (19)

Breton (Borsley 1990: 82) Lennet en deus Tom al levr. read:PART 3MSG have Tom the book 'Tom has read the book.'

Celtic word order

605

Thirdly, a passive participle can front, and the finite verb is the passive auxiliary, a form of bezan 'be': (20)

Breton (Borsley, Rivero & Stephens 1996: 54) Lennet e oa al levr gant Yann. read:PART PTL be:PAST the book by Yann 'The book was read by Yann.'

These examples are not at all emphatic, and do not produce the effects of topicalization or focalization; this distinguishes them pragmatically from sentences in which an entire VP or other constituent has been fronted. Although rarer than in Breton, the V-fronting construction also occurs in Middle Welsh and in Cornish: (21)

Cornish (from Beunans Meriasek, cited by George 1991: 233) avel hermyt... speyna a ra y dethyou. like hermit spend PTL do:PRS 3MSG day:PL 'Like a hermit ... he spends his days.'

(22)

Cornish (adapted from Gregor 1980: 255) Leverel a wrug dhedha ... say PTL do:PST:3SG to:3PL 'And he said to them ...'

This type of V-fronting, which occurs in several other European languages,7 is known in the generative literature as Long Head Movement. An account of Breton in terms of Long Head Movement is proposed by Borsley, Rivero and Stephens (1996):8 the verb moves from its underlying position in VP directly to the complementizer position, C°; since the verb bypasses the intermediate head INFL, its movement is "long":

(23)

lennet; en deus Tom

e;

al levr (=(19))

606

Maggie Tallerman

Syntactically, there is good evidence for distinguishing V-fronting from topicalization or focalization, as V-fronting has a number of distinctive features.9 Firstly, it is clause-bound: (24)

Breton (Borsley, Rivero & Stephens 1996: 55) *[Desket] am eus klevet he deus Anna learn:PART 1SG have:PRS hear:PART 3FSG have:PRS Anna he c'henteliou. 3FSG lesson:PL ('I've heard that Anna has learnt her lessons.')

Topicalization is not clause-bound: (25) illustrates unbounded movement of NP, and (26), of VP: (25)

Breton (Borsley, Rivero & Stephens 1996: 55) [NP AI levr] a lavaras Yann e lennas. the book PTL say:PST Yann PTL read:PST 'Yann said that he read the book.'

(26)

Breton (Stephens 1982: 99) [vp Sevel ar mogeriou] a ouien e rae ar raise the wall:PL PTL know:PRS:lSG PTL do:PST the vasonerien. mason:PL Ί knew that the masons built the walls'

Secondly, whilst topicalizations and clefts can occur in negative clauses, Vfronting cannot: (27)

Breton *Lennet n'o deus ket ar baotred al levr. read:PART NEG-3PL have:PRS NEC the boy:PL the book 'The boys have not read the book.'

Thirdly, V-fronting is restricted to root clauses, as (28) illustrates:10 (28)

Breton (Borsley, Rivero & Stephens 1996: 59) *Lavaret he deus Anna [lennet en deus say:PART 3FSG have:PRS Anna read:PART 3MSG have:PRS Tom al levr]. Tom the book ('Anna said Tom had read the book.')

Celtic word order

607

This example would be grammatical if the finite verb were initial in the embedded clause, rather than the non-finite verb. Fourthly, V-fronting does not co-occur with topicalization of XP: (29)

Breton (Borsley, Rivero & Stephens 1996: 60) *A1 levr lennet en deus Tom. the book read:PART 3MSG have:PRS Tom ('Tom has read the book.')

Borsley et al. show that such properties are all typical cross-linguistic features of the Long Head Movement construction. Another general property is that the construction is obligatory, in the sense that if no other constituent is fronted, then V-fronting must apply. It thus forms a kind of repair device for sentences which would otherwise have an illicit clause-initial structure. As (29) shows, V-fronting does not co-occur with movement of XP into Spec, CP, since such movement would already have created a legitimate clause-initial structure. An initial negative complementizer also creates a licit clause structure.11 To summarize, the data in § 2.1 and § 2.2 indicate that Breton and Cornish are typologically VSO (the order in embedded finite clauses) but have a general constraint against sentence-initial finite verbs. In generative accounts, this constraint has been variously characterized as V2, or as Long Head Movement; the reader is referred to the references cited for further details.

2.3. Other views of Breton word order in the literature From a pragmatic point of view, topicalization of an NP, PP or VP constituent gives different degrees of emphasis (see Stephens 1982, Press 1986: ch. 4 for discussion). Although fronting of PP, VP and object NP indicate focus on the preposed constituent in all varieties of Breton, a fronted subject NP may be either unemphatic or focussed; see Varin (1979), Timm (1989; 1991) for discussion. There is widespread disagreement over what constitutes the unmarked word order in Breton. Some claim that SVO, as in (6 a), is the neutral word order: see Varin (1979), Jouin (1984), and Ternes (1992: 386). Other writers argue that if any construction at all can be said to be neutral, it is the Vfronting construction: see particularly Stephens (1982) and Press (1986). Timm (1989; 1991) has a different view of Breton word order. Rather surprisingly, since, as we have seen, V[ + f initt .]SO order is usually ungrammatical, Timm claims that Breton is a surface VSO language. On the basis of text counts of word order frequencies, Timm (1989) concludes that 69.2% of clauses are

608

Maggie Tallerman

verb-initial. For a variety of reasons, however, this conclusion is rather problematic.12 Firstly, Timm conflates the statistics for main clauses and embedded clauses. As noted above, VSO order is unmarked in finite embedded clauses, so taking both clause types together results in a higher count for verb-initial clauses than if main clauses only are surveyed. But this approach also obscures the non-trivial fact that V[+finjte]SO order is generally disallowed in main clauses. Secondly, her tally of 'verb-initial' clauses includes any examples which have a negative particle or complementizer (e.g. (6c)) in absolute initial position, whereas under any kind of V2 or Long Head Movement analysis, this element is considered to be the true initial constituent (see Stump 1989: 466, Borsley et al., 1996). The complementizer itself is typically absent in speech13 but unless the analysis counts the negation as an initial constituent, then the word order differences between affirmative and negative clauses are inexplicable: compare for example (5) and (6c). Thirdly, Timm registers as "verb-initial" all clauses which have a fronted adverbial or other non-valency constituent, on the grounds that the word order following the initial constituent is VSO. In contrast, under the more abstract syntactic analyses discussed above, the clause-second position of the verb is predicted exactly because some constituent fills the absolute sentence-initial position. Fourthly, her data includes imperatives, which swell the verb-initial count artificially; English imperatives also have initial verbs, but are not considered to reflect verb-initial typology in that language. Fifthly, Timm conflates the construction in which a VP is fronted, as in (12 a), with the V-fronting construction. Since these clearly have different syntactic properties, as we saw in § 2.2, such treatment is undesirable. In fact, Timm considers both construction types to display verb-initial word order, which means that she does not distinguish between a finite verb in initial position, and a non-finite one. Many non-VSO languages have constructions with initial finite verbs, such as yes-no questions and imperatives; many languages also allow VPs to be fronted. Such constructions do not imply that a language is verb-initial in the sense of having a neutral pattern with a clause-initial finite verb in affirmative declarative clauses. Since Timm includes a variety of clause-types which are verb-initial in some other sense, we do not discover from her work just how common V[+finite]SO order is in Breton. Two main factors seem responsible for the general lack of agreement over Breton word order in the literature. Firstly, there are various language-specific considerations: there is no single standard language, and written Breton exhibits a particularly wide range of styles and registers. There may be efforts by some modern writers to adopt a "purer" Celtic syntax (i. e. one which is verbinitial) rather than using the traditional word order; see Varin (1979) for an expansion of this thesis. This situation is of course not unique to Breton, but

Celtic word order

609

it does contribute to the variety of opinion over which pattern constitutes the unmarked word order. The second factor is the development of V 2 syntax in main clauses. I suggest that the dispute is actually over the most likely element to precede the finite verb (or the most unmarked). As noted already, the two main views propose either SVO order or the V-fronting construction as neutral.

2.4. The Welsh "abnormal" construction As Long Head Movement in Breton illustrates, fronting does not always entail emphasis or topicalization in the Celtic languages. The Welsh construction discussed in this section has never been obligatory, but was once a highly prevalent pattern, at least in literature. In Middle and Early Modern Welsh, the "abnormal" construction was used extensively as a stylistic device, particularly in the Welsh Bible translation of 1588. One or more constituents may be fronted, but there is no degree of focus. What is "abnormal" about this construction is not the word order, but the agreement pattern: plural lexical NPs do not trigger subject-verb agreement in VSO or cleft word order, but in the abnormal construction they do, as (30) shows:14 (30)

Welsh (Matthew viii, 25) A'i ddisgyblion a ddaethant ato... and-3MSG disciple:PL PTL come:PST:3PL to:3MSG 'And his disciples came to him ...'

Although this example has the appearance of SVO word order, in fact not only subjects can front, but any argument or non-argument. Superficially, constructions such as (30) look like the cleft construction, since the initial constituent is followed by the same pre-verbal particles or complementizers (a, y) as are clef ted constituents. However, even apart from the agreement pattern mentioned above, several features distinguish the abnormal construction from clefts. Firstly, only one constituent may be clefted, whilst as many as five constituents have been found in initial position in the abnormal construction. (31) illustrates: (31)

Welsh (Brut y Brenhined, Lianstephan MS. 1 Version, 712-3) Ac [wrth henny] [er rey ereyll] [en kyflawn o ofyn] and at that the one:PL other PRED full of fear [adav e dynas] a orugant. leave the city PTL do:PST:3PL 'And at that, those others, full of fear, left the city.'

610

Maggie Tallerman

Secondly, the abnormal construction only occurs in root clauses, unlike topicalization or focalization in Welsh which also occur in embedded clauses. Thirdly, the abnormal construction never has a copula verb in initial position, preceding the fronted constituent, whilst earlier Welsh clefts did. In Middle Welsh, a copula occurred optionally;15 in modern Welsh clefts this has disappeared, but a reflex of it occurs in the complementizer mat, which is etymologically a copula, and precedes the cleft constituent in embedded clauses. Along with Long Head Movement, the abnormal construction was one of the ways that Middle Welsh maintained its V2 requirement: see Koch (1991) and Willis (1996) for extensive discussion.

2.5. The analysis of non-VSO finite clauses in P-Celtic The P-Celtic languages all exhibit, or have exhibited, a variety of non-VSO finite clause types. As we have seen, both Breton and Cornish have V-fronting, as did Middle Welsh; Middle Welsh also has the abnormal construction; and in each language clefting is a prominent part of the syntax. In Breton and Cornish, clefting is one device which ensures that finite verbs are not initial in main clauses. There is evidence, though, that not all initial XPs are derived in the same way in the three languages. Recent generative accounts of Celtic agree that topicalizations and clefts involve tf/7-movement to Spec, CP: see for example Hendrick (1988; 1990), Borsley and Stephens (1989), and Stump (1989), on Breton; Harlow (1981), Sproat (1985) and Tallerman (1996) on Welsh, amongst many other references. On the other hand the Welsh abnormal construction (§ 2.4) is analysed by Tallerman (1996) as adjunction to CP, with no w^-movement in the derivation. When a subject appears in the fronted position, the syntactic subject of the clause is realized as pro: since pro, like other pronouns, triggers subject-verb agreement in Welsh, the correct agreement pattern (see note 18) is predicted. Adjunction to CP accounts for the iterative nature of the abnormal construction; the cleft construction, however, targets the landing site peculiar to whmovement constructions: only one Spec, CP position is available. The restriction of the abnormal construction to matrix clauses is handled by the Adjunction Prohibition of Chomsky (1986: 6) and McCloskey (1992): this is a universal prohibition on adjunction to lexically-selected arguments, such as embedded clauses. Conversely, clefts are able to occur in embedded clauses, since they involve movement to Spec, CP rather than adjunction. The analysis of Breton by Borsley et al., discussed in § 2.2, also assumes that initial topicalized or clefted XPs are sited in Spec, CP. Recall from note 11 the proposal that the V 2 constraint actually requires tense to be licensed, and that

Celtic word order

611

one way to achieve this is to fill the specifier position of C°. Cornish appears to share this property, but there is evidence that Cornish also has a way of satisfying the V2 requirement which is not found in Breton. Under standard assumptions Cornish has clefted XPs in Spec, CP: see for example (9 a) and (9b), which respectively have an NP and an adverbial phrase in that position. At first glance, Cornish fronted subjects are no different from any fronted XPs: (11) and (32) are typical examples: (32)

Cornish (from Beunans Meriasek, cited by George 1991: 211) Ena eff a deske dadder. there he PTL learn:PRS goodness 'There he will learn goodness.'

The construction contains the particle a which introduces cleft clauses (see note 3), and, as is usual in Celtic subject clefts, the verb always appears in its analytic form (identical to the third person singular), as (33) through (35) show. However, there are strong indications that non-focalized preverbal subjects are in Spec, IP rather than Spec, CP in Cornish.16 The first piece of evidence that fronted subjects are not in Spec, CP is that in Cornish, but not elsewhere in Celtic, clefted XPs can appear to the left of a fronted subject; a number of examples from the medieval play Beunans Meriasek are cited by George (1991), including the following: (33)

Cornish (from Beunans Meriasek, cited by George 1991: 216) [Cryst ha ty] me a dhefy. Christ and thee I PTL defy:PRS Ί defy thee and Christ.'

(34)

Cornish (from Beunans Meriasek, cited by George 1991: 215) [An forth dalleth yredy] ny a vyn. the road begin certainly we PTL wish:PRS 'The road begin certainly we will.'

(35)

Cornish (from Beunans Meriasek, cited by George 1991: 210) [In crist ihesu] ny a greys. in Christ Jesus we PTL believe:PRS 'In Christ Jesus we believe.'

The fronted subject NP is preceded by the object NP in (33), by VP in (34) and by PP in (35). Assuming that the clefted XPs are in each case in Spec, CP, then the subject cannot be in that position.1' There are no grounds for analysing

612

Maggie Tallerman

these examples in the same way as the abnormal construction in Welsh, where all the fronted constituents are adjoined to CP. Evidence against this treatment is that there is no verbal agreement with the subject, as (33) through (35) show, whereas full agreement with a pro subject occurs in the abnormal construction (see (30) above). This suggests that the subject is internal to IP, but is not in an appropriate position to trigger agreement.18 On the other hand, fronted subjects do not precede another fronted constituent. So although OSV order occurs when a focalized object is in Spec, CP and the subject is in Spec, IP, as in (33), it is correctly predicted by the current analysis that Cornish does not display SOV word order.19 Examining the position of the subject relative to a fronted non-finite verb (Long Head Movement) there are two further arguments that Cornish fronted subjects are not in Spec, CP. The next piece of evidence is that fronted subjects can occur to the right of the non-finite verb, which is in C°:20> 21 (36)

Cornish (from Beunans Meriasek, cited by George 1991: 217) Omma gul me a vyn ryp chapel maria wyn here do I PTL wish:PRS beside chapel Mary Blessed thym oratry. for:lSG oratory 'Here I wish beside the chapel of Blessed Mary for me an oratory.'

If the non-finite verb gul has moved to C°, and the subject me is to the right of it, then the subject is presumably in a specifier position within IP.22 On the other hand, it seems that a fronted subject cannot occur to the left of a verb which has undergone Long Head Movement. This is of course expected in Breton, where no topicalized element can co-occur with V-fronting: see (29) above. However, Cornish seems not to have prohibited such constructions completely, as (21), repeated here as (37), shows: (37)

Cornish (from Beunans Meriasek, cited by George 1991: 233) avel hermyt... speyna a ra y dethyou. like hermit spend PTL do:PRS:3SG 3MSG day:PL 'Like a hermit ... he spends his days.'

Nonetheless, George explicitly states (1991: 217) that such examples as (38) do not occur: (38)

Cornish (George 1991: 217) *My gweles a wra. I see PTL do:PRS:3SG (Ί do see.')

Celtic word order

613

Although we cannot be sure that such examples were ungrammatical, it seems likely, given the variety of word order permutations found in all Middle Cornish texts, that such an order would be attested if it were grammatical. The ungrammaticality of (38) is predicted on the analysis proposed here: if the subject is in an IP-internal position, then it cannot be followed by the nonfinite verb, which is in C°.23 It seems clear that if Breton and Cornish are V2 languages, they are certainly not V 2 in the same sense that Germanic languages are. Whilst both Breton and Cornish permit an XP constituent in Spec, CP, this is only one strategy which results in V2 order, whilst in Germanic this is the major strategy:24 see also Borsley (1994). Similarly, finite verbs in Germanic V2 clauses are assumed to move to C°, whereas the C° position is occupied in Breton and Cornish by a negative element or a non-finite verb, but not a finite verb. If the analysis of Cornish fronted subjects proposed here is correct, then the V 2 requirement of Cornish can be fulfilled without involving any element outside IP: the subject moves no further than the specifier of the highest functional projection in IP. In § 5.3 we consider what the highest functional projection might be in the Celtic languages.

3. Celtic copular clauses As mentioned in § 2.4, the Welsh cleft construction is a development from a copular construction: the copular verb was grammaticalized as a complementizer, and then lost, so that the resulting construction is no longer verb initial. Here, then, we have a familiar case of word order change as a result of the grammaticalization of lexical elements. In Modern Welsh, the cleft construction is used extensively, but is optional. However, cleft-like fronting of a constituent is obligatory under certain conditions in copular clauses, and an initial copula disallowed. In copular clauses with an indefinite predicate NP, copula-initial word order is optional, with a predicate marker yn obligatorily accompanying the predicate phrase: (39)

Welsh Mae Mr. Jones yn brifathro newydd. be:PRS:3SG Mr. Jones PRED headteacher new 'Mr. Jones is a new headteacher.'

Alternatively, the indefinite predicate nominal can be fronted for focus, as in (40 a), with no predicate marker. But the subject cannot be initial in this construction, as (40 b) shows:

614

(40)

Maggie Tallerman

Welsh a. Prifathro newydd yw Mr. Jones, headteacher new be:PRS Mr. Jones 'Mr. Jones is a new headteacher.' b. *Mr. Jones yw prifathro newydd. Mr. Jones be:PRS headteacher new ('Mr. Jones is a new headteacher.')

Focus on the subject requires the construction in (41), with an obligatory predicate marker yn,25 and, in the present tense, a special form of the copula, as (41 a) shows: (41)

Welsh a. Mr. Jones sy'n brifathro newydd. Mr. Jones be:PRS-PRED headteacher new 'Mr. Jones is a new headteacher.' b. Mr. Jones oedd yn brifathro newydd. Mr. Jones be:PST PRED headteacher new 'Mr. Jones was a new headteacher.'

On the other hand, if the predicate nominal is definite, the copula-initial order is ungrammatical: (42)

Welsh *Mae Mr. Jones (yn) y brifathro newydd. be:PRS:3SG Mr. Jones PRED the headteacher new ('Mr. Jones is the new headteacher.')

NP V NP is the only order possible; see Rouveret (1991; 1996), and Morris Jones (1993) for an analysis of these constructions:

(43)

Welsh a. Y prifathro newydd yw Mr. Jones, the headteacher new be:PRS Mr. Jones 'The new head teacher is Mr. Jones.' b. Mr. Jones yw'r prifathro newydd. Mr. Jones be:PRS-the headteacher new 'Mr. Jones is the new headteacher.'

Celtic word order

615

Either the predicate (43 a) or the subject can be initial in this case, again giving focus on the fronted constituent. The same word orders are found with a sentential predicate: the copula must not be initial, but either the predicate or the subject is fronted: (44)

Welsh a. *Mae fy nod (yn) ddadlennu manylion y sefyllfa. be:PRS my objective PRED reveal detaihPL the situation ('My objective is to reveal the details of the situation.') b. Dadlennu manylion y sefyllfa yw fy nod. reveal detail:PL the situation be:PRS my objective 'My objective is to reveal the details of the situation.' c. Fy nod yw dadlennu manylion y sefyllfa. my objective be:PRS reveal detail:PL the situation 'My objective is to reveal the details of the situation.'

Constructions such as these, which do not have an initial copula, are analyzed by Rouveret (1996) as V2 constructions: the fronted constituent is in Spec, CP, and the second element, the copula, moves to C°; see also Tallerman (1996). The constructions are in effect clefts, and the fact that they do not display the typical clefting particles, a, y, is predicted if both these and the copula are sited inC°. A similar analysis of the Irish copula is proposed by Carnie and Harley (1993), Carnie (1994). Traditional grammars distinguish a copula verb is from a substantive verb tay hi, (cf. the Spanish ser and estar distinction). Carnie and Harley (1993: 3) point out that both forms are "copular in nature"; they argue, though, that whereas ta, bt are verbs, is is actually a complementizer. (45) through (47) illustrate its usage: (45)

Irish (Stenson 1981: 104) Is e an muinteoir an sagart. COP him the teacher the priest 'The teacher is the priest/The priest is the teacher.'

(46)

Irish (0 Siadhail 1989: 228) (Is) ise an muinteoir. COP she the teacher 'She is the teacher.'

616

Maggie Tallerman

(47)

Irish (0 Dochartaigh 1992: 40-1) (Is) tiomänai maith i. COP driver good her 'She's a good driver.'

Carnie and Harley show is to be in complementary distribution with the preverbal complementizer particles; in fact in many cases it is identical in form with these particles. Unlike verbs, is is morphologically defective: it displays only a past vs. nonpast distinction, and has no agreement morphology. Like complementizers in Irish, but unlike verbs, is can delete in the spoken language, as shown in (46) and (47).26 Compare (48), where the sentence is ungrammatical without the verb ta: (48)

Irish *(Tä) se sinn. be:PRS he sick 'He's sick.'

Is also deletes optionally in the cleft construction, as 0 Siadhail (1989) shows: (49)

Irish (0 Siadhail 1989: (Is) ag peinteail COP PROG paint 'The man was painting

236) cathaoir a bhi an fear. chair PTL be:PST the man a chair'

The data presented in this section show how grammaticalization followed by the loss of certain functional category items can lead to a change in word order. In particular, copular elements which once moved productively to C° have been grammaticalized as complementizers in both branches of the Celtic family, often leading to the loss of verb-initial clauses in some constructions.

4. Irish pronominal objects I turn now from the initial part of the clause to the word order at the end of the finite clause. In general (pace the situation in Breton and Cornish) Celtic exhibits VSOX word order with an inflected main verb. In Irish, however, pronominal object NPs give rise to an interesting exception to this generalization. Full lexical objects occur in the expected position, following the verb and preceding an indirect object and any non-valency constituents. Pronominal objects, on the other hand, exhibit a strong preference for clause-final position,

Celtic word order

617

even following any non-valency constituents, so that the word order becomes VSXO. Compare (50 a) and (b): (50)

Irish (0 Siadhail 1989: 207-8) a. Bhris se an chathaoir leis an ord areir. break:PST he the chair with the sledgehammer last.night 'He broke the chair with the sledgehammer last night.' b. Bhris se areir leis an ord t. break:PST he last.night with the sledgehammer it 'He broke it last night with the sledgehammer.'

The (b) example is somewhat surprising since it has the lightest constituent of all in clause-final position, rather than a heavy constituent; it thus counterexemplifies Hawkins' (1990) principle of Early Immediate Constituents, according to which the preferred word order is one that enables all daughter constituents of a mother node to be recognized as early as possible. Such data also run counter to a well-known general tendency for more topical material, especially "given" elements such as pronouns, to occur early on in the clause (see Mallinson and Blake 1981: ch. 3). Where pronouns appear in a position different than full NPs, they are usually expected to occur earlier in the clause rather than later. Furthermore, nothing in the general typological character of Celtic leads us to expect this Irish phenomenon: as we have seen (Tallerman, this volume) the regular word order of Celtic is VO. Chung and McCloskey (1987) suggest that the pronominal is right-adjoined to a VP constituent which contains it. The fact that the pronominal can "stop off" after some intermediate constituent rather than moving to the end of the clause is seen as evidence that the VP is layered. Another possibility is presumably that it has a flat structure. (51) illustrates (with a hash sign) the permitted positions of a pronominal object which co-occurs with a string of adverbials: (51)

Irish (0 Siadhail 1989: 209) Fägadh e ina loighe # ar an talamh # leave:IPS it in:3MSG lying on the ground taobh thair den sciobol # areir #. behind of:the barn last.night 'It was left lying on the ground behind the barn last night.'

Chung and McCloskey (1987: 195) note, though, that the unmarked position for a pronominal object is clause-final, even following a string of adverbials. They also show that not only pronominals which are syntactic direct objects undergo postposing:

618

Maggie Tallerman

(52)

Irish (Chung and McCloskey 1987: 196) Tharlaigh [ amuigh ar mo thriall me]. happen:PST out on 1SG travel me Ί happened to be out wandering.'

(53)

Irish (Chung and McCloskey 1987: 196) Ba mhinic [ faoi ionsai e\. be:PST often under attack him 'He was often under attack.'

In Chung and McCloskey's analysis, the postposed pronouns indicated in (52) and (53) are the subject of a "small clause".27 Duffield (1995) shows that postposing also applies to inflected prepositions, as in (54): (54)

Irish (Duffield 1995) Bhi an sagart inne aice. be:PST the priest yesterday at:3FSG 'The priest was with her yesterday.'

Chung and McCloskey's adjunction analysis is criticized by Duffield on the grounds that it cannot explain the set of restrictions on the postposing construction. Firstly, postposing only applies to weak pronouns, and never to contrastive or emphatic pronouns, or full NPs. Secondly, the subjects of lexically selected small clauses can postpose, but not the subjects of unselected small clauses.28 Thirdly, there are constraints on the type of adverbials which can follow the postposed pronominal: although the pronominal can "stop off" before absolute clause-final position, as (51) illustrated, only certain orders of adverbial plus pronominal are grammatical. All these issues are addressed in Duffield's alternative analysis, which involves leftward movement of the entire clause rather than rightward movement of the pronominal. Duffield analyzes the weak Irish pronouns as heads; in line with generative analyses of clitic pronouns in other European languages, the weak pronouns move not rightwards but leftwards, to a position not available for full NPs. This position is a functional head below the complementizer but above the highest verbal projection, referred to by Duffield as the Wackernagel head, since it is in second position in a full clause. The clause itself then raises to the Specifier of the Wackernagel position, so that the pronoun appears to the right of the remainder of the clause.29 Duffield's account also addresses the fact that pronoun postposing is restricted to finite clauses, and is absolutely ungrammatical in infinitival clauses; since finite clauses raise to check the + finite/Tense feature, there will be no raising to the Specifier of a Wackernagel head in infinitival clauses.

Celtic word order

619

The pronoun postposing construction is confined to the Q-Celtic languages; it is prevalent in Irish, where it is the strongly preferred construction, but seems to exist as a possibility in the other Q-Celtic languages. Normally, pronominal objects in Scottish Gaelic precede the indirect object and any adverbiale: (55)

Scottish Gaelic (Mackinnon 1971: 177) Ithidh mi e anns a' mhadainn. eat:FUT I it in the morning Ί will eat it in the morning.'

Postposing apparently occurs in (56), however: (56)

Scottish Gaelic (Ramchand 1993: 27) Chunnaic Calum air ball mi. see:PST Calum immediately me 'Calum saw me immediately.'

As we have seen in the previous three sections, the Celtic languages exhibit a variety of clause types which are not VSOX. § 2 discussed different fronting constructions in P-Celtic, and § 3 broadened the discussion to include other non-verb-initial clauses on both sides of the Celtic family. § 4 outlined a construction which is peculiar to Irish, the pronoun postposing phenomenon. Having examined some of the most noteworthy word order variations in Celtic, I turn now to the analysis of ordinary Celtic finite and infinitival clauses in generative grammar.

5. The analysis of Celtic word order in generative grammar Section 5.1 gives a brief overview of the treatment of Celtic in generative grammar in the last thirty years. Section 5.2 details a specific typology proposed by Ouhalla (1991), who claims that VSO languages form a different "type" than SVO languages. The Celtic languages, however, are considered not to constitute "true" VSO languages, but instead to have more in common with SVO languages. These proposals are critically examined. In Section 5.3 I outline some alternative proposals for Celtic clause structure.

5.1. Overview of the major proposals in the generative literature In the late 1970's the question of underlying word order in the Celtic languages became an issue in generative grammar, and since then, certain points of con-

620

Maggie Tallerman

sensus have emerged. Firstly, it is clear that a "flat" VSO structure is problematic, and there is general agreement in the literature that such a structure is incorrect. The subjects of SVO/SOV languages display a privileged syntactic status, accounted for via a configuration in which the subject asymmetrically ccommands the object. The fact that Celtic exhibits subject/object asymmetries suggests that hiererachical syntactic structure is also a feature of VSO languages: see for example Anderson and Chung (1977), McCloskey (1983), Sproat (1985), Hendrick (1988; 1990), and Woolford (1991). Evidence for a structure came in part from the periphrastic Celtic word order pattern, which has an inflected initial auxiliary, followed by the subject and a VP constituent: see (3) and (4). In early transformational work on Celtic this construction was argued to be evidence for an underlying SVO word order: see Jones and Thomas (1977), Emonds (1979, 1980, 1985), Harlow (1981), Koopman (1984), Chung and McCloskey (1987), McCloskey (1991) and Koopman and Sportiche (1991). The surface order is derived by verb fronting, either of an auxiliary or the lexical verb. Although the details of the analysis have changed, the basic proposal has become the standard treatment of Celtic. Accounts of Celtic differ with respect to the landing site of the fronted verb. Many non-verb-initial languages also display V-fronting phenomena, such as subject-auxiliary inversion in English; the general V2 construction of Germanic languages such as German or Dutch; the verb-initial construction of Spanish, as well as the more general I°-to-C° constructions of Romance. Should we assume that a fronted verb always moves to the same position? In the Germanic languages V-fronting is typically analyzed as movement to C° (see den Besten 1983), a proposal which has also been extended to Celtic; see for example Stowell (1989: 320) .30 Given a clause structure such as (57), in which the subject is the specifier of IP, then V-movement to 1°, followed by I°-to-C° movement, is the only way to derive verb-initial order:

Celtic word order

621

Here, the subject is in the specifier position of IP, and within the sentence there are no higher functional heads than 1°. Several authors have, however, noted conceptual and empirical problems connected with I°-to-C° movement in Celtic: see for example Koopman and Sportiche (1991: 219 f), McCloskey (1996 a) and Bobaljik and Carnie (1996). In V2 languages of the "asymmetrical" Germanic type,31 the restriction of Vfronting to main clauses is predictable under the I°-to-C° account, since in embedded contexts the verb's potential landing site is filled by an overt complementizer. This results in subordinate clauses with C°-SVO or C°-SOV word order. But finite clauses in the Celtic languages have V-initial word order whether root or embedded: see Tallerman (this volume). As McCloskey (1996 a) points out, an I°-to-C° head-movement account of Celtic cannot readily explain why V-fronting in these languages is not similarly restricted to root clauses. A second distinction between the two language types is that overt complementizers in Celtic can co-occur with a fronted finite verb, as (58) shows for Irish: (58)

Irish (Bobaljik & Carnie 1996: 227) Ceapaim [go bhfaca se an madra]. think:PRS:lSG COMP see:PST he the dog Ί think that he saw the dog.'

If the I°-to-C° analysis results in the verb literally occupying the C° position, there is no immediately obvious account of such data as (58). However, an alternative view which is compatible with (58) is that the verb adjoins to the C° position, rather than occupying it; C° could then contain an overt complementizer. There is also language-specific evidence against an I°-to-C° analysis. Duffield (1991 a) argues on the basis of mutation facts that the verb does not raise as high as C° in Irish. The Long Head Movement analysis of V-fronting in Breton and Cornish (see § 2.2 above) requires that finite verbs do not move to C°, since that head position is filled by V-fronting of non-finite verbs. There are, then, several arguments against a general I°-to-C° account of finite verb movement in Celtic. Clearly, if the subject is in the canonical position in the specifier in IP, then given an underlying SVO analysis the verb must move to a position higher than IP in VSO languages. However, recent work on the position of subjects allows for another possibility: the VP Internal Subject Hypothesis (see Manzini (1983), Fukui & Speas (1986), Kitagawa (1986), Kuroda (1988), and Koopman &

622

Maggie Taller man

Sportiche (1991), amongst other authors) proposes that subjects are universally base-generated within the VP, perhaps as the specifier of VP. Under this analysis, the subjects of VSO languages might remain (somewhere) within the Vprojection, whilst the subjects of SVO languages raise. VSO languages would thus have the following (simplified) clause structure:

It then remains to be explained under what circumstances a subject either raises or stays in situ. A further possibility stems from the more articulated clause structure proposed by Pollock (1989), Belletti (1990) and Chomsky (1991), in which the unitary I-projection is replaced by separate functional heads T(ense) and Agr(eement). Each head forms its own projection, outside the VP. Under this analysis, the subject of a VSO clause might raise from VP, but rather than moving to the highest specifier position in the clause it would raise to the specifier of whichever is the lower functional head. The verb moves stepwise to the higher functional head. A "split-Infl" account has been proposed for Welsh by Rouveret (1991), and Ouhalla (1991); for Welsh and Breton by Hendrick (1991); and for Irish by Dooley Collberg (1991) and Duffield (1991 b). There is, however, controversy over what should be the higher functional head position in the various Celtic languages, a point we now turn to in § 5.2.

5.2. A proposal for a VSO versus SVO typology: Ouhalla (1991) As noted earlier, a standard proposal in generative grammar is that Celtic shares an underlying (SVO) structure with surface SVO languages, and that the differences in superficial word order relate to the movements of the subject NP, and perhaps the verb. This means there is no reason to suppose that a specific VSO "type" exists. Interestingly, this view converges with that of recent syntactic typology: see Dryer (1991). Dryer shows that, overwhelmingly, Vinitial languages pattern with SVO languages in terms of ordering of pairs of elements, whilst V-final languages contrast with both. Dryer in fact concludes

Celtic word order

623

that work by Lehmann (1973; 1978) and Vennemann (1974; 1976) proposing a broad VO versus OV typology is correct. A somewhat different view of VSO and SVO languages is taken by Ouhalla (1991): under these proposals, these language types do not share an underlying structure and are held to be typologically distinct. In this section I present a critical evaluation of Ouhalla's word order typology.

5.2.1. Introduction Ouhalla proposes that three typological properties distinguish VSO from SVO languages: (60)

VSO languages a. have Agr inside Tense b. have SVO as an alternative order c. lack non-inflected infinitives

(61)

SVO languages a. have Agr outside Tense b. tend not to have VSO as an alternative order c. have non-inflected infinitives (Ouhalla 1991: 110)

The first property, (60/61 a), is crucial, since Ouhalla claims that the order of the tense and agreement morphemes predicts the value of the remaining properties, and establishes a VSO versus SVO typology. Celtic languages, according to Ouhalla, do not exhibit the typological properties of VSO languages. Except perhaps for Breton and Cornish (which Ouhalla doesn't discuss) the Celtic languages do not have unmarked SVO word order in finite clauses, (60b), and Celtic verbal infinitives are not inflected, (60c).32 Ouhalla claims that these facts follow from (61 a), as Celtic has Agr outside Tense: he therefore analyzes the Celtic languages as typologically SVO, as against "true" VSO languages, for example Berber, Arabic, and Austronesian languages such as Chamorro. Ouhalla's property (60 a) states that true VSO languages have Agr inside Tense, whereas SVO languages have Tense inside Agr. The underlying structures proposed for each language type are shown in (62) and (63) (from Ouhalla 1991: 113):

624

Maggie Tallerman

(62)

VSO languages TenseP

(63)

SVO languages AgrP

In Ouhalla's account, the canonical subject position is Spec, AgrP, although not all subjects move to that position: if the subject can receive Case in its underlying position, it remains there rather than raising to receive Case. In structure (62), the verb raises from Agr to Tense, and whether the subject is in Spec, AgrP or Spec, VP it will always be to the right of the verb, resulting in VSO order. In SVO languages, with the structure in (63), the subject typically moves to the canonical Spec, AgrP position, and so ends up to the left of the verb which has raised through Tense to Agr. However, suppose the subject can receive Case in its underlying position: then the

Celtic word order

625

verb would move to the left of the subject. This, according to Ouhalla, is the derivation of the Celtic languages; their VSO word order results from the fact that unlike, say, English, the subject does not raise to the canonical Spec, AgrP position. "True" VSO languages (not Celtic) allow an alternative SVO order, (60 b), but crucially, it has a different derivation from the SVO order of languages like English. Ouhalla analyzes an initial subject in, say, Arabic as a topic rather than a canonical subject; it is base-generated in [Spec, TP], and co-indexed with the true subject, pro in [Spec, AgrP]. The relationship is one of topic and resumptive pronoun, which is subject to an m-command condition, and this is fulfilled by the structure in (62). Therefore SVO order occurs freely as the alternative order in VSO languages. SVO languages, however, do not typically display an SVO/VSO alternation. In Ouhalla's terms, the reason is that given the structure in (63), a Topic in [Spec, TP] would fail to m-command a resumptive pro subject in [Spec, AgrP]. Consequently, Ouhalla concludes, we do not find postverbal subject topics in SVO languages, and so VSO is predicted not to occur as the alternative word order. Having outlined the basic principles of Ouhalla's analysis, in the following sections I will examine each of the properties in (60)7(61), both in relation to Celtic, and from a more general standpoint.

5.2.2. The order of tense and subject agreement morphemes By the Mirror Principle (Baker 1985) tree structures should reflect the order in which morphological affixes are adjoined to a stem. Providing the theory permits parametric variation in the hierarchical order of functional heads,33 any language which has the subject agreement morpheme closer to the verbal stem than the tense morpheme will have the structure in (62), whilst languages in which Tense is closer to the verbal stem than Agr have the structure in (63). Ouhalla proposes that the Celtic languages pattern like SVO languages with respect to the ordering of functional heads34 because, he claims, the agreement morpheme is attached to the verb outside the tense morpheme. Specifically, Ouhalla cites Welsh, although he provides no arguments for his proposed morpheme order. However, since Ouhalla's analysis of Welsh hinges on this claim, it should be examined in more detail. Rouveret (1991) has also proposed that in Welsh "the agreement affix is clearly external to the tense affix" (1991: 374). He cites as evidence just one member of one paradigm for two different tenses, analyzed as shown in (64):

626

Maggie Tallerman

(64)

a. can—a—f sing - T - Agr sing — Pres — Is I sing/I will sing b. darllen — as — ant read — T — Agr read — Past — 3p They read (past) (after Rouveret 1991)

Rouveret himself argues (1991: 374) that functional structure should reflect morpheme order if and only if the morphology is concatenative; see also Baker's discussion of the same point (1985: fn. 5; and 401 — 2). But if there really are identifiable tense and agreement morphemes in Welsh finite verbs, as Rouveret claims, then each of these morphemes should recur throughout the paradigm for that particular tense. However, this is not the case, as we see from the full paradigms of the verbs in (64). Both verbs are regular. Figure 1 illustrates the particularly fusional morphology of the Literary Welsh present tense: there is no recurring tense morpheme -a-, contrary to Rouveret's proposal in (64): canu 'to sing' (pres) Is

can-af

IP

can-wn

2s

cen-i

2p

cen-wch

3s

can

3p

can-ant

Figure 1. Figure 2 illustrates the full past tense of darllen 'read', and also, for comparison, the pluperfect: darllen 'to read' (past)

darllen 'to read' (pluperfect)

Is

darllen-ais

IP

darllen-asom

Is

darllen-aswn

IP

darllen-asem

2s

darllen-aist

2p

darllen-asoch

2s

darllen-asit

2P

darllen-asech

3s

darllen-odd

3p

darllen-asant

3s

darllen-asai

3p

darllen-asent

figure 2.

Celtic word order

627

The paradigms in Figure 2 illustrate the most agglutinative verb forms in Welsh, yet as they show, there are no uniquely identifiable tense and agreement morphemes. What Rouveret considers to be A past tense morpheme -as- only occurs in the plural forms of the past tense, but crucially, it is also found throughout the pluperfect paradigm. So the -as- inflection clearly cannot be equated with tense, since it appears in two different tenses. As Figure 2 shows, what actually differentiates the plural forms of the past tense from the pluperfect is a vowel change in Rouveret's "agreement" morpheme. These facts are telling: Welsh displays a classic case of fusional 35 morphology in all its verbal paradigms: there are no separate tense and agreement affixes in the surface morphology. Of course, we cannot expect to find a strict cut-off point separating fusional and concatenative morphology, either between languages or within a language: instead what we find is a continuum, with absolutely agglutinative morphology at one end, and merger of morphemes at the other. This parameter is known in Greenbergian typology as the index of fusion: see for example Comrie (1981: 43). We must, however, presume that there is a point along this continuum at which morphological affixes are not discrete enough to provide any evidence. I contend that Welsh is at that point. To make matters entirely clear, note that I am not proposing a surface morpheme order of Stem-Agr-Tense for Welsh: my claim is that since the verbal morphology of Welsh is fusional rather than concatenative, it is tangential to any discussion of the order of functional heads. Much the same is true of the other Celtic languages, which have similarly fusional (and suppletive) morphology. As we will see, though, (§ 5.2.4, 5.3) evidence concerning the order of functional heads can be obtained from (PCeltic) infinitival clauses, and this evidence in fact supports a clause structure which has Agr outside Tense, as Ouhalla and Rouveret propose. Ouhalla's (1991) hypothesis that morpheme order predicts word order is of course not disconfirmed by the Welsh data. That hypothesis can only be tested by seeing whether the majority of (nonfusional) surface VSO languages do have the tense morpheme outside the agreement morpheme. Such an investigation into affix order has been undertaken by Siewierska (1993), who surveyed a representative sample of 308 languages.36 Siewierska considers two language types: those where the Agr and Tense morphemes are both affixed to the verb, and those where Agr and/or Tense are not bound morphemes. Combining the results from both groups, she finds that 57% of the languages with Tense outside Agr are V-initial.37 Since the world's total percentage of V-initial languages is not more than 15 — 20%, Siewierska's results seem to support Ouhalla's hypothesis that morpheme order predicts word order: there is a pronounced skewing toward V-initial word order in languages with Tense outside Agr. On

628

Maggie Tallerman

the other hand, and contrary to Ouhalla's predictions, Siewierska finds that Vinitial languages are split between Tense outside Agr order and Agr outside Tense order rather evenly: only 54% of V 1 languages have Tense outside Agr. This latter result suggests that VSO languages are simply not all of the same type: rather than postulate a two-way distinction between "true" VSO languages and others, however, as Ouhalla proposes, it seems likely that parametric variations in fact trigger many fine distinctions between VSO languages, even within one language family: see McCloskey (1996 b) on the same point.

5.2.3. Alternative constituent orders One of the major typological parameters separating VSO from SVO languages, according to Ouhalla (1991), is the availability of the alternative word order; see also Bakker, this volume, and Siewierska, this volume. Ouhalla's hypothesis borrows from traditional typology: he extends Greenberg's Universal 6 to (65) by adding the second clause, which refers to SVO languages: (65)

All languages with dominant VSO order have SVO as an alternative or as the only alternative, whilst not all languages with dominant SVO order have VSO as an alternative. Ouhalla (1991: 108)

Of course, the validity of (65), like all other proposed universale, could only be confirmed via a large scale survey of a representative language sample. It is clear from the work of Dryer (1991) that Greenberg's original language sample was not representative. This means that a more careful survey might well disconfirm Universal 6/(65). In fact, it seems clear that there are indeed VSO languages other than Celtic which do not have alternative SVO order, thus disconfirming (65). Longacre (1990: 65) states that in the Nilo-Saharan language Toposa there is no SVO order, and Siewierska (p. c.) points out that there are several VSO languages which only have SVO if there is a special particle between the subject and the rest of the clause (as is also the case in Celtic): languages like this include Ixil, Maori, Tsimshian and Yapese. Furthermore, there is no SVO order with or without a particle in either of the VSO languages Amuesha and Chocho, although Amuesha does allow SV order in intransitive clauses. In any case, the status of (65) is problematic. Suppose firstly we assume that (65) refers only to the word order of finite clauses. An obvious question is whether "dominant" order means that of root clauses or embedded clauses,

Celtic word order

629

widely regarded as a more reliable indicator of word order type. Breton and Cornish have VSO finite embedded clauses; since both languages also have SVO as an alternative order in main clauses, they presumably conform to (65). Alternatively, if (65) is interpreted more liberally, then not just finite clauses but any SVO clause type could be counted. In that case, all the Celtic languages conform to (65), since all display SVO order in infinitival clauses.38 Ouhalla's view is that the Celtic languages do not have an alternative SVO word order, unlike VSO languages such as Arabic and Berber, and therefore cannot be considered "true" VSO languages. However, like all Greenbergian universale, (65) refers to observable surface word order. It does not, and indeed cannot, make reference to the derivation of VSO order in a theoretical model. If it did, it would be circular: only those languages which derive VSO order in a particular fashion would conform to the universal. Celtic would then be excluded, as its VSO order does not have the required derivation, and consequently, the failure of Celtic to exhibit alternative SVO order would be accounted for. But it would not be independently predicted, and thus the generalization would be invalid. Since (65) can, then, only refer to surface word order, the Celtic languages can hardly be excluded from its domain, as they do have "dominant (surface) VSO order". On this view, the Celtic languages might well constitute counterexamples to (65), thus discontinuing it. The alternative position is that (65) makes reference not to surface word order but to Ouhalla's rather abstract VSO/SVO typology; in that case the first clause can exclude Celtic, and apply only to a subset of surface VSO languages. But then (65) is unfalsifiable.

5.2.4. Inflected infinitives and VSO word order The third typological distinction between VSO and SVO languages proposed by Ouhalla (1991) is that infinitival clauses in VSO languages always display an Agr element, as (66) illustrates for Berber: (66)

Berber (Ouhalla 1991: 108) y-arzu uxwwan [ad-y-awer]. 3MSG-try:PST thief INF-3MSG-escape 'The thief tried to escape.'

SVO languages, conversely, tend to have uninflected infinitives. Ouhalla (1991: 122) proposes that inflected infinitivals occur in VSO languages because Tense subcategorizes Agr in the structure in (62), and so Agr obligatorily projects in

630

Maggie Tallerman

finite and non-finite clauses. In SVO languages Agr is the higher functional head and is not subcategorized for, and so may fail to project. When it does project, we get SVO languages which do display inflected infinitivals. In Ouhalla's view, the Celtic languages do not meet with (60 c): illustrating from Welsh, he suggests that Celtic has uninflected infinitives: (67)

Welsh (Ouhalla 1991: Disgwyliodd Siön expect:PST:3SG John 'John expected Gwyn

109) [i Gwyn weld Mair]. for Gwyn see Mair to see Mair.'

Although the embedded clause in (67) displays no overt Agr, compare the situation if the subject of the infinitival clause is pronominal: (68)

Welsh Disgwyliodd Siön [iddo fo weld Mair]. expect:PST:3SG John for:3MSG him see Mair 'John expected him to see Mair.'

This data, not considered by Ouhalla, is telling: the infinitive itself is not inflected, but the clause certainly does display an obligatory Agr element, namely the inflection found on the element i (etymologically a preposition meaning 'to, for'). The fact that Agr does not appear overtly in (67) is due to the well-known Celtic complementarity between overt arguments and "rich" agreement.39 So (67) could be said to display "covert" or null Agr, as proposed by Hendrick (1988) for finite verbs. Such an analysis extends easily to Breton infinitival clauses, where the element da 'to, for' also inflects obligatorily when the subject is pronominal:40 (69)

Breton (Stephens 1990: 155) Kavet am eus ur bluenn vat [din da skrivan find:PART 1SG have:PRS a quill good fonlSG to write gwelloc'h]. better have found a good pen so that I can write better.'

In fact it is clear that infinitival clauses in the P-Celtic languages do display obligatory agreement. Another type of infinitival clause occurs in Welsh and Cornish: this is the equivalent to the finite periphrastic clause illustrated for Welsh in (4):

Celtic word order

(70)

Welsh Dywedodd o [ei bod hi'n mynd]. say:PST:3SG he 3FSG be she-PROG go 'He said she was going.'

(71)

Cornish (from Passio Christi, cited by George 1993: 460) Yn sur ef a wothfye [y bos hy peghadures]. PRED sure he PTL know:COND 3FSG be she sinner:F 'Surely He would know that she is a sinner.'

631

In (70) and (71) the infinitival verbs bod, bos 'be' have an obligatory agreement proclitic. As in (67), if the subject is a lexical NP and not a pronoun, there is no overt agreement, but once again we can attribute this to the Celtic complementarity between overt (nominal) arguments and overt agreement. These data would not constitute evidence for Ouhalla that P-Celtic is typologically VSO, since, as noted, he allows for SVO languages in which Agr does happen to project. However, the data presented throughout Section 5.2 seem to indicate that the Celtic languages are not, after all, so different typologically from other VSO languages. If Ouhalla's criteria are crucial, then Celtic seems to conform to them quite well: it has both an alternative SVO word order and inflected infinitival clauses. On the other hand, from recent work on other verb-initial languages, it has clearly emerged that there is no "typical" VSO language. For example, Chung (1990) on Chamorro and Guifoyle et al. (1992) on Malagasy, Tagalog, Cebuano and Bahasa show that Austronesian verb-initial languages display very different syntactic properties than any of the Celtic languages, or indeed the Semitic languages. It seems likely, then, that there is no specific VSO "type": see also McCloskey, (1996b). I end this section with two further questions concerning Ouhalla's proposals. Firstly, note that a number of surface VSO languages lack full subject/verb agreement in finite clauses. This is the case in Arabic as well as the Celtic languages: in both families, finite verbs fail to agree in number with the postverbal subject, so that plural subjects co-occur with singular verbal affixes. Ouhalla (1991: 124) says this indicates that postverbal subjects do not agree with Agr in Arabic and Celtic. Why is this lack of full inflection in Arabic, a "true" VSO language, not taken as evidence against an Agr projection in finite clauses, whilst a lack of agreement in infinitival clauses (in, say, Irish) is taken to imply that Agr fails to project? Why does the superficial absence of agreement count as evidence in the one case but not in the other? Secondly, why is it that verbal agreement occurs regularly in finite clauses in SVO languages but is (apparently) only rarely found in infinitival clauses? In other words, why in Ouhalla's account does Agr project just in case the verb is finite?

632

Maggie Tallerman

5.3. Alternative views of Celtic word order and Case marking In the early 1990's a view of abstract Case marking emerged in the generative literature which claims that it is responsible for the parametric variation between VSO and SVO languages. The basic idea is that in VSO languages the subject NP receives Case without raising to the specifier position of the highest functional projection, whilst in SVO languages, the subject only receives Case in that position: see for example Koopman & Sportiche (1991), McCloskey (1991), Rouveret (1991). In Ouhalla's (1991) analysis, subjects generally raise to the specifier position of AgrP, and receive nominative Case under government from Agr. (This could be revised to give nominative assignment via Spec/Head agreement, ä la Chomsky 1993.) Since Ouhalla claims that the Agr projection is sited differently in VSO and SVO languages, as in (62) and (63), different word orders result. Raising the subject to the Agr projection also ensures that there will be subjectverb agreement, via the Spec/Head relationship. When a lexical NP does not trigger full subject-verb agreement, as in Arabic and Celtic, Ouhalla proposes that the subject receives a default nominative Case in its underlying position in [Spec, VP], It then has no need to raise, and hence cannot raise. Ouhalla's account is problematic for the Celtic languages in various ways. Consider firstly Welsh. Ouhalla assumes that the VSO order results from the subject remaining in situ in VP, whilst the verb raises. The first piece of evidence against this comes from adverbial placement. Making the standard assumption that VP adverbials are adjoined to VP, data like (72) and (73)41 show that the subject must have raised out of the VP:42 (72)

Welsh Mae ei rhieni [ V p bob amser [ V p wedi gofalu be:PRS:3SG her parent:PL every time PFV care am hynny]]. about that 'Her parents have always taken care about that.'

(73)

Welsh Doeddwn i [ V p prin [ V p yn fy nal fy hun yn NEG:be:PST:lSG I scarce PROG 1SG hold 1SG self PRED 61]]. back could scarcely hold myself back.'

Celtic word order

633

Since it precedes the adverbial, the subject cannot be within VP. Secondly, subjects also precede the negative element ddim:43 (74)

Welsh Welodd y ferch ddim y dyn. see:PST:3SG the girl NEC the man 'The girl didn't see the man.'

Assuming a negation projection which is above the VP, again we see that the subject cannot be within VP. From this evidence, it seems that Welsh does not have a default Case assignment rule for subjects: instead, the subject must raise. Comparing Breton to Welsh, it is clear that the two languages do not behave identically. In Breton the subject follows, rather than precedes, the negative morpheme ket, so the default Case assignment proposal might perhaps be sustained, with the subject remaining within VP: (75)

Breton Ne lenn ket ar vugale. NEC read:PRS NEC the child:PL 'The children don't read.'

An alternative might be that the position of the negative head itself varies crosslinguistically (see Ouhalla 1993), or that the subject raises covertly, at LF, in the sense of Chomsky (1993). For Cornish, I argued in § 2.5 that preverbal subjects move to the specifier position of the highest functional projection within the clause. However, it must clearly be the case that such movement is not driven by Case requirements, since subjects can also appear postverbally, just as in Breton and Welsh. Infinitival clauses in the P-Celtic languages shed further light on clause structure. As we saw in § 5.2.4, such clauses exhibit an Agr element, which is overt just in case the subject is pronominal (and empty, ie pro, in Breton). Specifically, we can propose that all clauses, whether finite or infinitival, are Agr-initial in P-Celtic: (76)

[Agr-sp AgrS [TP Tns ... [VP Su V ...]]]

The difference between finite and infinitival clauses lies in how the initial Agr element is supported. In finite clauses it is supported by the raised verb, moving through Tense to Agr-S. In infinitival clauses, most infinitival verbs are not lexically specified to take subject agreement, so do not raise. The inability of

634

Maggie Tallerman

(most) infinitival verbs to raise gives the appearance of SVO word order, for example in (67) and (68), but such clauses are actually headed by Agr-S, just like finite clauses. Where the infinitival verb does not raise, an element must insert (or perhaps raise) to give lexical support to Agr: this is the function of i in Welsh (67), (68), and of da in Breton, (69). These particles (etymologically, inflecting prepositions) adjoin to Agr, which would otherwise be unsupported. However, the verbs bod (Welsh) and bos (Cornish) do have subject agreement in their infinitival forms, as (70) and (71) show. These verbs are therefore unique in that they can move to Agr-S, unlike other infinitival verbs.44 Subjects in Welsh appear to raise overtly to the specifier position of a projection above VP, judging by (72) through (74). The same is true of the subjects of infinitival bod clauses, which also precede VP-adverbials: (77)

Welsh ... er ei bod hi [Vp eisoes [ V p wedi bod yn socian although 3FSG be she already PFV be PROG soak am dridiau]] for three.days 'although it had already been soaking for three days'

We might then assume that the subject in Welsh raises to Spec, TP. In this position it can be governed by Agr-S, if Case is assigned under government, or alternatively we can say that the subject raises covertly to the specifier position of Agr-S, where Case is checked under the Spec/head relationship. Under either variant, Agr-S will not be associated with Case assignment to subjects unless it is lexically supported, either by a finite verb, an infinitival verb, or by an inserted "prepositional" element which inflects. These proposals enable us to dispense with an artificial VSO/SVO division between finite and infinitival clauses in P-Celtic: this classification is unnecessary and misleading, once we accept that both clause types are Agr-S initial, and what varies is the nature of the element that gives lexical support to Agr. A subject in Spec, TP may subsequently move to a higher, preverbal position: in Cornish, this is the Spec, Agr-SP position, if the arguments in § 2.5 are correct. This derives the non-focalized "SVO" word order of Cornish: see for example (7 a) and (7b). We can surmise that the reason the subject moves (optionally) to this position has to do with the V 2 requirement in Cornish. From Spec, Agr-SP, the subject may then move to Spec, CP, which is the focus position for subjects (and other constituents) in all the Celtic languages. This derives such examples as (78):

Celtic word order

(78)

635

Cornish (from Beunans Meriasek, cited by George 1991: 229) Nahen dewes nynsa om ganov defry. no drink NEG:enter.3SG 1SG mouth sure 'No other drink will enter my mouth for sure.'

Since the subject is focalized, according co George (1991: 229), and also appears to the left of the negative complementizer ny, it seems plausible to analyse it as being in Spec, CP. At first sight, Irish infinitival clauses might appear to provide support for Ouhalla's (1991) default nominative Case assignment rule, since they have no Case-assigning Agr element before the subject: (79)

Irish (McCloskey 1986 a: 266) T πιέ iontach sasta [e a bheith ar an fhoireann]. be:PRS I very happy him be on the staff Ί am very glad for him to be on the staff.'

(80)

Irish (McCloskey & Sells 1988: 149) B'eadoiche [iad cruinniu]. be:COND:improbable them assemble 'It would be improbable that rhey would assemble.'

In fact, Irish is equally problematic for Ouhalla, as the morphological case of infinitival subjects is not nominative but accusative. On the other hand, subjects of finite clauses do bear nominative case, a contrast Ouhalla would have to account for. Presumably one solution would be different default Case assignment rules for finite and infinitival clauses, but this is highly stipulatory. Note that we cannot propose that the subjects of infinitival clauses move to escape the default nominative which is assigned in Spec, VP: no movement can occur unless it has to, and if the infinitival subject were receiving a default Case in situ it wouldn't need to move further. Another possibility in keeping with Ouhalla's analysis, although again a somewhat ad hoc one, is that infinitival subjects are base-generated in some position other than [Spec, VP]. That position could not be Spec, AgrP, since this is associated with nominative Case in Ouhalla's account. If Agr does not project, which is an option Ouhalla proposes for Celtic, then Irish preverbal subjects might be in [Spec, TP]: in this position they would be accessible to government from outside the clause. In fact, they would be in the same position as that proposed for Arabic preverbal subject topics. Ouhalla analyzes the latter as receiving accusative Case from the C° ?inna or the matrix V, under govern-

636

Maggie Tallerman

ment. Yet a parallel account cannot be correct for Irish: James McCloskey has demonstrated convincingly that accusative Case in Irish is not assigned under government, but is, rather, a default Case. It appears not only on embedded infinitival subjects, but also on preverbal subjects in independent infinitival clauses, on NPs in all verbless clauses, and NPs in syntactic isolation, all contexts where there is no external governor and Case assigner: see McCloskey (1986 a; 1986 b), Chung & McCloskey (1987) and McCloskey & Sells (1988).45 Conversely, nominative Case is argued by McCloskey (1996 b) to be assigned (checked) via the Spec, Head relationship, with the subject moving to Spec, Agr-SP.46

5.4. Conclusions Although the Celtic languages appear not to conform to the order of functional projections proposed by Ouhalla (1991) for "true" VSO languages, or at least not unequivocally, they share a variety of other properties with VSO languages. We have shown that of the remaining typological properties proposed by Ouhalla, Celtic does have alternative subject-initial word order (in infinitival clauses) and P-Celtic does have inflected infinitival clauses. This means that the Celtic languages are hardly atypical of VSO typology, even by Ouhalla's criteria. In any case, the link between the three proposed properties in (60) has been argued above to be tenuous, so that their value as diagnostics for VSO word order is called into question. It seems that a completely different set of properties might equally well be chosen to represent "true" VSO languages. For instance, Kayne (1994) makes the point that only VSO languages have prepositions which inflect to agree with their complements (as the Celtic languages all do): this might constitute one typological property. A second might be that "true" VSO languages lack subject/verb agreement for number with full lexical NPs, as is true of both the Celtic languages and Arabic (but not Berber). A third property might be the existence of VSO as the unmarked order of finite embedded clauses (covering all the Celtic languages including Breton and Cornish). It would no doubt be possible to link these properties in some way, just as Ouhalla has done for his three properties in (60)7(61). However, it is doubtful that the result would be either of typological validity or theoretical interest. It seems more likely that there are no "true" VSO languages, but that certain properties tend so recur amongst languages with VSO word order: see also Payne (1990). The challenge for theoretical linguistics is to discover why these properties occur, and why languages with the same superficial word order don't

Celtic word order

637

always share them. Current evidence suggests that not all languages with, say, V2 effects derive verb-second order in the same way. The same seems to be true of verb-initial order, and it is apparent that not all "initial" subjects are in the same clausal position. VSO languages are, then, unlikely to form a homogeneous class, even if they are closely related members of the same family.

Notes 1. A number of people read earlier versions of this paper with great care, and so prevented me from making many more mistakes than I would otherwise have done. My grateful thanks, then, to Robin Schaf er and Bob Borsley, and to the members of my EUROTYP group, in particular Anders Holmberg, Anna Siewierska and Yakov Testelec. None of these people are responsible for any remaining errors. 2. The Insular Celtic languages are divided into two major groups: P-Celtic, containing Welsh, Breton and Cornish, and Q-Celtic, containing Irish, Scottish Gaelic and Manx. Cornish and Manx have no native speakers: Cornish died out in the eighteenth century, and Manx in the 1970's. 3. The word order within the verb phrase is either VO or OV; see Tallerman (this volume) for further details. 4. The analysis of Celtic pre-verbal particles is a difficult area. For example, although a and ne in (6) are in complementary distribution, they are often considered to be in different positions, since the negative particle "counts" as a pre-verbal constituent in the verb-second construction but a does not. These particles or their cognates occur in all the Celtic languages, notably in cleft constructions such as (6), and relative clauses. The main question is whether or not they should be analysed as complementizers. I leave this question open, usually glossing these elements as particles. The negative particle ne is typically deleted in spoken Breton, leaving only its initial mutation (where appropriate) on the finite verb. 5. Occasionally Cornish has sentence-initial finite lexical verbs: see Tallerman (this volume) and George (1991: 238; 1993: 453). Unlike in Modern Breton, then, there seems not to have been an absolute prohibition on main clause V-initial word order in Cornish. 6. The discussion of Breton in this section owes much to Borsley, Rivero 8c Stephens, (1996). For an alternative view of V-fronting in Breton, see Schafer (1994). 7. It occurs, for example, in some of the Slavic languages, Balkan languages, and Old Spanish; see Rivero (1992; 1994) for discussion. However, no other living Celtic language allows this type of V-fronting: although the Welsh example in (i) is cited by Sproat (1985: 188) as grammatical, in fact it is completely impossible: (i)

Welsh *Gweld a wnaeth Siön y ty. see PTL do:PST:3SG Siön the house. (*It's see that Siön did the house.')

However, fronting of non-finite V did occur freely in Middle Welsh, as these examples show:

638

Maggie Tallerman (ii)

Middle Welsh (from Pedeir Keine y Mabinogi, cited by Evans 1989: 160) Edrych a wnaeth Manauydan ar y dref yn Llundein ... look PTL do:PST:3SG Manawydan on the town in London 'Manawydan looked upon the town in London ...'

(iii)

Middle Welsh (from Bucked Deivi cited by Evans 1989: 160) Mynet a oruc Padric y Iwerdon. go PTL do:PST:3SG Patrick to Ireland 'Patrick went to Ireland.'

8. Other analyses of Breton word order are proposed by Stephens 1982, in an earlier transformational framework; Borsley (1990), in a GPSG framework; and Borsley & Stephens (1989), who regard Breton as a VSO language with a general rule of subject topicalization, rather than an SVO language. Their view is also compatible with a V2 account of Breton. See Stump (1989) for an expansion of the V2 analysis. 9. Schafer (1994: ch. 4) discusses two other constructions which have an initial X° predicate. The first has an adjective in initial position, followed by the finite copula: (i)

Breton Fall eo an amzer. bad is the weather 'The weather is bad.'

Amongst the similarities which this construction has to V-fronting are the following: it is discourse neutral; it only occurs in root clauses (compare (28)); and it cannot occur in negative clauses (compare (27)): (ii)

Breton *Fall n'eo ket an amzer. bad NEG-is NEG the weather ('The weather is not bad.')

The grammatical version of the negative clause is as in (iii): (iii)

Breton N'eo ket fall an amzer. NEG-is NEG bad the weather 'The weather is not bad.'

The other construction Schafer discusses has initial infinitival bezan 'be', and a finite lexical verb in second position, rather than an auxiliary or copula: (iv)

Breton (Schafer 1994: 170) Bezan a ouie an tu. be PTL know:IMPF the way 'S/he knew the way (indeed).'

Examples like (iv) again have similarities with the type of V-fronting discussed by Borsley et al., most specifically that they only occur in root clauses, and cannot occur in negative clauses. However, Schafer notes (1994: 146) that the initial verb

Celtic word order

10.

11.

12.

13. 14. 15. 16.

17. 18.

19.

639

bezan cannot be seen as in any sense "belonging" inside the clausal sister to C°: "It cannot be licensed in any position internal to that clause". For this reason, I leave aside such constructions in the following discussion. In fact Long Head Movement does occur in embedded clauses which are not Lmarked in the sense of Chomsky (1986): clauses which are not the complement to some lexical head. To be precise, Borsley et al. propose that Long Head Movement languages share a particular licensing system for Tense, which accounts for their common properties. In their analysis of Breton, Tense is licensed when C° is filled either by negation or by a non-finite verb; or when the Spec of C° is filled, or when C° heads a CP which is L-marked, as for example in typical embedded clauses. The latter case gives rise to the VSO word order found in embedded clauses. Anders Holmberg (p. c.) suggests that movement to Spec, CP (topicalization/focalization) might require a particular feature in C°, the existence of which blocks V-fronting into C°: this proposal permits a unified approach to initial clause structure, relying on the features of C°. However, it is fair to point out that one of Timm's main aims was to discover the order of the verb relative to its arguments, rather than to examine the absolute position of the verb in a clause. In particular, she is interested in Varin's claim that SVO is the unmarked word order in Breton, a claim which she is able to dismiss on the basis of clause counts from written texts. Its presence is, however, signalled by mutation effects, if the verb has a mutable initial consonant. (30) and (31) are cited by Fife and King 1991. Compare the Irish examples in § 3, where the copula is optional under certain circumstances. Stump (1984) proposed that Breton subjects are in Spec, IP, an analysis which is refuted by Borsley & Stephens (1989) and Stump (1989). Note, though, that Stump's motivation for the 1984 account was the lack of subject/verb agreement in affirmative clauses, whilst the Cornish evidence presented here relies on the position of the subject relative to other elements in the clause. We will also see in § 5.3 that the Cornish subject may move to Spec, CP for purposes of focalization. Depending on how ena 'there' is analysed in (32), this example may also illustrate the same point. Postverbal pronouns trigger full agreement in Celtic: compare the Cornish example in (12 b); postverbal lexical NPs do not trigger agreement. See Rouveret (1991) for an analysis of Welsh agreement. Occasionally full agreement did occur with lexical NPs in Middle Welsh, and the same is true of Middle Cornish: see George (1991: 232). George (1991: 213) cites as an "exceptional case" one example of SOV with a full NP object: (i)

Cornish (from Beunans Meriasek, cited by George 1991: 213) Ha me an benediccon a ra oma purdyson. and I the benediction PTL do:PRS here forthwith 'And I the benediction shall perform here forthwith.'

20. Mynnes 'wish' is analysed by George (1991) as an auxiliary, which seems reasonable; compare English will.

640

Maggie Tallerman

21. George (1991: 217) describes the example in (36) as an "unusual" and "extraordinarily contorted" word order, but also cites the example in (i) as "standard": (i)

Cornish (George 1991: 214) Gweles my a vynn. see I PTL wish-.PRS Ί wish to see.'

22. As Robin Schafer (p. c.) points out, this raises the problem of how the verbal particle a should be analyzed: it clearly cannot be a complementizer under this analysis, since the non-finite verb is in C°. However the particles are analyzed, the same question arises for Long Head Movement constructions in Breton, where the nonfinite verb also co-occurs with a pre-verbal particle: see for example (18) and (20). 23. George also notes that the order Object-Verb-Auxiliary-Subject is unattested, which would give examples parallel to the ungrammatical (29) in Breton. 24. It is unlikely to be the only strategy: see for example Santorini (1994), who proposes that the landing site of topicalization in "symmetrical" V2 languages such as Yiddish and Icelandic is Spec, IP. 25. Rouveret (1996) points out that predicative yn always introduces stage-level predicates, that is, those having a transitory interpretation. 26. The facts of Scottish Gaelic and Manx are very similar to Irish, and these languages also have a zero copula under certain circumstances: (i)

Manx (Broderick 1993: 279) Juan Mooar yn fer share. John big the man best 'Big John is the best man.'

(ii)

Scottish Gaelic (Calder 1923: 261) A phobull sinn, his people we 'We are his people.'

27. This term is used in Government and Binding syntax to refer to a "clause" which has a "subject" and a non-finite or even non-verbal predicate, as in / consider [John a fool/incompetent]. 28. Compare (i), a selected small clause, with an unselected small clause in (ii). Without pronoun postposing, (ii) is grammatical. (i)

Irish (Duffield 1995) Ba annamh [sa bhaile e\. be:PST rare him at.home him 'He was rarely at home.'

(ii)

Irish (Duffield 1995) *agus [sa bhaile e]. and him at.home him ('and him at home')

29. I refer the reader to Duffield (1995) for full details, including motivation for the movements, and further evidence for the analysis.

Celtic word order

641

30. Such a proposal was first explicitly made for Celtic by Emonds (1979). 31. As opposed to the "symmetrical" V2 languages Icelandic and Yiddish, where V2 occurs in both main and embedded clauses: see for example Santorini (1994). 32. However, § 5.2.4 presents an alternative view of infinitival clauses. 33. Of course, the most restrictive assumption is that there is absolute identity of underlying structure cross-linguistically. 34. In fact, somewhat similar claims for Celtic have been made by Hendrick (1990: 141 if). Also citing Berber as a contrast, Hendrick suggests that in Breton, Tense is inside Agr; however, in a later paper (Hendrick 1991) he proposes the opposite order. 35. I have illustrated with paradigms from Literary Welsh, since this variety displays a great deal of verbal morphology. Colloquial Welsh has much less verbal morphology (and has consequently virtually ceased to be a null subject language); Thomas (1982: 215) shows that the past tense paradigm of his dialect only differentiates first and third person singular forms, the remaining members of the paradigm being morphologically identical. Literary Welsh (from which Rouveret's two examples in (64) are taken) is largely a written variety, and contains artificial literary forms and orthographic representations which do not reflect, and indeed may never have reflected, the actual morphophonology. Williams (1980: 85) comments on the tradition of representing the third person plural suffix as -nt: "This is merely a literary practice, as the final -t is never heard in natural speech". According to Williams, this practice stems from the Welsh Bible translation of 1588. Perhaps this was an early attempt to make the morphology appear neatly concatenative, since the -t suffix would serve to distinguish first person plural forms in the past tense (in spoken Welsh often -so«) from third person plural forms (also -son). Orthographic representations therefore cannot be relied on as evidence. 36. An obvious restriction was that the language had to display both affixes, and word order data had to be available. Siewierska did not consider the question of segmentability, so it is likely that some languages in her sample have (to some degree) fusional morphology, and so are not strictly relevant. 37. Note though that Siewierska does not distinguish between VSO and VOS languages, whereas Ouhalla is specifically discussing VSO. 38. Although as Tallerman (this volume) shows, Irish and Scottish Gaelic have predominantly SOV order in infinitival clauses. 39. In Welsh, overt pronominale but not lexical NPs co-occur with agreement. 40. In Breton the complementarity noted above also covers overt pronominals, so the subject in (69) must be pro. 41. These examples, and (77), are taken from Annwyl neb, translated by Emily Huws, Corner Press, 1993. 42. More accurately, out of an "aspect phrase", since (72) and (73) also contain an aspect marker which might be considered to head its own projection. 43. In some dialects the negation marker precedes the subject under certain conditions: see Awbery (1990) and Rouveret (1991) for an account. 44. In Middle Welsh, other infinitival verbs also bore subject agreement, and so moved to Agr, but this construction has died out. 45. Note, incidentally, that McCloskey's default Case rule applies wherever there is no external governor, and thus requires no special restrictions.

642

Maggie Tallerman

46. McCloskey follows Duffield (1991 b) on the order of functional heads, proposing that the higher projection is Tense. However, in a footnote he suggests an alternative under which the Agr-S projection is higher, as proposed here for P-Celtic.

References Anderson, Stephen 1981 "Topicalization in Breton", Proceedings of the Berkeley Linguistics Society 7: 27-39. Anderson, Stephen & Sandra Chung 1977 "On grammatical relations and clause structure in verb-initial languages", in: Peter Cole & Jerry Sadock (eds.) Syntax and Semantics 8. New York: Academic Press, 1—25. Awbery, Gwenllian 1990 "Dialect syntax: a neglected resource for Welsh", in: Randall Hendrick (ed.), 1-25. Baker, Mark 1985 "The Mirror Principle and morphosyntactic explanation", Linguistic Inquiry 16: 373-417. Bakker, Dik this volume "Flexibility and consistency in the languages of Europe". Ball, Martin (ed.) 1993 The Celtic languages. London: Routledge. Belletti, Adriana 1990 Generalized verb movement. Turin: Rosenberg and Sellier. Besten, Hans den 1983 "On the interaction of root transformations and lexical deletive rules", in: Werner Abrahams (ed.), On the formal syntax of the 'Westgermania. Amsterdam: John Benjamins, 47—131. Bobaljik, Jonathan & Andrew Carnie 1996 "A minimalist approach to some problems of Irish word order", in: Robert D. Borsley & Ian Roberts (eds.), 223-240. Borsley, Robert D. 1990 "A GPSG approach to Breton word order", in Randall Hendrick (ed.), 81-95. 1994 "Why Breton is not German", Ms: University of Wales, Bangor. Borsley, Robert D., Maria Luisa Rivero, & Janig Stephens 1996 "Long head movement in Breton", in: Robert D. Borsley & Ian Roberts (eds.), 53-74. Borsley, Robert D. & Ian Roberts (eds.) 1996 The syntax of the Celtic languages: a comparative perspective. Cambridge: Cambridge University Press. Borsley, Robert D. &c Janig Stephens 1989 "Agreement and the position of subjects in Breton", Natural Language and Linguistic Theory 7: 407—427. Broderick, George 1993 "Manx", in: Martin Ball (ed.), 228-285.

Celtic word order

643

Calder, George 1923 A Gaelic grammar. Reprinted by Gairm Publication, Glasgow, 1990. Carnie, Andrew 1994 "Complex predicates and deriving copula word order", Paper presented at the International Conference on Language in Ireland, Parasession on the Generative Grammar of Irish, June 25, 1994. Carnie, Andrew & Heidi Harley 1993 "Predicate raising and the Irish copula", Ms: MIT. Chomsky, Noam 1986 Barriers. Linguistic Inquiry Monograph 13. Cambridge, MA: MIT Press. 1991 "Some notes on economy of derivation and representation", in: Robert Freidin (ed.), Principles and parameters in comparative grammar. Cambridge, MA: MIT Press, 417-454. 1993 "A minimalist program for linguistic theory", in: Kenneth Hale &: Samuel J. Keyser (eds.), The view from Building 20: essays in honour of Sylvain Bromberger. Cambridge, MA: MIT Press, 1 — 52. Chung, Sandra 1990 "VPs and V-movement in Chamorro", Natural Language and Linguistic Theory 8: 559-619. Chung, Sandra & James McCloskey 1987 "Government, barriers, and small clauses in Irish", Linguistic Inquiry 18: 173-237. Comrie, Bernard 1981 Language universals and linguistic typology. Oxford: Basil Blackwell. Dooley Collberg, Sheila 1991 Comparative studies in current syntactic theories. Working Papers 37, Department of Linguistics, Lund University. Dryer, Matthew 1991 "SVO languages and the OV:VO typology", Journal of Linguistics 27: 443-482. Duffield, Nigel 1991 a "On negation, minimality and Irish relative clauses", in: L. Dobrin, M. Nichols & R. M. Rodriguez (eds.), Papers from the 27th regional meeting of the Chicago Linguistic Society. Part Two: The parasession on negation. Chicago: Chicago Linguistic Society, 30—48. 1991 b Particles and projections. PhD dissertation, University of Southern California. 1995 "Are you right? On pronoun postposing and other problems of Irish word order", in: R. Aranovich, W. Byrne, S. Preuss & M. Senturia (eds.), Proceedings of the Thirteenth West Coast Conference on Formal Linguistics. Stanford: Stanford Linguistics Association, 221 — 236. Emonds, Joseph E. 1979 "Word order in generative grammar", in: G. Bedell et al. (eds.) Explorations in lingustics. Tokyo: Kenyusha Press. 1980 "Word order in generative grammar", Journal of Linguistic Research 1: 33-54. 1985 A unified theory of syntactic categories. Dordrecht: Foris.

644

Maggie Tallerman

Evans, D. Simon 1989 A grammar of Middle Welsh. Dublin: Dublin Institute for Advanced Studies. Fife, James & Gareth King 1991 "Focus and the Welsh 'abnormal sentence': a cross-linguistic perspective", in: James Fife & Erich Poppe (eds.), 81-153. Fife, James & Erich Poppe (eds.) 1991 Studies in Brythonic word order. Amsterdam: John Benjamins. Fukui, N. 8c Margaret Speas 1986 "Specifiers and projection", in: N. Fukui, T. Rapoport &£ E. Sagey (eds.), Papers in theoretical linguistics: MIT Working papers in linguistics. 8: 128-172. George, Ken 1991 "Notes on word order in Beunans Meriasek", in: James Fife & Erich Poppe (eds.), 205-250. 1993 "Cornish", in: Martin Ball (ed.), 410-468. Gregor, Douglas 1980 Celtic. A comparative study. Cambridge: The Oleander Press. Harlow, Stephen J. 1981 "Government and relativization in Celtic", in: Frank Heny (ed.), Binding and filtering. London: Croom Helm, 213—254. Hawkins, John A. 1983 Word order universals. New York: Academic Press. 1990 "A parsing theory of word order universals", Linguistic Inquiry 21: 223-261. Hendrick, Randall 1988 Anaphora in Celtic and universal grammar. Dordrecht: Kluwer. 1990 "Breton pronominals, binding and barriers", in Randall Hendrick (ed.), 121-165. 1991 "The morphosyntax of aspect", Lingua 85: 171-210. Hendrick, Randall (ed.) 1990 Syntax and Semantics 23. San Diego: Academic Press. Jones, B. M. & Alan R. Thomas 1977 The Welsh language: Studies in its syntax and semantics. Cardiff: University of Wales Press. Jouin, Beatris 1984 Petite grammaire du Breton. Rennes: Quest France. Kayne, Richard 1994 The antisymmetry of syntax. Linguistic Inquiry Monograph 25. Cambridge, MA: MIT Press. Kitagawa, Y. 1986 Subjects in Japanese and English. PhD dissertation, University of Massachusetts at Amherst. Koch, John 1991 "On the prehistory of Brittonic syntax", in James Fife & Erich Poppe (eds.), 1-43. Koopman, Hilda 1984 The syntax of verbs. Dordrecht: Foris.

Celtic word order

645

Koopman, Hilda & Dominique Sportiche 1991 "The position of subjects", Lingua 85: 211-258. Kuroda, Y. 1978 "Whether we agree or not: A comparative syntax of English and Japanese", Linguisticae Investigationes 12: 1—47. Lehmann, Winfred P. 1973 "A structural principle of language and its implications", Language 49: 42-66. 1978 "The great underlying ground-plans", in: Winfred P. Lehmann, (ed.), Syntactic typology. Austin: University of Texas Press, 3—55. Longacre, Robert E. 1990 Storyline concerns and word order typology in East and West Africa. The James S. Coleman African Studies Centre and the Department of Linguistics, University of California, Los Angeles. MacAulay, Donald (ed.) 1992 The Celtic languages. Cambridge: Cambridge University Press. McCloskey, James 1983 A VP in a VSO language? In G. Gazdar, E. Klein, and G. K. Pullum (eds.), Order, concord and constituency. Dordrecht: Foris, 9 — 55. 1986 a "Inflection and conjunction in Modern Irish", Natural Language and Linguistic Theory 4: 245 — 281. 1986 b "Case, movement and raising in Modern Irish", in: J. Goldberg, S. MacKaye, & M. Wescoat (eds.). Proceedings of the West Coast Conference on Formal Linguistics 4. Stanford: Stanford Linguistics Association, Stanford, California. 1991 "Clause structure, ellipsis and proper government in Irish", Lingua 85: 259-302. 1992 "Adjunction, selection and embedded verb second", Ms: University of California at Santa Cruz. 1996 a "On the scope of verb movement in Irish", Natural Language and Linguistic Theory 14: 47-104. 1996 b "Subjects and subject positions in Irish", in: Robert D. Borsley & Ian Roberts (eds.), 241-283. McCloskey, James & Peter Sells 1988 "Control and Α-chains in Modern Irish", Natural Language and Linguistic Theory 6: 143-189. Mackinnon, Roderick 1971 Teach yourself Gaelic. Sevenoaks: Hodder and Stoughton. Mallinson, Graham & Barry Blake 1981 Language typology: Cross-linguistic studies in syntax. Amsterdam: NorthHolland. Manzini, Rita 1983 Restructuring and reanalysis. PhD dissertation, MIT. Morris Jones, Bob 1993 "Ascriptive and equative sentences in children's Welsh", (Studies in Child Language, Aberystwyth Education Papers) University of Wales, Aberystwyth.

646

Maggie Tallerman

O Dochartaigh, Cathair 1992 "The Irish language", in: Donald MacAulay (ed.), 11-99. O Siadhail, M'icheäl 1989 Modern Irish. Grammatical structure and dialectal variation. Cambridge: Cambridge University Press. Ouhalla, Jamal 1991 Functional categories and parametric variation. London: Routledge. 1993 "Subject extraction, negation and the anti-agreement effect", Natural Language and Linguistic Theory 11: 477—518. Payne, Doris 1990 The pragmatics of word order. Typological dimensions of verb initial languages. Berlin & New York: Mouton de Gruyter. Pollock, Jean-Yves 1989 "Verb movement, universal grammar, and the structure of IP". Linguistic Inquiry 20: 365-424. Press, Ian 1986 A Grammar of Modern Breton. Mouton Grammar Library 2 Berlin: Mouton de Gruyter. Ramchand, Gillian 1993 Aspect and argument structure in Modern Scottish Gaelic. PhD dissertation, Stanford University. Rivero, Maria Luisa 1992 "Long head movement and negation: Serbo-Croatian vs. Slovak and Czech", The Linguistic Review 8: 319—351. 1994 "Clause structure and V-movement in the languages of the Balkans", Natural Language and Linguistic Theory 12: 63—120. Rouveret, Alain 1991 "Functional categories and agreement", The Linguistic Review 8: 353—387. 1996 "Bod in the present tense and in other tenses", in: Robert D. Borsley & Ian Roberts (eds.), 125-170. Santorini, Beatrice 1994 "Some similarities and differences between Icelandic and Yiddish", in: David Lightfoot & Norbert Hornstein (eds.), Verb movement, Cambridge: Cambridge University Press, 87—106. Schafer, Robin 1994 Nonfinite predicate initial constructions in Breton, PhD dissertation, University of California, Santa Cruz. Siewierska, Anna 1993 "On the ordering of subject agreement and tense affixes", in: Anna Siewierska (ed.), EUROTYP Working Papers 11/5, 5: 101-124. this volume "Variation in major constituent order: a global and a European perspective". Sproat, Richard 1985 "Welsh syntax and VSO structure", Natural Language and Linguistic Theory 3: 173-216. Stenson, Nancy 1981 Studies in Irish syntax. Tübingen: Gunter Narr Verlag.

Celtic word order

647

Stephens, Janig 1982 Word order in Breton. PhD dissertation, School of Oriental and African Studies, University of London. 1990 "Non-finite clauses in Breton", in: Martin J. Ball, James Fife, Erich Poppe, & Jenny Rowland (eds.), Celtic linguistics: Readings in the Brythonic languages, a Festschrift for T. Arwyn Watkins. Amsterdam: John Benjamins, 151-165. Stowell, Tim 1989 "Raising in Irish and the projection principle", Natural Language and Linguistic Theory 7: 317-359. Stump, Gregory 1984 "Agreement vs. incorporation in Breton", Natural Language and Linguistic Theory 2: 289-348. 1989 "Further remarks on Breton agreement", Natural Language and Linguistic Theory 7: 429-471. Tallerman, Maggie this volume "An overview of the main word order characteristics of Celtic". 1996 "Fronting constructions in Welsh", in Robert D. Borsley & Ian Roberts (eds.), 97-124. Ternes, Elmar 1992 "The Breton language", in: Donald MacAulay (ed.), 371-452. Thomas, Alan 1982 "Change and decay in language", in: David Crystal (ed.), Linguistic controversies. London: Edward Arnold. Timm, Lenora 1989 "Word order in 20th century Breton", Natural Language and Linguistic Theory 7: 361-378. 1991 "Discourse pragmatics of NP-initial sentences in Breton", in James Fife & Erich Poppe (eds.), 275-310. Varin, Amy 1979 "VSO and SVO word order in Breton", Archivum Linguisticum 10: 83-101. Vennemann, Theo 1974 "Analogy in generative grammar: the origin of word order", Proceedings of the Eleventh International Congress of Linguists, 1972. Bologna: II Mulino. 1976 "Categorial grammar and the order of meaningful elements", in: A. Juilland (ed.), Linguistic studies offered to Joseph Greenberg on the occasion of his sixtieth birthday. Saratoga, California: Anma Libri, 615 — 634. Williams, Stephen 1980 A Welsh grammar. Cardiff: University of Wales Press. Willis, David 1996 The loss of verb-second in Welsh: a study of syntactic change: PhD dissertation, University of Oxford. Woolford, Ellen 1991 "VP-internal subjects in VSO and nonconfigurational languages", Linguis-

tic Inquiry 22: 503-540.

Yakov G. Testelec

Word order variation in some SOV languages of Europe

1. Introduction This paper deals with some interrelated issues in the syntax of the SOV languages of Europe, mainly of the Nakh-Daghestanian (East Caucasian) and Kartvelian (South Caucasian) families. My aim is to investigate some instances of word order variation in these languages involving pre- and postposing of dependents relative to their heads and to show how this variation correlates with differences in constituent structure. Languages of the SOV type, although dominant in Northern Eurasia, are in the minority in Europe. They include the Turkic languages (e. g. Turkish, Tatar, Nogai, Kumyk and several others), the North Caucasian languages (of the Abkhaz-Adyghe and Nakh-Daghestanian branches), the South Caucasian languages (or Kartvelian) and Basque. Many Finno-Ugric languages belong to this type (like Mari) or at least share some of its crucial characteristics (e. g. Northern Saami, Udmurt, or Hungarian). Latin predominantly showed SOV order of its clausal constituents, although most other relevant characteristics were unambiguously of the SOV type. (For the details on the distribution of the SOV features in Europe see the Appendix to this volume). To date, researchers in the field of the syntax of SOV languages have shown little interest in the problem of marked head-initial orders, partly because in the best studied SOV languages of the Altaic family like Turkish or Japanese instances of, say, VO or NGen are mostly peripheral or ungrammatical. By contrast, the SOV languages which belong to the indigenous linguistic families of the Caucasus regularly employ marked head-initial order. In the existing works on the syntax of the Caucasian languages, such orders are typically overlooked or described in the same structural terms as the unmarked headfinal orders. Their functional load is usually characterized as involving a shift in communicative or pragmatic status (Kibrik et al. 1977; Haspelmath 1993). The only work known to me in which differences in order are associated with differences in constituent structure is Boeder's (1995) investigation of Old Georgian.

650

Yakov G. Testelec

In this paper I will try to show that not only in Old Georgian but in the Caucasian languages in general constructions which manifest head-initial orders may evince a constituent structure quite different from their normal headfinal counterparts. I will argue that in the constructions in question there is no head-dependent relation between the semantically connected pairs of constituents. This holds true not only for noun phrases but also for some clauses headed by finite inflection markers. The non-clause-final position of the finite inflection or focus marker is determined by a special type of constituent structure. The analysis of constituent structure to be presented below is based on the theory-neutral tests for constituency elaborated mostly within generative grammar. My analysis is not, however, an actual GB one, since I do not use the now standard, in GB, double-bar phrase structure representation or consistent binary branching or functional agreement projections etc. The first part of the paper is devoted to head-initial orders in the NP. In S 2.1. through § 2.3. I argue that in some SOV languages, postposed modifiers of the NP constitute another NP in which they are heads or in which the head is deleted. Then in § 2.4. I consider binary branching structures, which in SOV languages may exhibit considerable variation in order. This variation is shown to be partially constrained by the Early Immediate Constituent hypothesis (Hawkins 1991) and by restrictions pertaining to the number of NPs with a postposed focused or contrastive modifier. The second part of the paper deals with focus constructions mainly in Daghestanian languages. I examine the restrictions imposed on word order variation in such constructions, first in Godoberi (§ 3.2.) and then in Avar (§ 3.3.) and discuss the potential constituent structure analyses of the problematic constructions with non-clause-final heads.1

2.

Postposed modifiers in the noun phrase

2.1. General characteristics Though in head-final languages not all modifiers of the noun exhibit the same likelihood of being placed in pre-head position (Hawkins 1983), any SOV language may be expected to position at least some modifiers before the head noun. In Nakh-Daghestanian languages, the pre-head position is the normal location of all the nominal modifiers. Postnominal placement, in terms of the markedness criteria for word order discussed, for example, by Keenan (1979), is obviously marked. In Kartvelian and Turkic languages, relative clauses may

Word order variation

651

follow their head nouns, which is the case in about half of the SOV languages in the world (cf. Dryer 1988). All the other modifiers, however, precede the noun. Thus with the exception of the just mentioned postnominal relative clauses, in those families that contain most of the SOV languages of Europe all noun modifiers have the head-final order as the unmarked option or as the only grammatical one.

2.2. Georgian: postposed modifiers as separate NPs At least in some languages, a phenomenon which looks superficially as a marked position of a modifier requires a structural description different from that of the unmarked construction. Constituents which look like postposed modifiers behave like separate NPs not immediately dominated by any other NP. Sometimes these postposed modifiers acquire all the crucial characteristics of a head noun; sometimes they can be viewed as NPs with heads deleted under coordinate reduction or similar syntactic operations. The facts considered below fit well within the paradigm of modifiers vs. appositions explored in much detail on the base of a considerable sample of languages by Jan Rijkhoff (this volume). In head-final languages the right boundary of a phrase is often morphologically marked. In Georgian (South Caucasian), apart from relative clauses, head nouns occupy the rightmost position in NPs and are marked for case. Head nominals inflect for the full set of cases, irrespective of their lexical class: noun, adjective, demonstrative, etc. Omitting some peripheral details of nominal inflection, preposed modifiers, again irrespective of the class they belong to, in modern non-archaizing Georgian normally inflect for a restricted ('adjectival') set of cases which includes only nominative- genitive- instrumental (NGI), dative-transformative (DT) and vocative forms. Demonstratives distinguish nominative and oblique (OBL) suppletive forms. Head nouns inflect for the full set of cases, and for two numbers. These morphological properties are illustrated in the examples in (1). (1)

a. is or-i lamaz-i kal-i that.NOM two-NGI nice-NGI woman-NOM 'those two nice women' b. am or-i lamaz-i kal-is that.OBL two-NGI nice-NGI woman-GEN Of those two nice women'

652

Yakov G. Testelec

c. am or-0 lamaz-0 kal-s that.OBL two-DT nice-DT woman-DAT 'to those two nice women' In contrast to other constituents of the NP, preposed genitives do not inflect at all, i. e. they do not allow any case ending to be attached to the genitive marker. Any modifier, except for a relative clause, when postposed, inflects for the full set of cases and numbers: (2)

a. kal-i lamaz-i woman-NOM nice-NOM 'the nice woman' b. kal-s lamaz-s woman-DAT nice-DAT 'to the nice woman' c. kal-eb-s lamaz-eb-s woman-PL-DAT nice-PL-DAT 'to the nice women'

As shown in (3 b) genitives do not differ in this respect from other categories. (3)

a. davit-is mama-s David-GEN father-DAT 'to David's father' b. mama-s davit-isa-s father-DAT David-GEN-DAT 'to David's father'

Whenever a modifier is discontinuous with its head, no matter in which order, it copies the head's case and number endings. The simplest way of treating these facts is to propose that each NP has a 'full' case/number ending which marks its right boundary. All modifiers precede their head nominale and cannot be postposed to it or be discontinuous. All examples which look like instances of modifier postposing as in (2) or (3 b) or discontinuity are better described as separate NPs, each having its own 'full' case/number. In order to discover the structure of such 'paired' NPs, let us look at some more obvious cases when head nominale are deleted under coordination. Consider the examples in (4) —(6) which illustrate respectively simple coordinate

Word order variation

653

reduction (4), reduction on the right periphery of the second coordinated clause (5), and gapping (6).2 (4)

K'arg-i amb-eb-i v-ici, good-NGI news-PL-NOM ISG.SUB-know cud-eb-i ar m-inda. bad-PL-NOM not ISG.SUB-want Ί am aware of the good news, and I want no bad.'

(5)

Sen g-esinia cem-i megobr-eb-is, you 2SG.SUBJ-fear my-NGI friend-PL-GEN me k'i sen-eb-is. I CNJ your-PL-GEN 'You are afraid of my friends, and I am of yours.'

(6)

a. Me v-k'itx-ulob cem-0 c'ign-s, I l.SBJ-read-PRS my-DT book-DAT sen k'i giorg-is c'ign-s. you CNJ George-GEN book-DAT Ί read my book, and you George's book.' b. ... sen k'i giorg-isa-s you CNJ George-GEN-DAT '... and you George's.'

The head noun deleted under coordination strands its case and number markers on the modifier (or, in other terms, only the stem, but not the case ending, undergoes deletion). Exactly the same occurs, as we have seen above, with all modifiers (except relative clauses) that follow their heads or precede them discontinuously. The question arises can we generalize the account via deletion to other cases where no coordination is involved? Since most modifiers can occur alone, without head nominals, anyway, this hypothesis seems redundant. The only modifier which always requires a head nominal is the genitive. For non-coordinated genitives, the deletion hypothesis seems plausible: as shown in (7) the head noun of a genitive deletes under coreference with a preceding nominal, stranding its case marker on the modifier. (7)

Cem-i deda-0 mo-vid-a da my-NGI mother-NOM PREV-come-AOR.3SG.SUBJ and 0-elaparak'-a nino-sa-s. 3.OBJ-speak-AOR.3SG:SUBJ Nino-GEN-DAT 'My mother came and had a conversation with Nino's (mother).'

654

Yakov G. Testelec

The same holds for genitives that belong to the predicate NP as illustrated in (8). (8)

Megobr-eb-i giorgi-s-eb-i ar-ian. friend-PL-NOM George-GEN-PL-NOM be-PRS.3 'The friends are George's.'

The assumption proposed above enables us to avoid a more traditional description via agreement which requires postulating that a nominal in the genitive case takes another case ending from its head. Instead, all instances of modifiers not immediately preceding their semantic heads, may be regarded as NPs with deleted heads.

2.3. Avar-Andi and Tsez languages: postposing of contrastive modifiers In most languages of the East Caucasian (Nakh-Daghestanian) family, especially in the Avar-Andi-Tsez group, a NP need not be strictly head-final, although modifiers are postposed infrequently. Rather, the order of constituents in the noun phrase reflects the degree of their contribution to the identification of the NP's referent: the more identifying the constituent with respect to the property is conveys and the referent that is to be denoted, the further to the right that can it be placed. The noun, being normally the most identifying constituent, occupies the rightmost position; other modifiers which are usually less identifying precede it and may be postposed only under contrast, i. e. if the referent of the NP is contrasted with some other referent that has a different characteristic. In that case the characteristic denoted by the contrasted modifier becomes crucial for the identification of the referent and may be placed postnominally as is the adjective in the Andi (9b).

(9)

Andi a. Hac'a k'otu b-iq-o. white horse 3CL-come-AOR 'The white horse came.' b. K'otu hac'a b-iq-o. 'The white horse came (of several horses, the one which is white).'

In some of the Avar-Andi-Tsez languages, there are modifier categories which may be postposed even if not contrastive or focal. A case in point is that of the relative clause in Godoberi (Avar-Andi group, data from Kazenin (forthc.))

Word order variation

655

which though typically prenominal as in (10 a) may occur postnominally as in (lOb).

(10)

Godoberi a. Di-ra haYa mahackala-jalda I-AFF see.AOR Makhachkala-LOC j-i-h-i-bu jaci-0. 2CL-live-PAST-PRT sister-ABS Ί saw my sister who lived in Makhachkala.' b. Di-ra jaci-0 haYa mahackala-jalda j-i-h-i-bu I-AFF sister-ABS see Makhachkala-LOC 2CL-live-PAST-PRT

Typically, however, modifiers follow their head nouns only under contrast or emphasis. In (11) we see this in relation to the relative clause, in (12) in relation to the numeral and in (13) in relation to the genitive.

(11)

Chamali a. Imu-d d-i'a-da χο§3-0 wuk. father-ERG 4CL-bring-PRT book-ABS fall.AOR 'The book that Father had brought fell down.' b. XOsa-0 book-ABS 'The book books) fell

(12)

imu-d d-i'a-da wuk. father-ERG 4CL-brmg-PRT fall.AOR that Father had brought (e. g. that one of several other down.'

Bezhta a. Lana suk'o 0-oq'-ojo. three man ICL-come-AOR 'Three men came.' b. Suk'o tana 0-oq'-ojo. 'THREE men came.' (not two and not four etc.).

(13)

Andalal dialect of Avar a. Dir wac-asul ?adi-0 j-ch-ana. my brother-GEN wife-ABS 2CL-come-AOR 'My brother's wife came.' b. £adi-0 dir wac-asul j-eh-ana. wife-ABS my brother-GEN 2CL-come-AOR 'My brother's wife came (not a wife of someone else).'

656

Yakov G. Testelec

Whenever the contrastive interpretation is blocked by the context, as in (14), the acceptability of the marked order becomes questionable.

(14)

Andalal dialect of Avar a. Hanib Yumro habu-dala bercina-j jas-ai. here life do-PRS nice-2CL girl-ERG nice girl lives here.' b. ? Hanib Yumro habu-dala jas-ai bercina-j. here life do-PRS girl-ERG nice-2CL

Given that preposed modifiers may but need not be contrastive, the distribution of pre- and postposed modifiers shows unambiguously that head-initial orders are marked. Andi differs from the other languages mentioned in that it has a special marker (-Rib) for contrastive, or restrictive, modifiers (Boguslavskaja 1989). As evinced by the previously cited (9b), this special marker is not obligatory in all cases of contrastiveness. Since contrstive modifiers allow both orders whereas non-contrastive modifiers always precede their heads, we would expect modifiers marked by -Rib to be able to either follow or precede their heads. This is indeed so as (15) illustrates. (15)

a. hac'a-Rib k'otu b-iq-o white-CTR horse 3CL-come-AOR b. K'otu hac'a-Rib b-iq-o '(Of several horses,) the white horse came.'

Whether a postposed contrasted modifier in the Avar-Andi-Tsez languages constitutes a separate NP resulting in a flat structure as is the case in Georgian, is not clear; evidence can be found both for and against this hypothesis. In the Avar-Andi-Tsez languages, as in the Kartvelian languages, modifiers show a restricted set of inflectional forms, if any at all, compared to head nouns. But, unlike in Kartvelian, they do not differ in morphological properties with respect to the position they occupy relative to the head. Thus for example, in Bezhta a relative clause modifier in the form of a participle takes, like all modifiers, one of the two case forms: direct (if the head is in absolutive) or oblique (if the head is in any other case). In (16a), for instance, since the head noun özdil 'the boy' is in the dative, i. e. an oblique case, the oblique ending -la is required for the participle.

Word order variation

(16)

657

Bezhta a. Di-1 0-iq'e-ca-la ozdi-1 I-DAT ICL-know-PRT-OBL boy-DAT t'ek-0 j-Tqo-s. book-ABS 4CL-get-PRS 'The boy whom I know got the book.'

When postposed, however, the relative modifier keeps the same binary direct vs. oblique form distinction; a participle cannot take the dative, nor any other case marker, unless it is the head itself, cf. (16 b): b. Ozdi-1 dil 0-iq'e-ca-la (*0-iq'e-ca-l). t'ek j-Tqos. ICL-know-PRT-DAT 'The boy WHOM I KNOW (e.g. of several boys) got the book.' Applying the usual tests for constituency to NPs with postposed modifiers we arrive at contradictory results. Consider, for instance, the Shared Constituent Coordination test in (17) as applied to Bezhta. Whereas a noun with a preposed adjective can become the target of the type of deletion illustrated in (17 a), a noun with a postposed adjective cannot; (17 b) is ungrammatical. This suggests that the noun and postposed adjective in (17 b) do not form a constituent.

(17)

Bezhta a. Özdi b-aq'o-t abo b-oc'-ijo boy-erg 3CL-bring-GER father 3CL-drive away-AOR häldijo gedo-0. white cat-ABS 'The boy brought, and Father drove away the white cat.' b. *0zdi b-aq'o-t abo boy.ERG 3CL-bring-GER father.ERG b-oc'-ijo gedo-0 häldijo. 3CL-drive away-AOR cat-ABS white

The same holds true for the noun and a postposed relative clause modifier. Again while (18 a) is grammatical, (18 b) is not. (18)

a. Özdi b-aq'o-i abo b-oc'-ijo boy-ERG 3CL-bring-GER father.ERG 3CL-drive away-AOR -0 j-üq-ijo gedo-0. meat-ABS 4CL-eat-PRT cat-ABS 'The boy brought, and Father drove away the cat that had eaten the meat.'

658

Yakov G. Testelec

b. * zdi b-aq'o-i abo b-oc'-ijo boy-ERG 3CL-bring-GER father.ERG 3CL-drive away-AOR gedo-0 χο-0 j-Oq-ijo. cat-ABS meat-PRT 4CL-eat-PRT However, other tests like the ability of single constituents to undergo simple coordination when applied to the same categories of modifiers suggest that the noun and postmodifier do form a constituent. For example, in Bezhta, NPs with postposed relative modifiers can be conjoined under simple coordination.

(19)

Bezhta Hok-co okko-na mi j-aq'o-jo he-ERG money.ABS-CNJ you 4CL-bring-PRT t'ek-la-na do j-aq'o-wa: j-owax-ijo. book-PL-CNJ I 4-bring-PRT.PL 4CL-take:PL-AOR 'He took the money that you had brought and the books that I had brought.'

2.4. Binary branching structures In Daghestanian languages that allow postposition as an option for some, or all, modifiers, it is impossible to postpose a modifier if it is itself an NP with a postposed modifier. Thus, for example, in Godoberi (Kazenin forthc.), a genitive can follow its head (20 b) unless its own modifier is postposed as in (20 c). Godoberi (20) a. [[[wasu-λ'ϊ] imu-λ'ϊ] hanq'u] boy-GEN father-GEN house 'the house of the boy's father' b. [hanq'u [[ wasu-λ'ϊ] imu-λ'ϊ]] house boy-GEN father-GEN c. ?? [hanq'u [imu-λ'ΐ [wasu-λ'ϊ]]] house father-GEN boy-GEN Another case in point, this time from Andalal is illustrated in (21).

Word order variation

(21)

659

Andalal a. [[Wac-asul [son sahara-iasa weh-ara-w]] boy-GEN yesterday city-LOC come-PRT-3CL cu] iut-umo a-na. horse run-GER go-AOR 'The horse of the boy who had come from the city yesterday, ran away.' b. *[ cu [wac-asul [son sahara-?asa horse boy-GEN yesterday city-LOC weh-ara-w]]] tut-umo a-na come-PRT-3CL run-GER go-AOR

The constraint on double postposing may receive a functional explanation. For obvious pragmatic reasons, there can be no more than one contrastive focus within a single NP. Any dependent of an NP may move to its right exept that contained in another, less inclusive NP. The latter cannot be moved either alone (crossing branches) or together with its closest head, if the latter precedes it. In Bezhta, such double postposing results in a Georgian-like copying of the morphological form of the postposed head noun onto its modifier, which is also postposed.

(22)

Bezhta [Βίλο [ ist'i-s [ iX'e saharba-' house brother-GEN last yea r city-LOC ελ'-eja-s (*eX'-ej-a)]]] j-ek'e-jo. go-PRT-GEN (go-PRT-OBL) 2CL-burn-AOR 'The house of my brother who had left for the city in the last year, burnt.'

Instead of the regular oblique ending -«, the participle modifying the postposed genitive inflects like a head noun, with the genitive ending -5. Therefore, an analysis similar to that proposed above for Georgian is required for (22), since the relative clause is a separate NP in apposition to its semantic head. Turning to binary branching structures consisting of categories other than NPs, with some categories, many SOV languages not only allow double headinitial orders but favour them. This is consistent with the predictions made by Hawkins's (1990, 1994, this volume) Early Immediate Constituent (EIC) recognition hypothesis according to which such structures are favoured due to their high EIC values. The EIC hypothesis predicts that there is a universal performance-based preference for orders that provide the quickest parsing pro-

660

Yakov G. Testelec

cedure. Hawkins designed a formal method for determining which of several ordering options is the optimal for parsing. Leaving aside technical details (see Hawkins this volume), each order for a given constituent can be associated with a qualitative value (called the EIC percentage) which determines comparatively their scale of acceptability. Informally, this value corresponds, very roughly, to the distance, in terms of words, between the head of a given constituent and the head of the constituent immediately embedded in it. In his publications Hawkins provides evidence for the EIC from binary branching structures involving the set of language universals which underlie his earlier Cross-Category Harmony principle (Hawkins 1983). He does not extend his hypothesis to the issue of the acceptability of different orders in binary branching structures. However, as I will show below, the EIC hypothesis can happily accommodate the acceptability of different orders in binary branching structures in the Caucasian languages. Consider the sentences in (23) from Bezhta, listed according to their relative acceptability, which show the four logically possible orders in a doubly embedded infinitival construction, namely: a) [[in-order-not-to-miss...to-go] I want]; b) [I want[to-go in-order-not-to-miss...]]; c) [I want [in-order-not-to-miss... to-go]]; d) [[to-go in-order-not-to-miss...] I want]. Two infinitival clauses are embedded into one another, one is the most common form for complement clauses (-al), the other is a final infinitive (-tRa) 'in order to...'. The EIC values for each order are given separately for the VP and for S' constituents.3 (23)

a. Dil [[[sa'at UnX'a meX'e-ca-la pojezba-X'a I hour five go-PRT-OBL train-LOC hakote-'efHa], eX'-al]s· at'na gej] V p miss-NEG- F.INF go-INF want AUX Ί want to leave in order not to miss the train that departs at five o'clock.' VP: EIC = 100% S1: EIC = 100% b. Dil [vp at'na gej [s· ελ'-al [sa'at iinX'a I want AUX go-INF hour five meX'e-ca-la pojezba-X'a hakote-'e-fHa]]] go-PRT-OBL train-LOC miss-NEG-FIN.INF VP: EIC = 72% S': EIC = 44%

Word order variation

661

c. Dil [ V p at'na gej [[ sa'at iinX'a meX'ecala I want AUX hour five go-PRT-OBL pojezbaX'a hakote-'e-ftla ]] eX'al s·]] train-LOC miss-NEG-FIN.INF go-Inf d. */?? dil [[s- eX'al [ sa'at iinX'a meX'ecala I go-INF hour five go-PRT-OBL pojezbaX'a hakote-'e-fHa ]] at'na gej VP] train-LOC miss-NEG-FIN.INF want AUX VP: EIC = 39% S 1 : EIC = 44% As shown by the EIC values calculated for (23 a) through (23d), for binary branching structures, the EIC makes the paradoxical prediction which is nevertheless true: two "wrongly" i. e. head initial ordered constituents embedded in each other as in (23 b) give a higher EIC value and are more acceptable in this order than when only one of these two constituents is head-initial as in (23 c). The least acceptable, lowest EIC values obtain for cases when the larger constituent is head-final and the embedded constituent is head-initial as in (23d). Again this is in line with the predictions of the EIC because in such instances the distances between the main and embedded heads are maximal. The branching part of both the VP and S', namely, the final infinitival clause may be enlarged, increasing the distance between the heads in (23 c, d); this clause can be itself head-initial, but those cases are not considered here. As shown above, word order in embedded structures in the languages under consideration is free and the distribution of acceptability judgements obeys the predictions of the Early Immediate Constituent theory. If both constituents in a binary branching structure are NPs, orders like (23 b) are totally ungrammatical because of the impossibility of crossing two NP boundaries. This restriction outweighs their comparatively high EIC] values, cf., for example, (21) above. In Turkic languages, head-initial orders in NPs are, as a rule, ungrammatical. Given that most embedded clauses are NPs in Turkic, constructions like (23 b) are not possible either as illustrated in (24) on the basis of Tatar. (24)

Tatar a. Sagat 5-te tor-gan pojez-ga kicek-mek hour five go-PRT train-DAT miss-INF öcen kit-arga teli. for go-GER want 'He wants to leave in order not to miss the train that departs at five o'clock.' b. *teli kitarga sagat 5-te torgan pojezga kicekmek öcen

662

Yakov G. Testelec

In Turkic languages, the position immediately preceding the verb is the focus position. The data from Turkish presented by Erguvanli (1984) offer sufficient evidence that there is no fixed sentence position for the topic, whereas the focus is placed before the verbal predicate. If only a part of an NP is focused, the whole NP moves into this linear position, as in the second clause in (25) (Erguvanli 1984: 36).

(25)

Turkish — Baba-n-a sarab-t yeni father-POSS2-DAT wine-ACC new bardak-la ver-di-n mi? glass-INST give-PAST-2SG.SBJ QUEST 'Did you give the wine to your father in/with the NEW glass?' — Hayir, eski bardak-la ver-di-m. no old glass-INST give-PAST-lSG.SBJ 'No, I gave (it to him) in the OLD glass.'

Here the whole NP is subject to focus movement, although only a part of it is in the scope of focus. Therefore a pied piping analysis, involving movement of the whole NP, seems unavoidable. In the binary branching structures, investigated in much detail by Erguvanli, a nominal dependent can leave the NP is belongs to if it is a topic of a sentence, but cannot if it is its focus. Given that the considered subordinate clauses in Turkish are NPs, it is not surprising that 'focusing does not operate across clause boundaries' (Ergunvanli 1984: 97). (26)

a. [Ben her gün on-un-la görü§-mek]-ten I every day he-GEN-with see-INF-ABL sikil-iyor-um.

bored-PROG-lSG.SBJ am tired of seeing him every day.' b. "'[Ben gün görü§-mek]-ten on-un-la sikil-iyor-um I everyday see-INF-ABL he-GEN-with bored-PROG-lSG Again, we see that in SOV languages, sentences containing focused or contrasted constituents show regular correspondences to their counterparts where the same constituents are not focused or contrasted. Those correspondences can be described via a rule of focus or contrastive movement. NP boundaries are in this case subjacent nodes, i. e. they can bar the movement: in some languages, the focused constituent cannot pass over one NP boundary, in oth-

Word order variation

663

ers, two NP boundaries block the focus or contrastive movement. The double head-initial order is always ungrammatical, even in cases when the single-step crossing of the two boundaries could be avoided cyclically.

3. Focus constructions in East Caucasian languages Below I will use the label 'focus construction' in a restricted sense, namely for a class of constructions in which special markers are used to express different kinds of focus and in which, very often, the syntactic structure is different from their non-focused counterparts. My understanding of 'focus' corresponds in the main to that of Kiss (this volume). Described first in Lak and Dargin languages by Said Khaidakov (Xajdakov 1986), focus constructions have become recently a point of keen interest for students of the syntax of Daghestanian languages. Focus words (they may be pronouns, particles, or auxiliaries) obligatorily follow the scope of the focus or a constituent that contains the scope as in (27).

(27)

Bezhta a. hugi [t'ek-ka:'] qow-al 0-oq'o-jo. he book.ABS-FOC read-INF ICL-come-AOR b. hugi [t'ek qow-al-la:'] 0-oq'o-jo. he book.ABS read-INF-FOC ICL-come-AOR 'He came in order to read THE BOOK.'

In both (17 a, b), the scope of the focus word is the same, but in (27 a) the focus marker -Ca:' (where C is the preceding consonant) follows the NP in scope, whereas in (28 b) it follows the infinitival clause — a larger constituent containing the NP under focus. The set of focus words may include a genuine focus (or: comment) particle (or a clitic, or an auxiliary verb), a yes/no question focus particle, a negation particle, and WH-pronouns. In most Daghestanian languages, not all of the functional units mentioned here require a special focus construction. In some, for example in Tsez, there seem to be no focus constructions different from standard ones at all. The structure of focus constructions resembles that of cleft sentences, i. e. a finite clause transformed into two clauses. The main clause consists of a copula verb with two NP arguments: one NP under focus, and another modified by a relative clause (sometimes with an null or pronominal head): John goes — It is John who goes — The one who goes is John.

664

Yakov G. Testelec

Consider first non-focus constructions with a copula verb. The copula verb with the meaning 'to be' takes two dependents, one of them obligatorily an NP, and the other a phrase which may be an NP as well, or a postpositional phrase, or any other kind of adverbial expression. Like in many SOV languages, the phrase which occurs closer to the finite verb (here it is the copula), is usually, although not always, the focus, whereas the other phrase is the topic. If we were to postulate pragmatic functional nodes for the copula construction, we would take a straightforward solution regarding the topic phrase as the specifier and the focus phrase — as the complement of the copula, schematically as in (28). (28)

CopulaP

in the house

is

In Bezhta the two orders shown in (29) are possible, with a focus interpretation of the preverbal phrase favoured in both. (29) a. oze biXo-' gej boy.ABS house-LOC be.PRS b. Βϊλο-' ze gej. house-LOC boy.ABS be.PRS Ά boy is in the house.' Now if Τ or Ρ or both are relative clauses we may get a kind of cleft construction. Focus words behave somewhat like the copula in cleft constructions, i. e. they take two dependents, one of them a topical relative clause, the other the focused constituent moved out of that clause. Here the Topic-Specifier and Focus-Complement treatment seems less convincing, however: we would expect rather that the Specifier be the landing site for the movement, whereas a relative NP is a typical branching Complement. To avoid the difficulties with the straightforward application of the double-bar hypothesis to the material of languages with free word order at the clause level, below I will present two dependents of the focus word in the form of a flat structure.

Word order variation

665

3.1. Focus constructions in Godoberi In Godoberi, according to data presented in (Kazenin forthc.), focus constructions different from standard ones occur only with analytical verbal forms that make use of the auxiliary. The auxiliary verb is itself a focus marker: the constituent it is encliticized to, unless it is the main verb, is a focus of contrast (Kazenin 1993: 1 — 2). We see this in (30 b, c) where the auxiliary is encliticized to 'Ali' and 'house' respectively.

(30)

Godoberi a. Tali-di hanq'u-0 bi%-ata buk'a. Ali-ERG house-ABS build-GER AUX.PAST 'Ali was building a house' b. ?ali-di buk'a hanq'u-0 bi Ali-ERG AUX.PAST house-ABS build-GER 'It was Ali who was building a house.' c. Tali-di hanq'u-0 buk'a bix-ata. Ali-ERG house AUX.PAST build-GER 'It was a house that Ali was building.'

In WH-questions the auxiliary is deleted, and the interrogative clitic can be attached to the WH-pronoun. Kazenin concludes that both focus words (i. e. the interrogative clitic and the auxiliary), being in complementary distribution, belong to the same syntactic category which he calls I (NFL). If an NP of a complement infinitival clause is in the scope of focus, a focus word can be attached either to the head of this NP or to the infinitive, as is the case in Bezhta (27), given earlier, and in (31) (Kazenin 1993: 7).4

(31)

Godoberi a. Quca-da hosu-ti idai-ata mahamadi-ti iki. book-FOC he-DAT want-GER Muhammad-DATgive.INF b. Quca mahammadi-ti iki-da hosu-ii idat-ata. book Muhammad-DAT give.INF-FOC he-DAT want-GER 'He wants to give Muhammad the BOOK.'

Since in (31 b) the focus word marks the whole infinitival clause though only a part of it is in focus, we may expect more than one interpretation for this sentence. It is indeed so: (31b) can also have the meaning 'He wants to give

666

Yakov G. Testelec

the book to MUHAMMAD' and 'It is TO GIVE MUHAMMAD A BOOK that he wants to do'; in the latter case the whole clause is in focus. Kazenin points out that the focused constituent usually occurs sentence-initially whereas the larger constituent it belongs to may be left in situ as it is in (31 a). If quca-da 'book-FOC' still belongs to the infinitival clause, the latter is split in two by the main clause which is impossible. He proposes therefore an account via movement of the focused constituent into the S-dependent position. The focus movement meets the subjacency requirements, in particular it is impossible also in the following three cases: i) If a whole NP is in focus, the focus word is attached to its head, but if the scope is only a part of a NP, the focus word still occurs after the head. In other words, NPs behave like islands with respect to focus movement (Kazenin 1993: 4). This is demonstrated in (32). (32)

a. Ιηλ:38υςυδ3^υ hosty ik-ata mahamadi-h'. which book-FOC he. ERG give-GER Muhammad-DAT 'Which book is he giving to Muhammad?' b. *ΐηλ:35υ^υ quca hosty ik-ata mahamadi-h' which-FOC book he.ERG give-GER Muhammad-DAT

ii) If the focused constituent is within a relative clause, the focus word is also attached to the head noun. (33)

a. Hosty te q'ardi-bu quca-wu bal-ata. he.ERG who.ERG write-GER book-FOC read-GER 'He is reading a book written by whom?' b. '•''hosty te-wu qardi-bu quca bal-ata he.ERG who.ERG.FOC write-PRT book read-GER

iii) Godoberi makes also use of a 'normal' cleft construction; if the scope of the focus falls on some constituent within the relative S' in a cleft construction, that constituent cannot be moved out of the S'; the focus marker can only be attached to its participial head. In order to account for the facts in i) — iii) Kazenin proposed a pied-piping analysis. In terms somewhat different from his own, Kazenin's proposal is as follows. A constituent can move into the focus position immediately dominated by S. The larger constituent containing the focused material can also move into the focus position (33 a). The latter becomes obligatory if the scope of the focus

Word order variation

667

is a part of a constituent belonging to a special class ('islands') which prohibits movement outside itself. Consider the structure for (32) presented in (34). (34)

NP/S'

which

quca-wu book QUEST

hosty he

ika -ta give-PRT

mahamadi-h Muhammad-DAT

'Which book is he giving to Muhammad?' (recall that the question enclitic -wu belongs to the class of focus markers) The node labeled here as NP/S' represents a mixed category displaying some properties of NP and some of S', which is rather common with nominalizations. Its NP-like properties are of special interest. Its most significant clauselike property is its word order which is not as free as in a 'genuine' finite clause (e. g. verb-initial orders are ungrammatical) but nevertheless not as rigid as in a 'genuine' NP. In other East Caucasian languages, however, some focus constructions may include non-mixed NPs in the same structural position. Such is the case, for instance in the Lak (35 b) where the focused constituent can be inserted into the normal converb phrase NP/S' (-s:a), and in (35 d) where it cannot be inserted into the restrictive converb phrase NP (-ma). We see therefore that in the Lak focus construction, normal converbs behave more like finite clauses, but restrictive converbs behave more like genuine NPs with nontransparent boundaries (examples kindly provided by K. Kazenin; in Lak, subject agreement clitics, here -r(i) 3SG.SBJ, function as focus markers):

(35)

Lak a. rasull-ul cu uwk-siwu busan-t'i-s:a-r. Rasul-ERG who come-COMP say-FUT-PRT-3SG.SBJ

b. rasull-ul cu-ri uwk-siwu busan-t'i-s:a. Rasul-ERG who-3SG.SBJ come-COMP say-FUT-PRT

668

Yakov G. Testelec

c. cu-ri rasull-ul uwk-siwu busan-t'i-ma. who-3SG Rasul-ERG come-COMP say-FUT-RESTR d. "'rasull-ul cu-ri uwk-siwu busan-t'i-ma. Rasul-ERG who-3SG come-COMP say-FUT-RESTR 'Rasul says that who will come?' Given that the cleft sentences as complements in focus constructions constitute another relative NP dominated by the relative NP created by the focus, it turns out that the 'island' class consists solely of NPs. We see therefore that no focused constituent can pass over two NP boundaries. A generalized view of (34) is depicted in (36).

(36)

F(focus position)

I(focus word)

NP

S1

X in (36) belongs to the lower NP and can be moved into the focus position only together with the whole of that NP.

3.2. Focus constructions in the Andalal dialect of Avar I shall discuss now the idiom which to my knowledge displays the widest set of focus constructions — the Andalal dialect of the Avar language.5 All focus constructions show remarkable uniformity and can be easily distinguished from non-focused ones. In Andalal, the set of focus words includes at least the following units: i) the (contrastive) focus enclitic -χα; ii) the yes/no question focus enclitic -la; iii) the negation focus marker guro (a stressed word); iv) all interrogative pronouns like su-C 'what' or ki-C 'where'; -C denotes here the position for the class agreement suffix.

Word order variation

669

Like in Godoberi, focus word require a cleft-like structure with two dependents — the focus position to which the focused constituent is moved, and a topical dependent in the form of a headless relative clause. This is shown in (37) with the contrastive focus enclitic -χα where the focus constituent 'was-as' together with its case ending has been raised into the finite clause. (37)

N a. was-as -Xa boy-ERG-FOC

D zundergo his

N masina car

V I C tunk -ara -b break-PAST-PRT.3CL

'It was the boy who broke his car.' That (37 a) is indeed a cleft-like construction with tunkarab functioning as a participle in a headless relative clause rather than as a finite verb is evinced by the word order restriction of the constituents in focus constructions as compared to those in non-focus ones. The nonfocus version of (37 a) is (38 a). (38)

a. Was-as zundergo masina tunk-ana. boy-ERG his car break-AOR The boy broke his car.' b. tunkana wasas zundergo masina. break-AOR boy-ERG his car c. tunkana zundergo masina wasas. break-AOR his car boy-ERG

As shown in (38 b, c) in the nonfocus clause both verb-initial orders are grammatical, although, of course, marked and restricted in use. By contrast in the

670

Yakov G. Testelec

focus construction the participle tunkarab cannot occur initially. Note the ungrammaticality of (37 b, c). (37) b. *tunk-ara-b was-as-xa zundergo masina break-PAST-PRT boy-ERG-FOC his car c. *tunk-ara-b zundergo masina was-as-xa break-PAST-PRT his car boy-ERG-FOC Word order in relative clauses in Andalal is strictly verb-final as evinced by (39 a, b) with a relative clause, and (39 c, d) with a converb (gerundial) clause. (39)

a. son masina tunk-ara-b haw was haniwe yesterday car break-PAST-PRT this boy here w-eh-ana. ICL-come-AOR 'Having broken yesterday his car, this boy came here.' b. *tunk-ara-b son masina haw was haniwe break-PAST-PRT yesterday car this boy here w-eh-ana. ICL-come-AOR c. son masina tunk-umo haw was haniwe w-eh-ana. yesterday car break-CONV this boy here ICL-come.AOR d. tunk-umo son masina haw was haniwe w-eh-ana break-CONV yesterday car this boy here ICL-come.AOR

The focus constituent marked with the focus word may also raise and cross the right boundary of the relative clause as is the case in (37 d). (37) d. zundergo masina tunk-ara-b was-as-xa his car break-PAST-PRT boy-ERG-FOC Its position is not unrestricted, however; it cannot precede the participial followed by all the nonfocus material. Hence, though sentences such as (37 e) are grammatical, they do not belong to the paradigm (37 a—d). (37)

e. wasas-xa tunkarab zundergo masina boy-FOC break-CONV his car

Word order variation

671

Rather the syntactic and pragmatic structure of (37 e) is different from the rest of the examples in (37); zundergo masina 'his car' is topical, or backgrounded information, thus being the topical dependent of the focus word, whereas the relative clause wasas-(xa) tunkarab is the focused dependent of the focus word. We will see below that the focused dependent position may be filled with even larger constituents containing the focus scope. The focus constructions in Andalal differ from those in Godoberi in two respects. The first, already mentioned, is that focus words in Andalal must immediately follow their scope constituent and not some larger constituent, as is common in Godoberi. The second is that in finite clauses, focus words are in complementary distribution with the copula verb -uk'-/-ug- 'to be' and with the auxiliary (which is synchronically homophonous to it). In Godoberi, by contrast, the copula-and-auxiliary clitic is itself a contrastive focus word. Given this second characteristic, the focus word is the highest predicate in a sentence. One of its dependents must be a relative NP which may include a simple verbal construction, as in (37), a construction with an auxiliary, or a cleft construction. Some examples with focus words other than -χα are given below; they do not differ from the examples with -χα. (40) and (41) illustrate the yes/no question particle -la where the subject (40) and the object (41) is questioned. Note that (41) is a construction with an auxiliary. (40)

a. Was-as-la zundergo masina tunk-ara-b? boy-ERG-QUEST his car break-PAST-PRT 'Was it the boy who broke his car?' b. *tunk-ara-b was-as-la zundergo masina? break-PAST-PRT boy-ERG-Q EST his car c. *tunk-ara-b zundergo masina was-as-la? break-PAST-PRT his car boy-ERG-QUEST d. Zundergo masina tunk-ara-b was-as-la? his car break-PAST-PRT boy-ERG-QUEST e. Zundergo masina was-as-la tunk-ara-b? his car boy-ERG-QUEST break-PAST-PRT f. was-as-la tunk-ara-b zundergo masina? boy-ERG-QUEST break-PAST-PRT his car

(41)

a. Was-as ex-la c'al-dalab b-ug-a-b. boy-ERG book.ABS-QUEST read-PRT 3CL-AUX-PRS-PRT 'Is it the book that the boy is reading?'

672

Yakov G. Testelec

b. Ex-la was-as c'al-dalab b-ug-a-b. book.ABS-QUEST boy-ERG read-PRT 3CL-AUX-PRS-PRT c. *ex-la c'al-dalab b-ug-a-b was-as book.ABS-QUEST read-PRT 3CL-AUX-PRS-PRT boy-ERG d. Was-as c'al-dalab b-ug-a-b ex-la. boy-ERG read-PRT 3CL-AUX-PRS-PRT book.ABS-QUEST In (41 c), the ergative NP cannot be postposed to the participle. Unlike the absolutive NP in (37 e), it cannot be the topical dependent of the focus word. It seems that of the two focus dependents, the topical one obligatorily takes the absolutive case. In (42) we have an auxiliary constructing with a negated subject by means of the term negation particle guro. (42)

a. Dir wac-as guro ka"Hat c'al-dalab b-ug-a-b. my brother-ERG not etter read-PRT 3CL-be-PRS-PRT 'It is not my brother who is reading the letter' b. KaHat dir wac-as guro c'al-dalab b-ug-a-b. letter my brother-ERG not read-PRT 3CL-be-PRS-PRT c. Dir wac-as guro c'al-dalab b-ug-a-b ka"Hat. my brother-ERG not read-PRT 3CL-be-PRS-PRT letter d. *c'al-dalab b-ug-a-b... read-PRT 3CL-be-PRS-PRT... etc.

And (43) features an auxiliary construction with an interrogative pronoun. (43)

a. Lita ex c' al-dala-b b-ug-a-b? (*b-ug-o) who.ERG book read-PRT-3CL 3CL-AUX-PRS-PRT (*3CL-PRS) 'Who is reading the book?'

The ungrammaticality of the finite form of the auxiliary bugo in (43 a) shows that interrogative pronouns behave like constituents provided with a focus word, although they contain no separate focus marker. The expected restrictions on word order variation are met here as well as evinced by the ungrammaticality of (43d, e). b. Ex iiia c'al-dala-b b-ug-a-b. book who.ERG read-PRT-3CL 3CL-AUX-PRS-PRT c. Liia c'al-dala-b book who.ERG read-PRT-3CL

b-ug-a-b ex. 3CL-AUX-PRS-PRT

Word order variation

673

d. *c'al-dala-b b-ug-a-b iita ex read-PRT-3CL 3CL-AUX-PRS-PRT who.ERG book e. *c'al-dala-b b-ug-a-b ex tita read-PRT-3CL 3CL-AUX-PRS-PRT book who.ERG In Andalal, like in Godoberi, focus movement out of an NP is impossible. Given that only NPs restrict the possibility of head-initial order, this means that although any constituent of an N P may be focused, no such constituent can leave more than one of the NPs that it belongs to. Consider first an infinitival clause embedded into an auxiliary construction. The focused object, which crosses only one NP boundary (that is the boundary of the S'dominating S7NP of the focus construction), can be moved out of that clause (44 a) or remain in situ (44 b). (44) a. Sundu-ta d-eje ka"Hat x D a-de b-oX'u-mo what-INSTR I-DAT letter write-INF 3CL-want-GER b-ug-a-b? 3CL-AUX-PRS.PRT-3CL 'What do I want to write the letter with?' b. D-eje ka'Hat sunduia /°a-de boX'u-mo I-DAT letter what-INSTR write-INF 3CL-want-GER b-ug-a-b? 3CL-AUX-PRS.PRT-3CL But when the genitive modifier is focused, the whole NP must undergo movement as in (45 a); if only the modifier undergoes movement as in (45 b) the sentence becomes ungrammatical. (45) a. Li-1 rucka-jat d-eje ka>Iat x°a-de who-GEN pen-INSTR I-DAT letter write-INF boX'u-mo b-ug-ab? want-GER 3CL-AUX-PRS.PRT-3CL 'Whose pen do I want to write the letter with?' b. *ti-l who-GEN boX'u-mo want-GER

d-eje ka^at rucka-jat X°a-de I-DAT letter pen-INSTR write-INF b-ug-ab? 3CL-AUX-PRS.PRT-3CL

674

Yakov G. Testelec

The same holds true for embedded relative clauses. Whenever a (non-head) constituent of a relative clause is focused, it cannot be moved out of it. Compare (46 a) and (47 a) with the ungrammatical (46 b) and (47 b). (46)

a. Dir wac-asa w-ex-dala-w my brother-DAT ICL-see-PRS-PRT-lCL w-ug-a-w kiwe w-a^-ana-w ci? ICL-be-PRS.PRT-lCL where ICL-go-PRT-lCL man.ABS 'Where is the man going whom my brother sees?', lit. 'My brother sees a where going man?' b. *kiwe dir wac-asa w-ex-dala-w where my brother-DAT ICL-see-PRS-PRT-lCL w-ug-a-w w-a^ana-w ci? ICL-be-PRS.PRT-lCL ICL-go-PRT-lCL man.ABS

(47)

a. Dir wac-as c'al-dala-b b-ug-a-b my brother-ERG read-PRS-PRT:3CL 3CL-AUX-PRS-PRT zundergo was-as-xa b-eh-ara-b ka^Iat. his son-ERG-FOC 3CL-bring-PAST-PRT letter:ABS 'My brother is reading a letter brought by HIS SON.' b. *zundergo was-as-xa dir wac-as c'al-dala-b his son-ERG-FOC my brother-ERG read-PRS-PRT.CL b-ug-a-b... 3CL-AUX-PRS-PRT...

A focused constituent within a double subordinated clause behaves identically. As shown in (48 b) the extraction of the interrogative pronoun into the main clause is ungrammatical. (48)

a. Dir wac-as ii-da tik'go ta-dala-w 6 my brother-ERG who-LOC well know-PRS.PRT-lCL ci-jas b-eh-ara-b ka"Hat c'al-dala-b man-ERG 3CL-bring-PRT-3CL letter read-PRT-3CL b-ug-a-b. 3CL-AUX-PRS.PRT-3CL 'My brother is reading a letter brought by a man whom who knows well?' b. *ti-da dir wac-as... who-LOC my brother etc.

Word order variation

675

In generative grammar such constraints on WH-extraction are usually accounted for via government requirements or via the principle of Subjacency. Assuming that the head NP inherits barrierhood from the non-theta-marked S' that it dominates, the wh-extraction would constitute a violation of Subjacency. There are two major problems posed by the Andalal focus constructions. The first one may be solved in principially the same way as in Godoberi, namely with reference to Subjacency and pied piping. If the scope constituent is contained in two NPs, it cannot be moved into the finite focus position; instead, the whole NP moves there. The second problem is that there can be only one focus word in a sentence (apart from marginal cases with two interrogatives like 'Who bought what?' etc.). This single focus word is the highest predicate which determines the form of the finite clause. In spite of that, the focus word must follow its scope constituent, no matter how deeply embedded, and undergo the focus movement together with it, or even with a larger constituent. This raises the question of how the focus word 'gets' into the subordinate clause given that it is the head of the finite construction, and its scope constituent can not always move up, as it is required by Subjacency. A promising solution is captured in the hypothesis in (49). (49)

The Downward Movement hypothesis: The focus word moves down from the highest I position to the constituent which is its scope; together with this constituent, or a larger constituent, it can move up into the focus (S-dependent) position.

An alternative solution could be that suggested by Aoun & Yen-hui Audrey Li (1993) for Chinese. Aoun & Yen-hui Audrey Li have proposed that in Chinese the WH-phrase in situ is co-indexed and interpreted with respect to a nonovert question operator XP which moves to the Spec of Comp position. "As a consequence of the existence of this operator, the wh-element itself in a language like Chinese cannot be treated like an operator; it functions as a polarity item. The wh-element in English, on the other hand, functions as an operator" (1993: 235). However, since in East Caucasian languages question words are not polarity items, and no such distribution of properties between the nonovert operator and the focused item in situ might be suggested for them, extending the above analysis to our data would be at best an hoc solution.

3.3. Remarks on focus constructions in Basque In Basque, another European SOV language, clausal piped piping in focus constructions has been described recently by Ortiz de Urbina (1993). Whereas

676

Yakov G. Testelec

extraction out of a relative clause island yileds an ungrammatical sentence (50), there is another strategy involving movement of the whole embedded clause containing a WH-operator into the focus position, immediately preceding the matrix inflected verb (51 a). (50)

(51)

*Nori irakurri duzu [ Mikelek t eman dio-n] liburua? who read AUX Mikel give AUX-COMP book 'To whom have you read the book that Mikel gave t?' a. [Nor etorriko d-ela bihar] esan diozu Mireni? who come AUX-that tomorrow say AUX Mary 'That who will come tomorrow have you told Mary?' b. *[Nor etorriko dela bihar] Mireni esan diozu

As shown in (51 b) no constituent can intervene between the fronted embedded clause and the matrix verb (Ortiz de Urbina 1993: 194). In cases of double embedding, the embedded clause can move higher up, undergoing successive cyclic movement from Spec, C to Spec, C (a position that Ortiz de Urbina proposes for the moved WH-operator). (52)

[Nor etorriko d-ela] esan du Mirenek t who come AUX-COMP said AUX Mary uste du-ela Peruk think AUX-COMP Peter 'That who will come has Mary said (that) Peter thinks?'

The cyclic character of the movement resulting on each clause level in an operator (or a clause pied-piped by it) being in the highest specifier position is noteworthy, as well as the fact that the focusing category in Basque, unlike in Daghestanian, requires no additional clause level above the whole sentence. As a result, Ortiz de Urbina is not faced with the problem of 'downward movement' as in Andalal. Foci in Basque "share many of the syntactic properties of interrogative operators, triggering, in particular, adjacency to the left of the verb" (Ortiz de Urbina 1993: 196). Focus in an 'island' embedded structure pied-pipes it into the focus position in the same way as WH-operators do. (53)

a. [JON etorriko d-ela bihar] esan diot Mireni Jon come AUX-COMP tomorrow say AUX Mary 'That it is Jon that will come tomorrow have I told Mary?' b. ??JON etorriko d-ela bihar Mireni esan diot

Word order variation

677

Ortiz de Urbina concludes that Operators in non-specifier positions seem to be able to pied-pipe the maximal projection they complement or modify' (1993: 212). Clausal pied piping has been reported also for Imbabura Quechua, an SOV language described by Cole (1982). In Imbabura Quechua, if a dependent of a noun phrase is in the focus of a question, only the whole NP may be questioned. When elements internal to the relative clause are questioned, only clause fronting is possible (Cole 1982: 23).

4. Conclusion In SOV languages, unlike in most SVO languages, word order seems to encode directly the relation of syntactic dependency. In an ordered pair XY where X and Υ are both constituents and Υ is a head it is expected that X is a dependent of Y, whereas in SVO languages no such predictions can be made. This expectation (as well as the contrary expectation for V-initial languages) is confirmed by many instances of illusory head-dependent order where both do not constitute a single constituent, e. g. a postposcd modifier behaves more like a separate NP, or 'apposition' (see Rijkhoff, this volume): it may display the head morphology, may be discontinuous from its semantic head etc. Focus may be viewed as a category which normally may occur only once within a sentence, for pragmatic reasons. In head-initial languages, it is usually represented in the finite, or the leftmost, clause. In some head-final languages, categories that tend to occur only once within a sentence like focus, may occur within subordinate clauses, — presumably because it is convenient for the speaker not to 'postpone' them until the rightmost, i. e. highest, clause. In SOV languages the subordinate clause preceding the main one acquires many characteristics which belong only to the main clause in SVO languages. A relative clause or other 'island1 constituent containing the focus may be pied-piped into the main clause in order to move up the finite category. It is to be expected therefore that more data of clausal pied piping may be found in SOV languages.

Notes I. Much of the work on this paper has been done when the author was a visiting researcher in the University of Umea, Sweden. I am very much indepted to the Svenska Institutet that has supported my research work there with a grant in November— December 1993. I am also indebted for valuable comments and corrections for Win-

678

2.

3.

4. 5.

6.

Yakov G. Testelec

fried Boeder, Anna Siewierska, Maria Polinsky, Anders Holmberg, and Konstantin Kazenin: all shortcomings and mistakes are my own. In (4), the direct object is marked with nominative; in (5), it is in genitive; in (6), the subject is in dative; the choice of cases follows some complicated rules which are not of importance here, for details see e.g. in Harris 1981. The procedure of calculating the EIC values is roughly as follows. The number of immediate constituents of the VP in the main clause is two, and the parser needs to parse two words in order to discover the structure: go-INF and WANT. The average value of the number of immediate constituents discovered at a given step divided by the number of words parsed is in (23 a) the average of 1/1 and 2/2 = 100%; in (23 b) the average of 1/1, 1/2 and 2/3 = 72%, in (23 c) the average of 1/1, 1/2, 1/31/4, I/ 5, 1/6, 1/7 and 1/8 = 36% etc. The average value for the first embedded clause is counted in an analogous way. For more details of this procedure see Hawkins (1990, 1994 and this volume). There are two dative-marked NPs in (32) because the subject of verbs like 'want' requires the dative case, as well as the recipient of 'give'. The data on the Andalal dialect of the Avar language are based on field research in the village of Sogratl (Gunibskij rayon, Republic of Daghestan) during a linguistic field trip arranged by Moscow University in July 1992. I am indebted to Prof. A. E. Kibrik who led the field trip; the research in Sogratl was also partly supported by EUROTYP. The locative case in (48 b) is required by the subject of the verb lade 'to know'.

References Aoun, Joseph &c Yen-hui Audrey Li 1993 "Wh-elements in situ: syntax or LF?", Linguistic inquiry 24: 199-238. Boeder, W. 1995 "Suffixaufnahme in Kartvelian", in: Frans Plank (ed.), Double Case, NY.: Oxford University Press. Boguslavskaja, Ol'ga Yu 1989 Struktura imennoj gruppy: opredelitel'nye konstrukcii v dagestanskix jazykax. Avtoreferat kandidatskoj dissertacii. M. MGU. (— The noun phrase structure: attributive constructions in Daghestanian languages) Cole, Peter 1982 Imbabura Quechua. Lingua Descriptive Studies 5. Amsterdam: North Holland. Dryer, Matthew S. 1988 "Object-verb order and adjective-noun order: dispelling a myth", Lingua 74: 185-217. Erguvanli, Eser E. 1984 The function of word order in Turkish grammar. Berkeley: University of California Press. Haspelmath, Martin 1993 A Grammar of Lezgian. Mouton Grammar Library 9. Berlin-New York: Mouton de Gruyter.

Word order variation

679

Harris, Alice C. 1981 Georgian syntax: a study in relational grammar. Cambridge: Cambridge University Press. Hawkins, John A. 1983 Word order universals. New York: Academic Press. 1990 "A parsing theory of word order universals", Linguistic Inquiry 21: 223 — 261. 1994 A performance theory of order and constituency. Cambridge: Cambridge University Press. this volume "Some issues in a performance theory of word order". Kazenin, Konstantin I. 1993 Focus constructions in Godoberi, ms. forthc. Order of modifiers in Godoberi noun phrase, in: Aleksander Kibrik (ed.), Studies in Godoberi. Keenan, Edward L. 1979 "On the syntax of VOS languages", in: Winfried P. Lehmann (ed.), Syntactic typology. Austin & London, 267-328. Kibrik, Aleksandr E., S. V. Kodzasov, I. P. Olovjannikova & D. S. Samedov 1977 Opyt strukturnogo opisanija arcinskogo jazyka. Vol. III. Moskva: Izd-vo MGU. Ortiz de Urbina, Jon. 1993 "Feature percolation and clausal pied-piping", in: J. I. Hualde & Jon Ortiz de Urbina (eds.), Generative studies in Basque linguistics. Amsterdam: John Benjamins. Rijkhoff, Jan this volume "Order in the noun phrase of the languages of europe". Xajdakov, S. M. 1986 "Logiceskoe udarenie i clenenie predlozenij (dagestanskie dannye)", in: Georgij A. Klimov (ed.), Aktual'nye problemy dagestansko-naxskogo jazykoznanja, Maxackala. Institut istorii, jazyka i literatury, 79—96. (Logical stress and structure of sentences (Daghestanian data)).

Katalin E. Kiss

Discourse-configurationality in the languages of Europe

1. Introduction This study will examine which European languages encode the discourse-semantic functions 'topic' and 'focus' structurally, and how they encode them in their sentence structures. The framework of the investigation is the Government and Binding version of generative theory. As follows from this framework, the paper will relate discourse-configurational languages to non-discourse-configurational ones, and will look upon their differences as the manifestations of parametric variation in some principles of Universal Grammar. Section 2.1. will describe the discourse-semantic function topic as understood in this paper, whereas section 2.2. will interpret — in structural terms — the notion of topic-prominence. Section 2.3. will examine how topic-prominence is realized in various European languages. Section 3.1. will describe the discourse-semantic notion of focus. Section 3.2. will examine the type of language in which this notion is structurally expressed, exploring various ways of placing focus in the X-bar-theoretic representation of the sentence. Section 3.3. will describe and compare the relevant syntactic properties of the European languages with structural focus. The typological survey to be presented is based on 35 European languages: Armenian, Basque, Bezhta (Daghestanian), Breton, Bulgarian, Catalan, Chamali (Nakho-Daghestani), Czech, Danish, Dutch, English, Estonian, Finnish, Georgian, German, Greek, Hungarian, Icelandic, Irish, Italian, Laz (Kartvelian), Lezgian (Nakho-Daghestani), Lovari (a Roma dialect in Hungary), Mingrelian (Kartvelian), Norwegian, Polish, Rumanian, Russian, Scottish Gaelic, Slovak, Spanish, Svan (Kartvelian), Swedish, Turkish, and Welsh. The information I have on these languages derives partly from a questionnaire which I prepared and had filled in,1 and partly from studies dealing with the expression of the notions of topic and focus in these languages. I have also used the material of a general word order questionnaire compiled for the word order work group, and the word order studies included in this volume. My specific hypotheses on the position of topic and focus in the sentence structure of discourse-configurational languages have been based on detailed syntactic analy-

682

Katalin E. Kiss

ses of some of the above languages prepared independently of this project, among them on the analysis of Basque (Ortiz de Urbina 1991), Bulgarian (Rudin 1986), Catalan (Vallduvi 1992 a), Finnish (Vilkuna 1989), Modern Greek (Tsimpli 1990), Hungarian (Horvath 1986; Brody 1990; E. Kiss 1994), Italian (Calabrese 1987), and Russian (King 1993).

2.

The structural encoding of topic

2.1. The discourse-semantic function 'topic' In the terminology to be employed in this study, the discourse-semantic function 'topic' is the function of the constituent which is predicated about in the sentence.2 The discourse-semantic function 'topic' is not necessarily associated with a constituent 'topicalized' in syntax. (Thus a constituent preposed by Topicalization e.g. in English or German can can also be a focus, when bearing the heaviest stress of the sentence.) The discourse-semantic function 'topic', which corresponds to that of 'notional subject', needs to be distinguished from the function 'grammatical subject' (which is associated with the constituent bearing the most prominent theta-role)3 — even though the two functions are often carried by the same constituent. The independence of the discourse-semantic function 'topic/notional subject' from the grammatical functions will be demonstrated by the Hungarian sentences in (1 a —c). The predicate-external constituent in topic function, denoting the referent that the sentence is about, can be represented by a grammatical subject, object, dative, etc.: (1)

Hungarian a. Janos [predp IMRET mutatta be Zsuzsänak] John Imre.ACC introduced prev Susan.DAT 'John introduced to Susan IMRE.' b. Zsuzsänak [predp JANOS mutatta be Imret] Susan.DAT John introduced prev Imre.ACC 'Susan was introduced Imre by JOHN.' c. Imret [predp ZSUZSÄNAK mutatta be Janos] 'Imre was introduced by John to SUSAN.'

(1 a) is a statement about John.NOM: it says that he introduced Imre to Susan. (1 b) is a statement about Susan.DAT, saying that Imre was introduced to her by John. (Ic) is a statement about Imre.ACC: it states with respect to Imre that John introduced him to Susan.

Discourse-configurationality

683

Even though the patterns in (1 a — c) are equally grammatical, they are not equally common: the topic role is most often associated with the constituent functioning as the grammatical subject. The reason must be that speakers prefer the strategy of describing situations as statements about their human participant. The grammatical subject (usually an agent or an experiencer) is the most common topic because it is the argument most likely to have the feature [ + human]. A constituent can function as a topic 'notional subject if it denotes a specific individual, a group of specific individuals, or a kind. The most common instantiations of topic constituents include proper names (see (2 a)), definite NPs (see (2b)), specific indefinite NPs (see (2c)), and generic NPs (see (2d)): 4 (2)

a. Fido is [predp chewing a bone] b. The dogs [predp are chewing bones] c. There are three dogs and two cats in front of the house. A dog is chewing a bone] d. Dogs [predp like bones]

In (2c), a dog denotes a member of a set previously introduced into the domain of discourse, and in (2d), dogs denotes (the instantiations of) the kind 'dog'. Not every sentence expresses a predication about somebody or something. In the logical theory of Marty (1918, 1965), introduced into generative linguistics by Kuroda (1972—73), judgments are of two types: categorical and thetic judgments. 5 As Kuroda (1972 — 73) puts it, a categorical judgment consists of two acts: the act of recognition of that which is to be made the notional subject, and the act to affirming or denying what is expressed by the predicate about the subject. It is only categorical judgments that have a notional subject — notional predicate structure: they foreground a constituent denoting an individual, a group, or a kind, and then make a statement about it, as was illustrated in (2 a— d). A thetic judgment, on the other hand, consists of a single act: the act of the recognition of the material of a judgment. In other words, a thetic judgment contains no notional subject/topic. The linguistic realizations of thetic judgments include the following sentence types: Impersonal sentence: (3)

Hungarian a. [Predp Esik] rains 'It is raining.'

684

Katalin E. Kiss

b· [predP Lätszik, hogy Janosnak igaza van] seems that John.DAT right is 'It seems that John is right.' Existential sentences: (4)

[predP Kutya van a szobaban]/[Precip Van egy kutya a dog is the room.in is a dog the szobäban] room.in 'There is a dog in the room.'

Sentences with a non-specific indefinite subject: (5)

a. [precip Egy auto ällt meg a häz elött] a car stopped prev the house in.front.of car stopped in front of the house.' b· [predP Kutyäk ugatnak az udvaron] dogs bark the courtyard.on 'Dogs are barking in the courtyard.'

In many languages, specific, referential subject NPs (among them, generic NPs) can also be left in a Predicate-Phrase-internal position. (As Calabrese (1987) points out, this can happen when the referent of the subject is part of the 'common ground', i. e., the shared background knowledge of the speaker and the listener, but is newly introduced into the domain of discourse.) If no other argument is externalized, such sentences will give the impression of thetic judgments: they describe an event without formulating it as a statement about one of its participants. See (6 b) and (7) again from Hungarian. (6)

a. Mi törtent tegnap este? 'What happened last night?' b· [predP Meglatogatott bennunket Zsuzsa] visited us Susan 'Susan visited us.'

(7)

Költöznek a fecskek. migrate the swallows 'Swallows are migrating.'

Discourse-configurationality

685

If, following Davidson (1967), Kratzer (1989), and others, we assume an event argument in sentences expressing actions/stages, the difference between categorical and thetic judgments can also be looked upon in a different way: thetic judgments can be analyzed as covert predication structures predicating about a phonologically empty, but deictically or anaphorically bound event argument, that is, about 'here and now', or 'there and then'. This would explain why individual-level predicates, possessing no event argument, cannot participate in thetic judgments. Kuroda (1972 — 73) reports than in the logical theory of Marty, sentences with a universal subject are categorized as thetic judgments, though in Japanese they seem to pattern with categorical sentences. In fact, in Hungarian, Catalan (see Vallduvi 1992 b), and various non-European languages, a universal subject does not share either the position of the non-specific predicate-internal subject of a thetic judgment, or that of the specific, referential predicate-external subject of a categorial judgment at S-structure. Observe the different positions of a non-specific indefinite subject (8 a), a universal subject (8 b —c), and a specific, referential subject (8 d) in the Hungarian sentences: (8)

a. Szerintem ugat egy kutya. according.to.me barks a dog 'In my opinion, a dog is barking.' b. Szerintem minden kutya ugat. according.to.me every dog barks 'In my opinion, every dog barks/is barking.'

cf. c. *Minden kutya szerintem ugat. d. Fido szerintem ugat. Fido according.to.me barks 'Fido in my opinion is barking.' As argued in E. Kiss (1994) in detail, the non-specific indefinite subject in (8 a) is in its base-generated position in the VP, the universal subject in (8 b) is adjoined to the Predicate Phrase, whereas the specific, referential subject in (8d) has been preposed out of the Predicate Phrase. Of course, I do not claim that the surface syntactic behavior of universal subjects attested in Hungarian or Catalan is universal — although it very well may be on the level of Logical Form. (Thus Stowell 8c Beghelli (1994) argue that the landing site of universal quantifiers in English is a Distributive Phrase (DistP) above IP but below CP.) In any case, it would seem semantically unjustified to group universal state-

686

Katalin E. Kiss

ments either with categorical sentences (as they do not predicate about a referent), or with thetic sentences (es they do not predicate about an event argument). On the other hand, the possibility that quantified subjects occupy the S-structure position of the subjects of categorical sentences in some languages and the S-structure position of the subjects of thetic sentences in others cannot be excluded. Kuroda does not consider sentences with a contrastive focus subject or an interrogative subject. It seems to me that such sentences raise problems similar to those of sentences with a universal subject. Semantically, they cannot be categorized as categorical, unless the operator subject is part of a Predicate Phrase predicated of a non-subject argument (as in (9b) and (10b)); and, if thetic sentences are those that involve covert predication about an empty event argument, they cannot be categorized as thetic, either. (9)

a. Ki szereti Jänost? who loves John.ACC 'Who loves John?' b. Jänost [predp ki szereti?] 'John who loves?'

(10)

a. MARI szereti Jänost. Mary loves John.ACC 'MARY loves John.' b. Janost [predp MARI szereti] 'John, MARY loves.'

In languages with overt Focus Movement and WH Movement, like Hungarian, focussed and interrogative subjects are already in an -bar position at S-structure; but in other languages they may move to operator position only at Logical Form. Within the latter type, a focussed or an interrogative subject may very well share the S-structure position of thetic subjects in some languages and that of categorical subjects in others. In view of these considerations, I tentatively assume that sentences form not two, but three logical-semantic types, (i) They can be categorical: expressing an overt predication relation between an initial constituent functioning as a notional subject, and the subsequent sentence part functioning as a notional predicate, (ii) They can be thetic, consisting overtly of a mere Predicate Phrase; in fact, expressing predication about an empty event argument bound deictically by the situation, or anaphorically by the context, (iii) Finally, sentences

Discourse-configurationality

687

can be non-predicational, expressing quantification (universal quantification, contrastive focussing, or interrogation) which is not subordinated to a predication relation. Of the three logical-semantic sentence types established above, only the categorical sentences contain a constituent in topic function. An NP functions as a topic, or notional subject, if it is predicated about in a categorical judgment.

2.2. The topic-prominent language type It has been a basic assumption of syntactic approaches involving immediate constituent analysis, among them generative theory, that the basic structural relation constituting clause structure is the grammatical subject-grammatical predicate relation. While syntactic research focussed on English and some related languages, there did not seem to be any reason to question this assumption. Other languages studied in the early seventies, however, demonstrated quite transparently that the constituent to which the Predicate Phrase bears a predication relation can be different from the grammatical subject, and it can also be missing altogether. The phenomena that proved to be hard to handle assuming the traditional grammatical subject-grammatical predicate dichotomy included, for example, the lack of an overt dummy subject in impersonal sentences, and the postverbal, clearly VP-internal position of non-specific subjects, with or without a non-subject argument externalized. Linguistic theory attempted to handle these problematic facts by claiming that they are facts of one particular language type only. The newly established language type was identified as topic-prominent — as opposed to the subject-prominent English and related languages. This hypothesis, different versions of which have been proposed by Li & Thompson (1976), and Sasse (1987), has been the initial hypothesis of the examination of the European languages reported on in this study, as well. Li & Thompson's typology (1976), classifying languages into topic-prominent, subject-prominent, both topic- and subject-prominent, and neither topicnor subject-prominent types, is based on the assumption that in some languages the basic syntactic organization of the sentence expresses the notional subjectnotional predicate (in other words, topic-predicate, or topic-comment) relation; in other languages it expresses the grammatical subject-grammatical predicate relation; still in other languages different constructions express both relations; whereas in a few languages neither relation is syntactically encoded. Li & Thompson do not define the notions of topic-prominence and subjectprominence precisely; they merely circumscribe them:

688

Katalin E. Kiss

"In subject-prominent languages, the structure of sentences favors a description in which the grammatical relation 'subject-predicate' plays a major role; in topicprominent languages, the basic structure of sentences favors a description in which the grammatical relation 'topic-comment' plays a major role" (Li &C Thompson 1976: 459).

Li & Thompson also enumerate a set of properties of topic-prominent languages; since, however, they do not demonstrate that these properties are necessary consequences of the essence of topic-prominence, they cannot be interpreted as defining criteria of topic-prominence. The properties attributed by Li & Thompson to topic-prominent languages are as follows: (i) Topic-prominent languages display surface coding for the topic, but not necessarily for the subject. — Li &c Thompson also consider initial position a way of surface coding; that is how the topic is marked e.g. in Mandarin Chinese. But practically all languages appear to have some visible preposing process, the target of which can function as a topic. Are all languages topic-prominent? (ii) Among topicprominent languages, the passive construction either does not occur at all, or appears as a marginal construction. — If passivization is looked upon as the externalization of the theme argument, then, indeed, it would seem unmotivated for a language that can extract a theme from the Predicate Phrase also without turning it into a grammatical subject to employ passivization. However, if passivization is viewed as the absorption of the agent argument, it is not clear why it should be incompatible with topic-prominence, (iii) Topicprominent languages do not contain dummy subjects. Only a subject-prominent language may need a subject whether or not it plays a semantic role. — In fact, the use of overt dummy subjects may very well be determined by factors that are not directly related to topic-prominence; thus it may be the consequence of the lack of pro-drop, as well as a particular mode of nominative assignment, with the place of nominative assignment fixed, but the position of the subject argument free. On the other hand, a dummy pronoun can also serve as a place holder for a topic constituent — as is the case with the German es in impersonal passive constructions, which is not needed in non-initial position or in embedded contexts. Cf. (11)

German a. Es wurde getanzt. 'It was danced.'

but: b. Gestern wurde getanzt. 'Yesterday was danced.'

Discourse-configurationality

689

c. dass getanzt wurde that danced was (iv) Topic-prominent languages display double-subject constructions of the following type: (12)

Japanese Sakana wa tai ga oisii. fish TOP red.snapper NOM delicious 'Fish, red snapper is delicious'.

Actually, the use of the double-subject construction has only been reported for a group of East-Asian languages. If we regard only languages displaying the double subject construction as topic-prominent, we may narrow down topicprominence to an areal feature. (Non-argument topics appear to be possible in every language (also in English, for example: 'As for fish, 1 like red snapper'), but it is not obvious that they represent topics proper; they usually function as contrastive topics, which are not (necessarily) notional subjects; they do not even have to be referential, (v) In topic-prominent languages, the topic controls coreference. (vi) Topic-prominent languages are V-final languages. — In topicprominent languages also displaying preverbal structural focus, indeed, one of the most unmarked orders is the SOV order. In languages with no structural focus, on the other hand, it is not clear why an SVO basic order could not simultaneously also function as a TVO order, (vii) In a topic-prominent language, the selection of the topic is unconstrained. In subject-prominent languages, on the other hand, only constituents of certain grammatical functions (e. g. subjects) can be topicalized. — In fact, it is not obvious that there are languages in which only the grammatical subject can be externalized. But even if there were, it would not seem to be justified to classify a language in which only "topical", i.e., referential subjects can be topicalized, but [ — specific] subjects and non-subjects cannot, as subject-prominent. It would seem more appropriate to consider it as representing a constrained sub-type of topic-prominence, (viii) In a topic-prominent language, topic-comment sentences are neutral; in a subject-prominent language, on the other hand, they are highly marked. — This criterion is applicable only if we do not analyze topical subjects as topics. If we do — and we should — then the topic-comment sentence is basic in most languages. Sasse (1987) views the typological differences noticed by Li & Thompson e. g. between English and Chinese as differences in the expression of categorical and thetic statements. He claims that languages like English — the subject-

690

Katalin E. Kiss

prominent type of Li & Thompson — realize both categorical and thetic statements through grammatical subject-grammatical predicate constructions, "dethematizing" the grammatical subject in thetic sentences only by phonological means. Languages like Chinese — the topic-prominent type of Li & Thompson —, on the other hand, realize categorical and thetic statements through different syntactic structures, which are direct mappings of the notional predication structures of the given sentences. Sasse (1995 a) also analyses Greek and Hungarian as instances of the latter language type. For a survey of the expression of theticity across languages, see Sasse (1995b). Relying on Li & Thompson's intuitions, and on Sasse's theory, I have formulated the following starting hypothesis for the classification of European languages with respect to the structural position of the topic/notional subject, and with respect to the structural encoding of the notional subject-notional predicate relation. I have assumed that in a class of language the syntactic structure of sentences is the direct equivalent of their logical-semantic predication structure, that is, their categorical or thetic character; this is the topic-prominent language type. In another class of language, representing the subject-prominent type, on the other hand, sentences invariably display a syntactic predication structure, whether or not they express predication on the logical-semantic level. This is the subject-prominent type. I interpret the notion of syntactic predication similarly to Williams (1980) and Rothstein (1983). The syntactic (primary) predication relation is the structural relation that the Predicate Phrase (the highest verbal projection, whose category will also be defined more precisely later on) bears to the pre-PredP argument. Adapting the predication definition of Williams (1980) to current theoretical assumptions (e.g. Chomsky 1992), I propose the definition of syntactic primary predication in (13). (13)

Syntactic Primary Predication PredP bears a primary predication relation to XP iff i. PredP is a complement of Y, and XP is a specifier of YP or is adjoined to YP; and ii. XP binds an argument position in PredP.

Notice that according to definition (13), a primary predication involves no restriction on the grammatical function, the category, or the case of XP, the subject-of-predication. Definition (13) is less constrained than previous definitions, e. g. that of Williams (1980), in that it allows a PredP to bear a syntactic primary predication relation to more than one PredP-external argument. This feature of the definition is motivated by the fact that in a very large number of

Discourse-configurationality

691

languages, among them Catalan, Basque, Italian, Hungarian, Russian, Bulgarian, Turkish, or Georgian, more than one argument can be extracted from the PredP, and the externalized constituents share the same syntactic behaviour and have the same semantic role — to the extent that their relative order is of no syntactic or semantic significance. Compare: (14)

Hungarian a. [YP Janos [ YP Marit [predP Evanak mutatta be]]] John Mary.ACC Eve. D AT introduced prev 'Mary, John introduced to EVK.' b. [ Y p Marit [ YP Janos [predp Evanak mutatta be]]]

(15)

Catalan a. [γρ L' Anna [ YP el cafe [preap el va fer ahir]]] the Anna the coffee it PST.3SG make yesterday 'The coffee, Anna made yesterday.' b. [γρ El cafe [YP Γ Anna [predp el va fer ahir]]]

According to the analysis of Vallduvi (1993), in (15) both preverbal constituents are immediately dominated by an IP node (even if the right one is dominated by both segments of IP, whereas the left one is dominated only by one segment of it). There is no syntactic or phonological rule or constraint that would affect only the left or the right of the two PredP-external constituents, but not the other. Similarly, the two constituents have the same discourse-semantic role: thus both (15 a) and (15 b) make a statement about Anna and the coffee, namely, that she made it yesterday. It is NOT the case that (15 a) predicates about Anna that she made the coffee yesterday, and (15 b) predicates about the coffee that Anna made it yesterday. The same holds for the Hungarian examples in (14a, b). These data indicate that the two PredP-external constituents share the same syntactic and semantic function: the PredP bears a syntactic and a semantic predication relation to both of them. Point (ii) of definition (13) requires that the subject-of-predication bind a trace or a resumptive pronoun in the Predicate Phrase. It excludes a predication relation between a PredP and a PredP-external adjunct, e. g. the speaker-oriented adverbial in (16): (16)

Hungarian [YP Szerintem [predp JANOS mutatta be Ev t Marinak]] according.to.me John introduced prev Eve.ACC Mary.DAT 'In my opinion, JOHN introduced Eve to Mary.'

692

Katalin E. Kiss

I assume that referential temporal and locative adverbials can be analyzed either as adjuncts or as optional arguments. In the latter case, they function as subjects-of-predication in PredP-external position. Consider: (17)

a. [Yp A magas hegyekben; [predp nyaron is esik a ho t,]] the high montains.in summer.in also falls the snow 'In high mountains, it also snows in the summer.' b. [Yp A magas hegyekben Eva [predp legszomjjal the high mountains.in Eve breathlessness.INST küszködik]] struggles 'In the high mountains, Eve struggles with breathlessness.'

The actual subject-of-predication or adjunct analysis of a referential adverbial may depend on whether or not there are also obligatory arguments in Predicate-Phrase-external position in the given sentence. If there are not, as in (17 a), the subject-of-predication analysis of the adverbial is more likely. The predication definition does not require that the subject-of-predication be an argument of the matrix V; hence it also covers Raising structures, and sentences involving long Topicalization. Cf. (18)

a. John; [predp seems t, to be happy] b. Hungarian Jänossalj [predp valoszinü, [cp hogy mär talälkoztam t\ John.INST probable that already meet.PST.lSG valahol] somewhere 'John, it is probable that I have already met somewhere.'

Under the assumptions outlined above, the language types established by Li & Thompson (1976) differ in the mapping relation that holds between the syntactic and the logical-semantic structures of their sentences. That is: (19)

a. A topic-prominent language A language is topic-prominent if it formulates categorical judgments as primary predication structures, and thetic judgments as mere Predicate Phrases.

Discourse-configurationality

693

b. Λ subject-prominent language A language is subject-prominent if it formulates both categorical and thetic judgments as primary predication structures. c. Neither a topic-, nor a subject-prominent language A language is neither topic-prominent nor subject-prominent if it formulates neither categorical nor thetic sentences as primary predication structures. In this version of Li &c Thompson's and Sasse's typology of language, there is no room for both topic-prominent and subject-prominent languages; if the grammatical subject-grammatical predicate construction is understood as the most unmarked realization of the topic-comment structure, then no language is expected to exist in which constructions of topic-comment function and those of grammatical subject-grammatical predicate function are realized in different syntactic structures.

2.3. Topic-prominence in the languages of Europe The most surprising finding of this study is that non-topic-prominent languages represent a very small minority of the languages investigated. The neither topic-prominent, nor subject-prominent language type is represented in Europe by the VSO Celtic languages, most clearly by Welsh, Irish, and Scottish Gaelic. In these languages, both categorical and thetic sentence are formulated as verb-initial phrases (IPs derived by V-movement into I according to McCloskey (1991) and Tallerman (this volume)); that is, the syntactic structure of a sentence corresponds to its semantic predication structure only in the case of thetic judgments. Compare the following Welsh examples: (20)

a. Ffoniodd Sion Mair neithiwr. phoned John Mary last night 'John phoned Mary last night.' b. Mae cath mewian wrth y drws. is cat miao at the door 'There is a cat miaowing at the door.'

At the same time, Welsh allows argument preposing (though not as extensively as e.g. Breton), and its target can function either as topic or as focus. An interesting study by Ramchand (1994) on Scottish Gaelic, and one by Doherty (1994) on Irish make the classification of these languages as neither

694

Katalin E. Kiss

topic-prominent nor subject-prominent also somewhat less straightforward. Ramchand (1994) claims that Scottish Gaelic has a sentence type displaying syntactic predication: it involves the copula and a nominal predicate, and it is used when an individual-level property is predicated of an individual or a kind. The subject-of-predication sits in Spec, IP, to the right of the VP. For example: (21) a. Is faicilleach Calum. COP.PRES careful Calum 'Calum is a careful person.'

A1

Spec Is

faicilleach Calum

Compare: b. Tha Calum faicilleach. be.PRES Calum careful 'Calum is being careful.' IP

Spec

Γ AP

Spec Tha Calum

A'

faicilleach

Ramchand (1994) assumes, following Kratzer (1989), that Scottish Gaelic sentences containing a non-individual-level predicate express a predication about a situation, and the position of the subject-of-predication in Spec, IP is occupied by a non-overt event argument. If she and Doherty (1994) are right, then Scottish Gaelic and Irish are less rigidly non-topic-prominent than claimed above. They can express categorical judgments by a syntactic primary predication

Discourse-configurationality

695

structure — but only if the sentence contains an individual-level non-verbal predicate selecting no event argument, and if the predication is mediated by the copula. Interestingly, the subject-prominent language type, in which categorical and thetic judgments are expressed by identical primary predication structures in syntax, may be even more underrepresented among the European languages than the neither topic-prominent nor subject-prominent type; it is not clear if this type exists (in Europe) at all. As the discussion of the topic-prominent language type will make clear, none of the non-VSO languages investigated clearly violates the criterion of topic-prominence in (19 a). Our criterion of topic-prominence has been whether a language realizes categorical judgments as primary predication structures, and thetic judgments as mere Predicate Phrases with no argument externalized. Among the 35 European languages examined, all the languages except Irish, Scottish Gaelic and Welsh formulate categorical judgments as primary predication structures, so what can potentially set apart a subject-prominent type is the way they formulate thetic judgments with a non-specific grammatical subject. I have found that nonspecific subjects remain inside the Predicate Phrase in at least 31 of the 35 languages examined, and they may be inside the Predicate Phrase in the remaining four languages, too. In the questionnaire I employed, the following question was posed to establish the position of a non-specific grammatical subject: "How do you translate the sentence A girl got on the bus, when said in situation (22 a) and in situation (22 b)? (22)

a. Several girl-friends of yours were waiting for the bus. The bus arrived. A girl got on the bus. b. You were sitting in a bus alone at night, frightened. Luckily, a girl got on the bus."

Whereas the specific a girl of (22 a) clearly appears in Predicate-Phrase-external position in all non-VSO languages, the non-specific a girl of (22 b) stands in a Predicate-Phrase-internal position in the overwhelming majority of the translations (it stands in a PredP-internal position in at least one of the alternative translations provided in every language). For example: (23)

Turkish (questionnaire) a. Bir kiz otobus-e bin-di. a girl bus-DAT board-PST b. Otobüse bir kiz bindi.

696

(24)

Katalin E. Kiss

Lezgi (questionnaire) a. Sa rus avtobus.di-z hax-na. one girl bus-DAT get.in-AOR b. Avtobus.di-z rus hax-na.

(25)

Czech (questionnaire) a. Jedna divka do autobusu nastoupila. a/one girl on bus got b. Do autobusu pfistoupila nejaka divka. on bus got a/some girl

(26)

Italian (questionnaire) a. Una ragazza e salita sulP autobus, a girl be.PRS.3SG got on.the bus b. E. salita sull'autobus una ragazza.

(27)

Swedish (questionnaire) a. En flicka steg pä bussen. a girl got on the.bus b. Det steg en flicka pa bussen. there got a girl on the.bus

The sentence A car ran over John was also included in the questionnaire to test the possibility of a PredP-internal subject. In practically all the languages examined, it was translated with John preposed (presumably extracted out of the PredP), and with a car kept next to the V (presumably in PredP-internal position). E.g. (28)

a. Bezhta (questionnaire) Rasul-X'a X'odol-X'a masina meX'e-jo. Rasul-LOC up-LOC car go-AOR b. Estonian (questionnaire) Jaan jäi auto alia. John changed car below.to c. Polish (questionnaire) Jana przejechal samochod. Jan.ACC ran.over car.NOM

Discourse-configurationality

697

d. Spanish (questionnaire) A Juan le atropello un coche. to John him ran.over a car In Icelandic, the sentence was translated with the non-specific subject inside the PredP, but without the specific, [ + human] argument externalized: (29)

Icelandic (questionnaire) paö keyröi bill ä Jon. there drove a.car on John

In the case of rigidly V-final languages such as Bezhta or Laz, of course, the possibility cannot be exluded that the subject is Predicate-Phrase-external both in SOV sentences and in OSV sentences (e.g. in (28 a)). The assumption that the immediately preverbal subject is in the Predicate Phrase ought to be confirmed by further tests — e. g. by showing that in an OSV sentence no sentence adverbial can intervene between S and V. The languages for which the possibility of a PredP-external non-specific subject cannot be completely excluded on the basis of the questionnaire are Danish, Swedish, and Norwegian. (Icelandic has been shown to be clearly topic-prominent; Icelandic non-specific subjects cannot leave the Predicate Phrase.) Further investigations should be performed to clarify the status of PredP-external indefinite subjects in these languages. The judgments offered are often very delicate. Thus according to the informant filling out my questionnaire for Danish, the Danish sentence (30 b) would be appropriate in describing a traffic scene when a car can be "accomodated", i. e., interpreted as One of the vehicles present in the domain of discourse'. (30)

Danish (questionnaire) a. John blev k0rt over af en bil. John was run over by a car. b. En bil k0rte over John, a car ran over John

If she is right, then in Danish a Predicate-Phrase-external indefinite subject is practically forced to have a specific interpretation. At the same time, it appears that a nominative subject cannot simply remain in situ in Danish: the usual way of keeping it in Predicate-Phrase-internal position is to use an existential construction plus a relative clause:

698

(31)

Katalin E. Kiss

Der er en kat, der star og miauer ved d0ren. there is a cat that stands and miauws at the.door 'There is a cat miauwing at the door.'

The situation seems to be similar in Norwegian — see Faarlund (1985) and (1995). When starting this project, I assumed (following Li & Thompson 1976) that the clearest example of a subject-prominent language is English. However, I soon realized that the distributional arguments indicating different sentence positions of non-specific and specific subjects in topic-prominent languages are also valid for English. Thus: (i) Whereas the specific subject of a categorical sentence can be followed by a sentence adverbial (without any comma-intonation), the non-specific subject of a thetic sentence cannot: (32)

a. John fortunately has been born on time, b. *A baby fortunately has been born.

(ii) The negative particle follows the specific subject of a categorical sentence, and precedes the non-specific subject of a thetic sentence: (33)

a. John was not born on time, b. *A baby was not born.

(34)

a. Not a baby was born. b. *Not John was born on time.

The usual position of sentence adverbials and the negative particle across languages is between the subject of predication and the Predicate Phrase. This is the distribution they display in the above English examples, too — if nonspecific subjects are internal to the Predicate Phrase (IP), and specific subjects are external to it. (iii) The VP of a thetic sentence cannot undergo VP-deletion, as was observed by Gueron (1980): (35)

*A riot occurred and then a flood did.

The ungrammaticality of (35) falls out if VP-deletion is interpreted as Predicate Phrase (i.e., IP-) deletion. In (35), deletion is illicit because the deleted phrase constitutes merely a subpart of the Predicate Phrase (IP).

Discourse-configurationality

699

(iv) A sentence-initial only (as well as as a sentence-initial also or even) can only have sentential scope when it is followed by a non-specific subject, as in (36b). A sentence-initial only followed by a specific subject, e.g. that in (36a), can only have scope over the subject: (36)

a. *Only John was born on time, nothing else happened, b. Only a baby was born, nothing else happened.

The grammaticality difference between (36 a) and (36 b) in all probability falls out if the maximal scope of only (as well as also and even) is the Predicate Phrase, which includes the subject in thetic sentences but does not include it in categorical sentences. The data in (32) —(36) indicate that the subject of a thetic sentence and that of a categorical sentence occupy different positions in English, too. If the subject of a thetic sentence is in the specifier of IP, the highest V-related projection, then the subject of a categorical sentence must be in a projection outside IP but inside CP, which I will tentatively call Topic Phrase (TopP). The finding that non-specific subjects do not leave the Predicate Phrase in most (perhaps all) European languages is not only significant from a typological point of view. It is also relevant for generative syntactic theory: it casts doubt on the alleged motivation for Subject Movement. If Subject Movement out of the Predicate Phrase were motivated by the requirements of case assignment alone, then it should affect specific and non-specific subjects alike. The above data suggest that — even though case requirements may be involved in Subject Movement in some languages — the universal trigger of Subject Movement is the requirement of creating a predication structure, and the universal licensing factor is the [-(-specific] feature of the target NP. Whereas on the basis of the criterion of the position of non-specific subjects, most European languages fall into the topic-prominent language type, they do display minor variation along certain parameters. Thus, they differ in the position of the grammatical subject within the PredP in thetic sentences. It appears that in some languages they remain in their base-generated argument position, in others they are preposed into a PredP-internal non-argument position, whereas in a third type either option is possible. This is the case in Hungarian: (37)

Hungarian a. Van egy könyv az asztalon. is a book the table.SUPESS 'There is a book on the table.'

700

Katalin E. Kiss

b. EGY KÖNYV van az asztalon. BOOK is on the table.' (38)

a. Be-jött egy läny a szobaba. in-came a girl the room.ILL 'There came a girl into the room.' b. EGY LÄNY jött be a szobaba. GIRL came into the room.'

In (37 b) and (38 b), the non-specific subject is in focus position; still, there is no significant difference between the interpretations of the (a) and (b) sentences. English represents a different version of this mixed type: the subject of an existential sentence is presumably in its base-generated position, whereas other non-specific subjects are preposed into Spec, IP. A further parameter of variation is that, whereas in most languages there is no syntactic indication of an empty subject-of-predication position in thetic sentences, a few languages do employ a dummy place-holder. E.g.: (39)

Swedish (questionnaire) Det star en katt vid dörren, there stands a cat at the.door

These languages are the V 2 languages, and partially, English, which requires a dummy subject only in existential and impersonal sentences. Since English also shows certain V 2 properties, I tentatively assume that the presence of a dummy subject is required by the parameter of grammar that is responsible for V2. Adopting one of Li & Thompson's criteria of topic-prominence, I have also examined which languages impose a restriction on topic-selection. In fact, all languages except the VSO Irish and Scottish Gaelic allow the extraction of specific arguments other than the grammatical subject out of the Predicate Phrase. At the same time, in the majority of languages, there can be more than one constituent of topic/notional subject function in a sentence. The V2 languages, as well as Estonian, Finnish, and Breton, on the other hand, allow at most one topic per clause. Languages which allow more than one topic in a clause may differ according to whether the order of subject and non-subject topics is free, as in the Spanish examples in (40), or bound, as in their English equivalents. The bound order attested in English is rare; Italian is the only other language in my sample in which this occurs.

Discourse-configurationality

(40)

701

Spanish (questionnaire) a. A Juan, Maria llego a conocer-Io el ano pasado. to Juan Maria got to know-him the the last 'John, Mary got to know last year.' b. Maria, a Juan llego a conocer-lo el ano pasado. Maria to Juan got to know-him the year last 'John, Mary got to know last year.'

Further variation involves the position that quantified and focussed subjects occupy. In topic-prominent languages with structural focus, among them Basque, Hungarian, Greek, and Georgian, focussed subjects are clearly barred from the PredP-external subject-of-predication position, and in several of these languages it is also evident that quantified subjects do not share the PredPexternal position of topic subjects, either (see the Hungarian (8) above). The distributional tests for subject position indicate that most instances of quantified subjects occupy a PredP-internal position in English, too. The syntactic category of the Predicate Phrase, i. e., the projection that bears a predication relation to the constituent in topic function is identified differently in various studies on topic-prominent languages. It is claimed to be a VP in Hungarian in the analysis of E. Kiss (1994), or in Finnish in the analysis of Vilkuna (1995); but it is called an IP in Catalan (cf. Vallduvi 1992 a), or in Hungarian in the theory of Horvath (1995). King, who analyzes Russian (1993), or Pinon (1993), who analyzes Hungarian, regard the Predicate Phrase as a Sigma Phrase (i.e., essentially a Focus Phrase), following Laka's analysis of Basque (1990). In Tsimpli's theory of Greek sentence structure (Tsimpli 1995), the Predicate Phrase is analyzed as a T(ense)P. In many topic-prominent languages, questions also allow a topic, in which case the topic phrase precedes the WH phrase (see (9b) above). As a consequence of this, authors who take the Spec, CP position of WH operators to be an invariable fact of Universal Grammar place the topic position before Spec, CP. In embedded clauses, however, the topic follows the complementizer. This ordering paradox can be avoided by assuming CP recursion (cf. Maracz (1989) on Hungarian). The paradox does not arise in Bulgarian (see Rudin 1986) and Basque (see Ortiz de Urbina 1991), as in Bulgarian, the constituent in topic function (or at least one type of it) precedes the complementizer in embedded sentences, and in Basque, the complementizer is cliticized to Infl. In these analyses, the Predicate Phrase is claimed to be a CP. The fact that a sentence can have more than one topic in most topic-prominent languages has lead several linguists to place the topic in an adjoined position, given that adjunction is an iterable operation. Thus it is claimed to be adjoined to CP in Bulgarian (Rudin 1986), and in Basque (Ortiz de Urbina

702

Katalin E. Kiss

1991); it is adjoined to IP in Vallduvi's analysis of Catalan (1992 a), and in King's analysis of Russian (King 1993); and it is claimed to be adjoined to TenseP in Greek (Tsimpli 1994). In E. Kiss's analysis of Hungarian (1994), one topic is in Spec, IP, and the rest are adjoined to IP. It is unclear at the moment to what extent these differences in the structural analysis and the labelling of topic-prominent sentence structure are notational differences, which can be eliminated, and to what extent they reflect factual differences. The difficulties of choosing between these descriptive options can be illustrated on the basis of Basque. In a Basque sentence of the order [XP YP V], XP functions as topic, and YP functions as focus — unless XP is subject, and YP is object, in which case the sentence can also be understood as communicatively neutral. Ortiz de Urbina (1995) accounts for its neutrality by assuming SOV (structured as [iP S [Vp O V]] to be the initial order, and deriving the TFV order by Topicalization (via adjunction to CP), Focussing (via movement into Spec, CP), and V Movement into C. This derivation would predict the possibility of TSV sentences in which a non-subject constituent adjoined to CP functions as topic, and the subject, sitting in Spec, IP, is communicatively neutral. In fact, such sentences do not exist; in an OSV sentence, S is necessarily focus. Ortiz de Urbina rules out TSV sentences by an ad hoc filter. Another possibility would be to analyze the neutral SOV sentences as TFV sentences, deriving their neutrality from the fact that S is the unmarked carrier of the topic function, and Ο is the unmarked carrier of focus function. The latter alternative would make the demotion of the focus to Spec, CP from under the IP projection unnecessary. Although I do not consider it my task to try to choose between such options, and in general, to try to reduce the attested variation in the description of topic-prominent sentence structure, let me point out certain elements of structure supported by independent cross-linguistic evidence. In the majority of languages with structural focus (in TFV languages), the focus position is the initial slot (presumably the specifier) of the Predicate Phrase. Tuller (1992) argues on the basis of cross-linguistic material that the position of structural focus across languages is Spec, IP (cf. also Horvath (1995)). Consequently, the constituent functioning as PredP must be of the category IP. If our analysis of English thetic and categorical sentences above was right, then English also supports this hypothesis. On the other hand, there is convincing syntactic and phonological evidence from Hungarian (cf. E. Kiss 1994) that the constituent in topic function is dominated by a projection different from the projection dominating the Predicate Phrase; that is, if the Predicate Phrase is an IP across languages, the topic cannot be adjoined to IP.

Discourse-configurationality

703

Evidence for this comes from the operation of the Nuclear Stress Rule. In Hungarian, the Nuclear Stress Rule puts phrasal stress on the leftmost constituent of phrases. If the topic were the leftmost constituent of the same phrase that also dominates the focus and the V, it would follow that a stressed topic would be the heaviest stressed constituent of the sentence. In fact, its stress can never be heavier than that of the first stressed constituent of the Predicate Phrase (the focus, or in lack of it, the V). Consider how phrasal stresses ad up in (41). The post-focus V obligatorily loses its stress. (41)

Hungarian [TP Janos [IP a szekrenyhez X

[r vagott egy könyvet]]]

X

X

X

John the wardrobe.to threw a book.ACC 'John threw a book at the wardrobe.' The distribution of stresses in (41) indicates that the topic phrase is outside the phrase dominating the focus and the V, as it constitutes a separate domain for the Nuclear Stress Rule. Another piece of evidence in support of the proposed analysis is based on the placement of sentence adverbials and predicate adverbials. It can be shown that the distribution of sentence adverbials and predicate adverbials, and the interpretation of adverbials that are ambiguous between a sentence adverbial and a predicate adverbial can only be predicted if sentence adverbials and predicate adverbials are attached to different projections. Hungarian sentence adverbials precede or follow the topic, and predicate adverbials precede or replace the focus. Cf. (42)

a. Szerencsere Janos jol megoldotta a feladatot. fortunately John well solved the problem.ACC 'Fortunately, John solved the problem well.' b. Janos szerencsere jol megoldotta a feladatot. c. *Jol Janos szerencsere megoldotta a feladatot. d. *Janos jol szerencsere megoldotta a feladatot.

If the focus were in Spec, IP, and the topic were adjoined to IP (as is essentially proposed e.g. in Brody (1990)), i.e., if both adverbials were adjoined to the same projection, it would be unclear why their relative order is fixed. It is, in fact, not the relative order of the two adverbials that matters; a predicate adverbial cannot precede the topic, and a sentence adverbial cannot go to focus position also if there is no other adverbial present.

704

Katalin E. Kiss

Furthermore, if the focus and the topic were immediately dominated by segments of the same projection, an adverbial that is ambiguous between a sentence adverbial and a predicate adverbial should be ambiguous. In fact, it is not; its interpretation depends on whether it is in the pre-PredP sentence part, or in the PredP. If it is left-adjacent to the topic and right-adjacent the focus, where it could belong to either sentence part, its interpretation is determined by whether or not it bears the heaviest stress of the sentence. Compare: (43)

a. Janos okosan MEG valaszolta a kerdest. John cleverly prev answered the question.ACC 'Cleverly, John answered the question.' b. Jänos OKOSAN meg valaszolta a kerdest. 'John answered the question cleverly.'

The stressing of okosan depends on whether it is adjoined to the projection dominating the Predicate Phrase, or it is adjoined to the projection dominating the topic phrase. If it is adjoined to the Predicate Phrase, the Nuclear Stress Rule will asign heavier stress to it than to meg, the 2nd constituent of the Predicate Phrase. If it is outside the Predicate Phrase, it cannot have heavier stress than meg, which is the leftmost constituent of the Predicate Phrase in that case. If okosan is adjoined to the same projection in both (43 a) and (43 b), the interpretation (and stress) difference between (43 a) and (43 b) cannot be predicted. If the Predicate Phrase is indeed an IP, as proposed by Horvath (1995), then the subject-of-predication, i. e., the topic, is dominated by an additional phrasal node intervening between IP and CP, which I will call Top(ic) P(hrase), as I did in the case of English. I assume that IP a is complement to an abstract T head, and the constituent in topic function occupies the specifier position of the TP projection, as follows:

(44)

CP

TopP XP [topic] IP [PredP] XP \ VP

Discourse-configurationality

705

A constituent in topic function binds an empty argument position in the Predicate Phrase in most of the topic-prominent languages in question. In Greek, Catalan, and optionally, in Bulgarian, it binds a resumptive pronoun instead — see (45). (On Greek, consult also Lascaratou (this volume), and on Bulgarian, Guentcheva (1994)). In the Romance languages, a resumptive pronoun coindexed with the topic may be obligatory, optional, or excluded, depending on many factors. (45)

Greek (Tsimpli 1995: 177) To vivlio to-edhose i Maria sto Yani. the.acc book it-gave the.nom Maria to.the.acc Yani 'The book, Maria gave it to Yani.'

The relation between a topic constituent and the empty argument coindexed with it seems to observe Subjacency, which indicates that the topic is extracted from the Predicate Phrase by movement. The presence of a resumptive pronoun in Greek has lead Tsimpli (1995) to the conclusion that the topic is base-generated outside the Predicate Phrase, and an invisible operator coindexed with it moves in LF. Long, cyclic Topic Movement is also possible in most topic-prominent languages, among them in Hungarian, Lovari, Italian, Polish, etc. E. g. (46)

Spanish (questionnaire) A Juan, dudo que lo conozca Maria, to John doubt-I that him knows Maria 'John, I doubt that Mary knows.'

The languages in which long topic extraction (perhaps long movement, in general) is blocked include the Caucasian languages. The constituent in topic function bears the case assigned to the empty argument or resumptive pronoun coindexed with it in every language. In some languages, e. g. in Greek, Italian, or Russian, the PredP-external argument can also be in the nominative (nominative pendens). Compare: (47)

Greek (Tsimpli 1994: 180) a. Tus fitites, oli i kathigites tus-ipostirizun. the.ACC students all the lecturers them-support.3PL 'All lecturers support the students.' b. I fitites, oli i kathigites tus-ipostirizun. the.NOM students all the lecturers them.support.3PL 'All lecturers support the students.'

706

Katalin E. Kiss

In fact, it is not obvious that a nominative pendens, presumably base-generated in Left Dislocation, and a topic constituent bearing the case assigned to the predicate-internal argument coindexed with it have the same discourse-semantic role. In Hungarian, as well as in Turkish, there is a clear difference between both the syntactic properties and the discourse-semantic function of a constituent extracted from the predicate, and a constituent generated in Left Dislocation. The latter does not function as a topic proper; it does not even have to be referential. It functions as a so-called contrastive topic instead, which appears to be more like an operator implying contrast than a subject-of-predication. In addition to the topic proper, which precedes the sentence part bearing a predication relation to it, some authors also identify an after-topic following the predicate (e.g. Tsimpli (1994) for Greek and Vallduvi (1992a) for Catalan. See also Lambrecht (1986) on French.) For example: (48)

Catalan (Vallduvi 1995: 128) [IP El ficarem al calaix] el ganivet. it put-FUT.lPL in.the drawer the knife 'We will put the knife in the drawer.'

Even though after-topics may resemble topics proper in their stress and reference properties, they do not seem to have the function of a notional subject, foregrounding the referent that the sentence predicates about. In Vallduvi's dynamic semantic framework, only pre-topics, i. e., topics proper, have the role of presenting the address under which the information conveyed by the sentence is to be entered.

2.4. Summary In sum: topic-prominence, that is, the consistent encoding of the semantic predication structure of a sentence in syntactic structure, is a property shared by almost all European languages — except the clearly VSO Irish, Scottish Gaelic, and Welsh. At the same time, topic-prominent languages display syntactic variation along various minor parameters. Thus they differ in whether a non-specific, PredP-internal subject is left in its base-generated position or is preposed into an Α-bar position inside PredP. This difference may correlate with whether in the given language nominative case can be assigned inside the VP. Some languages, belonging to the V2 type, require a dummy pronoun in the subjectof-predication position in thetic sentences; others can do without such a pro-

Discourse-configurationality

707

noun. Some languages display restrictions on topic-selection; thus in English a [ +specific] subject must always be among the topic constituents of a sentence; and in a multiple topic construction, it must be the topic closest to the finite V. A few languages allow only one topic per clause; in the majority of languages, on the other hand, multiple topic constructions are also possible.

3.

The Structural encoding of focus

3.1. The discourse-semantic function 'focus' The majority of the European languages that I have examined not only express the discourse-semantic function topic/notional subject structurally; they also express the discourse-semantic function 'focus' by a structural relation. The term 'focus' is used in at least two different senses in the literature. It is used to denote the new information in the sentence, or, alternatively, to denote an operator expressing identification. When it is understood as information focus, i. e., the sentence part carrying new information, the non-focus, presupposed part of the sentence is called 'background', or 'focus-frame'. The information focus can be a constituent of any size: e.g. a mere N or Adj, a NP, a VP, or the whole sentence, i. e., it can be 'narrow' focus or 'wide' focus. Its only function is to express the novelty of the material that it contains. It is not an operator either in the semantic or in the syntactic sense; it never involves movement either in syntax or in logical form. The notion of focus that is relevant for us in this study is that of an identifying operator (or focus operator). The focus operator serves to introduce a set and to identify a subset of it as such of which the predicate exclusively holds. It is a major constituent of the sentence, which can be moved into operator position without violating Subjacency — as I will argue below. It undergoes operator movement either in syntax or in LF, landing in a position from which it c-commands its scope. The focus operator is always part of the notional predicate; an information focus, on the other hand, can also extend over the notional subject (topic) — e.g. in 'all new' sentences. In Hungarian, an interrogative phrase can be answered either with an information focus or with a focus operator. The difference between the two types of answer is that the focus operator (preposed into immediately preverbal position) is understood as an exhaustive answer, whereas an in-situ information focus is interpreted as a non-exhaustive one. Compare the two possible answers to the question Kivel talalkoztal a hangversenyen? 'Who did you meet at the concert?'

708

(49)

Katalin E. Kiss

Hungarian a. Erzsivel talalkoztam. Elisabeth.INST met.PST.lSG Ί met ELISABETH [It was Elisabeth I met].' b. Talalkoztam Erzsivel. Ί met ELISABETH [among others].'

In (49 a) Erzsivel is in the focus position, where it functions as an identifying operator. As an operator, it introduces a set of relevant persons (presumably the set of the mutual acquaintances of the participants of the discourse), and asserts that it was Elisabeth who I met, also expressing that I did not meet anybody else of this set. Erzsivel is the only piece of new information and the only stressed constituent in (49 b), as well, but there it is not moved into operator position, and it does not express identification and does not imply exclusion; it is a mere information focus. In languages like Italian, Rumanian, Arabic, Greek etc., the question Who did you meet at the concert?, could, in fact, be answered only with an information focus. In such languages, the focus operator can only be used if there is a closed set of individuals present in the domain of discourse for it to quantify on. In these languages, only questions like Which of your class-mates did you meet at the concert? provide an appropriate trigger for a focus operator. In languages in which the two focus notions are not structurally distinguished, they are difficult to keep apart consistently. It is not always clear even in the semantic literature which focus notion is being accounted for; and whether authors holding conflicting views are analyzing the same phenomenon. The semantic approaches to focus include the 'structured meaning' theory of focus, elaborated by von Stechow (1981, 1991), Jacobs (1983, 1986), von Stechow & Uhmann (1986), Krifka (1991, 1992), etc. In this theory, the focus feature of a constituent induces the partitioning of the semantic representation of a sentence into a focus part and a background part. Consider the semantic structure of (50a): (50)

a. John loves MARY. b. e outside the CRD for VP, just as the corresponding material in „,ΡΡι will be outside the CRD of a higher VP domain in English structures of the type vp[V mPPi mPPil· The processing of iPPm in Japanese and Korean proceeds bottom-up in the structure ypLPPm iPPm V], with daughter constituents of aPPm being recognized before this PP itself is constructed, and with PP being recognized before VP is actually constructed. The result is equally efficient EIC ratios, achieved through mirror-image orderings as a consequence of the left- versus right-peripheral positioning of mothernode-constructing categories in the grammars of these different language types. Some relevant data from Japanese have been collected by Kaoru Horie and are illustrated in Tables 3 and 4. Table 3 tests EIC's predictions on structures of the type [{NP0 PPm} V], where NP0 stands for a direct object NP containing an accusative particle -o, occurring in combination with a postpositional phrase. An example of the relevant sentence type is given in (12): (12)

Tanaka ga {[sono hon o] [Hanako kara]} katta. Tanaka SU that book OBJ Hanako from bought, i. e. 'Tanaka bought that book from Hanako.'

The results are given in Table 3. They reveal a clear preference for the longer 1C (abbreviated here as 2lCm) to precede the shorter one dIC m ), to an extent that is directly proportional to the length difference between them. This longbefore-short pattern is the exact opposite of the short-before-long pattern in the English and Hungarian data of Tables 1 and 2. The unmarked case prediction in Table 3 achieves an 82.4% score, and all the marked case predictions are correct.10 Table 4 presents Japanese data comprising two PPs corresponding to the English data of Table 1. An example sentence is (13):

Performance theory of word order (13)

741

Tanaka wa {[Tookyoo e] [fune de]} itta. Tanaka TOP Tokyo to ship by went, i. e. 'Tanaka went to Tokyo in a ship.'

The results show the same pattern, with the unmarked case prediction scoring 84.8% and with the marked case predictions being again correct.

Table 3. Japanese [{NP-o PPj V] 150 pages of data: first 70 pp. from Juugo Kuroiwa Yama No Hada, Kobunsha Tokyo; first 40 pp. from Yasuyoshi Kobakura Okinawa Monogatari, Shinchoosha Tokyo; first 40 pp. from Isamu Togawa Konoe Futnimaro To Juushintachi, Kodansha Tokyo. [(TopP) [{(S > 1) (NP-ga)} {NP-o PPj V]] (no other ICs in the clause apart from optional adverbials in any position) where PPm = PP[NP P] [{2ICm ,ICm} V] η = 244

x [2icm ,icm] Υ [iIC m 2ICJ

NP-o = PPm

2 IC m

91

Ratio of Y/X

: 3-4

:5-8

: 9+

30

21 8

20 4

10 1

34%

28%

17%

> ,ICm : 1-2 59

9%

EIC predictions (assuming no VP discontinuity; consistent long before short predictions) Unmarked case: 201/244 = most optimal (X + =), i.e. 82.4% Correct Marked case: Ratio of Y/X for 2ICm > !lCm : 1-2 ^ 3-4 > 5-8 3= All correct 9+

Table 4. Japanese [{,PPm 2 PPj V] 150 pages of data: cf. Table 3 [(TopP) [{(S ^ 1) (NP-ga)} {,PPm 2PPm} V]] (no other ICs in the clause apart from optional adverbials in any position) where PPm = pp[NP P] η = 66 X [2PPm lPP m ] Υ [lPP m 2PPm]

Ratio of Y/X

!PPm = 2PPm

2 PP m

> ,PPm : 1 word

i 33%

:2-5

: 6+

18 8

3 0

31%

0%

EIC predictions (assuming no VP discontinuity; consistent long before short predictions) Unmarked case: 56/66 = most optimal (X + =), i.e. 84.8% Correct Marked case: ratio of Y/X for 2PPm > ,PPm : \^ 2-5 5s 6+ All correct

742

John A. Hawkins

2.3. EIC's predictions for grammar EIC derives its predictions for the conventionalized ordering rules of grammars from the performance preferences that have been observed in free word order structures and languages and in ordering rearrangements. The basic premise is that if a grammar conventionalizes a particular ordering, i. e. "fixes" what was hitherto free ordering of a set of ICs, it will do so in a way that matches the output predictions of EIC's predictions for performance. Optimal orders will be grammaticalized in the unmarked case, and the frequency of any non-optimal orders across languages will be in proportion to the magnitude of their ratios. In order to assess which orderings of constituents are actually optimal, we need to know how long a constituent such as NP, PP, VP, or S' is on average, and the predictions for grammar are accordingly formulated in terms of EIC ratios applied to ICs of aggregate length. The general theory of PTOC predicts that there should be a correlation between performance and grammar, whereby, for example, the most frequent orderings of a given set of ICs in a free word order language will be the orderings that are generally conventionalized in fixed word order languages. This is captured in the following prediction: (14)

EIC Basic Order Prediction The basic orders assigned to the ICs of phrasal categories by grammatical rules or principles will be those that have the most optimal ratios for ICs of aggregate length, in the unmarked case; any basic orders whose EIC ratios are not optimal will be more or equally frequent across languages in direct proportion to their EIC ratios.

EIC also makes predictions for grammaticalized rules of rearrangement, such as Extraposition in English and for word order hierarchies involving centerembedded constituents of increasing length and complexity within a CRD.11 Consider as an illustration of (14) a binary branching structure containing a PP embedded in a VP, vpW pp{P NP}}. There are four logically possible orderings of these constituents, whose CRDs for VP recognition are shown by the underlinings (V constructs VP, and P constructs PP):

(15) a. Vp[V Pp[P NP]] b.

VP [pp[NP

P]V]

c. Vp[V Pp[NP P]] d.

VP[PP[P

NP] V]

Performance theory of word order

743

The (a) and (b) versions are both optimal for VP recognition, since two ICs, V and PP, can be recognized by two adjacent words. The (c) and (d) versions have longer CRDs and are less efficient, since the NP intervenes between V and P. It is accordingly predicted that (a) and (b) will account collectively for the great majority of languages, and they do. Current samples show that (a) or (b) are grammaticalized in at least 90% of languages (cf. Hawkins 1994: 257, which cites data from Matthew Dryer's language sample). EIC also predicts a slight preference for (c) over (d) within the set of marked languages when ICto-word ratios are calculated left-to-right (cf. n. 6), a prediction which is also confirmed in this case. EIC accordingly explains the preferred adjacency of V and P. Head adjacency, and the adjacency of other mother-node-constructing categories (cf. Hawkins 1993), should not therefore be regarded as a primitive fact of grammar that has to be stipulated within a grammatical theory: it is a consequence of performance. More generally, the existence of two productive language types, the head-initial and the head-final type with their respective word order correlations (cf. Dryer 1992), follows from EIC since structures such as (15 a) and (15 b) can both attain optimal IC-to-word ratios. Their equal optimality is reflected in their roughly equal frequency across languages (cf. Hawkins 8c Cutler 1988 for a summary of the relevant language quantities in numerous language samples). And diachronically languages can be expected to gravitate towards one or the other type over successive generations, as speakers gradually conventionalize EIC's effects and remove any contrary word orders that may have been inherited or borrowed. This kind of evolutionary movement towards the EIC-defined ideal has been demonstrated in an intriguing computer simulation of EIC effects on word order fixing over time in Kirby (1994). EIC therefore explains why there should be two productive language types for binary branching structures, and it asserts that basic orders are conventionalizations of performance preferences. (Where there are left-right asymmetries and/ or other principles to counteract EIC in certain structures, cf. § 3.3, more variation is of course predicted, synchronically and diachronically.) In a multiple branching structure such as Np{N, Adj, S'} (e.g. [good movies [that we saw recently]]} there are six possible relative orderings, and these six jump to 12 if the internal structure of S' is distinguished according to whether the category that constructs it (such as a complementizer or relative pronoun and abbreviated here as C) occurs on a left or a right periphery, i.e. S °[C S] or S'[S C]. Four of these 12 relative orderings are optimal for EIC, since they permit recognition of the three ICs of NP in just three adjacent words: [N Adj [C S]], [Adj N [C S]], [[S C] Adj N], and [[S C] N Adj]. These four appear to be exactly the orderings that have been productively grammaticalized across

744

John A. Hawkins

languages, and there is a progressive decline in grammaticalization down the remaining eight orders, in accordance with their EIC ratios, as predicted by (14).12 In addition to accounting for these universal generalizations, EIC makes some fine-tuned predictions for the rule formulations of the better-studied languages of Europe. For example, the distribution of single-word and multi-word ICs in the English NP and VP appears to be exactly what EIC predicts, and the relative ordering of complements of the head is also as predicted. The following regularities of English can be argued to be "performance-driven":13 (16)

a. If an 1C is possibly a single-word in the CRD for NP (e. g. determiner, numeral, an AdjP without PP or VP' complements, N itself), then it precedes all necessarily multi-word ICs (such as PP, S', AdjP with PP or VP' complements). b. If an 1C is necessarily a single-word in the CRD for VP (e. g. V itself, Pro), then it precedes all possibly multi-word ICs (NP, PP, S'). c. The relative ordering of complements of a head category X is in accordance with their increasing aggregate weights [X NP PP S'].

3. Some theoretical issues The processing approach to linear ordering phenomena summarized in Section 2 raises a number of general issues in linguistic and psycholinguistic theory. First, does processing shape the grammatical conventions of rules to the extent that I have argued it does, and if it does, what consequences will this have for principles of linear ordering that are currently formulated in purely grammatical terms? Second, what is the precise interaction between the role of syntactic processing, as represented by EIC, and non-syntactic determinants of linear ordering involving information status (i. e. their givenness or newness to the hearer)? PTOC claims that it is grammar processing that drives linear ordering selections in performance, not information status, and it claims further that observed pragmatic regularities in free word order structures and languages can be derived as secondary consequences of EIC. What is the evidence for these claims? Third, what is the relationship between EIC and other principles of syntactic and/or semantic processing? And fourth, does the EIC efficiency metric apply equally to production and to comprehension, or just to the latter? It is not my goal in this context to try to give detailed answers to these questions. Space limitations preclude this. I simply want to define the issues in

Performance theory of word order

745

the clearest possible way, and mention the kinds of data and linguistic generalizations that can contribute to their resolution in the future. In particular, I shall discuss some ideas and data that have been brought to my attention by collaborators in the EUROTYP project.

3.1. The grammaticalization of processing principles There has been a long-standing assumption within generative grammar, since Chomsky (1965), that processing difficulty can determine only the acceptability status of sentences, and not their grammatically. According to this view, structures that are difficult to use, such as self-embedded relative clauses or certain center embeddings, are grammatical (i. e. they are generated by the grammar), but unacceptable (according to certain independent principles of e. g. structural complexity, cf. Miller & Chomsky 1963). This assumption has been challenged by a handful of proposals, primarily by psycholinguists and computational linguists, arguing that certain conventions of the grammar have been shaped by processing and that the formal properties of the rules or principles describing these conventions can be explained by independently needed mechanisms of performance. For example, Janet Fodor (1978) gives a parsing explanation for the Nested Dependency Constraint; Berwick &: Weinberg (1984) explain Subjacency in terms of the bound on left context within their parser; and Frazier (1985) proposes parsing explanations for a number of grammatical phenomena both language-particular (such as Complementizer Deletion environments in English) and universal (such as Head Adjacency). Chomsky himself has also explicitly allowed for the possibility that processing explanations for grammatical phenomena can be invoked at the level of the evolution of the species, cf. Chomsky & Lasnik (1977). Considerations of performance may have shaped the human language faculty in its evolution, favoring some innate universale of grammar over others (cf. Newmeyer 1990, 1991 for a more detailed discussion). But at the level of a particular grammar, performance principles and competence rules or principles are claimed to be quite orthogonal, and the latter cannot be explained in terms of the former, except via a possible processing motivation for an innate grammatical universal which in turn structures the grammar and possible parameter settings for a particular language. We can refer to this view of the relationship between performance and grammar as the assumption of pure acceptability. There are several problems with this assumption. At the level of particular grammars, it does not account for variation across languages with respect to the much discussed-sentence types that involve performance difficulty, the self-

746

John A. Hawkins

embedded relatives (e.g. the mouse that the rat that the cat chased bit died) and center-embedded finite clauses (e. g. did that Harry lost the race surprise Maryt). If such sentence types are all grammatical but unacceptable, we expect similar judgments by native speakers across languages, in those languages that have independently motivated grammatical conventions capable of generating them. But judgments are not constant. Self-embedded relatives are rejected and essentially unattested in English; in German they are accepted and attested, as long as argument-predicate relations are sufficiently clear and distinctive so that they can be recovered from what is evidently still a difficult structure. The obvious empirical generalization to be made here is that self-embedded relatives do exist in German, but not in English. Similarly, center embeddings of finite clauses are impossible in English, but they are accepted and attested in Japanese, as long as the length of the center embedding is not too great.14 What this suggests is that the grammars of English, German and Japanese differ with respect to these difficult structures. Self-embedded relatives are not generated by the grammar of English, whereas they are generated by the grammar of German, and different levels of acceptability are then assigned to these German sentences based on the ease with which the respective argument-predicate relations can be recognized. Similarly, finite center embeddings are blocked by the grammar of English, but not by the grammar of Japanese. Grammatical conventions appear to differ with regard to structures that are difficult to use, therefore, and hence processing ease does shape the grammatical conventions of particular languages, resulting in conventionalized responses in many languages whose effect is to avoid the generation of structures that are difficult to perform. One way to try to avoid this conclusion would be to argue that the different judgments made by English, German and Japanese native speakers are the result of differences in performance mechanisms across these languages and language users. But since the structures in question (self-embeddings, etc.) are identical, it is unclear why they should cause more difficulty for the performance mechanisms of one set of language users than for those of another. This approach is also at variance with an assumption that is now widely held among psycholinguists who have investigated syntactic processing across languages, to the effect that the fundamental principles of processing are universal. What varies is simply the set of grammaticalized structural options that are made available by each language and to which an identical and minimal set of parsing operations applies as needed (cf. e.g. Frazier 1985, Frazier & Rayner 1988, Inoue & Fodor 1994, Inoue 1991). The Chomskyan assumption that performance principles and competence principles are orthogonal at the level of particular grammars, and that process-

Performance theory of word order

747

ing may, at best, have impacted the evolution of the innate grammar, does not account for these kinds of cross-linguistic differences.15 At the same time, these differences raise the following question: if processing can indeed shape grammatical conventions, resulting in a "grammaticalization" of certain rules or constraints on rules whose effect is to make the structures generated by the grammar more user-friendly, then in what ways precisely will it do so and why aren't difficult structures blocked in all grammars? PTOC addresses this issue and points to an interesting correlation between the logical structuring of many grammatical conventions across languages, typically captured in the form of implicational universale, and degrees of complexity and performance difficulty defined by the theory and metric of Structural Domains (cf. Section 2.0). This correlation can be seen in the grammaticalized and hierarchically arranged environments for relative clause formation across languages (discussed by Keenan &C Comrie 1977) and in the correlating degrees of performance difficulty associated with these environments (established by Keenan & S. Hawkins 1987). It can be seen in hierarchies of increasingly complex center-embedding environments, whereby the center-embedding of a finite clause implies the possibility of a center-embedded non-finite clause, which in turn implies the possibility of center-embedding even less complex non-clausal nodes, etc.16 It can be seen also in the correlations between the performance preferences of EIC defined on free choice word order alternatives, exemplified in Tables 1 through 4 above, and the cross-linguistic ordering rules that have been grammaticalized, as illustrated in Section 2.3. By examining implicational universale and hierarchies in the structuring of language-particular conventions in this way, we see a clear and apparently general correlation with complexity and performance difficulty. The theory of Structural Domains defines complexity differences between different instances of one and the same grammatical structure or parsing operation. As a result it makes many predictions for performance and for grammaticalization across languages, and these predictions need to be tested further. For example, more data must be collected, from performance and grammar, that test for increasing complexity in the domains for WH-movement. More generally, all cross-linguistic hierarchies need to be examined from the perspective of performance in order to see whether complexity provides an explanation for what appears to be an extremely general source of structuring in cross-linguistic regularities. Just as there are hierarchies in syntax, so too there are hierarchies of expanding vowel and consonant inventories in phonology, for which processing explanations have been offered by Lindblom et al. (1984) and Lindblom &C Maddieson (1988). Richer phonological inventories, according to these authors, are associated with more elaborated and complex phonetic variables.

748

John A. Hawkins

There is also evidence for increasing conceptual complexity in many of the semantic markedness hierarchies discussed by Croft (1990).17 The metric of structural complexity proposed here, and the proposed correlation between implicationally arranged grammatical generalizations involving Structural Domains and degrees of performance difficulty, provides at least a partial answer to the question of which structures will be grammaticalized across languages. Languages will conventionalize grammatical structures in accordance with their ease of use, preferring co-occurring linear ordering rules that make for optimal constituent structure recognition, grammaticalizing a more complex relative clause structure only if all simpler relative clauses of the same type are already grammatical, and so on. This theory defines no absolute prohibitions against structures that are supposedly unprocessable, such as selfembeddings or subjacency violations, whose unprocessability has been claimed to follow from a particular processing architecture. There are too many counterexamples in languages other than English to the proposals that have been made, and the evidence supporting these architectures is often speculative and weak. What does seem to be empirically supported, both in performance data from individual languages and in cross-linguistic comparison, is the notion of degree of complexity and difficulty, with structures being attested within and across languages in proportion to their degree of processing difficulty. The socalled "unprocessable" structures are simply those with a high degree of associated processing difficulty, which makes them relatively infrequent. It is conceivable that the processing load associated with certain structures will be so great as to make them completely impossible to use. But such a claim will require careful examination of many more languages than have informed current processing theories. In the interim, claims of absolute unprocessability are largely premature, while the correlation between cross-linguistic implicational regularities, such as hierarchies, and degrees of processing difficulty seems to be empirically more defensible, and no less significant a finding. Notice now one consequence of this correlation for the field of psycholinguistics. If grammars are as responsive to processing as we claim, then grammatical conventions become directly relevant for theories of processing. A particular grammatical convention may be a direct response to performance, rather than an instantiation of some ultimately innate grammatical principle or parameter. Whether processing is a plausible explanation in any one case needs to be argued for, of course, by showing that the processing theory can subsume facts or generalizations that the grammatical theory cannot, or that the grammatical convention is simply stipulated and doesn't follow from anything on the innate grammar account, whereas performance does provide an explanation. Arguments of this type are adduced in PTOC to show that the grammati-

Performance theory of word order

749

calized ordering conventions of Section 2.3 are not adequately explained by grammatical principles and parameters in the literature, and that they do follow readily from performance, specifically from EIC. What remains to be more clearly established is: what other principles of processing are reflected in grammatical conventions apart from EIC (cf. Sections 3.2 and 3.3)? and what impact does shifting the ultimate explanation from the innate grammar module to the performance module have for the grammatical generalizations themselves? For a model such as Government-Binding theory, the philosophy of PTOC advocates that there are no innate and parameterized universals of directionality for case and theta-role assignment (cf. Koopman 1984, Travis 1984, 1989). The innate grammar may contain general principles of linguistic structure, but it is performance, in the form of EIC, that determines the orderings of elements that become conventionalized in particular languages. Generative rule types and principles such as directionality of case- and theta-role assignment may still be useful at the level of describing particular languages, however, and in predicting all and only the grammatical sentences in the relevant domain, but they are no longer a part of UG itself. For certain other grammatical principles our empirical findings suggest that they should be eliminated altogether. The Axiom of No Crossed Branching, which disallows crossed branching and which has guided the formulation of phrase-structure rules since their inception (cf. e.g. Chomsky 1956), is a principle which seems at first sight to be highly consistent with, and perhaps even motivated by, processing. If the daughters of all mother nodes are continuous and adjacent to one another, then the correct recognition of constituent structure relationships would appear to be simplified. However, performance data such as Extraposition from NP in German (cf. (11)) provide direct evidence for discontinuity of structure and against the Axiom of No Crossed Branching. The data are correctly predicted on the assumption that the extraposed S' is still (discontinuously) attached to its mother NP, and the relative advantages for the one CRD (e. g. the containing VP) are then measured against the disadvantages for the other (the discontinuous and now longer NP CRD). This theory assumes discontinuity, therefore, and makes correct predictions derived from this assumption. It does not appeal to alternative grammatical mechanisms such as the extraction of S1 from its mother NP and attachment to a higher S (cf. Reinhart 1983), with co-indexation between S' and some empty category in the original position of the S' (adjacent to the head noun). Such an alternative would predict a pattern of results that make Extraposition from NP look fundamentally similar to Extraposition of a clause (in which the extraposed S' is attached to the rightmost position of VP, cf. Reinhart 1983).

750

John A. Hawkins

But these two types of extraposition are not at all similar. Extraposition of a clause is always preferred in performance over non-extraposition (cf. Erdmann's 1988 data from English); Extraposition from NP is only preferred when it results in a (short) distance between S1 and its head noun, and when the declining EIC ratio for the NP does not offset the advantage for the containing VP resulting from the extraposition. As a result, Extraposition from NP is very infrequent out of a complex NP in a subject and/or topic position, since the whole matrix VP must then intervene between the discontinuous NP constituents (cf. Shannon 1992, 1995 for relevant supporting data). EIC, as currently formulated, predicts this. More generally, the adjacency of daughter ICs is not always optimal for processing, since the processing advantages for distinct nodes can be in conflict with one another. In structures in which there is no such conflict, EIC and the Axiom of No Crossed Branching are congruent. But if there is a conflict, EIC makes the right predictions (assuming discontinuity), and the Axiom is not supported. Hence, the Axiom appears to be false, and EIC can account both for the cases in which it is correct, and for those in which it is false. The Axiom of No Crossed Branching should be removed from the grammar altogether (cf. McCawley 1982, 1987 for a similar conclusion), and its putative effects should be derived from performance principles that structure the grammatical conventions of particular languages. One model of grammar which has taken typological word order differences across languages very seriously and which has come up with a number of interesting structural principles to account for them is Functional Grammar, cf. Dik (1978, 1989), Rijkhoff (1992), and Siewierska (1991 b). These principles include: LIPOC (i. e. the Language-Independent Preferred Order of Constituents, cf. Dik 1989: 351); Head Proximity ("the head of the domain tends to be contiguous with the head of the superordinate domain", cf. Rijkhoff 1992: 229); Domain Integrity ("constituents prefer to remain within the boundaries of their proper domain", cf. Rijkhoff 1992: 221); and the Relator principles ("relators have their preferred position: (a) at the periphery of the relatum with which they form one constituent (if they do so); (b) in between their two relata", cf. Siewierska 1991 b: 207). These principles are, in essence, statistical generalizations across grammars, and these generalizations appear to be correct. For example, heads are generally contiguous, in accordance with Head Proximity (e. g. P, the head of PP, is generally contiguous to V, the head of VP, in structure (15) above); and immediate constituents are preferably continuous rather than discontinuous, in accordance with Domain Integrity. But such preferences are exactly what EIC defines: CRDs are optimal when heads are contiguous as in (15); continuous ICs

Performance theory of word order

751

are predicted by EIC for most structures, except in a certain subset of conflicting CRDs exemplified by (11), for which EIC also predicts when d/scontinuity will be preferred and when not; the Relator principles also appear to follow from EIC, on the assumption that Relators are constructing categories in parsing; and so does LIPOC.18 These statistical generalizations appear to follow from EIC, therefore. What are stipulated structural conditions with stipulated preferences in Functional Grammar can be explained by principles and preferences of language performance that are independently motivated. But if the structural preferences of these many grammatical principles of FG are indeed identical to the preferences defined on these same structures by a single principle, EIC, then maintaining the former in place of the latter will result in a loss of significant generalizations. It may be typologically useful to maintain certain separate subregularities for descriptive convenience, but if there is a higher-order generalization, it needs to be captured somewhere. In a similar way, if one were to formulate several rules of ordering for ICs of the English NP along the lines of 'the definite article precedes count nouns', 'the definite article precedes mass nouns', 'the demonstrative determiner precedes singular and plural nouns', etc. one would miss the correct linguistically significant generalization: 'determiners precede nouns'. Some issues for further research within this model are therefore: can some of its many principles be eliminated in favor of a single generalization? and does performance actually explain the grammatical preferences that are simply stipulated in FG at the present time?

3.2. EIC and pragmatic principles The processing theory proposed here provides a unified theory of performance and grammar and claims that linear orderings in free word order languages and linear orderings in fixed word order languages are regulated by the same principle. Where there are no grammaticalized ordering conventions defined on the ICs of a given phrase, EIC predicts orderings based on the syntactic weight that each 1C happens to have in each performance instance, ultimately for expressive reasons (more precisely, based on the amount of structural material that each 1C adds to the CRD for this phrase, as defined in (5) above). Where there are grammatical conventions, these are also claimed to be determined by EIC, based on the weight aggregates of the ICs. This view of free word order languages amounts to the claim that the selection of alternative orderings in these languages is driven by syntactic processing, and not by pragmatic information status and the "givenness", "newness", and

752

John A. Hawkins

"predictability" of entities that are referred to in discourse, as is widely believed (cf. e.g. Firbas 1964, 1966; Thompson 1978; Givon 1983, 1988; Mithun 1992; Payne 1992). The syntactic processing alternative can, it is claimed, derive pragmatic ordering effects from EIC. Notice that there is considerable disagreement in the pragmatic word order literature over what the precise informational correlates of ordering are supposed to be. There are two major research traditions: the Prague School (cf. e. g. Firbas 1964, 1966) claims that "communicative dynamism" increases in the left-to-right presentation of a sentence as increasingly new information in the discourse follows given information; Givon and his followers (cf. e. g. Givon 1983, 1988) propose that "task urgency" regulates ordering, whereby, inter alia, new and unpredictable information precedes predictable and more given information. But these two research traditions are, in essence, advocating contradictory theses about discourse organization, so they can't both be right! Empirically, PTOC tests these pragmatic theories on a subset of the free word order data and languages upon which EIC was tested, using the quantified methodology of Givon and focusing primarily on degrees of givenness and newness of entities measured by referential distance to a previous mention (if any) within the immediately preceding 20 clauses of a text. Two findings emerge. First, the pragmatic theories make less successful predictions than EIC. Both given before new (Prague School) and new before given (task urgency) orders are productively attested in each language examined (English, Hungarian, German and Japanese). Second, there are, however, interesting correlations between information status and EIC. These results can be exemplified by first examining a subset of the English data of Table 1 involving the relative ordering of two post-verbal PPs. The data are set out in Table 5. This table tests the "critical predictions" of EIC and of given before new ordering. Critical predictions are those in which a theory defines a preference for the ordering [AB] and against the alternative [BA], as opposed to remaining neutral to, and equally compatible with, both orders. For EIC, critical predictions are defined on any pair of ICs whose weights are different rather than equal. For the Prague School's communicative dynamism theory, critical predictions are defined on any pair of ICs that do actually differ in givenness within the preceding 20 clauses, as opposed to being equally given or equally new, i. e. one 1C might be previously mentioned and the other not, or one might be mentioned more recently than the other. In both cases the one 1C will have a lower referential distance than the other and will be predicted by this theory to come first. For the opposite (task urgency) theory, the newer 1C with greater referential distance will be predicted to come first. The first column in section I of Table 5 groups the orderings according to their EIC status: non-optimal (i.e. a longer PP2 precedes a shorter PPi), equal

Performance theory of word order

753

Table 5. Given-New and New-Given versus EIC in English [NP V {PPj PP2}] 90 pages of data: first 90 pp. from D. H. Lawrence The Fox, Bantam Books; cf. Table 1 for structural template and additional EIC data / EIC predictions in relation to Given-New

EIC status

EIC ratio of correct: incorrect critical preds.

Non-optimal: Equals: Optimal (1): (2-8): (9+):

10 21 16 20 2

Totals:

69

21%

79% 79%

No. of critical GN preds. 4/10 14/21 14/16 14/20 2/2

GN ratio of correct: incorrect critical preds. 0/4 3/14 11/14 12/14 2/2

= = = = =

0% 21% 79% 86% 100%

70%

// EIC predictions in relation to New-Given EIC status

EIC ratio of correct: incorrect critical preds.

Non-optimal: Equals: Optimal (1): (2-8): (9+):

10 21 16 20 2

Totals:

69

21% 79% 79%

No. of critical NG preds.

4/10 14/21 14/16 14/20 2/2 70%

NG ratio of correct: incorrect critical preds. 4/4 = 100%

11/14- 79% 3/14= 21% 2/14= 14% 0/2 = 0% 42%

in length, or optimal (a shorter PPj precedes a longer PP2, by 1 word, 2—8 words, and 9+ words). Ignoring the equals cases, the optimal orderings defined by EIC are attested in 79% of the data, and the non-optimal orderings in 21%, making a 79% success rate for the critical predictions of EIC. (Recall that most of these non-optimal orderings involve only a one-word differential and hence only a very mild departure from the EIC-defined ideal, cf. Table 1.) Within these groups of data classified according to their EIC status, the next column quantifies the number of orderings for which given before new (GN) defines a critical prediction, i. e. there is some difference in referential distance between the two PPs and previous mention of the entities referred to by the NP immediately contained within these PPs. GN defines a critical prediction for 70% of these data. The final column then calculates the ratio of correct to incorrect critical predictions within these same groups of data. The overall success rate

754

John A. Hawkins

of GN is only 58%, which means that as many as 42% of the two PPs are ordered new before given and so support the opposite task urgency theory. But what is interesting is that there is a direct correlation between the increasing success rate of GN down the column and the increasing preferences and successes defined by EIC, resulting from the increasing weight differences between the two PPs. In other words, GN does best when EIC does best, even though the overall success rate for GN (58%) is significantly lower than that of EIC (79%), for these critical predictions. Correspondingly, NG does best when EIC does worst (in long-before-short orders), as shown in section II of Table 5. But success rates of 58% and 42% are close to random distribution; 79% is nonrandom, meaning success four times out of five in critical cases (and optimality in close to 90% of the data when the orderings with equal weights are also considered). Two issues are raised by these data. First, are pragmatic predictors less good predictors of free word order selections in general than syntactic processing? Second, why are there correlations between GN and EIC in English? With regard to the first question, all the data considered in PTOC comparing EIC with pragmatic predictions (both given before new and new before given) in English, Hungarian, German and Japanese corroborate the results of Table 5: the pragmatic predictions are significantly less successful than EIC. A similar conclusion has been reached by Primus (1994) with regard to "middle field" constituents in German. For Polish, Siewierska (1991 a, 1993) has compared the weight aggregates of subjects and objects in the six logically possible orderings of {S, V, O} with the givenness and newness status of these variably positioned NPs, measured in terms of referential distance aggregates. The weight aggregates are very much in accordance with EIC (cf. n. 8), and her referential distance figures show an almost perfect correlation with weight (cf. below). But just as the packaging of syntactic data in terms of weight aggregates precludes a direct testing of EIC's Text Frequency Prediction (cf. (9)), for the reasons discussed in footnote 8, so too the use of aggregates for the referential distance measurements effectively conceals the number of individual orderings for which EIC or pragmatics makes correct predictions. These data as presented do not decide between one or the other approach, therefore. With regard to the second question (why are there correlations between GN and EIC?), it turns out that the syntactic weight-information status correlation is not consistently that which is presented in Table 5. If EIC predicts the positioning of a short constituent before a longer one (for left-recognized ICs of the type mIQ then the corresponding pragmatic correlation is indeed given before new. But in those structures for which EIC predicts long before short

Performance theory of word order

755

(i.e. for right-recognized ICs of the type IC m ), as in Japanese head-final PPs and NPs, then the corresponding pragmatic correlation is: new entity before given entity, or NG, i. e. just the opposite of English. This is shown in Table 6 for a subset of the Japanese data presented in Table 4 involving two PPs. EIC gets 7.3% of its critical predictions correct, compared with 58% for GN, and hence 42% for NG. But this time given before new gets its successes precisely when EIC is /«correct, and orders are short before long, as shown in the rightmost column of section I in Table 6. When weights are equal or optimal for EIC, GN's successes decline to the point of being random (50% and 45%). Correspondingly, the successes of new before given increase as EIC's long-before-short predictions are increasingly successful, as shown in section II. NG gets predictions correct in those Japanese orderings for which EIC makes correct (long before short) predictions, whereas GN gets its predictions correct in English orderings for which EIC makes correct (shortbefore-long) predictions.

Table 6. Given-New and New-Given versus EIC in Japanese [{iPPm zPPm} V] 70 pages of data: first 70 pp. of data from Juugo Kuroiwa Yami No Hada, Kobunsha Tokyo; cf. Table 4 for structural template, additional EIC data and EIC calculation assumptions. / EIC predictions in relation to Given-New EIC ratio of correct: incorrect critical preds.

EIC status

No. of critical GN preds.

GN ratio of correct: incorrect critical preds.

Non-optimal: Equals: Optimal:

7 22 19

27%

5/7

73%

10/22 11/19

5/5 = 100% 5/10= 50% 5/11 = 45%

Totals:

48

73%

54%

58%

EIC ratio of correct: incorrect critical preds.

No. of critical NG preds.

NG ratio of correct: incorrect critical preds.

27%

0/5 = 0% 5/10 = 50% 6/11 = 55% 42%

// EIC predictions in relation to New-Given EIC status Non-optimal: Equals: Optimal:

7 22 19

73%

5/7 10/22 11/19

Totals:

48

73%

54%

756

John A. Hawkins

In both language types, therefore, short before long correlates with given before new, and long before short with new before given, even though the weight distribution predictions are different in English and Japanese. Is this correlation to be expected? I believe it is. It makes sense when we invoke an additional empirical finding in this context, namely: shorter constituents refer to more referentially given and definite entities, longer constituents to newer and more indefinite entities, irrespective of how these constituents are ordered. In other words, new entities require more words for referent identification than given and previously mentioned entities. As a result, if we have a theory predicting short before long ordering for constituents of the type mIC, and long before short ordering for constituents of the opposite type ICm, as EIC does, then a further correlation is automatically predicted with given before new and new before given respectively, simply by adding the length-information status correlation to EIC. These pragmatic theories are not sufficiently well supported empirically to be considered primitive generalizations in their own right. And theoretically they provide no explanation for why given before new should be favored in one structural type, and new before given in another. But all the performance data are readily explained, even the pragmatic data, if we take EIC as the primitive and basic determinant of order in performance, and add to it an independent correlation between the length of a constituent and its information status. Pragmatic theories appear to be epiphenomenal, therefore: they are derived or secondary consequences of a more primitive generalization, EIC, and of an independent correlation between constituent length and pragmatic status. Support for this independent correlation comes from Siewierska's (1991 a, 1993) aggregated data from Polish, summarized in Table 7. Section I of this table gives the aggregate referential distance (RD) to a previous mention (the lower the score, the more given the entity), and the corresponding aggregate weight for each NP (the lower the score, the fewer words in the NP), and then divides the RD by the weight in each case in order to quantify and compare the relationship between the two in each linear ordering. This relationship is quite consistent. Within each of the six orders, the NP with less weight has less RD as well (i.e. is more given). Section II then displays the correspondences between RD and weight for each NP: the NPs with low aggregate weights in the range 1.07 to 1.63 all have the lowest RDs, in the range 3.2 to 5.6; the NPs with intermediate aggregate weights from 2.5 to 3.69 have intermediate RDs in the range 7.6 to 11.9 with one higher instance of 15.1; and the highest aggregate weights of 5.02 and 6.6 are associated with the highest RDs of all, 16.6 and 16.7. Overall, there is a consistent relationship between weight and RD, therefore, with the RD score being on average 3.1 times greater than the

Performance theory of word order

757

weight score, and with the range of variation spanning 2.2 times through 4.6 times greater. Table 7. Weight and Referential Distance Aggregates in Polish {S, V, 0} / The six orders o f { S , V, 0} Order

vso svo sov vos ovs osv

Overall Frequency RD

7.1% 69.1% 2.2% 10.7% 9.0% 1.6%

3.2 7.6 4.9

Subject Weight

1.07 2.5 1.61

-^ 3.0 3.0 3.0

RD

Object Weight

16.6 11.9 5.6 3.3 7.2 7.6

6.6 3.69 1.63 1.3 3.25 3.0

H-

2.5 3.2 3.4 2.5 2.2 2.5

RD

Subject Weight -H

16.7 15.1 5.0

5.02 3.67 1.08

3.3 4.1 4.6

// RD-Weight correspondences RD

Weight

16.6 16.7

6.6 5.02

11.9 15.1 7.2 7.6 7.6

3.69 3.67 3.25 3.0 2.5

5.6 4.9 3.3 5.0 3.2

1.63 1.61 1.3 1.08 1.07

There is one important and partial proviso that needs to be made to all of this. Certain grammaticalized phrasal nodes in a number of languages have been argued to have a primarily pragmatic function, or both a pragmatic and a semantic function, as opposed to a syntactic one. These nodes are referred to as "discourse-configurational" nodes in Kiss (this volume). Examples are the various kinds of topic and focus nodes, as in Hungarian (cf. Kiss 1987) and Finnish (cf. Vilkuna 1989). The existence of these nodes in grammars is plausibly explainable ultimately on the basis of the pragmatic notions they express, notions such as the "aboutness" relation for topics discussed in Reinhart

758

John A. Hawkins

(1981), and the selection of constituents to fill these nodes in performance is then made in accordance with these notions. But while the existence of these nodes may be pragmatically motivated, it does not follow, as is often assumed, that their positioning is pragmatically determined as well. These nodes cooccur in a tree diagram with nodes that are mostly syntactic and non-discourseconfigurational, and whose ordering has been argued to be driven by EIC. PTOC argues that the positioning of discourse-configurational nodes is no different, and is driven by EIC as well. For example, the positioning of a topic phrase on the left periphery of the sentence in languages such as Hungarian reflects the weight distribution of topic and predicate, topics being shorter, and so results in optimal EIC ratios. This argument is based on the more detailed discussion of the relationship between topic and focus nodes and the EIC principle given in Primus (1991, 1993). Primus argues that the positioning of these nodes exhibits clear weight sensitivities of the type EIC predicts. And while there is a plausible cognitive explanation for the universally preferred leftward skewing of topics (cf. Sasse 1987 for what is essentially a logical priority or order of computation argument that entities must first be conceived of before they can be commented on), this explanation does not generalize to explain the much more variable positioning of focus nodes. Since then, however, Primus (1995: 211) has modified her position, suggesting that the processing principle that drives topic positioning may not be EIC, but may be the same principle that regulates other left-right asymmetries such as antecedent before anaphor preferences, wide scope quantifiers before narrow scope quantifiers, etc. I shall return to this point in the next section. In the meantime, it is important to make the following distinctions and to raise the following issues in connection with pragmatic principles of linear ordering. We need to distinguish, on the one hand, between pragmatic theories for free ordering phenomena, i. e. those orderings for which there is no grammatical rule, principle or constraint and no grammaticalized discourse-configurational node and positioning, and whose selection in performance has been claimed to be determined by pragmatic distinctions such as givenness and newness; and pragmatic theories for the grammaticalized conventions of particular languages, on the other hand, for example the discourse-configurational nodes and their positioning. The results of Tables 5 and 6 comparing EIC with pragmatic predictions for free ordering phenomena suggest that pragmatic information status is not a primary determinant of order in its own right, but is derivative of EIC and of an independent correlation between the amount of syntactic weight in a phrase and degree of givenness. Clearly, this claim needs to be tested on further languages, using syntactically analyzed structures for which

Performance theory of word order

759

both EIC and pragmatic theories make testable predictions. The results of Primus (1991, 1993) suggest further that the positioning of discourse-configurational nodes is not driven by pragmatics either, but by syntactic processing. Primus (1995) argues that EIC is not the processing principle that is relevant here, at least for topic nodes, and this raises the question of what this alternative principle is and of how it interacts with EIC to predict the weight-based sensitivities of focus nodes.

3.3. EIC and left-right asymmetries EIC is not the only processing principle regulating linear ordering proposed in PTOC, though it is the major one. In addition, the principles of Promotion Attachment and Promotion Construction predict the possibility of certain focus positions occurring immediately adjacent to a verb.19 And the principle of Immediate Matrix Disambiguation predicts, in conjunction with EIC, a number of left-right asymmetries involving the positioning of sentential nodes across languages, e. g. the rightward skewing in favor of postnominal versus prenominal relative clauses.20 The general expectation made in PTOC is that all orderings will be regulated by EIC and will be optimally efficient or nearly so, and that within the range of options so permitted certain additional principles can then constrain the logical possibilities further. PTOC thereby attempts to derive a very broad range of data from EIC, including certain other left-right asymmetries such as the well-known statistical preference for subject before object ordering in grammars and performance. It remains to be seen whether EIC can indeed cover as much ground as this theory claims. In any scientific theory we want to derive as many empirical facts as possible from as few principles as possible, and EIC was proposed initially, as I explained in the Introduction, in reaction to a multiplicity of seemingly unrelated principles, across models and approaches, and sometimes within one and the same model. But the question now arises as to whether it needs to be trimmed back, in the light of other processing principles of generality that may be orthogonal to it. One productive area of data in which to look is the full set of left-right asymmetries, some of which have already been accounted for in terms of an interaction between distinct processing advantages. For example, Primus (1995) gives an account of the asymmetrical preference for subject before object positioning across languages that differs from the syntactic weight- and constituency-based EIC theory of Hawkins (1994: 328 — 39). Her account is based on her own theory of grammatical relations, which rejects subject and object as valid categories, and instead decomposes

760

John A. Hawkins

these notions into the more primitive syntactic, morphological and theta-role properties that define them and that do not always overlap consistently in many structures and languages, resulting in the well-known difficulties in actually defining grammatical relations in a consistent way. Her alternative description of the facts opens up the possibility of capturing new higher-level linguistic generalizations, and indeed a key component of her account involves the very important insight that, just as there are asymmetries between syntactic nodes in a tree in terms of c-command, correlating with syntactic and semantic asymmetries in anaphoric binding and logical scope, so too there are asymmetries between theta-roles such as agent and patient, with the former implying "thematic independence" within the situation described by a verb and its argument structure, and the latter "thematic dependence".21 At the level of explanation, Primus then proposes a Principle of Structural Expression of Thematic Information regulating the mapping of such asymmetric (and hierarchically arranged) theta-roles onto surface syntax. For agent and patient this predicts that the agent will c-command and/or precede the patient in the unmarked case. This principle unifies the theory of theta-roles with other asymmetric dependencies in syntax and semantics and so provides an explanation for the assignment of theta-roles to surface syntactic configurations, i. e. it explains what is simply stipulated in current generative syntactic accounts. An extension of this same principle is then applied to the preference for left-peripheral topic positioning across languages: the discourse-semantic properties of a predication are dependent on those of a topic (being "about" the topic in Reinhart's 1981 sense), and again the topic precedes and also ccommands the predication. Part of the motivation for Primus' principle comes from a generalization that was formulated in Hawkins (1994: 51) as the Principle of Structural Determination, which states that a node X can depend syntactically and/or semantically only on those nodes that structurally integrate it (as illustrated in Section 2.0 above), i. e. on nodes within a Structural Domain of X in a higher constituent C. This captures the fact that dependent nodes such as anaphors can only be related to antecedent nodes that c-command their anaphors, as well as accounting for many other dependencies of a syntactic and semantic nature. There is a slightly different way of formulating Primus's explanation for agent-patient ordering, and also topic-predication ordering. It may be desirable, both descriptively and theoretically, to distinguish between the grammaticalized effects of the Principle of Structural Determination, which is not ultimately a performance principle but a principle deriving from the theory of grammar, and the grammaticalization of principles that derive from a theory of performance. In the present context, this would mean separating universal generaliza-

Performance theory of word order

761

tions stated in terms of c-command, as reflections of grammatical structure, from generalizations in terms of linear precedence, whose ultimate explanation reflects the advantages and disadvantages of producing and receiving items in a particular order. The formulation of individual rules in particular languages will often (but not always) refer to both, but at the level of universal grammar it may be desirable to keep these phenomena distinct, certainly within a modular theory of language universals of the kind envisaged here. Applied to Primus' Principle of Structural Expression of Thematic Information, this would mean that the hierarchically higher and thematically independent agent would be predicted to c-command the patient in the unmarked case, not to c-command and/or precede it. The c-command generalization alone is descriptively stronger and more restrictive than an inclusive disjunction generalization in terms of both c-command and precede, and the stronger generalization appears to be correct. All the major basic word order types that are usually distinguished, SOV, SVO, VSO, and also VOS, OVS and OSV, are compatible with it, both in analyses with a VP node and in those without. The only structural types that go against it are VP-nominative structures such as [[V S] O] in which the patient (direct object) asymmetrically c-commands the agent (subject), cf. Keenan (1988) for discussion and examples from Toba Batak. Hence, an unmarked case prediction is perfectly adequate in terms of c-command alone. Moreover, an unmarked case prediction in terms of c-command and/or precede explicitly allows for marked-case structures in which the agent neither c-commands nor precedes the patient, e.g. [O [V S]], yet such cases seem to be non-existent. Let us define Primus's structural expression principle in terms of c-command alone, therefore. We can now supplement this principle with a separate linear precedence principle, derived from a theory of language performance, in order to account for the cross-linguistic distribution of basic word order types. We have seen that processing principles such as EIC are reflected in clear preferences in performance and across grammars, and that such preferences operate relative to the constituent structures defined by the theory of grammar. Let us propose the following (tentative) parsing principle for the processing of asymmetric syntactic and semantic dependencies: (17)

Dependent Nodes Later (DNL) If a node Υ is semantically and/or syntactically dependent on a node X, then the human parser prefers to receive and parse X before Y.

This principle makes the right predictions for the distribution of basic word order types. The great majority should have the thematically independent agent preceding the thematically dependent patient, given Primus's theory of these

762

John A. Hawkins

theta-roles. In addition, c-command is required by the Principle of Structural Expression. As a result, the agent should preferably both precede and c-command the patient, and this is what is typically found in SOV, SVO and VSO languages, which account for close to 96% of languages in Tomlin's (1986) sample. A dispreferred minority will then have the agent c-commanding but not preceding the patient. This is typically the case in VOS, OVS and OSV languages, which account for 4.2% of Tomlin's sample. DNL also predicts a preference for antecedents to precede their anaphors, in performance and in grammars, and for wide scope quantifiers to precede narrow scope quantifiers. These preferences appear to be real, though they remain to be quantified, both within and across languages. DNL also operates in conjunction with the Principle of Structural Determination to predict the positioning of topic and predication nodes: the topic should c-command the predicate phrase and will preferably precede it. It does so, cf. Primus (1991, 1993, 1995) and also Gundel (1988). The formulation of DNL given in (17) still requires further attention. The notion of semantic dependence needs to be precisely defined so that dependencies can be clearly recognized between phrases other than agent and patient NPs, antecedent and anaphor, wide scope versus narrow scope operators, for all of which definitional criteria are now available in the literature. In addition, the precise interaction between DNL and other processing preferences, such as EIC, needs to be further investigated, both in performance and in the grammaticalized rules referring to such dependencies in particular languages. If DNL can survive in some form following such further investigation, then we will indeed have an independently motivated alternative processing explanation for subject before object positioning, and this motivation may be preferable to the EIC account, whose coverage may need to be scaled back. Similarly, DNL may give a better account of topic positioning, and of other left-right asymmetries as well. Some general issues raised by such an interactive model of processing would then include: does DNL operate within the set of constraints defined by EIC (i. e. does it select from among equally good orderings defined by EIC)? or is it actually opposed to EIC's preferences in some cases? If the latter, which principle wins out and when?

3.4. Production versus comprehension The formulation of EIC given in (8) above defines complexity and efficiency from the perspective of the hearer rather than of the speaker. The crucial concept is that of a Constituent Recognition Domain, within the context of an on-

Performance theory of word order

763

line parser, rather than of a Constituent Production Domain within a production model. This raises the following general issue: is linear ordering primarily driven by the benefits for comprehension rather than by those for production? The processing theory advocated here does not necessarily claim that the needs of the hearer are paramount. EIC is formulated in terms of comprehension for practical and not for theoretical reasons. We know more about the details of syntactic comprehension than we do about the production of syntax, and at least one systematic attempt to account for weight effects within a production model, that of De Smedt (to appear), makes the wrong predictions for head-final languages. Our theory is therefore non-committal about the precise benefits for production, and formulates EIC in terms of comprehension only, pending further research on relevant aspects of a production model.22 De Smedt (to appear) explains the postposing of heavy constituents in English-type languages in terms of his Incremental Parallel Formulator, i. e. a production model. According to this model, syntactic segments are assembled incrementally into a whole sentence structure, following message generation within a conceptualizer, and the relative ordering of segments can reflect both the original order of conceptualization and the processing time that is required by the formulator for more complex constituents. The late occurrence of heavy constituents is explained in this model in terms of the greater speed with which short constituents can be formulated in the race between parallel processes in sentence generation. However, heavy sentential or NP constituents with rightperipheral complementizers and head nouns respectively are preferably preposed in head-final languages such as Japanese, and this fact appears to be in direct conflict with the predictions of the Incremental Parallel Formulator. There are two possible conclusions that one can draw from this. First, there may be linear orderings in which the benefits for the speaker and those for the hearer do not overlap. Preposing a heavy subordinate clause to the left in Japanese benefits the hearer, because it shortens the CRD for the matrix S by including within this CRD only the right-peripheral complementizer, i. e. the first constituent that makes it clear to the hearer that this clause is a subordinate 1C within the matrix, by constructing it. Preposing is not predicted by De Smedt's production model, however, and more generally if we were to measure Constituent Production Domains from the speaker's perspective, we would begin to register the existence of the matrix S from the moment its first dominated constituent appears in the production string within the formulator, on the assumption that the speaker already knows whether this S is a main or a subordinate clause, and it should then make no difference whether a subordinate clause is preposed or center-embedded. The production domain, measured in terms of

764

John A. Hawkins

the non-terminal and terminal nodes that are accessed when producing the matrix, would be the same in either case, and it is only from the hearer's perspective that preposing is preferred. A second option is to pursue the (simpler and more likely) possibility that both the hearer and the speaker benefit from EIC's preferences, even in these mirror-image orderings across head-initial and head-final languages. This would entail a rethinking of the relationship between the conceptualizer and the formulator, so that heavy constituents would not necessarily be serialized later. It would also require formulating Constituent Production Domains in a way that parallels Constituent Recognition Domains and that sets the same premium on the rapid and efficient positioning of categories that construct phrasal nodes. In other words, the conceptualizer could be cognitively aware of the subordinate status of a preposed clause within a matrix S, but the syntactic node dominating this subordinate clause would not actually be constructed in the Formulator until the right-peripheral complementizer was producted, and this complementizer would thereby initiate the Constituent Production Domain for the matrix S by bottom-up processing, just as it does in the corresponding CRD. If the production and recognition of abstract syntactic nodes follows the same time course in this way, within the formulator and the parser respectively, then EIC's linear ordering preferences will indeed apply equally to production and comprehension. I find this latter possibility very attractive. However, it makes certain assumptions about production that require further research and that may not be valid. In the interim, EIC is not a principle of comprehension only. It is a general processing principle, with numerous clear overlaps between benefits for the speaker and benefits for the hearer. Certain cases of apparent non-overlap, however, require a more cautious formulation in terms of comprehension only, thereby enabling me to remain non-committal about issues that have not yet been resolved in the production literature and to whose resolution EIC effects may ultimately contribute.

4. The empirical testing of EIC In Tables 1—4, EIC's predictions for performance (cf. Section 2.2) were tested on alternative orders of just two ICs, within a controlled syntactic environment, and in written text samples from the languages in question. EIC's predictions for grammar (cf. Section 2.3) have been tested on ordering data from cross-linguistic samples (both controlled and convenience samples), and on more detailed properties of ordering rules in a number of better-studied Ian-

Performance theory of word order

765

guages. The mathematics of the preference defined by EIC is rather complex, however, and defining the precise predictions that it makes and the level of empirical success that can still be said to support it is no trivial matter. The data in these tables have all been taken from written texts, but there are other kinds of data that provide relevant evidence, and the structures selected for testing have been limited. In this section I define (briefly) some of the issues that merit further consideration in the empirical testing of EIC, and I illustrate a new testing procedure that consists in examining orders from a much broader range of syntactic environments.

4.1. EIC's preferences and relevant data If a syntactic environment is chosen for the testing of EIC in which there are two variably positioned ICs, then there will be only two possible orderings. If one of these is optimally efficient, and the other is non-optimal, then the optimal one will be preferred in the unmarked case, and the non-optimal one should be found in a distinct minority of instances and languages (cf. (8), (9) and (14) above). Further, the non-optimal order should only be tolerated at all when weight differences between the two competing orders are small, and should decline in frequency as these weight differences get larger. When we increase the variably positioned ICs from two to three, the number of possible orderings jumps to six, as in the Np{N, Adj, S'} structure discussed in Section 2.3. More than one of these may now be optimal, e.g. both [N Adj S '[C S]] and [Adj N S -[C S]]; more than one may now be non-optimal, e.g. [N S' Adj] and [Adj S' N]; and there may be a ranking in the degrees of preference, with e.g. [N Adj $'[C S]] preferred to [N S.[C S] Adj] which is in turn preferred to [s'[C S] N Adj]. If these ICs are freely ordered in a language, then EIC's predictions for the distribution of the six orders in performance are going to be more complex. Any one of the optimal orderings need not now achieve an 85% + frequency score, because optimality is partitioned among two or more equally good alternatives. These alternatives may divide up this 85% + evenly and randomly, or they may not and there may be further interacting considerations that skew the distribution so that only one optimal order has a high frequency of occurrence. What must be done in these cases is to add up the total distribution of the optimal orders and compare it with the total for the non-optimal orders. But now there is a further consideration. Let us say that two of the six alternatives are optimal and that the remaining four exhibit different degrees of non-optimality. Since there are now twice as many nonoptimal as optimal orders for the speaker to choose from, it is conceivable that this could affect the frequency with which non-optimal orders are selected,

766

John A. Hawkins

resulting in a higher overall proportion of non-optimal to optimal orders than we find in the alternative pairs of Tables 1—4. The statistical distribution of optimal to non-optimal orders may therefore be a function both of EIC's degree of preference, as reflected in these tables, and of the number of competing options. This needs to be investigated. Similar considerations apply when investigating a pair of binary branching phrases such as {S {V O}} in Polish, which are compatible with six possible orderings of these three elements, two of which have variable length (S and O), the verb being a single-word item. What must be done in this case is to add up the total number of sentences in the data with each of the six orders and with each set of word total assignments (S = 2 O = 3, S=2 O=4, S=3 O=5, etc.; or simplifying: O>S by 1, 2, 3, O=S, etc.). IC-to-word ratios must then be calculated for each set of word total assignments in each order, in order to see whether the best order occurs in the unmarked case in all the sentences containing each set of word total assignments, and whether the marked-case predictions hold for the non-optimal orderings in each case as well. The relative frequencies of the six orders overall will then be a consequence of (and predicted by) the word totals that happen to be assigned to subject and object in all the sentences of the text for expressive reasons, in conjunction with the EIC efficiency metric. If four alternatively positioned ICs are considered, the number of orderings jumps to twenty-four, and the practical testing of EIC soon becomes very complex and requires a large data-base in order to witness any examples at all of the less preferred orderings. It is for this reason that I have kept the empirical testing of EIC as simple as possible hitherto, generally limiting the alternative orderings to two in the performance data, and to four or six in the crosslinguistic grammatical data. The availability of computerized corpora, however, now opens up the possibility of a much more ambitious testing of EIC in performance, permitting more predictions to be tested on greater numbers of ordering alternatives.23 There is another point that requires comment in this context. I argued in Section 2 that the order preferred by EIC should be realized in 85% + of textual instances in alternations such as those of Tables 1—4, with the dispreferred order accounting for the rest. Where does this 85% figure come from? Why is the unmarked preference not attested in, say, 75% of the data, or in 95% ?24 Recall that EIC's unmarked case prediction is tested by adding up the total number of preferred orders in a text sample and adding to it the total number of orders in which weights are equal. The proportion of this combined total relative to all the orders in a sample is then calculated, and this proportion quantifies the number of orders that are as optimal as they can possibly be,

Performance theory of word order

767

compared with those that could be better, according to EIC's metric, if ordered differently. These dispreferred orders, in turn, are predicted to decline in frequency, the worse their ratios are compared with their preferred counterparts. Their overall frequency can, therefore, vary in principle, depending on how large the weight differentials are between ICs in the relevant structures of a text sample. If there are many ICs with just a one-word differential, there could be many dispreferred orders in absolute terms; if there are only few ICs with one-word differentials, there will be few dispreferred orders. The 85% + figure for EIC's success rate is an average figure for the data samples and languages that have been investigated so far, and it is a consequence both of EIC's preference for the most optimal orders in general, and of the average weight differentials that have been empirically observed in my text samples. If the number of ICs in Tables 1—4 with a one-word differential had been much greater, then EIC's preferred word orders would only have accounted for 80% or 75% of the data; if this number had been less, the preferred orders would have been in the 95% range. Interestingly, it appears that different text types do exhibit different weight differentials between ICs. In spoken data, NPs and PPs have smaller weights in general, and hence the range of variation in their differentials is smaller than it is in written texts, as Hagstrom (1994) has shown in her preliminary testing of EIC on spoken data. This means that there could potentially be more dispreferred orderings in spoken language, since dispreferred and preferred orderings will have similar ratios more of the time. On the other hand, my general expectation is that spoken data will be even more sensitive to EIC's predictions than written data, since the sensory input for the processing of syntactic structure is available in a more transient and fleeting form than it is in the visual medium, and this imposes a greater strain on working memory and permits less complexity of structure in general. This would lead us to expect a greater tolerance for dispreferred orders in written texts. As a result, these two considerations could balance each other out, so that spoken data end up with very similar EIC results to written data. The data reported in Hagstrom (1994) suggest that this is indeed the case. There does not appear to be a significant difference between EIC's success rate applied to written and spoken data. Again, this needs to be further investigated. EIC must also be tested psycholinguistically, in experiments that control for and test its role in comprehension and production. There are a number of experimental results that have been reported in the psycholinguistic literature that are relevant for EIC's predictions, and these are summarized in Hawkins (1994: 64—66). Some experiments that specifically test for syntactic weight effects in comprehension and production are currently being conducted at DSC,

768

John A. Hawkins

and early results are reported in Stallings, MacDonald & O'Seagdha (in preparation). Their results support the kinds of syntactic weight effects predicted by EIC in Heavy NP Shift structures.

4.2. Testing EIC on a complete text The data of Tables 1 and 2 examined post-verbal NPs and PPs in English and Hungarian. But what about the ordering of AdvPs in relation to these phrases, or the ordering of three PPs? What about the relative orderings of other daughters of VP and of S, or of PP and S' complements of N within an NP? EIC makes predictions for the positioning of every single syntactic phrase and terminal element in these languages, and hence there are many more structural types that are relevant for its predictions. Moreover, since EIC is a unified theory of performance and grammar (i. e. grammatical ordering rules are simply conventionalizations of performance efficiency), we do not have to test its predictions on grammatically free (or transformationally rearrangable) word orders and fixed word orders separately. We can examine all orderings of constituents in some body of data, whether conventionalized or not, and predict the same preference for the EIC-defined orders that we do for the sample word orders of Tables 1 and 2. The testing of this prediction on a broad range of phrases and their daughter constituents is not straightforward. For every phrase with more than two ICs, there will be many alternative orderings to consider whose respective efficiency levels will reflect the weights of each 1C, the precise place at which each is constructed, etc. And assessing the efficiency of the actually selected order will require calculating the IC-to-word ratios for all competing orderings, measuring the degree of preference for the selected one, and then comparing across all phrases and all alternative orderings in all phrases in order to quantify how many actual orderings are preferred by EIC, and how efficient the actual orderings are in IC-to-word ratios compared with their competitors. That means a lot of calculations! One way to make the mathematics more manageable while at the same time providing a realistic assessment of the degree to which all orderings of all constituents in a given language regularly conform to EIC is as follows. Calculate the IC-to-word efficiency ratio for every ordering of constituents that actually occurs in some text sample, and compare it with just two other calculations: the best possible ordering for these same constituents, as defined by EIC; and the worst possible ordering. Comparison of the actual scores with the best and worse scores will then make it possible to quantify (a) the degree to which attested orderings are in accordance with EIC's efficiency metric in the range

Performance theory of word order

769

from best to worst, and (b) the number and proportion of orderings that are as optimal as they could possibly be. To this end I conducted the following test. I selected an arbitrary English text: the lead article in The Guardian Weekly for the week in which I began the analysis, entitled "Karadzic orders Serb defiance", given in Appendix I. I then analyzed the orderings of ICs in all the major phrasal categories, S, VP, NP and PP, from this point of view. The method of analysis is set out in (18): (18)

Method of Analysis (i) Draw a syntactic tree diagram for every sentence in the text (n = 17): • assume the following phrasal categories or "maximal projections": S, VP, NP, PP, AdjP, AuxP, AdvP, PossP, QP; • assume no intermediate projections between the terminal categories and their maximal projections, i. e. assume flat structures such as NP[Det AdjP N PP], V p[V NP PP PP], etc. (ii) Define CRDs (cf. (5) above) for all branching phrases of the type S, VP, NP and PP: • assume that [ctet] constructs an embedded S; • assume to does not construct an embedded VP (since this to is identical to the preposition ίο); • count the possessive -s as a separate word within PossP; • assign to each instance of a branching phrase a numbered index reflecting the order in which it is received by the parser, NP l5 NP2, (iii) Calculate an EIC score for each branching S, VP, NP and PP node in the text by: • calculating as a percentage the ratio of ICs to words within each S, VP, NP and PP node; • calculating as a percentage the ratio of ICs to words within the phrasal category that most immediately contains each S, VP, NP and PP node (including, where relevant, a containing node other than S, VP, NP and PP, e. g. PossP or AdjP); • aggregating the percentages for these two CRDs to achieve an actual EIC score for each branching S, VP, NP and PP. (iv) Compare the actual EIC score for each branching S, VP, NP and PP with the best possible ordering of its ICs and the worst possible ordering: • best and worst scores are again aggregations of the IC-to-word ratio for each branching node and its immediately containing branching XP;

770

John A. Hawkins

• all logically possible alternative orderings of ICs are considered when assessing which is best and which is worst — some of these will still be grammatical strings of English, others will not be, being ruled out by the ordering conventions of English grammar; • when the logically possible orderings of ICs in any given phrase are enumerated, only the orderings of sister ICs are changed — the ordering of elements within each 1C is held constant. This method therefore assigns an actual IC-to-word score, a best score and a worst score to each branching S, VP, NP and PP node. Each such score is an aggregation of the IC-to-word ratio for each phrase and its immediately containing phrase. The reason for this calculation procedure is that different orderings of e. g. P and NP within PP can affect not just the ratio for PP itself, but also the ratio for an immediately containing branching node that dominates PP, such as VP, cf. Vp[V PP[P NP]] versus Vp[V PP[NP P]]. The aggregation reflects this. The conditions in (iv) permit us to manipulate the positioning of a preposition in English, converting it to a postposition even though postpositions are ungrammatical in English, in order to assess how efficient prepositions are relative to other orderings of constituents. When manipulating orderings for this purpose, however, it is only permissible to manipulate the relative ordering of sisters, such as P and NP, not to simultaneously change orderings within these ICs. Such changes can be made when assessing the efficiency of the NP CRD relative to its best and worst alternatives. But when the PP CRD is being considered, only the relative ordering of P and NP can be manipulated, with their respective daughters being held constant. Unless this is done, the comparison of actual, best and worst orderings of a given set of sister ICs will be meaningless, because one will no longer be comparing different orderings of the same entities. We can illustrate the method of (18) by examining sample calculations for the following sentence in the text:

(19)

the

rejection

left

them

AdjPj

bereft of

allies

Performance theory of word order

771

EIC scores for the VPj and PPj nodes are given in (20): (20)

VP I: actual EIC score: VPi CRD: 3/3 = 100% St CRD: 3/4=75% best EIC score: same worst EIC score: [them Adj?! left] VPj CRD: 3/5 = 60% Sj CRD: 3/8 = 37.5%

Agg. = 87.5

PP i: actualEIC score: PPj CRD: 2/2=100% AdjP, CRD: 2/2=100% best EIC score: same worst EIC score: [allies of] PPj CRD: 2/2= 100% AdjP, CRD: 2/3=66.67%

Agg. = 100

Agg. = 87.5 Agg.=48.8

Agg. = 100 Agg. = 83.4

In both VPj and PPt the aggregated actual EIC score is identical to the best possible score. The range of possible variation is captured by the difference between the best possible and the worst possible EIC scores. The results of the analysis are set out in Table 8. The text contains 259 branching S, VP, NP and PP nodes. The average EIC score for the actual orders in all of these nodes is 79.4. The average for the best scores is only three points higher, 82.4. The average for the worst scores is 48.0. What this means is that the actual orderings of all ICs in all these branching phrases are extremely close to maximum efficiency. We can quantify this as follows. The range of variation from best to worst is 82.4—48.0 = 34.4. Within this range, the actual EIC scores are (on average) just 3.0 points lower than the best and 31.4 points higher than the worst. The efficiency ratio of the actual orders is therefore 31.4/34.4 = 91.3%. The actual orderings of ICs achieve an efficiency level in terms of EIC's predictions that is 91.3% of what it could be. This supports the reality of the preference that EIC defines. Were EIC not operative in shaping the selections of orderings in a text, these orderings would not have been so close to the EIC-defined ideal. Table 8 also enumerates the numbers of nodes in which the actual ordering of ICs is identical to, or close to, the best possible ordering. 80.7% of nodes have actual orders with EIC scores that are identical to the best possible scores, as in (20); 83.8% have scores that are within 5% of the best; 87.3% come within 10%; and 90.3% come within 15%.25 These results are reminiscent of the success levels for EIC's predictions in the ordering alternations of Tables 1—4. They support the claim that EIC does

772

John A. Hawkins

Table 8. Testing ETC on a Complete English Text

Branching Nodes (S, VP, NP, PP)

Best E/C Score (Average)

Actual E/C Score (Average)

Worst E/C Score (Average)

259

82.4

79.4

48.0

No. No. No. No.

of of of of

branching nodes with Actual Score identical to Best Score: 209/259 (80.7%) branching nodes with Actual Score within 5% of Best Score: 217/259 (83.8%) branching nodes with Actual Score within 10% of Best Score: 226/259 (87.3%) branching nodes with Actual Score within 15% of Best Score: 234/259 (90.3%)

indeed define an ordering preference that holds in general or in the unmarked case. Only now this preference has been shown to hold for a much broader class of structures, within a whole text, and despite any alternative preferences that might be argued to be relevant for linear ordering, involving pragmatic, semantic or additional processing principles of the types discussed in Section 3. This supports the basicness of EIC as a determinant of linear ordering for which I have argued in this paper, and the epiphenomenal or secondary nature of other explanatory generalizations that have been proposed. Considerations other than EIC do not appear to be pulling orders away from the EIC-defined ideal to any great extent.

5. Conclusions In sections 1 and 2 of this paper I defined the EIC theory of word order, and illustrated some of the performance data on which this theory has been tested, by members of the ESF Constituent Order Group and by other ESF-funded collaborators. Section 3 laid out a number of issues for future research and clarification, resulting from discussions that were held during the EUROTYP project. Many details relating to EIC will undoubtedly need to be revised, but I still maintain that this principle provides a potential explanation for, and unification of, a very large range of grammatical and performance regularities of linear ordering that have been pursued in isolation from one another hitherto. The general processing approach of EIC is compatible with certain revisions, however, and with additional principles such as Dependent Nodes Later (cf. (17)), and the model of "Performance-Driven Grammar" initiated in PTOC will need to- evolve in response to new data and theorizing in much the same way that current models of grammar have evolved. Section 4 discussed issues involving the empirical testing of EIC, and illustrated an additional and rather

Performance theory of word order

773

ambitious testing procedure. The contributions of EUROTYP participants in developing and critically evaluating this theory, and their data from European languages, have been invaluable and are hereby gratefully acknowledged.

Notes 1. I am most grateful to members of the Constituent Order Group of EUROTYP for their critical feedback throughout the project and for their assistance in collecting data relevant to EIC's predictions. Their comments on the first draft of this paper have been incorporated in the current version wherever possible, as have comments by the de Gruyter series editor Georg Bossong. Comments made by Ekkehard König were also especially valuable. The role of the ESF in funding the meetings of the Constituent Order Group is gratefully acknowledged, and a Small Grant received from the ESF made it possible for further data to be collected at the University of Southern California. Additional financial assistance for this research was received from the University of Southern California Faculty Research and Innovation Fund (FRIF), which is gratefully acknowledged, as is a summer stipend that I was awarded by the National Endowment for the Humanities (FT-341 50-90) in 1990. The work reported here would not have been possible without my many collaborators, whose contributions are explicitly acknowledged in the main text. 2. Cf. Hawkins (1994: 24—32) for discussion of this complexity metric in terms of Structural Domains. 3. Note that each of the PPs in this configuration could be immediately dominated by VP or by S. The consequences of each of the domination possibilities for EIC's predictions are carefully worked out in Hawkins (1994: 123 — 29). It turns out that EIC predicts a consistent short before long preference, regardless of the attachment site, as long as we assume non-discontinuous structures. If discontinuity is permitted (e.g. if PP2 is dominated by VP and PPj by S in the order [V PPj PP2]), then certain long before short PPs may be preferred, depending on additional assumptions about whether alternating orders of [V PPj PP2] and [V PP2 PPj] can change constituency and move an originally VP-dominated PP under the immediate domination of S, even in models in which discontinuity is allowed. The EIC predictions of Table 1 assume non-discontinuity, for the sake of simplicity. 4. KimbalPs (1973) New Nodes principle states that: the construction of a new node is signalled by the occurrence of a function word. The class of constructing categories is rather broader than this, however, since it includes both lexical and functional categories, as illustrated in the main text. 5. Cf. Hawkins (1994: 74—77) for discussion of the use of IC-to-word ratios in lieu of IC-to-nonIC ratios. 6. The EIC definition for IC-to-nonIC ratios measured left-to-right is as follows. The L-to-R IC-to-nonIC ratio for a non-optimal CRD is measured by first counting the ICs in the domain from left to right (starting from 1), and then counting the nonICs (or words alone) in the domain [again starting from 1). The first 1C is then divided by the total number of non-ICs that it dominates (e.g. 1/2); the second 1C is divided by the highest total for the non-ICs that it dominates (e.g. if this 1C dominates the third through seventh non-IC in the domain, then 2/7 is the ratio for

774

John A. Hawkins the second 1C); and so on for all subsequent ICs. The ratio for each 1C is expressed as a percentage, and these percentages are then aggregated to achieve a score for the whole CRD. For example, the ratios for (4 a) and (4b) in the main text measured left-to-right would be as follows: (41)

a. VP CRD:

1/1 100%

2/4 50%

3/5 60%

Agg.=70%

b. VP CRD:

1/1 100%

2/8 25%

3/9 .33.3%

Agg.=52.8%

The left-to-right calculation procedure is especially useful for discriminating between orderings whose absolute IC-to-word ratios are otherwise identical, cf. Hawkins (1994: 81 — 83). In other cases, however, it generally defines the same preferences as the simpler procedure in terms of absolute IC-to-word ratios and can be dispensed with. 7. German Extraposition from NP data are presented in Hawkins (1994: 197—210). Most of these data come from Shannon (1992, 1995). 8. There is a reason why EIC's predictions for performance are not normally formulated in terms of the weight aggregates of the component constituents in each order. EIC cannot make any predictions for the relative frequencies of different orderings in a text sample, unless we know the actual word totals for subjects and objects in all individual sentences. The relative number of, for example, SVO versus VOS orders that is predicted in a text depends crucially on how many words have been assigned to each S and Ο for expressive reasons. If objects happen to be significantly longer than subjects in general, then SVO will predominate; if subjects are significantly longer, then VOS will predominate. An aggregate for a given order is derived from whatever set of sentences a text happens to have with that order, and depending on the raw numbers, any aggregate is mathematically compatible with any relative frequency ranking (the aggregate for SVO could be the same whether there were 2 or 2,000 SVO orders in the text). The kinds of predictions that EIC can make for these performance aggregates are exemplified in the main text. The predictions refer only to the relative sizes of the aggregate weights in the constituents of each sentence type, and say nothing about the relative frequencies of the different orders. The other complication in testing EIC on Siewierska's data comes from uncertainty over the precise structural analysis of OSV and OVS orders in Polish. OSV accounts for 1.6% of her data, and OVS for as much as 9%. If this latter has a structure in which V and S form a constituent that is c-commanded by a higher topicalized O, i. e. [O [V S]], then the weight aggregates are exactly as predicted by EIC: a shorter O (3.3 words) precedes a longer [V S] constituent (4.7 words). The weight distribution of OSV is not accounted for by a similar structural assignment of [O [S V]] (O = 3 words, S = 1.1), though a discontinuous analysis of O and V in which both remain VP-dominated would predict an unusually small S in this range, just as it does for the mirror-image VSO structure (in which the aggregate for S = 1.07). Independent syntactic and semantic evidence is needed in order to decide on the best structural analysis in these cases, so that additional EIC predictions can be tested. If some of these turn out to be unfavorable for EIC, this will not change the situation radically, for it remains the case that the weight aggregates in at least 90% of these transitive sentences support EIC strongly, and the predictions derivable from the many struc-

Performance theory of word order

9. 10.

11. 12. 13.

14. 15.

16. 17.

18. 19. 20. 21. 22. 23.

775

tural analyses that are clear support EIC as well. The Polish data appear to be very much in accordance with EIC, therefore, contrary to what Siewierska asserts. Cf. the further discussion of her data in Section 3.2. Cf. Hawkins (1994: 178-80) for a summary of Lascaratou's (1989) data. The testing of EIC's predictions in Table 3 is simplified in that it does not control for different structural possibilities in the data. The PP could be dominated by VP or by S, for example, just as the English PPs can be VP- or S-dominated in Table 1 (cf. n. 3). When structural alternatives are controlled for, EIC's unmarked case prediction scores an even higher percentage, cf. Hawkins (1994: 152—53). Despite the simplified presentation, Table 3 reveals a clear long-before-short preference, with increasing effect the larger the weight differential is. This is the exact mirror-image of the English and Hungarian data in Tables 1 and 2. Cf. Hawkins (1994: 100, 102) for a definition of these EIC predictions for grammars. For relevant data, cf. Hawkins (1994: 272) which draws on Christian Lehmann (1984). Cf. Hawkins (1994: 282—96) for illustration and discussion. Notice that EIC's predictions for grammars have to be defined on the aggregate weights of categories in order to discriminate between the complements NP, PP and S' in (16c). It is the relative aggregate weight ranking S' > PP > NP that results in EIC's predicted [X NP PP S'] serialization in a head-initial language. Cf. Hawkins (1994: 7—12) for further examples of these hard-to-process constructions in English, German and Japanese, and for relevant discussion. Nor does Chomsky's view define a very plausible scenario in its own right, apart from being unprovable. The basic theoretical problem with it is that the performance mechanisms of the human language faculty must be just as biologically determined and innate as whatever grammatical universale are innate. So if performance determines certain grammatical conventions, is it reasonable to suppose that one module of the mind is in effect duplicated into another? It is more economical to assume that performance mechanisms constrain options that are left open by the innate grammar. This alternative view defines the locus of the grammatical response to performance at the level of particular grammars, rather than at the level of the innate grammar. Cf. Hawkins (1994: 315-21) for illustration. In some work in progress examining morphological feature hierarchies of the type first proposed in Greenberg (1966), I argue that complexity per se accounts for only some of these hierarchies. The more general performance phenomenon to which languages have responded (by conventionalizing morphological distinctions in their paradigms, numbers of allomorphs, etc.) appears to be frequency of occurrence of the relevant semantic and/or grammatical distinctions in performance. Cf. further Hawkins (in prep.). Cf. Hawkins (1994: 119-20) for discussion of LIPOC. Cf. Hawkins (1994: 373-79) for discussion. Cf. Hawkins (1994: 323-28) for discussion. Primus (1995) provides her own definitions for the criteria! properties that distinguish theta-roles such as agent and patient, building on the work of Dowty (1991). Cf. Hawkins (1994: 425-27) for discussion. Cf. for example Ramoulin-Brunberg (1994) for an interesting illustration of the potential that a computerized corpus provides in this context. She examines the relative ordering of adjective and noun in Late Middle English.

776

John A. Hawkins

24. I am grateful to Tom Wasow (personal communication) for discussion of this issue. 25. If the best IC-to-word aggregate was, say, 80, and the worst 42, then if the actual aggregate was 75, this was considered to be within 5% of the best; if the actual score was 70, this was considered to be within 10%; and so on. This calculation of the numbers of actual orderings within 5%, 10% and 15% of the best does not take into account the permitted range of variation from best to worst in each case, therefore, but simply measures the percentage departure from the ideal on the assumption that the range of variation is 100 to 0.

References Berwick, Robert C. & Amy S. Weinberg 1984 The Grammatical Basis of Linguistic Performance: Language Use and Acquisition, Cambridge, Mass.: MIT Press. Chomsky, Noam 1956 "Three models for the description of language", in: /. R. E. Transactions on Information Theory, Vol. IT-2, 113 — 124. Reprinted with corrections in: R. D. Luce, R. Bush &c E. Galanter (eds.), Readings in Mathematical Psychology, Vol. 2, New York: Wiley, 1965. 1965 Aspects of the Theory of Syntax, Cambridge, Mass.: MIT Press. Chomsky, Noam & Howard Lasnik 1977 "Filters and control", Linguistic Inquiry 8: 425-504. Croft, William 1990 Typology and Universals, Cambridge: Cambridge University Press. De Smedt, K. J. M. J. to appear "Parallelism in incremental sentence generation", in: G. Adriaens & U. Hahn (eds.), Parallelism in Natural Language Processing, Ablex. Dik, Simon C. 1978 Functional Grammar, London: Academic Press. 1989 The Theory of Functional Grammar, Part 1: The Structure of the Clause, Dordrecht: Foris. Dowty, David 1991 "Thematic proto-roles and argument selection", Language 67: 547—619. Dryer, Matthew S. 1992 "The Greenbergian word order correlations", Language 68: 81 — 138. Erdmann, Peter 1988 "On the principle of 'weight' in English", in: Caroline Duncan-Rose & Theo Vennemann (eds.), On Language, Rhetorica Phonologica Syntactica: A Festschrift for Robert P. Stockwell from his Friends and Colleagues, London: Routledge, 325-339. Firbas, Jan 1964 "On defining the theme in functional sentence analysis", Travaux Linguistiques de Prague 1: 267—280, Prague: Academia. 1966 "On the concept of communicative dynamism in the theory of functional sentence perspective", Sbornik Praci Filosoficke Fakulty Brnenske University A-19, 135-144.

Performance theory of word order

777

Fodor, Janet D. 1978 "Parsing strategies and constraints on transformations", Linguistic Inquiry 9: 427-473. Frazier, Lyn 1985 "Syntactic complexity", in: David Dowty, Lauri Karttunen & Arnold Zwicky (eds.), Natural Language Parsing: Psychological, Computational, and Theoretical Perspectives, Cambridge: Cambridge University Press, 129-189. Frazier, Lyn & Keith Rayner 1988 "Parameterizing the language processing system: left- versus rightbranching within and across languages", in: John A. Hawkins (ed.) Explaining Language Universals, Oxford: Basil Blackwell, 247—279. Givon, Talmy (ed.) 1983 Topic Continuity in Discourse. A Quantitative Cross-Language Study, Amsterdam: John Benjamins. Givon, Talmy 1988 "The pragmatics of word order: predictability, importance and attention", in: Michael Hammond, Edith A. Moravcsik & Jessica Wirth (eds.), 243— 284. Greenberg, Joseph 1966 Language Universals with Special Reference to Feature Hierarchies, Mouton, The Hague. Gundel, Jeanette K. 1988 "Universals of topic-comment structure", in: Michael Hammond, Edith A. Moravcsik 8c Jessica Wirth (eds.), 209-239. Hagstrom, Cynthia 1994 "A test of Early Immediate Constituent predictions for spoken English", MA paper, USC. Hammond, Michael, Edith A. Moravcsik 8c Jessica Wirth (eds.) 1988 Studies in Syntactic Typology, Amsterdam: John Benjamins. Hawkins, John A. 1988 "On explaining some left-right asymmetries in syntactic and morphological universals", in: Michael Hammond, Edith Moravcsik and Jessica Wirth (eds.), 321-357. 1990 "A parsing theory of word order universals", Linguistic Inquiry 21: 223— 261. 1992 "Syntactic weight versus information structure in word order variation", in: Joachim Jacobs (ed.), Informationsstruktur und Grammatik, Linguistische Berichte Special Issue No. 4, 196—219. 1993 "Heads, parsing and word-order universals", in: Greville G. Corbett, Norman M. Fräser & Scott McGlashan (eds.), Heads in grammatical theory, Cambridge: 231—265, Cambridge University Press. 1994 A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press. in preparation "A typological approach to morphological expressiveness in Germanic". 1988 Hawkins, John A. (ed.) Explaining language universals, Oxford: Basil Blackwell.

778

John A. Hawkins

Hawkins, John A. &C Anne Cutler 1988 "Psycholinguistic factors in morphological asymmetry", in: John A. Hawkins (ed.), 280-317. Hawkins, John A. & Anna Siewierska (eds.) 1991 Performance Principles of Word Order. EUROTYP 11/2. Working Paper 2 Theme Group 2, European Science Foundation Programme in Language Typology, ESF Office, Strasbourg. Inoue, Atsu 1991 "A comparative study of parsing in English and Japanese", Ph. D. dissertation, University of Connecticut, Storrs. Inoue, Atsu & Janet D. Fodor 1994 "Information-paced parsing of Japanese", in: Reiko Mazuka & Noriko Nagai (eds.), Japanese Syntactic Processing, Hillsdale, New Jersey: Lawrence Erlbaum, 125—155. Keenan, Edward L. 1988 "On semantics and the binding theory", in: John A. Hawkins (ed.), 105— 144. Keenan, Edward L. & Bernard Comrie 1977 "Noun phrase accessibility and Universal Grammar", Linguistic Inquiry 8: 63-99. Keenan, Edward L. & Sarah Hawkins 1987 "The psychological validity of the Accessibility Hierarchy", in: Edward L. Keenan (ed.), Universal Grammar: 15 Essays, London: Routledge (Croom Helm), 60-85. Kimball, John 1973 "Seven principles of surface structure parsing in natural language", Cognition 2: 15-47. Kirby, Simon 1994 "Adaptive explanations for language universale", Sprachtypologie and Universalienforschung 47', 186—210. Kiss, Katalin E. 1987 Configurationality in Hungarian. Dordrecht: Reidel. 1991 "Logical structure in syntactic structure: The case of Hungarian", in: C. T. J. Huang & Robert May (eds.), Logical Structure and Linguistic Structure, Dordrecht: Reidel, 75-95. this volume "Discourse Configurationality inthe languages of Europe". Koopman, Hilda 1984 The Syntax of Verbs: From Verb Movement Rules in the Kru Languages to Universal Grammar, Dordrecht: Foris. Lascaratou, Chryssoula 1989 A Functional Approach to Constituent Order with Particular Reference to Modern Greek, Athens: English Department, Athens University. Lehmann, Christian 1984 Der Relativsatz, Tübingen: Gunter Narr Verlag. Lindblom, Björn, Peter MacNeilage & Michael Studdert-Kennedy 1984 "Self-organizing processes and the explanation of phonological'universals", in: Brian Butterworth, Bernard Comrie & Osten Dahl (eds.), Explanations for Language Universals, New York: Mouton, 181—203.

Performance theory of word order

779

Lindblom, Björn & Ian Maddieson 1988 "Phonetic universals in consonant systems", in: Larry M. Hyman 8c Charles N. Li (eds.), Language, Speech and Mind: Studies in Honour of Victoria A. Fromkin, London: Routledge, 62—78. McCawley, James D. 1982 "Parentheticals and discontinuous constituent structure", Linguistic Inquiry 13: 91-106. 1987 "Some additional evidence for discontinuity", in: Geoffrey J. Huck & Almerindo E. Ojeda (eds.), Syntax and Semantic Discontinuous Constituency, Vol. 20, New York: Academic Press, 185-200. Miller, George & Noam Chomsky 1963 "Finitary models of language users", in: R. D. Luce, R. R. Bush & E. Galanter (eds.), Handbook of Mathematical Psychology, Vol. 2, New York: Wiley, 130-175. Mithun, Marianne 1992 "Is basic word order universal?", in: Doris L. Payne (ed.), 15—32. Newmeyer, Frederick J. 1990 "Speaker-hearer asymmetry as a factor in language evolution: a functional explanation for formal principles of grammar", Proceedings of the Berkeley Linguistic Society 16, 102—112. 1991 "Functional explanation in linguistics and the origins of language", Language and Communication 11: 3 — 28. Payne, Doris L. (ed.) 1992 Pragmatics of Word Order Flexibility, Amsterdam: John Benjamins. Primus, Beatrice 1991 "A performance based account of topic positions and focus positions", in: John A. Hawkins & Anna Siewierska (eds.), 1—33. 1993 "Word order and information structure: a performance-based account of topic positions and focus positions", in: Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld & Theo Vennemann (eds.), Syntax. An international Handbook of Contemporary Research, Berlin: de Gruyter, 880-896. 1994 "Grammatik and Performanz: Faktoren der Wortstellungsvariation im Mittelfeld", Sprache und Pragmatik 32: 39-86. 1995 "Cases and Thematic Roles: Ergative, Accusative, Active", University of Munich. Ramoulin-Brunberg, Helena 1994 "The position of adjectival modifiers in Late Middle English noun phrases", in: Udo Fries, Gunnel Tottie & Peter Schneider (eds.), Creating and Using English Language Corpora, Papers from the Fourteenth International Conference on English Language Research on Computerized Corpora, Zürich, 159 — 168 Reinhart, Tanya 1981 "Pragmatics and linguistics: an analysis of sentence topics", Philosophica 27: 53-94. 1983 Anaphora and Semantic Interpretation. London: Routledge (Groom Helm), and Chicago: University of Chicago Press.

780

John A. Hawkins

Rijkhoff, Jan 1992 "The noun phrase: a typological study of its form and function", Ph. D. dissertation, University of Amsterdam. Sasse, Hans-Jürgen 1987 "The thetic/categorical distinction revisited", Linguistics 25: 511-580. Shannon, Thomas F. 1992 "Toward an adequate characterization of relative clause extraposition in Modern German", in: Irmengard Rauch, Gerald F. Carr & Robert L. Kyes (eds.), On Germanic Linguistics: Issues and Methods, Berlin: de Gruyter 1-18. 1995 "Extraposition of NP complements in Dutch and German: an empirical comparison", in: Thomas F. Shannon & Johan P. Snapper (eds.), The Berkeley Conference on Dutch Linguistics 1993, University Press of America, Lanham, MD, 87-116. Siewierska, Anna 1991 a "Syntactic weight versus information structure and word order variation in Polish", in: J. A. Hawkins & A. Siewierska (eds.), 107-140. 1991 b Functional Grammar, London: Routledge. 1993 "Syntactic weight vs information structure and word order variation in Polish", Journal of Linguistics 29: 233-265. Stallings, Lynne M., Maryellen C. MacDonald & P. G. O'Seagdha in preparation "Phrasal ordering constraints in sentence production: phrase length and verb disposition in Heavy-NP Shift", MS, Dept of Linguistics,

use.

Thompson, Sandra A. 1978 "Modern English from a typological point of view: some implications of the function of word order", Linguistische Berichte 54: 19—35. Tomlin, Russell S. 1986 Basic Word Order: Functional Principles, London: Routledge (Croom Helm). Travis, Lisa 1984 "Parameters and effects of word order variation", Ph. D. dissertation, MIT. 1989 "Parameters of phrase structure", in: M. R. Baltin & A. S. Kroch (eds.), Alternative Conceptions of Phrase Structure, Chicago: University of Chicago Press. Vilkuna, Maria 1989 Free Word Order in Finnish: its Syntax and Discourse Functions, Helsinki: Suomalaisen Kirjallisuuden Seura. 1991 "Constituent order and constituent length in Finnish", in: J. A. Hawkins & A. Siewierska (eds.), 81-106.

Performance theory of word order

781

Appendix 1: Lead Article from the Manchester Guardian Weekly (week ending 14th August 1994)

Ian Traynor Karadzic orders Serb defiance The Bosnian Serb leader, Radovan Karadzic, beleaguered as never before in more than two years of war, sought to shore up his power base at the weekend by stoking a mood of emergency, unity and defiance in the 70 per cent of Bosnia held by his forces. As president of the breakaway Bosnian Serb republic, he issued a decree ordering a full-scale mobilisation of the workforce to counter the blockade ordered by President Slobodan Milosevic of Serbia to punish the Bosnian Serbs for rejecting the international peace plan embraced by Belgrade. Weekend reports suggested that Mr Milosevic was making good his threat to seal the border, barring dozens of lorries from entering Bosnian Serb territory. His priority is to bring an end to two years of United Nations economic sanctions against Serbia, an aim that is being frustrated by the Bosnian Serbs' refusal to accept the peace plan. Mr Karadzic responded by vowing that the Bosnian Serbs would go it alone if need be and ordering all people of working age to report for compulsory work. "We will ask the Serb nation for further sacrifice," he said. After Nato's air strike on the Bosnian Serbs around Sarajevo on Friday, Mr Karadzic sought to make a virtue of his isolation. But in cannily worded statements, he also tried to keep his options open, hinting that he could campaign for a Yes vote in a referendum on the peace plan that the Bosnian Serbs are to hold at the end of the month. He also sent conciliatory signals, returning the five weapons his forces stole last week from a UN-guarded arms depot outside Sarajevo — the incident that triggered the air strike by breaching the Nato-decreed heavy-weapons ban around the Bosnian capital. Three times in the past three weeks the Bosnian Serbs have spurned the peace plan, which has been backed by Bosnia's Muslims and Croats. The rejection has left them bereft of allies. They are confronted by the Muslims and Croats in Bosnia, Nato and the western powers, Serbia and Russia. In another blow to their hopes, Moscow in effect endorsed the Nato air strike on Saturday, saying the Bosnian Serbs had brought it on themselves. The Russian statement appeared to confirm that the main priority of the five powers who drafted the latest peace plan — the US, Russia, Germany, France, and Britain — is to maintain their consensus. Mr Milosevic's tactics in the propaganda war are to target not the Serbs in Bosnia but their "leadership", who are "betraying the nation". In a direct reference to Mr Karadzic, a part-time poet, the Milosevic press has for the first time been invoking sympathy for the besieged people of Sarajevo. "Once peace comes, the people cannot be led by the men who bombarded civilians in Sarajevo and those who, to the world's revulsion, promote their poetry over Sarajevo while the city was burning," Politika, the president's main mouth-piece in Belgrade, said.

Anna Siewierska, Jan Rijkhoff and Dik Bakker

Appendix — 12 word order variables in the languages of Europe

The following appendix lists the values for twelve word order variables in the languages of Europe as defined by the EUROTYP project. The data comprising the appendix have been collected over the duration of the EUROTYP project on the basis of a questionnaire, direct consultation with language experts and native speakers and an examination of the existing literature. The rather elaborate word order questionnaire, devised by members of the constituent order group, was filled out by native speaker linguists or linguists working in consultation with native speakers for 56 languages. The languages in question are marked in the appendix by the letter Q next to the language name. We are greatly indebted to the many scholars who have shared their expertise with us and particularly to those who spared the time to complete the questionnaire. Our chief consultants together with the names of the languages on which they were consulted are listed at the end of the appendix. Needless to say, while we have relied on the knowledge of these scholars, the ultimate responsibility for how we have interpreted the answers to our queries rests with us. Though we have gone to great lengths to check the accuracy of the data contained in the appendix, confirming the data provided by one source with that of several other sources, for some of the languages we have had to rely solely on grammatical sketches and/or comparative remarks in surveys of groups of languages. Most of the Turkic languages fall into this category, quite a few of the indigenous languages of the Caucasus (e. g. Abaza, Bats, Hinukh, Lak, Tindi, Khvarshi, Udi, Ubykh), three of the Finnic (Karelian, Livonian, Votic), several of the Romance (e.g. Aragonese, Asturian, Corsican, Dalmatian, Franco-Provenfal), three of the Balto-Slavic (Old Prussian, Polabian and Lower Serbian), two of the Iranian (Talysh and Tati) as well as Oscan and Umbrian. We cannot vouch for the correctness of the data for these languages. For Pontic and Tsakonian we have been unable to collect any data at all. And for Etruscan we have simply reiterated the current conjectures. The twelve word order variables presented in the appendix are those that have featured prominently in typological discussions of word order initiated by Greenberg's (1963) seminal work on word order universals. With the exception

784

Anna Siewierska, Jan Rijkhoff and Dik Bakker

of the first variable which represents the basic order of the subject, object and verb in a language, all the variables reflect the order between pairs of constituents where one of the two constituents is the head and the other the modifier. Unlike in most listings of the order of such head/modifier pairs which provide only the basic order of the two relative to each other, this appendix also specifies alternative orders and the morphological status, free, clitic affix or compound, of the forms involved. Of the twelve word order variables, four relate to the order of what may be characterized as clause-level constituents. These are: the previously mentioned basic order of the subject, object and verb (BWO), the order of the adposition relative to the noun, i. e. whether a language has prepositions or postpositions (Adpos), the order of the lexical verb and the auxiliary verb (Aux/V), and the order of the recipient in a ditransitive clause relative to the verb (V/R). This last variable is essentially of relevance only for SOV languages. It is intended to capture whether the language in question is or is not strictly verb final. The remaining eight variables all pertain to the order of the constituents of the NP. The variables in question are: the order of the definite article and the noun (Def/N), the order of the indefinite article and noun (Indef/N), the order of the demonstrative and noun (Dem/N), the order of the cardinal numeral and the noun (Num/N), the order of the adjective and noun (Adj/N), the order of the nominal possessor and possessed noun (G/N), the order of the pronominal possessor and the possessed noun (Pro/N) and the order of the restrictive relative clause and noun (Rel/N). The identification of most of the categories comprising the twelve variables in the languages of Europe is relatively straightforward. The determination of the presence of definite and/or indefinite articles in a language constitutes the major exception. Many languages may express definiteness or indefiniteness by means of possessive affixes or clitics (e.g. as in Turkic or Finnic), demonstratives (e. g. as in Serbian and spoken Finnish), the numeral one (e. g. as in Classical Armenian, Assyrian, Estonian, Finnish, Nenets, Turkish and Bashkir), case endings (e. g. as in Adyghe or Kabardian) or different series of adjectival endings (e.g. as in Latvian, Lithuanian and to some extent in Germanic). The use of these means of definiteness and/or indefiniteness marking, particularly via possessives, demonstratives and the numeral one, may be common enough in a language for some linguists to consider the forms in question as articles. Therefore in the case of some languages there is considerable disagreement in the literature in regard to whether they do or do not possess articles. We have adopted a rather conservative position on this issue and treated as articles only morphemes (words, clitics or affixes) used solely (or primarily) to indicate definiteness or indefiniteness. Accordingly none of the Turkic or Finnic languages, with the exception of Mordvin, are indicated in the appendix as having

Appendix

785

a definite article. Nor are Classical Armenian, Assyrian, Estonian, Finnish, Nenets, Turkish or Bashkir treated as possessing an indefinite article. And of the Slavic languages only Bulgarian and Macedonian are considered as having a definite article. In determining the basic order of the twelve variables we have applied somewhat different criteria at the clause and NP levels. For the clause-level constituents we have considered the order obtaining in main, positive, declarative, indicative clauses with nominal as opposed to pronominal or clausal arguments. We took the basic order to be the statistically dominant order. In the case of the basic transitive order, we adopted the Greenbergian typology of S, Ο and V, where S stands for the agentive argument of a transitive clause and Ο for the patient-like argument. It must be mentioned that this characterization is not intended to carry any implications as to the syntactic as opposed to pragmatic underpinnings of the basic transitive order. Since we do not assume that the dominant configuration of the S, O and V is necessarily the product of syntactic rules, we have not made use of the labels 'free' or 'no basic order' which are two of the solutions frequently adopted for languages whose order is considered to be pragmatically determined. The vast majortiy of the languages of Europe have been given a basic transitive order classification of SVO, SOV or VSO. Some, however, have been assigned a dual classification. For several languages, namely Classical Greek, Classical Armenian, Upper and Lower Sorbian, Georgian and Hungarian opinions differ as to whether the dominant order is SOV or SVO. Since there are arguments for and against either of these classifications, we have opted to remain agnostic on this issue. The second group of languages which have been given a double basic transitive order classification are Dutch, Frisian, German and Luxembourgeois all of which have SVO order in nonperiphrastic tenses and SAuxOV in periphrastic tenses. Given that in the former the object occurs after and in the latter before the lexical verb, a straightforward SVO or SOV classification of these languages could be considered as misleading. Two further languages with double basic transitive order are Breton and Cornish both of which are labelled VSO/ SVO. For Breton the statistics are again not clear. Moreover, the verb in VSO clauses cannot usually occur in absolute initial position, while the subject in SVO clauses takes a special particle which is distinct from that found in XVSO order. In Cornish VSO order is claimed to be dominant in verbal clauses and SVO order in nominalized clauses. The latter, however, are said to be more frequent than the former. In view of the heterogenous nature of some of the modifiers of the noun, in determining the basic order of head/modifier pairs within the NP, in addition to the dominance criterion, we took into account certain semantic, structural

786

Anna Siewierska, Jan Rijkhoff and Dik Bakker

and distributional aspects of the modifiers. On the whole we took the basic order to be the dominant order displayed by the biggest class of a given modifier. Accordingly, for Pro/N order the order occurring with first and second person pronouns was treated as basic. And for Adj/N order the dominant order found with the biggest class of adjectives was taken as basic. For G/N order, however, we took the basic order to be that occurring with human possessors even if in the majority of the genitive phrases in the language the genitive occupied a different position, as is the case in English and other Germanic languages. Thus English, for example, is classified as having basic GN order as in the boy's jacket rather than basic NG order as in the car of the woman next door or the engine of the car, though the latter occurs with a bigger class of genitives than the former. In determining the basic order of cardinal numerals, was assigned priority to the order exhibited by lower ranking numerals rather than higher ranking ones, unless the former conisted of only one or two members. This latter situation occurs in Basque in which the numeral one and in some dialects two is prenominal while all the others are postnominal. In the case of relative constructions we took into account only the order of identifying as opposed to qualifying constructions. This means that we disregarded constructions such as the Dutch de in de tuin zittende man lit. 'the in the garden sitting man' where the prenominal participial phrase 'in de tuin zittende' typically serves to qualify the noun and not like the finite relative clause de man die net nog in de tuin zat 'the man who was sitting in the garden just a moment ago' to help identify the noun. In languages which have both participial and clausal relatives both of which regularly may fulfil an identifying function, as in some of the Finnic languages, we took the order of the clausal relative as basic. Our motivation for doing so was the fact that the participial relatives tend not to display the same range of arguments and adjuncts as the clausal. For the languages for which we lack detailed data, we opted to assign no basic order if two orders were attested. The alternative orders of head/modifier pairs specified in the appendix do not include orders which occur only if the modifier is itself modified. For example, though in English an adjective with a prepositional complement is obligatorily postnominal as in a man proud of his son rather than prenominal as in a proud man, English is not classified in the appendix as exhibiting NAdj in addition to AdjN order. Another type of alternative order which is not included in the appendix is that due solely to the attachment of a given modifier to another modifier. A case in point is that of DefN order in languages in which definite articles are enclitic to the first modifer of the NP. In languages which have such articles, for instance, Rumanian, Bulgarian and Macedonian the order of the article and noun is obligatorily N#Def as in the Rumanian om-ul

Appendix

787

'the man'. The order DefN occurs only if the definite article is enclitic to a preceding adjective, numeral or in Macedonian and Bulgarian also possessive pronoun as in the Rumanian bun-ul om 'the good man'. A similar situation is found in Modern Greek in which the definite article can occur here post rather than prenominally only if it is repeated with a postnominal modifier as in i jineka i omorfi lit. 'the woman the beautiful' as compared to / omorfi jineka or i omorfi i jineka. The alternative orders that we have taken into account include: a) orders that involve the same class of modifiers as those in the basic used under different, structural, semantic or pragmatic conditions those typical of the basic order; b) orders that involve a distinct class of modifiers to those found in the order; c) orders that may be used with a sub-class of the modifiers found in the order; d) combinations of the above.

order than basic basic

An example of an alternative order used under different structural conditions to that of the basic order is that of DefN as compared to NDef order in the North Germanic languages which use the former in conjunction or instead of the latter in the presence of a prenominal adjective; cf. the Swedish stad-en 'the city' with den stora stad-en 'the big city'. An instance of an alternative order involving a semantic difference is that of NNum as compared to NumN order in Latvian and East Slavic which is associated with an approximate interpretation of the numeral. An essentially pragmatic difference, contrast or emphasis, underlies the NAdj as opposed to AdjN order in most Slavic languages. An alternative order found with a complementary class of modifiers to that in the basic order is the previously mentioned NumN order with the numeral bat One' in Basque. The same phenomenon may be observed in Maltese and Classical Armenian. An alternative order displayed only by a subclass of the modifiers that occur in the basic order is the AdjN order in Celtic favoured by a small class of adjectives such as the Scots Gaelic deagh 'good', droch 'bad', fior 'real' which form quasi compounds with the noun. Another case in point is that of gentilitial adjectives in Basque which, unlike other adjectives which occur postnominally, may be placed both before and after the noun: amerikar hiria/ hiri amerikarra 'American city'. A combination of the above underlies AdjN as compared to NAdj order in the Romance languages. In most of the Romance languages AdjN order is found only with a small class of adjectives such as the Catalan veil 'longstanding old', pobre 'pitable/poor' and pur 'sheer/pure'. The majority of this small class of adjectives also display NAdj order but with a

788

Anna Siewierska, Jan Rijkhoff and Dik Bakker

change of meaning. And just a few adjectives such as mer 'mere' can only occur prenominally. The data in the appendix are organized as follows. The languages are grouped according to their standardly recognized genetic affiliation into phyla, families and major branches and sub-branches. Within each genetic sub-grouping the languages are listed alphabetically. Under each language name the values for the twelve word order variables are specified in linear sequence beginning with BWO and ending with V/R. For languages which have more than one value for variables other than BWO, the basic order of the variable in question is accompanied by the letter (b) and the alternative order is given underneath the basic order. We have not attempted to indicate the conditions under which the alternative orders occur by means of notational conventions, since too many such conventions would be necessary to capture the diverse range of alternative orders that have been taken into account. Instead we provide relevant information, if available, in notes. In view of the fact that a given order even in closely related languages need not occur under the same set of conditions, the reader is advised to consult the notes before drawing any hasty generalizations solely on the basis of the ordering patterns indicated in the main body of the appendix. Alternative orders which are not commented upon in notes may be assumed to involve essentially the same classes of modifiers as those featuring in the basic order. If neither of the values for a given variable is indicated by a (b), this means that we did not have sufficient information to determine which order is basic, or, possibly, that none of the orders should be considered as basic. Blanks for the variables Def/N and Indef/N indicate that the specific variable is irrelevant for the language in question. A '?' underneath a variable means that we lack the relevant information. The information on the order of head/modifier pairs is accompanied by a specification of the morphological status: free, clitic, affix or quasi compound of the forms involved. Orders involving free forms are indicated by the abbreviations of the forms in question beginning in upper case, (e. g. NAdj or AdjN) while affixes are presented in lower case and indicated by '-' (e.g. N-def or pro-N). Proclitics and enclitics are represented by upper case with a '#' (e.g. Pro#N or N#Def). And the few instances of semi-compounds are indicated by the symbol ' = ' (e.g. Adj = N). If the free form of a modifier requires the concomitant presence of an affixal or clitic form of the same modifier or vice versa under some set of circumstances, both of these forms are specified (e. g. Pro#NPro or ProN-pro). Needless to say, since the distinction between free forms, clitics and affixes is notoriously difficult to draw and the same forms are often referred to in the literature by diffent labels, the distinctions that we have made in this regard may not always be correct. In distinguishing clitics

Appendix

789

from free forms, unless we had information to the contrary, we took as a diagnostic of a free form independent word status in the writing system. Our basic criterion for distinguishing affixes from clitics was whether the article, demonstrative or possessive pronoun (the only three categories in the appendix for which we attempted to make this distinction) is necessarily attached to the noun or whether in the presence of other modifiers it attaches to one of these other modifiers. The former were classified as affixes, the latter as clitics. If we did not have enough information at our disposal to establish the distribution properties of an apparent bound form, it was assigned affixal status. The notional conventions for depicting free, clitic and affixal status that we adopted were difficult to apply in the case of pronominal possessive constructions in Celtic involving a postnominal personal form of a preposition as in the Manx y thie aym lit 'the hose at:lsg' meaning 'my house'. Since in such constructions the pronominal form is bound to the preposition rather than directly to the noun, either a NPro or a N-pro representation of the status of this pronominal form is misleading. Nonetheless, we have opted not to introduce further notational conventions and have represented this construction as NPro thus signalling that the pronoun is not bound to the noun.

Consultants Tor A. Afarli — Norwegian Jurji Anduganov — Mari G. M. Awbery - Welsh Emanuel Banfi — Albanian, Slovene Peter Bakker — Romani Giovanni M. G. Belluscio — Albanian Giuliano Bernini — Italian Vit Bubenik - Czech Alain Christol — Ossetic Bernard Comrie — Maltese Ines Loi Corvetto — Sardinian Helma Dik — Ancient Greek Margreet Dorleijn — Kurdish, Turkish Karen Ebert — Frisian (North) Jack Feuillet — Bulgarian, Macedonian Anna Gavarro — Catalan Inge Genee — Irish Jadranka Gvozdanovic — Serbian/Croatian Riho Grünthal — Mordvin (Erzya) Ian Hancock — Romani Martin Haspelmath — Lezgian

Toomas Help — Estonian Titi Hennoste — Estonian Mateja Hocevar — Slovene Anders Holmberg — Swedish Lisbeth Falster Jakobsen — Danish Jan de Jong — Latin Johannes Gisli Jonsson - Icelandic Birute Klass — Estonian Paula Kokkonen — Komi Zyrian Katalin Kiss — Hungarian Chryssoula Lascaratou — Greek Ruta Marcinkeviciene — Lithuanian Yaron Matras — Kirmanji, Romani Belen Lopez Meirama — Galician Juan Carlos Moreno Cabrera — Spanish Igor Nedjalkov — Ossetic, Karachay Christian T. Petersen — Gothic Beatrice Primus — High German, Rumanian Donall P. O Baoill - Irish Eusebio Osa — Basque Bernard Oyharcabal — Basque

790

Anna Siewierska, Jan Rijkhoff and Dik Bakker

Estiphan Panoussi — Modern Eastern Armenian, Assyrian Jan Rijkhoff — Dutch Sirkka Saarinen — Mari Tapani Salminen — Nenets (Tundra) Merja Salo — Mordvin (Erzya) Pekka Sammallahti — Sami Gerjan van Schaaik — Turkish Suzanne Schlyter — French Irja Seurujarvi-Kari — Sami Anna Siewierska — Polish H. A. Sigurösson — Icelandic Rieks Smeets — Circassian Svillen Stanchev — Bulgarian Janig Stephens — Breton Pirkko Suihkonen — Udmurt

Jan-Olof Svantesson — Kalmyk Maggie Tallerman — Welsh Yakov G. Testelec — Russian, Kartvelian, Daghestanian Ingrid Thelin — Spanish Jasmine Tragut — Armenian Ludmila Uhlirova — Czech, Slovak, Polish, Bulgarian Enric Vallduvi — Catalan Martina Vanhove — Maltese Maria Vilkuna — Finnish and Finnic Jos Weitenberg — Classical and East Armenian Bibinur Zaguljajeva — Udmurt Tomaso Zorzutti — Sardinian Bostjan Zupanicic — Slovene

Appendix

791

> .0



>Χ χ3

«ι"ί

ι χ

Χ

Ι

3
α! Λ θ Ο U ί/5

792

Anna Siewierska, Jan Rijkhoff and Dik Bakker

·£.

·£,

χ

χ

> ^

> ^

χ

3

3

χ

3

3

> ^

> ^

Ζ

υ

οLH d ® d k7

*-r O 1-1

ο ex

i-r

^