Grammatical Relations: A Cross-Linguistic Perspective on their Syntax and Semantics 9783110887334, 9783110137378


231 47 4MB

English Pages 181 [184] Year 1994

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgements
Abbreviations
Chapter 1. Introduction
1.1 The basic problem
1.2. Precursors
1.3. The nature of the typology
1.4. Summary
Chapter 2. High semantic transparency; Korean
2.1. Participant marking
2.2. The semantic range of basic grammatical relations
2.3. Word order effects
2.4. Clause marking
2.5. Summary
Chapter 3. The interaction with other principles
3.1. An interactionist view of language
3.2. Subject and voice in Indonesian
3.3. Low semantic transparency in Indonesian
3.4. High semantic transparency in Indonesian
3.5. Indonesian in the Semantic Typology
Chapter 4. The cross-linguistic survey
4.1. The sample
4.2. Left-branching versus right-branching
4.3. Left-branching languages
4.4. Right-branching languages
Chapter 5. Summary of results
Notes
References
Language Index
Subject Index
Recommend Papers

Grammatical Relations: A Cross-Linguistic Perspective on their Syntax and Semantics
 9783110887334, 9783110137378

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Grammatical Relations

Empirical Approaches to Language Typology 11 Editors Georg Bossong Bernard Comrie

Mouton de Gruyter Berlin · New York

Grammatical Relations A Cross-Linguistic Perspective on their Syntax and Semantics by Franz Müller-Gotama

Mouton de Gruyter Berlin · New York

1994

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter & Co., Berlin.

® Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data

Müller-Gotama, Franz, 1959 — Grammatical relations : a cross-linguistic perspective on their syntax and semantics / by Franz Müller-Gotama. p. cm. — (Empirical approaches to language typology : 11) Revision of the author's thesis (doctoral —University of Southern California, 1991). Includes bibliographical references and index. ISBN 3-11-013737-2 1. Grammar, Comparative and general —Syntax. 2. Semantics. 3. Grammar, Comparative and general — Grammaticalization. 4. Typology (Linguistics). I. Title. II. Series. P291.M33 1994 415 —dc20 91-3344 CIP

Die Deutsche Bibliothek — Cataloging-in-Publication Data

Müller-Gotama, Franz: Grammatical relations : a cross-linguistic perspective on their syntax and semantics / By Franz Müller-Gotama. — Berlin ; New York : Mouton de Gruyter, 1994 (Empirical approaches to language typology ; 11) ISBN 3-11-013737-2 NE: G T

© Copyright 1994 by Walter de Gruyter & Co., D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Printing: Gerike GmbH, Berlin. — Binding: Lüderitz & Bauer, Berlin. Printed in Germany.

Acknowledgements This book is based on my doctoral dissertation, which was submitted to the Linguistics Department at the University of Southern California in 1991. Since then, it has undergone a series of revisions until it reached its present state, substantially rethought, much streamlined, and, hopefully, all the better for it. I would like to once again thank the members of my dissertation committee, Bernard Comrie, Jack Hawkins, and Steve Lansing. This work would not have been possible without them. My thanks also go to the other members of my guidance committee, Joseph Aoun, Ed Finegan, and Jackie Schachter. I am indebted to Prof. Georg Bossong, who took an interest in my work and as editor of EALT guided it through several revisions to punishability in this series. The comments of several anonymous reviewers have been invaluable in this process. Although much of the data for this study comes from published sources, a good part of it was provided or rechecked by native speakers who were kind enough to sacrifice their spare time to serve as "language consultants" for this project. My sincere thanks go to them all, to Caecilia and Jacintha Gotama, to Jeongdal Kim, Suchitra Sadanandan, Kaoru Horie, Tim Shi, Anji Hsieh, and Rivka Brandt, and to my anonymous Dutch informant. I also like to thank Alan Kaye for checking the accuracy of the Hebrew data. Finally, I owe a very special thank you to my wife Caecilia and to my parents, Anneliese and Willi Müller. Dieses Buch ist für Dich, Willi.

Contents Acknowledgements

ν

Abbreviations

χ

Chapter 1.1 1.2. 1.2.1. 1.2.2. 1.2.3. 1.3. 1.3.1. 1.3.2. 1.3.3. 1.3.3.1. 1.3.3.2. 1.3.3.3. 1.4.

1 Introduction The basic problem Precursors Evaluation of Hawkins (1986) Hale (1982) Comrie (1986), (1989b) The nature of the typology Transparency and grammaticization Consequences and problems of the scalar model Ideal-typical instantiation in the Semantic Typology Basic grammatical relations Manipulation of grammatical relations Summary of the ideal types Summary

1 1 2 7 11 12 14 14 17 20 21 25 28 29

Chapter 2.1. 2.2. 2.2.1. 2.2.2. 2.3. 2.3.1. 2.3.2. 2.3.3. 2.3.3.1. 2.3.3.2. 2.4. 2.5.

2 High semantic transparency: Korean Participant marking The semantic range of basic grammatical relations Subjects Objects Word order effects Scrambling Extractions Raising Subject-to-subject raising Object-to-subject raising Clause marking Summary

31 31 32 33 42 43 43 46 47 47 49 51 54

Chapter 3 The interaction with other principles 3.1. An interactionist view of language 3.2. Subject and voice in Indonesian

55 55 56

viii

Contents

3.3. 3.3.1. 3.3.2. 3.3.3. 3.3.3.1. 3.3.3.2. 3.4. 3.4.1. 3.4.2. 3.4.3. 3.5. Chapter 4.1. 4.1.1. 4.1.2. 4.2. 4.3. 4.3.1. 4.3.1.1. 4.3.1.2. 4.3.1.3. 4.3.1.4. 4.3.2. 4.3.2.1. 4.3.2.2. 4.3.3. 4.3.4. 4.3.5. 4.3.5.1. 4.3.5.2. 4.3.5.3. 4.3.5.4. 4.4. 4.4.1. 4.4.1.1. 4.4.1.2. 4.4.1.3. 4.4.2.

Low semantic transparency in Indonesian Basic word order Grammatical relations The semantic diversity of basic grammatical relations Objects Subjects High semantic transparency in Indonesian Raising Extractions Discussion Indonesian in the Semantic Typology 4 The cross-linguistic survey The sample Criteria for sample construction The sample used in this study Left-branching versus right-branching Left-branching languages Malayalam Word order and case marking Grammatical relations Argument trespassing Summary Japanese Basic grammatical properties Manipulation of the basic participant structure Turkish Hixkaryana Dutch A remark on word order Basic grammatical properties Syntactic movement Summary Right-branching languages Jacaltec Argument trespassing The semantic range of the basic grammatical relations Conclusion: Split properties in right-branching languages Sawu

58 58 60 62 62 64 67 68 70 75 75 78 78 78 81 84 86 87 87 89 93 97 97 98 101 105 108 113 113 115 118 121 121 121 121 124 125 127

Contents

ix

4.4.3. 4.4.4. 4.4.4.1. 4.4.4.2. 4.4.4.3. 4.4.4.4.

Babungo Chinese A remark on word order Basic grammatical properties Syntactic movement Assessment of the Chinese facts

130 131 131 132 133 135

4.4.5.

Hebrew

136

Chapter 5 Summary of results

141

Notes

145

References

154

Language Index

167

Subject Index

168

Abbreviations 1 2 3 ABS ACC ASP BEN CAUS CL COLL COMP DAT DECL DEF DEM DET DETRANS ERG FUT GEN HON INF INST INT LF LOC NEG NML NOM 0 PASS PL PNE POSS PRES

Q

first person second person third person absolutive accusative aspect marker benefactive causative classifier collective complementizer dative declarative definite demonstrative determiner detransitivized ergative future tense genitive honorific infinitive instrumental intransitive logical form locative negative nominalizer nominative zero passive plural prenominai ending possessive present tense question

REDUP SG TA TOP TRANS

reduplication singular transitive active topic transitive

1. Introduction 1.1. The basic problem The relationship between the semantics and the syntax of language is a central problem in linguistic inquiry.1 Any grammatical theory must ultimately be concerned with how it handles the mapping of meaning onto grammatical form if it is to be successful as an account of human knowledge of language. Because of the tremendous variation in the surface syntax of the world's languages, much of the research, particularly in generative grammar, has shared the assumption that this mapping must take place at some abstract, "deep" level of syntactic structure. Chomsky's classical Standard Theory, for instance, had the mapping process proceed from deep structure to semantic structure (Chomsky 1965: 16) while present-day generative grammar sees a complex troika of s-structure, d-structure, and Logical Form (LF) at work: S-structure is mostly responsible for the interpretation of anaphora, d-structure for the recognition of the thematic (i.e., semantic) roles of noun phrases, and LF for the interpretation of quantifiers. Surface structure and semantics are, then, only very indirectly connected in multi-level theories, and any significant linguistic generalizations can only be expected between semantic structure and the relevant abstract syntactic level. Clearly, each language has its own unique grammatical structure, which must be interpreted semantically; but despite the obvious variability among surface grammars and the resulting "deep" bias of the research, the present work asks whether systematic relationships can not be established crosslinguistically between semantic structure and the surface syntax. Specifically, we deal with nominal participants, especially the core grammatical relations of subject and object, and show that there is a strong crosslinguistic correlation between the semantic content of the grammatical relations in a language and their grammatical treatment. Briefly, we find that languages which define the semantic range of their grammatical relations narrowly, so that for example basic subject status is restricted to agents and experiencers, also have only few syntactic rules - like raising, passive, or extraction - which rearrange the surface manifestation of the grammatical relations. Conversely, languages which allow a wider range of semantic roles as basic subjects permit such syntactic processes more extensively. This cross-linguistic distribution can be represented formally

2

Chapter 1: Introduction

as a typological continuum which arises from the interaction of two counteracting forces, semantic transparency and grammaticization. We also observe striking differences between head-initial and head-final languages. The former can be located at any point on the continuum while the latter are invariably found near the semantically transparent end of the scale. Furthermore, of the few languages which exhibit "split" properties, none are head-final. The systematic nature of this contrast suggests that it is non-accidental and caused by the principled interaction of the Semantic Typology with another linguistic principle, namely, head-position. On an explanatory level, we attribute the contrast to the different demands made on the parser by left-branching and by right-branching languages. That systematic relationships between surface syntax and semantic structure exist in individual languages must be assumed, of course, in any syntactic framework which does not postulate any "deep" syntactic level, however that level may be defined, because such a theory requires that semantic interpretation proceeds from surface structure. If the relationship between surface syntax and semantics was completely arbitrary, the meaningful interpretation of utterances would be rendered impossible. Even multi-layered syntactic theories can accommodate significant generalizations between surface syntax and semantic structure insofar as the posited syntactic levels are differentiated by the systematic application of rules or principles. What needs to be established in this work is that this syntaxsemantics interface does not vary randomly from one language to the next, but that, instead, universal constraints exist on its shape and the permitted range of variation. 2 Overall, then, this work establishes a typological continuum which accounts for the cross-linguistic correlation between the degree of semantic specificity of the grammatical relations on the one hand and their syntactic behavior on the other. The different behavior of head-initial and head-final languages provides evidence for the systematic interaction of our Semantic Typology with another organizing principle of language, as expected in a modular conception of language.

1.2. Precursors The present research arises from the typological inference drawn by Hawkins (1986) in his contrastive study of English and German. Hawkins found that the grammars of English and German contrasted systematically

1.2. Precursors

3

over a wide range of linguistic properties, with the grammar of English consistently permitting greater deviation from the semantic configurations than that of German, leading him to speculate that the systematicity of these contrasts results from the effects of a previously unnoticed typological parameter, which he calls the Semantic Typology. This typology holds that the grammars of individual languages differ systematically from one another according to the closeness of the fit they provide between syntactic and semantic structure (Hawkins 1986: 123). Hawkins' work not only corroborates the correspondences posited by Sapir's theory of drift (Sapir 1921 [1949]: 147-170), but also claims that they extend into much wider areas of the grammar than Sapir had assumed, including inflectional morphology, grammatical relations, word order, deletion rules, WH-movement, raising, and rules of semantic interpretation. Many of the English-German contrasts discussed by Hawkins can be attributed to the effects of the Semantic Typology as it is understood in this work. The case system of German illustrates this fact. Hawkins treats case in the context of a discussion on grammatical morphology and, as in the other grammatical properties he examines, finds that the set of case distinctions made in English constitutes a proper subset of the distinctions made in German. German nouns have four different case forms, the nominative (der Mann 'the man'), genitive (des Mannes 'of the man'), dative (dem Mann 'to the man'), and accusative cases (den Mann 'the man'). This four-way distinction allows the language to grammatically discriminate different types of arguments that collapse into a single morphosyntactic category in English. This is particularly so as the only case form which remains distinct from the base form of a noun in English, the genitive, has lost the ability to mark grammatical functions and is restricted to its possessive use. Compare the following examples which illustrate how objects with different semantic roles are coded by distinct case forms in German while they collapse into a single direct object class in English: (1)

Hans schlug den Mann. John hit the/ACC man 'John hit the man.'

(2)

Hans half dem Mann. John helped the/DAT man 'John helped the man.'

4

Chapter 1: Introduction

(3) Hans bedarf meiner Hilfe. John needs my-GEN help 'John needs my help.' Hawkins argues, following Plank (1980, 1981), that the case distinction made in the German examples derives from a semantic difference, namely the degree of affectedness of the object. The English sentences in (1) through (3) all follow the same SVO pattern, and all can undergo passivization, promoting the underlying object to surface subjecthood. In their German equivalents, on the other hand, promotion to subject is a possibility only for the accusative object. Among other semantic distinctions that can be expressed by case in German is the static/directional opposition in objects of prepositions like auf dem Tisch (DAT: static) versus auf den Tisch (ACC: directional) where English has 'on the table' for both meanings. German utilizes the power of its case system to achieve greater clauseinternal word order freedom. Grammatical relations are, thus, coded by case, in contrast to English where this task is performed by word order. As a result, German can use word order to encode pragmatic functions which are not encoded overtly in English.3 The case system can also be held responsible for the fact that the basic grammatical relations of German are semantically more focused than those of English. This is true not just for the differentiation of different types of objects, but also for the semantic range of noun phrases that become subjects because non-agentive noun phrases frequently do not become subjects in German when they do in English, both in terms of the grammaticality of the resulting sentence as such and in terms of frequency of use (Rohdenburg 1974, reported in Hawkins 1986: 58), yielding contrasts like the following: (4)

a. The book sold 10,000 copies. b. *Das Buch verkaufte 10,000 Exemplare.

(5)

a. This car seats four. b. *Dieses Auto sitzt vier (Personen).

The grammatical relations in English and in German differ not only in their respective semantic range but also in how they are treated syntactically. Clause-external movement is far freer in English than it is in German. In fact, Hawkins argues that in all cases where German permits a participant

1.2. Precursors

5

to move out of its clause, so does English, but the reverse does not hold (Hawkins 1986: 75). Both languages have apparent instances of subject-tosubject, subject-to-object, and object-to-subject raising, but in each raising type the set of trigger predicates is much smaller in German and properly included in that of English. Furthermore, their peculiar syntactic restrictions have led various linguists to question the correctness of a raising analysis for each one of these grammatical processes in German (cf. Ebert 1975: 177-187 for subject-to-subject raising; Reis 1973: 519-529 and Harbert 1977: 128-136 for subject-to-object raising; and Comrie and Matthews 1990 for object-to-subject raising.) As in the case of raising processes, WH-movement is more restricted in German than in English; English can extract wherever German can as well as in additional environments where WH-movement induces ungrammaticality in German. Hawkins (1986: 87) sets up a hierarchy of extraction possibilities which illustrates this claim. Table I. Extraction possibilities in English and German Extraction environment

German

English

infinitival object complement of 2-place predicate infinitival non-subject complement of 3-place predicate finite non-subject complement infinitive adverbial clause finite adverbial clause

Yes Yes/? No/? No No

Yes Yes Yes Yes/? No

Further analysis reveals that this hierarchy is definitely non-accidental because it is based on the principled interaction of two syntactic parameters, namely (a) subcategorization and (b) finiteness. Extraction out of subcategorized complements, i.e., lines 1 through 3 in Table 1, is always easier than extraction out of non-subcategorized complements, i.e., lines 4 and 5. In each of these two groups, extraction out of infinitives is easier than extraction out of finite clauses. Of course, this is precisely the lineup of properties that we should expect given the thrust of the argument on the nature of English and German. A subcategorized complement is by definition an integral, necessary constituent of the matrix clause, whereas a nonsubcategorized complement is not. Loosely speaking, a subcategorized complement clause and its matrix are more of a single entity than a matrix with a non-subcategorized complement. Consequently, the prohibition of

6

Chapter 1: Introduction

extraction out of the latter in German is in keeping with the general tendency in that language to keep semantically distinct entities syntactically distinct. Similarly, it can be argued that infinitival clauses are more closely tied with the matrix than finite clauses. Some evidence for this assertion comes from the fact that infinitivals typically share an argument with the matrix (cf. the rule of equi-NP deletion), a generalization which can not be made for finite complement clauses. Whether this extraction hierarchy holds cross-linguistically is an empirical question which warrants further investigation. Pied piping rules state that a certain phrase is moved as a unit, rather than extracting an element out of that phrase. Sentence (6) illustrates one type of such pied piping rules, the pied piping of a prepositional phrase: (6)

a. Das Auto, mit dem wir gefahren sind. the car with that we driven are 'The car with which we drove.' b. *Das Auto, dem wir mit [ t ] gefahren sind. the car that we with driven are 'The car that we drove with.'

Pied piping of prepositional phrases is required in such sentences in German. In contrast to English, it is not possible to extract a noun phrase out of a prepositional phrase, as the ungrammatically of example (6b) demonstrates. As a result of the obligatory application of this pied piping rule, preposition stranding is avoided in German and the structural integrity of the prepositional phrase maintained. Pied piping moves a larger constituent, for instance a verb phrase, a prepositional phrase, or a noun phrase, as a whole when simple extraction without pied piping violates the semantic integrity of the larger constituent by moving a noun phrase out of such a phrase, removing it from its governor. Pied piping rules are thus tantamount to constraints on the extent to which the operation of WH-movement may break up constituents. It is therefore not surprising that pied piping is the only type of movement which applies more freely in German than in English. Hawkins (1986: 115) adds that English has more deletions, largely as a result of tighter constraints on conjunction reduction and more extensive use of pronoun copies in German. Finally, he notes (1986: 29) that Plank (1980) confirms the traditional observation that German verbs tend to

1.2. Precursors

7

impose tighter selectional restrictions on their arguments than their English equivalents. The contrasts between German and English found by Hawkins can be summarized as in Table 2: Table 2. German-English contrasts ENGLISH

GERMAN More inflectional morphology More specific selectional restrictions More word order freedom Less semantic diversity of grammatical relations Less raising Less extraction More pied piping Less deletion of noun phrases

1.2.1.

Evaluation

of Hawkins

Less inflectional morphology Less specific selectional restrictions Less word order freedom More semantic diversity of grammatical relations More raising More extraction Less pied piping More deletion of noun phrases

(1986)

The distribution of properties shown in Table 2 reveals a common directionality among the contrasts between English and German. Each contrasting feature contributes to a closer correspondence between surface grammatical expression and semantic structure in German in the sense that English surface forms will be more ambiguous (or vague) than their German counterparts and that grammatical processes which tend to destroy semantic clause structure apply more freely in English. Hawkins' findings therefore constitute a promising point of departure for the typology developed in this work. As a contrastive study of two languages, Hawkins certainly achieves what he set out to do; that is, to demonstrate that the contrasts between English and German are remarkably precise and systematic. However, there are some problems with his work as a contribution to linguistic typology. In order to explain the systematicity of the English-German contrasts, Hawkins postulates the existence of a hitherto unnoticed typological parameter which aligns languages relative to the "tightness of the fit" between semantic structure and the surface syntax. This inference of a new typological parameter represents a significant advance in the paradigm of Typological

8

Chapter 1: Introduction

Universal Grammar, but two issues need to be addressed before his findings can be integrated into a theoretically sound language typology. One is the question whether the set of regularities observed between English and German in fact results from the effects of a general typological parameter or whether it must be attributed to other causal factors, such as diachrony. The second issue concerns the question whether all of the observed contrastive properties are typologically relevant. The adequacy, both empirically and conceptually, of Hawkins' typological claim is, hence, yet to be established. Hawkins himself is careful to point out the tentativeness of a typological inference based on two closely related languages only. Indeed, the contrasts he observed could simply turn out to be an artifact of the close genetic affiliation of these particular languages and their peculiar diachronic development. At the time of the breakup of West Germanic into its various daughter languages, the grammatical systems of English and German must clearly have been very close to each other. As the grammars of the two languages gradually diverged, the changes tended to compound, resulting in grammatical systems which are quite different from one another. Since grammatical changes do not occur in isolation, but rather in the context of a whole grammar, un système où tout se tient [a system where everything hangs together with everything else], as Meillet's famous Structuralist dictum puts it, a systematic realignment of the component parts of the grammar results. In general, the grammatical development of German has been a conservative one. By contrast, the syntax of English has changed dramatically since the erosion of the case marking system, with each change compounding the differences between the two languages. As Sapir (1921 [1949]: 147-170) has so poignantly stated in his theory of linguistic drift, a single change may trigger a set of consequent changes, which in turn trigger their own consequences. The single incipient change has, thus, set in motion a chain reaction which ultimately yields, and which is sufficient to explain, the differences between the resulting grammars. Insofar as Sapir is correct in assuming that diachronic change proceeds in a regular, non-haphazard fashion, it is therefore possible to explain regular contrasts between the grammars of genetically related languages by means of such regularities of diachronic change. What is important for us here, is to emphasize that it is possible to construct an explanatory hypothesis for Hawkins' observations which does not have recourse to typological considerations.

1.2. Precursors

9

The second issue, whether all of the properties observed by Hawkins are indeed attributable to typological factors, obviously goes beyond the stated goal of his work. The purpose of a purely contrastive study is to identify all the contrasts that can be found in the data base. This is what Hawkins set out to do, and this is what he succeeds in doing. However, the task for the typologist lies not merely in listing correspondences between languages, since some of them may turn out to be accidental or relics of a genetic relationship; instead, he must identify correspondences that are indicative of the functioning of a typological parameter. It is therefore essential to ask which of Hawkins' findings could have resulted from principled typological factors and which may be accidental. Hawkins' discussion of the German case system is especially illuminating in this respect. The existence of case is inseparably intertwined with the nature of the grammatical relations in a language, and so are the semantic diversity of those relations, grammatical relation changing rules like passivization or raising, extraction processes, and word order. All of these properties have to do with either the semantic content or the syntactic realization of grammatical relations, and this shared association with two aspects of grammatical relations may form the principled basis for them to correlate cross-linguistically. Case, then, is typologically interesting because it is an important element in the syntactic treatment of grammatical relations. But Hawkins views case principally as an instantiation of inflectional morphology. For him, it represents an example where the grammar of English makes fewer distinctions than German, which are furthermore properly included in those made in the latter language. In this perspective, mood or number marking on verbs, for instance, would be just as relevant an example as case is. Indeed, Hawkins grapples with the "problem" of aspect, where English distinguishes the progressive from the non-progressive, a distinction which is lacking in (Standard) German (Hawkins 1986: 291, note 4). The German sentence (7) covers the semantic ground of both glosses. The former implies that John owns the car while the latter has no such implication and simply asserts that John is operating it at this moment: (7) Hans fährt einen gelben Porsche. John drives a yellow Porsche 'John drives a yellow Porsche.' or 'John is driving a yellow Porsche.'

10

Chapter 1: Introduction

Of course, the distinction can be expressed lexically in German when needed, as is shown in (8), but this does not distract from the fact that German has a single surface category here whereas English forces a grammatical distinction.4 (8) Hans fährt gerade einen gelben Porsche. John drives just now a yellow Porsche 'Hans is driving a yellow Porsche.' Hawkins' resolution of the aspect problem clearly reveals that his emphasis is on bound morphological expression per se and not on the grammatical categories expressed by it, for he argues that the English progressive is expressed periphrastically and not as a bound morpheme, thus leaving his strictly morphological claim intact: The grammar of English makes fewer morphological distinctions than that of German, and the distinctions that it does make form a proper subset of those found in the latter language. In the present work, our emphasis differs fundamentally from that of Hawkins. We are conducting a typological study which focuses on the notion of grammatical relations, arguing that a significant cross-linguistic correlation exists between their semantic content on the one hand and their syntactic treatment on the other. Case marking is interesting for us not as a morphological phenomenon, but because it is an essential characteristic of how grammatical relations are expressed in many languages. Mood or aspect marking, on the other hand, like many other grammatical categories that may be coded as bound morphemes, are not of interest to us here since there is no reason to expect that they have anything to do with the syntax or the semantics of nominal participants. In conclusion, Hawkins' work is significant because the systematic contrasts which he found between English and German led him to induce a new typological parameter, which he called the Semantic Typology of language. Our present work establishes this typology as a valid linguistic concept by developing its theoretical foundation and subjecting it to the empirical test of a cross-linguistic investigation.

1.2. Precursors

11

1.2.2. Hale (1982) The Semantic Typology has a striking parallel in the configurationality parameter proposed by Hale (1982). This parameter subsumes some of the same features that form the basis for Hawkins' contrastive typology. The use of a rich case system and a relatively free word order co-occur nonaccidentally according to both models. In Hawkins' typology, they constitute the basis for the greater semantic transparency in German, and their loss, following Sapir, the root cause for the increasing loss of transparency in English; for Hale, they are essential features of the non-configurational setting of the parameter (Hale 1982: 1-2). Another property of Hale's parameter which plays a role in our typology is the lack of NP-movement in non-configurational languages, which matches well with our claim in section 1.3.3 that syntactic movement of arguments is rare in languages with high semantic transparency. However, it should be noted as well that important differences between the Semantic Typology and the configurationality parameter remain. There are empirical differences since not all the linguistic properties which Hale associates with configurationality can be meaningfully included in our typology. This is immediately true, for example, of his claims regarding differences in the lexicon and in the morphological structure between configurational and non-configurational languages. Hale further claims that non-configurational languages are characterized by the use of discontinuous expressions, although some non-configurational languages may not break up constituents. There is also an important conceptual difference between our typology and Hale's system in as far as parameters in generative grammar are thought to have precisely two possible settings. This means that a language must be either [+ configurational] or [- configurational] in Hale's model, depending on whether the language has the requisite structural properties or whether it does not. Our conception of the Semantic Typology, on the other hand, envisages a linguistic continuum in line with the theoretical assumptions of the Cologne Universals Project and rejects binarity as empirically inadequate. A final reason for caution in the use of Hale's parameter comes from its current status within generative grammar. Despite the explicit listing of defining features in Hale's initial formulation, there has been little agreement in subsequent work as to which languages should be considered nonconfigurational and what criteria are actually relevant. The relation of the

12

Chapter 1: Introduction

Semantic Typology to the configurationality parameter will, therefore, not be explicitly considered further in this work.

1.2.3. Comrie (1986), (1989b) The analysis of Russian in Comrie (1986) and (1989b) has provided support for a typological explanation. Applying the approach of Hawkins (1986), Comrie argues that Russian contrasts with English in the same ways as German does, only more strongly so. Comrie shows that Russian combines an extensive inflectional morphology, including a rich case marking system, with a free, pragmatically determined word order. Since this word order freedom makes it possible to express pragmatic functions directly, recourse to grammatical relation changing rules is not necessary for topicalization and the use of the passive, consequently, quite restricted. English passives, as in sentence (9), therefore readily translate as actives in Russian, cf. example (10):5 (9)

Victor was given a book.

(10) Viktoru dali knigu. Victor (DAT) gave (3PL) book (ACC) lit.: 'They gave Victor a book.' Other grammatical relation changing processes are similarly constrained. Comrie argues that putative examples of raising in Russian on closer inspection turn out to be base generated and do not involve raising. Examples (11), (12), and (13) present sentences which are potential instances of subject-to-subject, subject-to-object, and object-to-subject raising, respectively: (11) Viktor kazetsja durakom. Victor seems fool 'Victor seems a fool.' (12) Ja scitaju Viktora umnym. I consider Victor clever Ί consider Victor clever.'

1.2. Precursors

13

(13) Èia kniga legka dlja ctenija. this book (NOM) easy (FEMININE) for read 'This book is easy to read.' The putative instances of subject-to-subject and subject-to-object raising all require byt ' 'be' as the verb of the subordinate clause, with the additional restriction that this byt' must obligatorily be deleted. Furthermore, the construction requires that byt ' introduces an adjectival or a nominal, but not a locative phrase. Instead of positing a syntactic process with such strong and highly idiosyncratic constraints, Comrie argues that it is much more plausible and economical to base-generate sentences like (11) and (12). For sentences like (13), an object-to-subject raising analysis turns out to be not just implausible but false since the construction is grammatical only when the surface subject contracts an appropriate semantic relationship with the matrix predicate. A raising analysis would predict no such semantic restriction. Even though Comrie correctly emphasizes these heavy restrictions on grammatical relation changing rules, it is not the case that arguments are always surface-aligned in Russian with the verb that they belong to semantically. There is certainly a much more stringent requirement for them to do so than in English, but sentence (14) reveals that topicalization can operate across the boundary of an infinitival clause: (14) Etu knigu legko citat'. this book (ACC) easy (NEUTER) read 'This book it is easy to read.' Notice that knigu 'book' is morphologically marked as the object of the subordinate verb citat' 'read', so that no raising has occurred. Nonetheless, knigu has been moved out of its containing clause, making legko 'easy' its closest predicate on the surface. Overall, Comrie shows that Russian contrasts with English in much the same way as German does but to a greater degree. Russian has an extensive inflectional morphology, overt case marking, a syntactically free clauseinternal word order, few grammatical relation changing rules, and tightly constrained selectional restrictions, to an even greater extent than Hawkins determined for German. These precise contrasts between English, German, and Russian argue against a purely diachronic explanation. For, although Russian is a member

14

Chapter 1: Introduction

of the Indo-European family of languages, as are English and German, there is a sufficient time depth between the breakup into the Slavic and Germanic branches and the present to make a historical explanation of the grammatical contrasts between Modern English, Modern German, and Modern Russian doubtful. It is also questionable whether sociolinguistic factors, such as language contact, could have brought about the profound parallel alignment that was found between the structures of English, German, and Russian since these languages have not experienced the sustained intimate contact which is necessary for such extensive aerial diffusion. This is why Dryer (1989), with his great concern about the distorting effects of areal and genetic patterns on typological predictions, has classified the major subbranches of the Indo-European family as independent genera and why a typological answer to the regularities discussed in this section is called for.

1.3. The nature of the typology 1.3.1. Transparency and grammaticization Before entering into the discussion of the relevant data, it is necessary to clarify the nature of the typology itself and of the possible language types which it defines. The previous studies by Hawkins and by Comrie have revealed that German and Russian pattern similarly as the members of a single "transparent" language type. As such, they are opposed to English which differs consistently from the other two languages by permitting substantially greater deviations from the transparent state. If these patterns extend to the relevant data from other languages, Hawkins' notion of the closeness of the fit between the syntactic and the semantic structure can be understood as a single scale similar formally to the format of the typological continua developed by the Cologne Universals Project (Seiler and Lehmann 1982), thereby possibly lending increased validity to that group's constructs. Like the Seilerian scales, our system is constructed from two opposing forces, namely semantic transparency and grammaticization. The end points of the continuum thus define two diametrically opposed relationships between syntactic and semantic structure. At the left-most extreme, syntactic structure faithfully represents semantic structure, whereas at the right-most extreme, the syntactic structure fails to represent the semantic structure in a transparent way. The question of the potential for empirical reality of either extreme is taken up in section 1.3.3.6

1.3. The nature of the typology

Syntactic structure Φ Semantic structure

Figure 1. The scalar model of the Semantic Typology The continuum with its two counteracting forces arises from the competing needs to communicate explicitly and to communicate efficiently. An explicit communication system requires that the semantic content of a message is conveyed to the hearer as completely and as accurately as possible; explicit communication, therefore, implies the maximization of the amount of morpho-syntactic devices in a language in order to limit the semantic range covered by each individual device. Languages which maximize this dimension have overt case marking, which permits a syntactically free clause-internal word order; this free word order can then be used to express pragmatic functions explicitly, preempting the need for grammatical relation changing processes as a means for attaining topichood and leading to more semantically narrow grammatical relations. An efficient communication system, on the other hand, requires a maximally simple formal apparatus, which means that this force seeks to minimize the number of morpho-syntactic devices of a language. Maximization of this dimension yields languages without overt case marking; such languages tend to fix their clause-internal word order, which is then used for indicating grammatical relations, leaving pragmatic functions implicit. As a result, grammatical relation changing processes like passive or raising become necessary to ensure that non-subjects can attain topichood. Ultimately, the Semantic Typology can therefore be traced back to the dual function of the syntax to code semantic and pragmatic information. As Givón (1984: 42) puts it: "The fact that all sentences in discourse carry a dual function, semantic as well as pragmatic, has far-reaching consequences for syntax. While the propositional-semantic contents of a sentence may remain fixed, its discourse-pragmatic function can be modified enormously, and this is associated with radical changes in its syntactic structure..." The subject relation is at the core of this duality since it is the one grammatical relation which inherently unites both facettes in itself. Objects, as well as obliques, are uniquely characterized by the semantic roles they encode, whether they are patients, recipients, locatives, or any other semantic role. Pragmatically, however, they are not prototypically associated

16

Chapter 1: Introduction

with any one particular function; they may or may not be a grammatical topic, they may or may not be in focus, or they may simply be part of the larger rheme. By contrast, subjects are cross-linguistically characterized by the intersection of agency and topicality, but subjecthood can not be identified with either of these two properties alone (Comrie 1989: 106). Although the precise set of properties associated with subjecthood in any given language may vary considerably from any other (Keenan 1976), this convergence of the semantic role of agency and the pragmatic function of topichood seems to capture most succinctly what subjects are prototypically about. It is therefore not surprising that subjects, rather than objects or even less obliques, should become the prime target of grammatical rules that manipulate the grammatical coding of the participant structure of a sentence. The subject becomes a pivot for such syntactic processes because of its salience as a semantic as well as a pragmatic entity and because this dual nature opens up a way for other nominal participants to attain topic status by being promoted to subjecthood. Clearly, the use of syntactic promotion to achieve topicalization of non-subject participants is far less relevant in languages where such topicalization can be attained directly, principally because of their free word order, but it becomes attractive for languages where direct topicalization is less possible because word order is used to identify the various grammatical relations. In the latter languages, then, the association of subjecthood with semantic agency becomes more tentative as its pragmatic identity as topic gains more weight. The subject notion becomes desemanticized and grammaticalized in such grammaticizing languages whereas its dual association with topichood and agency can remain intact in semantically transparent languages.7 The subject relation, thus, constitutes the crucial link which is ultimately responsible for producing the opposing forces of semantic transparency and grammaticization in the scalar continuum of our typology. Our model differs from the Seilerian in that individual points on the scale are not the locus of specific linguistic categories; instead, data points are now understood as potential locations for a language. Of the languages considered so far, Russian has been determined by Comrie (1986, 1989b) to be the language with the greatest semantic transparency, giving it the left-most position on the continuum. German is placed just to the right of Russian. The drift effect observed by Sapir (1921) can be represented as the movement of a language along the continuum, as in the development from Old to Modern English:

1.3. The nature of the typology Language:

Russian I I

German I I

Old English

Modem English

I I

I < Syntactic structure

17

I I

grammaticization semantic transparency

> 1 Syntactic structure Φ

Semantic structure

Semantic structure

Figure 2. Drift as represented on the scalar model

1.3.2. Consequences and problems of the scalar model This subsection presents some important consequences and problems of the scalar model as it is developed in the previous section. First, it must be asked whether two languages LI and L2 are allowed to pattern more closely with respect to some properties than they do with respect to others. Secondly, there is the question whether LI and L2 can actually "switch" positions on the hierarchy with respect to some property. Figure 3 illustrates these theoretical possibilities on property P2 and property P3, respectively, for languages LI and L2: Property PI Property P2 Property P3

.... LI LI .... L2

Property Pn

L2

L3 L2

LI

LI

Lx .. L3

Lx

L3

L2

Lx ..

L3

Lx ..

— grammaticization — semantic transparency Syntactic structure

Syntactic structure Φ

Semantic structure

Semantic structure

Figure 3. Crossover and varying distance in the scalar model A strict interpretation of the scalar model requires a Principle of Relative Positioning, to ensure that the relative ordering on the hierarchy for any two languages examined is the same for any two linguistic properties, and an

18

Chapter 1: Introduction

Equal Distance Principle, which would go beyond the relative ordering of two languages by also requiring that their absolute distance from one another be the same over all their properties, insofar as this can be determined. Relative position is at the core of the scalar model proposed here, and some version of the Principle of Relative Positioning must clearly hold. With respect to the Equal Distance Principle, however, there is clear evidence that no equal distance for the properties of two languages can be established and that this principle should, therefore, be abandoned. I will turn to the latter point first and show why the Equal Distance Principle is unacceptable. In the typological inference he draws from his contrastive study of English and German, Hawkins (1986) appears to advocate a very strong reading of the typology, implying the validity of the Equal Distance Principle, when he expects that rich a morphology, free word order, less semantically diverse grammatical relations, fewer deletions, less raising, and fewer extractions correlate with each other "quite generally across languages (Hawkins 1986: 125). In other words, if two languages LI and L2 both have a considerable morphological inventory, as opposed to a highly synthetic language L3, Hawkins would seem to expect not just that LI and L2 pattern to one side of L3 with respect to, say, word order freedom, as predicted by the Principle of Relative Positioning, but also that LI and L2 both exhibit a comparable degree of word order freedom far in excess of anything permitted in L3. The empirical evidence does not bear out a strong reading of the typology which includes the Equal Distance Principle. Instead, if two languages are fairly close to each other for a certain grammatical property p, this does not preclude the possibility that they are far apart with respect to another property q (as long as their relative ordering is preserved.) For instance, German tightly restricts the semantic range of the basic grammatical relations, as Hawkins (1986: 56) has correctly pointed out following Plank (1980). This property puts German in close vicinity to other languages with semantically transparent grammatical relations. Nonetheless, German also tolerates a number of transparency-reducing movement processes like extraction and some raising, albeit less so than English. In this context it is interesting to note that processes like extraction and raising applied much more freely in German before the nineteenth century (Ebert 1978), making the contrast between the highly transparent semantic range of its grammatical relations and the remarkably free syntactic operation of extraction processes even more striking. Clearly, the position of German, in its

1.3. The nature of the typology

19

present-day variety and even more so in its nineteenth-century version, is in opposition here to those highly transparent languages which largely prohibit extraction processes, and this despite the fact that it may be very close to them with regard to the tightness of the basic grammatical relations. 8 As a result, the prediction of the Equal Distance Principle that languages must exhibit precisely the same level of (non-)transparency for all their properties is not borne out, so that this principle must be dismissed. This means that the position of a language LI on the continuum states that it will be more transparent than a neighboring language L2 to its left, but it does not predict the precise degree of divergence over all their contrastive properties. 9 The Principle of Relative Positioning raises some problems of its own which need to be commented on here. In particular, the question arises whether this principle applies in an exceptionless manner or whether languages may cross-over positions on the hierarchy for some of their properties under certain circumstances. In other words, can a language LI which is generally positioned to the left of a second language L2 break this pattern by having one or more properties appear to the right of the respective value for L2? Again, a strong interpretation of the Principle of Relative Positioning would lead us to expect that there exists a precise relation among comparable properties of different languages such that if a language LI is positioned to the left of a language L2 with respect to a property Ρ1, then it must also be positioned to the left of L2 with respect to all other properties P2 to Pn relevant to the typology. The cross-over of LI and L2 with respect to property P3 in Figure 3 would be disallowed under such a strong reading of the scalar model. The empirical results of Hawkins' contrastive study suggest that crossover does not occur for the relationship between English and German as the extent of grammaticization in German was shown to be consistently a subset of that found in English. As far as the relationship between those two languages is concerned, an exceptionless version of the Principle of Relative Positioning can thus be upheld. For Hawkins, the precision of the contrasts between English and German has the status of a major result and rationale for speculating further about their cross-linguistic generality. Notice, however, that this constellation, if it is correct cross-linguistically, would constitute an exceptionally strong typological claim, because it means, in effect, that we are dealing with bidirectional implications: Any property can serve as the antecedent of an implicational statement to predict the relative

20

Chapter 1: Introduction

position of a language with respect to all other properties in question. If our Semantic Typology is indeed characterized by such bidirectional implications, it would be of a considerably stronger form than other typologies, say word order typology, where one-way implications are the norm. It should, thus, not be too surprising (or discouraging) if cross-over of certain properties must be admitted in the Semantic Typology. It is our opinion that the typology is to be deemed successful if two things can be shown: firstly, it must be demonstrated as a cross-linguistically valid statement that the properties for any language L can be expected to cluster in the vicinity of a certain value on the scale with respect to any other language; secondly, it must be possible to ascribe gross deviations from this value, which should be rare to start with given the previous comment, to some overriding aspect in the organization of the violating language, in accordance with a statistical view of the Principle of Relative Positioning. For the typology to remain valid, we should be able to state precisely the aspect(s) of the language which cause its uneven behavior with regard to the Semantic Typology. In other words, the typology, if successful, is seen as a widespread cross-linguistic tendency to which languages will adhere, all things being equal. At the same time, however, the modular approach to language organization requires that the Semantic Typology is but one of many competing organizing principles. Each individual language represents a particular solution to the pressures exerted by the various competing (and frequently conflicting) universal principles and by pressures arising from language-specific forces. Chapter 3 describes just such an interaction between the Semantic Typology and a language-specific principle and should be instructive in this respect.

1.3.3. Ideal-typical

instantiation in the Semantic

Typology

In order to develop a clear notion of what consistent semantic transparency and consistent grammaticization mean, this section will work out what should be considered the ideal typical instantiations of the two extremes that are defined by the scalar model of the typology. An understanding of these ideal types is especially important since, conceptually, it appears that languages which cluster towards the extreme values of the scale strive to achieve optimum system efficiency. A position closer to the transparent end of the scale most faithfully preserves the semantic distinctions that are to be communicated and thus optimizes the simplicity of the mapping process

1.3. The nature of the typology

21

between the semantics and the syntax. A position closer to the grammaticizing end of the scale, on the other hand, achieves optimally efficient use of the surface-grammatical devices that the language employs. Of course, any real, living language will necessarily represent a compromise between the competing pressures of these extremes. We would, therefore, expect languages to be positioned on different points along the continuum, although the interaction of the Semantic Typology with other organizing principles of language may compel certain types of languages to cluster in a particular area. Section 4.2 illustrates this clustering effect for branching directionality, showing that left-branching languages tend to be positioned close to the transparent ideal type. Branching directionality, thus, in a sense "cancels out" the pull of the grammaticizing extreme in these languages.

1.3.3.1. Basic grammatical relations With respect to the semantic range of the basic grammatical relations, the scalar model presented in section 1.3.1 predicts two ideal types of languages. The first type is characterized by grammatical relations with a semantic range that is narrowly constrained around the prototypical core content of that grammatical relation, for instance languages where subjecthood is reserved for agents and possibly other semantic relations which can be construed as metaphorical extensions of agency. It is expected that the narrow semantic range of the grammatical relations is assured in such languages by the absence of or by heavy restrictions on grammatical relation changing or grammatical relation creating operations like passivization and raising since the very essence of such processes is the promotion to subject, or to object, of a noun phrase which can not attain that status in a basic sentence because its semantic role does not satisfy the thematic requirements of the verb. The basic grammatical relations of the second language type cover a wide semantic range, which is further expanded by the freer operation of processes affecting grammatical relations. By contrast, the typology predicts languages to be rare in which semantically narrow grammatical relations allow a few isolated extensions that are not motivated by the Fillmore hierarchy (1968: 33; also cf. Chafe 1970: 244). For example, we would not expect languages where accessibility to subjecthood is confined to all agents and all locatives. Furthermore, languages with grammatical relations which cover a wide range of semantic roles, but

22

Chapter 1: Introduction

which lack grammatical relation changing or grammatical relation creating processes are predicted to be strongly dispreferred. The assignment of semantic roles to specific grammatical relations, unlike some other grammatical properties, necessarily involves some distortion of semantic subtleties since, first of all, any given semantic role incorporates an infinite set of different real-world relationships and since, secondly, it is not always clear to which semantic role a given real-world relationship belongs. For example, the role "agent" comprises not only the subjects in (15), but probably those in (16) as well: (15) a. John hit Bill. b. Mike kissed Mary. (16) a. Sam tried to sleep. b. The robot opened the door. Even those in (17) can be interpreted as metaphorical extensions of agency because, in a very real sense, the flood water in (17) is the entity that performed the action of lifting the tree. (17) a. The flood water lifted the tree. b. The key opened the door. Certainly floods are far removed from prototypical agency: They are neither human nor animate and lack any concept of intentionality. However, nobody could seriously claim much animateness or intentional control for the behavior of microbes and the like, and yet we have no difficulty conceiving of statements where such barely animate microbes function as agents of a verb like attack. Even inanimate entities can fulfill this role with equal ease given the right circumstances, as when a meat-eating plant is said to devour its prey. Conversely, a human acting while under hypnosis does not thereby become "instrumental" of the psychologist who controls him. At the very least, then, the flood water in (17a) combines elements of "cause" and "agent". Its subsumation under the agentive pattern, which allows it to become a subject even in a much more restrictive language like German (cf. Die Flut hat das Land verwüstet 'The flood devastated the land') is, hence, not surprising. Fillmore (1977: 71-74) recognizes this indeterminacy of the case notion and proposes that semantic roles can only be understood relative to

1.3. The nature of the typology

23

particular scenes. The clear cases of (15) involve intentionally acting humans; as such, they illustrate prototypical agency. Certainly, the relationship expressed in (16a) is quite different from those in (15a) and (15b), as trying is obviously not as typically a kind of action as hitting or kissing are. In sentence (16b), the deficiency lies with the presumed agent, the robot, at least if intentionality is an important aspect of the prototype; nonetheless, the robot clearly must be an agent here. The subject in sentence (17b), finally, is normally considered an instrumental, with the key serving as a tool of a human agent for opening the door; but in some sense, the key can also be thought of as the one thing/agent that performs the action of opening the door. At the present state of technological development, one only needs to imagine a pre-programmed motor-driven key to drive home the possibility that the key can be both agent and instrument. Three conclusions follow from this discussion. Firstly, all languages will collapse a considerable number of real world relationships onto a single form because the semantic roles, of which there are, presumably, only a limited number, are already abstracted away from the relationships that obtain in the real world or any other possible world (cf. the discussion in note 6). For instance, agency with respect to hitting is quite different, as a real-world relationship, from agency with respect to kissing, although linguistically, the same semantic role "agent" occurs in the subject of either verb. In languages where the range of the grammatical relations is not in a one-to-one relationship with the semantic roles, those already abstracted semantic roles must in turn be mapped onto the further set of grammatical relations. Significantly, the latter situation seems to be prevalent in the world's languages. Secondly, the affinity between different semantic roles that has been demonstrated for agency and instrumental may be a clue as to how languages can go about extending or restricting the set of permitted roles for any given grammatical relationship. They can increase grammaticization by reducing the number of grammatical relations, as in the collapsing of dative and accusative objects in the history of English, or by extending the range of admissible semantic roles from core relations to relations that are intermediate between two different prototypes until all relations covered by the latter prototype become admissible (cf. our conclusion regarding the semantic range of the grammatical relations in Indonesian, section 3.3.3 below). Semantic transparency can be increased by providing a larger number of oblique cases and/or prepositions, or by reducing the range of semantic roles admissible to a grammatical relation.

24

Chapter 1: Introduction

Since semantic transparency in our view involves the mapping between the constructs of the semantic component and those of the syntax, rather than between real-world relations and the surface syntax (cf. note 6), high transparency with respect to grammatical relations essentially involves a one-to-one mapping between semantic roles and grammatical relations, while high grammaticization creates grammatical relations which collapse many semantic roles. In an ideal-typical perspective, then, the range of grammatical relations becomes identical to the semantic roles at the leftmost extreme of the scale. In principle, it might therefore be argued that the concept "grammatical relation" becomes irrelevant for languages located at this point. Alternatively, theories like Relational Grammar, which presuppose the universal applicability of grammatical relations, could conclude that they merely accidentally overlap with semantic roles in such languages. Of course, whether languages of this kind actually exist is an empirical question. Needless to say, though, the complete overlap of grammatical relations and semantic roles precludes the existence of grammatical relation changing rules in such a language. Thirdly, languages may choose to code the semantic space covered by the core grammatical relations differently, resulting in different case marking types. While objects are prototypically patients and subjects prototypically agents, it has long been noticed that there is a much greater variability in the semantic roles of noun phrases that can become subjects of intransitives (for a recent discussion, cf. Dixon 1989). Since intransitive subjects also form the only instance where case-marking is not needed for differentiating two arguments in a clause (Comrie 1989: 125-126), it should not be surprising that the three major case marking types that are attested among the languages of the world differ primarily in their treatment of intransitive subjects. Drawing on Dixon's well known distinction between the subject of a transitive verb (A), the object of a transitive verb (O), and the subject of an intransitive verb (S) (Dixon 1979, cf. Comrie 1978 for a very similar view), nominative-accusative systems mark S and A identically in contrast to the O relation. Nominative-accusative case marking can hence be interpreted as a response to the polar thematic contrast between the O relation, which is prototypically a patient, and the A and S relations, which are prototypically agents. Ergative-absolutive case marking systems reflect the semantic variability of S. By introducing the separate ergative case for the A relation, ergative languages create a class of nomináis which, at least as a surface grammatical relation, more closely matches the semantic role of agency than the noun phrases grouped together as nominative in

1.3. The nature of the typology

25

nominative-accusative languages. Active case marking, finally, represents a different response to the semantic variability of S. This type splits the S relation, marking a volitional S in the same way as A, and a non-volitional S in the same way as O (Dixon 1989: 97). Each case marking type, thus, manages to resolve one facette of the problem posed by S, but only at the cost of papering over another. Furthermore, there is still much debate about how "deep" various case marking types actually are, or can be in principle (Bossong 1984: 341-345; Comrie 1989: 120). The empirical consequences of different case marking types are, then, far from clear and call for much further research on a wide range of languages before firm conclusions can be reached. The cross-linguistic sample used in this work includes case marking as well as non-case marking languages, with two ergativeabsolutive languages (Jacaltec and Sawu) among the former group.

1.3.3.2. Manipulation of grammatical relations As for movement processes, a loss of transparency occurs whenever an element is moved away from the elements it belongs to, but some movement processes reduce the semantic transparency of an utterance more strongly than others. When we rank syntactic processes according to the degree with which they affect semantic transparency, a clear hierarchy emerges, which we can call the Grammatical Relation Manipulation Hierarchy. This hierarchy is presented in (18): (18) Scrambling > Passivization > WH-movement > Raising > Adposition stranding Movement which does not cross any maximal projection, for example scrambling in Korean, hardly diminishes transparency at all since it affects only the respective order of the various constituents of a clause, each of which remains clearly marked for its role by its accompanying case marker. The operation of scrambling does not allow stranding of case markers, nor is it capable of moving anything out of, or into, any maximal projection (that is, NP, PP, S, and VP, if the language in question has it). Consequently, no reduction of transparency results. Passivization reorganizes the basic grammatical relations in a sentence and, thus, introduces a step away from the surface transparency of the corresponding active construction, but the transparency-reducing effects of passivization, whether by movement across

26

Chapter 1: Introduction

VP as in German or without syntactic movement as in Japanese, are limited since the moved noun phrase remains morphologically identified for its role by a combination of its nominative case marking and the passive verb morphology. WH-movement is clearly much more transparency-reducing as it can extract any argument or oblique. WH-moved elements can, therefore, only be interpreted on-line by temporarily storing them in short-term memory and looking out for the telling trace. However, the correct interpretation of WH-movement constructions is ensured by the appearance of a WH-word in the complementizer position, which uniquely cues the hearer to expect a variable-trace relationship. Moreover, the WH-word does not become syntactically integrated in the higher clause in long-distance extraction, remaining marked for its syntactic role in the clause in which it originated, as in the German example (19). Here, the WH-word wen 'whom' is in the accusative case because of its role as the direct object of the verb sehen 'see' in the lower clause, even though it has been moved into the complementizer position of the matrix clause. (19) Wen hast du gesagt, dass der Hans [ t ] gesehen hat? who (ACC) have you said that the Jack seen has 'Who did you say that Jack has seen?' In a language like English, which has been losing the case contrast between who and whom, the saturation of the higher predicate with arguments, along with the rigid word order, ensures the same result. However, the interpretation of any WH-construction can only be unproblematic when the sentence indeed contains just one unique gap, as the extensive literature on parasitic gaps has shown (cf. Chomsky 1982: 36-78). Parasitic gap constructions, in which one WH-word is confronted with two gaps, have "marginal status at best" (van Riemsdijk and Williams 1986: 262) in English and are ungrammatical in German, as examples (20) and (21) show: (20) llWhati did you read [gapl]¡ before filing [gap2]¡? (21) *fVasi hast du gegessen [gapl]¡ ohne [gap2]¡ zu kochen? what have you eaten without to cook 'What did you eat without cooking?'

1.3. The nature of the typology

27

Raising removes an element from the predicate that it needs to be interpreted with and surface-aligns it with another predicate with which it contracts no semantic relationship, just as extraction via WH-movement does; unlike the latter, however, raising processes take the destruction of semantic transparency one step further since the raised element actually becomes syntactically fully integrated as an argument of the higher predicate. By contrast, a WH-moved element never becomes syntactically or semantically associated with the higher predicate, as was just pointed out above, leading to less of a reduction in transparency. Raising processes always involve such a syntactic integration in the higher clause as either a subject or an object, and thus more grammaticization and less semantic transparency. Still, the destructive effect of raising on the semantic transparency of a sentence is constrained by the fact that raising is typically limited to a small set of trigger predicates which permit the application of a small number of well-defined raising processes, such as subject-to-subject or object-to-subject raising. As a result, the range of noun phrases that can be reassigned to a higher predicate by means of raising is rather limited, and the correct interpretation of raising constructions is facilitated by the hearer's recognition of a raising predicate and his knowledge of what kind of raising process this predicate permits. Adposition stranding, whether created as a result of WH-movement or of raising, exemplifies grammaticization in its purest form because stranding a preposition or postposition removes the only element which indicates the role of the noun phrase it governs. Consequently, that noun phrase is literally left stranded as a foreign body in the sentence, which at least temporarily remains without any indication of its syntactic or its semantic function. In this way, adposition stranding goes beyond WH-movement and raising, both of which had "built-in" mechanisms to ensure the rapid recognition of the syntactic process that has applied and of the role of the moved noun phrase. More transparently oriented languages avoid the loss of semantic transparency inherent in adposition stranding by resorting to pied piping as a means for retaining the integrity of the moved element. The organization of our Grammatical Relation Manipulation Hierarchy is, thus, motivated by the degree to which a given process destroys the indication of the syntactic and the semantic function of an element. Independently, the transparency of the various movement types can be somewhat increased by leaving a pronoun copy in the extraction site. For these reasons, highly transparent languages restrict themselves to the first type of syntactic reorganization, using scrambling processes for coding

28

Chapter 1: Introduction

pragmatic functions. The possibility of encoding pragmatic information makes this kind of movement the only one that can be argued to actually increase the transparency of an utterance. Interestingly, a highly grammaticizing language like English does not just relinquish pronoun copies, at least in its standard variety, but is also forced to do without exactly this transparency-inducing movement type as a result of its syntactically identified grammatical relations. Extremely grammaticizing languages like English are, then, characterized by the presence of the four other types of syntactic processes.

1.3.3.3. Summary of the ideal types The properties of languages close to the two ideal types defined by the Semantic Typology are listed in Table 3: Table 3. Ideal types in the Semantic Typology Strongly transparent Little semantic diversity of grammatical relations Overt role markers for grammatical relations Free scrambling No grammatical relation changing rules (No passive, no raising) No extraction No adposition stranding

Strongly grammaticizing Much semantic diversity of grammatical relations No overt role markers for grammatical relations Fixed intraclausal word order Many grammatical relation changing rules (Frequent passive, raising) Frequent extraction Frequent adposition stranding

It is evident that this typology positions English fairly close to the ideal grammaticizing type. English nouns have no case markers apart from the genitive, which does not serve as a marker of grammatical relations; remnants of a more extensive earlier case marking system are only found in the pronouns. English has a rather fixed clause-level word order along with semantically extremely diverse grammatical relations. Grammatical relation changing operations are widespread, including the passive and several raising processes, which are furthermore triggered by a large number of predicates. There is frequent WH-extraction, and preposition stranding

1.4. Summary

29

is common. In contrast to the extreme position of English on the scale, German diverges significantly from the ideal transparent set of values in several ways. While it indeed has overt case marking and a high semantic specificity of the basic grammatical relations, it allows both out-of-clause extractions and some raising, though less so than English. In addition, its word order, while freer than that of English, does not approach free scrambling as do some other languages. One goal of the present research has therefore been to identify a language which more faithfully represents the transparent end of the scale. This effort is necessary to verify the existence of this type as a real linguistic constellation. Chapter 2 of this work presents the results of this endeavor, a portrait of Korean as it relates to the Semantic Typology. Chapter 3 illustrates how the Semantic Typology can interact with other principles, using evidence from Indonesian. This chapter shows that our typology can shed some light on the specific co-occurrence of grammatical properties found in Indonesian and argues that some apparent violations of the predictions of the Semantic Typology can be explained independently and the typology so remain intact. Chapter 4 contains a cross-linguistic study of the typology and attributes the different distribution patterns of head-initial and head-final languages on the typological continuum to the differing processing demands encountered by left-branching and right-branching parsing. Chapter 5, finally, presents a summary of our results.

1.4. Summary In this chapter, we have considered preliminaries for the typological study of the syntax-semantics interface. We have pointed out the existence of precise grammatical contrasts between English, German, and Russian which result in different levels of semantic transparency in the grammatical relations of these languages. English has the lowest and Russian the highest level of semantic transparency among these three languages. Because English, German, and Russian have not experienced intense sustained language contact, aerial diffusion can be discounted as the cause of the regularity of these contrasts. Neither are the three languages sufficiently closely related to attribute the contrasts exclusively to their genetic relationship. A typological account is therefore called for. This "Semantic Typology" can be captured in a scalar model defined by the two opposing forces of grammaticization and semantic transparency. The

30

Chapter 1: Introduction

extremes of the scale represent two ideal language types while each intermediate value constitutes a certain compromise between the demands of the two forces. We have argued that this scalar typology ultimately derives from the dual nature of subjects as the grammatical relation which simultaneously expresses the semantic role of "agent" and the pragmatic function of "topic". Finally, this chapter has attempted to identify the set of linguistic properties which characterize the semantically transparent ideal and the grammaticizing ideal. In the following chapter, a language which approximates the transparent ideal will be discussed in greater detail.

2. High semantic transparency: Korean The purpose of this chapter is to show that the typology developed in Chapter 1 has empirical content in as far as it allows us to make significant generalizations about the overall structure of a language. It outlines how the structure of Korean is characterized by a remarkably high degree of semantic transparency throughout its grammar. The notion of high overall semantic transparency thus unifies many grammatical properties which otherwise could be seen as co-occurring accidentally in Korean, under a single heading and provides a functional motivation for this co-occurrence. The grammatical properties of Korean to be discussed here are its lack of syntactic movement other than clause-internal scrambling, its pragmatically oriented word order, its use of an extensive case marking system to identify participant roles, and its semantically narrow grammatical relations. One reason why Korean does not completely reach ideal transparency as it was defined in section 1.3.3 is that the range of its grammatical relations is not identical to the various semantic roles. Still, Korean turns out to be considerably more transparent than German, and consistently so over a wide range of linguistic properties.

2.1. Participant marking Korean signals the grammatical functions of the various participants in a sentence by means of a rich set of case particles and postpositions. Although nouns are traditionally said to lack grammatical morphology, there is evidence that the so-called case particles cliticize to the noun. For example, the forms of some case markers are phonologically conditioned by the stem phoneme that precedes them. The Yale system of transliterating Korean, which is used in this study, 10 recognizes this by representing the case markers as suffixes. Table 4. Principal Korean case markers -ka Nominative -lui Accusative -nun Topic

-eykey Dative -eyse Ablative -uy Genitive

-lo -ey -ey

Instrumental Locative Temporal

32

Chapter 2: High semantic transparency: Korean

Some inflectional categories familiar from European languages are quite restricted in Korean. The expression of number is optional, for example, and agreement occurs only by means of honorifics. Nevertheless, the fact that Korean uses an extensive array of substantive case markers, rather than position in the sentence, to code participant roles, means that, if the overt marking of participants indeed correlates with the level of transparency elsewhere in the grammar, as was assumed by Sapir as well as by Hawkins, Korean should be a highly transparent language. The discussion in this chapter will show that this expectation is indeed borne out by the facts of Korean.

2.2. The semantic range of basic grammatical relations If Korean were an ideal transparent language, we would expect the range of its grammatical relations to coincide with the various semantic roles. The Korean noun suffixes could then be interpreted as semantic case role markers. However, this is not a correct characterization of the Korean facts, although this section will show that the case markers come close to the semantic constellations due to the comparatively narrow semantic range permitted by the grammatical relations they signal. Prime evidence that the grammatical relations of Korean do not completely correspond to individual semantic roles comes from the existence of the passive, shown in (22b): (22) a. Jeongdal-i mwun-ul yel-n-ta. NOM door-ACC open-PRES-DECL 'Jeongdal opens the door.' b. Mwun-i Jeongdal-ey uyhayse yel-li-n-ta. door-NOM LOC by open-PASS-PRES-DECL 'The door is opened by Jeongdal.' Notice that the patient noun phrase mwun-i 'door' in sentence (22b) carries the same nominative suffix -i as the agent noun phrase Jeongdal-i in sentence (22a). Moreover, the verb must add the suffix -li, marking it as passive. The passive voice, therefore, exists in Korean. With it, the subject relation, which is marked by a nominative suffix, absorbs all those patient noun phrases that are promoted in the passivization process.

2.2. The semantic range of the basic grammatical relations

33

Comrie (1986: 1158-1161) explains the tight restrictions on the passive in Russian by arguing that grammatical relation changing processes involve the syntactic reorganization of a basic structure and, hence, constitute a distortion of the semantic constellation. Such processes should, consequently, be lacking or strongly restricted in languages with high semantic transparency. This is exactly what we find in Korean. Although passive constructions undoubtedly occur in this language, they are exceedingly rare, occurring much less frequently than in English and in Japanese (Ν. K. Kim 1987: 893; Hwang 1975: 46). In fact, my Korean language consultant had to go through some hesitation and false starts when he tried to come up with the correct form of several passive sentences. His performance problem may well be symptomatic for the status of the passive in his native language.

2.2.1.

Subjects

Korean subjects are marked with the suffix -ka for nouns ending in vowels or -/ for nouns ending in consonants. Much more typically than their English counterparts, Korean subjects are agents. Noun phrases with other semantic roles typically do not become subjects, especially not in transitive active sentences. For completeness' sake, it must again be pointed out, however, that passivization is a rarely applied but possible strategy for promoting underlying objects, that is typically patient noun phrases, to subjecthood. Moreover, the sole argument of an intransitive will regularly be construed as a subject; in such a situation, a wider semantic range can reasonably be expected than in transitive sentences, in which the more agent-like noun phrase can be construed as subject and the more patient-like noun phrase as object. The non-agentive subjects in the intransitive sentences (23) through (25) are therefore not surprising: (23) Kiyong-i chwuk-kess-ta. NOM die-FUT-DECL 'Kiyong will die.' (24) Cip-uy cipwung-i say-ko-iss-ta. house-GEN roof-NOM leak-PRES-PROGRESSIVE-DECL 'The roof of the house is leaking.'

34

Chapter 2: High semantic transparency: Korean

(25) Sunhee-ka mwul-ey ppaci-ess-ta. NOM water-LOC fall-PAST-DECL 'Sunhee drowned.' However, Korean tends to avoid patient subjects even in intransitive sentences. The literal translation of an English ergative verb, that is, of a transitive verb which can also be used intransitively with the underlying object becoming the surface subject, therefore generally results in ungrammaticality, as in (26a). The most natural rendition of such sentences is by using the patient noun phrase as grammatical topic, as in (26b), although passivization as in (26c) is, of course, a technical possibility: (26) a. *mwun-i yel-n-ta. door-NOM open-PRES-DECL 'The door opens.' b. mwun-un yel-n-ta. door-TOP open-PRES-DECL 'As for the door, [someone] opens [it].' c. mwun-i yel-li-n-ta. door-NOM open-PASS-PRES-DECL 'The door is opened.' Sentence (22a) has already shown an example of an agentive subject. Such examples are the core cases and could be multiplied at length. The role of experiencer immediately follows that of agency in the hierarchy of case accessibility of Fillmore (1968: 33) and Chafe (1970: 244), which claims to establish a universal order in which the various semantic roles can become the subject of a sentence. In addition, there is an important semantic congruence between agents and experiencers since both potentially initiate or control the activity (cf. Dixon 1989: 97 and 103). Consequently, we should expect that, if a language with high overall semantic transparency allows non-agentive subjects at all, it should allow experiencer subjects, although differences in the grammatical treatment of agents and experiencers may exist in such languages. Some German predicates, for instance, allow experiencers to be coded as dative subjects, although this construction has been retreating in favor of a uniformly nominative subject category.

2.2. The semantic range of the basic grammatical relations

35

(27) German: a. Ihm ist kalt. he (DAT) is cold 'He is (feeling) cold.' b. Er ist kalt. he (NOM) is cold 'He is cold.' Sentence (27a) has only one possible interpretation, the one which assigns the role of experiencer to the subject ihm and which is reflected in its idiomatic English translation. Sentence (27b), on the other hand, teems with all the ambiguity of the English He is cold, that is, its subject can be understood as 'one who is feeling cold' or as 'one whose own temperature is low'; accordingly, only sentence (27b) allows the figurative reading of the predicate ('to show little emotion') since this idiomatic reading is incompatible with an experiencer subject. In Korean, experiencer noun phrases unproblematically become subjects in the absence of an agent noun phrase: (28) Yongjin-i yi chayk-ul coaha-n-ta. NOM this book-ACC like-PRES-DECL 'Yongjin likes this book.' (29) Junmie-ka chwup-ta. NOM cold-DECL 'Junmie is cold.' Korean contrasts here with German, which has non : nominative subjects for a few predicates with experiencer subjects. Example (29) corresponds to the German Der (DAT) Junmie ist kalt, which has a dative subject. It can, therefore, be argued that in these experiencer subjects, German maintains a semantic distinction in the grammar which is not made in the grammar of Korean. Insofar as we are correct in arguing that Korean is, in general, a more strongly semantically transparent language than German, the experiencer subjects represent an instance of cross-over of properties on the scalar model of the Semantic Typology. Notice that in itself, the Korean system is entirely consistent with the transparent type, though, since it properly

36

Chapter 2: High semantic transparency: Korean

restricts subjecthood in underived sentences to agent and experiencer noun phrases. Instrumentals are coded as such and can not become subjects. The ungrammaticality of sentence (30a), for instance, is caused by the attempt to construct the instrumental noun phrase yi yelsoy 'the key' as subject: (30) a. *Yi yelsoy-ka yi mwun-ul yel-swuiss-ta. this key-NOM this door-ACC open-can-DECL 'This key can open this door.' b. Yi yelsoy-lo-nun yi mwun-ul yel-swuiss-ta. this key-INST-TOP this door-ACC open-can-DECL 'As for with this key, [one] can open this door.' (31)7/ nyen ceney-nun il won-ulo twu kay na sey kay-uy one year before-TOP one won-INST two CL or three CL-GEN pin-ul sa-l-swuiss-ess-ta. pin-ACC buy-can-PAST-DECL Ά year ago, one won could buy two or three pins.' Despite the fact that sentence (31) has no overt subject, any attempt to make the instrumental won-ulo 'with one won' into a subject parallel to the structure of the English gloss makes this sentence totally ungrammatical. Sentences like (32) and (33) occur in Korean although at first glance, they look like direct counterexamples to these semantic restrictions on subjecthood; however, the only possible interpretation of such subjects is as agents. These sentences, hence, actually reinforce our claim: (32) Yelsoy-ka mwun-ul yel-n-ta. key-NOM door-ACC open-PRES-DECL 'The key opens the door.' (33) Mangchi-ka mwun-ul kkey-ess-ta. hammer-NOM door-ACC break-PAST-DECL Ά hammer broke the door.' Example (33) is adapted from Yang (1972: 8-9). Yang emphasizes that this sentence is only grammatical if mangchi 'hammer' is interpreted as a personification. One Korean speaker probably characterized such sentences

2.2. The semantic range of the basic grammatical relations

37

best when he stated that they can only describe fairy tale worlds where keys and hammers are alive and conscious of their actions. As such, these subjects are agents, and sentences like (32) and (33) do not constitute counterexamples to our claim that instrumentais can not become subjects in Korean. In the same manner, locative noun phrases do not become subjects: (34) Na-uy kita-nun cwul hana-ka pwule-ci-ess-ta. I-GEN guitar-TOP string one-NOM break-PASS-PAST-DECL 'As for my guitar, a string was broken.' = 'My guitar broke a string.' Replacing the topic marker -nun on kita with the nominative marker -ka (and the nominative marker on hana with the accusative marker -luí) results in the ungrammaticality of the sentence, as we would predict by now. This sentence also shows the inability of patient noun phrases to become subjects in Korean. The English verb break is a so-called "ergative" verb, but, as we have seen, different verb forms must be used for transitive and for intransitive structures in Korean. The only way for the patient noun phrase cwul hana 'one string' to become the subject is by applying passivization. Sentence (34) is therefore an example of the usefulness of passivization even in Korean. Some noun phrases which Case Grammar would classify as locatives can appear as subjects in Korean, as sentence (35) shows: (35) kay-nun yi hotel-i kumchiha-n-ta. dog-TOP this hotel-NOM forbid-PRES-DECL 'As for dogs, this hotel forbids [them].' However, one must question whether such subjects are truly locatives. Rather, it seems that the hotel in (35) is understood as an organization, and hence as an agent. This is also possible and occurs commonly in German, as sentence (37) demonstrates, notwithstanding the presumed ungrammaticality of the literal equivalent of sentence (36): (36) German: * Dieses Hotel verbietet Hunde. this hotel forbids dogs 'This hotel forbids dogs.'

38

Chapter 2: High semantic transparency: Korean

(37) German: Dieses Geschäft verkauft Zahnpasta. this shop sells toothpaste 'This shop sells toothpaste.' Hawkins (1986: 58) stars sentence (36) as ungrammatical, although in my own judgment it receives two question marks at most. In any case, sentence (37) shows that not all subjects which can strictly be interpreted as locatives induce ungrammaticality in German. While a shop is undoubtedly situated at a particular location, it is also an organization which can act and, in fact, can be held legally responsible for its actions, just like a human actor can. Fillmore (1977: 71) specifically allows that a given noun phrase may have several different case roles depending on the perspective of interpretation one takes. The crucial point of example (37) is, therefore, that its subject is readily construed as an agent. In general, the grammaticality of German sentences with "locative" subjects improves dramatically, the more likely an agentive interpretation is made. The subject of sentence (38), for example, at first glance appears to be a locative parallel to dieses Hotel in (36): (38) German: Dieses Hotel beschäftigt 200 Personen. this hotel employs persons 'This hotel employs 200 people.' However, Bernard Comrie (personal communication) points out that careful consideration of its meaning reveals that the hotel may have a certain number of employees who work at other locations, such as sales representatives, delivery services, etc. These people are employed by, though not in, the hotel. Consequently, an agentive reading of the subject is favored above a locative interpretation, and this is reflected in the perfect grammaticality of the sentence. Even for the apparently exceptional sentence (36), it can be shown that their grammaticality improves quite considerably if an agentive reading of the subject is made more likely, cf. examples (39) and (40): (39) German: Die Stadt Heidelberg hat Hunde verboten. the city has dogs forbidden 'The city of Heidelberg has forbidden dogs.'

2.2. The semantic range of the basic grammatical relations

39

(40) German: (?) Dieses Hotel hat Hunde verboten. this hotel has dogs forbidden 'This hotel has forbidden dogs.' Since governments are understood to act (occasionally), the subject of (39) is more readily interpreted as an agent despite the fact that its edicts hold for a particular locality; hence, the sentence is acceptable. Whatever level of anomaly can still be ascribed to (40) arises primarily because the sentence seems to imply that the hotel has outlawed dogs in the society at large, and the native speaker doubts whether a hotel (as opposed to the city government, which may well have been concerned about the spread of rabies, etc.) might reasonably be assumed to have such far-reaching powers. In other words, it seems that in English the presence of a locative subject is by itself sufficient to localize the field of applicability of the verb whereas in German it is not, although locatives which are coded as such have the same effect in German, as (41) shows: (41) German: Hunde sind hier verboten. dogs are here prohibited 'Dogs are prohibited here.' If this is correct, the absence of this localizing effect of the subject dieses Hotel in (36) and in (40) becomes extremely significant because it would point towards a difference in the semantic roles underlying the subjects of the grammatical English sentence This hotel forbids dogs as opposed to the questionable German sentence Dieses Hotel verbietet Hunde. The English subject exerts a localizing effect on the meaning of the verb and is, thus, a locative, whereas the German subject has no such effect and must, consequently, be an agent. Notice that the sentence with the "locative" subject dieses Hotel becomes completely acceptable in German if the locative range of the verb is properly constrained: (42) German: Dieses Hotel verbietet das Rauchen in der Lobby. this hotel forbids the smoking in the lobby 'This hotel prohibits smoking in the lobby.'

40

Chapter 2: High semantic transparency: Korean

Consequently, the subject of the Korean example (35) can be construed as an agent noun phrase. Far from being anomalous, this sentence, therefore, illustrates once more the fact that subjecthood in Korean is restricted to agents and experiencers. Yang (1972: 8) points out that no other semantic roles can become subjects in Korean so that such well known examples as English The book sells (well) have no literal equivalent in Korean. Sentence (43) presents a further example of an unusual English subject. Our informant made it clear that the phrase han chang-i 'one chapter' can only be the subject of a passive sentence, cf. the verb morphology. Its active equivalent must either have no overt subject, or a suitable agent (the author of the book, its publisher, etc.) must be chosen as subject. No other subject is possible. Notice that the Korean equivalent of the English subject the latest edition only has the role of topic in this sentence: (43) Yi chayk-uy choykun ho-nun han chang-i this book-GEN latest edition-TOP one chapter-NOM cwuka-toy-ess-ta. add-PASS-PAST-DECL 'As for the latest edition of this book, a chapter was added.' = 'The latest edition of this book has added a chapter.' This subsection has shown that only a narrow range of semantic roles have access to subjecthood in Korean. This result is in keeping with the consistently transparent nature of the language as it is now taking shape with our observations that Korean has an extensive set of overt case markers, a high semantic specificity of the basic grammatical relations, and restricted use of the grammatical relation changing process of passivization. The contrast between English and Korean appears to confirm that a major reason for the widely expanded set of semantic roles which can become subjects in English is their need to function as topics. As English word order became fixed in the subject-verb-object frame, English developed other mechanisms to take over the pragmatic functions that were previously performed by simple word order rearrangements. Two developments in the English language were particularly important. Both of them had as a major consequence the reduction of semantic transparency in English: Firstly, there was the extended applicability of the passive, which now allowed the promotion of the old class of dative objects; secondly, the extension of the range of noun phrases that are eligible to become subjects in underived environments allowed more and more semantic roles to function as subjects.

2.2. The semantic range of the basic grammatical relations

41

According to Kirkwood (1969), expanding the set of subjects allowed the English language to retain a good deal of the pragmatic topic-comment, ordering despite the now grammatically fixed word order. Faced with the same problem of ensuring the expression of topic continuity through foregrounding, Korean has opted for a strategy that is diametrically opposed to that of English. Instead of fixing word order, Korean allows free scrambling of the major maximal constituents of a sentence, except for the verb, which must remain in final position (cf. section 2.3), making it possible for any constituent to be foregrounded. In addition, the topic marker -nun can attach to any nominal expression: (44) Na-nun yi chayk-ul coaha-n-ta. I-TOP this book-ACC like-PRES-DECL Ί like this book.' (45) II nyen ceney-nun Kiyong-i Yunsoo-lul ttayli-ess-ta. one year before-TOP NOM ACC hit-PAST-DECL 'As for one year ago, Kiyong hit Yunsoo.' (46) Yi yelsoy-lo-nun yi mwun-ul yel-swuiss-ta. this key-INST-TOP this door-ACC open-can-DECL 'As for with this key, [one] can open this door.' (47) kay-nun yi hotel-i kumchiha-n-ta. dog-TOP this NOM forbid-PRES-DECL 'As for dogs, this hotel forbids [them].' Even multiple topics are allowed:" (48) Jongjirt-un Seoul-ey-nun ka-ess-eyo. TOP LOC-TOP go-PAST-SEMIFORMAL 'Speaking of Jongjin, Seoul he went to (but not to...)' As a final strategy of Korean, non-agentive subjects can be construed as agent-like by making them the subject of a causative construction since causers prototypically are intentionally acting humans, and effects are prototypically the result of an action:

42

Chapter 2: High semantic transparency: Korean

(49) Chongal-i Jongjin-uy taly-lul tacy-key ha-ess-ta. bullet-NOM GEN leg-ACC wound-COMP do-PAST-DECL 'The bullet wounded Jongjin's leg.' = lit.: 'The bullet caused to wound Jongjin's leg.' The combination of free scrambling, causativization, and overt topic marking ensures that virtually any nominal expression can be foregrounded or topicalized. Therefore, the corresponding strategies which perform these functions in English are much less needed in Korean and, hence, severely restricted, like the passive, or virtually non-existent, like the collapsing of a large number of semantic roles onto the grammatical function of subjecthood.

2.2.2.

Objects

Korean distinguishes between dative objects, which are marked by -eykey, and accusative objects, which are marked by -(l)ul. Since only accusative objects can undergo passivization (Hwang 1975: 45; cf. sentences (22b), (26c), (34), and (43) above for examples of the passive), the promotion of the underlying dative object ai-eykey 'the child-DAT' renders sentence (50b) ungrammatical: (50) a. Jeongdal-i ai-eykey enehak-ul kaluchi-ess-ta. NOM child-DAT linguistics-ACC teach-PAST-DECL 'Jeongdal taught linguistics to a child.' b. *ai-ka enehak-ul Jeongdal-ey uyhayse child-NOM linguistics-ACC LOC by kaluchi-eci-ess-ta. teach-PASS-PAST-DECL Ά child was taught linguistics by Jeongdal.' Whereas English has lost the case marking distinction between dative and accusative and allows former dative objects to become promoted to subjecthood by means of passivization, Korean strictly differentiates the two categories. So, as in the case of subjects, Korean objects preserve the respective underlying semantic distinctions more closely than their structural counterparts do in a grammaticizing language like English.

2.3. Word order effects

43

2.3. Word order effects 2.3.1.

Scrambling

Korean has a highly flexible word order. It has frequently been noted in studies of Korean (for instance by A. Kim 1985: 107 and by Ν. K. Kim 1987: 894) that all logically possible permutations of the major constituents result in a grammatical sentence as long as the verb remains in clause-final position. Therefore, sentences (51a) through (5If) are all grammatical: (51) a. Kiyong-i Jinhee-eykey chayk-ul cwu-ess-ta. NOM DAT book-ACC give-PAST-DECL 'Kiyong gave Jinhee a book.' b. Kiyong-i chayk-ul Jinhee-eykey cwu-ess-ta. c. Jinhee-eykey Kiyong-i chayk-ul cwu-ess-ta. d. Jinhee-eykey chayk-ul Kiyong-i cwu-ess-ta. e. Chayk-ul Kiyong-i Jinhee-eykey cwu-ess-ta. f. Chayk-ul Jinhee-eykey Kiyong-i cwu-ess-ta. Sentences (51 g) and (51h) however are not possible, because material has been scrambled to the right of the verb: (51) g. * Kiyong-i Jinhee-eykey cwu-ess-ta chayk-ul. h. *Jinhee-eykey chayk-ul cwu-ess-ta Kiyong-i. The verb-finality constraint ensures that rightward clause boundaries can be clearly identified. It is especially important when there are embedded clauses, as in sentence (52). With this constraint, the rightward boundary of all clause level constituents is marked in the surface structure. In nominal expressions, the boundary is marked by the case marker or postposition accompanying a noun, while clauses signal their rightward edge by the position of the verb. The identification of the role of the clause is further enhanced by the fact that embedded verbs typically have a complementizer

44

Chapter 2: High semantic transparency: Korean

among their affixes. By comparison, leftward clause boundaries are left unsignalled, a phenomenon which will be further scrutinized in section 2.4 below. Their locations must be inferred by the hearer. (52) Nay-ka [Sunhee-ka yi chayk-ul coaha-ki-lul] I-NOM NOM this book-ACC like-COMP-ACC kitayha-ess-ta. expect-PAST-DECL Ί expected that Sunhee likes this book.' The permutations that occur because of scrambling do not lead to a reduction in semantic transparency, unlike other types of movement, because the grammatical roles of the constituents that are moved around remain unequivocally identified by the case marker and/or the postposition accompanying each nominal expression. Constituents can only be scrambled as a unified whole, and nothing can be scrambled into, or out of, another constituent. The violation of this constraint is the cause of the ungrammaticality of (53b), where the matrix subject nwu-ka 'who' has been scrambled into the centerembedded clause: (53) a. Nwu-ka [Yunsoo-ka Kiyong-eykey chayk-ul who-NOM NOM DAT book-ACC cwu-ess-taj-ko malha-ess-ni? give-PAST-DECL-COMP say-PAST-Q 'Who said that Yunsoo gave Kiyong a book?' b. * [Yunsoo-ka Kiyong-eykey nwu-ka chayk-ul NOM DAT who-NOM book-ACC cwu-ess-ta]-ko malha-ess-ni? give-PAST-DECL-COMP say-PAST-Q Conversely, sentence (53c) is ungrammatical because the noun phrase Kiyong-eykey 'Kiyong-DAT' has been scrambled out of the embedded clause into the matrix: (53) c. * Kiyong-eykey nwu-ka [Yunsoo-ka chayk-ul DAT who-NOM NOM book-ACC cwu-ess-ta]-ko malha-ess-ni? give-PAST-DECL-COMP say-PAST-Q

2.3. Word order effects

45

Of course, the statement that clause-level constituents must only be moved as a whole and without violating the integrity of other constituents entails the claim that neither adposition stranding nor extractions exist in Korean. The stranding of a postposition implies that it has somehow been removed from the remainder of the postpositional phrase in violation of the requirement that maximal constituents can only be moved as a whole. Extractions, on the other hand, move an element out of the clause that it semantically belongs to, thereby offending the integrity of the receiving clause. Indeed, none of these offending movement processes exists in Korean, showing once more how the task of maintaining transparency lends a common directionality to a large number of seemingly unrelated structural facts about this language. If case markers are deleted, as can optionally be done with the nominative, accusative, and some datives in casual speech, Korean resorts to its basic subject-object-verb word order in order to identify the grammatical function of the various constituents: The first noun phrase in such a sentence will be interpreted as the subject and the remaining noun phrases as objects. A. Kim (1985: 127) points out that deviations from the canonical word order make a sentence without overt case markers difficult to process even when there are semantic cues present that should facilitate the interpretation of the sentence and even in the presence of some grammatical cues other than case marking. Consider sentence (54): (54) UNwukwu sathang ne cwu-lay? who candy you give-want 'Who do you want to give candies to?' Since the verb requires an animate subject and recipient, sathang 'candy' is most likely the direct object in (54). In addition, nwukwu 'who' can not be the subject as this question word must invariably occur in its nominative form nwu-ka if it is a subject. Finally, the volitional morpheme lay can not occur with a third person agent (A. Kim 1985: 124). Together, these cues should make the assignment of grammatical functions in this sentence unproblematic, and yet Kim insists that sentence (54) is difficult to process for most native speakers. The availability of scrambling, thus, crucially depends on the presence of case markers. In other words, Korean again opts for the more transparent solution: Only case markers will guarantee the unambiguous assignment of grammatical relations in all sentences. As a result, Korean forgoes the possibility of scrambling in all instances where

46

Chapter 2: High semantic transparency:

Korean

this assignment relies on less reliable pragmatic, semantic, and grammatical cues. Interestingly, the respective order of dative and accusative objects is not fixed in Korean and must be recovered from the pragmatic context. English makes a clear distinction between these on the basis of word order: If a sentence like John gave Mary Bill can plausibly be uttered in some context, it is clear that the recipient is Mary and the object received is Bill.

2.3.2.

Extractions

The absence of extraction in Korean has already been noted in the preceding section. The purpose of this section is to add some examples whose ungrammaticality illustrates this fact. The formation of WH-questions does not involve movement in Korean. Instead, the WH-word simply replaces whichever noun phrase or adverbial is questioned: (55) a. Kiyong-un [Sunhee-ka nwukwu-eykey yi chayk-ul TOP NOM who-DAT this book-ACC cwu-n-culj-lo al-ko-iss-na? give-PRES-COMP know-Q 'To whom does Kiyong believe Sunhee gave this book?' 12 Like any maximal constituent, the WH-word is free to scramble within its clause, as examples (55b) and (55c) indicate, but the ungrammaticality of sentence (55d) shows that any attempt to extract the WH-word out of its clause induces ungrammaticality. (55) b. Kiyong-un [nwukwu-eykey Sunhee-ka yi chayk-ul TOP who-DAT NOM this book-ACC cwu-n-cul]-lo al-ko-iss-na? give-PRES-COMP know-Q c. Kiyong-un [Sunhee-ka yi chayk-ul nwukwu-eykey TOP NOM this book-ACC who-DAT cwu-n-cul]-lo al-ko-iss-na? give-PRES-COMP know-Q

2.3. Word order effects

47

d. *Nwukwu-eykey Kiyong-un [Sunhee-ka yi chayk-ul who-DAT TOP NOM this book-ACC cwu-n-cul]-lo al-ko-iss-na? give-PRES-COMP know-Q Finally, sentence (56b) shows the ungrammaticality of extraction from a different type of complement, namely a nominalized clause. (Note the accusative marker attached to the complementizer.) (56) a. Jinhee-ka [nwu-ka yi chayk-ul Kiyong-eykey NOM who-NOM this book-ACC DAT cwu]-ki-lul wuenha-na? give-COMP-ACC want-Q 'Who does Jinhee want to give Kiyong this book?' b. *Nwu-ka Jinhee-ka [yi chayk-ul Kiyong-eykey who-NOM NOM this book-ACC DAT cwuj-ki-lul wuenha-na? give-COMP-ACC want-Q Based on these word order effects, we can therefore conclude that there is no WH-extraction in Korean.

2.3.3.

Raising

2.3.3.1. Subject-to-subject raising There is no evidence of subject-to-subject raising in Korean. The equivalent construction to English raising works exactly as we would expect for all embeddings: Kat 'seem' appears as the matrix verb in sentence-final position; there is no overt expletive; and the embedded clause has the same shape as its simple counterpart, with the exception of the morphology of its verb, which is nominalized and must select from a different set of tenseaspect markers than a main verb. This is shown in the contrast between sentences (57a) and (57b).

48

Chapter 2: High semantic transparency: Korean

(57) a. Nay chinkwu-ka yi chayk-ul coaha-n-ta. my friend-NOM this book-ACC like-PRES-DECL 'My friend likes this book.' b. [Nay chinkwu-ka yi chayk-ul coaha-nun kes] kat-ta. my friend-NOM this book-ACC like-COMP NML seem-DECL 'My friend seems to like this book.' Sentence (57c) illustrates that scrambling applies freely within the embedded clause, including scrambling of the subject: (57) c. [Yi chayk-ul nay chinkwu-ka coaha-nun kes] kat-ta. this book-ACC my friend-NOM like-COMP NML seem-DECL This last fact alone indicates that the subject has not been raised, given the constraint formulated above that scrambling into another constituent is not permitted. Add to this the fact demonstrated by (57d) that this subject can not be scrambled within the matrix clause, and the evidence from movement processes against a raising analysis is complete: (57) d. *[Yi chayk-ul coaha-nun kes] nay chinkwu-ka kat-ta. this book-ACC like-COMP NML my friend-NOM seem-DECL Another way to demonstrate that no raising has taken place in sentences like (57b) is by means of honorific agreement. In Korean, subjects agree with the verb in the level of honorification. If the subject is honorified, the verb must take an honorific marker as well. This is why sentences (58a) and (59a) are both well-formed whereas (58b) and (59b) are not. In example (58a), both the subject and the verb are honorific, as marked by the suffix -nim on the subject and by the suffix -si on the verb, while in example (59a), both the subject and the verb are in their "plain", non-honorific forms. In (58b), however, the subject is honorific, but the verb is in its nonhonorific form, making the sentence unacceptable due to the lack of honorific agreement between the subject and the verb. Similarly in (59b), a non-honorific subject clashes with an honorific verb, again triggering unacceptability:

2.3. Word order effects

49

(58) a. Sensayng-nim-i hakkyo-ey tochakha-si-ess-ta. teacher-HON-NOM school-LOC arrive-HON-PAST-DECL 'The teacher arrived at the school.' b. * Sensayng-nim-i hakkyo-ey tochakha-ess-ta. teacher-HON-NOM school-LOC arrive-PAST-DECL 'The teacher arrived at the school.' (59) a. Nay chinkwu-ka hakkyo-ey tochakha-ess-ta. my friend-NOM school-LOC arrive-PAST-DECL 'My friend arrived at the school.' b. *Nay chinkwu-ka hakkyo-ey tochakha-si-ess-ta. my friend-NOM school-LOC arrive-HON-PAST-DECL 'My friend arrived at the school.' When honorific agreement is used as a test in sentences like (57b), it is found that the underlying subject of the embedded verb can only agree with that verb and not with the matrix verb kat-ta 'seem': (60) a. [Sensayng-nim-i yi chayk-ul coaha-si-nun kesj teacher-HON-NOM this book-ACC like-HON-COMP NML kat-ta. seem-DECL 'The teacher seems to like this book.' b. * [Sensayng-nim-i yi chayk-ul coaha-nun kesj teacher-HON-NOM this book-ACC like-COMP NML kat-usi-ta. seem-HON-DECL Just like the scrambling evidence, this agreement pattern therefore shows that no subject-to-subject raising has occurred.

2.3.3.2. Object-to-subject raising The productive use of constructions like those in (61 ) suggests that objectto-subject raising may exist in Korean. Some evidence for a raising analysis

50

Chapter 2: High semantic transparency: Korean

comes from the fact that the underlying object can be scrambled to the right of the embedded verb as if it was a constituent of the matrix clause, as shown in (61b): (61) a. Enehak-i kongpwuha-ki-ka swuip-ta. linguistics-NOM study-COMP-NOM easy-DECL 'Linguistics is easy to study.' b. Kongpwuha-ki-ka enehak-i swuip-ta. study-COMP-NOM linguistics-NOM easy-DECL With regard to its effect on the position of Korean in the Semantic Typology, the apparent availability of object-to-subject raising constitutes a problem. In the properties discussed thus far, Korean has exhibited a remarkable overall consistency. Raising, however, is a process which reduces semantic transparency considerably, if the reasoning behind the Grammatical Relation Manipulation Hierarchy presented in section 1.3.3.2 is correct. It must, therefore, be asked, whether a raising analysis is actually the correct derivation of sentences like (61), or whether the possibility of scrambling within the matrix is not sufficient to prove that raising has occurred. In fact, there is evidence along the lines of Comrie (1989b) for Russian and Comrie and Matthews (1990) for Serbo-Croatian that object-to-subject raising is not the correct analysis for sentences like (61).13 As in Russian, the only grammatical relation changing process in Korean other than the presumed object-to-subject raising rule, namely the passive, is limited to advancing accusative objects. However, the presumed instances of Korean object-to-subject raising are not constrained in the same way, as example (62) demonstrates. Not just objects, but many kinds of phrases for which a raising analysis does not seem plausible can appear as the matrix subject in such sentences, like the locative phrase yipen kumyoil 'this Friday' in (62): (62) Yipen kumyoil-i kongpwuha-ki swui-wul-kesi-ta. this Friday-NOM study-COMP easy-FUT-DECL 'This Friday is easy for studying.' While (62) is not the preferred structure to use in such a situation, it is perfectly grammatical when studying is usually difficult on Fridays, say,

2.4. Clause marking

51

because of the noise level in the students' dormitory; sentence (62) could be uttered to refer to a Friday when the noise problem won't occur. Furthermore, a raising analysis would predict that the subject and the verb of the matrix in such sentences may not enter into any semantic relationship at all. Again, this is not true in Korean. As shown in (63), the sentences become ungrammatical when no such semantic relation exists:14 (63) a. [Sunhee-ul hwana-key ha-nun kes]-un ACC get angry-CAUS make-PNE COMP-TOP swuip-ta. easy-DECL 'It is easy to annoy Sunhee.' b. *Sunhee-ka [hwana-key ha-nun kes]-un NOM get angry-CAUS make-PNE COMP-TOP swuip-ta. easy-DECL 'Sunhee is easy to annoy.'

2.4. Clause marking So far this chapter has shown that Korean is characterized by a high level of semantic transparency throughout a wide range of its grammatical properties. The most interesting part about this consistency is the discovery of how various seemingly unrelated properties of the language interact to preserve the advantages of a consistent location on the continuum close to its transparent end. The purpose of this subsection is to discuss a property of Korean which, at first glance, appears to run counter to this generalization, namely the marking of dependent clauses. Sentences (64) to (66) present some typical examples:15 (64) Na-nun [Kiyong-i ttena-ess-taj-ko I-TOP NOM leave-PAST-DECL-COMP mit-nun-ta. believe-PRES-DECL Ί believe that Kiyong left.'

52

Chapter 2: High semantic transparency:

Korean

(65) John-i [Bill-i Mary-eykey senmwul-ul NOM NOM DAT gift-ACC cwu-ess-taj-ko malha-yess-ta. give-PAST-DECL-COMP say-PAST-DECL 'John said that Bill gave Mary a gift.' (66) Chulsoo-nun [Sunhee-ka cip-ey TOP NOM home-LOC o-ess-nun]-ci-lul mwul-ess-ta. come-PAST-ASP-COMP-ACC ask-PAST-DECL 'Chulsoo asked whether Sunhee came home.' Notice that the dependent clauses in these examples are marked as such by the occurrence of a complementizer, i.e., -ko in (64) and in (65), and -ci in (66). Therefore, dependent clauses are not strictly counterexamples to the claim of consistent high transparency in Korean since any dependent clause is explicitly marked with a complementizer, as their counterparts are in English. However, only the right flank of a dependent clause is so marked while the left flank is not overtly signalled in Korean. This marking pattern is typical of left-branching languages generally (Schachter 1985: 47), but on the face of it, it is somewhat surprising from the standpoint of the Semantic Typology because it opens up the possibility that syntactic dependency in Korean consistently produces temporary ambiguities, a property which, if true, would clearly contradict the spirit of semantic transparency which otherwise pervades the grammar of Korean. The strategy of marking subordination found in Korean presents a potential interpretive problem because, in on-line performance, a hearer generally can not rely on overt grammatical cues to signal the beginning of an embedded clause since there is no overt syntactic indication in the beginning of any new sentence whether main clause material or a dependent clause is being transmitted. For instance, the first indication for a hearer that sentence (67) begins with a dependent clause occurs only in the complementizer suffix -ko on the verb, i.e., on the very last word of that clause: (67) [Bill-i Mary-eykey senmwul-ul cwu-ess-taj-ko NOM DAT gift-ACC give-PAST-DECL-COMP John-i malha-ess-ta. NOM say-PAST-DECL 'John said that Bill gave Mary a gift.'

2.4. Clause marking

53

To a certain extent, this problem is exacerbated by the operation of scrambling and the possibility of pro-drop in Korean: As a result of scrambling, a listener can not rely on a fixed order of maximal constituents, say S IO DO V, or on a break in such a sequence to provide a syntactic hint that an embedded clause is being constructed; and as a result of prodrop, the hearer can not trust that each verb will have a full set of overt arguments. Arguments may, hence, be misassigned to the wrong clause if the hearer constructs, or fails to construct, a clause boundary. Example (68) illustrates this problem: (68) John-un [Mary-eykey senmwul-ul Bill-i TOP DAT gift-ACC NOM cwu-ess-ta]-ko malha-ess-ta. give-PAST-DECL-COMP say-PAST-DECL 'John said that he gave Mary a gift.' In the absence of grammatical evidence to the contrary, a hearer is likely initially to interpret John, Mary, and senmwul 'gift' as arguments of the same predicate. This misinterpretation is only corrected when he encounters the nominative Bill-i, which gives cwu-ess-ta a full set of arguments preceding it, indicating that John belongs into another clause. At this point, it also becomes clear that John opens the matrix clause as the topic marker -nun is restricted to matrix constituents. The processor can, therefore, assign the correct clause boundary on the occurrence of Bill-i. While Korean subordination, thus, conforms with the predictions of the Semantic Typology when sentences are viewed globally as a whole, Korean may present a problem when sentence production is traced as it occurs on-line. Frazier and Rayner (1988: 260-274) have argued that left-branching languages like Korean can avoid the problem of misinterpreting subordinate clauses as main clauses, and vice versa, by abandoning top-down processing (or, more precisely, their Partial Top Down Constraint, which is hypothesized for right-branching languages). They report that Ueda (1984) discovered evidence that bottom-up processing is in fact used in Japanese, which like Korean is a consistently left-branching language. For example, in sentences like (69) there is a definite preference among speakers of Japanese to interpret the sentence-initial adverb kinoo 'yesterday' as a constituent of the embedded clause, rather than as a constituent of the main clause (Frazier and Rayner 1988: 272-273):

54

Chapter 2: High semantic transparency: Korean

(69) Japanese: Kinoo John-ga kekkonsi-ta to Mary-ga it-ta. yesterday NOM marry-PAST COMP NOM say-PAST 'Mary said that John married yesterday.' If Frazier and Rayner are correct, dependent clauses in Korean do not induce rampant temporary ambiguities. Rather, the parser simply builds up constituents from the bottom up and does not make a decision on what constitutes the main clause until the last predicate is encountered. Right end marking of embedded clauses in Korean is, therefore, not in conflict with the semantic transparency of that language.

2.5. Summary This chapter has argued that Korean is a language with a very high degree of semantic transparency throughout its grammar. Korean is characterized by overtly marked and semantically narrow grammatical relations, a free intra-clausal word order, which allows it to encode pragmatic information flow functions overtly, and the virtual absence of argument trespassing. As such, Korean is located close to the transparent end on the scalar model of the Semantic Typology and contrasts most sharply with English, which had been determined to closely approximate the grammaticizing end of the scale. The Semantic Typology advances our understanding of how the grammar of Korean works since it shows how a large number of otherwise unrelated grammatical properties of that language combine to yield a high level of semantic transparency throughout the grammar. In turn, the consistency of this transparency of Korean provides supporting evidence for the validity of the typology as it was developed in the first chapter.

3. The interaction with other principles 3.1. An interactionist view of language This chapter illustrates the interaction of the Semantic Typology with other principles. Clearly, the typology does not exist in a vacuum; rather, it is but one of many organizing principles that a language must respond to, ranging from general principles of communicative systems, such as the principles of information flow described by the pragmatic component of language, to specific structural concepts dominating the organization of a particular language. Consequently, we should expect that the data for any given language may not line up precisely as expected by the Semantic Typology alone, but that the effects of such other forces interfere and lead to a divergence from the predicted alignment. The cross-linguistic survey presented in the next chapter reveals several instances of such interactions. Having described the properties of languages approaching the ideal configurations on each end of the scale in Chapters 1 and 2, this chapter attempts to isolate such an interaction of the Semantic Typology with other principles by outlining the properties of a language which displays a consistent patterning close to one of these extremes, except for the interference of a single parameter which distorts the effect of the Semantic Typology in this language. The consistency and the regularity of the resulting deviations from the expected patterning thus serve to clarify the effects of such an interaction and provide additional evidence for our typology. The language chosen for this purpose is Indonesian, which constitutes a striking example where a single, but for the organization of that language central principle interacts with the Semantic Typology, leading to welldefined and consistent deviations from the predictions of the Semantic Typology over a wide range of linguistic properties. In the case of Indonesian, this interfering principle is the centrality of the notion of subjecthood to movement processes. Besides providing further insight into the working of the Semantic Typology, the treatment of Indonesian from the typological perspective developed here also yields a unifying view of a number of grammatical properties of this language, some of which have been recognized individually in the literature and some of which have not previously been noted. The Semantic Typology provides a rationale for the co-occurrence of these properties and, thus, can be said to provide a motivation for their coexistence in Indonesian.

56

Chapter 3: The interaction with other principles

3.2. Subject and voice in Indonesian The use of the notions "subject" and "voice" is critical to the analysis of Indonesian presented in this chapter. The fact that the relevancy of these notions has frequently been denied in the literature therefore makes it necessary for us to review the type of arguments that have been forwarded against them and to show why these arguments are not syntactically correct. Like many Indonesianists, Prentice ( 1987: 931 -934) is uncomfortable with applying the terms active and passive in the analysis of Indonesian, preferring instead to speak of "agent-orientation" and "object-orientation" (for a similar view, cf. Wouk 1986). I believe the reasons given by these critics for the rejection of an active - passive analysis for Indonesian not to be compelling and that an analysis which does not analyze the relevant Indonesian constructions in terms of voice only prevents a unified grammatical account of Indonesian syntax based on universal linguistic concepts. The reasons cited by the critics include the great text frequency of "object-oriented" verbs and the fact that the choice between agent-orientation and object-orientation is determined by grammatical and stylistic considerations quite distinct from those that are assumed to trigger the choice between active and passive in other languages. Prentice (1987: 934) lists three such constraints: The requirement that subjects must be definite triggers object-orientation whenever the underlying subject is non-definite; action sequences are frequently presented as a series of object-oriented verbs (cf. also McCune 1979); and the fact that yang, which occurs as a relativizer, in WH-questions, and in clefting, must be linked to a subject. However, the discussion in this chapter shows that an analysis in terms of active versus passive yields a uniform account of the constraints on movement processes in Indonesian, thereby strongly suggesting that this analysis is superior to one which resorts to the idiosyncratic device of "object-orientation". Further arguments for using the term passive are developed in Chung (1976b). Certainly, if the recent contributions in Shibatani (1988) are any guide, the essential ingredient of an active passive distinction universally is a rearrangement of the grammatical relations in a sentence, rather than a list of discourse circumstances, such as the choice of narrative style, under which one or the other would be used. As John Verhaar (1978: 14-15) correctly points out: There is scarcely any imaginable view of what is here called "passive" that has not been represented in all the studies (much of it polemics)

3.2. Subject and voice in Indonesian

57

around the "vervoegde werkwoordsvormen" [affixed verb forms, FMG] ... but when these ideas are developed in discussions it invariably ... turns out that what is meant is that Indonesian verbs are not inflectional in the same way Indo-European verbs are ... There is, from the terminological point of view, no reason not to employ the term "passive" for the forms so named in the present article. Once one does, of course, so call them, one may reject interpretations of them that are due only to the critic's understanding of the terms involved; for example, "passive" need not be understood in its Indo-European sense. Apparently realizing that passive subjects are no longer derived after abandoning the analysis in terms of a voice opposition, Cumming (1986: 72-73) takes the next logical step and denies the applicability of the term "subject" for Indonesian altogether on grounds that most of the noun phrases which have traditionally been called subjects do not have the prototypical subject property of being agents. Of course, Cumming's argument hinges crucially on the rejection of the voice analysis. If an active - passive dichotomy was rejected for English, for instance, Cumming's arguments would lead us to deny the existence of subjects for that language as well by the same criteria. Independent support for the relevance of the notion "subject" in fact exists in the syntax of Indonesian. This evidence comes from sentences like the following, found in the Indonesian magazine Matra (June 1990: 17): (70) Waktu itu apakah Anda sudah yakin Bung Karno time that Q you already believe Sukarno di-pengaruh-i oleh PKI? PASS-influence-LOC by Communist Party of Indonesia 'At that time, did you already believe that Sukarno was influenced by the PKI?' In Indonesian, the question particle apakah occurs sentence-initially, except that it is preceded by the grammatical topic of the sentence. This topic is clearly waktu itu 'at that time' since this phrase appears in the sentence initial position. The crucial point is the grammatical role of the noun phrase Anda 'you'. Anda can not be the grammatical topic because of its position following the question particle apakah and because of the presence of the temporal noun phrase waktu itu. Yet, it is the agent of yakin 'believe' and may be a lesser discourse topic; in addition, Anda (but not grammatical

58

Chapter 3: The interaction with other principles

topics like waktu itu) can occur in an oblique oleh 'by' phrase when the sentence is passivized ("object-oriented"). Altogether, then, Anda has few characteristics of a grammatical topic while behaving precisely like a subject. In the absence of any other possibilities, we must therefore conclude that the notion of subjecthood is relevant in the grammar of Indonesian; in fact, only when this relevance is recognized does it become possible to describe the properties of movement in Indonesian in terms of a single, overarching syntactic constraint.

3.3. Low semantic transparency in Indonesian Some readily observable properties of Indonesian, such as its word order properties and the marking of participant roles make it a possible contender as a member of the grammaticizing type, like English. While several languages with close affinities to the transparent end of the Semantic Typology have been identified so far, namely Korean, Russian, and to a lesser extent German, the grammaticizing type is so far represented only by English. As we will see in this chapter, however, Indonesian is actually characterized by a split between some areas of the grammar which have a low degree of semantic transparency and others with a much higher degree. What is crucial is that the distribution of these two areas is very systematic and can be attributed to the interference with the effects of the Semantic Typology by a single factor, which we call the Subjects Only Constraint on movement.

3.3.1. Basic word order The first feature which aligns Indonesian with English is its word order. Indonesian sentences are characterized by stringent word order requirements. It is a consistent head-initial language with a basic subject - auxiliary - verb - object word order. Nevertheless, variation from the basic order can be achieved by means of topicalization, passivization, and extraction. While the latter two are to be illustrated in sections 3.3.3 and 3.4.2, respectively, topicalization is taken up here. The contrast between sentence (71a) and sentence (71b) illustrates this process:

3.3. Low semantic transparency in Indonesian

59

(71) a. Mungkin harga-nya bisa lebih murah untuk kota-kota lain. maybe price-3 can more cheap for city-city other 'The prices may be lower for other cities.' b. Untuk kota-kota lain mungkin harga-nya bisa lebih murah. for city-city other maybe price-3 can more cheap 'For other cities, the prices may be lower.' The oblique phrase untuk kota-kota lain 'for other cities' is moved from its original position in (71a) to the initial position in (71b) in order to make it into the grammatical topic of that sentence. However, argument scrambling is not permitted, as the ungrammaticality of the various proposed permutations in example set (72) demonstrates: (72) a. Teman saya mem-beli mobil-nya. friend I TA-buy car-3 'My friend bought his car.' b. * Mem-beli teman saya mobil-nya. c. * Mem-beli mobil-nya teman saya. d. * Teman saya mobil-nya mem-beli. e. *Mobil-nya teman saya mem-beli. f. *Mobil-nya mem-beli teman saya.

(In this meaning.)

Chung (1976a: 44) claims that a kind of scrambling exists in Indonesian because, according to her data, "post-verbal NPs" can be scrambled. She offers the following examples in support of her contention: (73) a. Iwan me-masuk-kan anjing itu ke truk. TA-enter-CAUS dog that to truck 'Iwan forced the dogs into the truck.' b. Iwan me-masuk-kan ke truk anjing itu. TA-enter-CAUS to truck dog that 'Iwan forced the dogs into the truck.'

60

Chapter 3: The interaction with other principles

None of my Indonesian language consultants accepted Chung's scrambled version. All required strict adjacency between the verb and the direct object. Scrambling is avoided even when it involves two oblique phrases, all things being equal, with locatives preceding temporal phrases: (74) a. Saya mau beli pakaian di Pasar Baru minggu depart. I want buy clothes in market new week front Ί want to buy clothes in Pasar Baru next week.' b. USaya mau beli pakaian minggu depan di Pasar Baru. I want buy clothes week front in market new Ί want to buy clothes next week in Pasar Baru.'

3.3.2.

Grammatical relations

Even a cursory look at the examples presented so far reveals that Indonesian is not a case-marking language. The fixed word order implies that the basic grammatical relations are identified by position, making substantive case markers redundant, and in fact, there is only one inflectional affix that can occur with nouns, the definiteness/possessive marker -nya (cf. buku-nya 'the book'), which is homonymous with the third person enclitic -nya 'his, her'. Beyond that, Indonesian has a set of verb affixes which are associated with the transitivity status of a verb and which could, on this basis, possibly be interpreted as a kind of head-marking equivalent to more conventional case markers, following the typological distinction established by Nichols (1986). Among these verb affixes, the passive marker di- is the only form which is utilized exclusively for inflectional purposes. However, several others also function grammatically as indicators of the argument structure of a sentence, i.e., they serve a strictly syntactic purpose. The prefix ber-16 indicates the intransitivity of the verb, meng- its transitivity, and di- the passive. Moreover, Chung (1976a) has demonstrated convincingly that the suffixes -i and -kan serve to create direct objects from oblique prepositional phrases marked with ke 'to' or untuk 'for', respectively. Because of these affixes, the morphological form of a verb regularly reflects its argument structure; however, there is no agreement between the verb and any of its arguments. The inflectional verb affixes of Indonesian can be represented as in Table 5:

3.3. Low semantic transparency in Indonesian

61

Table 5: Inflectional verb affixes in Indonesian ber-

+

STEM

+

STEM

mengdi-

+

-kan 0 -i

However, even this verbal marking of transitivity and argument roles is deficient, since it is characterized by many lexical exceptions. For instance, while meng- verbs are overwhelmingly transitive active, a sizable minority of them are not. To mention just a few, me-laut 'sail' and meny-alak 'bark' are intransitive. There are even a few transitive verbs with the prefix ber-: A popular dictionary lists ber-buah-kan 'result in', which is a transitive verb notwithstanding the prefix ber-, the regular transitive form mem-buah-kan exists as well and is glossed as 'produce, yield' (Wojowasito - Poerwadarminta - Gaastra, no date: 33). Chung's rule of Dative usually involves suffixation of the verb with -i or -kan, but a number of verbs (among them beri 'give', kasih 'give', and bayar 'pay') undergo this rule without any suffixation necessary (Chung 1976a: 55). Such exceptions are widespread enough to make a clear distinction into inflectional and derivational affixes doubtful. Accordingly, it is not surprising that Prentice (1987: 919) recognizes only two inflectional affixes in Indonesian, namely meng- and di-. For Prentice, ber- derives stative or dynamic verbs, while Chung's objectcreating functions of -i and -kan are included in his list of derivational affixes. Even the regular application of the affixes may still not identify the roles of all noun phrases because the suffixes -i and -kan are multifunctional: Both suffixes are transitivizers, but -kan regularly indicates the causative besides marking the benefactive (mati 'die' -> me-mati-kan 'kill', hidup 'live' -> meng-hidup-kan 'bring to life, create', etc.); the suffix -/ often marks iterative verbs (me-mukul 'hit' -> me-mukul-i 'hit repeatedly', meng-gigit 'bite' -> meng-gigit-i 'bite repeatedly', etc.) in addition to serving as a marker of transitive locative verbs. The Indonesian verb affixes therefore can not be seen as an effective head-marking equivalent to case markers.

62

Chapter 3: The interaction with other principles

3.3.3. The semantic diversity of basic grammatical

relations

In addition to word order and the marking of grammatical relations, a further characteristic of low semantic transparency in Indonesian is the considerable semantic diversity of both objects and subjects; however, underived subjects display relatively more semantic specificity than in English.

3.3.3.1. Objects The class of direct objects in Indonesian actually covers an even wider range than in English. Like that of English, it includes not only strongly patient-like objects, but also other roles which more semantically transparent languages tend to treat as distinct from direct objects. Sentence (75) demonstrates the canonical patient object. Notice that the non-patient objects in (76) and (77) can unproblematically be promoted to subjecthood by means of passivization, unlike their German equivalents, which are given to reinforce the contrast: (75) a. Ali me-mukul orang ini. TA-hit man this 'Ali hit this man.' 'Ali schlug diesen (ACC) Mann.' b. Orang ini di-pukul oleh Ali. man this PASS-hit by 'This man was hit by Ali.' 'Dieser (NOM) Mann wurde von Ali geschlagen.' (76) a. Ali me-nolong orang ini. TA-help man this 'Ali helped this man.' 'Ali half diesem (DAT) Mann.' b. Orang ini di-tolong oleh Ali. man this PASS-help by 'This man was helped by Ali.' '•Dieser (NOM) Mann wurde von Ali geholfen.'

3.3. Low semantic transparency in Indonesian

63

(77) a. Dia mem-erlu-kan bantu-an-mu. he TA-need-BEN help-NML-2SG 'He needs your help.' 'Er bedarf deiner (GEN) Hilfe.' b. Bantu-an-mu di-perlu-kan-nya. help-NML-2SG PASS-need-BEN-3 'Your help was needed by him.' '•Deine Hilfe (NOM) wurde von ihm bedurft.' The Indonesian direct objects in (75), (76), and (77), like those of English, correspond to accusative, dative, and genitive objects in German, respectively. Benefactive noun phrases which function as a direct object, such as the noun phrase Tini in (78b), including those created by the rule of Dative (Chung 1976a), undergo passivization freely, as shown in (78d), although their equivalents in English must frequently appear in a prepositional phrase and resist the application of the passive. (78) a. Karto mem-beli mobil itu untuk Tini. TA-buy car that for 'Karto bought that car for Tini.' b. Karto mem-beli-kan Tini mobil itu. TA-buy-BEN car that 'Karto bought Tini that car.' c. * Karto mem-beli Tini mobil itu. TA-buy car that d. Tini di-beli-kan mobil oleh Karto. PASS-buy-BEN car by 'Tini was bought a car by Karto.' Example (79) demonstrates that the same is true for locative phrases which have been promoted to object status: (79) a. Pak Topo datang ke kantor-nya. Mr. Sutopo come to office-3 'Mr. Sutopo came to his office.'

64

Chapter 3: The interaction with other principles

b. Pak Topo men-datang-i kantor-nya. Mr. Sutopo TA-come-LOC oflice-3 'Mr. Sutopo came to his office.' c. Kantor-nya di-datang-i oleh Pak Topo. office-3 PASS-come-LOC by Mr. Sutopo '??His office was come to by Mr. Sutopo.' = 'Mr. Topo came to his office.' The locative phrase ke kantornya 'to his office' takes the form of a prepositional phrase in the underived sentence (79a) but appears as a simple noun phrase without a preposition in sentence (79b). This latter sentence is transitive (witness the verb morphology), having acquired the locative kantornya 'his office' as a direct object. Consequently, this phrase can unproblematically become the subject of the passive construction illustrated in example (79c). The class of direct objects in Indonesian is, therefore, expanded relative to the transparent state of German as well as relative to English. Interestingly, we are again dealing with a relationship of proper subsets between these languages, where the class of direct objects in German is properly included in that of English, which is in turn included in that of Indonesian.

3.3.3.2. Subjects The semantic generality found with objects in Indonesian largely carries over to subjects as well. However, the range of subjects turns out to be more restrictive in this language than in English because of syntactic reasons (cf. the discussion below. Sentence (80b) is from Matra, August 1990: 146 and sentence (83) from Echols - Shadily 1989: 1): (80) a. Dadas sedang men-ulis sural. PROGRESSIVE TA-write letter 'Dadas is writing a letter.' b. Si Pembelot mem-bangun masjid. DET traitor TA-build mosque 'The traitor builds a mosque.'

[AGENT]

3.3. Low semantic transparency in Indonesian

(81) a. Kunci irti bisa mem-buka pintu belakang. key this can TA-open door back 'This key opens the back door.'

65

[INSTRUMENTAL]

b. Waktu itu se-puluh rupiah bisa mem-beli se-buah nangka. time that one-ten rupiah can TA-buy one-fruit nangka. 'At that time, ten rupiah could buy a Nangka fruit.' (82) a. Kotak ini berisi baju anak-anak. box this contain clothes child-REDUP 'This box contains children's clothes.'

[LOCATIVE]

b. Mobil ini bisa me-muat lima orang. car this can TA-load five people 'This car seats five people.' c. Pulau itu sering meng-alam-i gempa bumi. island that often TA-experience shake earth 'That island frequently experiences earthquakes.' d. Hotel ini hotel this mem-bawa TA-carry 'This hotel

me-larang tamu-nya untuk [LOCATIVE/AGENT] TA-forbid guest-3 for anjing. dog forbids its guests to bring dogs.'

(83) Tugu itu meng-abad-i-kan monument that TA-century-ITERATIVE-TRANS jasa-nya. service-3 'The monument immortalizes his service.' (84) a. Saya sudah meng-alam-i gempa bumi. I already TA-experience shake earth Ί have experienced an earthquake.' b. Saya senang buku itu. I like book that Ί like that book.'

[THEME]

[EXPERIENCER]

66

Chapter 3: The interaction with other principles

(85) Bulan September akan jadi dingin sekali. month September will become cold veiy 'September will be very cold.'

[TEMPORAL]

(86) Edisi terachir dari buku ini ber-tambah dua pasal. edition last from book this INT-increase two chapter 'The last edition of this book has increased (by) two chapters.' Indonesian naturally allows all of the semantic roles illustrated in examples (80) through (85), and even in (86), to access subjecthood in these sentences, including those instances where a language with greater surface semantic transparency like German would require or prefer to express the semantic relations in an adpositional phrase. So far, Indonesian behaves like English once again. However, it is also evident from examples like (87) that not all English subjects match an Indonesian subject: (87) *Kemah ini tidur empat orang. tent this sleep four people 'This tent sleeps four.' The exceptionality of sentence (87) finds an explanation on independent syntactic grounds: Verbs in Indonesian, as has been pointed out above, are regularly marked for their status with respect to their syntactic argument structure, and tidur 'sleep' is unmistakably an intransitive verb. Whereas it is possible for a language like English, which lacks such marking, to easily change the transitivity of a verb, the same cannot be done in Indonesian. In fact, the collocation 'sleep somebody', Indonesian men-idur-kan, could only be interpreted as a causative: (88) Kemah ini men-idur-kan lima orang. tent this TA-sleep-CAUS five people 'This tent put four people to sleep.' Hence, (87) is ruled out because of the intransitivity of the verb form used. For the same reason, the so-called ergative verbs of English, i.e., verbs like open, which occur both as transitives with agentive subjects and as intransitives with objective subjects, must be expressed by different forms in Indonesian:

3.4. High semantic transparency in Indonesian

67

(89) a. Hasan mem-buka pintu itu. TA-open door that 'Hassan opened that door.' b. Pintu itu ter-buka.17 door that open 'The door opened.' c. *Pintu itu mem-buka. door that TA-open 'The door opened.' The verb mem-buka is unequivocally transitive, hence the ungrammaticality of (89c). Notice that non-agentive subjects do occur with the transitive active prefix meng-, as sentences (81) through (86) have shown. So the ungrammaticality of (87) really must be attributed to the intransitivity of the verb.

3.4. High semantic transparency in Indonesian In this chapter, we have shown so far that Indonesian consistently patterns like English on the typological continuum with respect to its stringent word order restrictions, the lack of case marking, and the semantic diversity of its basic grammatical relations. Surprisingly, though, Indonesian takes a position much closer to the transparent end of the scale with respect to other properties. This split could represent a crucial violation of the framework, but the systematicity of the apparent violations leads to a solution which yields further insight into the nature of the Semantic Typology. The apparently exceptional properties, namely various raising processes and WH-extraction, all involve syntactic movement. Indonesian should exhibit considerable freedom in all of these properties if it was to be unequivocally analyzed as a highly grammaticizing language. Moreover, the permitted movement rules in this language should be systematically related to those of English if a strict version of the scalar model of the typology is correct. This section examines to what extent English and Indonesian are indeed related in this way.

68

Chapter 3: The interaction with other principles

3.4.1.

Raising

Given the overall thrust of the argument in section 3.3, it may come as something of a surprise that raising is a rather restricted phenomenon in Indonesian. For one thing, an investigation of the verbs typically associated with subject-to-subject raising appears to indicate that Indonesian does not make use of this process at all (but cf. the discussion on the Indonesian "Tough" construction below). To express the standard examples with seem, the language resorts to sentential adverbs like sepertinya 'like, as, apparently', rupanya 'apparently', or kelihatannya 'it looks like, apparently': (90) a. Sepertinya dia senang gadis itu. apparently he like girl that 'Apparently he likes that girl.'= 'He seems to like that girl.' b. Rupanya dia senang gadis itu. c. Kelihatannya dia senang gadis itu. Like other adverbs in Indonesian (Macdonald 1967: 123), they can occur in sentence-initial position or between the subject and the predicate, but sentence-final positioning is avoided:18 (91) a. Ia kelihatannya men-dapat keuntungan. he seemingly TA-get advantage 'He seems to have an advantage.' b. *Dia senang gadis itu kelihatannya. he like girl that seemingly 'He seems to like that girl.' By contrast, Chung (1976b: 63-66) demonstrates that subject-to-object raising exists in Indonesian. In fact, this type of raising is quite productive with control verbs like kira 'think', sangka 'believe', hendak 'want', anggap 'believe, consider', etc. Still, the corresponding constructions with the subordinating complementizers supaya 'so that' or bahwa 'that' are often preferred by many speakers:

3.4. High semantic transparency in Indonesian (92) a. Saya meng-hendak-i Budiman untuk mem-baca I TA-want-LOC for TA-read Ί want Budiman to read this book.'

69

buku ini. book this

b. Saya meng-hendak-i supaya Budiman mem-baca buku ini. I TA-want-LOC so that TA-read book this Ί want that Budiman reads this book.' Subject-to-object raising is, thus, attested in Indonesian. The existence of object-to-subject raising is questionable, though. Chung (1976b: 69-74) first cited some examples which look as if they are instances of such "Tough" movement. The relevant sentences are presented in (93):19 (93) a. Mobil car 'This 'This

ini sulit this hard car is hard car is hard

untuk di-per-baik-i (oleh) kami. for PASS-CAUS-good-LOC by we to be repaired by us.' for us to repair.'

b. Film ini sulit untuk di-lewat-kan. film this difficult for PASS-go past-TRANS 'This film is difficult to pass up.' c. Roti ini baik untuk di-potong dengan pisau. bread this good for PASS-cut with knife 'This bread is good/easy to cut with a knife.' However, there is a serious problem with this analysis. Contrary to what we would expect for "Tough" movement, these sentences require obligatory passivization in the dependent clause. The original object must, hence, be promoted to subject before any raising takes place, so that the raising process involved in sentences like (93) is subject-to-subject raising, rather than the direct object-to-subject raising that is typical of "Tough" movement. This is probably why Chung (1976b: 69-74) calls this construction Derived Subject Raising. The construction is restricted to a small set of trigger predicates in addition to the fact, observed above, that no other instances of subject-to-subject raising exist in Indonesian. The extent of the applicability of raising constructions, therefore, is much more tightly restricted than the more liberal range found in English. In particular, the

70

Chapter 3: The interaction with other principles

grammaticality of sentences like (93) deteriorates rapidly as one tries to introduce other triggering predicates: (94) Buku ini sulit/mudah/?men-arik/*mem-bosan-kan untuk di-baca. book this hard/easy/interesting/boring for PASS-read 'This book is difficult/easy/interesting/boring to read.' (95) *Computer ini mahal untuk di-beli. computer this expensive for PASS-buy 'This computer is expensive to buy.' All other things being equal, the most straightforward analysis is, therefore, that only subject-to-object raising and subject-to-subject raising exist in Indonesian. Since the raised noun phrases cannot be interpreted as arguments of the surface predicate that they stand with, they are true instances of argument trespassing constructions. Clearly, though, there are several factors which severely delimit the impact of raising as a creator of such trespassing. These are the total absence of object-to-subject raising; the preferred replacement of subject-to-object raising structures by dependent clauses without raising; the total absence of subject-to-subject raising except in "Tough" sentences; and the limited number of triggers for both subjectto-object and subject-to-subject raising. We can conclude that raising processes do exist in Indonesian and account for some loss of semantic transparency, but that their effect in this regard is rather more limited than in a language like English, with which Indonesian patterned so closely when linguistic properties other than argument trespassing were considered.

3.4.2.

Extractions

Chung (1976a: 50, 56, note 9, and 68) was the first to note the existence of WH-movement in Indonesian, albeit without further elaborating on its functioning within the grammar of the language. Parallel to our observations with regard to raising processes, this section shows that WH-movement is also subject to stringent syntactic restrictions. Specifically, it is constrained by a Subjects Only Constraint, which stipulates that only subjects and subjects of subjects, i.e., possessive noun phrases, can be extracted. Nonsubject noun phrases must first become subjects, for instance via passivization or via raising, to become accessible to extraction processes.

3.4. High semantic transparency in Indonesian

71

The restriction of extraction to subjects can be shown by the topicalization of arguments. In sentence (96a), it is possible to topicalize Siti from the subject noun phrase, yielding sentence (96b), but the direct topicalization of the object kue 'cake' is blocked by the Subjects Only Constraint, accounting for the ungrammaticality of (96c): (96) a. Teman-nya Siti mau mem-bikin kue. friend-3 want TA-make cake 'Siti's friend wants to bake a cake.' b. Siti, teman-nya mau mem-bikin kue. friend-3 want TA-make cake 'As for Siti, her friend wants to bake a cake.' c. *Kue itu, teman-nya Siti mau mem-bikin. cake that friend-3 want TA-make 'That cake Siti's friend wants to bake.' Example (97) exemplifies the process of WH-movement in Indonesian. In a striking parallel to the topicalization case, it again turns out that, whereas extraction of subjects is possible, cf. (97a), the extraction of an object renders (97b) ungrammatical unless the sentence is first passivized, cf. sentence (97c): (97) a. Siapa yang mau mem-bikin kue? who that want TA-make cake 'Who wants to bake a cake?' b. *Apa yang teman-nya Siti mau mem-bikin? what that friend-3 want TA-make 'What does Siti's friend want to make?' c. Apa yang mau di-bikin oleh teman-nya Siti? what that want PASS-make by friend-3 'What wants to be made by Siti's friend?' = 'What does Siti's friend want to make?' Indonesian allows long-distance extraction with a few bridge verbs such as pikir 'think' and kira 'think, guess':20

72

Chapter 3: The interaction with other principles

(98) Siapa-kah yang kau-pikir yang [ χ ] mau mem-bikin kue? who-Q that you-think that want TA-make cake 'Who is thought by you that wants to bake a cake?' = 'Who do you think wants to bake a cake?' (99) Apa yang kau-kira yang [ χ ] mau Karto beli [ t ]? what that you-guess that want buy 'What is thought by you that is wanted to be bought by Karto?' = 'What do you think that Karto wants to buy?' These examples indicate that such extractions are possible out of object complements.21 Notice, however, that whenever a non-basic subject is moved, passive must obligatorily apply to satisfy the requirement of the Subjects Only Constraint that only noun phrases which are subjects at some level are extractable. The underlying object in (99) is, therefore, first promoted to subjecthood before it is moved out of the clause. (100) and (101) give the relevant examples for extraction from two-place predicates and three-place predicates, respectively: (100) Apa yang mau di-coba Budi untuk di-curi? what that want PASS-try for PASS-steal 'What did Budi want to try to steal?' (101) ΊΑρα yang kau-suruh Budi untuk di-bikin? what that you-order for PASS-make 'What was Budi asked by you to be made?' = 'What did you ask Budi to make?' The bridge verbs pikir and kira usually require a complement clause which is introduced by bahwa 'that', as in (102a). However, they permit extraction only when all complementizer positions that movement occurs across are filled by yang. Extraction out of any other complements is not allowed, whether the embedded clause is passivized or not. Sentences (102b) and (102c) are ungrammatical because the complement clause from which the moved element originated has the complementizer bahwa 'that'. Notice that sentence (98) is identical to (102b) except for the occurrence of yang in place of bahwa, but unlike (102b), sentence (98) was grammatical.

3.4. High semantic transparency in Indonesian

Ti

(102) a. Kau-pikir bahwa Siti mau mem-bikin kue. you-think that want TA-make cake 'You think that Siti wants to bake a cake.' b. *Siapa-kah yang kau-pikir bahwa [ χ ] mau mem-bikin kue? who-Q that you-think that want TA-make cake 'Who do you think [that] wants to bake a cake?' c. *Apa yang kau-pikir bahwa mau di-bikin oleh Siti? what that you-think that want PASS-make by 'What do you think [that] Siti wants to make?' Yang clauses (as well as untuk clauses) can be thought of as quasi-infinitivals since they are always subjectless. The obligatory conversion of the "finite" (in the sense that they have a subject) bahwa clauses into quasiinfinitival yang clauses, hence, supports the Extraction Hierarchy introduced in section 1.2 above. Extraction from any positions lower on this Extraction Hierarchy is not allowed. Example (103) shows this for quasi-infinitival purpose clauses. Such clauses are introduced by untuk, and extraction out of them immediately renders them ungrammatical, even if passive has applied in accordance with the Subjects Only Constraint to promote the object of the lower predicate to subjecthood before WH-movement takes place, cf. (103b) and (103 c). (103) a. Ibu-nya pergi ke toko untuk mem-beli udang. mother-3 go to store for TA-buy shrimp 'His mother went to the store in order to buy shrimp.' b. *Apa yang ibu-nya pergi ke toko untuk mem-beli? what that mother-3 go to store for TA-buy 'What did his mother go to the store in order to buy?' c. *Apa yang ibu-nya pergi ke toko untuk di-beli? what that mother-3 go to store for PASS-buy 'What did his mother go to the store in order to buy?' This is true even when the untuk clause functions as the subject of the matrix:22

74

Chapter 3: The interaction with other principles

(104) a. Sebelum be-rencana, untuk meng-e-tahu-i dulu medan-nya before INT-plan for TA-know-TRANS first field-DEF sudah me-makan energi dan waktu. already TA-eat energy and time 'Before you make a plan, getting first to know the field already takes up energy and time.' Applying passivization within the untuk clause makes medan-nya 'the field' into the subject of a subject, yet the sentence is ungrammatical when this noun phrase is extracted:23 (104) b. *Apa yang [untuk [ χ ] di-ke-tahu-i dulu [ t ]] what that for PASS-know-TRANS first sudah me-makan energi dan waktu. already TA-eat energy and time 'What does getting first to know [ χ ] already take up energy and time?' Neither is extraction possible from finite adverbial clauses, including "finite" purpose clauses introduced by supaya 'so that' or sehingga 'so that' and other adjuncts, regardless of passivization: (105) a. Ibu-nya datang supaya dia bisa mem-bikin kue. mother-3 come so that 3SG can TA-make cake 'His mother came so that she could bake a cake.' b. *Apa yang ibu-nya datang supaya bisa di-bikin? what that mother-3 come so that can PASS-make 'What did his mother come so that she could make?' (106) a. Siti pulang karena ibu-nya mau mem-bikin kue. go home because mother-3 want TA-make cake 'Siti went home because her mother wants to bake a cake.' b) *Apa-kah yang Siti pulang karena mau di-bikin oleh what-Q that go home because want PASS-make by ibu-nya? mother-3 'What did Siti go home because her mother wants to make?'

3.5. Indonesian in the Semantic Typology

75

The Indonesian extraction facts can be summed up as follows: Extraction is possible only on a continuous segment of the Extraction Hierarchy (Hawkins 1986: 87; cf. section 1.2 above). It is permitted from quasiinfinitival complements of two-place predicates and, with a somewhat reduced acceptability, of three-place predicates. Certain bridge verbs allow extraction from "finite" object complement clauses, but only if those complements are converted to quasi-infinitival yang clauses. Finally, extraction from adverbial clauses, whether quasi-infinitival or not, is ruled out. Even in those environments which permit extraction, the grammar of Indonesian imposes a strict condition on extraction, the Subjects Only Constraint, which requires that only subjects may be extracted.

3.4.3.

Discussion

The prohibition against the extraction of objects has pervasive consequences for the grammar of Indonesian since it explains at the same time the ungrammaticality of object topicalization and of object WH-movement, including the exceptional status of the language in the Keenan - Comrie Accessibility Hierarchy to relativization (Müller-Gotama 1987). Moreover, this constraint also explains the curious status of the Indonesian "Tough" construction (cf. section 3.4.1 above): First, the Subjects Only Constraint is a constraint on movement, so it only applies when movement takes place. The fact that it is operative in the Indonesian "Tough" construction therefore shows that such sentences indeed involve movement. Secondly, the constraint explains why the Indonesian "Tough" construction requires passivization in the embedded clause with subsequent subject-to-subject raising since it prohibits objects from being raised, i.e., moved, directly. To conclude, the Subjects Only Constraint is a pervasive syntactic constraint in Indonesian. It limits access to all movement processes to subjects and is operative throughout the grammar.

3.5. Indonesian in the Semantic Typology We have noted in the preceding sections that raising and extractions in Indonesian are subject to severe constraints, particularly the pervasive Subjects Only Constraint. Given the tight constraints on extractions and raising, these grammatical processes are much more limited than one would

76

Chapter 3: The interaction with other principles

have expected based on the low semantic transparency of the language with respect to all properties apart from movement. The location of Indonesian in the overall typology, thus, is split between a close affinity with the transparent type in all grammatical properties which involve movement and with the grammaticizing type in all other cases. This peculiar split of properties can be viewed as a direct result of the limitation of extraction to subjects. Since the extractability of a noun phrase is essential for a great many grammatical processes - including relativization, the formation of WH-questions, the "Tough" construction, and topicalization - it is extremely important for a noun phrase to gain access to extraction. For agentive noun phrases, this access is unproblematic since they are prototypical subjects. For subcategorized objects of verbs, accessibility is guaranteed by means of passivization; the unified treatment of all objects, contrary to the differentiation made in German, exploits this strategy to the largest possible extent. But what of oblique noun phrases? Indonesian offers them two solutions: First, some obliques have the opportunity to become objects through the suffixation of the verb with -kan or -i, which makes them eligible for promotion to subjecthood through passivization. Secondly, Indonesian applies a more direct solution for oblique noun phrases by loosening semantic restrictions on subjects, thus making it possible for at least some noun phrases with oblique roles to become subjects directly (cf. section 1.3.3.1 above.) Far from being a possible but accidental collocation of properties, the regularities we observed in Indonesian, thus, follow logically from one another. In particular, it seems to be the case that the restriction of extraction to only one grammatical relation, the subject, is a major cause for the liberal, English-like properties in other areas of the grammar. Pending the cross-linguistic confirmation of this collocation through a large scale typological survey and a comparative study of closely related languages, like the strongly subject-prominent Malagasy as well as the Philippine languages (Keenan 1976), it is therefore possible to state the following implication: (107) If a language restricts extraction to only one grammatical relation, it will compensate for this constraint by creating strategies which allow other noun phrases to access this same grammatical relation. Regarding the movement hierarchy introduced in section 1.2, a clear cross-linguistic pattern also begins to emerge now, as shown in Table 6,

3.5. Indonesian in the Semantic Typology

77

which sums up the extraction facts of Indonesian and the four other languages discussed so far: Table 6: Extraction possibilities in 5 languages Extraction environment

K'

infinitive object complement of 2-place predicate infinitive non-subject complement of 3-place predicate finite non-subject complement infinitive adverbial clause finite adverbial clause

-

R

G

I

+ + + + +/? + - -/? xb - - - - + /

a. Κ = Korean, R = Russian, G = German, I = Indonesian, E = English. b. χ = after conversion to quasi-infinitival.

E + + + ?

4. The cross-linguistic survey This chapter puts the Semantic Typology developed in Chapter 1 and elaborated on in Chapters 2 and 3 to the test in an empirical survey. Our purpose is to establish that there is indeed a strong tendency among the world's languages to conform to the predictions made by the typology and to determine the relationship between a language's position in the Semantic Typology and other typological indicators, particularly word order. We will not be concerned with trying to establish the absolute frequency of language types since frequency may reflect large areal groupings more than typological factors (Dryer 1989).

4.1.

The sample

4.1.1. Criteria for sample construction The issue of what constitutes proper sampling procedure in linguistic typology has not received a conclusive answer in the literature. It is clear, as Mallinson and Blake (1981: 15) state, that the goal must be a sample which is "representative of human language", but the diversity of sample sizes and compositions found in actual typological studies amply demonstrates that it is much less clear what constitutes such representativeness. It is, therefore, important for us to determine the criteria needed to establish an adequate sample for this work. In the past, the focus of many typological studies has been to determine the frequency of various language types, and as a result of this, it was deemed desirable to use a sample which ideally would represent all areal, genetic and structural language groups proportionately to their total numbers. For instance, Bell (1978) proposes a sample consisting of a fixed percentage of the languages in each language family and linguistic area, a procedure actually implemented by Tomlin (1986) to determine the frequencies of the basic word order types. But Dryer (1989: 257-267) has argued convincingly that the frequency of a particular type is by itself not sufficient to establish a linguistic preference for (or against) that type because frequency is determined in large part by non-linguistic historical factors, primarily the distorting effects of large language families. Neither can the distortions caused by large groups of languages be eliminated by

4.1. The sample

79

constructing a sample of genetically and areally independent languages because too few such truly independent languages exist. These facts pointed out by Dryer present an obvious dilemma for studies which seek to establish linguistic preference; however, they do not necessarily represent an obstacle for other approaches to language typology. The construction of a typological sample is undoubtedly subject to certain universally valid conditions, such as the avoidance of extreme bias and skewing in favor of a single genetic, areal, or structural group, but the proportional representation of these groupings is not one of them, and the goals of one's study significantly influence the construction of the kind of sample that is necessary to achieve them. In this work, we are not trying to determine the cross-linguistic frequencies of, or preferences for different language types, however well motivated such preferences may be; rather, we are trying to establish the typological validity of the scalar model of the Semantic Typology per se. This model does not even claim to describe discrete language "types" whose frequency could be measured. Instead, it defines a linguistic continuum, as we have shown in Chapter 1. To achieve this goal, our focus in selecting the sample should be to attempt to discover (i) the extent of variation among languages and (ii) minimal differences between languages that are positioned close to each other on the typological continuum defined in Chapter 1. Of course, criterion (i) could only be fully satisfied if all languages were investigated, a task which is obviously beyond our capabilities. Barring that, a promising heuristic for achieving broad typological coverage seems to be to include languages from a large number of areal, genetic, and structural groups selected at random, as suggested by Comrie (1989a: 10-12). This approach maximizes the likelihood that a wide empirical spread is included and therefore ensures adequate typological breadth of coverage. The most promising approach to identify minimally different languages according to criterion (ii) is by studying and contrasting languages which are closely related to one another genetically or typologically since closely related languages are most likely to be similarly structured. In other words, this criterion requires that we need to do the exact opposite of Dryer's recommendations in that we may need to consciously overrepresent small groups of languages. Proportional representation, on the other hand, is not relevant to either of our two criteria. Furthermore, it appears necessary to balance our cross-linguistic survey by introducing some lesser known languages for which few published materials exist. Our survey is necessarily skewed in favor of well-studied

80

Chapter 4: The cross-linguistic survey

languages of wider use because of the amount and the detail of the data needed to include a language meaningfully in our comparative study of the Semantic Typology. For smaller languages, this range of data is often unavailable in the literature, and the data we can obtain may be less reliable since often only one or two publications are all that exists on a language. In addition, language consultants on the more widely used languages are generally easier to come by to fill in the gaps in the data by performing some field work. Despite these problems with less documented languages, we feel it is necessary to include at least a few in this study in order to ensure that our understanding of the Semantic Typology is not distorted by the recent worldwide influence of a few dominating, mostly European languages, since such lesser known languages are also typically spoken in areas that are furthest away from the modern cultural and political centers where the influence of the dominating international languages is most strongly felt. The reality of this influence has been documented by Becker and Wirasno (1980) for Indonesian. Using data from grammatical "mistakes" made by native speakers, Becker and Wirasno show that several syntactic changes are occurring in Indonesian: First of all, an overt copula is introduced in previously verbless sentences; secondly, the definite-indefinite number system, which marked indefinite plurals by reduplication and definite plurals with a classifier, becomes a simple singular-plural system which uses reduplication to mark plural and abandons the use of classifiers; thirdly, an overt possessive marker -nya is used in place of the previously not overtly marked genitive relationship; fourthly, the focus-marking verb prefixes are dropped or reinterpreted as voice markers; and finally, relative clause construction is revamped totally by making non-focussed participants accessible to relativization and by the replacement of the invariable relative marker yang with question words (dari mana 'from where', kepada siapa 'to whom', etc.) These ongoing syntactic changes affect disparate constructions in many parts of the grammar. What they all have in common is that each of them makes the structure of Indonesian more like English or Dutch. Becker and Wirasno see the major motivation for the changes in the attempt to make Indonesian more like these Western prestige languages and correctly stress (Becker and Wirasno 1980: 100) that "the prestige of Western conceptual systems is high in modern Indonesia, and the kind of texts and text strategies one produces in them are also prestigious." The difference between the Standard Indonesian described in Chapter 3 above and Betawi Malay, the colloquial variety of the language spoken in

4.1. The sample

81

Jakarta, may also be suggestive of syntactic change that was triggered by such influence from Western prestige languages. Ikranagara (1980: 49) states categorically that there are no instrumental subjects in Betawi Malay and claims that active verbs instead choose their subjects strictly according to a (semantically based) subject choice hierarchy: If there is an agent, it must become the subject; if there is none, the verb takes a dative subject. A patient noun phrase can be the subject of an intransitive verb (Ikranagara 1980: 31). These tight restrictions on subjecthood in Betawi Malay contrast with the wider semantic range we determined for Standard Indonesian in Chapter 3.24 It is therefore essential that our sample include data from lesser studied languages even at the risk that these data may not be as reliable as that from the well known languages of wider communication. Overall, our sample should, thus, comprise languages from a variety of typological, areal, and genetic groupings, and include clusters of languages which are closely affiliated with one another with respect to any or all of these measures as well as several lesser studied languages.

4.1.2.

The sample used in this study

Based on the criteria developed in the preceding section, a set of 15 languages has been selected to be used in this work. Although we have striven to collect a complete data set for each language, information on some properties is clearly more available than on others. For example, almost all grammars will discuss the basic word order properties of the language, or, at the very least, the examples given should make a good part of this information retrievable. Word order data could, therefore, be included for all the languages in the sample. The permitted semantic range of the grammatical relations in a language, on the other hand, is more rarely discussed in the literature; as a result, this information is often unavailable, and where we were able to provide it in the sample, it is, with a few exceptions, based on primary field work. In accordance with criterion (i), the sample includes a wide areal spread. Four of the fifteen languages are European; three are African or Near Eastern languages; six languages are from Asia and Oceania; and two languages are spoken in the Americas. Because of a lack of adequate data sources and/or language consultants, no information on Australian or Papuan languages has been included.

82

Chapter 4: The cross-linguistic

survey

This areal spread is counterbalanced in our sample by several clusters of languages that have experienced close areal contact, either recently or in the past. One such cluster consists of Chinese, Korean, and Japanese, with the well known long-standing historical influence of Chinese on the latter two and the more recent impact of Japanese on Korean. Another cluster exists among the European languages in the sample. Table 7 shows the areal makeup of the sample: Table 7: Areal makeup of the sample European languages Dutch English German Russian African and Near Eastern Babungo Hebrew Turkish

Asian and Oceanic Chinese Indonesian Japanese Korean Malayalam Sawu

languages

languages

American languages Hixkaryana Jacaltec

Genetically, the languages in the sample represent at least nine language families.25 Since the classification of Korean, Japanese, and Turkish into a single Altaic family is moreover quite doubtful (Comrie 1987: 7-9), these languages should be treated as genetically independent. If so, the number of independent genetic groups in our sample increases to eleven. Table 8 lists the languages in the sample according to their genetic affiliation: Table 8: Genetic makeup of the sample Afro-Asiatic Hebrew "Altaic" Japanese Korean Turkish

Austronesian Indonesian Sawu Dravidian Malayalam

Ge-Pano-Carib Hixkaryana

Niger-Kordofanian Babungo

Indo-European Dutch English German Russian

Penutian Jacaltec Sino-Tibetan Chinese

4.1. The sample

83

Typologically, the sample includes all three dominant basic word order types approximately in proportion to their actual frequency (6 SVO languages, 6 SOV languages, and 2 VSO languages) as well as one minor type, with Hixkaryana representing the OVS type. It includes case-marking as well as non case-marking languages, including both nominative-accusative and ergative-absolutive languages, languages with relatively strict and with relatively free sentence internal word orders, as well as head-marking and dependent-marking languages. In their morphological structure, both languages with a rich inflectional morphology and languages which use morphological means sparingly are represented. Table 9 lists the distribution of the basic word order types in the sample:26 Table 9: Basic word order types in the sample SVO languages Babungo Chinese English Indonesian Russian Sawu

SOV languages Dutch German Japanese Korean Malayalam Turkish

VSO languages Hebrew Jacaltec OVS languages Hixkaryana

Table 10 presents an overview of case marking types represented in the sample: 27 Table 10: Types and form of case marking in the sample Type of case-marking

Language

Nom inative-acciisative no overt case overt case in pronouns only head-marking morphological/particles morphological

Babungo, Chinese, Indonesian, Hebrew Dutch, English Hixkaryana Japanese, Korean German, Malayalam, Russian, Turkish

Ergative-absol utive head-marking particles

Jacaltec Sawu

84

Chapter 4: The cross-linguistic

survey

Our sample, thus, covers a sufficiently wide areal, genetic, and typological range to ensure that criterion (i) is satisfied, as well as clusters of closely related languages in accordance with criterion (ii), particularly the European languages included here.

4.2. Left-branching versus right-branching There are several factors which suggest that the position of a language in the Semantic Typology correlates with its basic branching direction. Some evidence for this asymmetry comes from the alignment of the five languages considered so far in this study. The two basic SOV languages, Korean and German, were both located on the semantically transparent part of the typological continuum; in fact, Korean has been determined to have the highest semantic transparency overall (cf. Chapter 2), and it is also the only consistently left-branching language considered so far, while Hawkins (1986) has demonstrated the consistently greater overall semantic transparency of German over English.28 Left-branching may, therefore, correlate with a higher degree of semantic transparency. In contrast to this, the three right-branching languages in our initial sample, English, Russian, and Indonesian, allow no generalization as to their position within the Semantic Typology since these three languages alone cover the whole continuum: Chapter 1 above has identified English as close to the properties of the grammaticizing extreme, while Comrie (1986, 1989b) shows Russian to approximate the transparency found in Korean. Indonesian, finally, displayed a split between high and low transparency features. A second piece of evidence for a correlation between semantic transparency and branching direction comes from the difference in processing strategies posited for head-initial and head-final languages by Frazier and Rayner (1988: 264) and, more directly, Hawkins (1990: 229). These researchers have claimed that right-branching languages utilize a top-down processing strategy as opposed to left-branching languages, which are said to utilize bottom-up processing. Hawkins derives this distinction from the different points at which the parser can construct a mother node: In a headinitial structure, the occurrence of a head immediately allows the parser to construct the relevant phrasal category, for example PP upon encountering a preposition, VP upon encountering a verb, etc. In a left-branching language, however, the head that determines the correct phrasal category comes at the end of a phrase, so that a noun phrase, for instance, is

4.2. Left-branching versus right-branching

85

compatible with a number of higher phrase types: It could be the leftbranching daughter of a higher PP, NP, or VP. Clearly, this processing difference places different demands on the structures that need to be parsed, with resulting consequences for the grammars of these languages. Hawkins (1990) has linked this distinction explicitly to the assignment of semantic roles to the various participants in head-initial and in head-final languages. Heads subcategorize for certain sets of modifiers and assign semantic roles to them. Any given head therefore activates a limited number of semantic role frames for the various combinations of modifiers it allows. In head-initial languages, the head occurs at the left periphery of a phrase; crucially, the parser can thus determine early on in the parsing process which phrase type is being processed as well as which modifiers are compatible with this head, and so activate precisely the set of semantic role frames that these modifiers permit. This is completely different in head-final languages. There, the head cannot contribute any information on-line about the syntactic or semantic roles of the various participants until very late in the parsing process. The parser of a left-branching language thus has no information to determine the possible semantic functions of a phrase that is being processed on-line. It is this uncertainty of the parser that compels left-branching languages to prefer a semantically transparent argument structure. The absence of a corresponding parsing uncertainty in head-initial languages accounts for the lack of a comparable pressure in that branching type. For example, upon encountering any noun (in English any noun or determiner) in a head-initial language, the parser can construct a noun phrase and immediately attach it to the preceding head to determine its function as the object of a preposition or a verb or as an [NP, S] (Chomsky 1965: 71). For each of these participant types, only a limited range of semantic roles is possible, although the precise range of permitted roles may be wider in one language, as in English, or rather narrow in another, as for instance in Russian. The crucial point is that nothing forces grammatical relations to be restricted to their prototypical semantic core content in head-initial languages. In contrast to this, the parser of a head-final language cannot know right away whether a noun phrase that has been constructed is a subject, an object, or an oblique, or even whether that noun phrase will ultimately be a constituent of another noun phrase, of the main clause, or of some embedded clause. Left-branching languages therefore compensate for this uncertainty of the parser by tightly constraining the permitted semantic range of the various grammatical relations around their prototypical core content and by doing

86

Chapter 4: The cross-linguistic

survey

without grammatical processes that manipulate the argument structure, that is, by achieving a high level of semantic transparency. Finally, there is some empirical evidence for the asymmetry between leftand right-branching languages in the typological literature. Greenberg's universal 41 (Greenberg 1963: 96) states that SOV languages almost always have a case system, while the 100 language survey of Mallinson and Blake (1981: 178) confirms that languages which do not mark case overtly tend to be SVO as much as case marking languages tend to be SOV. Since we have argued above that case marking is typical of high semantic transparency, these findings support the claim that OV languages tend to be semantically transparent. In a similar vein, Ultan (1978: 229) found that SOV languages are more likely to leave WH-words in situ than languages of other word order types.29 Finally, Lehmann (1978: 22) has claimed that passive is prominent in SVO languages but not at all in OV languages. Lehmann's generalization has a great many exceptions, not least of all Japanese (cf. section 4.3.2), but if it is true as a general tendency, the absence of this transparency-reducing process in left-branching languages would certainly fit the prediction made here. Furthermore, the Grammatical Relation Manipulation Hierarchy developed in Chapter 1 predicts that passive should develop before processes like WH-extraction and raising, and this prediction may be reflected in the "exceptions" to Lehmann's claim. Based on this evidence, it is therefore predicted that left-branching languages will generally display a high level of semantic transparency whereas right-branching languages may or may not do so. In other words, a prediction based on branching-direction is possible only for left-branching languages; no general prediction is made for the position of right-branching languages. The latter, thus, may be found at any point on the hierarchy. Consequently, our cross-linguistic in this chapter considers left-branching and right-branching languages in turn in order to demonstrate that this asymmetry indeed holds cross-linguistically and to empirically substantiate the Semantic Typology.

4.3. Left-branching languages The first languages to be considered here, namely Malayalam and Japanese, have been selected because of their specific relationships with Korean. Like Korean, both Japanese and Malayalam are consistent head-final languages with great word order variability and a considerable grammatical morphol-

4.3. Left-branching languages

87

ogy. It is therefore interesting to see whether these similarities extend to other properties to create an overall typological pattern similar to that of Korean, as predicted by the Semantic Typology, and what, if any, significant differences are found. Japanese has long had extensive areal contact with Korean and may well be genetically related to its mainland neighbor. Furthermore, Japanese is well-known to be typologically close to Korean in a number of ways, and it would, therefore, be a surprising exception if this closeness did not carry over to the posited manifestations of the Semantic Typology. In the case of Malayalam, there is no widely accepted genetic relationship to Korean,30 and the two languages have experienced little areal contact with one another. If Korean and Malayalam in fact show a great number of similarities throughout their grammars, these similarities would, consequently, support our hypothesis that attributes them to typological principles, more precisely to the effects of the Semantic Typology.

4.3.1.

Malayalam

4.3.1.1. Word order and case marking This subsection begins our empirical survey by describing the relevant aspects of the grammar of Malayalam, a Dravidian language rooted in the Southern Indian state of Kerala.31 The pre-nominal occurrence of noun modifiers, the use of postpositions, and a basic SOV word order establish Malayalam as a consistent left-branching language. These fundamental word order properties of Malayalam are illustrated in examples (108), (109), and (110), respectively: (108) Noun modifier positions: ii penni-nte ii rendir veliya naayi-kal this woman-GEN this two big dog-PL 'These two big dogs of this woman' (109) Postpositions: ñaan siitekki veendi pustakam I Siita-DAT for book Ί bought a book for S i ita.'

meeticcu. bought

88

Chapter 4: The cross-linguistic survey

(110) Basic SOV order: raaman siitaye kandu. Raaman Siita-ACC saw 'Raaman saw Siita.' Grammatical relations are indicated by means of case marking suffixation, but inanimate nouns must not be case marked as direct objects. This differential object marking (Bossong 1985) can be observed in other Dravidian languages as well. Malayalam permits free scrambling of maximal constituents, resorting to pragmatic clues when the case relations are syntactically ambiguous. The set of examples in (111) demonstrates that all possible permutations of sentence (110) are grammatical, although the verb-initial variants occur only in songs, according to my language consultant, VOS sentences even more rarely so than VSO sentences: (111) a. siitaye raaman kandu. Siita-ACC Raaman saw 'Raaman saw Siita.'

(OSV)

b. raaman kandu siitaye. Raaman saw Siita-ACC

(SVO)

c. siitaye kandu raaman. Siita-ACC saw Raaman

(OVS)

d. \kandu raaman siitaye. saw Raaman Siita-ACC

(VSO)

e. Ukandu siitaye raaman. saw Siita-ACC Raaman

(VOS)

As an inanimate entity, the object viidi- 'house' in example (112) can not be marked with the disambiguating accusative marker, yet the sentence is usually interpreted as 'Siita saw the tree', rather than the unlikely 'The tree saw Siita' based on the hearer's pragmatic knowledge: (112) a. viidi siita kandu. house Siita saw 'Siita saw the house.'

4.3. Left-branching languages

89

b. llviidi-ne siita kandu. house-ACC Siita saw 'Siita saw the house.' The loss of accusative case marking on inanimate noun phrases leads to unresolvable ambiguities only in those rare instances when pragmatic clues fail to clarify the roles of the participants. This occurs when two inanimates are involved. However, the majority of sentences includes at least one human participant. With its obligatory case marking of human participants, the grammar of Malayalam ensures that the bulk of the potential ambiguities is resolved. Nonetheless, the restriction of case markers to animates may represent a historical reduction from the more extensive Proto-Dravidian system, which marked eight distinct cases on animate as well as inanimate nouns and which survives intact in the neighboring Tamil language (Steever 1987: 727 and 73Ó-737).32

4.3.1.2. Grammatical relations The grammatical relations of Malayalam are characteristically narrow, as we would expect for a language with high semantic transparency. First, the use of the passive is very restricted. Our informant felt that it is most characteristic of written Malayalam, particularly of newspaper style. The passives that do occur are typically agentless, but passives with an oblique agent now appear occasionally. Our informant attributed this, along with the higher frequency of passives in the media to the influence of Sanskrit and English. The grammatical relations in Malayalam are semantically narrow in a second sense as well. Only noun phrases which closely approximate the semantic prototype of a particular grammatical relation can be coded in that role. This can be illustrated in a particularly striking way with subjects of transitive verbs. Recall that English and Indonesian permitted a wide range of noun phrases to become subjects even in underived environments, whereas Korean subjects were much more narrowly defined. As predicted by the Semantic Typology, Malayalam patterns much more like Korean here: Any attempt to construct non-agentive subjects induces ungrammaticality, as is shown in examples (113) through (116):

90

Chapter 4: The cross-linguistic survey

(113) a. *ii urulakizarjrjirgal oru veliya paattiram kari undaakkaam. this potatoes one big pot curry make can 'These potatoes can make one big pot of curry.' b. ii urulakizaqqigal kondi oru veliya paattiram kari this potatoes with one big pot curry undaakkaam. make can 'With these potatoes, [you] can make a big pot of curry.' (114) a. *oru rupiya oru pustakam meeticcu. one rupee one book bought One rupee bought a book.' b. oru rupiya kondi (ñaan) om pustakam meeticcu. one rupee with I one book bought Ί bought a book for a rupee.' (115) a. *kutti janavaatili-ne potticcu stick window-ACC broke Ά stick broke the window.' b. [aaroo] janavaatili kutti kondi potticcu. someone window stick with broke '[Someone] broke the window with a stick.' (116) a. *innale raatri ende viina oru kayari potti. last night my veena one string broke 'Last night, my veena broke a string.' b. innate raatri ende viinayudo kayari potti. last night my veena-GEN string broke 'Last night, my veena's string broke.' As in other languages, intransitive verbs allow a wider range of semantic roles as their subjects than transitive verbs. This accounts for the grammaticality of sentence (116b) despite the patient role of its subject kayari 'string'. Similarly, the Malayalam verb for 'sell' can function as a transitive or as an intransitive verb, cf. (117a) and (117b), respectively. However,

4.3. Left-branching languages

91

only the intransitive use of this verb allows the occurrence of a patient subject; hence the ungrammaticality of (117c): (117) a. raaman pustakam vittu. Raaman book sold 'Raaman sold the book.' b. ii pustakam aayiram kanakkili vittu. this book thousand quantities-in sold 'This book sold thousands of copies.' c. *ii pustakam aayiram pakarppe vittu. this book thousand copy-PL sold 'This book sold thousands of copies.' Significantly in light of the discussion in section 2.2.1 above, Malayalam straightforwardly allows sentence (118), thus providing further support that the subject is understood as an agent: (118) ii hotel naayikkale sammatikunnu. this hotel dog-PL-ÀCC allow 'This hotel allows dogs.' The decisive contrast between transitive and intransitive verbs also becomes evident in the different thematic frames for verbs that can be transitivized by means of affixation. The verbs for 'open' and 'close', for example, are basically intransitive but can add the transitivizer -ikk (cf. Mohanan 1982: 132-139). Again, patient subjects can occur with the intransitive forms while they render the sentence ungrammatical when the transitive forms are used: (119) a. vaatil adayum. door close-FUT 'The door will close.' b. *vaatil adakkum. door close-TRANS-FUT 'The door will close.'

92

Chapter 4: The cross-linguistic survey

It should be noted that (119b) is ungrammatical only if vaatil 'door' is interpreted as the subject. The sentence is perfectly acceptable with an object interpretation of vaatil, due to the possibility of pro-drop, yielding the meaning '[Someone] will close the door'. The same pattern recurs in (120): (120) a. vaatil torayum. door open-FUT 'The door will open.' b. *vaatil torakkum. door open-TRANS-FUT 'The door will open.' c. vaatil torakkum. door open-TRANS-FUT '[Someone] will open the door.' Since inanimate entities inherently lack the ability of volitional agency, English sentences with such subjects, like The thunder frightened the children, similarly have no literal equivalent in Malayalam as simple transitive sentences, witness the ungrammaticality of sentence (121a). However, it is possible for idi 'thunder' to occur as the subject of a causative construction since the causative marker uniquely identifies the semantic function of the inanimate subject as the causer of the action: (121) a. *idi kuttikale peediccu. thunder chiid-PL-ACC frightened 'The thunder frightened the children.' b. idi kuttikale peedippiccu. thunder chiid-PL-ACC frightened-CAUS 'The thunder frightened the children.' Experiencer subjects are coded in the dative case in certain expressions: (122) a. enikk+ sukhamilla. I-DAT well-NEG Ί am not well./I am sick.'

4.3. Left-branching languages

93

b. enikki cuudi edukkunnu. I-DAT hot feel-PRES Ί feel hot.' c. enikk+ siiteene ariyaam. I-DAT Siita-ACC know Ί know Siita.' d. raamam avane istamaana ermi jon paraññu. Raaman-DAT he-ACC likes COMP John said 'John said that Raaman likes him.' Colin Masica (1976: 159-169) identifies this as an areal feature of Indian languages, although it occurs elsewhere as well, cf. the discussion in section 2.2.1.

4.3.1.3. Argument trespassing Malayalam lacks many of the grammatical processes, such as WH-movement and raising, which create argument trespassing structures in other languages. WH-phrases behave like other noun phrases in that they can occur in situ or be scrambled freely to any position in the sentence: (123) a. siita aarikki veendi pustakam meeticcu. Siita who-DAT for book bought 'Who did Siita buy the book for?' b. siita eppool pooyi? Siita when go 'When did Siita go?' However, word order in WH-questions is more stringent than in declarative sentences in that the WH-word can not be scrambled after the verb: (124) a. siita endi meeticcu? Siita what bought 'What did Siita buy?'

94

Chapter 4: The cross-linguistic survey b. *siita meeticcu endi? siita bought what 'What did Siita buy?'

In addition, sentences with two WH-words obligatorily occur in the basic SOV pattern despite the transparent case marking of aar+ 'who' as nominative. In the accusative, the form for 'who' would be aare: (125) a. aari endi meeticcu? who (NOM) what bought 'Who bought what?' b. llendi aarimeeticcu? what who (NOM) bought 'Who bought what?' Because of the absence of WH-movement, any attempt at extracting a WHword out of its clause immediately renders the sentence ungrammatical: (126) a. [raaman aare kandu enni] nii kamttunnu? Raaman who-ACC saw COMP you think-PRES 'Who do you think that Raaman saw?' b. * [raaman kandu enni] aare nii kamttunnu? Raaman saw COMP who-ACC you think-PRES 'Who do you think that Raaman saw?' c. * [raaman kandu enni] nii aare kamttunnu? Raaman saw COMP you who-ACC think-PRES 'Who do you think that Raaman saw?' In both (126b) and (126c), aare 'who' has been extracted into the main clause, making the sentence ungrammatical. Furthermore, no stranding of postpositions is allowed. (127) demonstrates that a postposition may not be removed from the noun phrase it governs: (127) a. raaman siitekki veendi pustakam meeticcu. Raaman Siita-DAT for book bought 'Raaman bought the book for Siita.'

4.3. Left-branching languages

95

b. *siitekki raaman pustakam veendi meeticcu. Siita-DAT Raaman book for bought 'Raaman bought the book for Siita.' Similarly, there is no raising of noun phrases into another clause in Malayalam. Since a major effect of raising or WH-movement in languages like English is the topicalization of the extracted noun phrase, the lack of these processes in Malayalam can be attributed to the fact that topicalization can be achieved without resorting to syntactic extraction. More precisely, the possibility of topicalization by means of scrambling removes the functional motivation for extraction, thereby preempting its necessity. Main clause constituents are topicalized by simply scrambling them to sentence-initial position where they are naturally interpreted as topic. Similarly, constituents of embedded clauses can appear in the sentence-initial position without being extracted out of their clause as the result of a two-step process which scrambles the embedded clause to the beginning of the matrix while the constituent in question is scrambled to the first position within its clause. Even though this constituent is technically part of the matrix, the bottom-up processing mechanism of left-branching languages like Malayalam does not make a decision about clause assignment until late in the sentence; as a result, any sentence-initial constituent can be interpreted as topic without resorting to syntactic raising, cf. example (128): (128) a. siita ii pustakam meeticcu ermi raaman paramu. Siita this book bought COMP Raaman said 'Raaman said that Siita bought this book.' b. [ii pustakam siita meeticcu ennij raaman paraññu. this book Siita bought COMP Raaman said 'As for this book, Raaman said that Siita bought [it].' In (128b), ii pustakam 'this book' occurs in the sentence-initial position where it is interpreted as the topic of the sentence even though it was not extracted out of its clause, as the bracketing indicates. In sum, no syntactic movement processes exist in Malayalam which could generate argument trespassing structures. There is one construction in Malayalam, however, where an argument may not stand directly with its predicate; this is the serial verb construction, which is illustrated in example (129):

96

Chapter 4: The cross-linguistic survey

(129) kurukken apparti caadi iduttu paaññu kalaññu. fox bread jumped took ran threw away (= 'ran away') 'The fox jumped, took the bread, and ran away.' Note that appam 'bread' functions as the direct object of iduttu 'took' in this sentence but is separated from that verb by the intercession of another verb, caadi 'jumped'. This argument "stranding" arises from the peculiar set of constraints that govern the serial verb construction. The construction consists of a series of verbs, which must occur in the same sequence as the events they refer to. Only the final verb is a full fledged independent verb which carries the tense marker that determines the time reference of the whole series. All other verbs appear in a fixed "citation form" without independent tense markers, as the contrast between sentences (129) and (130) shows: (130) kurukken appam caadi iduttu paaññu kalayum. fox bread jumped took ran throw away (FUT) 'The fox will jump, take the bread, and run away.' Evidently, no material can intervene between the various verb forms so as to ensure that the tense marked on the final verb in the series can be interpreted as the tense for all the verbs in the series. As a result, the hearer has to rely on their respective subcategorization frames and pragmatic knowledge to assign the correct argument - predicate relationships. This is why neither sentence (131a) nor sentence (131b) is grammatical: (131) a. * kurukken caadi iduttu appam paaññu kalayum. fox jumped took bread ran throw away (FUT) 'The fox will jump, take the bread, and run away.' b. *kumkken caadi iduttu paaññu appam kalayum. fox jumped took ran bread throw away (FUT) 'The fox will jump, take the bread, and run away.' However, when the serial verb construction is broken up into a simple coordinated sequence, each argument must stand with its respective verb, including appam 'bread', which appears adjacent to iduttu 'took' in (132). caadi and iduttu do not form a serial verb construction in (132) as is evident

4.3. Left-branching languages

97

from the fact that this sentence is only acceptable with a pause, represented by the comma after caadi: (132) kurukken caadi, apparti iduttu paaññu kalaññu. fox jumped bread took ran threw away 'The fox jumped, took the bread, and ran away.'

4.3.1.4. Summary In summary, Malayalam has the essential characteristics of a language with a high degree of semantic transparency, as the Semantic Typology predicted based on a few well-known properties of this language. It has overt case marking, semantically narrow grammatical relations, and a highly variable clause-internal word order, while it lacks the movement processes like WH-movement and raising which form the structural bases for argument trespassing in other languages. Malayalam has only few vestiges of a less transparent organization, particularly the serial verb construction with its potential for argument stranding. In its overall organization, Malayalam is, thus, rather close to the transparent Korean type. In the absence of any close genetic relationship or areal contact that could explain this fact, we are not aware of any theory of the structure of language which could predict this alignment, other than the Semantic Typology developed here. Malayalam, therefore, provides further empirical evidence for this typology.

4.3.2.

Japanese

Korean and Japanese are well known for their striking typological similarities. Both are head-final languages; both have a very variable, though strictly verb-final, clause-internal word order; both use noun suffixes and postpositions to identify the grammatical roles of the participants in a clause; both have extensive inflectional morphologies, particularly of the verb; and both are topic-prominent languages, with Japanese using wa and Korean using -nun to mark the grammatical topic, which in either language may or may not be an argument of the verb. The two grammars are so close to one another that Kim (1985: 35, note 10) found "a near perfect correspondence" between the functions of the Japanese topic marker wa and its Korean counterpart -nun. Together, these and other properties of

98

Chapter 4: The cross-linguistic survey

Japanese add up to a level of semantic transparency close to that found in Korean.

4.3.2.1. Basic grammatical properties Inflectional morphology in Japanese marks such grammatical categories as the tense, voice, mood, finiteness, negation, and politeness in verbs. The basic grammatical relations are signalled formally by postnominal case markers.34 Subjects are usually marked by ga, direct objects by o, and indirect objects by ni, as shown in sentence (133) from Shibatani (1987: 870): (133) Taroo ga Hanako ni sono hon o yatta. NOM DAT that book ACC gave 'Taroo gave Hanako that book.' As in other languages, the existence of overt case marking facilitates scrambling in Japanese. As long as the verb remains in the clause-final position, all permutations of the various participants are grammatical although the variants with two objects preceding the subject are somewhat less well-formed (Shibatani 1987: 870): (134) a. Taroo ga sono hon o Hanako ni yatta. NOM that book ACC DAT gave 'Taro gave Hanako that book.' b. Hanako ni Taroo ga sono hon o yatta. DAT NOM that book ACC gave c. ΊHanako ni sono hon o Taroo ga yatta. DAT that book ACC NOM gave d. Sono hon o Taroo ga Hanako ni yatta. that book ACC NOM DAT gave e. Isono hon o Hanako ni Taroo ga yatta. that book ACC DAT NOM gave

4.3. Left-branching languages

99

The verb is strictly clause-final only in written Japanese since in the spoken language other material may follow the verb. These post-verbal elements have been analyzed as afterthought phenomena attributed to discourse conditions rather than the syntax (Kuno 1978: 60-64).35 Less prototypical transitives may involve experiencer subjects rather than agents or a verb which does not express action upon the object, e.g., a stative verb. For such less prototypical transitives Japanese often uses case marking patterns which differ from the standard system. For example, subjects of stative verbs may be marked with ga or with ni (Hinds 1986: 193): (135) kimi ni wakatte-te mo ore ni you DAT understand-PARTICIPIAL even I DAT wakaranai zo. understand-NEG EMPHATIC 'Even if you understand, I don't understand.' In example (136) from Shibatani (1987: 871), the use of ni to mark the object reflects the semantic fact that it is not a patient: (136) Taroo ga Hanako ni atta. NOM DAT met 'Taro met Hanako.' Double nominatives also occur if the object is not a patient (Shibatani 1987: 871): (137) Taroo ga Hanako ga suki da. NOM NOM like is 'Taro likes Hanako.' Furthermore, Kuno (1978: 65) observes that a transitive verb with an inanimate subject is extremely unnatural in Japanese and has "a distinct flavor of direct translation from English:" (138) a. UTaihuu ga ie no hei o kowasita. typhoon NOM house GEN fence ACC destroyed 'The typhoon destroyed the house's fence.'

100

Chapter 4: The cross-linguistic survey

Japanese prefers an intransitive inchoative construction in its place, as in (138b). Note that (138b) is an intransitive sentence and that intransitives do not share in the semantic restrictions which hold for subjects of transitive verbs, as we already observed for other languages above: (138) b. Taihuu de ie no hei ga kowareta. typhoon with house GEN fence NOM broke 'Because of the typhoon, the house's fence broke.' As a result of the semantic constraint on subjects of transitive verbs, many semantic roles are excluded from becoming subjects of these verbs. Notice the unacceptability of the transitive structures in the following example sets and their correct rendition in more semantically transparent oblique phrases. As was the case in Korean, sentences like (139a) are acceptable with an anthropomorphic interpretation. (139) a. ΊΊΞοηο hanmaa ga mado o kowasi-ta. that hammer NOM window ACC break-PAST 'That hammer broke the window.' b. Sono hanmaa de mado ga koware-ta. this hammer with window NOM break/INT-PAST Lit.: 'With this hammer, the window broke.' (140) a. ΊΊΚοηο e ga watasi ni kodomo no koro o this picture NOM I DAT child GEN time ACC omoidasi-se-ta. remember-CAUS-PAST 'This picture reminded me of my childhood.' b. Kono o de watasi ga kodomo no koro o this picture with I NOM child GEN time ACC omoidasi-ta. remember-PAST Lit.: 'With this picture, I remembered my childhood.' (141) a. *Kono hako ga takusan no ningyoo o this box NOM great quantity GEN doll ACC

4.3. Left-branching languages

101

fukun-deiru. include-PROGRES SI VE/RESULTATI VE 'This box contains many toys.' b. Kono hako ni takusan no ningyoo ga this box in great quantity GEN doll NOM hai-tteiru. fit-PROGRESSIVE/RESULTATIVE 'In this box, a great quantity of dolls have been fit.' (142) Kono hon ga ichi-man satsu ureta. this book NOM 10,000 copy sold 'This book sold 10,000 copies.' Sentence (142) looks superficially like a literal equivalent of the English gloss, but it has quite a different structure in Japanese since it is an intransitive construction. The word glossed 'copy' is in fact not a noun but a classifier, and any attempt to create an object "noun phrase" ichi-man satsu ο '10,000 copies (ACC)' immediately renders the sentence ungrammatical. Therefore, a more faithful translation of the Japanese construction would be akin to 'This book sold in the ten thousands' since this rendition preserves the non-nominal nature of the quantificational phrase. In sum, the organization of the basic participant structure of the sentence has all the hallmarks of a highly transparent language in Japanese.

4.3.2.2. Manipulation of the basic participant structure Of the grammatical processes that comprise our Grammatical Relation Manipulation Hierarchy, Japanese uses passivization extensively in marked contrast to Korean (N. K. Kim 1987: 893). Japanese has two passive strategies, the neutral passive, which functions generally like the passive in English, i.e., as a device which defocuses the agent while topicalizing the patient (Cameron 1989), and the affective passive, which expresses that the derived subject is somehow (often adversely) affected by the action and is often characterized by the appearance of an additional participant that may not be present in the active version of the sentence. Note the difference between the plain passive in (143) and the affective passive in (144):36

102

Chapter 4: The cross-linguistic survey

(143) Kozutsumi ga Tarooni Hanako ni okurareru. package NOM by DAT send-PASS 'The package is sent to Hanako by Taro.' (144) Taroo ga sensei ni musoko o sikar-are-ta. NOM teacher by son ACC scold-PASS-PAST Lit.: ""Taro was scolded his son by the teacher.' = 'Taro was adversely affected by the teacher's scolding of his son.' Although sentences (143) and (144) are superficially very similar, they in fact exemplify two quite different constructions. Unlike the neutral passive of (143), the affective passive does not rely on the promotion of a direct object to subjecthood. As a result, the latter construction is not restricted to transitive verbs, as in example (145) from Cameron (1989). Passivization is, thus, widely used in Japanese. (145) Taroo ga haha ni yorokob-are-ta. NOM mother by be happy-PASS-PAST 'Taro had his mother be happy about him.' With regard to WH-movement,the next position on the Grammatical Relation Manipulation Hierarchy introduced in Chapter 1, there is no evidence for extraction of WH-words in Japanese; instead, WH-words simply remain in situ (Hinds 1986: 26): (146) Kaoru wa nani o kaimasita ka? TOP what ACC bought-POLITE Q 'What did Kaoru buy?' (147) Takashi wa dare ni kare no kagi o watasimasita ka? TOP who to he POSS key ACC handed-POLITE Q 'Who did Takashi give his keys to?' Furthermore there are no syntactic processes in Japanese that can separate a noun phrase from the postposition governing it. No postposition stranding, thus, occurs (Hinds 1986: 81). Despite the lack of WH-movement, Japanese has some instances of argument trespassing. Kuno (1978: 133) notes that these can arise from the

4.3. Left-branching languages

103

fact that scrambling is not strictly clause-bound in Japanese. Compare (148a) and (148b): (148) a. Boku wa [Yamada sensei ni anata o syookaisitai] to I TOP teacher to you ACC introduce-want that omotte imasu. thinking am Ί have been thinking that I want to introduce you to teacher Yamada.' b. Anata o boku wa [Yamada sensei ni syookaisitai] to you ACC I TOP teacher to introduce-want that omotte imasu. thinking am Notice that the direct object anata o of the embedded verb appears to the left of the topic boku wa in (148b) and, thus, constitutes an instance of argument trespassing. However, it should be added that sentences like (148b) do not seem to be grammatical for Hinds (1986: 160), in whose view only adverbs and inteijections may precede the topic. These apparently discrepant judgements and the status of sentences like (148b) per se warrant further investigation. The second source of apparent argument trespassing in Japanese are sentences like (149b): (149) a. Watasi ga [kare ga baka da] to omou. I NOM he NOM fool is that think Ί think that he is a fool.' b. Watasi ga kare o baka da to omou. I NOM he ACC fool is that think Lit.: Ί think him to be a fool.' = Ί think that he is a fool.' Kuno (1976) has argued that such sentences involve a rule of subject(-toobject) Raising, and as such they would be true instances of argument trespassing. However, Kuno himself points out ( 1976: 17) that the presumed raising process is grammatical only when the verb of the complement clause is a form of the copula, as is the case in (149), in a striking parallel to the evidence adduced by Comrie (1989b) against a raising analysis for Russian.

104

Chapter 4: The cross-linguistic survey

Significantly also, no other types of raising occur in Japanese. There is no subject-to-subject raising since the head-finality of the language obviates the need for topicalizing the embedded subjects in constructions with yoo da 'seem': (150) a. Taroo ga kuruma o katta yoo da. NOM car ACC bought seem is 'Taro seems to have bought a car.' Both scrambling and honorification provide evidence that Taroo is a constituent of the embedded clause. If Taroo had been raised to become the subject of the matrix, we would expect that it could scramble freely within that clause as long as its verb remains in the clause-final position; but sentence (150b) shows that Taroo can not scramble behind the dependent verb: (150) b. * Kuruma o katta Taroo ga yoo da. car ACC bought NOM seem is 'Taro seems to have bought a car.' Honorification can serve as a test of whether raising has taken place since verbs agree with their subject in the level of honorification. When honorification applies to sentence (150a), Taroo must agree with the dependent verb while honorific agreement with the matrix verb is not grammatical. Note the grammaticality difference between (150c) and (150d): (150) c. Taroo-san ga kuruma o o-kai-ni natta yoo da. HON NOM car ACC buy-HON became seem is 'Taro seems to have bought a car.' d. UTaroo-san ga kuruma o katta yoo de irashyaru. HON NOM car ACC bought seem is HON 'Taro seems to have bought a car.' Japanese is rich in compound verb formation (Kuno 1978: 100), resulting in monoclausal sentences which avoid object-to-subject raising in "Tough" sentences and subject-to-object raising in control structures. Sentences (151) and (152) are examples of the Japanese "Tough" construction from Farmer 1984: 83):

4.3. Left-branching languages

105

(151) Kono hon ga (gakusei-tati ni totte) yomi-yasu-i. this book NOM student-PL DAT read-easy-PRES 'This book is easy (for the students) to read.' (152) Kono tosyokan (kara) ga hon o nusumi-niku-i. this library from NOM book ACC steal-hard-PRES 'This library is hard to steal books from.' Similarly, note the compound verb form in sentence (153), which in English is expressed as a biclausal construction, with subject-to-object raising of the subordinate subject: (153) Kaoru wa Tarooni shukudai o yatte-hoshi. TOP DAT homework ACC do-want 'Kaoru wants Taro to do his homework.' Summing up, Japanese has a considerable level of semantic transparency. As a language with a rich case-marking system, it uses scrambling to encode pragmatic information and has the semantically narrow grammatical relations typical of highly transparent languages. There is no WH-extraction or postposition stranding; but to a limited extent subject-to-object raising and possibly scrambling may create argument trespassing structures; also, passivization is a frequently used means of rearranging the argument structure of a sentence. In these ways, Japanese falls somewhat short of the level of semantic transparency found in Korean or in Malayalam.

4.3.3.

Turkish

After Korean, Malayalam, and Japanese, Turkish is the fourth consistently left-branching language to be examined here. Like the former languages, Turkish is characterized by an extensive inflectional morphology; it has a rich case marking system and free scrambling of the clause-level constituents, including surface orders where the verb is not clause-final (Kornfilt 1987: 628-636). In most, but not all (Kornfilt 1987: 636) actual Turkish sentences, the surface word order is, therefore, determined by pragmatic considerations: The topic is the first constituent, the focus appears in the immediate pre-verbal position, and backgrounded or afterthought material appears post-verbally (Erguvanli 1984: xi). Main clause material can only

106

Chapter 4: The cross-linguistic survey

be scrambled within the matrix, and constituents of embedded clauses can not be scrambled into higher material; however, constituents that logically belong to an embedded clause may occur to the right of the matrix verb as background or afterthought information (Kornfilt 1987: 636). There is little information in the literature concerning the permitted semantic range of the subject relation, but Turkish grammar distinguishes accusative objects from dative objects, with many less prototypical transitive verbs governing the dative case, e.g., varmak 'reach', bakmak 'look at, watch', inanmak 'believe', yardim etmek 'help', giilmek 'smile', etc., or, less frequently, the ablative case, cf. iizülmek 'be sorry, be worried (about)', bikmak 'be tired (of)', and korkmak 'fear, be afraid (of)' (Underhill 1976: 447-470)." Turkish has both a personal and an impersonal passive. The former is illustrated in sentence (154) from Underhill (1976: 333) and the latter in example (155) from Siewierska (1984: 201): (154) Ρencere Hasan tarafindan aç-il-di. window by open-PASS-PAST 'The window was opened by Hassan.' (155) Burada çaliç-il-ir. here work-PASS-AORIST 'Here it is worked.' As the focus of an information question, WH-words are positioned in the preverbal position, cf. sentence (156), although sentence-initial WH-words occur when they are contrasted; in the latter case, they require special stress, as indicated in sentence (157) (A. Kim 1985: 55). However, there is no evidence of any WH-extraction in Turkish; rather, the position of a WH-word, like that of other nomináis, follows directly from its pragmatic function and can be syntactically derived by means of scrambling. (156) Bu film-i kim gör-dii? this film-ACC who see-PAST 'Who saw this film?' (157) Kim sinemá-ya git-mek isti-yor, kím tiyatró-ya? who movies-DAT go-INF want-PROGRESSIVE who theater-DAT 'Who wants to go to the movies and who to the theater?'

4.3. Left-branching languages

107

So far, our discussion has shown Turkish to be a language with high semantic transparency as our typology predicted for left-branching languages. This view is strengthened by the fact that adposition stranding is disallowed in Turkish (Kornfilt 1987: 637). However, the situation is less clear with respect to raising because there are examples like (158) from Kornfilt (1987: 641): (158) a. Herkes [(ben) iiniversite-ye baçla-yacag-im] everybody I university-DAT start-FUT-lSG san-iyor. believe-PRES PROGRESSIVE 'Everybody believes that I shall start university.' b. Herkes ben-i iiniversite-ye baçla-yacak everybody I-ACC university-DAT start-FUT sart-iyor. believe-PRES PROGRESSIVE 'Everybody believes that I shall start university.' In sentence (158a), the pronoun ben Τ is unequivocally the subject of the embedded clause since it is in the nominative case and the verb of that clause carries the first person singular agreement marker -im. By contrast, the embedded verb exhibits no agreement marking in sentence (158b), and the pronoun ben-i 'me' is in the accusative case. The contrast between examples (158a) and (158b) may, therefore, point towards the possibility of subject-to-object raising in Turkish, even though Kornfilt presents sentence (158b) with a clause boundary in the same position as in (158a), i.e., preceding ben-i. Kornfilt (1977) explicitly investigates the question of subject-to-object raising in Turkish, finding that native speakers disagree on some of the facts relevant to this problem. Kornfilt concludes that for some speakers, there is a rule of subject-to-object raising while for others there is not. At least the Turkish of those speakers whose idiolect has such a rule therefore presents a problem for the exceptionlessness of the Grammatical Relation Manipulation,Hierarchy. Overall, though, Turkish grammar clearly reflects the high level of semantic transparency predicted for left-branching languages.

108

Chapter 4: The cross-linguistic

survey

4.3.4. Hixkaryana Hixkaryana is a Carib language spoken by approximately 350 people in Northern Brazil. It has become unusually well known for an Amazonian language thanks in large part to the efforts of Derbyshire (1979, 1985), which form the basis of all the information in this section. With its OV order and its postpositions, Hixkaryana is undoubtedly a leftbranching language, but of an unusual kind because of its basic OVS word order. It is therefore interesting to see whether Hixkaryana displays a similar overall transparency as the other left-branching languages we have investigated. If the hypothesis regarding the position of left-branching languages in the Semantic Typology is correct, we would certainly expect Hixkaryana to be located in the semantically transparent range of the typological continuum. On the clause level, this prediction is indeed borne out. Hixkaryana has a rich inflectional morphology, which marks possession on nouns (Derbyshire 1979: 83) and an array of functions on verbs, including tense, aspect, mood, changes in valency, and agreement with both the subject and the (direct) object. Although word order is an important factor in the identification of the participants functioning as subject and as (direct) object because of the absence of nominal case marking, it is supplemented by obligatory agreement markers on the verb for both of these grammatical relations, and all other participants, including the indirect object, appear in postpositional phrases (Derbyshire 1985: 33-35). Example (159) from Derbyshire (1985: 35) demonstrates the use of postpositional phrases and agreement marking in Hixkaryana: (159) yawaka y+myako b+ryekomo rowya. axe he-gave-it boy me-to 'The boy gave the axe to me.' The formal marking of subjects and objects by means of agreement markers on the verb squarely places Hixkaryana with the head-marking type in the typology of Nichols (1986). In accordance with this type, there is not much variability in word order, except for the possibility of moving any one constituent, including subordinate clauses, into sentence initial position for emphasis (Derbyshire 1985: 74),38 cf. sentence (160) from Derbyshire (1985: 75):

4.3. Left-branching languages

109

( 1 6 0 ) ohetxe wya woto w+mno enmahriro. your-wife to meat I-gave-it early-in-the-day Ί gave the meat to your wife early in the day.' As we have seen for WH-movement in German and for scrambling in Korean and in Japanese, all languages with high semantic transparency, this fronting process in Hixkaryana requires that a constituent be pied piped in toto so that no postposition stranding is allowed (Derbyshire 1985: 16). The same process applies obligatorily in WH-questions (Derbyshire 1979: 75) as seen in example (161) from Derbyshire (1979: 8): (161) onok+ b+ryekomo komo yonyetxkon+7 who child COLL he-was-eating-them 'Who used to eat the children?' The fact that the fronting operation is limited to only one constituent also explains Derbyshire's finding (1979: 12) that multiple WH-questions are not permitted. Furthermore, since noun phrase fronting for emphasis and for the formation of a WH-question are generated by the same operation, there is no need for stipulating a separate rule of WH-movement in Hixkaryana. In fact, there is no evidence in Derbyshire's work of WH-extractions of a noun phrase into another clause. Derbyshire notes a general preference to use question words only in main clauses, but if an element of a subordinate clause is to be questioned, usually in echo questions, the WH-element must occur in the sentence-initial position. This can be achieved in Hixkaryana without extracting the WH-word out of its clause by means of a two-step process which moves the embedded clause in front of the matrix and the WH-word to the initial position within its own clause, as shown in the conversational sequence cited in (162) (Derbyshire 1979: 9). (162) Speaker A: kokaht+mno [rowya kamara yonyetoko]. I-ran-away by-me jaguar when-seeing-of Ί ran away when I saw the jaguar.' Speaker B: [onok yonyetoko owya] oyokaht+mno? who when-seeing-of by-you you-ran-away 'When you saw what you ran away?'

110

Chapter 4: The cross-linguistic survey

In the sentence uttered by Speaker B, the WH-word onok 'who', which here refers to the jaguar, appears in the first position of the sentence, i.e., the normal position for WH-elements in Hixkaryana, but the bracketing shows that it does not have to be extracted out of its clause to get there. At this point, it should be recalled that Malayalam, another language with a high overall level of semantic transparency, also utilized a fronting-withoutextracting strategy for bringing embedded material to the sentence-initial position. In the case of Malayalam, this strategy achieved the topicalization of an embedded constituent, while Hixkaryana uses it to front the WHfocus, but in both of these cases we observe a language with high semantic transparency developing a similar syntactic strategy for completely different purposes. It stands to reason, therefore, that the pressure to avoid argument trespassing, which is induced by the position of these languages in the Semantic Typology, is precisely the factor that ultimately motivates these rules. Besides extraction and adposition stranding, Hixkaryana lacks clauseinternal processes that change grammatical relations, such as the passive, as well (Derbyshire 1985: 122). However, Hixkaryana does have a good number of valency-changing affixes which derive causative verbs from intransitive stems and causative or intransitive verbs from transitive stems (Derbyshire 1979: 133-135). One effect of detransitivizing is the derivation of "pseudo-passives", i.e., derivations which create an effect like a passive, for instance the derivation of an adverbial as in (163) or of an intransitive as in (164), both from Derbyshire (1985: 91): (163) tonoso naha kyokyo. can-be-eaten it-is parrot 'The parrot is edible.' or 'The parrot can be eaten.' (164) neramano Waraka. he-turned-around Waraka 'Waraka got turned around.' In detail, the verb structure in (164) is n-e-rama-no, i.e., 3SG-DETRANStum-IMMEDIATE PAST. Because the intransitivizer is homophonous with the reflexive marker e-, the sentence can also be read as 'Waraka turned (himself) around.'

4.3. Left-branching languages

111

Given the lack of proper passivization, extraction, and adposition stranding, the existence of raising in Hixkaryana would be surprising from the standpoint of our typology. However, Derbyshire (1985: 137-140) argues for an obligatory rule of subject-to-subject raising which must apply in all negative sentences. (165) apaytara yarhira nexeye wekoko. chicken not-taking it-was hawk 'The hawk did not take the chicken.' Assuming that (166) is the "deep structure" of this sentence, Derbyshire observes (1985: 139) that the subject wekoko 'hawk' is raised out of the embedded clause to become the subject of the matrix verb -exe- 'be': (166) [NEG [[apaytara -ari- wekoko] -exe-]] chicken take hawk be Unfortunately, Derbyshire's argument for a raising rule is not very convincing. His argument crucially depends on the assumption that (166) is indeed the deep structure of sentence (165), but he does not substantiate this claim. Specifically, he does not present any evidence to show that (165) is a biclausal structure, as the representation in (166) claims. This omission is particularly striking as Derbyshire goes to great lengths to emphasize the parallels between the negative construction in Hixkaryana and that in English. He explicitly states that the Hixkaryana copula -exe- is "equivalent to the 'dummy auxiliary'" (Derbyshire 1985: 138) do. But in English negation, do is clearly merely an auxiliary in a monoclausal structure, which means that no raising rule needs to be postulated to make the underlying subject the subject of do. Derbyshire does not explore the monoclausal account for Hixkaryana. The existence of subject-to-subject raising in Hixkaryana is, hence, strongly in doubt. Even so, there is one instance of upward movement into a higher clause. This rightward movement operation involves the subject of a subordinate clause, which is placed to the right of the matrix verb (Derbyshire 1985: 78-79): (167) tan+hnohtoho komo ywenyeke fact-of-their destruction COLL not-knowing

rma SAME-REFERENT

112

Chapter 4: The cross-linguistic

survey

hak nehxatxkon ham+, tuna ymo wya. yet they-were DEDUCTION water AUGMENTATIVE by 'They did not yet know about their (coming) destruction by the flood.' According to Derbyshire, this rightward movement is preferred when the subordinate clause is "heavy", as for example in the sentence just cited. As an isolated out-of-clause movement rule in Hixkaryana, this rule clearly constitutes an exception with respect to the Grammatical Relation Manipulation Hierarchy, so unless this process can be understood as a paratactic construction (cf. below) rather than syntactic movement, the hierarchy must be viewed as a statistical tendency instead of a universally true statement. Derbyshire provides no explicit information on the range of semantic roles that can be coded as subject or object, but examples like (168) from Derbyshire (1979: 86) indicate that subjects of intransitives can have semantic roles other than agency, as we have noted for other languages: (168) anaro wewe nemokotono. other tree it-fell 'Another tree has fallen.' Given the high overall degree of semantic transparency in Hixkaryana, we would predict that the semantic range of the grammatical relations in transitive sentences is rather more narrow. Additional research is necessary to investigate this claim. Interestingly, Hixkaryana maintains this transparency only within the clause unit; beyond it, we find frequent paratactic sequences, many of which are not overtly marked for their grammatical role. For instance, parataxis is the primary means of expressing coordination as Hixkaryana lacks the formal means for generating coordinate structures at the sentence, clause, or phrase levels (Derbyshire 1985: 120). Because of this, paratactic structures often involve discontinuous constituents, as in example (169). Sentences (169) and (170), from Derbyshire 1985: 135 and 67, respectively, are examples of paratactic coordination: (169) kuraha tho t+ hn+nkaye waywi-heno bow DEVALUED HEARSAY he-put-it-down arrow set komo. COLL 'He put down the bow and set of arrows.'

4.3. Left-branching languages

113

(170) hohtyakon+ hat+, txetay. she-was-picking-it HEARSAY picking (DEVALUED) nenahyakon+ hat+. nar+r+kekon+ hat+ she-was-eating it HEARSAY she-was-tossing-it-down HEARSAY t+nyo hyaka. her-husband to 'She was picking (the fruit), eating it, and tossing it down to her husband.' In addition to coordination, parataxis is also a primary mechanism for expressing attribution since there are no adjectives and no modifier forms that can occur within a noun phrase (Derbyshire 1985: 122), cf. sentence (171) from Derbyshire 1985: 133): (171) rom+n hoko rakoronometxoko, hawana komo enyhoru my-house occ.-with they-helped-me visitor COLL good-one komo. COLL 'Those good visitors helped me build my house.' Summing up, the Hixkaryana data generally fit the hypothesis that leftbranching languages have a high degree of semantic transparency. It has explicit marking of participant roles by means of agreement or postpositional relators. No extraction or adposition stranding exist even though word order variation is limited in accordance with the head-marking language type. Verbs are strictly either transitive or intransitive, but a number of valency-changing affixes exist. One effect of these affixes is the derivation of "pseudo-passives", but Hixkaryana lacks a true active - passive dichotomy. However, this high level of semantic transparency does not carry over to the paratactic sequences outside the boundaries of the clause proper, which are often unmarked for their role within the sentence.

4.3.5. Dutch 4.3.5.1. A remark on word order Based on the position of the verb in most sentences, it can be argued that Dutch has an underlying syntactic SOV word order, even though, apart

114

Chapter 4: The cross-linguistic survey

from the sentence-final position of the verb complex, none of the phrasal categories of Dutch are head-final. Dutch uses prepositions rather than postpositions, and the word order in noun phrases is similar to English, i.e., determiner - adjective - noun - prepositional phrase. This fact is significant in view of the finding of Hawkins (1983) that adposition type is a much more reliable indicator of the basic branching directionality than the respective orders of the clause-level constituents subject, object, and verb. Even for the position of the verb, a more precise statement of the facts would note characteristics of both the SOV and the SVO types. As in German, main clauses are verb-second, requiring an auxiliary or, in the absence of any, the verb itself to appear in the second position of the sentence; only subordinate clauses have no verbal element in the second position, and even there Dutch allows greater leakage behind the verb than German does. These word order facts are illustrated in examples (172) to (174), all adapted from Donaldson (1981): (172) Helma heeft het boek aan Wim gegeven. has the book to given 'Helma has given the book to Wim.' (173) a. Gisteren is hij gevlogen naar Amsterdam. yesterday is he flown to 'Yesterday, he flew to Amsterdam.' b. German: *Gestern ist er geflogen nach Amsterdam. yesterday is he flown to 'Yesterday, he flew to Amsterdam.' (174) Zij ging vroeg naar bed, omdat ze die dag een lange she went early to bed because she that day a long wandeling had gemaakt. walk had made 'She went to bed early because that day she had gone for a long walk.' Sentence (172) shows the verb-finality in main clauses which have an auxiliary while example (173a) cautions that leakage to the right of the verb is possible irrespective of the ungrammaticality of its German equivalent

4.3. Left-branching languages

115

with the same word order. Sentence ( 174), finally, shows the typical verbsecond position in main clauses without an auxiliary and the clause-final position of the verb complex in a subordinate clause. Whatever the merits of the syntactic arguments for a basic SOV order are, from a parsing perspective most phrase level categories in Dutch can be uniquely identified at their left edge, either because they are head-initial, such as prepositional phrases, or because they are typically introduced by a another element which only occurs with a given phrase type. Dutch noun phrases, for example, are like their counterparts in English in that they are neither head-initial nor strictly head-final; however, determiners occur in the initial position of noun phrases, so that the parser can construct an NP upon encountering the determiner. The verb-second feature has a similar effect, establishing the fact that a main clause is being parsed and separating the topical sentence-initial constituent from the remainder of the sentence. Example (175) from Donaldson (1981: 97) shows how the verb-second feature makes a top-down parse possible: (175) Vanavond moeten jullie dit hoofdstuk lezen. tonight must you this chapter read 'Tonight you must read this chapter.' In (175), the auxiliary moeten 'must' demarcates the end of the topical vanavond 'tonight'. Since vanavond is an adverb, the nominal constituent following moeten, i.e., jullie 'you', must be the subject. The determiner dit 'this' establishes a second post-auxiliary noun phrase. As a noun phrase following the subject, this constituent is interpreted as the object of a transitive predicate. The whole process is somewhat more complex than a comparable English parse, involving more of an interaction between several elements rather than the strict reliance on word order typical for English, but a top-down parse is feasible for Dutch. As a result, Dutch is not compelled to follow the semantically transparent organization typical of leftbranching languages. In fact, Dutch has a good number of low transparency characteristics, as we will show in the remainder of this section.

4.3.5.2. Basic grammatical properties The theory of drift formulated in Sapir (1921) places Dutch between English and German. In its "drift towards the invariable word" (Sapir 1921

116

Chapter 4: The cross-linguistic survey

[1949]: 168), English has lost gender and case as grammatical categories of the noun while German noun phrases still distinguish three genders and four cases. Dutch is left with two genders, having collapsed the masculine and the feminine into a common gender. In word order as well, a contrastive analysis of these three Germanic languages reveals a clear hierarchy: German has considerable word order freedom at the sentence level, Dutch has less, and English least. Although Modern Dutch has retained more of the original Germanic morphology than English, the loss of case has the result that the basic grammatical relations are identified primarily by word order. Not surprisingly, Dutch therefore joins the English language in collapsing many of the less prototypical objects that used to be marked by non-accusative cases into one large direct object class (cf. section 3.3.3.1 above for these contrasts between English and German). The grammaticality of passivization in sentence (176b) demonstrates that such noun phrases are now treated as instances of the general category "object": (176) a. Hans heeft mij geholpen. has me helped 'Hans helped me.' b. Ik ben door Hans geholpen. I am by helped Ί was helped by Hans.' However, not all objects can be promoted in this way. Apart from some lexical exceptions, this is true for the indirect object in a double object construction, cf. example (177) from Donaldson (1981: 163). The English sentence I was given a book can be rendered in three ways in Dutch, by topicalizing the recipient in a personal passive construction as in (177a), without topicalization in a personal passive, or in an impersonal construction with the expletive er. The latter possibility is by far the most common among these (Donaldson 1981: 163). (177) a. (Aan) mij werd een boek gegeven. to me became a book given Lit.: 'To me was given a book.' = Ί was given a book.'

4.3. Left-branching languages

117

b. Een boek werd aan mij gegeven. a book became to me given Ά book was given to me.' c. Er werd een boek aan mij gegeven. it became a book to me given Lit.: 'It was given to me a book.' = Ά book was given to me.' Like objects, the subject relation in Dutch also accommodates a greater variety of semantic roles than its counterpart in German, though less so than English subjects. This distribution again extends Sapir's idea that Dutch occupies an intermediate position between English and German. The following examples illustrate the ability of non-agentive semantic roles to become subjects: (178) Deze sluitel opent de achterdeur. this key opens the back door 'This key opens the back door.' (179) Zijn nieuwe boek heeft 10000 copien verkocht. his new book has copies sold 'His new book sold 10,000 copies.' (180) De donder heeft de kinderen bang gemaakt. the thunder has the children afraid made 'The thunder frightened the children.' (181) Een steen heeft het raam gebroken. a stone has the window broken Ά stone broke the window.' Not surprisingly, there are some English subjects which have no literal Dutch counterparts: (182) TiVroeger kon een dubbeltje een ijsje kopen. earlier could a (small coin) a ice cream buy 'Long ago, a dubbeltje could buy an ice cream cone.'

118

Chapter 4: The cross-linguistic survey

(183) *Opeens opent de deur. suddenly opens the door 'Suddenly, the door opens.'

4.3.5.3. Syntactic movement Dutch permits all grammatical operations on the Grammatical Relation Manipulation Hierarchy. There are two passivization strategies, a personal passive which applies to transitive verbs, cf. example (184) from Donaldson (1981: 161), and an impersonal passive, which is illustrated in example (185) from Siewierska (1984: 94). Both are used with high frequency. (184) a. Hij wast het raam. he washes the window 'He washes the window.' b. Het raam wordt door hem gewassert. the window becomes by him washed 'The window is washed by him.' (185) a. De jongens flutten. the boys whistle 'The boys whistle.' b. Er wordt door de jongens gefloten. there becomes by the boys whistled 'There is whistling by the boys.' WH-movement can extract a noun phrase into a higher clause, as examples (186) and (187) illustrate: (186) Wat denk ye dat de politie in zijn huis gevonden heeft? what think you that the police in his house found has 'What do you think that the police found in his house?' (187) Wat vroeg Rudi je te kopen? what asked you to buy 'What did Rudi ask you to buy?'

4.3. Left-branching languages

119

When the object of a preposition is moved, pied piping may optionally apply, or the preposition may be stranded. Sentence (188) is from Donaldson (1981: 63-64): (188) a. De taf el waarop het brood ligt, is van mij. the table which-on the bread lies is from me 'The table on which the bread is lying is mine.' b. De tafel waar het brood op ligt, is van mij. the table which the bread on lies is from me 'The table the bread is lying on is mine.' Dutch uses various raising processes. Sentences (189) and (190) are examples of subject-to-subject raising while example (191) shows that object-tosubject raising exists as well: (189) Gerrit schijnt van tee te houden. seems from tea to hold 'Gerrit seems to like tea.' (190) Dat konijn lijkt ziek te zijn. the rabbit seems sick to be 'The rabbit seems to be sick.' (191) a. Het is onmogelijk (om) dit boek te verkrijgen. it is impossible INF this book to get 'It is impossible to get this book.' b. Dit boek is onmogelijk (om) te verkrijgen. this book is impossible INF to get 'This book is impossible to get.' The relationship between (191a) and (191b) is to be analyzed as an instance of object-to-subject raising for two reasons: First, the surface matrix of (191b) does not exist as such in an underived structure: (191) c. *Dit boek is unmogelijk. this book is impossible 'This book is impossible.'

120

Chapter 4: The cross-linguistic survey

Secondly, the construction occurs only with certain trigger predicates. Most strikingly, mogelijk 'possible' is not one of them, so that sentence (192b), which was formed in analogy to (191b), is ungrammatical: (192) a. Het is mogelijk (om) dit boek te verkrijgen. it is possible INF this book to get 'It is possible to get this book.' b. *Dit boek is mogelijk (om) te verkrijgen. this book is possible INF to get 'This book is possible to get.' If sentences like (191b) were base-generated, the ungrammaticality of (192b) and of (191c) would be totally surprising and unmotivated. At the same time, object-to-subject raising has fewer trigger predicates in Dutch than in English, and some sentences that appear to be instances of this "Tough" construction are better analyzed as base-generated. Sentence (193a), for instance, seems at first glance to exemplify the same construction as (191b):39 (193) a. De chef is moeilijk (?om) tevreden te stellen. the boss is difficult INF satisfy to put 'The boss is difficult to satisfy.' However, (193b) reveals that moeilijk 'difficult' behaves like an adverb here and not like an adjective which could be the matrix predicate of an objectto-subject raising construction: (193) b. een moeilijk tevreden te stellende chef a difficult satisfy to put boss Lit.: Ά difficult to satisfy boss' = Ά difficult boss to satisfy' "Tough" sentences are, therefore, yet another area of the grammar where Dutch takes an intermediate position between German, which prefers the base-generated type, and English, where object-to-subject raising is widespread.

4.4. Right-branching languages

121

4.3.5.4. Summary Summing up it can be said that our exploration of Dutch reinforced Sapir's view that the historical drift in Germanic has left Dutch in an intermediate position between German and English. This is certainly true with respect to the inflectional morphology and the semantic range of participants that can become subjects, although Dutch appears definitely closer to English with respect to argument trespassing processes.

4.4. Right-branching languages 4.4.1.

Jacaltec

Jacaltec is the first right-branching language to be discussed here. All the information listed on Jacaltec, an Amerindian language spoken in Guatemala, is based on Craig (1977). Like Hixkaryana, Jacaltec has a considerable inflectional morphology. Verbs in particular are morphologically complex, carrying affixes for aspect, tense, voice, and transitivity, in addition to subject and object agreement markers.40 Verb agreement functions on a nominative-accusative basis in aspectless embedded clauses and on an ergative-absolutive basis in all other clauses (Craig 1977: 115). Even though the grammatical functions of the various noun phrases are thus marked overtly by means of agreement markers, the nouns themselves do not carry any case markers. As is expected for such head-marking languages, the clause-internal word order is rather rigid. In fact, Jacaltec is consistently head-initial; it is prepositional, has a basic VSO order, and the noun precedes its modifiers in a noun phrase.

4.4.1.1. Argument trespassing Only a small set of movement processes exists which can generate structures that deviate from the basic VSO pattern. One of these processes, clefting, involves the movement of a noun phrase to the pre-verbal position, as in example (194). Since word order is the only indicator of the grammatical functions of the subject and the object noun phrases, the grammar responds to the resulting ambiguity question by preposing the clefting

122

Chapter 4: The cross-linguistic survey

element ha ' to the moved noun phrase and by placing a special marking on the verb whenever the subject of a transitive verb has been clefted.41 (194) a. x-0-(y)-il naj ix. ASP-A3-E3-see CL/he CL/her 'He saw her.' b. ha' naj x'il-ni ix. CLEFT CL/he saw-SUFFIX CL/her 'It is he who saw her.' The disambiguating suffix -ni appears in all operations where a transitive subject has been removed, be they WH-questions, relative clauses, or cleft sentences. It is possible for clefting to operate across a clause boundary to extract a noun phrase from an embedded clause as in sentence (195) (Craig 1977: 223): (195) ha' naj xal ix ta sloko' hin cheh. CLEFT CL/him said CL/she that will-buy my horse. 'It is he that she said will buy my horse.' However, the disambiguating mechanism only applies in finite clauses, leaving the ambiguity unresolved in non-finite, i.e., aspectless embedded, clauses (Craig 1977: 224-225): (196) ha' naj latjan 0-y-il-ni ix. CLEFT CL/him PROGRESSIVE A3-E3-see-SUFFIX CL/she(her) 'It is he that she is looking at.' or 'It is he that is looking at her.' Significantly for Craig's argument, the suffix -ni in (196) is not the disambiguating suffix but a marker which is characteristic of aspectless transitive verbs in general. In addition to clefting, Jacaltec has other grammatical processes which freely operate across clause boundaries. In fact, the surface word order properties of Jacaltec clearly identify it as a strongly grammaticizing language. At the same time that it has a much more rigid word order than Hixkaryana, Jacaltec also permits all out-of-clause movement types described by the Grammatical Relation Manipulation Hierarchy. For instance,

4.4. Right-branching languages

123

information questions are formed by a rule of WH-movement which can extract a nominal constituent into a higher clause. Compare sentences (197a) and (197b) from Craig (1977: 48): (197) a. xal ix chubil xil ix xcolwa naj yiq yustaj. said CL/she that saw CL/she gave help CL/he to his brother 'She said that she saw him give a hand to his brother.' b. mac γΐη xal ix chubil xil ix xcolwa naj? whom to said CL/she that saw CL/she give help CL/he 'To whom did she say that she saw him give help [ χ ]?' In sentence (197b), the prepositional complement of the lowest verb has been extracted to the sentence-initial position by WH-movement. Example (197c) from the same source demonstrates that the preposition can be stranded in such extractions: (197) c. mac xal ix chubil xil ix xcolwa naj yirj? whom said CL/she that saw CL/she give help CL/he to 'Whom did she say that she saw him give help to [ χ ]?' Subject-to-subject raising of noun phrases into a higher clause also exists (Craig 1977: 287-310), but Jacaltec has no object-to-subject raising (Craig 1977: 273). A typical example of subject-to-subject raising in Jacaltec is sentence (198b), which derives from (198a) (both examples are taken from Craig 1977: 291): (198) a. mastic'a s-O-tarj-oj O y-al-ni ix hun tu'. never ASP-A3-stop-FUT A3 E3-say-SUFFIX CL/she one that Lit.: '(It) will never stop she says that.' = 'She will never stop saying that." b. mastic'a s-O-tarj-oj ix 0 y-al-ni hun tu'. never ASP-A3-stop-FUT CL/she A3 E3-say-SUFFIX one that 'She will never stop saying that.' Notice that ix 'CL/she', the underlying subject of the embedded clause, has been realized grammatically as the surface subject of the matrix in sentence (198b). In order to complete our assertion that Jacaltec permits all the

124

Chapter 4: The cross-linguistic survey

syntactic operations described by the Grammatical Relation Manipulation Hierarchy, Jacaltec has four different ways of forming the passive, one of which is shown in example (199) (Craig 1977: 77): (199) x-O-tz'ah-ot te' qah y-u naj. ASP-A3 -paint-P AS S CL/the house E3-by CL/him 'The house was painted by him.' The operation of syntactic rules across clause boundaries is evidently not limited to movement rules alone. Relative clause formation, for instance, which in Jacaltec constitutes a deletion rule, can equally apply across clauses. Craig (1977: 215) presents the following example to make this point: (200) w-ohtaj naj x-0-(y)-al ix ta s-O-s-lok-o ' El-know CL/him ASP-A3-E3-say CL/shethat ASP-A3-E3-buy-FUT [—] no ' cheh. GAP CL/the horse Ί know the man that she said will buy the horse.' Overall, then, the properties of syntactic movement in Jacaltec identify it as a strongly grammaticizing language. Although its surface syntax, with its basic VSO word order and its extensive ergative-absolutive agreement system, is far different from English, the two languages share a basic syntactic orientation which allows for the systematic grammatical violation of the structural integrity of clauses. In this way, both of these languages differ from consistently transparent languages like Korean.

4.4.1.2. The semantic range of the basic grammatical relations Jacaltec deviates from the expected pattern of a strongly grammaticizing language elsewhere in its grammar. Though lacking nominal case marking, its agreement markers are an effective head-marking equivalent, and the semantic range of nouns that are allowed to be coded as subjects is narrowly constrained: Only animate noun phrases can be the subject of most transitive verbs. This is why Jacaltec does not have a literal equivalent of English sentences like The wind closed the door, accounting for the ungrammaticality of sentence (201a). Instead, such inanimate noun phrases

4.4. Right-branching languages

125

must appear in an "agentive prepositional phrase" as in sentence (201b) (Craig 1977: 73-75): (201) a. *speba cake te' pulía. close wind CL/the door 'The wind closed the door.' b. xpehi te ' pulta yu cake. close CL/the door by wind 'The wind closed the door.' As in other languages, intransitive verbs do not impose such tight selectional restrictions on their subjects, witness example (202) from Craig (1977: 75): (202) xtarjiloj ixim ixim. finished CL/the corn 'The corn ran out.' Jacaltec also draws a strict distinction between direct objects and indirect objects. While the former are core arguments and obligatorily coindexed by an agreement marker on the verb, the latter must occur in a prepositional phrase without being coindexed on the verb (Craig 1977: 9). Furthermore, Craig (1977) gives no indication that indirect objects are accessible to any rules whose structural description refers to objects. For example, she provides no evidence that indirect objects can be passivized.

4.4.1.3. Conclusion: Split properties in right-branching languages The realization of the basic grammatical relations as a primarily semantically rather than syntactically based concept in Jacaltec clearly runs counter to the consistently low transparency which this language displays in its movement properties. This is an inconsistency which is reminiscent of the split properties we found in Indonesian (cf. Chapter 3). In the latter language, a high degree of transparency in the properties relating to the manipulation of the argument structure of a sentence coexisted with a low degree of transparency elsewhere, most notably in those properties that are concerned with the codification of the various arguments. In essence, the

126

Chapter 4: The cross-linguistic survey

situation in Jacaltec is a mirror image of that in Indonesian: Here, a low degree of transparency with regard to the syntactic treatment of the argument structure, particularly in as far as movement processes are concerned, occurs in conjunction with high semantic transparency elsewhere, i.e., in the grammatical marking and the semantic content of the grammatical relations. In Indonesian, on the other hand, a high degree of semantic transparency vis-à-vis syntactic movement coexisted with low semantic transparency with regard to the semantic range of the grammatical relations. It is significant that these split properties both occur in right-branching languages. This is consistent with our argument that the directionality of parsing restricts only left-branching languages to one end of the typological scale. While top-down parsing obviates the necessity for semantic transparency in right-branching languages, it does not preclude its existence in such languages. What the examples of Indonesian and Jacaltec seem to indicate is that head-initial languages may choose to utilize semantic transparency not just for the overall organization of their grammars but also as a characteristic of certain modules of their grammars only. Such splits would have to be considered counterexamples to the claims of the Semantic Typology when they occur in head-final languages, at least in so far as we have invoked parsing differences as the causal factor which assigns them to the transparent area of the typological scale. By contrast, the fact that headinitial languages can appear at any position on the continuum, cf. the extremes typified by English and Russian, and that they may even exhibit a split distribution of properties provides support for our theory that parsing is indeed the decisive factor. It should be noted, though, that the fact that head-initial languages are not inherently bound into a specific range within the Semantic Typology does not mean that this typology has no bearing on them. On the contrary, the splits observed in Jacaltec and Indonesian are not random but follow the organization of the grammar into distinct modules, with some modules going one way and others the other way. In the case of Indonesian, it was possible furthermore to identify independent principles which led to the assignment of higher transparency to a particular module of the grammar. Finally, head-initial languages conform to the Semantic Typology insofar as they comply with its auxiliary hypotheses like the Grammatical Relation Manipulation Hierarchy (18), the extraction hierarchy discussed in section 1.2, and the Case Transparency Principle (204).

4.4. Right-branching languages

127

4.4.2. Sawu The Sawu language is a member of the Central Malayo-Polynesian branch of the Austronesian family (Grimes 1987: 525). It is presently spoken by about 80,000 people, mostly on the island by the same name in the Indonesian province of Nusa Tenggara Timur and several emigré communities on the larger neighboring islands of Timor and Flores as well as on Java (Walker 1982: 1). Walker (1982) is the only reliable description of the language to date, and all the data in this section come from this source. Sawu is interesting for us here not only because it represents a small, little studied language but also because, like Jacaltec, it operates on an ergativeabsolutive basis. Furthermore, Sawu can serve as a less "contaminated" counterpoint to the heavily European-influenced Bahasa Indonesia (Becker and Wirasno 1980; cf. the discussion on pages 152-153 above.) Sawu has little in the way of morphology apart from number agreement between the verb and the absolutive noun phrase, the causative prefix pe-, the reciprocal prefix pe-, and reduplication (Walker 1982: 22-23).42 Clauses are structured by the use of sixteen case prepositions which indicate the semantic functions of the various noun phrases, such as locative, benefactive, source, goal, instrumental, etc. Only absolutive noun phrases have no overt marking (Walker 1982: 16). Sentence (203) from Walker (1982: 37) illustrates the use of these case prepositions: (203) ta wèbe 0 noo ri j'aa pa kètu ne NON-PAST hit (SG) ABS 3SG ERG ÌSG LOC head DEM/1 SG ri aj 'u ne. INST stick DEM/1SG Ί will hit him on the head with this stick.' Despite the ergative case marking system, the absolutive noun phrase in an intransitive clause and the ergative noun phrase in a transitive clause cover the semantic ground typical of subjects cross-linguistically, while transitive absolutives have the typical semantic content of objects (Walker 1982: 13-15). Walker's detailed list of the kinds of participants that are coded as ergatives and as intransitive absolutives shows that both are typically agents or experiencers, ergatives invariably so.43 A similar list of transitive absolutive referents reveals that they are patients, recipients, or "referents that come into being as the result of an action" (Walker 1982: 13). Assuming the correctness of Walker's findings, we can conclude that the distribution

128

Chapter 4: The cross-linguistic survey

of the various semantic roles in Sawu operates on a nominative-accusative basis and that the nominative and the accusative referents have the narrow semantic range that is characteristic of high semantic transparency. Syntactically, Sawu can be characterized as a head-initial language because it is strictly prepositional and because verb-initial sentences predominate. It is possible for one noun phrase, which must be in the ergative or the absolutive case, to precede the verb (Walker 1982: 40), but any auxiliaries must be sentence-initial (Walker 1982: 46). Otherwise, the order of the various noun phrases is syntactically free and governed by such considerations as the animacy hierarchy (Comrie 1989a), Foley's Referentiality Hierarchy (Foley 1976), and definiteness (Walker 1982: 52-53. Walker does not discuss what happens when these criteria identify different participants.) A syntactic constraint exists only in intransitive clauses where the absolutive noun phrase must be the left-most noun phrase (Walker 1982: 52). Despite the preponderance of these head-initial properties, the word order within noun phrases deviates somewhat from the standard right-branching pattern. The head noun is obligatorily followed by the genitive and by any demonstratives, as one would expect, but articles precede the noun. Quantifiers and numerals (for cardinal numerals the numeral plus classifier combination) can either precede or follow the head noun (Walker 1982: xv). Unfortunately, Walker contradicts himself with respect to the position of relative clauses: Page xv announces that relatives may precede or follow the head noun, but his section describing the relative construction in Sawu states categorically that they follow the noun (Walker 1982: 44). All of his examples present postnominal relatives. In any case, these noun phrase internal word order properties, though surprising from the point of view of a rigid right-branching X-bar schema, become readily plausible when they are seen in the perspective of the different processing assumptions made for head-initial and for head-final languages by Frazier and Rayner (1988: 266-273) and by Hawkins (1990: 228-232). Since Sawu is essentially a right-branching language, it will use a top-down processing strategy. Consequently, the auxiliary can be relied upon by the hearer to construct an S-node since all sentences other than imperatives and copulative sentences seem to require an auxiliary, judging from Walker's data. The variety found in the noun modifier positions of Sawu will not lead to frequent misinterpretations for two reasons: First, because the occurrence of a determiner on-line unambiguously constructs a noun phrase; and secondly, because Sawu noun phrases are generally introduced by case marking prepositions which, therefore, serve to set the

4.4. Right-branching languages

129

boundaries between the various maximal noun phrases within a clause. Misassignments are likely only in a limited number of circumstances, for example when an absolutive noun phrase which does not immediately follow the verb is preceded by a quantifier or numeral. Because absolutive noun phrases do not have a case preposition, such a quantifier could refer to the preceding or to the following noun phrase; Walker does not mention whether Sawu uses any disambiguating devices, such as intonation, to resolve such conflicting parses. In any case, once a noun phrase has been unambiguously constructed by a preposition, no disadvantage arises from permitting noun-preceding as well as noun-following modifiers. There is no evidence of syntactic movement in Sawu. Walker (1982: 24) claims that passivization does not exist in Sawu. Moreover, there is no WHmovement, with WH-phrases simply occurring in situ wherever the corresponding noun phrase could occur although certain adjunctive WH-words, e.g., talèki 'why' have syntactically prescribed positions, usually in clauseinitial position (Walker 1982: 41-43). None of the data available on Sawu in Walker (1982) and none of his remarks on the language give any indication that raising or adposition stranding, or any other type of syntactic movement, might exist in Sawu. Sawu, thus, falls squarely in line with the transparent language type. Lacking all syntactic movement, it is characterized by a syntactically free clause-internal word order and a large number of case-marking prepositions which delineate the semantic content of the various noun phrase participants in a sentence. We can therefore place Sawu close to the transparent end of the typological continuum developed in Chapter 1. This type assignment is interesting since it clarifies the role of the morphology within the Semantic Typology. It will be remembered that both Sapir and Hawkins (cf. pages 3-4 and 9-10 above) have claimed that semantic transparency correlates with the presence of an extensive inflectional morphology; moreover, Hawkins viewed a richer morphology per se as characteristic of the transparent type (Hawkins 1986: 125). The data from Sawu now confirm that the correct generalization should be slightly different. The fact that Sawu can maintain a high level of semantic transparency despite the virtual absence of inflectional morphology shows that the decisive factor is not the presence of any morphology per se, but the existence of some sort overt participant role marking. In many languages, such participant marking will take the shape of morphological case markers, but not necessarily so. Sawu clearly does not have any case-marking affixation. What it does have is a set of prepositional case markers (cf. the

130

Chapter 4: The cross-linguistic survey

case marking postpositions in Japanese), which can serve to delineate the semantic range of the various participants in a sentence. As we argued in section 1.2.1 above, this material marking of the grammatical relations is now confirmed to be the crucial factor, whether it is morphological or not. We expect, therefore, that a language with an otherwise rich morphology, but which does not use material case marking to identity the various grammatical relations, is not necessarily bound into the transparent mode just because of its morphology. The properties of Jacaltec, which was discussed in the preceding section, may corroborate that this prediction indeed holds. Sapir and Hawkins' concentration on the morphology is understandable, of course, since an extensive inflectional morphology in a given language frequently includes morphological case markers. Sawu, though, uses non-morphological means to mark case within the context of a largely isolating grammar.

4.4.3.

Babungo

A further question which immediately arises is, of course, whether it is possible to achieve a high level of semantic transparency in a language which lacks case marking altogether. There is evidence from Babungo, a Bantu language spoken by about 14,000 people in Cameroon, that this is indeed possible. All the information available on this language indicates a high degree of semantic transparency: Babungo lacks movement processes like WH-movement (Schaub 1985: 8) or adposition stranding (Schaub 1985: 67), and there is no voice opposition to reorganize the grammatical relations in a sentence (Schaub 1985: 209). The basic SVO order can be changed by topicalizing the object, yielding a surface OVS order, or by putting the subject into focus behind the verb; for intransitive clauses, focussing the subject simply results in a surface verb - subject order, but for transitive clauses, a verb copy without any accompanying tense or aspect markers must be repeated after the subject, resulting in a VSVO surface order (Schaub 1985: 62). However, Babungo does not have any case marking, morphological or otherwise, except for some remnants like the special locative form which remains in a few nouns (Schaub 1985: 139) and the distinct indirect object/beneficiary marker in the first person singular pronoun. The rich morphology of Babungo is limited to marking noun classes in nouns (Schaub 1985: 171-185), noun class agreement on most modifiers of a head

4.4. Right-branching languages

131

noun (Schaub 1985: 185), and aspect and non-finite forms in verbs, among other things, none of which are indicators of participant roles. As a result, a more complex picture of the relationship between morphology, case marking, and semantic transparency emerges: It seems to be true as a crosslinguistic generalization that the presence of material case marking is sufficient to tie a language to the transparent language type, but the converse does not hold. Languages without a system of case marking may have a high degree of grammaticization, like English, or they may not, like Babungo. This pattern can be stated as the Case Transparency Principle in (204): (204) If a language has material case marking, it will have a high degree of semantic transparency overall. Since the implication is non-reversible, no prediction can be made based on the absence of case marking in a given language. The implication defines four possible collocations of properties, one of which is ruled out: Table 10: Case marking and semantic transparency + +

case, case, case, case,

4.4.4.

- transparent + transparent + transparent - transparent

English, Dutch Babungo, Hixkaryana Japanese, Sawu, Russian, etc. —

Chinese

4.4.4.1. A remark on word order (Mandarin) Chinese has a certain number of head-final characteristics, most notably its prénom inai noun phrase modifiers and the option to place a όά-marked object before the verb, which led Li and Thompson (1981) to claim that Chinese has been undergoing a slow, long-term shift from being head-initial to being head-final. This view has recently been challenged by P. Li (1990), who shows that Chinese syntax has remained remarkably stable through the centuries irrespective of its word order "inconsistencies" and that the changes that have taken place do not add up to a shift towards a basic SVO order. In fact, some word order changes can be interpreted as

132

Chapter 4: The cross-linguistic survey

a change towards the SOV type, for example, the historical change from postnominal to prenominai relatives, while others can be interpreted as a change toward a more consistent head-initial syntax, e.g., the introduction of postverbal pronominal objects in interrogative and in negative sentences (Li 1990: 12). In the present-day language, noun phrases are head-final, but the head-final patterns at the clause level are marked variants, as the example of the όά-construction demonstrates: An unmarked SVO word order without bä is fine, and SOV ordering can be achieved when the object is preceded by the marker bä, as in sentence (205), which is adapted from Li (1990: 3). My language consultants did not accept sentence (205b) without the marker bà. (205) a. Tä döu chfguäng le yìhé táng. he all eat empty ASP one/CL candy 'He ate up the whole box of candy.' b. Tä bà yìhé täng döu chfguäng le. he BA one/CL candy all eat empty ASP 'He ate up the whole box of candy.' Chinese is, therefore, treated as a right-branching language in this work.

4.4.4.2. Basic grammatical properties Chinese is well known as an isolating language which lacks inflectional morphology. Certainly there is no case marking, and the basic grammatical relations are identified by word order only. Example (206) from Chu (1983: 126) is a basic transitive sentence: (206) Lâo Zhâng yòng yäoshi käile mén. old with key open-ASP door 'Old Zhang opened the door with the key.' Different word orders result from topicalization, as for instance in sentence (207) where the logical object appears in the sentence-initial topic position, or syntactic processes like the bä construction (cf. example (205b) above) and passivization (cf. examples (211) to (213) below).

4.4. Right-branching languages (207) Ni

men de

huà

133

wö bù döng.

you PL POS S language I not understand Ί do not understand your language.' The preverbal subject position can be filled not just by agentive noun phrases. Sentences (208) through (210) present some typical examples:45 (208) Wö zhfdào

zhèibà

suö néng käi

hòu mén.

I know this/CL key can open back door Ί know that this key opens the back door.' (209) Ershí

nián qián shi méifën

këyi mài yikuài

táng.

twenty year ago ten American cent can buy one/CL candy 'Twenty years ago ten cents could buy a candy bar.' (210) Hën buxìng,

wö de

very unfortunate I

jitä

zuówán

duànle

yigên

POSS guitar last night break-ASP one/CL

xuán.

string 'Unfortunately my guitar broke a string last night.'46

4.4.4.3. Syntactic movement Mandarin Chinese forms the passive unproblematically by promoting the underlying object to subject status; the demoted agent is expressed in a phrase governed by the preposition bèi, as in the following example from Li (1990: 9): (211) Yú bèi mäo

chile.

fish by cat eat ASP 'The fish was eaten by the cat.' Indirect objects and direct objects are equally accessible to passivization (Chu 1983: 221): (212) Wö bèi jingchá fále

sânshikuài

qián.

I by police fine-ASP 30-dollar money Ί was fined 30 dollars by the police.'

134

Chapter 4: The cross-linguistic

survey

(213) Ta kaï che de shihòu, cháng bèi réti àn läbä. he drive car POS S time often by person blow horn 'When he drives, he is often blown the horn on by people.' Although this construction figures prominently in Li and Thompson's theory of ongoing word order change in Chinese, Wang (1956, cited by Li (1990: 9)) found that the bèi passive was popularized as early as the fifth century A.D. Earlier in its history, Chinese used two personal passive constructions, one marked by jiän ... yú and the other by wèi... suö (Li 1990: 8; examples from the same source): (214) Chèn köng jiän qf yú Qín. I fear deceive Ί was afraid to be deceived by Qin.' (215) Hart jûen wèi Chü suö jí. army chase 'The Han army were chased by the Chu (soldiers).' Data from Chinese have played an important role in recent research on Logical Form beginning with Huang (1982) because of the absence of extraction in that language. In the formation of WH-questions, the question word simply remains in situ wherever it is base-generated. Sentences (216) and (217) illustrate this fact: (216) Zhängsän yào ni mài shénmo? ask you buy what 'What did Zhangsan ask you to buy?' (217) Ni rènwéi jingchá zài tä de wüzili zhâodàole shénmo? you think police in he POSS house-inside find-ASP what 'What do you think the police found in his house?' However, Mandarin Chinese, like so many other languages, has sentences which look like potential instances of object-to-subject raising. This construction has been discussed in detail by Shi (1990), who gives the following example:

4.4. Right-branching languages

135

(218) a. [Chóngfii zhèigè gùshìj hën ηάη. repeat this-CL story very difficult 'To repeat this story is very difficult.' b. Zhèigè gùshì hën ηάη chóngfii. this-CL story very difficult repeat 'This story is very difficult to repeat.' Proponents of a raising analysis have put forward several different syntactic mechanisms to account for this construction in Chinese, but Shi (1990: 3-6) demonstrates convincingly that its peculiarities argue against any raising analysis: First of all, a one-to-one relationship between two sentences like those in (218) does not always exist; secondly, any posited raising process would have to be able to operate across intervening clauses; thirdly, the raising process would be able to apply optionally in the sense that it can not be triggered by the failure of the subject slot to be filled; and, finally and most strikingly, the "raised" noun phrase does not behave like the surface subject of the matrix because it can co-occur with an underived subject of the matrix verb and because it does not control reflexivization with ziji 'self. These facts make it very difficult to argue that this noun phrase has become a matrix subject, which an analysis in terms of object-to-subject raising is obliged to do. But all of these facts fall out straightforwardly when the initial constituent in sentences like (218b) is analyzed not as a raised noun phrase but as a (base-generated) topic constituent. We can, therefore, state that of the grammatical processes in the Grammatical Relation Manipulation Hierarchy only passivization exists in Chinese.

4.4.4.4. Assessment of the Chinese facts The peculiarities of Chinese syntax have long made it a sticking point for linguistic typology. Because it forms an exception to the predictions of word order typology, a tempting analysis has been to claim that it is in the process of changing from one basic type to another, as Li and Thompson have done, but the observed diachronic changes in its syntax appear to have little directionality towards any one type (Li 1990). For our processing-driven typology, the same syntactic peculiarities of Chinese present some difficulties. As an essentially right-branching language without material case marking, Chinese lacks the two structural

136

Chapter 4: The cross-linguistic

survey

factors which would compel a high degree of semantic transparency according to the criteria developed above. To this extent, Chinese is therefore free to align on any position of the typological continuum so that its highly transparent properties with respect to argument trespassing are not surprising. Moreover, we have already observed for Indonesian (cf. Chapter 3) and for Jacaltec (cf. section 4.4.1) that it is possible for right-branching languages to exhibit split properties between their positions regarding the codification of the basic grammatical relations and argument trespassing. This split distribution in Chinese is very much comparable to that in Indonesian, with the main difference that argument trespassing is severely constrained in Indonesian but totally ungrammatical in Chinese. A problem arises only in connection with the processing of Chinese relative clauses. By any measure we have seen so far, a right-branching syntax would coincide with a top-down processing mechanism; but the application of top-down parsing would necessarily induce massive garden path misreadings when it comes to interpreting prenominai relative clauses. Since there is no overt marker which signals the fact that an embedded sentence is about to begin, the parser will frequently misassign material from the embedded relative to the matrix until a monoclausal reading becomes untenable, often well into the relative clause. Such considerations identify prenominai relatives as dispreferred for right-branching languages, as indeed they are cross-linguistically, and predict that diachronic change would tend to eliminate them in favor of postnominal relatives. Paradoxically, though, prenominai relative clauses are a syntactic innovation in Chinese while an earlier postnominal strategy was lost (Li 1990: 11). At present, we have no solution to this paradox. Certainly, any final resolution would have to go beyond our problem of the processing of relative clauses. To be satisfactory, such a resolution would have to account for all the peculiar word order collocations in Chinese and explain the directionality of their syntactic change as well as clarify the processing mechanisms needed. Such an encompassing resolution is, obviously, beyond the scope of the present typological survey.

4.4.5.

Hebrew

Hebrew is a Semitic language which remains strictly head-initial even though it has undergone a shift in basic word order from VSO to SVO. Although Hebrew lost its morphological case markers in pre-biblical times,

4.4. Right-branching languages

137

Modern Hebrew uses the particle et as a marker of definite direct objects, thus providing a crucial material distinction between subjects and many objects. However, et is often deleted in colloquial Isreali Hebrew (Alan Kaye, personal communication). The suffix -a indicating 'direction towards', cf. ha-ira 'to town' and yerusalayma 'to Jerusalem', is the major remaining reflex of the Semitic morphological case marking system that survives in nouns (Berman 1978: 75). Otherwise, nouns have no overt case marking in Modern Hebrew, but agreement is marked on verbs (Hetzron 1987: 698699) and on prepositions (Berman 1978: 77-79). Brief work with a Hebrew language consultant indicates that subjects of transitive verbs are typically agents. Generally, my consultant's grammaticality judgements for sentences with non-agentive subjects ranged from marginally acceptable to completely ungrammátical, and a version with an agentive subject was invariably the preferred way of expressing a proposition. Sentence (219a), for example, represents the basic transitive pattern with an agentive subject. Its grammaticality contrasts sharply with that of (219b), which differs only in the choice of a non-agentive subject. If the agent is not to be expressed, Hebrew prefers the construction shown in sentence (219c), which codes the instrument ha-mafteaH 'the key' as an oblique and utilizes the option of pro-drop for the subject:47 (219) a. rivka patHa et ha-delet. opened ACC the-door 'Rivka opened the door.' b. llha-mafteaH pataH et ha-delet. the-key opened ACC the-door 'The key opened the door.' c. pataH et ha-delet im ha-mafteaH. opened ACC the-door with the-key '[Some person] opened the door with the key.' In a similar vein, an intransitive construction is required in sentence (220) because of the non-agentive subject sfaro 'his book' : (220) sfaro ha-Hadash nimicar 10000 pe'amim. book-his the-new sold times 'His new book sold 10000 copies.'

138

Chapter 4: The cross-linguistic survey

Locatives can become the subject of a few verbs like hama 'hum', nataf 'drip with', and zalag 'flow' even while the agent becomes "an apparent object" in "a minor construction functioning like a passive" (Glinert 1989: 139-140): (221) a. nemalim shartsu ba-ir. ants swarmed in-city 'Ants swarmed in the city.' b. ha-ir shartsa nemalim. the-city swarmed ants 'The city swarmed with ants.' Glinert (1989: 166-167) points out that the construction of examples like (221b) differs from transitive sentences not just in the choice of a nonagentive subject but also in the fact that the postverbal noun phrase does not behave like other direct objects. Specifically, it can not be promoted to subject via passivization. The fact remains that some non-agentive participants can become subjects at least of quasi-transitive clauses, although this occurs only rarely. With respect to the properties relating to the Grammatical Relation Manipulation Hierarchy, Hebrew has the passive and WH-extraction, but raising is quite limited. There is no adposition stranding (Cole et al. 1977: 43; Glinert 1989: 273). Passivization is common in formal or technical contexts but less used in casual speech where topicalization by word order shifting and impersonal verbs are preferred (Glinert 1989: 140). Example (222) from Glinert (1989: 138) illustrates Hebrew passivization: (222) a. rafi histir et ha-dolarim. hid ACC the-dollars 'Rafi hid the dollars.' b. ha-dolarim husteru al-yedey rafi. the-dollars were hidden by 'The dollars were hidden by Rafi.' WH-words invariably move to the sentence-initial position as shown in example (223).48 Notice the obligatory application of pied piping in sentence (223) to avoid the stranding of the preposition im 'with':

4.4. Right-branching languages

139

(223) im mi at nosaat? with whom you are-going 'Who are you going with?' Sentence (224) demonstrates that WH-movement can extract material from embedded clauses as well: (224) efo Hashavt she-eshev, al ha-gag? where you-thought that I'd-sit on the-roof 'Where did you think that I'd sit, on the roof?' Raising is either non-existent as a grammatical process in Hebrew or restricted to sentences like (225a), which Glinert (1989: 330-332) discusses in a section entitled "Raising in object clauses": 225) a. yosef alul lenatseaH. likely [Object Clause] 'Yosef is likely to win.' Unfortunately, Glinert's description of this process differs from the conventional understanding of raising: "The object clause is semantically [sic!] not subordinate to the verb or adjective governing it, and it is the subject that is subordinate" (1989: 330). In other words, Glinert assumes that the predicate of the object clause raises while its subject yosef remains subordinate. He does not motivate this idiosyncratic analysis. Still, his statement is interesting since Gundel - Houlihan - Sanders (1988: 295) have explicitly claimed that Hebrew has neither subject-to-subject raising nor object-tosubject raising. Still, the existence of gender agreement between the subject and the verb in sentences like (225b) provides some evidence for subject-to-subject raising in Hebrew. The agreement between rivka and alul-a demonstrates that rivka, the logical subject of the dependent clause, functions as the surface subject of the matrix: (225) b. rivka alul-a lenatseaH. likely-FEMININE [Object Clause] 'Rivka is likely to win.'

140

Chapter 4: The cross-linguistic

survey

In any case, sentences like (225) appear to be the only possible candidates for raising in Hebrew, so if raising exists as a grammatical process, it is definitely a very limited phenomenon. In conclusion, Hebrew has a level of semantic transparency similar to that of German. Of the linguistic properties that comprise the Grammatical Relation Manipulation Hierarchy, only passivization and WH-extraction exist. The existence of raising is doubtful, and there is no adposition stranding. Modern Hebrew lacks case-marking but, as in German, non-agentive subjects are strongly dispreferred with transitive verbs.

5. Summary of results The results of the typological survey can be summarized as follows: Our cross-linguistic study confirmed that left-branching languages generally have a high degree of semantic transparency. In fact, precisely those SOV languages which are least consistently left-branching are also the ones where we find lesser semantic transparency. Dutch, for example, turned out to have a low overall transparency, although more than English; even German, which definitely has a much greater transparency than either Dutch or English, permits WH-extraction and, possibly, some raising. Rightbranching languages, on the other hand, were found at various positions on the typological continuum, in keeping with the prediction that their topdown processing system does not compel them into the transparent range. Of course, top-down processing does not preclude a high semantic transparency so that, given the propensity of natural languages for redundancy, the high transparency of languages like Russian is not surprising. It was also precisely among the right-branching languages that we discovered fundamental splits, with some modules of the grammar displaying a high level of transparency and others a high degree of grammaticization, cf. the situation in Indonesian, in Jacaltec, and in Chinese. This is possible because of the lack of pressure from the processing mechanism to adhere to a semantically transparent organization and the resulting freedom of these languages to choose, or not to choose, a high level of transparency. What is particularly striking is that despite this freedom, right-branching languages follow the predictions of the Semantic Typology in so far as their positions with respect to the typology are not random but obey its subsidiary hypotheses like the Grammatical Relation Manipulation Hierarchy, the Extraction Hierarchy, and the Case Transparency Principle, which prohibits the co-occurrence of material case marking and low overall transparency in a language. Indeed, no such violating language was found. Our results for the grammatical manipulation of the basic grammatical relations are summarized graphically in Table 11. Table 11 shows a clear hierarchical distribution of the various syntactic processes that can rearrange the argument structure of a sentence as predicted by the Grammatical Relation Manipulation Hierarchy. The only stray data points are the limited existence of argument trespassing in Hixkaryana and in Japanese, cf. the discussion in the text. Both of these, however, are highly restricted processes.

142

Chapter 5: Summary of results

Table 11: Typological overview: Syntactic manipulation of grammatical relations Passive

Raising

P-strandinga

no no no no no yes yes

(r)b no no (no) (no) (no) yes

no no no no no no yes

no no no no yes yes yes yes

no no no no (no) (r) yes yes

no no no no no no yes yes

WH-extraction

(A) Left-branching languages Hixkaryana Korean Malayalam Japanese Turkish German Dutch

no (r) (r) yes yes yes yes

3) Right-branching languages Sawu Babungo Russian Chinese Hebrew Indonesian Jacaltec English

no no (r) yes yes yes yes yes

a. P-stranding = preposition stranding or postposition stranding b. (r) = restricted.

The properties relating to the codification of the participant structure are less binary and, consequently, less straightforwardly diagrammed. Table 12 represents an attempt to display these properties in tabular form, but the limitations of this endeavor should be kept in mind. For instance, case still exists as a feature of English and Dutch pronouns but has been lost in nouns generally, hence the categorization "no case" for these two languages. Similarly Hebrew, which has long lost the Semitic morphological case markers in nouns (but not in pronouns), is marked as "no case" despite the existence of the accusative marker et. The semantic range of subjects and direct objects is given as a relative scale ranging from narrow to wide. For subjects, an assignment of "narrow" implies that subjects, especially of transitive sentences, are limited essen-

5. Summary of results

143

tially to agents while the classification "wide" indicates that a variety of semantic roles can become subjects. When referring to objects, the term "wide" signals that a language has one large object class whose members are treated identically for syntactic processes operating on objects, such as passivization; the assignment "narrow" means that the language differentiates different types of objects based on semantic differences, typically separating prototypical patients from other roles such as benefactive. Table 12: Typological overview: Syntactic codification of participant structure Case (A) Left-branching Korean Malayalam Japanese Turkish German Hixkaryana Dutch (B) Right-branching Sawu Russian Babungo Jacaltec Hebrew Chinese Indonesian English

Subject Range

Object Range

languages yes yes yes yes yes no no

narrow narrow narrow n.d." narrow n.d. wider

narrow narrow narrow narrow narrow narrow wider

narrow narrow n.d. narrow narrow wider wider widest

narrow narrow n.d. narrow n.d. widest widest widest

languages yes yes no no no no no no

a. n.d. = no data available.

Just as the data in Table 11 on the possible syntactic manipulation of the basic grammatical relations did, the data in Table 12 on the syntactic codification of the participant structure reveals the preponderance of semantically transparent organization among left-branching languages. With the exception of Dutch, which is in many other ways atypical of head-final

144

Chapter 5: Summary of results

languages, all of the languages that can be considered head-final have semantically narrow grammatical relations, and most have material case marking. As our processing-based theory predicted, right-branching languages do not form a uniform patterning. Whereas Babungo, Sawu, and Russian have a highly transparent organization across all properties tested, English is characterized by the lowest overall transparency. Yet other languages display a split distribution, scoring high on one dimension and low on another. Jacaltec's low transparency with regard to movement processes shown in Table 11 contrasts with the high semantic transparency of its grammatical relations. Conversely, Chinese and, to a lesser extent Indonesian, combine a high transparency in the area of movement processes with low transparency of the basic grammatical relations. Significantly, all of these splits are found among the right-branching languages, i.e., among those languages that are not compelled into the transparent range by their processing strategy. At the same time, no strongly grammaticizing languages were found among the more consistent left-branching languages as predicted by our processing-driven account, which attributes this distribution to the different demands made by the bottom-up processing strategies required by head-final languages and the top-down processing strategies possible in head-initial languages. Our model even predicted correctly that an inconsistent SOV language whose structure is compatible with a topdown parser will not be tied into the transparent mode required for consistent left-branching languages, as the example of Dutch shows. In sum, the Semantic Typology developed in this work has been confirmed by the cross-linguistic survey.

Notes Chapter 1: Introduction 1. An early version of parts of this work, which represents its state as of 1989, has been published in Kefer and Van der Auwera (1992). That version differs from the present both in its conceptual basis and in its empirical coverage. It focuses on Korean and Indonesian data. 2. The semantic component is thought to have three principal tasks, the determination of the lexical semantics, of the argument structure, and of scope relationships. In this work, we will be concerned primarily with the second of these. From the outset it should be said that the term "Semantic Typology", which we adopt from Hawkins (1986) to refer to the typology under discussion, is somewhat misleading since it may imply that the parameter is intended to typologize the semantic component of language(s) as such. This is not the case, however. What the parameter does instead is to typologize languages as to the closeness of the fit between the semantic structure and its syntactic realization. Only three minimal assumptions about the nature of the semantic component can readily be made for our purposes to test this typology: Our first assumption is that semantic representations are unambiguous; secondly, we assume that they contain no deletions so that all elements are overt; and thirdly we assume that they contain no moved arguments or predicates. Following Hawkins, we will limit our assumptions about the nature of the semantic component to this three-item set. 3. Case does not seem to be a sine qua non for achieving word order freedom comparable to German. Witness the situation in Dutch and, to a lesser extent, the Scandinavian languages, which have similar word order properties to German in main clauses (Dutch also in subordinate clauses), despite the fact that they underwent case loss on a comparable scale to English. 4. In their reviews of Hawkins (1986), both McKay (1987) and Lenerz (1987) have made a similar argument with respect to the systematic relationship between -ly adverbs and their corresponding adjectives in English. German lacks a morphological distinction between adjectives and adverbs.

146

Notes

5. All Russian examples in this section are adapted from Comrie (1989b). Sentences (10) through (13) are from Comrie 1989b: 1, 4, 8, and 10, respectively. Sentence (14) is also from page 10 of the same source. 6. The conception of the extremes defined by the typology differs from the hypothetical language discussed in Hawkins (1986: 126). For Hawkins, complete transparency means that every possible meaning distinction has its own distinct surface form, yielding an infinite number of surface forms, while complete grammaticalization is understood as the collapsing of all possible semantic differences onto a single grammatical form. Needless to say that no usable natural languages could be located anywhere near the extreme values of a such defined typology. The present work takes a narrower view of the range that needs to be defined for the Semantic Typology. In our view, transparency becomes a measure of the degree to which the syntax preserves the distinctions made in the semantics. An example from the realization of semantic roles should illustrate this difference: We take it for granted that the semantic component of a language distinguishes a certain, finite number of semantic roles. In the most extreme transparent scenario, the syntax would preserve all of these semantic distinctions, but the syntax can not increase transparency by going beyond the number of semantically distinguished roles since any deviation from the semantics constitutes a loss of transparency. In other words, the syntax can not reintroduce linguistically any real-world distinctions that are not made in the semantics. Introducing an infinite number of surface syntactic distinctions, therefore, would not increase semantic transparency as it is understood in this work. 7. I am grateful to Georg Bossong for directing my attention towards the centrality of subjects for the general theory. 8. For an illustrative contrast, compare these German facts with the situation in the highly transparent Korean, which is discussed in Chapter 2. 9. Of course, we would expect a strong correlation among the positions of the various properties of a language L, all things being equal. In addition, we would expect it to be the case that significant deviations arise as a result of clearly identifiable reasons, cf. the discussion in the text.

Chapter 2: High semantic transparency: Korean 10. Proper names are given in the transliteration preferred by their bearers. Song (1988: ix-x) summarizes the correspondences between Yale spell-

Notes

147

ing and other romanization systems as in the following table. In this table, column I presents Yale spelling. Column II lists the equivalent symbols in the McCune - Reischauer system, while column III illustrates the prevalent South Korean transliteration, and column IV, finally, its North Korean counterpart.

III

IV

I

II

III

IV

ρ, b

b

Ρ' PP t, d t' tt s ss eh, j eh' tch k, g k' kk m η -ng l,r h i

Ρ bb d t dd s ss

Ρ ph

wi ey yey wey oy ay yay way u e ye we a ya wa wu yu 0 yo uy

wi e ye we oe ae yae wae ü ö yö wo a ya wa u yu 0 yo üi

wi e ye we oe ae yae wae eu eo yeo weo a ya wa u yu 0 yo eui

wi e ye we oi ai yai wai ü ö yö wö a ya wa u yu 0 yo üi

I

II

Ρ ph PP t th tt s ss c eh cc k kh kk m η -ng 1 h i

j ch jj g k gg m η -ng 1, r h

PP t th tt s ss ts tsh tss k kh kk m η -ng r h i

11. Example (48) is adapted from Chang (1983: 222). The verb suffix -eyo marks this sentence as semiformal speech. Though alternating with the various mood markers in informal speech, -eyo occurs in declarative, interrogative, imperative and propositive sentences alike. 12. The verb form al-ko-iss-na consists of two verb bases, al- 'know' and iss- 'exist', -ko, which elsewhere functions as a coordinator, is required here to connect the two verb bases. 13. In copular sentences, honorific agreement takes place only when the complement attributes a quality to the subject:

Notes

Sensayng-nim-i yengliha-si-ta. teacher-HON-NOM smart-HON-DECL 'The teacher is smart.' As a complement, swuip-ta 'easy' can be interpreted as a quality of the teacher himself ('an easy person') or as a quality of the courses he teaches. When honorific agreement is used, only the former interpretation is possible: Sensayng-nim-i swuip-usi-ta. teacher-HON-NOM easy-HON-DECL *'The teacher is easy (as far as teaching his classes).' 'The teacher is easy (an easygoing character).' This fact accounts for the ungrammaticality of the following sentence, where it is not the teacher himself that is said to be 'easy' but 'understanding' him is: * Sensayng-nim-i yihayha-ki-ka swuip-usi-ta. teacher-HON-NOM understand-COMP-NOM easy-HON-DECL 'The teacher is easy to understand.' Significantly, sensayng-nim-i can not agree with the lower predicate either in this reading: * Sensayng-nim-i yihayha-si-ki-ka swuip-ta. teacher-HON-NOM understand-HON-COMP-NOM easy-DECL 'The teacher is easy to understand.' Of course, the same sentence is perfectly grammatical with the reading 'It is easy that the teacher understands' since, in this interpretation sensayng-nim-i is the subject of the embedded verb and no raising has taken place. Honorific agreement can, hence, not be used as a test for whether the semantic object of the lower predicate has been promoted to matrix subjecthood via object-to-subject raising in sentences like (61). The verb suffix -nun, which is glossed 'PNE' in example (42) is known as a "prenominai ending" in Korean grammar. Its precise function is unclear.

Notes

149

15. Examples (65) and (66) are adapted from J. Kim (1989: 7 and 26, respectively).

Chapter 3: The interaction with other principles 16. Although both Prentice (1987) and Dreyfuss (1978), among others, use the form meN- with an archiphonemic nasal to capture the fact that the prefix-final nasal assimilates to the initial phoneme of the base that the prefix is attached to, it can be shown that the underlying form of the prefix must have a final velar nasal since this is the allomorph which appears before vowels and is, thus, the only variant that cannot be derived by an assimilation rule. 17. The verb prefix ter- indicates involuntary action. In Indonesian terverbs are generally passive, but in Malaysian active ter- verbs occur as well. 18. Of course, the use of a sentential adverb is possible in English, French, or German as well, but Indonesian lacks the option of raising. Example (91a) is adapted from Echols and Shadily (1975: 510). 19. The complementizer untuk is optional when the embedded claùse has no overt material other than the verb. Sentence (93a) is adapted from Chung (1976b: 69). Sentence (93b) is found in the Indonesian magazine Popular (August 1990: 106). 20. The verb pikir 'think' in (98) and both verbs in (99) are in the socalled object-preposed passive construction (Chung 1976b), which is one of two major passive strategies in Indonesian. Unlike in the canonical di- passive, verbs with object preposing carry no overt passive affix. But, unlike active verbs, they are characterized by the cliticization of the underlying subject to the verb (hence kau-pikir in (98) and kau-kira in (99)). The position of the modal in mau Karto beli in (99) moreover shows the cliticization of the name Karto to the verb beli. Modals usually stand between the subject and the main verb but precede the logical subject in the object-preposed construction and, hence, form a test for cliticization. A second indicator that cliticization has occurred is the fact that only pronouns and, contrary to Chung's expectation (1976b: 61), modifierless names can be the subject of an object proposed passive. The following example gives the paradigm example for the object-preposed passive construction:

150

Notes Buku itu mau Karto beli book that want buy 'This book wants to be bought by Karto.' = 'Karto wants to buy this book.'

This sequence of preposed object - modal - logical subject - verb is indicative of this construction. 21. At least one of my Indonesian consultants did not accept any examples of long-distance extraction in Indonesian. 22. Sentence (104a) is adapted from Matra, June 1990: 14. 23. The temporal sebelum 'before' clause, though present in the original example, has been left out in (104b) to make sure that the ungrammatically of this sentence can not be attributed to any potential interference from that clause.

Chapter 4: The cross-linguistic survey 24. My own fieldwork on colloquial Jakarta speech appears to contradict Ikranagara's assessment. Several Indonesian language consultants readily constructed sentences with instrumental or objective subjects such as the following: Kunci yang besar bisa buka pintu belakang. key that big can open door back 'The big key opens the back door.' Pintu buka dan ada satu laki-laki ke-dalam. door open and exist one male person to-inside 'The door opened and a man came inside.' Further research is necessary to determine whether this discrepancy reflects a recent syntactic change, dialectal (or idiolectal) differences, or other factors. 25. Genetic classification following Ruhlen (1976). 26. Dutch and German are listed under the SOV heading in Table 9 since they are SVO only in main clauses that have no auxiliaries. 27. Although Hebrew is listed in Table 10 as having no overt case, it does possess an accusative particle et, cf. the discussion in section 4.4.5.

Notes

151

28. German word order has, of course, been the subject of much debate. While verb-initial structures are clearly derived, both verb-final and verb-second structures have been assumed to be basic in the literature. See Hawkins (1986) for an extensive discussion. 29. Ultan did find that languages of all word order types tend to have sentence-initial WH-words, but he does not distinguish whether this is the result of a pragmatic preference or of syntactic movement. 30. N. K. Kim (1987: 881) contends that the 19th century view that the Dravidian and "Altaic" languages may ultimately be genetically related is not taken seriously today. However Masica(1976: 181-182) briefly entertains the possibility of such a relationship to account for the typological similarities he found between these language groups. 31. My thanks to the members of the 1987 fieldwork course on Malayalam at USC during which most of the data were collected, particularly to Prof. Bernard Comrie, who organized and led the course, and to Ms. Suchitra Sadanandan, our language consultant. 32. Accusative marking on inanimates was already optional in ProtoDravidian, according to Steever (1987: 727). 33. The oddness of the non-agentive subject is also reflected in the informant's judgement that this sentence actually becomes somewhat less unacceptable if the direct object is marked with the accusative marker -ne. Recall that normally inanimate objects are not overtly marked for case. 34. The case markers may optionally be dropped. When this occurs, word order usually, though not always, reverts to the canonical SOV order (Hinds 1986: 188). 35. Some evidence for the non-syntactic nature of the afterthought phenomenon comes from the fact that no postverbal material may ever occur in embedded clauses, even in colloquial speech. Kuno (1978: 63-64) gives the following example to make this point: *Kimi [Taroo ga kekkonsita Hanako to] koto sitte iru. you NOM married with that knowing are 'Do you know that Taro married Hanako?' 36. Example (143) is from Siewierska (1984: 58) and example (144) from Kuno 1978: 109). 37. As in a number of other languages, the verb meaning 'hit, strike', Turkish vurmak, does not govern the accusative but the dative case

152

38.

39.

40.

41.

42.

43.

44.

Notes although the class of verbs with dative objects consists almost exclusively of verbs with low (semantic) transitivity. A similar case is known from Latvian. In rare instances, objects may appear postverbally when the subject is first or second person. In addition, two kinds of heavy shifts exist, one which right-dislocates heavy objects or subjects of embedded clauses and one which left-dislocates heavy subordinate clauses, derived nomináis, and paratactic sequences (cf. Derbyshire 1985: 74-79). These observations on "Tough" sentences in Dutch were collected by the members of a seminar on linguistic typology at USC. My thanks to all participants of that seminar, particularly to professors Jack Hawkins and Bernard Comrie. In addition, prepositions agree with their object and possessed noun phrases with their possessor (Craig 1977: 105). Deviating from Craig's convention, we are using the more standard symbols /s/ for the voiceless palatal fricative and Λ]/ for the voiced velar nasal. Sentence (194a) is from Craig (1977: 211), and sentence (194b) from Craig (1977: 213). See Craig (1977, Appendix B) for the morphophonemic rules that derive the surface form χ 'il 'saw'. In the Jacaltec data, the glosses Al, A2, A3, and El, E2, and E3 identify the first, second, and third person forms of the absolutive and the ergative agreement markers, respectively (cf. Craig 1977). Agreement is shown only where Craig's examples indicate it due to the complex morphophonemics of the Jacaltec verb forms. Verbs like keb 'ali 'ask', which have a quotation as absolutive, show agreement with the goal noun phrase. Reduplication marks plural or variety in nouns, as it does in Indonesian, and repetitive, continuous, or intensive action for verbs (Walker 1982: 22-25). All the instances of intransitive absolutives discussed by Walker are also agentive. Walker gives no indication as to how a sentence like The door opened would be expressed in Sawu. Head-marking morphology is problematic, though. On the one hand, it provides overt markers identifying the grammatical functions of the various grammatical relations and is, thus, comparable to case marking; yet on the other hand, these markers are not associated with the various noun phrase participants themselves but with the verb. A priori, it is therefore not possible to tell whether this head-marking kind of grammatical relation marking is sufficient to tie a language into the transparent range of the Semantic Typology. There is empirical evidence,

Notes

45.

46.

47.

48.

153

however, that it is not. As the section on Jacaltec has shown, Jacaltec is characterized by an abundance of low transparency properties, e.g., adposition stranding, raising, WH-extraction, a rigid clause-internal word order, etc., despite its obligatory head-marking coindexation system. Li and Thompson (1981) have denied the relevance of the notion subject for Chinese, arguing that preverbal noun phrases are to be analyzed as topics. However, sentences like (208) show that the preverbal noun phrase position is available even in embedded contexts, which is highly unusual for topics, as the term is generally understood, but expected for subjects. Li and Thompson's use of the term topic therefore differs significantly from the way it is used elsewhere. In any case further research is necessary to determine the status of the notions subject and topic in the grammar of Chinese. In the remainder of this section we will assume, based on evidence like sentence (208), that Chinese has a preverbal subject position. The main verb of sentence (210), duàn 'break', is an adjective or an intransitive verb. Unlike in the English gloss, the postverbal constituent is, therefore, not a direct object but an adverbial. Hebrew examples are given in the transcription used by Glinert (1989: 9). In accordance with this transcription, sentence-initial symbols and names are not capitalized. Examples (223) and (224) are from Glinert (1989: 273).

References Abasolo, Rafael 1974 Basic semantic structures of Korean. [Unpublished Ph.D. dissertation, Georgetown University.] Arnold, Doug - Martin Atkinson - Jacques Durand - Claire Grover - Louisa Sadler (eds.) 1989 Essays on grammatical theory and universal grammar. Oxford: Clarendon. Bach, Emmon - Robert T. Harms (eds.) 1968 Universals in linguistic theory. New York: Holt, Rinehart and Winston. Becker, Alton L. - Umar Wirasno 1980 "On the nature of syntactic change in Bahasa Indonesia", in: Paz Buenaventura Naylor (ed.), 95-102. Bell, Alan 1978 "Language samples", in: Joseph H. Greenberg (ed.), 123-156. Berman, Ruth A. 1978 Modern Hebrew structure. Tel Aviv: University Publishing Projects. Bossong, Georg 1984 "Ergativity in Basque", Linguistics 22: 331-392. 1985 Empirische Universalienforschung. Différentielle Objektmarkierung in den neuiranischen Sprachen. (Ars Linguistica 14.) Tübingen: Narr. 1989 "Morphemic marking of topic and focus", Belgian Journal of Linguistics 4: 27-51. Bowers, John S. 1986 Grammatical relations. (Outstanding Dissertations in Linguistics.) New York: Garland. Burzio, Luigi 1986 Italian syntax: A government-binding approach. (Studies in Natural Language and Linguistic Theory 1.) Dordrecht: D. Reidel. Cameron, Carrie 1989 "The interaction of semantic and pragmatic voice: Data from Japanese." Paper presented at the LS A Winter Meeting, Washington, D.C.

References

155

Chafe, Wallace L. 1970 Meaning and the structure of language. Chicago: Chicago University Press. Chang, Sok-Chin 1983 "Reference in Korean discourse", in: Korean National Commission for UNESCO (eds.), 219-263. Chomsky, Noam 1965 Aspects of the theory of syntax. Cambridge, Mass. : The MIT Press. 1977 "On wh-movement", in: Peter W. Culicover - Thomas Wasow - Adrian Akmajian (eds.), 71-132. 1981 Lectures on government and binding. (Studies in Generative Grammar 9.) Dordrecht: Foris. 1982 Some concepts and consequences of the theory of government and binding. (Linguistic Inquiry Monographs 6.) Cambridge, Mass.: The MIT Press. Chu, Chauncey Cheng-hsi 1983 A reference grammar of Mandarin Chinese for English speakers. (American University Studies, Series VI, Foreign Language Instruction 2.) New York: Peter Lang. Chung, Sandra 1976a "An object-creating rule in Bahasa Indonesia", Linguistic Inquiry 7: 41-87. 1976b "On the subject of two passives in Indonesian", in: Charles N. Li (ed.), 57-98. Clancy, Patricia M. - Hyeonjin Lee - Myeong-Han Zoh 1986 "Processing strategies in the acquisition of relative clauses: Universal principles and language-specific realizations", Cognition 24: 225-262. Cole, Peter - Wayne Harbert - Shikaripur Sridhar - Sachiko Hashimoto Cecil Nelson - Diane Smietana 1977 "Noun phrase accessibility and island constraints", in: Peter Cole - Jerrold M. Sadock (eds.), 27-46. Cole, Peter - Jerrold M. Sadock (eds.) 1977 Grammatical relations. (Syntax and Semantics 8.) New York: Academic Press. Comrie, Bernard 1978 "Ergativity", in: Winfred P. Lehmann (ed.), 329-394.

156

References

1986

"Contrastive linguistics and language typology", in: Dieter Kastovsky - Aleksander Szwedek (eds.), 1155-1163. 1987 "The unity of syntactic contrasts in English and Russian." [Unpublished MS.] 1989a Language universals and linguistic typology (2nd edition.) Chicago: University of Chicago. 1989b "On so-called raising in Russian." [Unpublished MS.] Comrie, Bernard (ed.) 1987b The world's major languages. London: Croom-Helm. Comrie, Bernard - Stephen Matthews 1990 "Prolegomena to a typology of tough movement", in: William Croft - Keith Denning - Suzanne Kemmer (eds.), 43-58. Cook, Walter A. 1989 Case grammar theory. Washington: Georgetown University Press. Coram, Claudia T. - T. Cedric Smith-Stark - Ann Weiser (eds.) 1973 Papers from the ninth regional meeting of the Chicago Linguistic Society. Chicago: University of Chicago. Craig, Colette G. 1977 The structure of Jacaltec. Austin: University of Texas Press. Croft, William - Keith Denning - Suzanne Kemmer (eds.) 1990 Studies in typology and diachrony: Papers presented to Joseph H. Greenberg on his 75th birthday. (Typological Studies in Language 20.) Amsterdam: John Benjamins. Culicover, Peter W. - Thomas Wasow - Adrian Akmajian (eds.) 1977 Formal syntax. New York: Academic Press. Cumming, Susanna 1986 "Variation and function in contemporary Indonesian", in: Scott DeLancey - Russell S. Tomlin (eds.), 71-88. DeLancey, Scott - Russell S. Tomlin (eds.) 1986 Proceedings of the second annual meeting of the Pacific Linguistics Conference. Eugene: University of Oregon, Department of Linguistics. Derbyshire, Desmond C. 1979 Hixkaryana. (Lingua Descriptive Series 1.) Amsterdam: North Holland.

References 157 1985

Hixkaryana and linguistic typology. (Summer Institute of Linguistics Publications in Linguistics 76.) Arlington: Summer Institute of Linguistics and The University of Texas at Arlington. Dixon, R. M. W. 1978 "Ergativity", Language 55: 59-138. 1989 "Subject and object in universal grammar", in: Doud Arnold - Martin Atkinson - Jacques Durand - Claire Grover - Louisa Sadler (eds.), 91-118. Donaldson, Bruce C. 1981 Dutch reference grammar. 's-Gravenhage: Martinus Nijhoff. Dreyfiiss, J. V. 1978 "meN-, di-, and ber-: Three analyses", NUSA: Linguistic Studies in Indonesian and Languages in Indonesia 6: 1-6. Dryer, Matthew S. 1989 "Large linguistic areas and language sampling", Studies in Language 13: 257-292. Durie, Mark 1985 A grammar of Acehnese on the basis of a dialect of North Aceh. Verhandelingen van het Koningklijk Instituut voor Taal-, Land-, en Volkenkunde 112.) Dordrecht: Foris. Ebert, Robert P. 1975 "Subject raising, the clause squish, and German scheinen constructions", in: Robin E. Grossman - L. James San Timothy J. Vance (eds.), 177-187. 1978 Historische Syntax des Deutschen. (Sammlung Metzler, Abt. C, Sprachwissenschaft M167.) Stuttgart: J. B. Metzlersche Verlagsbuchhandlung. Echols, John M. - Hassan Shadily 1975 An English-Indonesian dictionary. Ithaca: Cornell University Press. Echols, John M. - Hassan Shadily 1989 Kamus Indonesia-Inggris. [An Indonesian - English dictionary.] (3rd edition revised and edited by John U. Wolff James T. Collins in cooperation with Hassan Shadily.) Jakarta: Gramedia.

158

References

Erguvanli, Eser E. 1984 The function of word order in Turkish grammar. (University of California Publications in Linguistics 106.) Berkeley: University of California Press. Farmer, Ann K. 1984 Modularity in syntax: A study of Japanese and English. (Current Studies in Linguistics 9.) Cambridge, Mass.: The MIT Press. Fillmore, Charles J. 1968 "The case for case", in: Emmon Bach - Robert T. Harms (eds.), 1-88. 1977 "The case for case reopened", in: Peter Cole - Jerrold M. Sadock (eds.), 59-81. Foley, William A. 1976 "Inherent referentiality and language typology." [Unpublished MS.] Frazier, Lyn - Keith Rayner 1988 "Parameterizing the language processing system: Left- vs. right-branching within and across languages", in: John A. Hawkins (ed.), 247-279. Gilligan, Gary M. 1987 A cross-linguistic approach to the pro-drop parameter. [Unpublished Ph.D. dissertation, University of Southern California.] Givón, Talmy 1984-1990 Syntax. A functional-typological introduction. 2 vols. Amsterdam: John Benjamins. Glinert, Lewis 1989 The grammar of Modern Hebrew. Cambridge: Cambridge University Press. Greenberg, Joseph H. 1963 " Some universale of grammar with particular reference to the order of meaningful elements", in: Joseph H. Greenberg (ed.), 73-113. Greenberg, Joseph H. (ed.) 1963 Universals of language. Cambridge, Mass.: The MIT Press. 1978a Universals of human language. Vol. 1: Method and theory. Stanford: Stanford University Press.

References 1978b

159

Universals of human language. Vol. 4: Syntax. Stanford: Stanford University Press. Grimes, Barbara F. (ed.) 1987 Ethnologue. Languages of the world. (11th edition). Dallas: Summer Institute of Linguistics. Grossman, Robin E. - L. James San - Timothy J. Vance (eds.) 1975 Papers from the eleventh regional meeting of the Chicago Linguistic Society. Chicago: University of Chicago. Gundel, Jeanette Κ. - Kathleen Houlihan - Gerald Sanders 1988 "On the function of marked and unmarked terms", in: Michael T. Hammond - Edith A. Moravcsik - Jessica R. Wirth (eds.), 285-301. Hale, Kenneth 1982 "Preliminary remarks on configurationality", in: James Pustejovsky - Peter Sells (eds.), 1-88. Hall, Kira - Jean-Pierre Koenig - Michael Meacham - Sondra Reinman Laurel A. Sutton (eds.) 1990 Proceedings of the 16th annual meeting of the Berkeley Linguistics Society. Berkeley: Berkeley Linguistics Society. Hammond, Michael T. - Edith A. Moravcsik - Jessica R. Wirth (eds.) 1988 Studies in syntactic typology. (Typological Studies in Language 17.) Amsterdam: John Benjamins. Harbert, Wayne 1977 "Clause union and German accusative plus infinitive constructions", in: Peter Cole - Jerrold M. Sadock (eds.), 121149. Hawkins, John A. 1983 Word order universals. (Quantitative Analyses of Linguistic Structure 3.) New York: Academic Press. 1986 A comparative typology of English and German. London: Croom-Helm and Austin: The University of Texas Press. 1990 "A parsing theory of word order universals", Linguistic Inquiry 21: 223-261. To appear A performance theory of order and constituency. Cambridge: Cambridge University Press. Hawkins, John A. (ed.) 1988 Explaining language universals. Oxford: Basil Blackwell. Hetzron, Robert 1987 "Hebrew", in: Bernard Comrie (ed.), 686-704.

160

References

Hinds, John 1986

Japanese. (Croom-Helm Descriptive Grammars.) London: Croom-Helm. Hopper, Paul N. 1977 "Some observations on the typology of focus and aspect in narrative language", NUSA: Linguistic Studies in Indonesian and Languages in Indonesia 4, 14-25. Huang, Chen-Teh James 1982 Logical relations in Chinese and the theory of grammar. [Unpublished Ph.D. dissertation, MIT.] Hwang, Shin Ja Joo 1975 Korean clause structure. (Summer Institute of Linguistics Publications in Linguistics 50.) Norman: Summer Institute of Linguistics. Ikranagara, Kay 1980 Melayu Betawi grammar. (NUSA: Linguistic Studies in Indonesian and Languages in Indonesia 9.) Jakarta: Universitas Atma Jaya. Kastovsky, Dieter - Aleksander Szwedek (eds.) 1986 Linguistics across historical and geographical boundaries: In honour of Jacek Fisiak on the occasion of his fifiieth birthday. Vol. 2: Descriptive, contrastive and applied linguistics. (Trends in Linguistics. Studies and Monographs 32.) Berlin: Mouton de Gruyter. Keenan, Edward L. 1976 "Towards a universal definition of subject", in: Charles N. Li (ed.), 303-333. 1984 "Semantic correlates of the ergative/absolutive distinction", Linguistics 22: 197-223. Kefer, Michel - Johan van der Auwera 1992 Meaning and grammar: Cross-linguistic perspectives. (Empirical Approaches to Language Typology 10.) Berlin: Mouton de Gruyter. Kim, Alan Hyun-Oak 1985 The grammar of focus in Korean syntax and its typological implications. [Unpublished Ph.D. dissertation, University of Southern California.]

References 161 Kim, Jeongdal 1989 "On the proper treatment of complementizers in Korean." [Unpublished MS.] Kim, Nam-Kil 1984 The grammar of Korean complementation. (Occasional Papers of the Center for Korean Studies 11.) Honolulu: Center for Korean Studies, University of Hawaii. 1987 "Korean", in: Bernard Comrie (ed.), 881-898. Kirkwood, Henry W. 1969 "Aspects of word order and its communicative function in English and German", Journal of Linguistics 5: 85-107. Korean National Commission for UNESCO (eds.) 1983 The Korean language. Pace International Research, Inc. Kornfilt, Jaklin 1977 "A note on subject raising in Turkish", Linguistic Inquiry 8: 736-742. 1987 "Turkish", in: Bernard Comrie (ed.), 619-644. Kuno, Susumo 1976 "Subject raising", in: Masayoshi Shibatani (ed.), 17-49. 1978 "Japanese: A characteristic OV language", in: Winfred P. Lehmann (ed.), 57-138. Lee, Hansol Η. Β. 1989 Korean grammar. Oxford: Oxford University Press. Lehmann, Winfred P. (ed.) 1978 Syntactic typology: Studies in the phenomenology of language. Austin: University of Texas Press. Lenerz, Jürgen 1987 "Review of Hawkins 1986", Studies in Language 11: 494501. Lewis, Geoffrey L. 1967 Turkish grammar. Oxford: Clarendon Press. Li, Charles N. (ed.) 1976 Subject and topic. New York: Academic Press. Li, Charles N. - Sandra A. Thompson 1974 "An explanation of word order change SVO --> SOV", Foundations of Language 12, 201-214. 1981 Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.

162

References

Li, Patricia 1990

"An investigation of word order change in Chinese", University of Hawaii Working Papers in Linguistics 20, 2: 1-15. Lieb, Hans-Heinrich (ed.) 1980 Oberflächensyntax und Semantik: Symposium anlässlich der ersten Jahrestagung der Deutschen Gesellschaft für Sprachwissenschaft. (Linguistische Arbeiten 93.) Tübingen: Niemeyer. Macdonald, R. Ross 1976 Indonesian reference grammar. Washington: Georgetown University Press. Mallinson, Graham - Barry Blake 1981 Language typology: Cross-linguistic studies in syntax. (North-Holland Linguistic Series 46.) Amsterdam: North Holland Publishing Co. Masica, Colin P. 1976 Defining a linguistic area: South Asia. Chicago: University of Chicago Press. McCune, Keith 1979 "Passive function and the Indonesian passive", Oceanic Linguistics 18: 119-169. McKay, Terence 1987 "Review of Hawkins 1986", Linguistics 25: 626-629. Mithun, Marianne 1991 "Active/agentive case marking and its motivations", Language 67: 510-546. Mohanan, Karuvannur P. 1982 Lexical phonology. Bloomington: Indiana University Linguistics Club. Miiller-Gotama, Franz 1987 "Indonesian and the accessibility hierarchy: Filling the gap." [Unpublished MS.] 1991 A typology of the syntax-semantics interface. [Unpublished Ph.D. dissertation, University of Southern California.] 1992 "Towards a typology of the syntax-semantics interface", in: Michel Kefer - Johan van der Auwera (eds.), 137-178. Muysken, Pieter - Henk van Riemsdijk 1986 Features and projections. (Studies in Generative Grammar 25.) Dordrecht: Foris.

References

163

Naylor, Paz Buenaventura 1980 Austronesian studies. Papers from the second Eastern Conference on Austronesian Languages. (Michigan Papers on South and Southeast Asia 15.) Ann Arbor: University of Michigan, Center for South and Southeast Asian Studies. Nichols, Johanna 1986 "Head-marking and dependent-marking grammar", Language 61: 56-119. Plank, Frans 1980 "Verbs and objects in semantic agreement: Minor differences between languages that might suggest a major one. " [Unpublished MS.] 1981 "Object cases in Old English: What do they encode? A contribution to the general theory of case and grammatical relations. [Unpublished MS.] Prentice, D. J. 1987 "Malay (Indonesian and Malaysian)", in: Bernard Comrie (ed.), 913-935. Pustejovsky, James - Peter Sells (eds.) 1982 Proceedings of NELS12. Amherst, Mass. : GLSA, University of Massachusetts, Amherst. Reis, Marga 1973 "Is there a rule of subject-to-object raising in German?" in: Claudia T. Corum - T. Cedric Smith-Stark - Ann Weiser (eds.), 519-529. Riemsdijk, Henk van - Edwin Williams 1986 Introduction to the theory of grammar. (Current Studies in Linguistics 12.) Cambridge, MA: MIT Press. Rohdenburg, Günter 1974 Sekundäre Subjektivierung im Englischen und im Deutschen: Vergleichende Untersuchungen zur Verb-und Adjektivsyntax. (Projekt für angewandte kontrastive Sprachwissenschaft, PAKS Arbeitsbericht 8.) Bielefeld: Cornelsen-Velhagen und Klasing. Ruhlen, Merritt 1976 A guide to the languages of the world. Stanford: Stanford University.

164

References

Sapir, Edward 1921 Language: An introduction to the study of speech. New York: Harcourt Brace & Co. [1949] [Reprinted New York: Harvest Book, Harcourt Brace & World.] Schachter, Paul 1985 "Parts-of-speech systems", in: Timothy Shopen (ed.), 1: 3-61. Schaub, Willi 1985 Babungo. (Croom-Helm Descriptive Grammars.) London: Croom-Helm. Seiler, Hansjakob - Christian Lehmann (eds.) 1982 Apprehension: Das sprachliche Erfassen von Gegenständen. (Language Universals Series 1.) Tübingen: Narr. Shi, Dingxu 1990 "Is there object-to-subject raising in Chinese?" in: Kira Hall Jean-Pierre Koenig - Michael Meacham - Sondra Reinman Laurel A. Sutton (eds.), 305-314. Shibatani, Masayoshi 1987 "Japanese", in: Bernard Comrie (ed.), 855-880. Shibatani, Masayoshi (ed.) 1976 Japanese generative grammar. (Syntax and semantics 5.) New York: Academic Press. 1988. Passive and voice. (Typological Studies in Linguistics 16.) Amsterdam: John Benjamins Publishing Company. Shopen, Timothy 1985 Language typology and syntactic description. 3 vols. Cambridge: Cambridge University Press. Siewierska, Anna 1984 The passive: A comparative linguistic analysis. London: Croom-Helm. Slobin, Dan I. - Karl Zimmer 1986 Studies in Turkish linguistics. (Typological Studies in Linguistics 8.) Amsterdam: John Benjamins. Song, Seok Choong 1988 Explorations in Korean syntax and semantics. (Korea Research Monograph 14.) Berkeley: Center for Korean Studies, Institute of East Asian Studies, University of California, Berkeley.

References

165

Steever, Sanford B. 1987 "Tamil and the Dravidian languages", in: Bernard Comrie (ed.), 725-746. Suarez, Jorge A. 1983 The Mesoamerican Indian languages. (Cambridge Language Surveys.) Cambridge: Cambridge University Press. Sweet, Waldo E. - Ruth S. Craig - Gerda M. Seligson 1966 Latin. A structural approach. (2nd edition.) Ann Arbor: University of Michigan Press. Tomlin, Russell 1986 Basic constituent orders: Functional principles. (CroomHelm Linguistics Series.) London: Croom-Helm. Ueda, M. 1984 "Notes on parsing Japanese." [Unpublished MS.] Ultan, Russell 1978 "Some general characteristics of interrogative systems", in: Joseph H. Greenberg (ed.), 211-248. Underhill, Robert 1976 Turkish grammar. Cambridge: MIT Press. Verhaar, John W. M. 1978 "Some notes on the verbal passive in Indonesian", NUSA: Linguistic Studies in Indonesian and Languages in Indonesia 6, 11-19. Walker, Alan T. 1982 A grammar of Sawu. (NUSA: Linguistic Studies in Indonesian and Languages in Indonesia 13.) Jakarta: Universitas Atma Jaya. Wang, Li 1956 Zhong-guo yu-fa li-luen [A theory of Chinese grammar]. Beijing: Zhong-Hua Publishing. Wojowasito, S. - W. J. S. Poerwadarminta - S. A. M. Gaastra n.d. Kamus Bahasa Indonesia - Inggeris [An Indonesian English dictionary]. Jakarta: Penerbit Cypress. Wouk, Fay 1986 "Verbal morphology and the discourse structure of Indonesian", [Unpublished MS.]

166

References

Yang, In-Seok 1972 Korean syntax: Case markers, delimiters, complementation, and relativization. [Unpublished Ph.D. dissertation, University of Hawaii.] Zubizarreta, Maria Luisa 1987 Levels of representation in the lexicon and in the syntax. (Studies in Generative Grammar 31.) Dordrecht: Foris.

Language Index Afro-Asiatic 82 Altaic 82, 151 Austronesian 82, 127

Indonesian 23, 29, 55-77, 80-81, 82, 83, 84, 89, 125-126, 127, 136, 141145, 149-150, 152

Babungo 82, 83, 130-131, 142-144 Betawi Malay 80-81

Jacaltec 25, 82, 83, 121-126, 127, 130, 136, 141-144, 152, 153 Japanese 26, 33, 53-54, 82, 83, 8687, 97-105, 109, 130, 131, 141-143, 151

Carib 108 Chinese 82, 83, 131-136, 141-144, 153 Dravidian 82, 87, 88, 89, 151 Dutch 80, 82, 83, 113-121, 131, 141145, 150, 152

Korean 25, 29, 31-54, 58, 77, 82, 83, 84, 86-87, 89, 97-98, 100, 101, 105, 109, 124, 142-143, 146-149 Latvian

152

English 2-11, 12, 13, 14, 16-17, 1819, 22, 23, 26, 28-29, 33-42, 46, 47, 52, 54, 57, 58, 62-64, 66, 67, 69-70, 76, 77, 80, 82, 83, 84, 85, 89, 92, 95, 99, 101, 105, 111, 114, 115, 116, 117, 120, 121, 124, 126, 131, 141-145, 149, 153

Niger-Kordofanian 82

French

Penutian 82

149

Ge-Pano-Carib 82, 108 German 2-11, 12, 13-14, 16-17, 1819, 22, 26, 29, 31, 34-35, 37-40, 58, 62-64, 66, 76, 77, 82, 83, 84, 109, 114, 115-117, 120, 121, 140, 141143, 145, 146, 149, 150, 151 Germanic 8, 14, 116, 121 Hebrew 82, 83, 136-140, 142-143, 150, 153 Hixkaryana 82, 83, 108-113, 121, 122, 131, 141-143, 152 Indo-European

14, 57, 82

Malagasy 76 Malayalam 82, 83, 86, 87-97, 105, 110, 142-143, 151

Russian 12-14, 16-17, 29, 33, 50, 58, 77, 82, 83, 84, 85, 103, 126, 131, 141, 144, 146 Sanskrit 89 Sawu 25, 82, 83, 127-130, 131, 142144, 152 Scandinavian languages 145 Semitic 136, 137, 142 Serbo-Croatian 50 Sino-Tibetan 82 Tamil 89 Turkish 82, 83, 105-107, 142-143, 151-152

Subject Index ablative case 31, 106 absolutive case 127-129 accessibility hierarchy 75 accusative case 3-4, 23, 26, 31, 37, 42, 45, 46, 47, 50, 63, 88-89, 94, 106, 107, 116, 128, 142, 150, 151 active case marking 25 active voice 12, 25, 33, 40, 56-57, 58, 61, 67, 81, 101, 113, 149 adjectival 13 adjective 113, 114, 120, 139, 145, 153 adposition stranding 6, 25, 27-29, 45, 94, 102, 105, 107, 110, 111, 113, 119, 123, 129, 130, 138, 140, 142, 153 adposition type 25, 114 adverb 53, 68, 103, 115, 120, 145, 149 adverbial 5, 46, 74, 75, 77, 110, 153 affixation 31, 32, 33, 44, 48, 52, 57, 60-61, 67, 76, 80, 88, 91, 110, 113, 121, 122, 127, 129, 137, 147, 148, 149 see also morphology afterthought phenomena 99, 105, 106, 151 agent 1, 16, 21-23, 24, 30, 32, 33-40, 41, 45, 56-58, 64, 65-67, 76, 81, 89, 91, 92, 99, 101, 112, 125, 127, 133, 137-138, 143, 152 agreement 11, 32, 48-49, 60, 104, 107, 108, 113, 121, 124, 125, 127, 137, 139, 147-148, 152 ambiguity 7, 35, 45, 52-54, 88-89, 121-122, 128-129, 145 anaphora 1, 110 animacy 22, 45, 88-89, 92, 99-100, 124-125, 128, 151

animacy hierarchy 128 areal contact 14, 29, 78-81, 82, 84, 93, 97 argument structure 3-4, 6, 7, 11, 13, 24-25, 26, 27, 33, 53, 59, 60-61, 66, 70-71, 85-86, 95-97, 101, 125, 126, 141, 145 argument trespassing 13, 26, 27, 54, 70, 93-97, 102, 103, 105, 110, 121124, 136, 141 see also extraction article 128 aspect 9-10, 47, 108, 121, 122, 130, 131 auxiliary 58, 111, 114, 115, 128, 150 benefactive 61, 63, 127, 130, 143 binarity 11 branching directionality 21, 29, 8486, 114 Case Grammar 21-23, 24, 37-40 case transparency principle 126,131, 141 causative 22, 41, 42, 61, 66, 92, 110, 127 classifier 80, 101, 128 clause boundaries (identification of) 26, 43-44, 51-54, 107, 136 clefts 56, 121-122 clitics 31, 60, 149 Cologne Universals Project (AKUP) 11, 14, 16 complementation 5, 6, 47, 72, 75, 77, 103 complementizer 26, 43-44, 47, 52, 68, 72, 149 configurationality 11-12 conjunction reduction 6

Subject Index constituency 5, 6, 11, 25, 41, 43-46, 48-49, 50, 53-54, 84-85, 88, 95, 9697, 104, 105, 106, 108, 109-110, 112, 114, 115, 123, 128-129, 135, 153 constraint 2, 5, 6, 12, 13, 43, 44, 48, 50, 53, 56, 58, 70-73, 75-76, 85, 96, 100, 124, 128, 136 copular structure 13, 80, 103, 111, 128, 147 coordination 96-97, 112-113, 147 dative case 3, 23, 31, 34, 35, 40, 42, 45, 46, 63, 92-93, 106, 151-152 dative shift 61, 63 dative subject 34-35, 81, 92-93 deep structure 1, 2, 4, 13, 24-25, 33, 34, 42, 49, 50, 56, 72, 111, 119120, 123, 133, 149 definiteness 56, 60, 80, 128, 137 declarative 93, 147 deletion 3, 6, 7, 18, 124, 145 demonstrative 128 determiner 85, 114, 115, 128 diachronic perspective 8, 13-14, 1617, 29, 40-41, 115-116, 121, 131132, 134, 135, 136, 137 direct object 3-4, 26, 45, 60, 62-64, 88, 96, 98, 102-103, 116, 125, 133, 137-138, 142, 151, 153 directional 4 disambiguating devices 88, 122, 129 discontinuous constituency 11,112 drift 3, 8, 16-17, 115, 117, 121 equi-NP deletion 6 equal distance principle 18-20 ergative case 24, 127, 128, 152 ergative verb 34, 37, 66 ergative-absolutive language 24-25, 83, 121, 124, 127 experiencer 1, 34-35, 36, 40, 65, 9293, 99, 127

169

expletive 47, 116 extraction 1,5-7, 9, 18-19, 26-29, 4547, 58, 67, 70-77, 86, 94-97, 102, 105, 106, 109-110, 113, 118, 122, 123, 134, 138, 139, 140, 141, 142, 150, 153 extraction hierarchy 5-6, 73, 75, 7677, 126 Fillmore hierarchy 21, 34, 81 finiteness 5-6, 73, 74-75, 77, 98, 122, 131 focus 16, 80, 101, 105, 106, 110, 130 garden path effect 136 gender 115, 139 generative grammar 1,2 11-12, 128 genitive case 3-4, 28, 31, 63, 80, 128 goal 127, 152 government 6, 106 grammatical relation manipulation hierarchy 25-28, 50, 86, 101, 102, 107, 112, 118, 122, 124, 126, 135, 138, 140, 141-142 grammaticization 1-2, 14-19, 84-86, 122, 124, 125-126, 131, 141-144, 146 head-marking 60-61, 83, 108, 113, 121, 124, 152-153 heaviness 112, 152 honorific 32, 48-49, 98, 104, 147-148 impersonal construction 106,116, 118, 138 imperative 128, 147 implicational uni versais 19-20, 76, 131 inanimate 22, 88-89, 92, 99-100, 124, 151 inchoative 100 indirect object 3-4, 46, 98, 108, 116, 125, 130, 133

170

Subject Index

infinitive 5-6, 13, 73-75, 77 instrumental 22-23, 31, 36-37, 65, 81, 127, 137, 150 interrogative 45, 46, 56, 57, 76, 86, 93, 106, 109, 110, 122, 123, 132, 134, 141, 142, 147 intransitive 24, 33-34, 37, 60-61, 6667, 81, 90-91, 99-101, 110, 112, 125, 127, 128, 130, 137, 152, 153 isolating language 132 iterative 61 left-branching 2, 21, 52-54, 84-86, 86-121, 141-144 lexicon 5, 10, 11, 61, 116, 145 locative 13, 15, 21, 37-40, 50, 60, 61, 63-64, 65, 127, 130, 138 logical form (LF) 1, 134 modal 149, 150 modularity 2, 20, 126, 141 mood 9-10, 98, 108, 147 morphology 3, 7, 9-11, 12-13, 18, 26, 31, 40, 47, 60, 64, 83, 86-87, 96, 97-98, 105, 108, 116, 121, 129-130, 131, 132, 136-137, 142, 145, 152 negation 98, 111, 132 nominative case 3, 24, 26, 31, 32, 34, 37, 45, 53, 99, 107, 128 nominative-accusative language 2425, 83, 121, 128 non-agent 4, 33-34, 41, 67, 89, 117, 137, 138, 140, 151 non-finite 122, 131 NP-movement 11 number 7, 32, 127 numeral 128, 129 object-to-subject raising 5, 12-13, 27, 49-51, 68-70, 104, 119-120, 123, 134-135, 139, 148

oblique 15-16, 23, 26, 58, 59-60, 76, 85, 89, 100, 137 parasitic gap 26 paratactic sequences 112-113,152 particle 31, 57, 83, 137, 150 passive 1, 4, 9, 112, 21, 25-26, 28, 32-34, 37, 40, 42, 50, 56-58, 60, 62-64, 69, 70-75, 76, 86, 89, 101102, 105, 106, 110-111, 113, 116, 118, 124-125, 129, 130, 132-135, 138, 140, 142-143, 149 patient 15, 24, 32, 33-34, 37, 62, 81, 90-91, 99, 101, 127, 143, 150 personification 22-23, 36-37 100 pied piping 6, 7, 27-28, 119, 139 possessive 3, 60, 70, 80, 108, 152 postposition 27, 31, 43, 44, 85, 87, 94, 97, 102, 108, 113, 114, 130 postpositional phrase 45, 108 pragmatic functions 4, 12, 15-16, 2728, 30, 40-41, 105-106 predicate 5, 26, 27, 28, 34-35, 53-54, 68, 69-70, 72-75, 77, 95-96, 115, 120, 139, 145, 148 preposition 4, 6, 23, 27, 64, 84, 85, 114, 119, 121, 123, 127, 128, 129, 133, 137, 138, 152 prepositional phrase 6, 25, 60, 63, 64, 84, 114, 115, 125 principle of relative positioning 17-20 pro-drop 53, 92, 137 processing 2, 29, 45, 52-54, 84-86, 95, 115, 126, 128-129, 135-136, 141-144 progressive 9-10 pronoun 6, 27, 28, 83, 107, 130, 132, 142, 149 quantifier 1, 101, 128-129 quasi-infinitival 73, 75, 77 question word 26, 45, 57, 80, 86, 94, 106, 109, 134

Subject Index

171

raising 1, 3, 5, 7, 9, 13, 15, 18, 21, 25, 27, 28-29, 47-51, 67, 68-70, 75, 86, 93, 95, 97, 103-104, 107, 111, 119, 123, 129, 135, 138, 139-140, 141-142, 148, 149, 153, recipient 15, 45-46, 116, 127 reciprocal 127 referent 127-128 referentiality hierarchy 128 reflexive 110, 135 Relational Grammar 24 relative clause 53, 75, 76, 80, 122,

topicalization 12-13, 15, 16, 42, 5859, 71, 75-76, 95, 101, 104, 110, 116, 130, 132, 138 "tough" sentences 5, 12-13, 68-70, 75, 76, 104, 120, 152 trace 26 transitivity 24, 33-34, 37, 60-61, 64, 66-67, 89, 90-91, 92, 99-101, 102, 106, 110, 112, 113, 115, 118, 121122, 124, 127, 130, 132, 137-138, 140, 142, 152

124, 128, 132, 136 rheme 16 right-branching 2, 29, 53, 58, 84-86, 121-140, 141-144

universale 1-2, 7-8, 10, 11-12, 14, 19-20, 24, 34, 56, 79, 86, 112, 131

sampling 78-84 scrambling 25,27-28,29,31,41-46, 48-49, 50, 53, 59-60, 88, 93, 95, 98, 103-104, 105-106, 109 selectional restrictions 6-7, 13 serial verb construction 95-97 short-term memory 26 source 127 stative 4, 61, 99 structuralism 8 subcategorization 5, 76, 85, 96 subjects only constraint 55, 58, 70-75 subject-to-object raising 5, 12-13, 6869, 70, 104-105, 107 subject-to-subject raising 5, 12-13, 27, 47-49, 68, 69, 70, 75, 104, 111, 119, 123, 139 synthetic language 18 temporal 31, 57, 60, 66, 150 temporary ambiguity 52-54 tense 47, 96, 98, 108, 121, 130 theme 65 topic 15-16,30,31,34,37,40-42, 53, 57-58, 59, 95, 97, 103, 105, 115, 132, 135, 153

voice 32, 56-58, 80, 98, 121, 130 see also passive WH-movement 3-6, 25-28, 46, 67, 70-77, 86, 93-94, 95, 97, 102, 105, 106, 109, 118-119, 123, 129, 130, 138-139, 141-142, 151 see also extraction word order typology 83,135 X-bar theory 128