243 28 9MB
English Pages 292 [300] Year 1984
Explanations for Language Universals Edited by
BRIAN BUTTERWORTH BERNARD COMRIE ÖSTEN DAHL
MOUTON PUBLISHERS BERLIN
NEW YORK
AMSTERDAM
The contents of this book have been published simultaneously as volume 21-1 of Linguistics.
Library of Congress Cataloging in Publication Data Main entry under title: Explanations for language universale. 1. Universale (Linguistics)—Addresses, essays, lectures. II. Comrie, Bernard, 1947- . III. Dahl, Östen. P.204.E97 1984 415 84-1979 ISBN 3-11-009797-4
I. Butterworth, Brian.
© Copyright 1984 by Walter de Gruyter & Co., Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form — by photoprint, microfilm, or any other means — nor transmitted nor translated into a machine language without written permission from the publisher. Printing: Krips Repro, Meppel. — Binding: Lüderitz & Bauer Buchgewerbe GmbH. Printed in the Netherlands
Foreword
This collection is the second in an occasional series of special issues of Linguistics. Each special issue will be devoted to a single theme and edited by a guest editor or editors who will be able to invite contributions and/or select papers submitted in response to an announcement in the journal. Special issues will be published simultaneously as an issue of Linguistics (in this case number 1 of volume 21) and as a book available to nonsubscribers through bookshops or directly from the publishers in the usual way. BOARD OF EDITORS
Contents
FOREWORD
ν
BRIAN BUTTERWORTH, BERNARD COMRIE, AND OSTEN DAHL Introduction
1
JANET DEAN FODOR Constraints on gaps: is the parser a significant influence? MARK STEEDMAN On the generality of the nested-dependency constraint and the reason for an exception in Dutch LARRY M. HYMAN Form and substance in language universels BERNARD COMRIE Form and function in explaining language universale OSTEN DAHL Temporal distance: remoteness distinctions in tense-aspect systems AKE VIBERG The verbs of perception: a typological study ALISON GOPNIK Conceptual and semantic change in scientists and children: why there are no semantic universale BJÖRN LINDBLOM, PETER MACNEILAGE, and MICHAEL STUDDERT-KENNEDY Self-organizing processes and the explanation of phonological universale PETER HOWELL and STUART ROSEN Natural auditory sensitivities as universal determiners of phonemic contrasts R. COLLIER Some physiological and perceptual constraints on tonal systems GEERT E. BOOIJ Principles and parameters in prosodie phonology
9
35 67 87 105 123
163
181
205 237 249
NAME INDEX
281
SUBJECT INDEX
287
LANGUAGE AND LANGUAGE-GROUP INDEX
291
Introduction
This volume is the result of a workshop aimed at tackling one of the most central, most difficult, most controversial — and one of the most exciting — issues in modern linguistics: what d o languages have in common, and why? It is, of course, impossible within the compass of one book to treat all the myriad facets of this problem, but we have tried to cover important and representative issues in phonetics, phonology, lexicology, syntax, and semantics to answer the question 'what?'. To answer the 'why?' question, many proposals are offered. They divide, however, into two broad categories: 'internal' explanations and 'external' explanations. Internal explanations say, roughly, that there is something intrinsic to all language systems, not necessarily shared by other human abilities and institutions. Commonalities among languages are not reflections of commonalities among these other sorts of abilities or institutions. External explanations claim that linguistic commonalities do indeed reflect basic properties of human beings and their commonalities — for example, properties of the auditory system, of processing demands, of communicative needs. Naturally, one would be foolish to expect simple answers to the question 'why?'. External constraints may set limits within which the internal dynamics of the evolution of a language have to work. And it is not always clear whether some widely distributed property should count as internal or external. Moreover, as Fodor points out, the shape of a language may be the result of a complicated trade-off among external needs — to communicate, to acquire the language, to process efficiently — and internal, formal demands. One influential approach, namely that of Chomsky and his followers, comes down firmly for internal explanations of language commonalities. These stem from the postulation of a separate language faculty in the brain, common, of course, to speakers of all languages. Thus the properties of language may seem arbitrary with respect to communication, processing, and acquisitional needs, and hence commonalities cannot be accounted for
2
Introduction
in terms of these external factors. This has interesting implications for a program of research into universals: careful study of a single language and its growth in the mind of a child should be sufficient to uncover these universal properties given by the language faculty. It would be fair to say that most contributors to this volume are engaged in a different research program, one that involves the systematic comparison of the properties of many languages. This has one very important advantage. Even if Chomsky is right about the language faculty — and no one has yet discovered its physiological instantation — there may still be important universals dependent on external factors as well as those dependent on the internal properties of the faculty. Only a comparative program can identify both sorts and distinguish one from the other (see Comrie's paper).
Syntax Analysis of some components of language shows obvious sources where one can try to search for external universals. For instance, in phonology one can look to the structure of the human articulatory and perceptual mechanisms. In semantics, one can look to independently verifiable properties of human conceptualization. In pragmatics, one can look to independently verifiable characteristics of discourse structure and personal interaction. At first sight, syntax might not seem amenable to such external explanations, given that it is the formal component par excellence of language; indeed, one might even wonder whether there could be external motivation for the very existence of a syntactic component. Five papers in the present collection approach the problem of explanation in syntax, from three different viewpoints. In one sense, the contributions by Fodor and Steedman can be seen as advocating internal explanations for language universals, since both propose universal restrictions on language that are formal restrictions on the syntax, namely syntaxes with a much weaker generative capacity than that of, for instance, transformational-generative grammar. However, the effect of this restriction on the formal nature of the syntactic component has the effect of considerably simplifying the problem of parsing sentences — indeed Fodor argues that, with the syntax restricted in this way, there may be few or no SPECIFIC constraints that are required by the parser. The parser rather requests that there should be some (i.e. any) constraints, in order to facilitate its task, a request that has to be negotiated with the expressor, which would prefer that the maximum of expressive freedom be allowed. Both papers are concerned largely with the problem of
Introduction
3
unbounded dependencies, i.e. relations between the indexed elements in such examples as what¡ did you say that I saw ,?, or German weil¡ ich¡ Hansj die Nilpferde fütternj sah¡, 'because I¡ saw¡ Hansj feedj the hippos'. Multiple dependencies (as in the second example) are characteristically nested rather than crossing, and Steedman both provides a model that will acount for this property as the most natural case and also allows exceptions like the Dutch equivalent, omdat ik¡ Jan¡ de nijlpaarden zag{ voerenj, where crossing results from the interaction of opposite directions for simple and partial combinations of functions. The papers by Comrie and Hyman are those most committed to the search for external explanations for language universals. Comrie argues that the distribution of nominative-accusative syntax (roughly, parallel treatment of intransitive subjects and transitive agents) versus ergativeabsolutive syntax (roughly, parallel treatment of intransitive subjects and transitive patients) across constructions is not arbitrary. Rather, some constructions have, cross-linguistically, a bias toward nominative-accusative syntax (e.g. imperatives), while others have a bias toward ergativeabsolutive syntax (e.g. resultatives). Crucially, these biases do not remain as arbitrary formal generalizations but are shown to correlate highly with independent semantic or pragmatic properties of the constructions in question. Certain aspects of syntax can thus be viewed as grammatical reflections of semantic and pragmatic properties, although Comrie concedes that in certain instances formal simplicity may override such functional generalizations, and many syntactic generalizations remain for which we have as yet no promising external explanation. The view of some syntax as the grammaticalization of semantics/pragmatics is taken up by Hyman, who notes an analogy to phonology here: many phonological processes seem to have their origin in natural phonetic processes (e.g. slight nasalization of vowels in the environment of nasal consonants), but such phonological processes come to differ from the corresponding phonetic processes in that the relevant phonetic feature is exaggerated beyond what would be required by purely phonetic considerations (and may ultimately become quite dissociated from its phonetic origin). The most original part of Hyman's discussion of syntax is his consideration of the grammaticalization, for instance in Aghem, of the pragmatic notion of focus. In certain constructions, there is a general pragmatic tendency, irrespective of the grammar of the individual language, for certain constituents to be focused (e.g. imperative verbs, negative elements), with corresponding defocusing of other constituents (e.g. noun phrases); in certain other constructions (e.g. relative clauses), there is a tendency for there to be no focused material. In Aghem, these tendencies have simply been extended to become absolute prohibitions on using the 'focus' form
4
Introduction
of a noun in constructions where, typically, some other constituent would be focused or there would be no focus at all — irrespective of the actual pragmatics of the construction in question.
Semantics We now come to three papers which are mainly geared to semantics, namely those by Dahl, Gopnik, and Viberg. Dahl's paper discusses tense systems, more specifically one semantic dimension which can be expressed by such systems, viz. what he refers to as 'temporal distance'. The universal properties of tense systems that are discussed in the paper concern both the ways in which distance in time is measured and the interrelations between this dimension and other semantic factors. In terms of explanations, the emphasis is on possible connections with general properties of human cognition, although such explanations are hinted at rather than claimed to be there. One point that is not stressed in the paper but that should be made here is that tense systems may also be seen as an illustration of a competition between the needs of different components of the linguistic faculty (perhaps a better term should be chosen here) and also of the process of 'grammaticalization' spoken about in Hyman's paper: tense as a grammatical category always involves more or less obligatory markings of certain features of the context of use. In her paper, Gopnik discusses the claim that meanings can be described in terms of a universal set of semantic features and addresses a question which seems to have been neglected by most adherents of this claim, viz. is it compatible with what we know about the cognitive development of children? Her answer to this is negative: ontogenetic data show that there cannot be any semantic universals. Whether one agrees with this conclusion or not, it is obvious that considerations about the ontogenetic development of concepts should play a more important role in semantic theory than has been the case so far. Another relatively neglected area is lexicology from a typological point of view. Viberg's paper treats one important semantic field, that of verbs of perception, and its realization in a wide range of languages. If the structure of the lexicon shows universal tendencies, it is within fields of this kind — that is, fields where the cultural variation can be expected to be small — that such tendencies will show up in the clearest way. In comparison to most other papers in the volume, Viberg's paper is more oriented toward description than toward explanation: this is natural in view of the relative sparseness of earlier works in the field.
Introduction
5
Phonetics and phonology Perhaps the most widely accepted of linguistic universale are to be found in the sound structure of languages. Although there are important disagreements as to the best descriptive apparatus, most investigators agree that the sounds of languages are to be analyzed into segments and features in such a way that a particular language is held to use a subset of the universal set of such features (Jakobson). Since features themselves have physical phonetic content (articulatory, acoustic, or auditory), this area seems ripest for an external explanation. But there are two considerations which may count against this. First, it may be that segments and features are not real properties of language but artefacts of our method of description. Second, the articulatory and auditory characteristics of humans that fit them for speech might have evolved to be specific for this purpose; the alternative, external account is that speech has exploited existing physiological and anatomical systems. The problem of artefactuality is treated in Lindblom, MacNeilage, and Studdert-Kennedy's article. Consider a universal phonetic space in which all possible articulatory gestures — CV syllables — are represented in a three-dimensional array, coding each for its articulatory, acoustic, and auditory properties. In principle, there are an infinite number of such gestures, since the space is continuous in each dimension. A language, L, will need a finite subset, K, of gestures to encode the words of L. Each gesture in Κ has to be perceptually distinct from other gestures, and these distinctions will be made at some articulatory cost. The cost will be high if very precise articulatory control is needed to distinguish general type g ¡ from another g2, but low if the control needs to be only rather approximate. Given the constraint of producing 'sufficient perceptible benefits at an acceptable articulatory cost', what will the subset Κ for L look like? There are three possibilities. First, that there are intrinsic discontinuities in the space, and that those are given by the organization of gestures in K. Second, that the emergent structure of Κ will be very 'diffuse' — that is, each gesture will be as different as possible from each other. Third, that the emergent structure will be 'compact', and gestures may be arranged in minimal pairs. So, for example, if there are no two gestures which differ only by voicing — that is, AT contains /p/ and /d/ but not /ρ/ and /b/ — then Κ is diffuse. If it contains /p/ and /b/ AND /t/ and /d/, then it will be more compact. Compactness implies featural contrasts and segments; and Lindblom et al. show, in an ingenious simulation, that the emergent structure will tend to be compact. The second consideration (are there speech-specific properties of the auditory and articulatory system?) is treated by Howell and Rosen. To
6
Introduction
take their main example, most languages distinguish voiced from unvoiced obstruents — for at least some obstruents. For example, English contrasts all stops, Dutch labial and alveolar only (for velars there is /k/ but no /g/). The critical auditory property is the time difference between obstruent release and the onset of larynx activity — 'voicing' — known as Voice Onset Time. This property has been extensively investigated in many languages, for many obstruents, and inJx>th infants and adults. One of the best-attested findings is that of 'categorical perception' for voicing; that is, we hear a syllable as either /pa/ or /ba/ but not something in between, even if the sound has a V O T intermediate between the normal V O T for /b/ and the normal V O T for /p/. One possible internal explanation of categorical perception is that there is a special feature detector which triggers when V O T is sufficiently long, and it is remarkable that neonates are able to distinguish /ba/ from /pa/ without, apparently, any learning. A s against this, Howell and Rosen note that critical V O T s vary from language to language, from context to context, and that indeed the chinchilla can make the same distinction as the child. Nonetheless, the human speech system may be exploiting a natural auditory sensitivity, i.e. using a feature detector which we share with other animals and which we employ for purposes other than speech. A s it happens, categorical perception has been reported for plucked versus bowed sounds — as with the violin — which could be underpinned by a similar, or the same, detector, since this distinction also depends on the time between the beginning o f the sound and its maximum loudness. Howell and Rosen show that there is no need to postulate a specific detector; rather there seems to be a general decision-making procedure which imposes a distinction on a range o f sounds. Or, in Lindblom et al.'s terms, a meaningful contrast 'emerges' from part o f a continuous distribution o f sounds. Beyond the segment and featural contrasts are the structures that they compose, syllables, words, and phrases. A r e these structures, at some level o f analysis, universal such that, like universal feature contrasts, a language exploits some limited range o f possibilities? Booij reviews a number of proposals that have been offered for syllable and word structure and evaluates them against a wide range o f phenomena from many languages. O n the whole, Booij argues that external explanations in terms of perceptual distinctiveness and articulatory efficiency are not sufficient to account for the complex geometry o f these structures. O n the other hand, over longer stretches o f speech — tone groups — suprasegmental features that are found d o seem highly constrained by these factors, as Collier demonstrates. These features concern, acoustically, pitch movements. Declarative clauses in all languages, including
Introduction
7
tone languages, start at higher pitch than they end. This seems to have something to d o with the decrease in sub-glottal air pressure with time. Listeners are sensitive to this, and in tone languages they expect the critical pitch of all tones to be lower at the ends of sentences. Similarly, the speed of pitch movements and the n u m b e r of significant levels of pitch (which, incidentally, differs a m o n g languages) are constrained by our articulatory ability to produce them and by our ability reliably to distinguish in any context, and in any part of the sentence, level pitch differences and the length of pitch d r o p and pitch rise. However, within these constraints, languages exploit the possibilities rather differently, as Collier elegantly shows. Although the papers in this volume explore diverse areas f r o m various perspectives, we believe that individually and collectively they point to exciting new directions in the study of language universals. We would like to take this opportunity to thank the m a n y people who have helped to m a k e this first Linguistics w o r k s h o p an intellectual success and a personal pleasure. W e are grateful to the m a n y Consulting Editors and several visitors w h o participated in the workshop, contributing to the discussions a n d presenting papers. Hotel Cidadela in Cascais provided pleasant surroundings, efficient organization, and excellent food. M a r i o n Smith, then M a n a g i n g Editor, was responsible for most of the preliminary organization and the extensive correspondence necessary for the international gathering: to her our heartfelt appreciation. Finally, o u r t h a n k s go to M r . Arie B o r n k a m p , M a n a g i n g Director of M o u t o n Publishers, w h o was convinced that this w o r k s h o p would be good for linguistics as well as for Linguistics. W e hope and believe that this conviction was justified. July 1982
BRIAN BUTTERWORTH BERNARD COMRIE OSTEN DAHL
Constraints on gaps: is the parser a significant influence? JANET DEAN FODOR
1. Functional explanations
I have been concerned for some years with the question, why are the syntactic constructions of natural languages restricted as they are? In this paper I want to try out a new answer. 1 We can sort the kinds of answers that are available in terms of the components of the human language faculty that they appeal to. They may refer to formal limitations on the class of grammars that people can mentally represent, to the limited powers of the mechanisms for learning or using language, or to the adequacy of language as a code for human communication. Thus we can select among the answers: (i) Representation: because of the innate peculiarities of the grammar representation centers of the human brain. (ii) Acquisition: because children couldn't learn languages that were not so restricted. (iii) Production: because otherwise people couldn't use their grammars efficiently to encode meanings into sentences. (iv) Perception: because otherwise people couldn't efficiently decode meanings from word strings. (v) Communication: because any other set of constraints would unduly limit people's ability to express the sorts of messages they typically want to express. (vi) Pot pourri: some combination of the above. This list seems to exhaust the alternatives. Unless there is some other potential influence that we haven't yet thought of, the viable human languages will be just the representable, learnable, speakable, understandable, and communicatively useful languages. What we would like to be able to do (at least, those of us with the temerity to consider wAj-questions about language at all) is to determine whether each of these factors does in fact have a significant influence on
10
J. D. Fodor
the shape of languages, and if so, what their relative strengths are, which ones are responsible for which aspects of language design, whether there are any significant interactions between them (a trade-off, for example, between ease of processing and ease of learning). Not all of these factors have received equal attention. So little is known at present about the details of sentence production that there has been no strong move to attribute syntactic constraints to practical limitations on these processes. In this paper I will have nothing to say about sentence production at all. The expressive function of language doesn't lend itself to precise description and has understandably been neglected. I have tried in a small way (Fodor 1981) to remedy this. Despite the problems of studying it, the expressiveness factor is of special interest because its influence is inevitably toward richer languages, subject to fewer constraints. As I will show below, it thus has the potential for interacting in interesting ways with various pressures toward a more restricted class of syntactic constructions. There has been some enthusiasm during the past few years about the explanatory potential of limitations on the sentence-parsing mechanism. As will become clear, my own opinion now is that only rather minor, peripheral constraints on language can be specifically related to parsing problems. Whether there is some more global way in which the exigencies of sentence parsing might have contributed to language design is the topic of this paper, and I will return to it below. More recently, interest has turned toward functional explanations based on language acquisition, led by the innovative work of Culicover, Hamburger, and Wexler within a standard theory framework (see Wexler and Culicover 1980, and references therein), and since extended to other theoretical frameworks (e.g. Baker 1979; Pinker 1981). Despite all of this, I think most linguists still favor the view that functional influences on language are minimal and that virtually all universal tendencies are attributable to accidents of human evolution, whose legacy has been a collection of odd limitations on rule format and application programmed into our genes. A couple of points of clarification are in order here. First, as Chomsky and Lasnik (1977) have observed, functional pressures on language structure might operate either on the development of a particular language or on the evolution of the grammar-representation system. In the latter case, innately determined properties of linguistic competence would have their source in practical problems of language acquisition, sentence production, and so on. It might therefore seem to be something of a category mistake to set off answer (i) above as an alternative to (ii)-(vi). However, I have tried to formulate (i) in a way that
Constraints on gaps : the parser
11
emphasizes the arbitrariness of evolution, the opportunism of nature. Thus (i) is intended to convey that there are no practical reasons behind the limitations on possible grammars other than reasons having to do, for example, with the availability and adaptability of certain preexisting neurophysiological structures when the grammar-representation system was evolving. The second point to be made about (i) is that, construed in this way, it is bound to be difficult (at least, given the present state of knowledge about the evolution of the brain) to come up with anything that would count as positive evidence in its favor. One consequence of this is that it makes excellent sense to investigate (ii)-(vi) even if one is a firm believer in (i); (i) will be supported to the extent that we find no evidence for (ii)-(vi), as long as we can convince ourselves that this is not due simply to our own shortsightedness. (I don't mean to underestimate this latter point. Undoubtedly there is a chance that we may overlook genuine functional explanations, just as there is a chance that we will be taken in by spurious ones. But the only way of avoiding these risks is to give up all search for explanations.) In previous work I have argued for a version of (vi), a 'negotiation' model according to which the shape of a language is the result of interactions between parsing considerations, expressive considerations, and innate limits on grammar representation. Unlike some attempts to implicate sentence parsing, this model suggests that the parser has little qualitative influence on the structural properties of a language. The qualitative details of syntactic constraints are determined primarily by the grammar and the expressor (where I use these terms as a loose shorthand for the grammar-representation system and for whatever aspect of the language faculty concerns itself with protecting the expressive potential of the language). Apart from a few specific cases, the role of the parser is to provide a general negative pressure, pressure toward some (i.e. any) constraints on syntactic constructions. In section 4 below, I will reconsider this model and suggest that this negative pressure from the parser is required only because current theories of grammar are too rich. On different theoretical assumptions, which give up both transformational rules and the array of constraints needed to limit their application, we can see the grammar itself as the negative influence. The 'negotiations' can then be reduced to a battle between the grammar and the expressor, with the parser dropping out almost completely. Despite appearances, this new model doesn't really amount to a rejection of the idea that performance considerations can shape linguistic competence. The reason that it is difficult to pinpoint any specific effects of the parser is simply that, with the shift in the role of the grammar, the
12
J. D. Fodor
concerns of the parser and the grammar (and of the acquisition device too) now virtually coincide; only the expressor stands out in opposition to the others. A great deal more work is needed to validate the theory of grammars on which this new 'cooperative' model is based, but I consider the general idea to be very encouraging. There is obviously no a priori guarantee that selection pressures designed the human language faculty so elegantly that one and the same kind of language is optimal for all of its components. But to the extent that this is in fact the case, it can certainly help to explain how one kind of language could have gained a significant selective edge over all other contenders.
2. The negotiation model Research on sentence processing has identified a variety of constructions in English and other languages which are difficult for the parsing mechanism to cope with. By extrapolation, especially where we have been able to develop a fairly specific model of how parsing proceeds, we can often plausibly argue that certain other imaginable constructions which don't occur in the language would be difficult to parse if they did occur. It is these latter cases which support the idea that the parser can have an impact on the grammar (insofar, of course, as they cannot be accounted for in terms of other components of the language faculty). But the former kind of case, in which a difficult-to-parse construction is NOT excluded from the language, is also important. It argues that the parser is not the only, or even the most powerful, influence on the shape of the language. Some parsing problems apparently do have an impact on the grammar, but others do not, and an adequate theory must account for the distinction. As long as the parsing problems themselves do not difTer in any obviously relevant way (e.g. in severity), the theory must appeal to other factors which could screen any constraints that the parser may propose and approve some while rejecting others. Thus we arrive at the negotiation model, with the grammar and the expressor as the most likely adjudicators of the parser's demands. Before turning to the 'filler-gap constructions' which are the main focus of this paper, I will present some simple examples to illustrate this point and will show how the grammar and the expressor can temper the proposals of the parser. I will begin with a clear case, where there is a fairly convincing parsing problem that could be resolved if the grammar contained a constraint to exclude the offending constructions, and also little doubt that the grammar does indeed contain the constraint in question. Bever (1970) and Chomsky and Lasnik (1977) have suggested (with
Constraints on gaps: the parser
13
slight differences of detail) that the exclusion of constructions like (4) facilitates sentence parsing by permitting the parser to rely on the general hypothesis that a sentence-initial clause is the main clause unless explicitly marked as subordinate. (1) (2) (3) (4)
It is clear that he loves her. That he loves her is clear. It is clear he loves her. *He loves her is clear.
All the ingredients of a plausible functional explanation are here. There is independent reason to believe that the excluded constructions would garden-path the parser. (They are temporarily ambiguous, and the correct analysis is the less-probable analysis, given the existence of single-clause sentences and also the preponderantly right-branching structure of English.) There is a rather close match between the class of constructions excluded by the constraint and the class of constructions that would give rise to this confusion in parsing. 2 There is no reason to believe that this class of constructions is particularly troublesome to any other component of the language faculty. (The grammar itself is not concerned with ambiguity; it provides a derivation for each sentence of the language, and the formal mechanisms for doing so will be no more complex — in many cases even simpler — if two or more derivations are allowed to converge on a similar surface string. The language-acquisition device and the sentence-production mechanism are presumably concerned with ambiguity only at second-hand, insofar as it may obscure the data for learning, or confound the intent of the speaker to be understood.) There is even some evidence that the constraint is absent in languages in which the parsing problem would not arise (e.g. in left-branching languages for which a main-clause-first guessing strategy would be inappropriate). Last and certainly not least, the constraint does look to be a genuine constraint in the grammar, rather than merely a practical restriction on how a speaker will choose to express what he has to say if he cares about being understood. (A sentence like [4] is rejected even by people who know what it is supposed to mean. The judgement is confident and does not waver even when attention is drawn to pairs like [1] and [2], Thus there is a strong consensus that the meaning of [3] cannot be expressed as in [4], even when all potential performance problems have been defused.) The case of (4) contrasts, as Chomsky and Lasnik note, with that of (5). (5)
The horse raced past the barn fell.
Sentence (5) is extremely difficult for uninitiated subjects to understand, and what is especially interesting about it is that the misanalysis is exactly
14
J. D. Fodor
comparable to that attributed to the ungrammatical (4): the first few words, which do not constitue the main clause, are construed as the main clause. In explaining why (5) is allowed to stand, incomprehensible as it is, Chomsky and Lasnik endorse a limited version of the negotiation model. They observe that (5) differs from (4) in that its ambiguity turns on the morphological indistinguishability of the past-tense main verb raced and the passive participle raced; there is no such ambiguity in (6). (6)
The horse ridden past the barn fell.
Assuming that a constraint to exclude (5) would have to take the form of a surface filter, they propose a theory of surface filters which precludes sensitivity to morphological ambiguities. The parser will want to be rid of (5) just as much as it wants to be rid of (4), but because of innate constraints on grammars the parser's needs cannot be accommodated. 3 It is worth considering one more example, which is more typical in that the status of the proposed grammatical constraint is unclear. The sentence (7) is almost invariably interpreted as in (8), and not as in (9). (7) (8) (9)
Nobody saw the policeman who was sitting in the back row. The policeman who was sitting in the back row was seen by nobody. Nobody who was sitting in the back row saw the policeman.
That is, the relative clause in (7) is not typically construed as having been extraposed from the subject phrase, even though there is a relative-clause extraposition rule in English. It has been proposed that this rule is subject to a transderivational no-ambiguity constraint, whose function, just like the constraint against (4), is to facilitate parsing by guaranteeing the correctness of the favored analysis of an ambiguous word string. (In this case the operative parsing strategy is either to assume the minimal number of transformational operations, or, as proposed in Frazier and Fodor 1978, to make local structural associations between neighboring words.) That the parser would benefit from a constraint on relative-clause extraposition is not in dispute. The uncertainty about this example is whether or not such a constraint exists. If it does, it is a very squishy constraint. Judgements differ considerably from one speaker to another and are sensitive to variations in number, animacy, the noun/pronoun distinction, and prosody, and even to semantic and pragmatic plausibility; any or all of ( 10)—( 14) may be judged to be more acceptable than (7) on the extraposition analysis. (10) (11) (12)
Nobody saw the policemen who was sitting in the back row. Nobody saw the fistfight who was sitting in the back row. Nobody saw him who was sitting in the back row.
Constraints (13) (14)
on gaps: the parser
15
Nobody saw the policeman, who was sitting in the back row [where the comma indicates a prosodie break], Nobody saw the explosion that was sitting in the back row.
In other words, the extraposition construction is apparently acceptable just to the extent that it is unambiguous, regardless of the grammatical or extragrammatical nature of the disambiguation. We cannot absolutely rule out the possibility that the grammar contains a sharp constraint which tends to be overriden in practice just where no ambiguity would arise, but it does seem more natural to conclude that there is after all no constraint in the grammar, but only a practical stylistic principle which tells speakers to avoid marked and potentially confusing constructions. If this is so, we can ask whether there are any factors that would tend to oppose a grammatical constraint in this case. In fact, we can reconstruct a rather interesting and intricate debate between the parser, the grammar, and the expressor. The exclusion of just the truly ambiguous examples would call for either a transderivational constraint on the extraposition rule or a surface filter sensitive to all sorts of semantic and lexical properties of sentences; these we can confidently suppose to be either outright incompatible with innate limitations on grammars, or at least assigned a very high cost by the evaluation metric. The parser might then press for a broader constraint, which would sacrifice some harmless examples along with the ambiguous ones — for example, a filter prohibiting an extraposed relative immediately to the right of any noun phrase. But this filter too would be complex. A mere string condition would be inadequate, since it would exclude nonextraposed relatives as well (see below for further discussion). A structure-sensitive filter would have to be rich enough to exclude relatives following VP-dominated NPs, NPs in prepositional phrases under VP, NPs in final position deep within any preceding complex constituent such as a complement clause. Frustrated once more by the grammar, the parser might then consider requesting that relative-clause extraposition be abolished entirely. The grammar would presumably raise no objection, since this would constitute a simplification; but the parser itself might find this undesirable, since rightward-movement rules apparently facilitate parsing by reducing structural complexity at the beginnings of sentences. Whatever the precise reasons may be, there seems to be a tendency for sentences to be easier to parse to the extent that complex constituents appear late rather than early (see Yngve I960; Frazier i.p.). To drop the extraposition rule from the grammar would therefore be to exchange the disadvantages of ambiguity for an increase in the computational effort associated with unambiguous constructions. The latter problem could be solved in turn by a constraint
16
J. D. Fodor
prohibiting relative clauses within subject phrases, but this would surely meet with strong opposition from the expressor, since now it would have no way of modifying subjects with either extraposed or nonextraposed relatives. The result, then, is a deadlock in the negotiations, leaving the parser to cope with its ambiguity problem as best it can, with the assistance only of charity on the part of speakers. The general conclusion to be drawn from such examples is this. Some grammatical constraints can plausibly be regarded as motivated by parsing problems, but not all parsing problems succeed in motivating grammatical constraints. The reason that we have adduced for the latter point is that a constraint exactly tailored to the parsing problem is not of the kind that will fit into the grammar neatly, or at all, while a broader constraint with a simpler grammatical formulation would significantly reduce the expressive capacity of the language. It seems, then, that parsing considerations can shape languages only within the bounds set by universal grammar, and we might conclude from this that the parser is too weak an influence to have had any role in determining the properties of universal grammar. In fact, this last step is an uncertain one, for it is possible that the unresolved parsing problems that we now observe represent just those cases in which the parser happened to be the loser in long-past evolutionary battles — perhaps the parsing problems would now be far worse were it not that the parser also won some of those battles. It is worth noting in this connection, though, that universal properties of grammar could in principle have been very different from what they are, in a way that would have guaranteed that the parser would never have to struggle with ambiguity. Suppose, for example, that the grammar of a natural language consisted exclusively of interpretive rules governing the assignment of structure to word strings and of meaning to structures. It would then effectively mimic the operations of the comprehension routines. We could also suppose that these rules were ordered and subject to 'elsewhere' conditions, such that later rules would be inapplicable if earlier rules applied successfully. By matching these grammatical 'elsewhere' rankings to the inherent preferences of the comprehension routines, it could be guaranteed that whatever analysis the parser preferred was the one and only correct analysis licensed by the grammar. Coping with relative-clause extraposition would be child's play in such a system. Natural-language grammars are in fact quite obviously not like this. Further speculation about why this is so is rather far removed from any concrete evidence; it might be that such grammars would be hopelessly unwieldy for sentence production, or it might simply be that functional influences of any kind run out at this point. But we have at least
Constraints on gaps: the parser
17
established with some confidence that there are powerful influences on the composition of a grammar other than the parser. As we shall see, this conclusion is supported by consideration of a range of other examples as well.
3. Constraints on fillers and gaps Constraints on movement and deletion phenomena have been intensively studied, and they are among the most central and universal constraints on natural languages. It would be particularly impressive if we could find functional explanations for these very general constraints, rather than just for an unsystematic collection of prohibitions against constructions that present idiosyncratic parsing problems. A movement or deletion rule creates a 'gap' in a sentence, where a constituent required by the base rules does not appear on the surface; this gap must be associated in some fashion with its 'filler' (the moved constituent, or the antecedent of the deleted constituent). Filler-gap constructions clearly impose a certain burden on the parser. Fillers and gaps must be identified, though they are often ambiguous, and they must be properly associated in accord with the constraints defined by the grammar. It has been very tempting to view these constraints not as a further complication of the parser's task but as a simplification, in that they crystallize within the grammar the parser's own inherent tendencies. The suggestion is that fillers and gaps are not permitted to appear in positions in which it would be difficult for the parser to detect them, and that patterns of filler-gap association are not permitted if they would tend to be overlooked by the parser. Let us consider the association problem first, for this provides one of the successes of the functional-explanation project. A sentence containing two filler-gap dependencies is potentially ambiguous if either filler is of the right category to be associated with either gap. In fact, in English and other languages, this potential ambiguity is resolved by the nesteddependency constraint (NDC), which requires the two dependencies to be either disjoint or nested, not intersecting. My favorite example is still (15), which clearly has the nested store boxes in what reading but not the intersecting store what in boxes reading. (15)
I I I 1 What are boxes easy to store—in—?
We know that ambiguities of any kind can be a problem for the parser, and this kind of ambiguity could be expected to be particularly disruptive.
18
J. D. Fodor
If we are right in assuming that even one filler-gap dependency creates extra work, and that two dependencies create at least twice as much work, then it is understandable that the parser would prefer not to have to juggle with alternative hypotheses about which two dependencies it is dealing with. Postulating a parsing motivation for the N D C is thus very plausible. It is further reinforced by the observation that the N D C is a no-ambiguity constraint, i.e. that it is inapplicable when other facts about the grammar determine a unique pattern of filler-gap associations. In (16) there are two intersections, and this is permitted because there is no other way in which the fillers could be linked up with the gaps. (16)
Which*crimes did the FBI not know how—to
solve-——?
Ideally, we would be able to explain not only the existence of the constraint but also its direction. Why are nested dependencies favored rather than intersecting ones, when either constraint would successfully remove the ambiguity? There have been several suggestions as to why the parser might favor the nested analysis (Kaplan 1973; F o d o r 1978; Steedman, this volume). I now think that the most plausible one has to do with asymmetric probabilities of gap distribution determined by the ccommand constraint. 4 Assuming that two fillers must be at different heights in the phrase marker (i.e. cannot be sisters), the c-command condition entails that the domain of possible gap positions for the lower filler will be smaller (or at least no greater) than that for the higher filler. The probability that the gap for the lower filler will be in the smaller domain is 1, while the probability that the gap for the higher filler will be in this smaller domain is less than 1, since the latter gap might either precede or follow this domain. Thus for two filler-first dependencies, given that the gap for the higher filler does not precede the domain for the other gap ( = disjoint dependencies), the probability that the gap for the higher filler will occur first ( = intersecting dependencies) is lower than the probability that this gap will occur second ( = nested dependencies), even in the absence of any constraint in the grammar. This means that if the parser has not already found the gap for the higher (first) filler before it encounters the lower (second) filler, its best bet will be that the first gap it encounters belongs to the lower filler. (Comparable considerations apply to gap-first dependencies; see F o d o r 1983, for details.) A parser that adopted a guessing strategy based on these probabilities would favor nested filler-gap assignments over intersecting ones; and crystallization of this tendency in the form of a grammatical constraint would result in the N D C rather than its inverse. Thus both the existence and the direction of the constraint can be related to the operation of the parsing routines. In contrast with this success, the search for constraints motivated by
Constraints on gaps : the parser
19
problems of filler and gap identification has been a striking failure. Parsing problems exist, and constraints exist, but there is little correlation between them. I will concentrate on gap identification here, but comparable problems can arise in filler identification too. 5 A gap will be ambiguous ('doubtful') if it appears in the position of an optional constituent in the base. Since the verb read has an optional direct object, examples (17) and (18) both have a doubtful gap in postverbal position; (17) has no other potential gap, so this doubtful gap is the true gap, but (18) has a doubtless gap after the preposition at the end, and so the doubtful gap is a false gap. I 1 (17) Which book did you read—to the children? I I (18) Which book did you read to the children from—? Only the ends of the sentences (17) and (18) disambiguate their identical beginnings, and ambiguities of this sort can clearly extend over even greater distances. The lexical-expectation model of Fodor (1978) claims that the parser hypothesizes a gap in a given position, for example a nounphrase gap in postverbal position, just in case (a) there is no noun phrase in that position in the surface word string, and (b) a deep structure with a noun phrase in that position is more probable than a deep structure without one. If this is correct, the parser will not always be led astray by doubtful gaps, but only by likely false gaps and unlikely true gaps. But even so, the frequency and severity of the gap-identification garden paths would seem to be considerable, and we need some explanation for the absence of any constraint in the grammar to rule out at least (18) if not (17). The explanation I have proposed (Fodor 1978) is very similar to Chomsky and Lasnik's suggestion about surface filters — that is, that the necessary constraints would have to be highly sensitive to lexical details, though in this case the details would be a matter of subcategorization rather than of morphological form. However much the parser might want such constraints, the grammar might very well reject them as too complex. However, just as in the discussion of the examples in section 2, we must also consider the possibility of grammatically more natural constraints which would exclude the troublesome examples and others besides. Why not, for example, prohibit ALL direct-object gaps, regardless of whether the subcategorization of the verb is ambiguous? Why not prohibit gaps altogether? The answer, presumably, is that this would cramp the style of the expressor. I will return to this point below. In any case, the grammatical-complexity considerations offer a solution to only one part of the problem. On one hand we have troublesome gaps that are permitted by the grammar; on the other hand we have trouble-free gaps
20
J. D. Fodor
that are prohibited by the grammar. What could be easier to detect than the gap in (19) or (20)? (19) (20)
""Who were you hoping for — to win the game? *What did the baby play with — and the rattle?
How could the gap in (21) be difficult to locate if the gap in (22) is identifiable? (21) (22)
*John tried for Mary to get along well with —. John is too snobbish for Mary to get along well with —.
Why should the gap in (23) be any harder to find than the gap in (24)? (23) (24)
*The second question, that he couldn't answer — satisfactorily was obvious. The second question, it was obvious that he couldn't answer — satisfactorily.
The constraints illustrated in these examples don't have any obvious motivation other than the problems of gap identification, but they also aren't well designed as solutions to those problems. The data seem to run exactly contrary to any functional considerations. If we are not prepared to give up on the functional-explanation project entirely, we need a new way of looking at the facts. We have been looking for prohibitions on gaps in specific positions in which they would be ambiguous. Examples like (19) and (20) illustrate specific positional constraints, but it turns out that these are atypical. The fundamental idea behind Ross's (1967) characterization of the island constraints, which has been borne out by subsequent work, is that the constraints are relational: they limit the positions of gaps in relation to the positions of their fillers.6 What this suggests is that the parser does not search for fillers and gaps independently; rather, (in the case of a filler-first dependency) it identifies the filler and then keeps this in mind while searching for a gap to associate with it. If so, then the grammar can assist the parser not by excluding specifically troublesome gaps, but by restricting the domain through which the parser must search for the gap that it needs. Identification problems will remain, but the number of positions in which they can arise will be significantly reduced. The problem with this story is that it implies that the most helpful grammar of all would be one that determines a unique position for the gap corresponding to any given filler. Given such a grammar, there would be no need for the parser to search at all, and no possible danger that it would misidentify even doubtful gaps. But grammars are in fact not so obliging. In English, equi and raising have unique gap positions, but
Constraints on gaps : the parser
21
too/enough deletion and tough movement, though otherwise very similar, do not. And topicalization and WH-movement still offer a vast array of possible gap positions despite the exclusion of sentential subjects, complex noun phrases, and other islands. To flesh out the line of argument we are pursuing, we would therefore need to identify some POSITIVE pressure, which would resist the imposition of constraints strong enough to guarantee a unique position for every gap in relation to its filler. As we have already noted, this kind of positive pressure is just what we would expect from the expressor. Filler-gap dependencies are used for various expressive purposes. They allow the topic of a sentence to be placed in a prominent position early in the sentence where it can guide semantic integration of the message; they allow the focus of a question to be assigned a position that defines the scope of the question; they minimize redundancy by permitting repeated material to be deleted; they signal identity of argument terms (as in equi) without the ambiguities of pronominalization; and so on. Admittedly there are other ways of achieving the same communicative ends (e.g. resumptive pronouns, grammatical morphemes as scope markers), but fillers and gaps certainly constitute one good device for expressing semantic and pragmatic distinctions above and beyond the who did what to whom distinctions of the basic clause; and judging by how often they are found in natural languages they seem to be a grammatically natural device for this purpose. Massive restrictions on filler and gap distribution, however acceptable to the parser and the grammar, would thus undermine the value of the language for communication. As I have argued elsewhere (Fodor 1981), the particular concerns of the expressor, unlike those of the parser, exhibit a rather close correspondence with the particular profile of constraints that we observe on filler and gaps. For instance, equi and too/enough constructions center on a predicate which expresses a relation between the matrix subject (or object) and an event or state of affairs described by the complement clause. The meanings of equi verbs and adjectives typically have to do with intentions, plans, and desires, and are such that the matrix subject (for intend, try, etc.; the matrix object for force, persuade, etc.) is the most likely agent for the complement-event and hence, in the unmarked case, is most likely to be coreferential with the SUBJECT of the complement. If the gapping of the coreferential noun phrase in the complement is to be used as the signal for this identity of arguments, a highest-subject gap in the complement clause would therefore be expressively more useful than gaps in other positions. Under sufficient pressure from the parser to economize on gap positions, the expressor might be prepared to permit the grammar to exclude all but highest-subject gaps (especially given the existence of passive and other
22
J. D. Fodor
devices for turning a to-be-gapped noun phrase into a subject if it happens not to be the deep subject). By contrast, too/enough predicates express an enabling or disenabling relation between the complement event and some entity, referred to by the matrix subject. Since it is assigned a causal influence, this entity will typically be one of the participants (agent, object, instrument, goal, etc.) in the complement-event. The role of the entity in the event must be fairly central but is otherwise quite free; there is certainly no necessity, or even special likelihood, that it will be the agent. A restriction to highest-subject gaps in too/enough constructions would therefore meet with more resistance from the expressor than in the case of equi. The gap in a WH-question corresponds to an entity about which more information is requested. Here too, semantic considerations preclude a severe restriction on gap positions. Almost any major phrase of a sentence could plausibly function as the focus of an information request, and it would be quite unnatural from an expressive point of view for the grammar to permit, say, only direct-object gaps. The only sort of hierarchy of expressive importance that we might anticipate here would favor major constituents over more deeply embedded ones, with constituents buried within modifying phrases or clauses the least likely question foci. This, of course, is just the sort of gap-distribution principle that we observe for WH-movement, with different languages establishing different cutoff points on the hierarchy. These expressive hierarchies for different constructions are only probabilistic. We may certainly, on occasion, find ourselves wanting to express an equi-type relation with an oblique object, or to question a part of an adverbial clause. But the parser will gain greater certainty from a sharp syntactic constraint on gap positions than from a fuzzy semantic tendency, and so we can imagine that the expressor is persuaded to relinquish absolutely those gap positions which it would make use of only rarely if they were available. As always, the grammar contributes to these negotiations too. A t least in some nontransformational frameworks (such as the generalized phrase-structure system that I will discuss further below), the observed constraints are of just the kind that the grammar can most readily accommodate. The probabilistic difference between equi and too/enough can be sharpened up into a highest-subject-gap constraint on equi because predicates can be subcategorized either for clausal complements with gaps or for V P complements which are equivalent to clauses with highest-subject gaps; the cost to the grammar of this constraint is therefore negligible. The loose hierarchy of expressive importance for WH-gaps can be sharpened up into island constraints if the grammar prohibits a filler-gap association from being passed down by the phrase
Constraints on gaps: the parser
23
structure rules through certain types of nodes (see below for details); the result will be that the stretch of a sentence from which the gap is excluded must constitute a complete phrase of some characteristic type (adverbial clause, sentential subject, complex noun phrase, etc.). In the absence of innate limitations on possible (and highly valued) grammars, other constraints could perhaps be imagined which would give a more delicate fit to the concerns of the expressor and the parser, but the evidence is that the constraints that they can draw on are all cut to a rather restricted grammatical pattern. The negotiation model that we have arrived at for filler-gap phenomena can be summarized thus: The parser requests some (i.e. any) constraints that will reduce its uncertainty about filler and gap positions, ideally constraints that uniquely identify their structural relationship in a phrase marker. The grammar offers certain constraints that can be formulated simply within the innately given representation system for grammars, not all of which are as restrictive as the parser might like. The expressor tolerates those of the constraints that the grammar offers which will do least damage to the expressive value of the construction in question. This model, I think, has its merits, but it should be observed that the role of the parser has been reduced to that of a general restrictive influence, with no effect on the specific profile of constraints that the language will exhibit. Parsing is relevant to the question, why are there any constraints at all?, but not to the question, why these particular constraints? It is the grammar and the expressor between them which qualitatively shape the language. This is a sweeping conclusion, which needs to be moderated somewhat. We have observed that various peripheral constraints (such as the obligatory-complementizer constraint for subject clauses) seem to reflect the needs of the parser quite directly. All no-ambiguity constraints probably arise as solutions to parsing problems. Also, we have considered the possibility that certain rules (such as optional extraposition rules) which create additional freedom in a language may be devices for simplifying the parser's task. Nevertheless, the fact that the parser can be implicated only in the most general fashion in the design of the central constraints on movement and deletion phenomena suggests that we should at least consider the possibility that the parser has nothing to do with these constraints at all. Its role in the negotiations is so bland, so lacking in identifiable idiosyncrasies of the parsing mechanism, that we may have confused it with the influence of some other component of the language faculty. The new model outlined in the next section shows that, by revising
24
J. D. Fodor
our syntactic theory, we can see the grammar itself as the source of the negative pressure and can thus dispense with all reference to the parser.
4. The grammar as negative influence Current grammars, both transformational and nontransformational, contain overgenerating rules whose effects must be held in check by a variety of constraints (universal or language-particular) on their application. That is, a grammar (including its innate aspects) consists of a mix of positive statements and negative statements, where a positive statement is one such that, if it were omitted from the grammar, the set of derivations would be reduced, and a negative statement is one such that, if it were omitted from the grammar, the set of derivations would be increased. The general trend of recent syntactic research has been toward very general positive statements (e.g. Move a), with more and more of the distributional work to be done by a rich body of negative statements (e.g. conditions on government and binding). It is in this vein that I have referred in the previous section to CONSTRAINTS on (i.e. NEGATIVE statements about) filler and gap distribution. And it is this way of viewing grammars that requires the parser to be the prime mover in establishing the constraints. Let us review the argument. Filler and gap positions are assumed to be limited by negative statements in the grammar. Any statement in the grammar adds to its complexity. Complexities call for explanation. Why aren't grammars simpler? We can understand why they couldn't be simplified by the omission of all positive statements, for the positive statements provide ways of expressing messages. But why shouldn't they be simplified by omission of their negative statements, whose effect, after all, is to prohibit some ways of expressing messages? Given that neither the expressor nor the grammar itself could be what favors the inclusion of these negative statements, 7 our model has had to appeal to the parser (and/or, conceivably, to the sentence-production and language-learning mechanisms), which very well might prefer it if there weren't too many ways of expressing messages. The only alternative to this reasoning is to turn the complexity metric for distributional constraints upside down. Suppose that grammars could contain only very narrow positive statements, so that every additional freedom in the distribution of fillers and gaps would require additional rules in the grammar. Then the grammar itself would favor restricted distributions, since restrictions would constitute simplifications rather than complications.
Constraints on gaps : the parser
25
The phrase-structure component of a generalized phrase-structure grammar (GPSG; see Gazdar 1981, and elsewhere) has just this property. In addition to standard phrase-structure rules, it contains linking rules and derived rules which control the distribution of slash categories. A slash category of the form X\ Y indicates a phrase of category X which contains a gap of category Y. (For example, for John to sit on is an S/NP, put the book is a VP/PP; a story about is an NP/NP.) A filler constituent is associated by a linking rule with a constituent containing a gap of the appropriate category. The slash annotation denoting the gap is passed by the derived rules down through the phrase marker to the lexical level where it is cashed out as a trace. Sentence (25), for example, is assigned the phrase marker (26) by rules (27). (25) (26)
That boy, I punched on the nose. S S/NP NP I
VP/NP V
NP/NP
punched
trace
PP Ρ on
(27)
S-»NP S/NP S/NP->NP VP/NP VP/NP -+V NP/NP N P / N P - » trace PP-P NP NP-vDet Ν
Det
Ν
the
nose
(linking rule) PP (derived rules) (linking rule)
Note that these rules permit only a direct-object gap in the main clause. For N P gaps in prepositional phrases the rules in (28) must be added to the grammar. For N P gaps in subordinate clauses the rules in (29) must be added. For gaps within noun phrases the rules in (30) must be added. For PP gaps the rules in (31) must be added. (28)
VP/NP->V PP/NP-+P
(NP) P P / N P NP/NP
26
(29)
J. D. Fodor
VP/NP->V
(NP)
S/NP->COMP
/NP
S/NP
(30)
NP/NP->NP
(31)
S ^ P P S/PP S/PP->NP VP/PP VP/PP->V (NP) PP/PP PP/PP-* trace
A language like Russian, whose gap positions are more limited than those of English, will have a simpler grammar (in this respect) than English. A language with more positions for its gaps would have a more complex grammar. We can imagine, then, that the drive for simplicity in the grammar is constantly at war with the pressure for expressive freedom, and that different languages represent different truces between the two. According to this new model, The grammar yearns for simplicity, i.e. for fewer rules, with correspondingly more limited filler and gap distributions. The expressor presses for more rules, to permit fillers and gaps in positions where they would make a significant contribution to the expressive capacity of the language. The parser now contributes nothing, since its essential role in the previous model has been taken over by the grammar. The weakness of this new picture is that generalized phrase-structure grammars, as currently conceived, do not in fact consist of rules like those in (27)—(31). The essential ingredients of a G P S G are a set of basic phrasestructure rules and a set of metarules defined over them which generate the slash-category rules from them. These metarules are not dissimilar to transformations, though their function is to derive rules from rules, rather than phrase markers from phrase markers. And, just like transformations, the metarules tend to overgenerate. The derived phrase-structure grammar has the property we require, i.e. the simpler the grammar, the fewer the derivations. But the metagrammar, which determines which rules the derived grammar will contain, has the opposite property. I have argued elsewhere (Fodor, 1983) that there is some encouragement for the view that such a grammar will not need to contain constraints on the application of the phrase-structure rules, or filters over the phrase markers that they generate. But they MUST contain constraints on the application of the metarules, or filters over the derived phrasestructure grammars that these generate. The island constraints, for example, amount to restrictions on the operation of the general metarule
Constraints on gaps: the parser
27
(32) (from Gazdar 1981; notation changed here), which creates the derived rules for passing slash annotations down through a tree from filler to gap. (32)
If the basic grammar contains the rule α^σ1...σί...σ„, then the derived grammar contains the rule α Ι β - > σ ί . . . σ ί / β . . . σ π , where 1 ^ i < η, and α, a¡ are nonterminal symbols which can dominate β according to the rules of the basic grammar.
If constituents of category X are islands, then no slash annotation may be passed down through a node labeled X. Gazdar proposes, for example, that all left-branching phrases are islands for NP gaps, and he notes that this generalized left-branch condition can be imposed either in the form of the filter (33) over the derived grammar, or by incorporating the stipulation (34) into the metarule (32). (33) (34)
*αΙβ->σ/β..., where a and σ are any node labels, and /? = NP. —ι (i = 1 A = NP)
But either way, it is clear that the grammar would be simpler without this constraint than with it. So we have lost our answer to the question, why do such constraints exist? In recent work with Stephen Crain (Fodor and Crain n.d.), we have explored the possibility of eliminating the metarules from a GPSG, and along with them the constraints that must be imposed on them. At first sight this appears to be a highly undesirable move, since the metarules serve two important functions. One function is to ensure that every derived rule corresponds to a basic rule (where by correspondence I mean the formal relation between rule pairs established by [32]). Without this, it would be possible for a topicalization or a WH-movement construction to be structurally quite different from its closest 'untransformed' counterpart in the language. In fact, apart from the presence or absence of the filler and its gap, the two constructions are observed to be structurally very similar. In some cases there may be a difference due to interaction with some other rule (e.g. subject-aux inversion), but there is certainly no massive failure of correspondence. It appears, then, that correspondence is the unmarked case, and that the theory should make it either impossible, or at least significantly more costly, for the grammar to capture noncorrespondence. The other function of the metarules is to simplify the grammar. We have seen that it often takes several derived PS rules to do the work of one transformational rule. It is a traditional objection to CFPS grammars that, even if they were capable of capturing all syntactic dependencies (which they aren't without the introduction of slash categories), the
28
J. D. Fodor
complexity of the grammar would be disproportionate to the complexity of the phenomena it describes. This objection can be answered in the case of a G P S G if complexity measures are defined not over the derived grammar, but over the basic grammar and the metarule component. The metarules capture the same sorts of broad generalizations as transformations, and a single quite simple metarule can vastly increase the generative capacity of the grammar. Both of these desiderata — simplicity and correspondence — can be achieved without postulating metarules, if we adopt (and adapt) standard abbreviatory conventions for PS rules. For example, the rules in (35) can be collapsed as in (36), with the additional convention that the contents of the empty angle brackets on the right-hand side of the rule match those of the angle brackets on the left-hand side. 8 (35)
(36)
S->NP VP S / N P - N P VP/NP S/PP->NP VP/PP VP
The correspondence condition might simply be stipulated as an absolute requirement on slash-category rules, but it is interesting to note that a relative-cost version of it falls out automatically from the abbreviatory conventions. A slash-category rule that cannot be collapsed with a (corresponding) basic rule would require more symbols for its formulation and thus would be disfavored by any standard evaluation metric. With this modification to the standard theory of GPSGs, they can be seen as very minimal enrichments of traditional CFPSGs, with only as much machinery added as is absolutely essential to accommodate longdistance dependencies. There are also advantages for the theory of language learning, which will be noted below. For present purposes, the crucial effect of the elimination of metarules is the conversion of a grammar which, like a transformational grammar, has an inherent tendency to overgenerate (if simplified by loss of the constraints on the metarules), into a grammar with an inherent tendency to undergenerate (if simplified by loss of any of its PS rules). What makes the difference is that, even with the abbreviatory conventions, there is still a finite cost associated with each rule for passing gap information through phrase markers. The conventions ensure that the cost will be small (smaller than the cost of adding a randomly selected rule), so that a language with very free extraction patterns doesn't have to be counted as vastly more complex than a language with very limited extraction patterns. But nevertheless, simplicity of the grammar will be positively correlated with
Constraints on gaps ', the parser
29
limitations on extraction. As we have seen, this means that it could be the grammar rather than the parser that contributes the negative pressure which counterbalances the expressor's desire for complete freedom with respect to extraction.
5. Cooperation and competition in language design In developing the 'pot pourri' alternative of section 1, it is natural to begin by assuming that the various components of the language faculty will each have their own characteristic interests, and that there may be considerable conflicts between them. It seems altogether too much to hope for that the optimal grammar should permit simultaneously the simplest parsing and production operations, the easiest learning, and the greatest flexibility of expression. But a case can now be made for just this state of affairs, as long as we set aside the matter of expressiveness, which is apparently in inevitable competition with simplicity of linguistic competence and the performance mechanisms. 9 This 'cooperative model' rests on the modified theory of GPSGs (sans metarules) outlined above. There is some empirical support for this theory, since it predicts (as opposed to merely stipulating) the kinds of limitations on gap distribution that natural languages exhibit (e.g. island phenomena) 1 0 (see Fodor 1983 for more specific predictions about lexical government, boundedness, etc). 11 The modified theory also permits an account of language acquisition which shows how learning can be independent of 'negative data' (i.e. information that certain word strings are ungrammatical), but which presupposes a much more parsimonious account of the learner's innate linguistic endowment than current theories do. The 'negative-data problem' is a problem only if the innately given representation system for grammars (together with its evaluation metric) favors overgenerating rules which must be held in check by constraints, i.e. by negative statements in the grammar. Negative statements cannot be motivated by positive data, so the only solution is for every negative statement in every possible grammar to be innate (see Chomsky and Lasnik 1977). Since many, perhaps most, of these negative statements are highly languagespecific, a child would then have to go through the stage of discarding them on the basis of positive data. By contrast, a modified G P S G contains no negative statements, and hence negative statements have to be neither learned nor innately given. With this obstacle out of the way, it may be possible to adopt an extremely restrictive theory of innate linguistic knowledge according to
30
J. D. Fodor
which a child is equipped at birth with ONLY a grammar-representation system (a metalanguage) 1 2 in which to formulate whatever he learns from his linguistic input. Note that every theory has to postulate some innate metalanguage; the novel aspect of this theory is that it does not have to postulate in addition that the child has innate knowledge of any specific statements (positive or negative) framed in this metalanguage (see Fodor and Crain n.d. for fuller discussion). To put it informally, this theory of innate linguistic knowledge presupposes less work on the part of evolution to equip us with the capacity for language. A rather different line of thought supports the same conclusion. As I have already noted, a G P S G without metarules represents a minimal escalation of standard CFPSGs. Taking the expressor into consideration as well, this suggests a sensible, albeit highly speculative, account of the course of language evolution, an account which minimizes the role of accident, opportunism, and other random factors that we could never hope to identify. Let us imagine ourselves in the role of Nature, whose task is to design a language faculty for humans which will bestow the maximum selective advantage at the minimum cost. What is the simplest kind of language that will do the job? Clearly not a finite language, given that humans are capable of an infinite range of different thoughts, any one of which they might want to communicate. A recursive finite-state language is a somewhat stronger candidate, but it too is inadequate for expressive purposes because its recursion is in the wrong places. If we assume that the fundamental perceptual/conceptual analysis imposed on states of affairs by the human mind is an analysis into entities and relations between entities, and that a speaker might wish to give a rich description either of the entities or of the relations between them, then it is clear that an ideal linguistic code should permit modification and elaboration of both argument expressions and predicate expressions, and should do so independently of where these expressions appear in sentences. A finitestate language, with its uniformly right-branching (or left-branching) structures, does not offer this kind of freedom, but a phrase-structure language does. The hierarchical structures characterized by PS rules mean that sentences can all be cut to the same few basic patterns, even though some parts of some sentences are much more richly developed than others. And the patterns can correspond in a quite natural way with the basic predicate/argument patterns of sentence meanings. A CFPS system thus rather nicely balances expressive power with economy of means; but enriching it with the ability to handle longdistance dependencies in sentences would bring additional advantages in the expression of scope relations, coreference, and similar matters that are
Constraints on gaps: the parser
31
beyond the range of predicate/argument relations. The modified G P S G theory I have outlined here implies that these additional advantages could be achieved without any radical restructuring of the language faculty, such as the introduction of a completely new type of rule (transformations or G P S G metarules). It was, in other words, a very small evolutionary step that gave us WH-movement. There is presumably no way of finding out whether this evolutionary fairytale is true, but it does provide an interesting contrast to the prevailing view that the human language faculty is odd and surprising and riddled with all manner of quirks and unmotivated complications. This could be so, and certain current theories of grammar must assume that it is so, but methodologically it is surely proper to keep trying to show that it is not so. Finally, I turn to the question of how this kind of grammar eould optimize parsing, given that it is much too restricted in format to be able to accommodate many constraints that the parser might have liked as remedies for various kinds of ambiguity or computational overload. The answer, I think, lies in a trade-off between complexity of the parsing process in general, and complexity due to particular problems posed by certain sentences in a language. In Fodor (1983), I contrast a parser based on a transformational grammar with a parser based on a GPSG. The former needs, in addition to basic phrase-structure-assignment routines, some separate mechanism for coping with fillers and gaps, e.g. for storing a filler (or its trace) in memory and retrieving it from storage at just the positions which the various constraints in the grammar identify as legitimate positions for its gap. By contrast, a G P S G permits a completely uniform parsing process, with no special storage and retrieval routines. Filler and gap associations are established by the slash-category PS rules, and these can be applied to word strings in just the same way, and at just the same time, as the basic PS rules that are used to identify noun phrases, clauses, and so forth. 1 3 Furthermore, since the parsing routines make direct use of the PS rules (in contrast to transformational rules, whose effects must be simulated indirectly in any plausible parsing model), there is no need for information in the grammar to be reshuffled or recoded for purposes of sentence parsing. A child who has acquired a new rule in his grammar can immediately start to make use of it in parsing the sentences he hears. According to this model, sentence parsing is a pretty straightforward business. There will still be uncertainties and complexities in particular sentences, but it is no longer surprising that the parser is robust enough to be able to cope with them (or at least with most of them). A richer format for grammars might have provided more flexibility for the resolution of
32
J. D. Fodor
these specific problems, but it would also entail a much more intricate parsing machine. To summarize: I have shown that the kind of functional explanation that one is led to for some aspect of linguistic competence depends very heavily on how one characterizes the components of the language faculty. This is a rather obvious point, but it is interesting to see how much difference a shift in the conception of grammars can make to the functional model. Where we once saw competition, we now see cooperation. The design of human languages may still represent a compromise between form and function, but it begins to look as if all the performance mechanisms (i.e. the mechanisms for learning, speaking, and understanding) may come down on the form side of the debate. They too, after all, involve the representation of rules and computations based on them; it is only the expressor that is unmoved by formal elegance.
Notes 1.
2.
The new answer is given in sections 4 and 5 below. The first three sections of the present paper largely duplicate previous work (particularly Fodor 1978, 1981). I include them here, with apologies for the redundancy, as background for the new model. The theory of syntax that this model incorporates is sufficiently unconventional that it seems wise to offer some indication of why one might be led to contemplate it. For simplicity's sake I concede this point here, though it is not obvious that it is true. Both Bever (1971) and Chomsky and Lasnik (1977), in their different ways, subsume the ungrammaticality of (i) under the same constraint as (4). And at least for Chomsky and Lasnik, this constraint also excludes (ii), even though, for any reasonably systematic left-to-right parser, (ii) is far less misleading than (i) and (4) because it contains a left-disambiguation of the locally ambiguous word sequence. (i) *The man was here is my friend. (ii) *I met the man was here.
3.
However, the lack of any direct parsing motivation for excluding (ii) would not necessarily invalidate the functional explanation for the constraint. The closer the fit between the parsing problem and the constraint, the stronger is the argument that the constraint is motivated by the parsing problem. But this is an epistemological point only. As I will stress in connection with other examples below, cases in which a superset, or only a subset, of the difficult-to-parse constructions is excluded can be accommodated by bringing the grammar into the negotiations, i.e. by recognizing that the parser can influence the grammar only insofar as the grammar agrees to be influenced. The grammar might well resist a constraint excluding (i) but not (ii), for this would be more complex than a constraint excluding both. As for other examples discussed in note 2 and below, we should consider the possibility of a grammatically simpler but more sweeping constraint, such as a prohibition on ALL reduced relative clauses, ambiguous or otherwise. In other cases it is typically the expressor that would resist such a wholesale reduction of the set of grammatical
Constraints on gaps: the parser
33
sentences, but this is not plausible here since everything expressible in a reduced relative can also be expressed by an unreduced relative. The only loss due to such a constraint would be in the economy of expression — roughly, the content conveyed per word or per node of the phrase marker. I am reluctant at this stage to throw together questions of effability and questions of economy (though one can discern some connections between them), so it may be necessary to introduce an 'economist' into the negotiation model to accommodate deletion phenomena. 4.
5.
At the meeting in Cascais, I offered an explanation for the nesting preference in which Mark Steedman spotted a serious flaw. I observed that the c-command constraint permits the gap for the higher filler to appear outside the gap domain for the lower filler, either before it (which would result in disjoint dependencies for filler-first examples), or after it (which would result in nested dependencies); but this situation can never give rise to intersecting dependencies. My suggestion was that this asymmetry is carried over to the case in which the gap for the higher filler appears WITHIN the gap domain for the lower filler (so that c-command does not disambiguate the filler-gap relations). The trouble with this idea is that it presupposes either some kind of nonformal analogizing, or that the c-command condition and the N D C are combined into a single formal constraint — which is almost certainly false. I believe the explanation I propose here does a better job of capturing the hunch that the N D C asymmetry stems from the c-command asymmetry. The vague analogy is given solid substance by the assumption (for which there is independent empirical support) that the parser is sensitive to distributional probabilities.
For example, a filler will be 'doubtful' if it is the antecedent for an optional deletion or has been moved by an optional structure-preserving rule. 6. In fact, the constraints at work in (19) and (20) are probably just limiting cases of relational constraints. 7. My casual use of 'the grammar' to cover both innate and acquired principles needs to be unpacked here, but I think it is still clear that competence considerations cannot plausibly be held to favor the inclusion of negative statements in grammars. For a universal negative statement, the claim that it is a part of innate linguistic competence leads to a less parsimonious theory of language evolution, other things being equal, than the claim that it is not. For a language-specific negative statement, the innate evaluation metric would assign a greater cost, other things being equal, to a grammar containing that statement than to a grammar not containing it. (I discuss below the claim that even language-specific negative statements must be innate.) 8. Even if some languages require an infinite number of derived rules (which would be handled by recursive metarules in Gazdar's theory), abbreviating conventions will be adequate as long as we allow infinite rule schemata (as in most treatments of conjunction). 9. This may not be inevitable in free-word-order languages. I have not so far attempted to apply the functional model to such languages. 10. So does the more standard theory of GPSGs with metarules. 11. Fodor (1983) also notes that global constraints like subjacency would be extremely costly in a GPSG, so that the empirical validation of this very restricted theory involves showing that such constraints are not in fact needed for observational or descriptive adequacy. 12. Gazdar's metarules can be regarded as part of the grammar for this metalanguage, rather than part of the grammar for the object language. That is, the vocabulary of the metalanguage includes slashed-category symbols, and its syntax permits certain sorts of sentences (i.e. rules for the object language) containing them.
34
J. D. Fodor
13.
Since this reference to uniformity may be reminiscent of Postal's (1972) celebration of homogeneity, I should point out two significant differences. First, I am not suggesting that there is any a priori reason for preferring a uniform grammar to a more modular one which has several components containing rules of quite different kinds; my argument is the empirical one that a uniform syntax (of this kind) demands a lesselaborate parsing machine. Second, unlike Postal, I am not suggesting (and I do not believe) that syntactic structure and semantic interpretation are dealt with by a homogeneous set of rules; it is only uniformity WITHIN the syntactic component that I am advocating.
Department of Linguistics University of Connecticut Stoors, Conn. 06268 USA References Baker, C. L. (1979). Syntactic theory and the projection problem. Linguistic Inquiry 10, 533-581. Bever, T. G. (1970). The cognitive basis for linguistic structures. In J. R. Hayes (ed.), Cognition and the Development of Language. New York: Wiley. Chomsky, N., and Lasnik, H. (1977). Filters and control. Linguistic Inquiry 8, 425-504. Fodor, J. D. (1978). Parsing strategies and constraints on transformations. Linguistic Inquiry 9, 427^173. —(1981). Does performance shape competence? Philosophical Transactions of the Royal Society of London Β 295, 285-295. —(1983). Phrase structure parsing and the island constraints. Linguistics and Philosophy 6, 163-223. —, and Crain, S. (n.d.). On the form of innate linguistic knowledge. Unpublished manuscript. Frazier, L. (i.p.). Syntactic complexity. In D. Dowty, L. Karttunen, and A. Zwicky (eds.), Natural Language Processing: Psycholinguistic, Computational and Theoretic Perspectives. Cambridge: Cambridge University Press. —, and Fodor, J. D. (1978). The sausage machine: a new two-stage parsing model. Cognition 6, 291-325. Gazdar, G. (1981). Unbounded dependencies and coordinate structure. Linguistic Inquiry 12, 155-184. Kaplan, R. (1973). A multi-processing approach to natural language. In Proceedings of the First National Computer Conference. Pinker, S. (1981). A theory of the acquisition of lexical-interpretive grammars. In J. Bresnan (ed.), The Mental Representation of Grammatical Relations. Cambridge, Mass.: MIT Press. Postal, P. M. (1972). The best theory. In S. Peters (ed.), Goals of Linguistic Theory. Englewood Cliffs, N. J.: Prentice Hall. Ross, J. R. (1967). Constraints on variables in syntax. Unpublished Ph.D. dissertation, MIT, Cambridge, Mass. Wexler, K., and Culicover, P. W. (1980). Format Principles of Language Acquisition. Cambridge, Mass.: MIT Press. Yngve, V. H. (1960). A model and an hypothesis for language structure. Proceedings of the American Philosophical Society 104, 444-466.
On the generality of the nested-dependency constraint and the reason for an exception in Dutch* MARK STEEDMAN
1.
Introduction
Several recent papers have argued that the phenomena which have been widely assumed to demand the inclusion of transformations in grammars of natural languages can in fact be handled within systems of lesser power, such as context-free grammar (cf. Brame 1976, 1978; Bresnan 1978; Gazdar 1981; Peters 1980). In a paper called 'On the order of words' (Ades and Steedman 1982; hereafter O O W ) it was similarly argued that many of the constraints on unbounded movement 1 that have been observed within the transformationalist approach could be explained within a theory consisting of a categorial grammar (Ajdukiewicz 1935; Bar-Hillel et al. 1960; Lyons 1968) augmented with a small number of simple rule schemata called (because of their resemblance to the operations of a parser) 'combination rules'. The present paper concerns the application of this theory of grammar to certain constructions allegedly involving 'crossed dependencies'. Such constructions are widely assumed to present a serious challenge for alternatives to the transformationalist approach. The theory put forward in O O W took as its point of departure the observation concerning movement and other dependencies in natural languages, that usually goes by the name of the 'nested-dependency constraint'. There is a strong tendency among the languages of the world to forbid constructions like (a), where the dependencies (indicated by subscripts) between elements of the sentence 'intersect' — that is, where one of a pair of dependent items intervenes between the members of another pair. Usually only their nonintersecting relatives are allowed, as in (b). 2 (1)
a. b.
* Which sonatas ! is this violin 2 easy to play t on 2 ? Which violin! are these sonatas 2 easy to play 2 o n ^
While the tendency to exclude crossed dependencies falls a long way short of being a universal constraint, it applies with striking frequency to
36
M. Steedman
unbounded and bounded movements and dependencies in the languages of the world. That is to say that although there are many well-attested exceptions to the tendency, such as the Dutch infinitival complement constructions that are discussed below, or the clitic pronouns of the Romance languages, they clearly ARE exceptional. For example, while crossed dependencies are apparently allowed in 'inverted' relative clauses in French — cf. (a), below — the nested construction is allowed as well, as in (b). (2)
a. b.
(la maison) [dans laquelle] i a 2 vécui Brigitte 2 (la maison) [dans laquelle]! Brigitte 2 a 2 vécu.
Similarly, while French clitic pronouns HAVE to cross dependencies in a construction like (a) below (at least, they do on a movement account of such dependencies, according to which the pronouns are restored to the canonical positions t, and t 2 ), the noncrossing construction (b) is required for all other nonpro phrases: (3)
a. b.
Jean leSi y 2 a mis t¡ t 2 Jean a mis les couteaux dans le tiroir.
In short, while many languages, perhaps even MOST languages, allow a certain degree of crossed dependency, it appears to be very strictly circumscribed. There are no languages that entirely cross dependencies, nor even any that mainly cross while exceptionally allowing nesting. Such authors as Kuno and Robinson (1972), Woods (1973), Kaplan (1973), Fodor (1978), and Bach (1977, quoting Andrews) have all pointed out that such a 'constraint' (although not its exceptions) would be explained if the mechanism that were used to augment a context-free surface-structure processor in order to perform the 'movement' operations (for example in parsing sentences into a canonical form) were a 'last in, first out' store or 'push-down' stack, upon which displaced items could be placed as encountered and from which they could be retrieved when needed. However, this account of the constraint is subject to two important objections. First, the basic context-free processor also requires a stack — but a mechanism with two stacks is equivalent to a universal Turing machine, and in itself places no constraint at all on the languages it will process. Thus it is slightly misleading to suggest that the constraint is 'explained' by the involvement of this mechanism. A further 'performance constraint', albeit a very simple and elegant one restricting the uses of the respective stacks, is required as well, just as a (somewhat more mysterious) constraint is required in the transformational account. Second, as Fodor (1978) pointed out, the C F parser could as easily be augmented with a 'FIRST in, first out' store and process the languages with
The nested-dependency
constraint
37
entirely crossing dependencies that are not in fact found. The proposal in O O W was simply that the SAME stack that was required for basic contextfree processing could also be used to process discontinuous constituents in the way suggested by the earlier work. The effect of this proposal was to shift the responsibility for explaining the tendency toward nesting away from the domain of performance constraints and to put it in the domain of grammar. The suggestion was that extraction or unbounded movement was a phenomenon of grammars corresponding to single-stack automata, that is, of some natural generalization of context-free grammar. 3 The proposal was therefore related to the base-generation hypothesis. It was distinguished from other such proposals by the inclusion of a novel syntactic operation called 'partial combination', of which more will be said below. It will be convenient to refer to such grammars as 'extended' categorial grammars. The theory as presented in O O W laid considerable emphasis upon the initial assumption that the so-called nested dependency constraint describes a significant tendency among natural languages. It is therefore important to show that the apparent exceptions either are illusory or can be accommodated within the proposed extension to categorial grammar. Pullum and Gazdar (1982) have recently demonstrated that many of the classic arguments for the existence of crossing dependencies in various languages (Postal 1964; Langendoen 1977; Chomsky 1963) are flawed, leaving open the question of whether the dependencies in question can be handled within a 'generalized' phrase-structure grammar of the kind proposed by Gazdar (1981). However, they also consider one case which is much less easy to dismiss. It has been noted by Evers (1975) and Huybregts (1976), among others, that certain constructions which apparently involve crossed dependencies between verbs and their arguments are permitted in Dutch. These constructions occur for example with verbs like zien [to see] and helpen [to help], which take infinitival complements, and are illustrated in the following subordinate clauses, adapted from Huybregts, in which subscripts indicate dependencies according to one widespread assumption about the corresponding deep structures. (4)
a.
b.
... omdat iki Cecilia 2 de nijlpaarden 2 zagi voeren 2 ... because I Cecilia the hippos saw feed '... because I saw Cecilia feed the hippos' ... omdat iki Cecilia2 Henk3 de nijlpaarden 3 zagi voeren 3 ... because I Cecilia Henk the hippos saw help feed '... because I saw Cecilia help Henk feed the hippos'
helper
38
M. Steedman
The construction can embed, so that indefinitely many crossed dependencies can in principle be involved. Moreover, in most dialects the alternative, in which the verb group is in the reverse, German order (and thus makes the dependencies nest), is actually disallowed (Zaenen 1979). While Pullum and Gazdar show that alleged proofs on the basis of this construction that Dutch is not a context-free LANGUAGE are flawed, they are careful to point out first that they do not themselves offer a proof that it is a C F language, and second that such a proof would in any case not rule out the possibility that a strongly adequate GRAMMAR for Dutch would require more than context-free power. Recently, Bresnan et al. (1982) have advanced a proof that strong adequacy does indeed require more than a CF grammar. The challenge that these constructions pose for any theory of grammar is twofold. First, can the grammars allowed by the theory capture them at all? Second, does the theory explain why such constructions should be exceptional (in the sense defined earlier)? Context-free grammar appears to be excluded on the first count. Standard theoretic TG and many of its descendants seem to be ruled out on the second. The present paper shows that an extended categorial grammar restricted to combination rules of exactly the same type as were originally proposed for English will also account for some of the most important characteristics of the Dutch infinitival construction. The novel syntactic operation called 'partial combination', which is required in such grammars of English, German, and Dutch for quite independent reasons in order to explain extraction phenomena, plays a crucial role in composing the Dutch verbs into a verblike entity which DEMANDS its arguments in the 'crossed' order. The types of crossed dependencies that such grammars will allow appear to be strictly limited and are predicted to be relatively rare under certain independent assumptions that have commonly been adopted in accounts of language variation. While certain problems remain concerning so-called clitic pronouns and extractions, the ability to explain the comparative rarity of crossed dependencies appears to constitute an advantage of the present theory. The argument will go as follows. First the original proposal will be briefly reviewed. A similarly brief outline of the principal facts of Dutch and German word order follows. The construction introduced in example (4) and some related constructions are then considered at some length, with particular attention being paid to the behavior of the verb group. The concluding sections briefly consider an unsolved problem to do with the other, nonverbal elements of the construction and the implications of the theory for the study of language universals.
The nested-dependency constraint 2.
39
An extended categoria! grammar for a fragment of English
In proposing that movement and the constraints on movement are the manifestation of some more restricted form of grammar than TG, extended categorial grammar is closely allied to the base-generative hypothesis. At the same time it attempts to preserve quite directly the generalizations that have been described using certain unbounded movement transformations. It consists of two components, which can be thought of as performing roughly the same roles as the base and transformations of a standard transformational grammar. The first component is a categorial. grammar which, like a standard theory-base grammar, is context-free and more or less directly related to the logical form or interpretation of the sentence. The second component consists of a number of rule schemata which, like an orthodox transformational component, define related orderings of constituents. The difference is that the 'base' is an ORDER-FREE categorial grammar, while the SOLE responsibility for defining order rests with the second component of the grammar, called 'combination rules' because of their direct relation to the operations of a parser. The categorial base is defined as a lexicon, in which each entry includes a 'category' which defines the kind of constituent (if any) with which the word in question can combine and the kind of constituent that results. The category of a pronoun is simply NP. The category of a transitive verb is written VP/NP, identifying it as combining with an (object) NP to yield a VP. Similarly, the category of a ditransitive verb like give is (VP/NP)/NP — a thing that combines with an (indirect object) NP to yield a thing which still needs an (object) N P to yield a complete VP. In what follows it will be convenient to use a convention that (in the absence of brackets explicitly indicating the contrary) slashes 'associate to the left' and to write such categories as the above as VP/NP/NP, leaving the bracketing implicit. Items having categories of the form X/ Y, W/X/ Y, and so on are to be thought of as functions over Y. For example, the category VP/NP of transitive verbs identifies them as functions from NPs into VPs, and the category VP/NP/NP of ditransitive verbs identifies them as functions from NPs into functions-from-NPs-into-VPs. In the first place, such functions can be thought of as mapping between entirely syntactic entities. However, the categories can also be thought of as a shorthand for the semantics of the entities in question, and the functions can then be thought of as mapping between semantic entities, such as interpretations, intensions, or whatever. 4 The present paper will continue to use the syntactic shorthand, since for present purposes it makes no difference
40
M. Steedman
whether semantic entities are thought of as expressions in lambda calculus, procedures, or anything else. The important assumption is parallel to the basic 'rule-to-rule' assumption prevalent in Montague grammar (cf. Bach 1980): there is a functional correspondence between syntactic categories and rules of semantic interpretation. (Naturally, any word, including the verbs mentioned above, is free to have more than one lexical entry. The question of WHICH of its categories will be involved in a given derivation is of course not relevant to a discussion of grammar.) The fact that the lexical categories simply specify what combines with what, without regard to left-right order (as befits their basically semantic nature), and that the responsibility for defining constituent order rests entirely in the other component, the PS rule schemata, will prove crucial to the enterprise of capturing cross-linguistic generalizations, as other theorists committed to an unordered base (such as the relational grammarians) have pointed out (cf. Cole and Sadock 1977; Bartsch and Vennemann 1972; Peterson 1977; Sanders 1975; Saumjan 1971). There were three combination rules in the original proposal. The first two rules allowed the simple combination of a function with an argument respectively to its right and to its left, the two trivial rules for applying a function to its argument that any unordered categorial grammar would require in order to accept the grammatical strings of an order-dependent language. The former was called 'forward combination', and is written as follows. (5)
Forward combination
mγ
In this and all of the rules that will follow, Χ, Y, Z,... are to be read as variables which match any category. In the present paper (and in OOW) the categories in question are always atomic, such as NP, VP, S, and so on, but this restriction is not inherent in categorial grammar itself. The symbols $, $', $" are string symbols, which stand for sequences of alternating slashes and category symbols which begin with a slash, or for the empty string. Again, for all examples in this paper and its predecessors, the categories in question are atomic. Thus the expression Xh/Y in the above formula will match the functor categories VP/NP, VP/PP/NP, VP/PP/PP/NP, and so on, with Z = V P , y = N P , and $ successively bound to the null string, /PP, /PP/PP, and so on (The '$' convention may seem unnecessarily cumbersome, but it is necessary in order to express the rule called 'forward partial combination' below. In many cases $, $', etc., are in fact null, and the reader can just ignore them.) It is important to remember that the slash in a variable category such as
The nested-dependency constraint
41
X$/Y is always the 'main' or rightmost slash. (This convention is simply a consequence of the left-associativity convention that is used here.) The combination rules are written as a kind of backward rewrite rule schema but may also be thought of as 'reduction' rules for a 'shift and reduce' parser. Such parsers operate upon a single push-down stack by from time to time 'shifting', or adding a new symbol to the top of the stack, and from time to time 'reducing', or combining the top two items on the stack. (The nondeterministic phrase 'from time to time' points to the fact that such a processor will also need a mechanism for ambiguity resolution to tell it WHEN to do these operations. The nature of this mechanism is discussed further in Crain and Steedman 1982 but has no immediate relevance to questions of grammar.) It may be helpful to think of combination rules like the above as rules for such a parser, by imagining a stack upon which the words of the sentence are from time to time placed with their categories showing. (The stack must be thought of as lying on its side, and left-hand and right-hand sides of the rule must be thought of as representing the topmost items on the stack before and after the reduction.) The following verb phrases are accepted by two successive forward combinations. (Combination of two entities is indicated by an underline joining them, with the resulting category written underneath.) (6)
a.
Eat
the
cake.
VP/NP
NP/N
Ν
NP VP b.
Give
me
that.
VP/NP/NP
NP
NP
VP/NP VP Backward combination is similar, except that as far as the categories discussed in this paper go, in English (unlike German) it appears that nothing except a sentence-producing functor of the form S$/X can get an argument from the left. 5 The rule is therefore written (7)
Backward combination Y X$/Y=> X$ where JT=[ + S, + N , + V ]
The features - Ν and + V identify the X as a VERBAL category, as under the Z-bar hypothesis of Chomsky (1970) and Jackendoff (1977). The
42
M. Steedman
(nonstandard) feature + S further restricts I t o a SENTENTIAL category. The convention in such rules (which differ slightly from the ones used in OOW) is that features borne by the functor category on the left-hand side of the rule are assumed to be borne by the right-hand-side entity, unless explicit indications are given of a new feature or a change in value. It is the backward-combination rule which for example allows a preposed adverbial here to be picked up by a sentence fragment he comes which by previous applications of the rules has been given the category S/ADV — that is, a sentence requiring a (directional) adverbial to complete it. (8)
Here
he
ADV
comes.
S/ADV S
The third rule is a little less obvious. If all preposings are to be handled using backward combination, a problem arises in the following topicalized sentence: (9)
These cakes NP
I will
eat.
S/VP VP/NP
We may assume (and it is at this point that the treatment of tensed verbs becomes crucial to the account) that the subject / and the tensed verb will can somehow be combined to yield an entity of category S/VP, that needs a VP to make a sentence. If eat could combine with the N P these cakes then it would make a suitable VP — but the object NP is isolated on the wrong side of [I will]s/VP. Since sentences like these cakes she says I might have been eating potentially introduce unlimited numbers of intervening items, it seems inevitable that a base-generative theory must treat such incomplete fragments as / will eat, I might have been eating, and I might have believed John thought Mary would eat as entities in the grammar itself. More precisely, they must be entities bearing the category S/NP, that is, functions from NPs into Ss. There are at least two ways to do this. The alternative chosen by Gazdar (1981) is to introduce more PS rules into a base-generative grammar for English, so that it will allow the generation of such entities. For example, in order to generate the entity [I will eat] s/NP , 6 he includes in the base (via an extremely ingenious use of rule schemata) the following rules in addition to the normal ones (the notation is slightly adapted). (10)
a. S/NP ->NP VP/NP b. VP/NP -> V NP/NP c. NP/NP->· trace
The nested-dependency constraint
43
However, there is an alternative. The framework of categorial grammar provides 'hole' categories. For example, the category VP/NP of a transitive verb like eat not only does the same job as the 'normal' rewrite rule VP->V NP. It also embodies the other rule (b), defining the verb eat as an INCOMPLETE VP lacking an NP. It follows that the same job of providing grammatical status for fragments like [I will eat] s/NP can economically be done by introducing a new kind of combination rule that will 'partially' combine the fragment [I will]s/VP with the INCOMPLETE verb phrase [eat]VP/NP in the following way. 7 The semantics of simple combination was said above to be simply the application of a function to an argument. However, there is another operation that can be performed on functions. A function can be COMPOSED with second function, provided that the second delivers a result that constitutes an appropriate argument for the first. It is therefore possible to define a syntactic operation of PARTIAL combination, whose semantics is simply the composition of two functions. For example, if the function over (semantic representations of) VPs that corresponds to I will is known, and the function from (semantic representations of) NPs into (semantic representations of) VPs eat is also known, then everything is known that is necessary to specify their composition, a function over (semantic representations of) NPs corresponding to I will eat. This composition can be applied without any further interpretation to the remainder of the sentence, or even to referents in the focus of discourse. (It is argued in OOW, and by Crain and Steedman 1982, that the ability to evaluate interpretations of incomplete phrases and select the one[s] that are consistent with the referential context is the central mechanism for local ambiguity resolution in the psychological sentence processor.) The syntactic operation corresponding to the particular instance of function composition needed for the above example can be written ALREADY
(11)
Forward partial combination
x$/Y y$'/z=>w/z
The symbols $ and $' are string variables of the kind defined earlier and are necessary to correctly compose or partially combine the more complex functor categories like ditransitive verbs, VP/NP/NP. 8 As noted before, $ and $' are often null and can be ignored for the moment. (However, their involvement in the forward-partial rule is crucial to the later analysis of Dutch infinitivals.) The full derivation of (9) can be written as follows (underlines are indexed according to the rule that applied).
44 (12)
M. Síeedman a.
These cakes
I will
eat
NP
S/VP
VP/NP FP
S/NP Β S It is the forward-partial rule that bears the sole responsibility in English for building a 'bridge' (Shir 1977) between an 'extracted' item and the site of its 'deletion'. On the assumption that complement-taking verbs such as believe bear the category VP/S (a function from Ss into VPs), and that as before a subject I and a tensed modal will can somehow combine to give an entity I will of category S/VP, then 'unbounded' extractions from complements can be accommodated in exactly the same way, by successive partial combinations. 9 (12)
b.
These cakes
I can
believe
I will
eat
NP
S/VP
VP/S
S/VP
VP/NP
FP S/S FP S/VP FP S/NP Β S If the category denoted by F i n the forward-partial rule (11) is restricted co categories which are - N under the X-bar hypothesis of Chomsky (1970) and Jackendoff (1977), the rule can be made to capture a number of the constraints upon such unbounded extractions that have been observed within the transformational framework. The restricted version is written as follows: (13)
Forward partial combination x$IY y$'/z=>jsr$$7z where 7 = [ - N ]
The noun-phrase constraint of Horn (1974) and Bach and Horn (1976) was captured in O O W using this entirely local rule. For if it is assumed (as in all versions of the X-bar hypothesis), that NPs are a + Ν category, then the above version of the rule of forward partial combination will not allow a 'bridge' to be built across a N P (or an AP) boundary, since it is only allowed to apply to verbal and prepositional categories, which are - N under the hypothesis, and may therefore 'strand' in English. 1 0
The nested-dependency constraint
45
The explanation of the above rule in terms of the operation of semantic function composition implies that it is drawn from a class of exactly two such rules. The other one would by analogy be called 'backward partial combination' and would be written (14)
YS'/Z
X$/Y^X$$'/Z
At first glance this rule does not appear to be implicated in the grammar of English, but since it is within the domain of the machine that the grammar describes, we may expect to see it turn up sooner or later. 11 The point that has been glossed over in the above account is the way in which the subject and the tensed verb became combined into a single entity [I will]s/VP, and in particular the way in which inversion of subject and tensed auxiliaries is to be accommodated. This is a question of considerable complexity, which is treated at length elsewhere. For the present paper, it will be assumed that subjects are functions from finite verb phrases into Ss, bearing the category S/FVP, and that tensed verbs are functions from the arguments of the verb into finite verb phrases, a category that can be represented as FVP$. While this is an unorthodox assumption, it has a semantically well-motivated precedent in Montague (1973). And while it is only part of the full account, in all of the derivations in the present paper it is consistent with that account. It is further assumed that tensed verbs bear the category FVP$ in the lexicon. It is also assumed that in a language like English that does not usually mark surface case, subject NPs must acquire the category S/FVP by a 'subject rule', that (nondeterministically) replaces NPs, as follows: 12 (15)
The subject rule NP=> S/FVP
Under this analysis, the full derivation of a simple object-topicalized main clause is as follows (in subsequent examples the operation of the subject rule, indicated here with the annotation 'S', will not be explicitly indicated): (16)
These cakes
Alf
will
eat
NP S NP
S/FVP
FVP/VP
VP/NP FP
S/VP FP S/NP Β S
46
M.
Steedman
Grammars of the above type embody the stringent and significant restriction that no rule may relate categories that are other than adjacent. With the important addition which follows, such a grammar can be made to fulfil the primary requirement for a grammar of English, that is to accept or generate all and only the sentences of a reasonable fragment of English, and to capture significant generalizations about their relations. As it stands, the above grammar accepts all but NOT only the sentences of the relevant fragment of English. Its most important overgeneralizations (ignoring those related to the analysis of subject and tensed verb) are of the following kinds. First, the grammar described so far does not include any distinction between WH-constituents and their non-WH counterparts. Thus while it correctly allows (a) below (given the omitted account of auxiliary inversion), it also allows (b). (17)
a. b.
Which cake will you eat? *This cake will you eat.
(It is interesting for the present purpose that the overgeneralization is to a construction that is allowed in Dutch and German.) A second, more serious, class of overgeneralizations arises from the fact that the grammar as yet contains no limit on the number of preposed items that may precede the tensed verb in the sentence. Thus not only (a) but (b) and (c) are allowed: (18)
a. b. c.
The pink one I gave him *The pink one him I gave **The pink one him Frank I think gave
It was argued in OOW that the combination rules could further be restricted to exclude these and certain other overgeneralizations (such as violations o f ' D O - s u p p o r t ' arising from the treatment of subject inversion in that paper) by the use of features such as + / — A U X , + / — W H , and + / - P R E P O S E D as conditions on their application and markers on their results. As in the case of subject and tensed verb, the account sketched in OOW has been extensively revised elsewhere and will not be considered in any further detail in the present paper, beyond noting that the use of such features is allowed within the theory, and that many of the overgeneralizations that they limit are in the direction of constructions that are found in Dutch and German.
3.
The grammar of Dutch and German
There are two ways in which the theory originally contrived for English might be made to apply to German and Dutch. It might be that those
The nested-dependency constraint
47
languages have quite different lexical categories. On the other hand, it might be the case that they have the same categories but different feature restrictions upon combination rules for ordering them, where the combination rules are restricted to simple and partial combination in the forward and backward directions. The latter assumption, which is the equivalent within the present framework of assuming a common (unordered) base grammar, is by far the more attractive possibility. Inspection of the basic facts of German and Dutch word order suggests that it is largely correct.
3.1.
German word order
The following sentences illustrate in English transliteration the principal surface orders of a clause in German containing a subject, a tensed auxiliary, a transitive verb, and its objects, Sie muss Apfel essen. (19)
a. b. c. d. e.
She must apples eat ('declarative') Must she apples eat? ('subj inversion') apples/what must she eat ('topicalization' question') ... (that) she/(a woman) who apples eat must ('that-complement' and 'subj relative') (apples) which she eat must ('obj relative')
and
'object
There are a number of commonplace generalizations that help in keeping track of these constructions. First, in subordinate clauses the tensed verb comes last in the clause. Second, in subordinate clauses at least, the verbs appear in an order which is the the mirror image of the corresponding English construction, and with noun phrases and other nonverbal complements to their left, as in (20)
Dass er Äpfel gegessen haben muss that he [apples eaten have must] 'that he [must have eaten apples]'
(These examples and generalizations are oversimplified and will be qualified later. For example, complement-taking verbs like denken 'to think' find their complement to the right, like their English counterparts. There is also a strong tendency to allow the Dutch pattern (discussed below) in the verb complex, particularly for spoken sentences including more than two verbs — cf. Evers 1975: 51, quoting Bech 1955: 63.) Third, in main clauses, while this mirror-image construction is broadly
48
M. Steedman
maintained, there is an overriding rule that the tensed verb, whether a main verb or an auxiliary, must come either sentence-initially, as in the yes-no question (20b), or immediately to the right of the leftmost constituent of the sentence, as in (20a, c). This fact has led to the introduction of 'verb-second' or 'V/2' rules in transformational treatments of Germanic languages. Its consequence is that the Englishtype topicalization construction, in which the tensed verb has TWO constituents to its left, is not allowed in German: topicalization must always be accompanied by subject inversion as in (20c). Despite the puzzling nonadjacency of tense and subject in subordinate constructions, the observation that the German verb order tends toward the 'mirror image' of the English one seems to suggest that the German verbal categories are almost identical but combine by backward, rather than forward, combination. 13 On the assumption that MOST categories in German are the same as in English, the following conclusions follow concerning the other component of the grammar, the combination rules. First, since prepositions (PP/NP) and articles (NP/N), among other categories, always combine to their right, and since verbs basically combine to their left, it follows (not surprisingly) that German must possess the two basic rules of forward and backward combination, albeit with slightly different restrictions. Moreover, the existence of a preposing construction (20c) in German shows that it must also include the rule which in the grammar of English sketched in section 2 was called forward partial combination, as in the following derivation: (21)
What NP
must
she
S/VP
eat? VP/NP FP
S/NP Β S But in order to account fully for German word order, we will need one more combination rule. 14 As was remarked earlier, there is only one more possibility among the class of combination rules that the model allows. It is backward partial combination. 15 The crucial example of a slightly more complex object question construction reveals that an extended categorial grammar of German must include this rule. (The category 'Cen' in the following example is an -en complement, or past-participial phrase.)
The nested-dependency constraint (22)
What
must she
eaten
have?
NP
S/VP
Cen/NP
VP/Cen
49
BP VP/NP FP S/NP Β S The two words eaten have must combine into a single entity of category VP/NP if they are to combine with the item [must she] s/VP to make a single entity [must she eaten have] s/NP , which needs an object and may therefore combine with the preposed WH-item. This can be brought about by introducing the rule of backward partial combination. It begins to look as though the German constructions can indeed be handled using an almost identical categorial lexicon and the same small class of combination rules and 'partial' combination rules, albeit with slightly different restrictions on the particular categories to which they may apply. The suspicion is reinforced by consideration of the related facts for Dutch. 3.2.
Dutch word order
The grammar of Dutch turns out, perhaps not surprisingly, to be entirely definable in terms of the rules already introduced for English and German. The principal clause constructions of Dutch are similar to those of German (cf. 20 above), but they differ in one respect. While noun phrases and other nonverbal complements of verbs precede them, as in German, the verbs themselves are predominantly in the English order. The following examples typify the alternatives, again in English transliteration, for a simple clause Zij moet appels eten. (23)
a. b. c. d. e.
She must apples eat ('declarative') Must she apples eat ('subject inversion') Apples/what must she eat ('topicalization' question') ...(that) she/(a woman) who apples must eat ('that comp' and 'subj. relative') (the apples) which she must eat ('obj. relative')
and
'obj.
As in the corresponding generalizations for German verb order, Dutch is not really quite as straightforward as this. For example, the past participle
50
M. Steedman
in some dialects precedes the perfect auxiliary, as in zij moet appels gegeten hebben, and again complement-taking verbs combine to the right. (These largely irrelevant complications will be briefly considered later.) In some dialects, moreover, certain German verb orders such as those illustrated in (20d) and (e) are also allowed — cf. Evers (1975: 51). Apart from the order of the verbs within the verb group, the generalizations concerning these constructions are exactly as for German. In subordinate clauses such as (23d, e), the tensed verb group is (although right-branching, on the English pattern), predominantly clause-final. The complement(s) of the verb group are in general to its left, but there is the same overriding rule that the tensed verb must in a main clause be either sentence-initial or in second position, a phenomenon which has led several authors within the transformationalist framework to postulate the same 'verb-second' rule for Dutch as for German (cf. Koster 1975). It follows that Dutch must have exactly the same rules of forward and backward simple combination as German, with exactly the same restrictions to ensure the acceptance of the above constructions and the exclusion of the English equivalents where necessary. However, the order of the verbal elements makes it clear that the PARTiAL-combination rules of Dutch are on the English, rather than the German, pattern. That is, backward partial combination is more restricted in Dutch than in German, while forward partial combination is less so. Some dialects appear to allow very little backward partial combination at all, almost like English.
3.3.
An aside on the corresponding fragment of Dutch and German
Given the remarks made earlier concerning the numerous exceptions to the generalizations concerning word order in the German and Dutch verb phrase, it is not surprising that many more variations over and above the most basic constructions listed in (20) and (24) are possible for these languages than are possible for English. The possibilities for German are summarized in Figure 1, in which the possible permutations of a subject, a tensed modal, a verb, and its complement are exhaustively listed in English transliteration, with an indication of their grammaticality. (The data are collated from many sources, including Peterson 1977 and von Stechow 1979. Much the same variety is found in Dutch.) The four rules of forward and backward simple and partial combination, without any constraints at all, will allow all but two of the above arrangements to combine in the correct semantic relations. The reason is that the unordered categorial base defines 'mobiles' rather than
The nested-dependency constraint a. b. c. d.
She must believe [that Hans apples eats] She must apples eat (that) she believe must [that Hans apples eats) *She eat apples must
e. f. g. h. i. j. k. 1. m. n. o. p. q. r.
(that) she apples must [[to eat][be able]] (that) she apples eat must Must she believe [that Hans apples eats] Must she apples eat *Must eat she apples *Must eat apples she *Must apples she eat *Must apples eat she *Eat she must apples *Eat she apples must Eat must she apples 1 6 *Eat must apples she *Eat apples she must *Eat apples must she
s. t. u. v. w. χ.
(apples) which she must [[to eat][be able]] (apples) which she eat must Apples must she eat 'Apples must eat she 'Apples eat she must Apples eat must she
Figure 1 The permutations in German
of a clause containing subject, auxiliary,
verb, and its
51
complement
fixed tree structures. If the two simple combination rules are unconstrained, then they will allow all of the set of trees that is obtained by rotating pairs of daughters around each other below their parent nodes. In addition, the rules of forward and backward partial combination will, if similarly unconstrained, allow the 'extraction' to the right or to the left of any subtree. The set of permutations that can be accepted is a large proportion of all those possible for this example. However, the fact that no combination rule refers to other than immediately adjacent entities means that every function X$/Y must be adjacent either to its argument Y, or to some function that can be composed with it. It then follows that the string k in Figure 1 and its mirror image η are not possible strings for a language based on the categories that are assumed here. The remaining strings d, i, j, 1, m, p, q, r, ν, and w, which are ungrammatical in Dutch and German, 1 7 must be excluded with feature-based restrictions on the four rules, of the kind that were introduced at the end of section 2 in order to limit preposing in the grammar of English. It appears to be relatively simple to impose the required restrictions in this way, and they will be ignored for the
52
M.
Steedman
present. 1 8 For present purposes, it is only important to note that all of these most basic constructions can be handled by combination rules of the same restricted and semantically coherent type, operating upon lexicons for the three languages that are almost identical, before turning to the much more problematic case of Dutch infinitivals and their apparent threat to the assumptions with which the account began.
4.
Crossed Dependencies in Certain Dutch Infinitivals
As was noted at the start of the paper, certain constructions which undoubtedly involve crossed dependencies between verbs and their complements are permitted in Dutch (Seuren 1973; Evers 1975; Huybregts 1976; Zaenen 1979; de Haan 1979; Bresnan et al. 1982). Verbs like zien [to see] and helpen [to help] give rise to main clauses on the following pattern: (24)
Ik zag Cecilia de nijlpaarden voeren I saw Cecilia the hippos feed Ί saw Cecilia feed the hippos'
In subordinate clauses the following constructions are found, where subscripts indicate dependencies as usual. They are, as noted before, 19 CROSSED dependencies, unlike those in the corresponding English and German sentences: (25)
a.
b.
... omdat iki Cecilia 2 de nijlpaarden 2 zagi voeren 2 ... because I Cecilia the hippos saw feed '...because Ii sawi Cecilia2 feed 2 the hippos 2 ' ... omdat iki Cecilia 2 Henk 3 de nijlpaarden 3 sagi helpen 2 voeren 3 ... because I Cecilia Henk the hippos saw help feed '...because I j saw¡ Cecilia 2 help 2 Henk 3 feed 3 the hippos 3 '
Because the construction can embed, indefinitely many crossed dependencies may be included. 2 0 It is important to note that in most dialects the alternative mentioned in connection with example (24), in which the verb group is in the German order, is actually disallowed (Zaenen 1979: n. 3), and in all dialects it appears to be uncommon. That this option can be excluded or dispreferred is remarkable, for it would restore to the Dutch examples the nested dependencies exhibited in the corresponding German constructions between the verbs and their complements, thus:
The nested-dependency constraint (26)
a. b.
53
...? omdat ik, Cecilia 2 de nijlpaarden 2 voeren 2 zagj ...* omdat ikj Cecilia 2 Henk 3 de nijlpaarden 3 voeren 3 helpen 2 zag!
In no dialect are sentences allowed which have the English (rightbranching) verb group but the reverse order of the noun phrases. (27)
c. d.
...* omdat ik de nijlpaarden Cecilia zag voeren. ...* omdat ik de nijlpaarden Henk Cecilia zag helpen voeren.
The importance of these data must be obvious. They go directly against the generalization that crossed dependencies are the exception rather than the rule. They also appear to present a direct challenge to the present model, which has the nesting tendency very close to its heart. The construction is lexically governed. The verbs involved are all verbs of perception and causation, plus a few that probably also belong under the causation heading, such as helpen 'to help' and leren 'to teach'. Second, the rather similar Raising and Equi verbs, such as besluiten [to decide], schijnen [to seem], and toelaten [to allow], which take the other Dutch infinitive with the particle te (cf. English to), behave similarly in that they allow crossing, but differently in the other alternatives that they permit (Zaenen 1979; Seuren 1973). We shall ignore these complications for the moment, returning to them briefly below. Third, these sentences conform to the general pattern of the German/Dutch subordinate clause: the entire verb group is clause-final, as in German, although the order of the component verbs is as in English. There has been a certain amount of controversy concerning the surface structure of these sentences. All authors (Seuren 1973; Evers 1975; de Haan 1979; Zaenen 1979; Bresnan et al. 1982) agree that the verb group (zag...voeren in the above examples) constitutes a surface constituent of type V. There is less agreement about how the various noun phrases relate to this constituent and to each other. For present purposes we shall assume that the surface structure of (25, a) is something like the following: 21 (28)
omdat
ik¡
V Cecilia 2 de nijlpaarden 2 zag!
V voeren 2
54
M. Steedman
Many working from a transformational standpoint agree that the deep structure underlying the above is (29)
NP 2
v2
X2
— although there is considerable disagreement as to how the deep structure is mapped into surface structure. Within the present theory, the entity corresponding to base grammar is the categorial lexicon. One set of categories which would express such deep structures as the above would make all infinitival verbs functions from the verb's complement into functions-from-NPs-into-infinitival-S, of the form Sinf/NP$. 2 2 The categories of the infinitives voeren and helpen are therefore respectively Sinf/NP/NP and Sinf/NP/Sinf. The category of the tensed verb zag, on the assumption that tensed verbs and subjects have the same categories as they do in English, is FVP/Sinf. 23 Remarkably, the surface structure (27), in which the verbs form a constituent in isolation from their NPs, can be accepted using exactly these categories, with no further modification to the theory whatsoever. The reason is that the grammar already includes the partial-combination rules, whose express function is precisely that of glueing verbs together in advance of their combination with any other arguments. Furthermore, if the verbs are allowed to combine by the forward partial-combination rule (11) (whose definition, it will be recalled, follows from quite unrelated considerations to do with extraction phenomena in English and German, as well as in Dutch), then they will NECESSARILY produce as their composition a function which, IF IT IS TO FIND ITS ARGUMENTS TO THE LEFT, as in Dutch it usually must, demands those arguments in the CROSSED, rather than the nested order. The analysis of the main clause, and of sentences (25,a) and (25,b) is then as follows: 24 (30)
a.
Ik zag
Cecilia
S/Sinf
NP
de nijlpaarden NP
voeren Sinf/NP/NP Β
Sinf/NP Β Sinf F S
The nested-dependency constraint b.
omdat ik S'/FVP
Cecilia
de nijlpaarden
zag
NP
NP
FVP/Sinf
55
Sinf/NP/NP
FP FVP/NP/NP
Β FVP/NP
Β FVP F S' c.
omdat ik S'/FVP
Cecilia Henk de nijlpaarden NP
NP
NP
zag FVP/Sinf
helpen Sinf/NP/Sinf Sinf/NP/NP
FVP/NP/Sinf
FP 1 FP
FVP/NP/NP/NP
Β FVP/NP/NP
Β FVP/NP
Β FVP F
In the absence of more specific restrictions on backward partial combination and the types of verbs that may be combined under the various rules, sentences like (26), with the verbs in the German order and the dependencies nested, will also be allowed. (However, even a grammar of this degree of freedom will still exclude the sentences [27], with the order of the noun phrases reversed.) And of course, in common with sentences involving other verbs taking N P complements, a restriction on the application of the forward simple combination rule is required if those noun phrases are to be prevented from occurring on the right. As for the simpler sentences of section 3.3., Figure 1, the rules as formulated here without more specific feature-based restrictions allow overgeneraiion of subordinate clauses with main clause order, such as (31)
*...omdat ik zag Cecilia de nijlpaarden voeren.
— together with certain other unacceptable word orders, such as (32)
*... omdat ik Cecilia zag de nijlpaarden voeren.
56
M.
Steedman
However, as in the earlier cases, similar word orders ARE required for certain other verbs in subordinate construction. The 'subject equi' verbs, like trachten [to try], allow analogous constructions (Seuren 1973; Zaenen 1979). These verbs take the te infinitive. The tensed verb may come second in a subordinate clause: (33)
a. b.
... omdat ik trachtte de kikker te eten ... because I tried the frog to eat ... omdat ik probeer Jan het lied te leren zingen ...because I try John the song to teach sing
Certain further orderings are also allowed (Seuren 1973), as follows: (33)
c. d.
... omdat ik probeer Jan te leren het lied te zingen ...omdat ik Jan probeer te leren het lied te zingen
A further pattern (which it will be hard to exclude under the present scheme) is not grammatical, according to Seuren: 25 (33)
e.
...*omdat ik Jan probeer het lied te leren zingen
The elimination of such overgeneralizations as (32) and (33) must, again, await a more complete set of features, distinguishing the particular kinds of verbs and further restricting the operation of the combination rules.
5.
An unsolved subproblem
The apparent anomaly of the dependencies in Dutch infinitival complements is, according to the above argument, an inevitable concomitant of a language and a grammar which has the basic apparatus for partial combination required by English and German but which does not, as they do, adopt a consistent rightward or leftward pattern for verbs and their arguments. However, a number of problems remain which arise when those arguments are pronouns of one kind or another. Clitic pronouns may disrupt the order of the noun phrases (Evers 1975; Zaenen 1979), as in (34)
dat ik het ι Cecilia 2 [zag voeren]Fvp/2/i
Extraction of a relative pronoun may achieve the same effect of reordering the argument noun phrases. (35)
de nijlpaarden diei ikS/Fvp Cecilia 2 [zag voeren] F vp/2/i
The nested-dependency
constraint
57
While the present grammar allows relativization and pronominalization of the first two NPs in the sequence, these last two types of string are NOT accepted, because the verb complex is separated by intervening material from the thing that it requires as its first argument. The problem appears to lie in the account of the Dutch/German preverbal noun-phrase sequence, rather than in the account of the Dutch infinitivals themselves. The present treatment fails to account for several other cases of extraction from German and Dutch subordinate clauses, including multiple preverbal NPs — for example, simple relative clauses including ditransitive verbs: (36)
a. b.
de appels die ik Cecilia gaf het meisje dat ik appels gaf
(Whatever the order in which the category of the verb requires the direct and indirect object, one of these sentences will block.) The problem is one of considerable extent and goes well beyond the present limited concern with infinitivals. It is probably the SAME problem as arises with the Romance clitics (cf. section 1), which also appear to be arguments of the verb but appear to its left rather than its right, and in the crossed order, as in example (3). I conjecture that the solution to these problems involves a combination of the preverbal NPs and other arguments into a constituentlike entity, similar to the combination of the elements of the verb group, a suggestion that may prove compatible with the proposals of Bresnan et al. (1982) concerning the surface structure of the noun-phrase sequence in the Dutch infinitival complements, and with the related proposal of Kirkpatrick (1981). However, this is a complex problem, and the present paper will say nothing further.
6.
Implications for universal grammar
To the extent that rules similar to the above can be made to generate an 'X-X language, the grammar that is presented here cannot be merely context-free. It remains to be seen how MUCH greater than context-free power is implicated by the proposed extension of categorial grammar. (Since the only rules which are not obviously C F are the partialcombination rules, the power involved in the theory depends crucially on them.) However, the extent to which the class of languages that it defines approaches the class of human languages will remain a more important consideration, and the tantalizing minority of constructions apparently involving overlapping dependencies in those languages 2 6 promises to be an important source of information on this score.
58
M.
Steedman
The following generalization about the theory's capability to deal with crossed dependencies seems interesting in this connection. Suppose that there are two functions A and B, which are of category X/al/ .../an/Y and Y/b\j...¡bm, respectively, and which are therefore allowed to partially combine by one or the other of the two partial-combination rules. We will notate the resulting composed function as [Α·Β]. By the usual rule of partial combination, its category is X/a\/.../an/b\/.../bm. Since none of the combination rules combines other than adjacent pairs of entities, it follows that the order in which the '6' arguments must combine with the composed function [Α ·Β], and the order in which the V arguments must combine with it, are the same as the order in which they combine with their respective original functions Β and A. Therefore, the only crossings that can be induced by composing functions in this way are BETWEEN the sets of arguments of the two original functions A and B. Within a set, no crossing can be induced, because of the function-composing nature of partial combination. Since the French clitic pronouns in example (3a), repeated here as (37a), DO appear to cross the arguments of the single main verb mettre, which example (b) suggests bears the category VP/PP/NP, the implication is once again that something else must be going on in the case of clitics. (For example, they may not in fact bear the same categories as the full N P and PP in [37b]). (37)
a. b.
Jean les y a mis Jean a mis les couteaux dans le tiroir
Moreover, crossed arguments can only be induced by partial combination where the direction of simple combination of the composed function [Α·Β] with some or all of the 'b' arguments is OPPOSITE to the direction of partial combination (which defines the linear order of the composed functions A and B), as is the case in the Dutch examples. In languages which (like English) keep the direction of simple and partial combination of verbs and their complements the same, the inclusion of partial combination in the grammar does not appear to induce more than a context-free language. But languages (like Dutch) which are not consistent in this way DO allow this possibility. Were there a language with the same lexical categories as German, having the German ordering of verbal elements (that is, with a BACKWARD partial-combination rule), but with the English habit of picking up objects and the like to the right (that is, by FORWARD simple combination), then crossed dependencies would also be the rule in the relevant construction. There is no such Germanic language of course, and there are independent reasons for knowing that languages like Dutch which mix leftward- and rightward-branching structures in a single category are
The nested-dependency constraint
59
unusual. Most languages make all verbal categories combine in a consistent direction (Greenberg 1963; Vennemann 1973; Lehmann 1978; Comrie 1981; Hawkins 1980, 1982), a characteristic which is generally regarded as having a basis in the semantic categories that the syntax reflects and the need to simplify the grammar. It follows that, while the partial-combination mechanism will allow certain limited types of crossed dependency, there are independent reasons for predicting that this kind at least will be rare. It remains to be explained why natural languages should include such a mechanism as partial combination in the first place. It is crucial to the present enterprise that the partial-combination rules can be assigned a straightforward semantics, and it was indicated in section 2 that this semantics was to be identified with the simple notion of function composition. However, there remains the further question of WHY natural languages should incorporate such an operation in their semantics, rather than making do with function application alone. In OOW, it was suggested that partial-combination rules are compatible with the proposal that human language processors assemble semantic interpretations directly, without the mediation of any intervening representationally autonomous syntactic structure, and in many cases almost word by word, a proposal which has growing support from experimental studies (cf. Marslen-Wilson 1975; Marslen-Wilson and Welsh 1978; Tyler and Marslen-Wilson 1977; Marslen-Wilson and Tyler 1980; Crain 1980; Crain and Steedman 1982). In fact, for extensively right-branching constituents like the English verb phrase, the forwardpartial rule is essential if anything in the way of an interpretation is to be assembled before the rightmost element is reached, quite apart from questions of extraction. Such composed functions need not be internally inspected or modified in any way in the subsequent stages of processing: they can simply be applied to their arguments (or composed with further functions) without any further modification. It is tempting to speculate further that the involvement of such processors in psychological sentence comprehension stems in turn from a functional requirement, such as the need for the comparatively slow medium of spoken natural language to be understood rapidly in a hostile world which may not wait for the end of a sentence before changing in ways that may drastically affect the hearer. This proposal might explain why natural languages are so apparently complex in their syntax, when compared with the artificial languages of formal logic and computer programming. The artificial languages are not subject to any such requirement, and so their syntax is a straightforward reflection of their semantics. However, the fact that in natural languages the capability of
60
M. Steedman
interpreting incomplete fragments (and hence the apparatus of partial combination must be added to the basic apparatus of context-free syntax and semantics seems to have far-reaching consequences for their syntax. 27 Most importantly, just a few extra constructions involving unbounded extractions, of just the kind that are so peculiarly widespread among the languages of the world, are allowed over and above the ones that are defined by the basic semantics and the corresponding context-free syntax. Moreover, under certain fairly unusual conditions which happen to obtain in the Dutch verb group, crossing dependencies may also be introduced. Department of Psychology University of Warwick Coventry CV4 7AL England
Notes *
1.
2.
3. 4.
5.
6.
This work has been influenced at many stages by conversation with Tony Ades and Emmon Bach. They, Gerald Gazdar, Charlie Kirkpatrick, and Pieter Seuren read various drafts and made many helpful suggestions. Parts of the work were done during visiting fellowships at the Center for Cognitive Science of the University of Texas at Austin, and at the Max Planck Institut für Psycholinguistik, Nijmegen. The members of the latter institution have made heroic efforts to compensate for the manifold deficiencies in my knowledge of the languages concerned. An earlier version of the paper appeared as part of a longer manuscript entitled 'Work in progress' and was presented to the Conference on Explanations for Linguistic Universale, Cascais, Portugal, January, 1982. A fuller account will appear soon. Terms like 'movement', derived from transformational grammar, will be used throughout the paper purely descriptively, to describe constructions. Their use does not imply that the present account subscribes to such a theory — in fact it is intended to offer alternatives to ideas like movement. The star on (la) indicates that the sentence is ungrammatical WITH THE INTERPRETATION DEFINED BY THE SUBSCRIPTS. There is of course another interpretation under which the sentence is grammatical, though silly. The phrase 'natural generalization' is used in the sense of Aho (1968, 1969), who applies it to the indexed grammars and the related 'nested-stack' automata. The present paper follows Dowty (1978) in assuming that it is this function which defines the relational role of the argument, for example as a direct as opposed to indirect object. For a dissenting opinion, see Bresnan (1982). However, it is clear from work by Flynn (1982) within a related categorial framework that certain categories that are not considered here, notably adverbials, will require a more general backward-combination rule in a full grammar of English. Although the slash notation used by Gazdar happens to look like the notation of categorial grammar, it is in fact quite different. For Gazdar, a category Xj Y is not a
The nested-dependency
constraint
61
function at all, but rather a constituent of type X which happens to contain a trace or 'hole' in place of a Y. The association of a 'hole' and the corresponding preposed constituent is mediated by a device in the semantics known as a 'distinguished variable,' a tactic which differs significantly from the one adopted in OOW and the present paper. According to the present theory, the association is accomplished in the syntax, as in the movement-based account of traditional TG. Engdahl (1982) has pointed to some problems of variable binding that are inherent in the use of distinguished variables for this purpose. 7. 8.
9.
There is a precedent for rules like the one that follows in Geach's (1972) extension to the original scheme of Ajdukiewicz, in his use of 'recursive' rules. This feature of the partial-combination rule means that the corresponding semantic operation is a generalization of function composition in the usual sense of the term. However, the generalization is a very obvious one and will continue to be referred to simply as 'composition'. Example (13) provides a good illustration of the extremely unorthodox position that the notion of surface structure occupies in the present theory (a position, incidentally, which has no parallel in any of the alternative base-generative theories mentioned earlier). The example amounts in transformational terms to saying that the surface structure of the sentence is the following:
It implies that I can, I can believe, I can believe I will, and / can believe I will eat are all constituents of the surface structure of this sentence. Moreover, since they have the status of constituent in the grammar, the surface structure of the canonical sentence I can believe I will eat these cakes may also include such constituents. This possibility implies that in grammars of this kind, sentences with several verbs are potentially multiply ambiguous in their surface structure. Such a proposal may seem startling (although Schmerling 1983 provides some independent support for a constituent containing subject and tensed verb). However, the following points should be borne in mind in assessing its implications: (1) If one's conception of surface structure is drawn from the theories of grammar that are most directly related to parsing, such as the transition network grammars of Woods (1973, passim), then a surface structure is simply a record of the operations that a processor goes through in building a meaning representation, such as a deep structure or an interpretation. It is not a thing which needs to be built. Nor need it bear any obvious relation to the meaning representation that results. (2) There is surprisingly little unequivocal psychological or linguistic evidence for the psychological reality of traditional surface structure. Psychological results from 'click' experiments and other such paradigms show psychological
62
M.
Steedman correlates of SOME level of structure, for what that is worth. But they can as well be interpreted as reflecting DEEP structure, or indeed any other kind of meaning representation, as surface structure. N o r does linguistics itself offer any clearer evidence. Native-speaker intuitions, prosodie p h e n o m e n a , and (in the absence of an adequate theory of gapping) evidence f r o m conjunction are every bit as equivocal and seem quite as likely to provide evidence FOR the present position as against it.
(3)
10.
Following Bach and H o r n , apparent exceptions to the N P constraint, such as (i)
11.
12.
13. 14. 15.
W h a t did you drink a bottle of?
were argued to stem from an alternative analysis of the VP in which the P P is immediately dominated by the VP, and hence the preposition may strand. Languages like G e r m a n in which prepositions d o not strand can be captured by a marginally m o r e stringent restriction. In section 3.1 it will be shown to be implicated in the g r a m m a r of G e r m a n . In Steedman (1982) it is suggested that something like the backward-partial rule is responsible for the combination of nontense affixes with verb stems, and that it m a y be implicated in some English constructions that are not considered here. The work of this rule could also be accomplished in the categorial lexicon, by making such entities as articles bear more than one category. It does not therefore affect the power or the compositionality of the system. Since the whole of the story a b o u t tense and subject is not presented here, certain further restrictions that will be required to prevent overgeneration — for example 'finite VP preposing' as in ate apples she — are omitted. The tendency for related languages to have 'mirror-image' constructions is a widespread one — cf. Culicover and Wexler (1974). The necessity is in no way contingent upon the analysis of the subject and tensed verb, which will continue to be ignored in the present paper. It is assumed here that constructions like the following (cf. Koster 1978b: 200) include a left-dislocation: (i)
16.
It follows that about the only really clear constraint u p o n surface structure is that the associated operations for building the meaning representations should produce the correct result. T h e operation of function composition seems to d o this. It will also produce equivalent end results for all the possible analysis of a canonical sentence with multiple verbs, such as the earlier example / can believe I will eat these cakes. T h e reason is that function composition and function application are 'associative' operations, like the operations of simple arithmetic. F o r the same reason, the alternative analyses of (12,b), in which the three forward partial combinations are done in a different order, will produce equivalent end products.
Die m a n die ken ik T h a t m a n that know I 'That man, I know'
The initial N P therefore does not strictly belong to the clause, and the construction does not violate V/2. Sentences like (o), in which verbs are preposed, have no equivalent in English. They are discussed by von Stechow (1979), and I a m grateful to E m m o n Bach and Sue Schmerling for having drawn them to my attention. (See next note also.)
The nested-dependency 17.
19.
20.
21.
22.
24.
25.
26. 27.
Verstehen soll mich der mann Understand must me the man
A proposal concerning the restricted forms of the rules is advanced in Steedman ( 1982), although certain further restrictions that are required (for example in order to confine construction [a] to sentences with main verbs taking sentential complements) are not attempted there. (It has already been noted above how subtle and complex the rules are which govern the order of particular verbs in Dutch and German.) The example is adapted from Huybregts (1976). There is an assumption concerning deep structure implicit in the identification of the dependencies in (26) which not all accounts share. But all agree that the dependencies do cross. There is a rapidly increasing processing load which makes such multiple embeddings increasingly unacceptable. But by well-known arguments (cf. Chomsky and Miller 1963) such considerations are irrelevant to questions of grammaticality. No strong claim is made that this is the correct account of the surface constituency of the noun-phrase sequence. Bresnan et al. have argued convincingly that some at least of the NPs form a constituent, a proposal which would go some way toward eliminating a problem for the present account that is discussed (but left unsolved) in section 5 below. As always in the present paper it is assumed that such a category is simply given in the lexicon. It should be noted that a degree of freedom in the theory has been exploited in choosing this category for Dutch infinitival verbs. In order to account for the corresponding construction (a) in English, and the extractions (b, c) that it permits, it appears to be necessary to postulate a slightly different category for the corresponding English infinitives: (i)
23.
63
Peter Seuren has pointed out to me that sentences of the form of Figure 1, ρ are grammatical in German (though not in Dutch) if (and only if) the object is a clitic pronoun, as in (i)
18.
constraint
a. b. c.
I saw Cecilia feed the hippos Cecilia I saw feed the hippos These hippos I saw Cecilia feed
Just as the account of tense and subject that was offered for English was slightly simplified, so it is an oversimplification to say that Dutch and German tensed verbs and subjects bear identical categories to their English counterparts. However, for all cases considered here, it is consistent with the fuller proposal that has already been mentioned. There is a further analysis of (32c), in which the verbs partially combine in a different order. But the semantics of partial combination (cf. section 2) ensures that the end result is the same, for reasons sketched in note 9. Any set of categories for these verbs and the te infinitive which will allow (b), (c), and (d) seems bound to allow (e) as well. On the other hand, there does seem to be considerable disagreement among native speakers as to whether (e) is in fact ungrammatical. Further putative examples are discussed by Engdahl (1980). In OOW it is also argued that the evaluation of such fully interpreted fragments is an important mechanism for the resolution of local syntactic ambiguities during processing, an idea developed further by Crain and Steedman (1982).
64
M.
Steedman
References Ades, Α. Ε., and Steedman, M. J. (1982). On the order of words. Linguistics and Philosophy 4, 517-558. Aho, Α. V. (1968). Indexed grammars — an extension of context-free grammars. Journal of the Association for Computing Machinery 15(4), 647-671. —(1969). Nested stack automata. Journal of the Association for Computing Machinery 16(3), 383-406. Ajdukiewicz, K. (1935). Uber die syntaktische Konnexität. Studia Philosophica 1, 1-27. (English translation in Storrs McCall (ed.), Polish Logic 1920-1939, 207-231. Oxford: Oxford University Press.) Bach, E. (1977). Comments on a paper by Chomsky. In P. W. Culicover, T. Wasow and A. Akmajian (eds), Formal Syntax. New York: Academic Press. —(1979). Control in Montague grammar. Linguistic Inquiry 10, 515-531. —(1980). In defense of passive. Linguistics and Philosophy 3, 297-341. —(1983). Generalised categorial grammars and the English auxiliary. In F. Henry and B. Richards (eds.), Linguistic Categories: Auxiliaries and Related Puzzles, IL Dordrecht: Reidel. —, and Horn, G. M. (1976). Remarks on 'Conditions on Transformations'. Linguistic Inquiry 7, 265-299. Bar-Hillel, Y., Gaifman, C., and Shamir, E. (1960). On categorical and phrase structure grammars. The Bulletin of the Research Council of Israel 9F, 1-16. (Reprinted in 1964 in Y. Bar-Hillel, Languages and Information. Reading, Mass.: Addison-Wesley. Bartsch, R., and Vennemann, T. (1972). Semantic Structures. Frankfurt am Main: Athenäum. Bech, G. (1955). Studien Uber das Deutsche Verbum Infinitivum, vol. I. Copenhagen. Brame, Μ. Κ. (1976). Conjectures and Refutations in Syntax. New York: Elsevier NorthHolland. —(1978). Base Generated Syntax. Seattle: Noit Amrofer. Bresnan, J. (1978). A realistic transformational grammar. In M. Halle, J. Bresnan, and G. Miller (eds.), Linguistic Structure and Psychological Reality. Cambridge, Mass.: MIT Press. —(1982). Control and complementation. Linguistic Inquiry 13(3), 343-434. —, Kaplan, R., Peters, S., and Zaenen, A. (1982). Cross-serial dependencies in Dutch. Linguistic Inquiry. Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton. — (1963). Formal properties of grammars. In R. D. Luce, R. R. Bush, and E. Galanter (eds.), Handbook of Mathematical Psychology, vol. II. New York: Wiley. —(1970). Remarks on nominalisation. In R. Jacobs and P. Rosenbaum (eds.), Readings in English Transformational Grammar. Waltham, Mass.: Ginn. —, and Lasnik, H. (1977). Filters and control. Linguistic Inquiry 8, 485-504. —, and Miller, G. A. (1963). Introduction to the formal analysis of natural language. In R. D. Luce, R. R. Bush, and E. Galanter (eds.), Handbook of Mathematical Psychology, VII. New York: Wiley. Cole, P., and Sadock, J. M., (eds.) (1977). Syntax and Semantics, vol. 8: Grammatical Relations. New York: Academic Press. Comrie, B. (1981). Language Universals and Linguistic Typology. Oxford: Blackwell. Crain, S. (1980). Pragmatic constraints on sentence comprehension. Unpublished Ph.D. dissertation, University of California, Irvine. —, and Steedman, M. J. (1982). On not being led up the garden path: the use of context by the psychological parser. In D. Dowty, L. Kartunnen, and A. Zwicky (eds.), Natural Language Processing. Cambridge: Cambridge University Press.
The nested-dependency
constraint
65
Culicovcr, P., and Wexler, K. (1974). The invariance principle and universals of grammar. Social Sciences Working Paper 55, University of California, Irvine. de Haan, G. (1979). Conditions on Rules. Dordrecht: Foris. Dowty, D. (1978). Lexically governed transformations as lexical rules in a Montague grammar. Linguistic Inquiry 9(3), 393-426. Engdahl, E. (1980). The syntax and semantics of questions in Swedish. Unpublished doctoral thesis, University of Massachusetts, Amherst. —(1982). A note on the use of lambda conversion in generalised phrase structure grammar. Linguistics and Philosophy 4, 505-515. Evers, A. (1975). The transformational cycle in Dutch and German. Unpublished Ph.D. dissertation, University of Utrecht, Indiana Linguistics Club. Flynn, M. (1982). A category theory of structure building. In G. Gazdar, E. Klein, and G. Pullum (eds.), Order, Concord and Constituency, 139-174. Dordrecht: Foris. Fodor, J. D. (1978). Parsing strategies and constraints on transformations. Linguistic Inquiry 9, 427-473. Gazdar, G. (1981). Unbounded dependencies and coordinate structure. Linguistic Inquiry 12, 155-184. Geach, P. T. (1972). A program for syntax. In D. Davidson and G. Harman (eds.), Semantics of Natural Language. Dordrecht: Reidel. Greenberg, J. (1963). Some universals of grammar with particular reference to the order of meaningful elements. In J. Greenberg (ed.), Universals of Language. Cambridge, Mass.: M I T Press. Hawkins, J. (1980). On implicational and distributional universals of word order. Journal of Linguistics 16, 171-338. —(1982). Cross category harmony, X-bar and the predictions of markedness. Journal of Linguistics 18, 1-35. Horn, G. M. (1974). The noun phrase constraint. Indiana University Linguistics Club. Huybregts, R. (1976). Overlapping dependencies in Dutch. Utrecht Working Papers in Linguistics 1, 24-65. Jackendoff, R. (1977). X-bar Syntax: A Study of Phrase Structure. Cambridge, Mass.: MIT Press. Kaplan, R. (1973). A multiprocessing approach to natural language. In Proceedings of the First National Computer Conference. Kirkpatrick, C. (1981). The famous Dutch problem. Unpublished manuscript, University of Texas, Austin. Koster, J. (1975). Dutch as an SOV language. Linguistic Analysis 1, 111-136. —(1978a). Conditions, empty nodes and markedness. Linguistic Inquiry 9: 551-593. —(1978b). Locality Principles in Syntax. Dordrecht: Foris. —(1980). Proximates, locals, and dependents. In J. Köster and R. May (eds.), Levels of Syntactic Representation. Dordrecht: Foris. Kuno, S., and Robinson, J. (1978). Multiple WH questions. Linguistic Inquiry 3, 463-488. Langendoen, D. T. (1977). On the inadequacy of Type 3 and Type 2 grammars for human languages. In P. J. Hopper (ed.), Studies in Descriptive and Historical Linguistics: Festschrift for Winfred P. Lehmann. Amsterdam: John Benjamins. Lehmann, W. (1978). Syntactic Typology: The Phenomenology of Language. Austin: University of Texas Press. Lewis, D. (1971). General semantics. Synthese 22, 18-67. Lyons, J. (1968). Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press.
66
M.
Steedman
Maling, J., and Zaenen, A. (1982). A PSR treatment of unbounded dependencies in Scandinavian Languages. In P. Jacobsen and G. Pullum (eds.), The Nature of Syntactic Representation. Dordrecht: Reidel. Marslen-Wilson, W. D. (1975). Linguistic structure and speech shadowing at very short latencies. Nature (London) 244, 522-523. —, and Tyler, L. K. (1980). The temporal structure of spoken language understanding: the perception of sentences and words in sentences. Cognition 8, 1-74. —, and Welsh, A. (1978). Processing interactions and lexical access during word recognition in continuous speech. Cognitive Psychology 10, 29-63. Montague, R. (1973). The proper treatment of quantification in ordinary English. In R. H. Thomason (ed.), Formal Philosophy: Papers of Richard Montague. New Haven: Yale University Press. Peters, S. (1980). Definitions of linked tree grammars. Lecture notes, University of Texas, Austin. Peterson, T. H. (1977). On constraining grammars through proper generalisation. Theoretical Linguistics, 75-127. Postal, P. (1964). Limitations of phrase structure grammars. In J. A. Fodor and J. J. Katz (eds.), The Structure of Language: Readings in the Philosophy of Language. Engelwood Cliffs, N.J.: Prentice-Hall. Pullum, G. K., and Gazdar, G. (1982). Natural languages and context-free languages. Linguistics and Philosophy 4. Ross, J. R. (1967). Constraints on variables in syntax. Unpublished doctoral dissertation, MIT, Cambridge, Mass. Sanders, G. (1975). Invariant Ordering. The Hague: Mouton. Saumjan, S. Κ. (1971). Principles of Structural Linguistics. The Hague: Mouton. Schmerling, H. (1983). A new theory of English auxiliaries. In F. Henry and B. Richards (eds.), Linguistic Categories: Auxiliaries and Related Puzzles, II. Dordrecht: Reidel. Seuren, P. A. M. (1973). Predicate raising and dative in French and sundry languages. Unpublished manuscript, Linguistic Agency, University of Trier. Shir, Ν. E. (1977). On the nature of island constraints. Indiana University Linguistics Club. Steedman, M. J. (1982). Work in progress. Unpublished manuscript, University of Warwick. Tyler, L. K., and Marslen-Wilson, W. D. (1977). The on-line effects of semantic context on syntactic processing. Journal of Verbal Learning and Verbal Behavior 16, 683-692. Vennemann, T. (1973). Explanation in syntax. In J. Kimball (ed.), Syntax and Semantics, vol. II. New York: Seminar Press. von Stechow, A. (1979). Deutsche Wortstellung und Montague Grammatik. In J. M. Meisel and M. D. Pam (eds.), Linear Order and Generative Theory. Amsterdam: John Benjamins. Woods, W. (1973). An experimental parsing system for transition network grammars. In R. Rustin (ed.), Natural Language Processing, Courant Computer Science Symposium 8. New York: Algorithmic Press. Zaenen, A. (1979). Infinitival complements in Dutch. Proceedings of the Chicago Linguistics Society 15.
Form and substance in language universels* LARRY M. HYMAN
Introduction In a workshop such as this one, it is impossible to consider explanations for language universals without addressing the nature of explanation itself — either as it pertains to linguistics or in general. While philosophical and methodological questions have not been a major concern of mine — or probably of most linguists — we each operate with at least an implicit set of assumptions which allow us to either accept or reject something as an explanation, or an analysis of a particular corpus of data as 'explanatory'. When asked to reflect on the nature of explanation and on possible explanations of language universals, I am inclined to respond as most of my colleagues whom I have interrogated on the subject, and distinguish between 'internal' vs. 'external' explanations. If the problem to be accounted for is a syntactic one, an internal explanation will propose an account in terms of the nature of syntax itself, while an external explanation will attempt to relate the syntactic problem to phenomena outside the realm of syntax (e.g. semantics or pragmatics). Similarly, if the problem is a phonological one, an internal explanation will construct a theory of phonology to account for it, while an external explanation will seek a relation with, say, articulatory or perceptual phonetics. There is a belief among certain linguists that only an external explanation is a true explanation. That is, a theory of syntax does not 'explain' syntax and a theory of phonology does not 'explain' phonology. Since internal explanations involve the construction of formal models, while external explanations normally do not, the internal/external dichotomy is sometimes referred to as one between formal vs. functional explanations. This opposition is useful, however, only to the extent that there is a clear distinction or break between the two kinds of explanation. Unfortunately, there is disagreement on the meaning of 'functional' as applied in this context. While everyone would agree that explanations in terms of communication and the nature of discourse are functional, it became
68
L. M. Hyman
evident in different presentations at this workshop that explanations in terms of cognition, the nature of the brain, etc., are considered functional by some but not by other linguists. The distinction appears to be that cognitive or psycholinguistic explanations involve formal operations that the human mind can vs. cannot accommodate or 'likes' vs. 'does not like', etc., while pragmatic or sociolinguistic explanations involve (formal?) operations that a human society or individual within a society can vs. cannot accommodate or likes vs. does not like. Since psycholinguistic properties are believed to reflect the genetic makeup of man, while sociolinguistic properties reflect man's interaction with the external world, some definitions hold that only the latter truly has to do with 'function'. In the other view, any explanation which relates grammar to anything other than grammar is 'functional'. Since the term 'functional' is thus vague, if not 'loaded', I shall not use it in this paper. Instead I shall refer to internal vs. external explanations, as defined above. It is of course no accident that internal explanations involving formal models are believed by their defenders to be related to cognition, while external explanations involving interactions between grammar and real speaker/hearers are believed by their defenders to be related to communication. It is an unfortunate and unnecessary consequence of this artificial division that the former group rarely addresses the communicative input, as the latter group has in recent years abandoned the quest for a formal model of grammar and cognitive processing. What I shall therefore attempt to demonstrate in this paper is that these views, far from being contradictory of one another, together provide a fruitful avenue for the pursuit of explanations of language universals.
1.
The problem: (i) phonology
Over the past ten years or so I have been intrigued by a puzzling recurrent pattern which can be summarized as in (1). (1)
a.
b.
Language A has a [phonological, phrase-structure, transformational] rule R which produces a discrete (often obligatory) property P; Language B, on the other hand, does not have rule R, but has property Ρ in some (often nondiscrete, often nonobligatory) less structured sense.
Let me first illustrate this pattern frequently obtaining between languages by means of two phonological examples I have addressed in previous
Universah: form and substance
69
publications. In Hyman (1975) I considered languages which have phonological rules such as in (2). (2)
a. b.
V -» [ + nasal] / Ν.. V -> [ + n a s a l ] / Ν
(perseverative assimilation) (anticipatory assimilation)
A language may have a rule which nasalizes a vowel after a nasal consonant or before a nasal consonant. A language may have both processes. It is usually quite obvious to the investigator when a language has either or both of these rules. On the other hand, it is less obvious that languages not having the phonological rules in (2) often allow for slight nasalization of vowels in the context of nasal consonants. These low-level, detail, or 'n-ary' effects are measurable and are due to the physiological properties of the speech organs. Speakers thus seem to find it more convenient or at times less difficult not to worry about synchronizing the raising and lowering of the velum with the change of vowel and consonant articulations. The result, of course, is that the velum may stay down too long, in which case we get perseverative nasalization (2a), or it may go down too soon, in which case we get anticipatory nasalization (2b). Languages having rules such as in (2) seem therefore to formally institutionalize what the vocal tract would like to do, if left to its own devices. Or take a second phonological example, one which is of great historical interest in how languages develop tonal contrasts. It is well known that voiced obstruents have an intrinsic pitch-depressing effect on a following vowel. It is not surprising, then, that the English words in (3) tend to have the indicated pitch contours with a high-low declarative intonation: (3)
a.
pin[^>]
b.
bin f ^ ]
While the pitch fall may start immediately after the release of the initial voiceless obstruent in pin in (3a), in (3b) the voiced obstruent in bin causes there to be a lowering of pitch before the high-low fall. While phoneticians do not all agree on the causes of this effect of voiced obstruents on pitch (see Ohala 1973, 1978; Hombert 1978), they do agree that the explanation lies in the operations affecting the larynx where voicing contrasts and pitch distinctions are produced. In Hyman (1977a) I talked about the three stages of historical development a language may undergo in the development of a new tonal contrast, as indicated in (4). (4)
Stage I > pá [ — ] 'intrinsic'
Stage II > pá [ — ] bâ [ y - ] 'extrinsic'
Stage III Pá [ — ] pà [ s - ] 'phonemic'
( ') = high tone (") = rising tone
70
L. M.
Hyman
I start with a language which has a high vs. low tonal opposition and address what happens to high-tone syllables which begin with voiced obstruents. In stage I there is a slight lowering effect caused by the initial /b/. As in the case of low-level nasalization, the pitch lowering on the vowel /a/ is an I N T R I N S I C by-product of a neighboring segment and not part of the phonological tone (cf. M o h r 1971). In stage II, however, the lowering effect has been exaggerated beyond the degree we would expect from universal phonetics, to such an extent that its presence must be due to a phonological rule. Or, in other words, the low part of the tone has become an E X T R I N S I C part of the signal. Finally, in stage III, the voiced obstruent becomes devoiced and we get a phonemic opposition between a high tone and a low-high rising tone. The transition from stage I to stage II I term 'phonologization': an intrinsic property of the speech signal becomes part of the language-specific phonology. The transition from stage II to stage III I term 'phonemicization': a predictable phonological property introduced by rule becomes unpredictable, i.e. distinctive. The same stages are observable in the historical development of distinctive vowel nasalization, although in Hyman (1975) I argued that there may be as many as five distinguishable stages in such a development. At this point it may not be clear why I introduced these facts as a puzzle, or why they are intriguing or even interesting to me. After all, I have simply demonstrated that languages acquire natural phonological rules, that is, rules which relate to natural phonetic processes. The puzzle in my opinion is the following, stated as the question in (5). (5)
Given that property Ρ (e.g. vowel nasalization before/after a nasal consonant, tone lowering after a voiced obstruent) has an external (i.e. extragrammatical) origin and raison d'être, why doesn't Ρ stay out of the grammar? I.e. why do languages acquire rule R?
In other words, why don't speakers just always nasalize vowels as suits their vocal tract? Why don't speakers just slightly lower the fundamental frequency of a vowel following a voiced obstruent? Why get carried away about it and make it a formal property of one's language? An initial response might be that in order for such phonological phenomena to occur over and over again, they must represent some kind of 'simplification', or some kind of advantageous functional or formal mutation over the previous stage. But what is the nature of this putative advantage? The question in (5) can be reformulated without reference to either property Ρ or rule R. In more general terms I am asking why languages must have grammars, i.e. formal systems containing a syntactic component, a phonological component, etc. There has been a tendency to view
Universals: form and substance
71
grammar as a compromise mediator between sound and meaning (see, for example, Yennemann 1972 for a statement of this traditional view). In this view, since grammar is the result of a struggle between the two components of the linguistic sign, it acquires arbitrary (i.e. noniconic) properties of its own. This view may have relevance to describing the arbitrariness of form/content relations in the bulk of lexical items in any language, but we cannot explain the recurrent properties of GRAMMAR (syntax, phonology) without assuming that grammar has 'a mind of its own'. That is, the concerns of Grammar with a capital 'G' are not derivable from extragrammatical factors. Let us, as a convenience, refer to Grammar as form and non-Grammar as substance. In the preceding discussion I have pointed out that the Grammar (i.e. the phonological component of the Grammar) has had a mind of its own: rather than leaving the phonetics to itself, the Grammar gets into its head the idea that nasalization or new tonal distinctions ought to be a part of it. What has begun as substance is now form. In other words, the true struggle is between the laws of substance and the laws of form (Grammar). Each set of laws wishes to control the raw material (substance). In our example, the vocal tract 'wishes' nasalization and pitch lowering to be intrinsic, while the Grammar wishes both to be extrinsic. Once extrinsic, the substance, which is now form, must conform with the laws of Grammar. In the example, once the phonetic substance has become phonological, it may become 'morphologized', 'lexicalized', and eventually dropped from individual grammars as new phonologization processes are introduced.
2.
The problem: (ii) syntax
In the preceding section I have attempted to justify what I think is obvious to all linguists; namely, that there are independent principles or laws of Grammar which constitute a FORCE ever-present to appropriate substance for its own purposes. I hope that it is clear that one does not 'explain' the presence of rules such as in (2) by merely pointing to phonetic facts at an intrinsic level. Similarly, one does not explain the phonologization and phonemicization processes in (4) in strictly phonetic terms. The substance (phonetics) provides the input for these processes but cannot explain why formal rules come into being. Phonologists are well aware of the so-called naturalness of phonologization processes. What is less well understood is that the same formalization of substance occurs as frequently in the syntactic component. In this particular case the substance is pragmatics, i.e. intrinsic properties of communication. When pragmatic factors become part of a grammar, the
72
L. M. Hyman
result is syntax and morphology. Let us consider an example. It is claimed for certain languages, e.g. Samoan, that all subject noun phrases must be definite. The case parallel to our phonological examples is one where a language has an overt definite article and either an overt indefinite article or (fi marking indefiniteness on common nouns. In this language the grammar will have to ensure that the determiner of the subject N P not be filled by an indefinite article (or be null, if a common noun). As seen in (6), the property Ρ is the [ +definite] specification that the subject N P must receive by rule, and the language in question is like the language A seen earlier in (1). (6)
a. b.
Language A : N P -» [ + definite] / [ s Language B: [s N P 'tends' to be [ +definite], statistically
Language B, on the other hand, is one where, statistically speaking, the subject N P tends to be or is almost always [ +definite]. Or, statistically, the proportion of [ +definite] subject NPs is greater than, say, the proportion of [ +definite] direct-object NPs. The frequency counts done by Givón and others indicate that this statistical bias holds across languages, wherever it can be measured. So, definite marking on subject NPs is an intrinsic byproduct of being 'subject' in the exact same way as nasalization in language Β is an intrinsic byproduct of being adjacent to a nasal consonant. In this latter case we are still in the realm of substance as far as the distribution of definite markers is concerned. The question of why subjects should tend to be definite arises, just as the question arises as to why pitch should be depressed after a voiced obstruent. I am sure there is a natural external explanation as to why speakers organize discourse so that the subject position receives a greater preponderance of more identifiable or determined referents than certain other positions. However, it is not any more important to our understanding of the grammar to know at this time WHY subjects tend to be definite (or more definite) than it is to know why voiced obstruents lower a following pitch. So I need not speculate here as to what the communicative function or 'meaning' of being subject might be in a given language or languages. The Grammar need only detect that there is a clustering of definiteness and subjecthood, which someone, perhaps a linguist, sociologist, or ethnologist, like our earlier phonetician for the nasalization and tone problems, may wish to take to an external level of explanation and account for why the substance is the way it is. From this example and many like it we can conclude that pragmatics feeds into syntax exactly as phonetics feeds into phonology. I shall stay with the interaction between pragmatics, on the one hand, and syntax/
Universals: form and substance
73
morphology, on the other, recognizing both the sometimes-vague line between semantics and pragmatics as well as the possibility of considering other interactions. If we have used the term 'phonologization' for the earlier examples, the term 'grammaticalization' seems appropriate to describe the harnessing of pragmatics by a grammar. It would be convenient to have a term to cover both phonologization and grammaticalization. 'Grammaticalization' could apply to both situations, but one interpretation of the term would overlook its phonological instantiation. On the other hand, a term like 'codification' (suggested to me by Henning Andersen), used when substance becomes part of the linguistic code, does not sound linguistic enough for my taste. So I will change off between phonologization and grammaticalization, according to the case, but wish to emphasize that these are two instances of the same phenomenon. I should like to return to the notion of basic conflict and of conflicting interests. The property P, being part of substance, belongs to the extragrammatical world from whence it came. However, grammars, having Ρ as a universally available formal feature, struggle to appropriate it from the world of substance and subject it to the laws of G r a m m a r . The struggle between universal phonetics, i.e. the physiological properties of the vocal tract, and the laws of G r a m m a r is obvious to any phonologist, but it is no less evident that the struggle between universal grammar ( ' G r a m m a r ' with a capital ' G ' ) and pragmatics. We can differentiate three distinct situations or stand-offs which may obtain with respect to a given Ρ in this struggle. These are listed in (7). (7)
a. b. c.
G r a m m a r takes care of its interests [linguistic form]; pragmatics takes care of its interests [nonlinguistic substance]; The interests of pragmatics encroach on the interests of Grammar; The interests of G r a m m a r encroach on the interests of pragmatics.
The situation in (7a) is the one where everything stays in its place: there is, with respect to some property P, no overlap between interests. An example might be the presence of conjugational or declension classes, as in Latin and other languages. This seems to be an area of grammar not of interest to pragmatics. The pragmatics, on the other hand, might like to have greater amplitude (loudness) on an imperative or negative form, but few grammars have this as a requirement (see below, however, for the interaction between such forms and focus marking). The situations in (7b) and (7c) are more interesting and are dealt with in the following two sections.
74
L. M. Hyman
3.
Pragmatics encroaching on the interests of Grammar
In this situation the grammar of a particular language has a construction which is used for a certain grammatical purpose, and the pragmatics meddles and OVERRIDES the use of this grammatical construction for that purpose. I shall draw my examples from the syntax of body parts, an area I have investigated in several European and African languages. Consider, then, the French sentences in (8). (8)
a. b.
j'ai lavé la chemise de l'enfant Ί washed the child's shirt.' j'ai cassé le bâton de l'enfant Ί broke the child's stick.'
As seen in (8), the normal possessive or genitive construction involves the preposition de 'of' plus a noun phrase. The sentences in (8) thus have an 'NP of NP' in direct-object position. In (9), however, (9)
a.
j'ai lavé les mains à l'enfant Ί washed the child's hands.' b. j'ai cassé le bras à l'enfant Ί broke the child's arm.'
we see that when the possessed object N P is a body part, the expected construction is the preposition à 'to' plus an NP, i.e. identical in form to the indirect object in French. The sentences in (9) thus mean, literally, Ί washed the hands to the child' and Ί broke the arm to the child'. If we use the 'normal' possessive construction, as in (10), (10)
a. b.
?j'ai lavé les mains de l'enfant ?j'ai cassé le bras de l'enfant
the impression gotten is that the 'hands' in (10a) and the 'arm' in (10b) are not part of the child's body, but rather some loose objects he may have found lying around somewhere. Many languages disprefer or ban the use of a possessive construction in such cases. The details may vary and depend upon the nature of the verb, the object NP, and the possessor NP, as argued in Hyman (1977b) for Haya, a Bantu language spoken in Tanzania. Details aside, the problem is always the same. The direct object of a verb is in semantic-case terms the 'patient' of that verb. If there is any semantic unity to this notion, it is that the direct object undergoes or 'is affected by' the action of the verb. In cases where the direct object N P is a body part belonging to a human NP, a potential confici arises because the possessor NP is necessarily and perhaps more critically affected by the action than the body part. This is not usually the case with the detached
Universals: form and substance
75
object (body part or not) in (8) and (10). Although an idealized grammar (i.e. one without interference from the external world) would like to produce the sentences in (10), the pragmatics is not happy with this representation of the actions involved and their effect on the two NPs. In English, the concerns of Grammar win out in the sense that the possessive construction is used freely whenever possession is involved. (Remnant sentences such as look me in the eye! and don't look a gift horse in the mouth! reveal that English once had an alternative construction.) In this particular sense of 'concerns of Grammar' we mean, following Fodor (this volume), that an idealized Grammar wishes maximal generality, simplicity, and 'tidiness'. Fodor hypothesizes, further, that the Grammar wants the fewest statements possible, which is what the grammar of English gets with respect to body-part syntax. There is another sense of'concerns of Grammar', however, which is put into effect in the grammar of French and other languages which accommodate the pragmatics of body-part syntax in at least the two ways indicated in (11). (11)
a.
b.
The 'affected possessor NP' is expressed in the direct-object relation, with the semantic patient being expressed in the chômeur relation (e.g. in most Bantu languages), The 'affected possessor NP' is expressed in the indirect-object relation, with the semantic patient left expressed in the directobject relation (e.g. in many European languages).
Since the situations described in (11) represent an encroachment of pragmatics into individual grammars, we assume that a 'compromise' has been reached. The concerns of pragmatics are obvious: optimal expression of what an individual, society, culture wants to express. This corresponds to Fodor's notion of an 'expressor'. The concerns of Grammar in this process are different from those isolated by Fodor. In the grammaticalization process what we discover is the desire of the Grammar, and therefore individual grammars, to control anything and everything they can. In order to see that this is so, let us imagine French to be slightly different from what it is. In this slightly different French the sentences in (10) are used when the speaker is concerned with the effect of washing on hands and breaking on an arm; the sentences in (9) are used when the speaker is concerned with the effect of washing and breaking on the child. In this case the use of the possessive vs. indirect-object construction would be an intrinsic byproduct of the pragmatics. This, then, is the only situation where the pragmatics can be said to have 'won out'. The real French as we know it, however, virtually requires the sentences in (9) when the body parts are attached to a live possessor. This,
76
L. M.
Hyman
then, represents the corresponding extrinsic stage brought about by the grammaticalization of the pragmatic substance. We can look at the stages in (4) in the following way: the intrinsic stage represents the concerns of substance winning out over the concerns of G r a m m a r ; the 'ernie' stage represents the concerns of G r a m m a r winning out over the concerns of substance. In between the extremes is the extrinsic stage: this is the true compromise achieved because the substance is receiving formal recognition in a natural, nonarbitrary way. Pragmatics and G r a m m a r reach a happy medium. One of the interesting features about the encroachment of pragmatics into the realm of G r a m m a r is that the loosening of the grammatical grip often extends beyond a single subarea of the individual grammar. When in the struggle between G r a m m a r and pragmatics the latter finally breaks through, it is not just for one instance, but more generally. We see this particular clearly in the French à construction which marks an N P affected by an action. Consider the slight meaning difference in the following two sentences involving the faire causative construction (cf. Hyman and Zimmer 1976): (12)
a. b.
j'ai fait laver la vaisselle par la bonne Ί had the dishes washed by the maid.' j'ai fait laver la vaisselle à la bonne Ί had/made the maid wash the dishes.'
In (12a) the normal causative construction is used: the subject/agent of the lower clause becomes a par 'by' phrase as in the English translation. However, note in (12b) that an alternative construction is available with the agent expressed as an indirect object, i.e. the preposition à 'to' plus an NP. The reason for this is essentially the same as before. In (12a) the speaker is interested in the effect of the verb 'wash' on the dishes; in this case, the patient la vaiselle 'the dishes' gets no competition from the other NP, la bonne 'the maid', since the latter is expressed as a nonterm, i.e. by a 'by' phrase. The implication is that the agent in (12a) is only secondary, i.e. that I wanted to get the dishes washed and it happened to be the maid whom I found to do it. In (12b), however, a different set of circumstances obtains. In this case, as seen in the gloss, I am interested in the effect of the dishwashing on the maid in addition to the effect of the washing on the dishes. (12a) does not sufficiently express this, since the par 'by' relation is too low in the grammatical-relation hierarchy, just as it was in the ώ + Ν Ρ possessive construction in (10). H o w does one test this hypothesis? Consider the sentences in (13).
Universals: form and substance (13)
a. b. c.
77
?j'ai fait laver Pierrot à la bonne Ί had the maid wash Pierrot.' ?*je t'ai fait laver à la bonne Ί had the maid wash you.' ?*ils m'ont fait éléver à ma pauvre grand'mère 'they had my poor grandmother raise me.'
These sentences have human-patient direct objects. As indicated, it is hard to get French speakers to accept these sentences, even if providing precise and appropriate contexts. The reason is that with a human direct object the competition between the two NP referents ('maid' vs. 'me', 'you', 'Pierrot') is keener. That is, the two referents in each sentence are (intrinsically) equally capable of being affected by the action. In the Hyman and Zimmer study it was demonstrated that the acceptability of ό + Ν Ρ to express the causative agent depended upon the following person animacy hierarchy: (14)
1st person 2nd person 3rd person human 3rd person animate 3rd person inanimate
(definite > indefinite) (definite > indefinite) (definite > indefinite)
(sg. >pl.) (sg. > pi.) (sg. >pl.) (sg. >pl.) (sg. >pl.)
This hierarchy has been studied by a number of linguists besides myself, including Kuno (1976), Silverstein (1976), Hawkinson and Hyman (1974), Durand (1979), Hopper and Thompson (1980), and others. Languages either do or do not allow it to influence their grammars — making cuts in different places — and perhaps making finer distinctions as in the Navaho 'great chain of being' — but they do not reverse the hierarchy. As a final note in the French causative construction, consider the following pair of sentences (cf. Pinkham 1974): (15)
a. b.
j'ai fait voir le film à l'enfant Ί had the child see the film.' ?j'ai fait voir le film par l'enfant Ί had the film seen by the child.'
Since the direct object 'film' is inanimate, both constructions should be possible. However, with such objects, perceptual verbs such as voir 'see' do not readily allow the causative agent to be expressed with a 'by' phrase, as seen in (15b). Given what has been said, it is not hard to explain this fact. The film in (15) is not affected by the action of the verb; instead, only the 'experiencer' l'enfant 'the child' can be said to be affected. There is, however, a reading of (15b) which makes that sentence acceptable. This
78
L. M. Hyman
is the sense where the child is acting as agent getting the film seen by others. The translation might be Ί had the film shown by the child'. Returning to the hierarchy in (14), one might ask why this or any other such array of pragmatic features should become so involved with the giving out of grammatical relations and other grammatical properties. The idea that I have been developing here is one of constant conflict. Real speaker/hearers want to have the communicative freedom to exploit linguistic material at will. Thus, they want to be able to nasalize when it is convenient to them and use à vs. de/par according to the intended message. Let us refer to such freedom in the area of syntax as pragmatic control. To this we oppose the ever-present pressure by grammars on substance. An idealized Grammar existing without any real-world constraints would like all choices to be controlled by individual grammars, not by speakers. This means choices such as whether to nasalize or not, whether to use à vs. de or par, etc. Let us refer to this force as grammatical control. The result is compromise: what speakers wish to exploit is encoded, but with resultant grammatical control. What is achieved, in effect, is an ICONICITY between degrees of speaker concern or 'empathy', in Kuno's terminology, and the feature hierarchies in (14). This iconicity is extended as well to a hierarchy of semantic roles (or the grammatical cases and grammatical relations the semantic roles tend to receive). Thus, languages show a tendency to associate the higher feature values in (14) with the higher semantic roles in the hierarchy: agent > recipient/benefactive> patient > instrument, etc.
4.
Grammar encroaching on the interests of pragmatics
In the preceding section we saw that French speakers entered into a compromise arrangement with the idealized Grammar. In both the possessive and the causative constructions the result was that they yielded to the grammar of French some control over whether one vs. another form would be used in a given context. The trade-off was the encodability of affectedness in the grammar of French. Rather than remaining apart, the interests of pragmatics encroached upon the grammar of French and the resulting compromise was struck. What started therefore as a grammatical opposition (e.g. 'possessive' vs. 'indirect-object' construction) now has some pragmatics built into it. In other languages what starts as a pragmatic opposition comes to have grammar built into it. This is the third possibility mentioned in (7c). I would like to illustrate this reverse situation with respect to cases where the choice of focus marking, which should be pragmatically controlled, is instead grammatically controlled.
Universals: form and substance
79
Consider the sentences in (16). (16)
a. b. c. d.
John John John John
ate ate did ate
rice rice eat rice rice
(ans. (ans. (ans. (ans.
to to to to
Q Q Q Q
who ate rice?) what did John do to rice?) didn't John eat rice?) what did John eat?)
The focus in unmarked declarative English sentences is realized by putting the high pitch of the high-low intonation contour on the main stress of the focused material. As seen, the focus may be placed on the subject NP (16a), the verb (16b), the auxiliary (16c), or the direct object (16d) in a simple transitive sentence. The placement of focus marking in one vs. another place depends on two considerations: (a) the context; and (b) the intention of the speaker. If the context is one of the WH questions given, the only appropriate answer is as shown. Since it is the speaker who decides where focus marking will be, English has pragmatic control of focus marking. This is not to say that rules of grammar do not place the focus; only that the choice of one place rather than another is not decided by the grammar. Now let us modify English only slightly. Assume that there is a dialect of English which differs from the standard in only one respect: if the speaker of this dialect chooses to use an imperative construction, he must put the focus on the imperative verb. Thus, we would have an exchange such as in (17). (17)
Speaker 1: What should I eat? Speaker 2: Eat rice!
Clearly the utterance 'Eat rice!', with stress on the verb, does not sound right to us in this context. Yet parallels to just this development are found in a number of African languages, e.g. Aghem and Somali, which treat the imperative as inherently [ +focus] (Hyman and Watters 1980). Aghem also requires that [ + focus] be marked on a negative auxiliary. Thus we would have the exchange in (18) in our imaginary modified English dialect: (18)
Speaker 1: Why isn't Professor Hawkins here? Speaker 2: Because he isn't in town.
Again the exchange does not seem natural, but it reflects what Aghem does in its morphological focus marking (see the studies in Hyman 1979). It is not possible for me to go into what a grammatical account of Aghem or this hypothetical English dialect might look like. Let us assume for the purpose of discussion that rules such as in (19) will become part of their grammar:
80 (19)
L. M. Hyman a. b.
[ + imperative] -» [ + focus] [ + negative] -» [ + focus], etc.
I have put 'etc.' because it is not only these two features which acquire [ + focus] marking, although they are the most common. A more complete picture would be as in (20). (20)
a. b. c. d.
'marked' 'marked' 'marked' 'marked'
polarity mood aspect tense
= = = =
negative imperative (possibly subjunctive too) progressive perfect
Any of these may attract focus marking, and since there can be only one [ + focus] per clause in these languages, the [ + focus] is robbed from where it might have otherwise been according to the context and the intentions of the speaker. Why should this be? The answer lies in the question, what is focus? We are accustomed to seeking sophisticated responses to this question, relying on presupposition, scope of assertion, exhaustive listing vs. counterassertive focus, etc. But I think the answer is much simpler: grammatical focus is the assignment of [ + focus] by a grammar, and this [ + focus] is a mark of salience within the grammar. Given that not all information communicated is of equal salience, languages could conceivably choose to deal with this in a number of ways, as indicated in (21). (21)
a. b. c.
languages could ignore salience altogether; languages could allow gradated and unlimited marking of salience; languages, through their grammars, could 'harness' the pragmatics and create a formal system for focus.
(21a) seems unreasonable since it would completely ignore the needs of speakers: there simply have to be means of highlighting information at the expense of other information, if for no other reason than to get the attention of listeners who might be less interested in monotone and an utterance devoid of affect. (21b) is more reasonable and represents the complete pragmatic control of salience. Imagine a language which places a [ +focus] marker baa after any and as many items as the speaker chooses. This is a situation which the speakers might like, just as their vocal tract might like to nasalize only when it's functional to do so. But an idealized Grammar would not like it. The driving force of Grammar is to get control of whatever it can, as in (21c). This is normally done in a simple way by reference to focus within the propositional content of an utterance: e.g. a WH element may receive [ + focus] marking, as will its answer. Otherwise, put [ +focus] on a salient part of the proposition.
Universals: form and substance
81
W h a t Aghem and many other African languages do is consider relative salience within the OPERATOR system, that is, within the auxiliary. It is intuitive than when one tells someone 'no', it is more salient than when one tells them 'yes'. The operator is more salient when you order someone via an imperative than when you make an indicative declarative statement. Progressive aspect focuses attention on the ongoingness of some action (e.g., 'John is playing cards'), while the nonprogressive, being unmarked, does not focus attention on the non-ongoingness of the action ('John plays cards'). Instead, the nonprogressive competes less with the propositional content of the utterance, thereby allowing [ + focus] more ready access to one of its constituents outside the auxiliary. Finally, the perfect tense (or aspect, as you will) focuses attention on the effect of some prior action on some later state. The absence of a perfect does not do the reverse, but again, as in the nonprogressive, simply allows the proposition more relative salience. What the above indicates is that there is a competition between salience within a proposition and salience within a system of operators. As a confirmation of this competitive view of salience, with different kinds of saliences vying for one [ + f o c u s ] marking, let us consider constructional focus. Some clause types are inherently more salient than others. In particular, main clauses are more salient than nonmain clauses. Let us return briefly to the hypothetical English dialect. Imagine that this dialect, like many African languages, starts to forbid any [ + f o c u s ] assignment within a relative clause. We now have an exchange as in (22). (22)
Speaker 1 : which book did he read? Speaker 2: [he read] the book that you gave him.
Again there is an unnatural focus marking, but one which we would have to have in some languages. In (23) I have indicated the kinds of clauses which do not allow [ + f o c u s ] marking in certain African languages: (23)
a. b. c. d. e.
cleft clauses relative clauses adverbial clauses if-clause consecutive clauses
(it's the child that I saw) (the child that I saw...) (when I saw the child...) (if I saw the child...) (he came and — I saw him)
(23) represents a proposed hierarchy. Even in English it is rather odd to get a [ +focus] within the cleft clause. The above clauses typically create islands and essentially constitute the class of nonroot Ss in Emond's (1976) framework. Why should it be these that repulse the [ +focus]? A moment's reflection will reveal that (23) represents a class of 'backgrounded' clauses (cf. Schachter's [1973] discussion of focus and
82
L. M. Hyman
relativization). They are typically used not to make new assertions but rather to provide circumstances under which the assertions of main clauses hold true. Thus, to look only at these embedded clauses would not tell us much about the story line of a narrative but only about the periphery of the tale. We see some indication of this in the English sentences in (24). (24)
a. b. c. d. e.
he did too hit me! ( ~ did so) *it's John who did too hit me *the child who did too hit me *when he did too hit me... ?*if he did too hit me...
The counterassertive too/so construction goes only in assertive or main clauses. Aghem and other languages simply extend this so that relatively less salient clauses are exempted from any [ +focus] marking. Thus, the [ + focus], which should have been placed on the basis of speakers' wishes relative to a given context is largely controlled by the grammars of these languages. This does not mean, necessarily, that differences in focus cannot be indicated in the context of negation or backgrounded clauses. Besides the morphological marking which is sensitive to [ + focus] specifications, these languages allow SYNTACTIC devices, e.g. word-order variations, to express differences in relative salience. Note, finally, that if you choose in our hypothetical English dialect a main clause, affirmative, indicative, nonprogressive, nonperfect, one still, as a speaker, has some choice in placing the [ +focus].
5.
Conclusion
At the beginning of this paper I distinguished internal vs. external explanations for language universals. Given the preceding discussion, it should be clear that the totality of language will be accounted for only by a combination of 'explanations'. If we take as a major goal the explanation of grammatical properties recurrent in languages, we shall have to postulate, on the one hand, abstract principles of Grammar, as Chomsky has argued most forcefully, which constrain what individual grammars can do, and, on the other hand, principles, abstract or not, of communication, cognition, etc., to predict the kinds of substance which may become grammaticalized — and in what ways, order, etc. The discussion of focus in section 4 was, admittedly, sketchy, but some very basic predictions can be derived from it. First, the notion of marked ( = salient) operators in
Universals: form and substance
83
(20) predicts that negatives, imperatives, progressives, and perfects may 'associate with focus' (Jackendoff 1972) but that no language will have such an association only in the case of the unmarked values (affirmative, indicative, nonprogressive, nonperfect). This association may result in a [ + neg] attracting the one [ +focus] marking of a clause. Second, the notion of background clauses in (23) predicts that main clauses will always make as many focus distinctions as, or more than, cleft, relative, adverbial, 'if', and consecutive clauses. External explanations thus relate properties of Grammar to substance (phonetics, psychology, sociology, etc.). Since substance is taken to be 'real', e.g. often physically measurable or capable of statistical analysis, external explanations are in a sense more concrete and accessible than internal explanations. In this paper I have tried to place the two kinds of explanations in perspective. Phonetics provides much of the substance of phonology, and pragmatics provides much of the substance of syntax. However, the ever-present phenomena of phonologization and grammaticalization cannot be explained by reference to the origin of the substance. Grammar has its own laws which, whether innate or learned, are species dependent. While it is universally asserted that LANGUAGE is present for communicative purposes, it is harder to demonstrate that GRAMMAR exists for the same reason. Or, restated as a problem in language acquisition, it is as if the need of the child to communicate is subordinated to his need to develop a formal system; that is, to grammaticalize as much substance as possible. Phonologization and grammaticalization become, then, an overformalization of substance by the child which gradually works its way into the adult language. We can thus reformulate Chomsky's (1965) distinction between substantive and formal language universale as follows. Substantive universale provide statements, hierarchies, implicational universals, etc., on how substance comes to be grammaticalized. Formal universals provide statements, hierarchies, implicational universals, 'parameters' concerning the internal formal operations of grammars. All of the generalizations reported in this paper fall, then, in the category of substantive universals. It is hoped that the years ahead will see the convergence of substantive and formal universals into a unified theory of language structure. Department of Linguistics University of Southern California Los Angeles, Calif 90089 USA
84
L. M. Hyman
Note *
This paper represents a compromise between the presentation made at the Workshop, which I entitled 'Universale of focus', and a Professorial Address made January 20, 1982, at the University of Southern California, entitled 'Language [read: Grammar] has a mind of its own'. I would like to thank participants at both events for their helpful comments and discussion, especially Bernard Comrie, Edward Finegan, Janet Fodor, Jack Hawkins, Osvaldo Jaeggli, Stephen Krashen, and Elinor Ochs.
References Chomsky, Noam (1965). Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press. Duranti, Alessandro (1979). Object clitic pronouns in Bantu and the topicality hierarchy. Studies in African Linguistics 10, 31 45. Emonds, Joseph E. (1976). A Transformational Approach to English Syntax. New York: Academic Press. Hawkinson, Anne K., and Hyman, Larry M. (1974). Hierarchies of natural topic in Shona. Studies in African Linguistics 5, 147-170. Hombert, Jean-Marie (1978). Consonant types, vowel quality, and tone. In Victoria A. Fromkin (ed.), Tone: A Linguistic Survey, 77-112. New York: Academic Press. Hopper, Paul, and Thompson, Sandra (1980). Transitivity in grammar and discourse. Language 56, 251-299. Hyman, Larry M. (1975). Nasal states and nasal processes. In Charles A. Ferguson et al. (eds), Nasalfest: Papers from a Symposium on Nasals and Nasalization, 249-264. Stanford: Stanford University Press. —(1977a). Phonologization. In Alphonse Julliand (ed.), Linguistic Studies Presented to Joseph H. Greenberg, 407—418. Saratoga, Calif.: Anima Libri. —(1977b). The syntax of body parts. In Ernest Rugwa Byarushengo et al. (eds), Haya Grammatical Structure, 99-117. Southern California Occasional Papers in Linguistics No. 6. Los Angeles: University of Southern California. —(ed.) (1979). Aghem Grammatical Structure. Southern California Occasional Papers in Linguistics No. 7. Los Angeles: University of Southern California. —, and Watters, John Robert (1980). Auxiliary focus. Paper presented at Conference on Auxiliaries, Fourth Annual Round Table in Linguistics, University of Groningen, Netherlands, July 4-8, 1980. —, and Zimmer, Karl E. (1976). Embedded topic in French. In Charles N. Li (ed.), Subject and Topic, 189-211. New York: Academic Press. Jackendoff, Ray S. (1972). Semantic Interpretation of Generative Grammar. Cambridge, Mass.: MIT Press. Kuno, Susumo (1976). Subject, theme, and the speaker's empathy — a reexamination of relativization phenomena. In Charles N. Li (ed.), Subject and Topic, 417-444. New York: Academic Press. Mohr, Burckhard (1971). Intrinsic variations in the speech signal. Phonetica 23, 65-93. Ohala, John J. (1973). The physiology of tone. In Larry M. Hyman (ed.), Consonant Types and Tone, 1-14. Southern California Occasional Papers in Linguistics No. 1. Los Angeles: University of Southern California. —(1978). Production of tone. In Victoria A. Fromkin (ed.), Tone: A Linguistic Survey, 5-39. New York: Academic Press.
Universals: form and substance
85
Pinkham, Jessie (1974). Passive and faire-par causative construction in French. Senior Essay, Harvard University. Schachter, Paul (1973). Focus and relativization. Language 49, 19-46. Silverstein, Michael (1976). Hierarchy of features and ergativity. In R. M. W. Dixon (ed.), Grammatical Categories in Australian Languages, 112-171. New Jersey: Humanities Press. Vennemann, Theo (1972). Phonetic analogy and conceptual analogy. In Theo Vennemann and Terence H. Wilbur (eds), Schuchart, the Neogrammarians, and the Transformational Theory of Phonological Change, 181-204. Frankfurt am Main: Athenäum.
Form and function in explaining language universale*
BERNARD COMRIE
1.
Form and function
In this section, I want to contrast two possible approaches to explaining language universals. The first, which is most closely associated with mainstream transformational-generative grammar, especially Chomsky, I will call the formal approach; it argues that explanations for language universals are to be sought purely within the formal system of language description, for instance by trying to establish higher-level formal generalizations from which many of the particular properties of individual languages will follow logically. Of course, claims may be made relating these purely formal properties to phenomena beyond the formal properties of language, for instance by claiming that these formal properties represent internal properties of the human being as a species ('innate ideas'); however, unless independent evidence can be adduced in favor of these properties as innate mechanisms, it remains the case that the only explanation actually demonstrated is a linguistic-internal, formal explanation. The second kind of explanation argues that universal properties of language hold for functional reasons, i.e. because language would be a less-efficient communication system if the universal in question were not to hold. Obviously, this kind of explanation is only viable in contrast with the first kind if some independent measure of communicative efficiency can be given. In the body of this paper, I will argue that for a significant set of constructions cross-linguistically, a functional explanation of this kind can be established. Functional explanations can be of several different kinds; for instance, they may relate properties of language structure to aspects of human cognition, or to language as a system for efficiently relating meaning to form. In the present paper, I will be concentrating on the second of these kinds of functional explanation, i.e. on explanations that might also be characterized as semantic-pragmatic. Although I will be arguing that, in certain specific cases, a functional
88
Β. Comrie
explanation can and should be given, I will not adopt the extreme position that all language universale can be given a functional explanation (except in the rather uninteresting sense that they may directly reflect limitations on the human organism). For instance, one might consider one universal frequently cited by transformationalists, namely that transformations are structure-dependent, i.e. are limited to performing operations in terms of constituent structure, rather than on arbitrary strings. 1 This allows, for instance, a language to form yes-no questions by inverting the subject and predicate, or the subject and finite verb, or the subject and auxiliary, all of which possibilities are found in different languages. 2 It prevents a language, however, from forming yes-no questions by simply providing a left-right inversion of the word order for a string of arbitrary length, so that the question corresponding to (1) would be (2): (1) (2)
The man that killed the cat had an old gun. Gun old an had cat the killed that man the?
As far as I can tell, there is no functional explanation for why transformations in natural language should be structure-dependent (except, perhaps, that human beings are built that way). Certainly, there is no a priori reason why a language with a left-right inversion process as illustrated above should not be a perfectly functioning communication system. This seems, therefore, to be a universal for which there is no functional explanation. I am therefore not questioning the existence of some language universals for which there is no (known) functional explanation. What I will argue for in the body of the paper is that some language universals do have viable and, I believe, correct functional explanations. Before leaving structure-dependence, I would make one final comment. It seems to me misleading to present structure-dependence as being a specifically linguistic universal (whereas the universals discussed below will be specifically linguistic). Rather, it seems to be a general characteristic of human processing that it is difficult to provide a left-right inversion of a string that is presented in sequence. For instance, try taking any reasonably long sequence that is well known forward but which has not been learned backward (e.g. a telephone number, or the alphabet), and then try to recite it backward. At best, the result is dysfluent, often characterized by errors (perhaps subsequently corrected) and false starts, even for a sequence that can be recited forward at incredible speed and with complete accuracy. 3 Structure-dependence may thus be a general property of human processing capabilities, albeit with no functional explanation other than to say that that is the only way humans are constructed to operate.
Universals: form and function
89
By contrast, one might consider the universal presented as (3) below, which is universal number 15 in Greenberg (1966: 111): (3)
In expressions of volition and purpose, a subordinate verbal form always follows the main verb as the normal order except in those languages in which the nominal object always precedes the verb.
If, to start off with, we omit from consideration the exception given at the end of the universal, then it is clear that, from a purely formal viewpoint, there is no reason why expressions of volition and purpose should typically follow rather than precede the verb of the main clause. Functionally, however, there is an explanation, related to the concept of iconicity, whereby the form of an expression is brought into direct correspondence with some aspect of its meaning. For word order, this involves arranging the linear order of constituents to match the chronological order of the events they refer to. Since the content of a wish or purpose is necessarily subsequent to the expression of that wish or the event designed to bring about that purpose, from iconicity one would expect that the constituent encoding the wish or purpose would follow the constituent encoding the expression of that wish or the event designed to bring about that purpose, i.e. that in (4) want would precede leave and that in (5) came would precede see:* (4) (5)
I want to leave. I came to see you.
The Greenberg universal presented in (3) is therefore an example of a universal for which we can provide a functional explanation, and for which there seems to be no formal explanation (though there is, of course, a formal statement — namely (3) itself — which, however, lacks explanatory power). Before leaving this example, let us, however, consider the exception noted by Greenberg, namely that rigidly verb-final languages may have expressions of volition and purpose normally preceding the verb of the main clause. 5 This illustrates an interesting interplay between functional and formal factors in language structure. Functionally, one would expect iconicity to override form, so that normal clause order would always reflect chronological order. Formally, however, in a language which is otherwise rigidly verb-final, it is simpler to have a rule that the verb of the main clause follows all constituents, irrespective of their semantics (and, in particular, irrespective of their time reference). The exceptions to Greenberg's universal, captured by the exception clause, demonstrate that any comprehensive account of explanations for
90
Β. Comrie
language universals must pay attention to both formal and functional factors. My reason for concentration on functional factors in this paper is that this is the side of the coin that has typically been neglected in recent work. Finally, before leaving universal (3), it should be emphasized that I am not claiming that, where a given functional motivation can be given, every sentence in every language necessarily corresponds with that explanation. Thus there are clearly many sentences in many languages — which moreover do not fall under the exception clause in universal (3) — where clause order does not correspond to chronological order of the events described in the clauses. W h a t I am claiming is that in such instances, there is a statistically significant bias in the direction of adherence to the functionally explainable universal, whereas in purely formal terms one would expect adherence to and violation of the universal to be equally probable.
2.
Control
The examples with which I will be concerned in the body of this paper for the most part involve the notion of control, more specifically constructions where a noun phrase that is absent from the surface structure of a sentence has to be interpreted as coreferential with some overtly expressed noun phrase in the sentence (or, in the case of imperatives, as coreferential with the addressee). 6 Examples (6)-(8) illustrate the phenomenon under consideration: (6) (7) (8)
I want to leave. I persuaded the girl to leave. I promised the girl to leave.
In (6), the subject of leave is absent from surface structure, but is readily interpreted as being I. In (7), the missing subject is equally readily interpreted as being the girl. In (8), at least for most speakers of English — a point to which I return below — the missing subject is again /. I will refer to the overt coreferential noun phrase as the controller (or trigger), and to the missing noun phrase as the target (or victim). If we look for the moment at examples (6) and (7), each of which represents a large class of sentences with other verbs behaving in precisely the same way as want and persuade, and temporarily omit from consideration type (8) (since only a small number of verbs behave like promise), then it is possible to give a formal characterization of the difference
Universals: form and function
91
between the two types. In type (6), the main verb has no object other than the infinitive; in type (7), the main verb has an object other than the infinitive; thus, if the main verb has an object other than the infinitive, the missing subject of the infinitive must be coreferential with the object of the main verb, while if there is no such object, the missing subject must be coreferential with the subject of the main verb. 7 If we accept this formal explanation, however, or at least if we accept it as the whole story, then the type represented by (8) remains an exception. Although the number of main verbs possible in type (8) is small, it is suspicious that almost exactly this same set of verbs recurs as an exception in a large number of languages, suggesting that there is something more systematic about the unusual control properties of the verb promise and its close synonyms. A way out of the dilemma is provided by the theory of speech acts: here I will use the classification of speech acts given by Searle (1976). The main verbs that pattern like persuade in (7) all express directives, i.e. 'attempts... by the speaker to get the hearer to do something'. The object of the main verb thus refers to the hearer; the agent (and therefore typically the subject) of the infinitive also refers to the hearer. Since these two noun phrases are typically, if not invariably, coreferential, languages which permit omission of the subject of the infinitive are simply permitting the omission of a noun phrase whose referent can (nearly) always be predicted on pragmatic grounds from the nature of the speech act described by the main verb. 8 We can therefore provide a functional explanation for the control properties of persuade as a main-clause verb: the most usual situation (hearer and agent are coreferential) is expressed by the most succinct means (omission of one of two coreferential noun phrases). This line of explanation also provides us, with no further mechanism, with an explanation for the 'exceptional' behavior of promise, a member of the class of verbs expressing commissives, i.e. speech acts 'whose point is to commit the speaker...to some future course of action' (sc. by the speaker). In other words, the speaker is subject of the main verb and is also agent (therefore typically subject) of the infinitive. Again, the formal syntactic properties of verbs in this class simply reflect the typical speechact value of such verbs: one normally makes promises relative to one's own actions, and the control properties of promise and other commissives give a more concise expression for the normal state of affairs. The crucial point of the examples in this section is thus that an explanation relating to the pragmatic properties of directives and commissives as control verbs can provide an explanation for what, from a purely formal point of view, seems to be completely exceptional behavior on the part of the latter set of verbs. 9
92
Β. Comrie
3.
Control and syntactic ergativity
So far, we have been considering the control properties of the controller in constructions with missing noun phrases. It is now time to bring the target noun phrase into the picture, in order to introduce the notion of syntactic ers Livity, since sections 3 and 4 will be concerned with functional explanations for the distribution of syntactic ergativity versus syntactic accusativi ty. In coordinate clauses in English, it is possible to omit the subject of the second clause (target) if it is coreferential with the subject of the first clause, so that (9) below can be interpreted as the conjunction of (10) and (11), but not as the conjunction of (10) and (12): (9) (10) (11) (12)
The The The The
man hit the woman and came here. man hit the woman. man came here. woman came here.
Note that the two clauses conjoined are, respectively, transitive and intransitive, and my statement above presupposed that the term 'subject' is equally applicable to the man in the transitive clause (10) and the intransitive clause (11). This does indeed seem justified for English, but there are other languages where this claim is at best questionable, and perhaps incorrect. It is useful therefore to have a more neutral terminology, which avoids begging the question of which noun phrase is subject in the transitive construction. In the intransitive construction, the sole argument will be referred to as S. In the transitive construction, we start from a set of canonical (prototypical) transitive constructions, referring to actions where an agent acts upon a patient, and use A for the agent in such a construction and Ρ for the patient. The A/P terminology can be extended, however, to other transitive constructions with the same syntactic behavior, but where the participants are not, strictly, semantic agent and patient. Thus, just as the man is A in (10) and the woman is P, we can make the same assignments in (13): (13)
The man saw the woman.
In English, then, we can say that, in conjunction reduction, the controller must be either S or A and likewise the target must be either S or A (or, equivalently, they must both be subject, where 'subject' for English means S or A). This can be contrasted with the behavior of conjunction reduction in Dyirbal, an Aboriginal language of Australia described by Dixon (1972).
Universals: form and function
93
In Dyirbal, sentence (14) is interpreted as the conjunction of (15) and (17), not as the conjunction of (15) and (16): (14)
(15)
(16)
(17)
Jugumbil yaranggu balgan, baninyu. woman-ABSOLUTIVE man-ERGATIVE hit came-here 'The man hit the woman, and she came here.' Jugumbil yaranggu balgan. woman-ABSOLUTIVE man-ERGATIVE hit 'The man hit the woman.' Yara baninyu. man-ABSOLUTIVE came-here 'The man came here.' Jugumbil baninyu. woman-ABSOLUTIVE came-here 'The woman came here.'
In Dyirbal, it turns out that the controller for conjunction reduction must be either S or P, and likewise the target must be either S or Ρ (equivalently, both must be subject, where subject in Dyirbal means either S or P). 1 0 Where S and A are treated alike in contrast to P, we refer to nominative-accusative (or simply: accusative) syntax. Where S and Ρ are treated alike in contrast to A, we refer to ergative-absolutive (or simply: ergative) syntax. With conjunction reduction, there seems to be no functional explanation for preference of one kind of syntax over the other, and we even find languages like Chukchee where sentences with omitted noun phrases in the second conjunct can be given either the accusative syntactic or the ergative syntactic interpretation (Nedjalkov 1979):11 (18)
9tlage eksk talayvanen ankPam ekvetgPi. father-ERGATIVE son-ABSOLUTIVE hit and left 'The father hit the son and the father/the son left.'
(19)
Stbge eksk talay vanen. father-ERGATIVE son-ABSOLUTIVE hit 'The father hit the son.' Stbgan ekvetgPi. father-ABSOLUTIVE left 'The father left.' Ekak ekvetgPi. son-ABSOLUTIVE left 'The son left.'
(20)
(21)
Let us therefore turn to some constructions where there is a pragmatic
94
Β. Comrie
expectation of cross-linguistic bias in favor of one particular kind of syntax, in fact in favor of accusative syntax. In imperative constructions in many languages, the noun phrase referring to the addressee may (or must) be omitted. In English, for instance, second person subjects of imperatives are usually omitted: (22) (23)
(You) come here! (You) hit the man!
As the sequence of intransitive and transitive examples shows, in English either an S or an A may be omitted in this way; a Ρ may not be omitted (though see note 14 on the passive imperative): (24)
*Let/may the man hit! (sc. you)
Since English has accusative syntax even where there is no functional reason to expect accusative syntax, one might think that this is simply another manifestation of the language's formal accusative syntax. However, in Dyirbal, with ergative syntax where there is no functional explanation, we find precisely the same property in imperatives: a second person S or A can be omitted, but not a second person P: 12 (25)
(26)
(Nginda) you-NOM IN ATI VE 'Come here!' (Nginda) you-NOMINATIVE 'Hit the man!'
bani! come-here-IMPERATIVE yara balga! man-ABSOLUTIVE hit-IMPERATIVE
Thus in Dyirbal the accusative syntax of imperative addressee deletion goes against the general ergative nature of Dyirbal syntax. Once again, a functional explanation can be given for the apparently exceptional nature of Dyirbal imperative addressee deletion, making use of Searle's classification of speech acts and following essentially the same lines as Dixon (1979: 112-114). An imperative expresses a directive, i.e. an instruction is given to the addressee (hearer) for that addressee to carry out a certain course of action. The only felicitous directives are those where the speaker believes that the addressee has the capability of carrying out the course of action. In general, agents are most likely to have this capability; subjects of some intransitive verbs (e.g. come-here, but not be-tall) are likely to have this capability; patients of transitive verbs are unlikely to have this capability. 13 Given the high correlation between A and agent, and between Ρ and patient, all that Dyirbal does here is to provide a more succinct expression for the more usual state of
Universals: form and function
95
affairs, where the addressee is the participant with greater capability with regard to completion of the action required by the speaker. 14 Given this discussion of imperatives, one would expect the same generalization to carry over to indirect commands, by which I mean reports of directives in indirect speech, such as (27)-(28) in English: (27) (28)
I told the girl to come here. I told the girl to hit the man.
This is, indeed, the case in English, where the target for noun-phrase omission is either S or A of the infinitive clause. 15 I am aware of at least one language, however, where this generalization does not hold for indirect commands, namely Dyirbal. In Dyirbal, the S of the subordinate verb form can be omitted in sentences parallel to (27): (29)
Ngana yabu gigan banagaygu w e - N O M I N A T I V E mother-ABSOLUTIVE told return-INFINITIVE 'We told mother to return.'
Likewise, the Ρ of the subordinate verb form can be omitted in sentences like (30): 16 (30)
Ngana yabu gigan gubinggu w e - N O M I N A T I V E mother-ABSOLUTIVE told doctor-ERGATIVE ma wali. examine-INFINITIVE 'We told mother to be examined by the doctor.'
In Dyirbal, however, it is impossible to omit the A in sentences parallel to (28), i.e. Dyirbal here evinces its usual syntactic pattern of syntactic ergativity. 17 This last point brings us back to one of the points mentioned with regard to Greenberg's universal (3) in Section 1, namely that language reflects an interplay between functional and formal factors. Functional factors would lead us to expect accusative syntax in both imperatives and indirect commands, and Dyirbal does indeed provide us with evidence here in imperatives. Formal properties of Dyirbal syntax, however, would lead us to expect ergative syntax everywhere, and the syntax of indirect commands provides us with evidence for this formal generalization. What would be a strong counterexample to the functional explanation would be some language where there is no formal reason to expect ergative syntax in indirect commands, but where nonetheless one finds ergative syntax there. As far as I know, no such language exists.
96
Β. Comrìe
4.
Resultativi constructions
In this final section of the body of the paper, I want to turn to the syntax of resultative constructions. A resultative construction describes a certain state which holds as the result of some preceding event. In general, the resulting state is predicted of one of the participants in that preceding event. The question that will concern us here is, which participant? From a functional viewpoint, we might formulate certain expectations, on the basis of which participant is most likely to undergo a change of state as the result of some event. If the event is encoded as a one-place predicate, with only an S, then the change of state will be predicted of that S, as in (31): (31)
The glass fell.
With two-place predicates, however, it is usually the case that the P, rather than the A, necessarily undergoes a change of state, as in (32)—(33): (32) (33)
The boy smashed the glass. The boy killed the lizard.
While the boy may have undergone some change of state in each of these examples (e.g. by starting to suffer pangs of conscience, by having cut himself on the glass, by having covered himself with lizard's blood), it is not necessary that he should have done so; whereas the glass is inevitably in a new state (it was whole, now it is broken), likewise the lizard (it was alive, now it is dead). 1 8 Thus our expectation is that, from a functional viewpoint, in resultative constructions S and Ρ should behave alike, and differently from A, in terms of the attribution of change of state to one of the participants. Data from two languages, Nivkh (Gilyak) and (Modern Eastern) Armenian are presented in support of this expectation. 1 9 In Nivkh, the basic word order is subject-object-verb, and there is in general no evidence of syntactic ergativity. Direct objects immediately preceding their verb have the interesting property that, in certain cases, they trigger morphophonemic change in the initial consonant of that verb. Thus the transitive verb rad' 'roast' changes its initial consonant to th when immediately preceded by the direct object t'us 'meat' in (35): (34)
(35)
Anaq yod'. 2 0 iron rusted 'The iron rusted.' Umgu t'us t h ad'. woman meat roasted 'The woman roasted the meat.'
Universals: form and function
97
The resultative in Nivkh is expressed by adding a suffix -ysta. Where the verb is intransitive, there is no change in the syntactic structure of the sentence: (36)
Anaq yoyatad'. iron rusted-RESULTATIVE 'The iron has rusted.'
Where, however, the verb is transitive, a number of syntactic changes take place. It is no longer possible to specify the A. The original P, though immediately preceding the verb, cannot trigger initial consonant change, i.e. it is no longer direct object, but functions rather as subject of the construction: 21 (37)
T'us jayatad'. meat roasted-RESULTATIVE 'The meat has been roasted.'
It should be noted emphatically that (37) means 'the meat has been roasted', and not 'the meat has (perhaps spontaneously) roasted'. Nivkh distinguishes rigidly between transitive and intransitive verbs, and rad is unequivocally a transitive verb. The syntactic generalization is easy to state formally: the subject of the resultative may be either the S or the P, but not the A, of the corresponding nonresultative, i.e. this is an instance of syntactic ergativity. However, the formal statement provides no hint of why we should find syntactic ergativity here in a language which otherwise provides little or no evidence of syntactic ergativity. Functionally, however, the Nivkh construction makes perfect sense: resultatives describe changes of state that are attributable primarily to S or P, rather than to A, and Nivkh simply capitalizes on this generalization to provide a formal means of attributing changes of state to Ρ without any overt change in voice comparable to the English passive. (Note that in the English translation of [37], use had to be made of the passive.) The resultative in Modern Eastern Armenian provides a similar picture, although here there is usually overt indication of passive voice, by means of the suffix -v. Sentences (38) and (39) are nonresultative: (38)
(39)
Asota khanel e. Ashot fall-asleep-PERFECT is 'Ashot has fallen asleep.' Aso ta namaka garel e. Ashot letter-the write-PERFECT is 'Ashot has written the letter.'
98
Β. Comrie
In the resultative corresponding to (38), there is no change in syntactic structure, or in voice: (40)
Asota k h anac e. Ashot fall-asleep-RESULTATIVE is 'Ashot is in a state of having fallen asleep, is asleep.'
In the resultative corresponding to (39), however, there are several changes. The verb is obligatorily passivized, by addition of the suffix -v. 22 The Ρ appears as surface-structure subject, while the A is usually omitted, although it can be expressed as a passive agent, with the postposition koymich (which takes the genitive-dative case): (41)
Namaka gejAOD a (Asoti koymic"). letter-the write-PASSIVE-RESULTATIVE is Ashot-GENITIVE by 'The letter is in a state of having been written (i.e. is ready) (by Ashot).'
Formally, the Armenian data are bewildering: why should the passive, otherwise productive and optional for transitive verbs, suddenly become obligatory for transitive verbs in the resultative construction? Once one invokes the functional explanation for resultative constructions, however, these data are no longer surprising, but simply fall into place alongside those from Nivkh. In Modern Eastern Armenian, a small number of transitive verbs have resultatives that do not undergo this change of voice, or at least do not undergo it obligatorily. For instance, the resultative participle of xsmel 'drink', in the active voice, is xdmac, which means 'drunk', in the sense 'inebriated' (whereas 'in a state of having been drunk' would be xamvac); likewise the resultative participle of kardal 'read', in the active voice, is kardachac, which means 'well-read' (i.e. of a person who has educated himself by reading a lot). Perhaps surprisingly, these seemingly idiosyncratic exceptions actually provide evidence in favor of our analysis. First, note that these examples are lexicalized: xamac means not just 'having drunk', but rather 'having drunk sufficient to become inebriated'; kardachac means not just 'having read', but rather 'having read sufficient to become educated' — so that, in the worst analysis, we could dismiss them as isolated lexical exceptions. However, we can go beyond this, because the way in which they are lexicalized is such as to attribute a change of state, atypically, to the A rather than to the Ρ of an event: a person who is inebriated is one who has undergone a change of state by drinking; a person who is well-read is one who has undergone a change of state by reading. The pragmatic generalization therefore remains, the only felicitous resultatives are those which attribute a change of state to their
Universals: form and function
99
subject. Modern Eastern Armenian is not alone in having these lexical exceptions. In English, drunk would be, regularly, a past passive participle ('having been drunk'), and can be used in this way, but has also, exceptionally, taken on the idiosyncratic lexicalized active interpretation; likewise, well-read would be regularly, and can be used as, a past passive participle ('having been read well'), in addition to the idiosyncratic lexicalized interpretation 'having read widely'. Once one recognizes the functional basis of resultatives, even such exceptions are less surprising than from a purely formal point of view.
5.
Conclusions
In this paper, I have tried to present some cases where functional explanations can be given for cross-linguistic regularities, where formal statements provide no explanation for the observed regularity. The various cases discussed are all linked by one factor in common: they are all instances where a given construction is exceptional in terms of the usual formal behavior of grammatical relations in the language in question, but where a pragmatic explanation can provide insight into the motivation behind that formal idiosyncrasy. Although I believe that the paper does report some solid results, I would emphasize that it outlines the very beginning of a research project, rather than anything like a completed line of investigation. In this final paragraph, I will outline some of the avenues that remain to be investigated more fully, and which will, I hope, link my paper to some of the others presented at the workshop. First, it is necessary to establish more clearly the domain of functional versus formal explanations/statements. Throughout, I have been at pains to note that, although functional explanations do have their domain, there are still many areas where no viable functional explanation is forthcoming, and where formal statements therefore still rule; moreover, many aspects of language can only be appreciated in terms of the interaction of formal and functional factors. Second, it is necessary to relate functional explanations to the whole issue of innateness, which has been so crucial in the development of formal explanations in linguistics. For instance, it is not excluded that functional principles might be innate (although I see little evidence for or against this — see, however, note 9 for a suggestion that some functionally motivated universals might involve principles acquired late in first-language acquisition); this raises the interesting question of how innate ideas turn out to be 'correct' (more accurately, functionally valuable) ideas — as the result of selectional pressure in evolution? And if such principles are not innate,
100
Β. Comrie
then an even broader perspective of investigation of their acquisition opens before us. Department of Linguistics University of Southern California Los Angeles, California 90089 USA
Notes *
1.
2.
3.
4.
Versions of this paper were presented to audiences at the Arizona State University, Tempe; the Fifth Inter-American Institute of Linguistics, Campinas, Brazil; the University of California, Los Angeles; the University of Southern California, Los Angeles; and the University of Utah, Salt Lake City, in addition to the Linguistics Workshop on Explanations for Language Universals, Cascais, Portugal. I am grateful to all those who offered comments after my presentations at these meetings. More accurately, one should say that transformations are almost without exception structure-dependent, since some rules in some languages violate the strict characterization of structure-dependence. For instance, many languages have a rule placing clitics in sentence-second position, irrespective of the nature of the first word, which may be a major constituent (e.g. the subject or main verb), but may equally be just part of a major constituent (e.g. a noun within a larger noun phrase, or an adjective within a noun phrase). For further discussion and examples, see, for instance, Comrie (1981b: 21-22). These exceptions are, however, very restricted: the generalization that transformations are structure-dependent still holds as the unmarked case. The formulations given here are relatively informal, but could easily be translated into most of the major current frameworks, including transformational-generative grammar and relational grammar. Indeed, the easiest way to recite backwards is to avoid the problem of sequential presentation by writing the sequence down (so that everything is presented simultaneously) and then reading from the end. Some, but not all, speakers of English accept sentences like (i) below, where the purpose is apparently chronologically anterior to the situation designed to bring that purpose about: (i)
I will destroy your creation, so that you might have worked in vain.
Here, 'your working' clearly precedes 'my destruction of your creation'. Note, however, that while your working is indeed anterior to my action, the fact that you worked in vain can only be established subsequent to my destroying your creation, i.e. the evaluation of the truth of 'you worked in vain' is subsequent to my destroying your creation. 5. Note that the exception clause allows verb-final languages to have the exceptional word order, but does not require them to have it. 6. Use of the term control to refer to the interpretation of zero-subjects of imperatives, e.g. (you) come here, is an extension of the usual meaning of control, but a natural one. However, nothing hinges crucially on the term, so that those working in frameworks that eschew this extension may regard this as a convenient, even if misleading, terminology. Throughout, I remain neutral as to whether the noun phrase
Universals: form and function
101
absent from surface structure is present in underlying structure with its referent specified, is present in underlying structure without a specified referent and has its reference assigned interpretatively, or is absent at all levels of syntactic derivation and is present only in terms of semantic interpretation. 7. 8.
This condition can be stated more rigorously; see, for instance, the classic formulation within the standard theory of transformational grammar by Rosenbaum (1967:6). In the sentence, the adverb typically and the reservation nearly always are necessary because it is possible to issue a directive not to the person who is to carry it out, but rather to someone else who has control over that person, as in (i) below. Likewise, one can promise an action by some other person, provided that other person is under the speaker's control, as in (ii): (i) (ii)
The teacher told Mr. Smith that his son should study harder. Mr. Smith promised that his son would study harder.
9.
Note that I have left open the question of whether there is any validity to the formal account, under which promise would simply be an exception. Chomsky (1969) notes that young children frequently misinterpret sentences like (8), taking the main clause object rather than the subject as missing subject of the infinitive. My own observations suggest that some adults make the same judgement. If it is indeed the case, however, as argued by Chomsky, that there is a significant difference here between child and adult language, this would provide a fascinating area for further speculation and investigation: is it, for instance, the case that young children are typically guided more by formal than by functional generalizations? If so, this would be strong evidence for an innatist approach to first-language acquisition, although it would not affect the viability of functional explanations for universale of adult language. See further section 5.
10.
In examples ( 14)—( 17) it might seem that the ergative syntax simply mirrors the ergati ve morphology, i.e. that both controller and target must be noun phrases in the absolutive case. However, Dixon demonstrates that this is not the correct analysis: personal pronouns have nominative-accusative morphology (i.e. one form for S and A, another form for P), but still under conjunction reduction a pronoun controller or target must be S or P, and cannot be A. Since language overall seems to have a bias toward the identification of subject and agent, there is, however, a bias in favor of accusative rather than ergative syntax even with conjunction reduction, though this is independent of the factors considered here. Note that in conjunction reduction it is necessary that the same topic be carried along in both clauses, so that in a language like Chukchee with potentially ambiguous instances of conjunction reduction, where one noun phrase is clearly topicalized (e.g. by fronting of a nonsubject), this noun phrase is interpreted as controller, as noted by Nedjalkov:
11.
(i)
12. 13.
14.
Ekak atbge talayvanen, ekvetgPi. son-ABSOLUTIVE father-ERGATIVE hit left 'The son, the father hit, and the son left'.
In fact, in Dyirbal only imperatives with second person S or A are well formed. Elsewhere, rather than 'capability' I have used the term 'control'. In the present paper, I avoid the potential confusion between this sense of 'control' and the syntactic use of the same term in talking about controller and target. Passive imperatives in English, where the addressee is a Ρ underlyingly rather than an A, might seem to go against the functional explanation. However, passive imperatives are rarely given an imperative interpretation, but function rather as conditional apodoses, as in (i) below:
102
Β. Comrie (i)
15.
Come to the Ruritanian State Circus and be amazed by the world's greatest liontamer!
The fact that a special interpretation is given is actually evidence in favor of the functional explanation: where an imperative form violates the functional explanation, it cannot be interpreted as an imperative. Again, passive infinitives might seem to give an exception, since the noun phrase omitted is an underlying P, as in (i): (i)
I persuaded the girl to be examined (by the doctor).
However, again, a special interpretation is required, namely that the girl had the capability of allowing herself to be examined. Thus it is not simply the case that English allows any surface-structure subject to be deleted in such constructions, since this deletion is only permitted where such an interpretation of capability is allowed. It should be noted that some other major European languages, such as German and Russian, do not permit literal translations of (i), but require overt expression of this capability, as in German sentence (ii): (ii) 16.
17.
This example was collected for me personally by R. M. W. Dixon, to whom I am very grateful for his trouble — and also for the opportunity of presenting a Dyirbal example that is not from Dixon's own published work! Sentences like (28) can be expressed in Dyirbal by using the antipassive voice, by which the underlying Ρ is presented in surface syntax as a nondirect object and the underlying A as an S, as in (i): (i)
18.
Ich überzeugte das Mädchen, sich untersuchen zu lassen. Ί persuaded the girl to have herself examined.'
Ngana yabu gigan ngumagu we-NOMINATIVE mother-ABSOLUTIVE told father-DATIVE buralngaygu. watch-ANTIPASSIVE-INFINITIVE 'We told mother to watch father.'
This is parallel to the use of the passive in English in example (i) of note 15. But note that in this respect, Dyirbal does not make use of the most succinct expression for the more normal state of affairs: Dixon notes that while sentences like (29) and (i) are frequent in text, he has no textual attestations for examples like (30). Indirect commands ('jussives') are discussed for Dyirbal by Dixon (1979: 114-115, 128-129). There are some transitive predicates where the change of state is attributable to the A rather than to the P, as in (i): (i)
The mountaineer reached the summit of the mountain.
However, these are less typical than those discussed in the text: note, for instance, that the verbs in question do not refer to actions involving a semantic agent and patient. 19. Nivkh is a language-isolate spoken at the mouth of the River Amur and on Sakhalin island; data are from Nedjalkov et al. (1974). Modern Eastern Armenian is the variety of Armenian spoken in Soviet Armenia and in Iran; data are from Kozinceva (1974). For a somewhat fuller discussion of my interpretation, see Comrie (1981a). 20. All Nivkh verb forms in these examples end in - d , which indicates a finite verb; tense is otherwise unmarked in these verb forms for nonfuture. Noun phrases functioning as subject and direct or indirect object are not case-marked.
Universals: form and function 21.
103
Note, incidentally, that it is not the case that initial consonant mutation is inconsistent with the resultative form. Some verbs take two objects, and when one of them becomes subject in the resultative construction, the other remains as direct object and triggers the mutation: (i)
(ii)
If k h uva nux t h 3d'. she thread needle inserted 'She inserted the thread into the needle.' K.huva nux t h 3y3tad'. thread needle inserted-RESULTATIVE 'The thread has been inserted into the needle.'
22.
The citation form of the verb 'insert' is fed'. Kozinceva notes that there are some instances where the resultative verb form does not take the passive suffix -v, but is still constructed syntactically as a passive and so interpreted. It is the syntactic construction and the interpretation that are crucial to my use of these examples. Indeed, the fact of passive interpretation and syntax even in the absence of passive morphology is a further piece of evidence in favor of my analysis. Note, incidentally, that the resultative in Modern Eastern Armenian has specifically the meaning of a state resulting from a preceding event; the perfect may express this concept, but has a much wider range of interpretations.
References Chomsky, Carol (1969). The Acquisition of Syntax in Children from 5 to 10. Research Monograph No. 57. Cambridge, Mass.: MIT Press. Comrie, Bernard (1981a). Aspect and voice: some reflections on perfect and passive. In Philip J. Tedeschi and Annie Zaenen (eds), Tense and Aspect, vol. 14 of Syntax and Semantics, 65-78. New York: Academic Press. —(1981b). Language Universals and Linguistic Typology: Syntax and Morphology. Oxford: Basil Blackwell; Chicago: University of Chicago Press. Dixon, R. M. W. (1972). The Dyirbal Language of North Queensland. Cambridge Studies in Linguistics 9. Cambridge: Cambridge University Press. —(1979). Ergativity. Language 55, 59-138. Greenberg, Joseph H. (1966). Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of Language, second edition, 73-113. Cambridge, Mass.: MIT Press. Kozinceva, N. A. (1974). Zalogi ν armjanskom jazyke. In A. A. Xolodovic (ed.), Tipologija passimyx konstrukcij: diatezy i zalogi, 73-90. Leningrad: Izdvo 'Nauka'. Nedjalkov, V. P. (1979). Degrees of ergativity in Chukchee. In Frans Plank (ed.), Ergativity: Towards a Theory of Grammatical Relations, 241-262. London: Academic Press. —, Otaina, G. Α., and Xolodovic, A. A. (1974). Diatezy i zalogi ν nivxskom jazyke. In A. A. Xolodovic (ed.), Tipologija passivnyx konstrukcij: diatezy i zalogi, 232-251. Leningrad: Izdvo 'Nauka'. Rosenbaum, Peter S. (1967). The grammar of English Predicate Complement Constructions. Research Monograph No. 47. Cambridge, Mass.: MIT Press. Searle, John R. (1976). A classification of illocutionary acts. Language in Society 5, 1-23. Xolodovic, A. A. (ed.) (1974). Tipologija passivnyx konstrukcij: diatezy i zalogi. Leningrad: Izdvo 'Nauka'.
Temporal distance: remoteness distinctions in tense-aspect systems OSTEN DAHL
Mais parce que d a n s le passé, on peut marquer que la chose ne vient que d'estre faite, ou indéfiniment qu'elle a esté faite: De là il est arrivé que dans la pluspart des Langues vulgaires, il y a deux sortes de preterit; l'vn qui m a r q u e la chose précisément faite, & que p o u r cela on n o m m e définy, comme, j'ay écrit, j'ay dit, j'ay fait, j'ay disné; & l'autre qui la m a r q u e indéterminément faite, & que p o u r cela on n o m m e indéfiny, ou aoriste; c o m m e j'écrivis, je fis, j'allay, je disnay, &c. Ce qui ne se dit p r o p r e m e n t que d'vn temps qui soit au moins éloigné d'vn j o u r de celuy auquel nous parlons. Car on dit bien par exemple; j'écrivis hier, mais non pas, j'écrivis ce matin, ni j'écrivis cette nuit; au lieu dequoy il faut dire, j'ay écrit ce matin, j'ay écrit cette nuit, &c. Nostre Langue est si exacte dans la propriété des expressions, qu'elle ne souffre aucune exception en cecy, quoy que les Espagnols & les Italiens c o n f o n d e n t quelquefois ces deux preterits, les prenant l'vn p o u r l'autre. Arnauld and Lancelot 1667: 108-109
Introduction The above quotation, in which it is claimed that the categories passé simple and passé composé in 17th century French differ in that the former can only be used of events that took place more than one day ago, may sound a bit strange from the point of view of the modern language in which no such distinction can be found, and one might even suspect that it is a result of the same mentality which gave rise to the French classicist rule that the action of a drama should not encompass more than 24 hours. A priori, one would not expect the semantics of inflectional categories to
106
Ö. Dahl
be dependent on an exact time measure. However, it turns out that the rule postulated by Arnauld and Lancelot is far from unique as a rule of tense choice in human languages; on the contrary, it is fairly characteristic of the ways in which the parameter 'temporal distance' is reflected in tense-aspect systems. In this paper, we shall have a closer look at this parameter. The paper builds on work carried out within the research project 'Universal grammar and language typology' at the Departments of Linguistics of the Universities of Göteborg and Stockholm, which was funded by the Swedish Research Council for the Human and Social Sciences. In spite of its all-encompassing name, this project was almost entirely devoted to the study of tense-mood-aspect (TMA) systems. I have chosen 'temporal distance' as an example of a notional category the realizations of which have not been extensively studied before but which can illustrate the kind of generalizations that it is possible to make about TMA systems. Within the project, data were collected basically in two ways: (i) by consulting extant descriptions of different languages; (ii) by the 'TMA Questionnaire', which contained a number of sentences and short connected texts in English together with indications of the contexts the sentences or texts were assumed to be uttered in. The questionnaire has been completed by native speakers and analyzed by us for about 60 languages from at least 15 unrelated language families and covering all major continents of the world. For easier access, the analyses of the questionnaires have been put on a computer. The consultation of extant language description has not in general been performed in a systematic fashion, except for the 60 languages mentioned above. The reason is that grammars vary to such an extent with regard to reliability, exhaustiveness, and theoretical approaches as to be virtually incommensurable. However, since I was especially interested in the parameter 'temporal distance' and it seemed to be particularly well reflected in languages of the Bantu group, we decided to make a special study of the TMA systems of the languages in that group, based on extant grammars. This investigation, which covers about 75 languages, was carried out by Maria Kopchevskaya and will be reported in Kopchevskaya (forthcoming). Pierre Javanaud, who acted as our informant for Limouzi (a dialect of Occitan, a Romance language spoken in the South of France), also wrote a paper on the TMA system of his native language (Javanaud 1979). It was mainly this paper that drew my attention to the possible importance of temporal distance as a semantic parameter underlying tense systems.
Tense-aspect systems
107
Before going further, we shall try to make the terminology somewhat more precise. 'Temporal distance' involves, by definition, a measurement of the distance between two points or intervals in time; this implies that for the parameter to be relevant, at least two such time points should be involved in the interpretation of a sentence. We shall use the terminology introduced by Reichenbach 1947, since it is by now fairly well known and suitable for our purposes. According to Reichenbach, there are three points in time that may be relevant for tense: the speech time (S); the event time (E), i.e. the time at which the event talked about is supposed to take place; and in addition the reference time (R), a point in time relative to which E can be defined. In the unmarked case, R coincides with S or E. In those cases, which constitute the overwhelming majority in any text, the only possible 'temporal distance' will be between S and E, that is, 'distant' will mean 'distant from the time of speech'. If R is separate, however, we have two intervals to measure: on one hand, the distance S R , on the other, the distance R-E. In principle, both these might be relevant in a tense system. The tendency, however, seems rather to be for remoteness distinctions to be neutralized in such contexts; many languages do not even have a separate category which, like the English pluperfect, is used for events that take place before an R which in its turn precedes S. I also have relatively little information concerning these cases — being conceptually more complex, they are rather hard to elicit reliable information about — and shall just note one fairly clear example of a minimal pair differing in the distance between a past R and a preceding E. Morolong (1978) quotes the following Sotho sentences, saying that the (a) sentence with the ne + tsoa form 'is felt to be nearer to the reference point than is the case with ne + stem + ile sentence in' (b). (1)
(a) (b)
Ha 'At Ha 'At
letsatsi sunset letsatsi sunset
le-likela re-ne re-tsoa tloha Maseru we had just left Maseru.' le-likela re-ne re-tloh-ile Maseru we had left Maseru.'
The discussion in the following will mainly concern distinctions connected with the S-E distance. The E point may be both in the past and in the future: accordingly, languages make distinctions both between 'remote' and 'close' pasts and 'remote' and 'close' future tenses. However, in general, the distinctions in the past appear to be more well developed — that is, to be more numerous and well defined than those in the future; as Ultan (1978) notes, this may be explained by the general tendency for future tenses to be more marked than past tenses. Whatever the facts may be, my material is more
108
Ö. Dahl
extensive on distinctions in the past than in the future, and the bulk of ensuing discussion will concern the past . So far, I have mainly used the term 'temporal distance' as a label the problems that interest me in this paper. Since this term is a unwieldy, I shall often use 'remoteness distinction' to denote grammatical categories that are used to mark how far time points from each other.
the for bit the are
Examples of remoteness systems Remoteness distinctions can be found in languages from most parts of the world and a large number of unrelated genetic groups, although they are more salient in some, such as the Bantu languages. It is possible that temporal distance is at least marginally relevant for TMA categories in the majority of human languages; at present, such a hypothesis is impossible to evaluate due to lack of reliable data. I shall now give a couple of examples of typical remoteness systems. A relatively representative example of a well-developed Bantu system of past tenses is found in Kamba (Whiteley and Muli 1962), which distinguishes between three degrees of remoteness in non-narrative contexts: (i) an 'immediate' past, 'which refers to an action taking place earlier on the day of speaking': ningootie Ί pulled (this morning)'; (ii) a 'recent' past 'which refers to an action taking place on the day prior to the day of speaking, or even to a week previously': Ί pulled (e.g. yesterday)'; (iii) a 'far past' tense, 'which occurs for actions having taken place ...not earlier than some months past': Akamba maia.tua vaa tene 'The Kamba did not live here in the past'. In narrative contexts, there is a vaguer distinction between two tenses, Narrative I, e.g. na.tata Ί tried', and Narrative II, e.g. nina.tata 'the same', the latter of which 'connotes a rather less remote time in the past'. In addition, there is a perfect, e.g. ninakoota Ί have pulled', 'which may be translated by "has" or "has just", so that it may have both an immediate or — less commonly — a general perfect connotation'. As for reference to the future, Whiteley and Muli claim that the same tense, called 'present continuous', e.g. nunu(k)ukoota, 'he is pulling, about to pull' is used for referring to the present and to events that will take place within the next 24 hours. There are, in addition, two proper future tenses, one simply called 'future', e.g. aka.koota 'he shall pull' which is 'for events occurring subsequent to the time of speaking up to a
Tense-aspect systems
109
period of some months ago' (41) (it is not quite clear if this is supposed to exclude 'today': in another place the future tense is defined as 'from 24 hours beyond the time of speaking' [Whiteley and Muli n.d.: (49]) and a 'far future', e.g. nitukaatata 'we shall try', 'used for actions taking place at some point after a few months, though it is clear that there is some looseness in this'. For an example of a system from another genetic group and another part of the world we may quote Derbyshire's description of Hixkaryana, a Carib language (1979). According to Derbyshire, Hixkaryana has — like Kamba — three degrees of remoteness in the past: 'immediate past', e.g. kahatakano Ί came out', which 'refers to actions done the same day or the previous night'; 'recent past', e.g. ninikyako 'he went to sleep', which 'refers to actions done on the previous day or any time earlier up to a period of a few months (this is the norm, but it is relative to the total situation, and sometimes an event of only a few weeks ago will be expressed with the distant past suffix)'; and 'distant past', e.g. wamaye Ί told it', which 'refers to actions done any time earlier'. (Hixkaryana has no future tenses.) As we can see, the similarity between the ways in which the two languages 'cut up' the past is striking. A much more complicated system is found in Kiksht (Hymes 1975). Two sets of morphemes, one with four members and one with two, interact to yield at least seven (possibly more) past tenses. In addition to the distinction 'today : before today', Kiksht has a distinction 'this year: before this year' ('seasonal round' according to Hymes) and a possibility in each 'slot' to have a finer gradation between 'near' and 'far'. Thus, in the 'before this year' range one may, according to Hymes, distinguish between what is within the realm of personal experience and what belongs to the 'age of myth'. Within 'this year' one can distinguish 'not more than a week ago' (or some equivalent time measure) from 'more than a week ago'. In the future, there is simply a distinction between 'near' and 'far', although no indication of what that means is given. Until evidence to the contrary appears, it seems fairly safe to regard Kiksht as an example of maximal differentiation in remoteness systems, at least with regard to the past.
Objective and subjective judgments of temporal distance When someone assesses the distance of an event from the present timepoint, his judgment may of course be influenced by a number of more or less conscious factors. We can range judgments of temporal distance on a
110
Ö. Dahl
'subjectivity scale', where the zero point is a measurement which is made purely in physically definable terms. Most (if not all) cultures employ some kind of physically defined time measures, usually based on the observable reflections of the movements of the earth, the sun and the moon. Day, month, and year are all definable as 'one cycle' in different cyclical astronomical processes. Terms like hour, minute, and second, which are defined as subdivisions of days, depend on the existence of reasonably reliable instruments for the measurement of time. Adverbial expressions which denote time-points or intervals defined in terms of these measures are of course extremely common in English and probably most other languages, e.g. yesterday, four months ago. A sentence such as the following would then exemplify an objective time measure: (2)
I arrived here exactly two years ago.
As a contrast, the time referred to in the following sentence could — probably depending on the circumstances and the mood of the speaker — vary among, say, ten minutes and 60 years: (3)
I've been here for an awfully long time already!
As we have already seen, objective time measures do play an important role in determining the choice between different tenses in various languages. However, it appears that there are differences between languages as to how important they are. In general, there seems to be some possibility for the speaker to treat something as close even if it is objectively remote and vice versa, that is, there is a possibility to give weight to subjective factors. In some languages, however, a 'contradictory combination' of, say, a 'today' tense with a time adverbial meaning 'last year' results in an ungrammatical sentence. This appears to be the case for instance in Kom (a Bantoid language; Chia 1976). In other languages, e.g. Sotho (a Southern Bantu language), it seems in general possible to combine any time adverbials with any tense (Morolong 1978), as in the following sentence where a recent past is used: (4)
Morena Moshoeshoe ofalletse Thaba Bosiu ka-1824 'Chief Moshoeshoe moved to Thaba-Bosiu in 1824.'
In other words, it appears to be possible to distinguish between those languages that give more weighting to objective factors and those which leave more room for subjective factors in judgments of temporal distance. It is sometimes possible to identify the factors that may influence what we have referred to as subjective judgments of temporal distance. Such factors may be for instance spatial distance or personal involvement. Consider the following two Limouzi sentences from Javanaud 1979:
Tense-aspect systems (5) (6)
111
I m'an letsa quant j'ai paia quo qu'i devio I me latseren quant i'agui paia quo qu'i devio 'They released me when I had paid what I owed.'
The verb in the main clauses of (5) and (6) are in the passé composé and the passé simple, respectively. According to Javanaud, (5) is appropriate if 'we are still at the same place'. Similar intuitions were elicited for the distinction between hodiernal and hesternal pasts in Sotho. Intuitively, it is not too hard to accept that distance in time and space will not always be differentiated in people's minds. In a parallel way, events which you have witnessed yourself or which concern you as a person in a direct way might be felt as being 'closer' in a general way and thus be more likely to be reported in a nonremote past tense. (Notice that spatial distance may be measured either relative to the point where the speech act takes place or relative to where the speaker or some other protagonist of the conversation was situated when the event took place. In the second case, there is a clear connection between what is spatially close and what is witnessed by the speaker.) Colarusso (1979) discusses the use of certain prefixes in some Northwest Caucasian languages, which he takes to mark what he calls 'horizon of interest'. A sentence whose main verb is marked for 'horizon' when 'the action referred to takes place or originates' in a zone 'lying at a variable, culturally determined distance' from some central locus. This may be interpreted either in concrete spatial terms in such a way that 'marked for horizon' means 'spatially distant from the speech act' or in abstract terms, in which case 'horizon' refers to, for instance, what directly concerns the speaker. More specifically, it may indicate that one is speaking of a person who is not within the speaker's primary social group, i.e. his consanguineal kin group (hence the reference to 'kinship' in the title of the paper). The relevance of Colarusso's paper for the present discussion is that it shows how a category which primarily marks one type of distance may be extended to other types. The prefixes discussed by Colarusso do not have temporal uses, as far as one can judge from the data given in the paper. An example of a language where morphemes with a deictic spatial function have acquired temporal meaning is Kiksht as described by Hymes, where the prefixes t- and u- have the primary meaning 'hither' and 'thither' and the secondary use of marking the time of reference as being relatively closer or more distant, respectively. 1 As Hymes notes, it is a universal tendency for expressions with primary spatial meaning to be extended to a temporal use. In the case of a language like Limouzi, insofar as the passé simple : passé composé opposition can be reinterpreted in spatial terms, we are apparently dealing
112
Ö. Dahl
with a development in the opposite direction, i.e. a spatial extension of a primarily temporal meaning.
The hodiernal : nonhodiernal distinction As was hinted at in connection with the discussion of 17th century French in the beginning of the paper, there is evidence that the following generalization can be made: (7)
If there are one or more distinctions or remoteness in a tense-aspect system, and reference can be made to objective time measures, one of the distinctions will be between 'more than one day away' and 'not more than one day away'.
The distinction between 'today' and 'before/after today' tenses, which we shall refer to as the hodiernal : nonhodiernal distinction (from Latin hodie 'today'), is well known from grammars of Bantu languages but is by no means restricted to this group. We have already seen illustrations of it from 17th century French, Hixkaryana, and Kiksht. To get a further example from another geographical area, we may refer to Davies's (1981) description of Kobon, a New Guinea language. The fact that the hodiernal : nonhodiernal distinction tends to crop up in grammars widely separated by time and space makes it possible to state with some confidence that it is not a figment of the grammarians' imagination. Its existence is also corroborated by the fact that the exchange of the word yesterday for this morning in sentences like I met him this morning resulted in a different choice of tenses in languages as diverse as Bengali (IndoEuropean), Catalan (Indo-European), Kikuyu (Bantu), Limouzi (IndoEuropean), Quechua (Andean), Spanish (Indo-European), and Zulu (Bantu). Sometimes a more specific delimitation of what is counted as 'today' is given, e.g. when it is said in a grammar of Ewondo (a Bantu language) that 'aujourd'hui commence au dernier coucher du soleiV (Angenot 1971). When a day is supposed to begin is clearly a culturebound phenomenon, and statements like the one quoted may be regarded as language-specific specifications of a universal but vague boundary (to the extent that they are not just constructions of the grammarian, of course). Typically, then, there are one or more hodiernal tenses 2 and one or more prehodiernal. As to delimitations among the prehodiernal tenses, they are most often much more vague. A typical description is the one quoted above for Kamba: 'on the day prior to the day of speaking, or even to a week previously'. It appears that if there are two prehodiernal
Tense-aspect systems
113
tenses, the marked member of the opposition is a 'distant past' which typically refers to things that happened several months or years ago. The unmarked member is then the tense often referred to as the 'yesterday tense', although, as we have already seen, it tends go further back than that. For such forms, I have coined the label 'hesternal' (from the Latin hesternus 'related to yesterday'). 3 Since the delimitation between hesternal and distant pasts is usually rather vague, one might question whether it is at all relevant to think of it in terms of objective time measures. It would then be tempting to strengthen (7) to state that the only relevant objective time measure is that of 'one day'. However, Hymes's account of Kiksht, related above, suggests that, at least for some languages, the year may be another relevant measure. Interestingly, even in Kiksht, with its rich remoteness system, something like hodiernal : prehodiernal may be the fundamental distinction. Hymes notes that the hodiernal past i(g)- 'appears to be far and away the preferred tense for recent past and to be used as such in conversation and narrative' and that it as such 'is contrasted with ga(l)-as the preferred tense for distant past'. O n e gains the impression that the first "cut", so to speak, made by speakers in terms of times past is recent (i(g)-) : remote (ga(l)-).' Statement (7) seems to hold also about remoteness distinctions in the future, although it is relatively more seldom that objective time measures are found to be relevant there at all. In Kamba, as we have seen, the hodiernal future is identical to the present continuous. As an example of a language which has an opposition between hodiernal and posthodiernal future, where the hodiernal future is distinct from the present tense(s), we may mention Aghem (Bantu; Hyman 1979). The interaction between remoteness and other semantic parameters of tense-aspect systems It is not always possible to separate out remoteness distinctions as an independent dimension of a tense-aspect system. Temporal distance may be only one of the semantic factors that underlie an opposition between two verbal categories. The combination of various degrees of remoteness with the values of other semantic parameters is far from arbitrary. Thus, the following statement expresses a fairly well-known tendency: (8)
Categories which can be used as a 'perfect of result' (i.e. to say that something has happened that has a result at the moment of speech) tend to be nondistinct from hodiernal pasts.
This is perhaps more often formulated in another way:
114
Ö. Dahl
In many languages, the perfect may be used where the present relevance of the past situation referred to is simply one of temporal closeness, i.e. the past situation is very recent (Comrie 1976: 60).
Weimers (1973: 348) goes as far as to claim that many so-called recent or hodiernal pasts in African languages are really what he calls 'completive', i.e. a perfect of result. In making this claim Weimers does not consider the possibility that temporal closeness and having a result at the point of speech may well both be relevant factors for one and the same category, something which is at least fairly clear for the languages that I have had a chance to look at more closely, such as Limouzi. It should be pointed out here that the identification of perfects and hodiernal pasts is by no means an absolute universal: cf. e.g. the Bantu language L o N k u n d o which is mentioned by Weimers (1973) and which is supposed to have a 'completive' which is distinct from a 'today past'. The nondistinctness of these categories is, however, frequent enough to merit an explanation. An obvious one would be that a recent event is more likely to have a persistent result than a distant one. A category which is used as a perfect of result will thus automatically be used more frequently of recent events. Even languages which are not normally thought of as having marking for temporal distance, such as English, may exhibit a tendency to use a perfect more easily for recent events. It may be somewhat difficult to distinguish such a tendency from another possible restriction on the use of perfects, namely that if they are used to state that a certain kind of event has taken place within a certain temporal interval, that interval must not be ended before the point of speech (cf. in English the impossibility of saying / have met him last year as compared to I have met him this year). This is reminiscent of formulations sometimes found in grammars of Bantu languages where reference is made to 'current' or 'preceding units of time', where the unit of time may be days, months, years, or even wars (Appleby 1961; quoted in Johnson 1981). Thus, a hodiernal past would refer to 'the current unit of time', variously interpreted as 'today', 'this week', 'this year', etc., whereas a hesternal past would refer to 'yesterday', 'last week', etc. The day as a unit of time would then presumably represent the default value. Another tendency is the following: (9)
Categories which are used like the English pluperfect will tend to be nondistinct from remote past.
Statement (9) is in our material mainly manifested in Indie languages, such as Hindi-Urdu and Bengali, where the tenses referred to as pluperfects (for both historical and synchronic reasons) have acquired a
Tense-aspect systems
115
use as remote past tenses, but is found also in e.g. Amharic (a Semitic language, Titov 1976: 115). (Incidentally, it explains the overuse of the pluperfect in English by speakers of these languages.) The traditional definition of the meaning of the pluperfect in English and other European languages is that it is used to refer to an event that took place before a definite point in time (Reichenbach's R) in the past. It should be pointed out — since this is sometimes the cause of some confusion — that this is not in itself tantamount to a remote past: an event referred to by the pluperfect may be very close to the point of speech, provided that it is looked upon retrospectively from another point in time which is between it and the point of speech. As was pointed out by McCawley (1971), given that the perfect in English is used to state that something has taken place within a current period of time (see above), one would expect that the pluperfect—being the 'past' of the 'present' perfect—should be used when we say that something has taken place within a past period of time. However, we do not say for instance (10) but rather (11): (10) (11)
Henry VIII had been married six times. Henry VIII was married six times.
Such a use of pluperfect form is possible, however, in some languages of the Indo-Aryan group, such as Modern Persian (Mohammad Hariri, personal communication). It appears plausible that such an extension of the pluperfect is the first step to a situation where it is used as a general remote past. According to Katenina (1960), the pluperfect in Hindi is used to express 'the completion of an action before a definite moment (or action) in the past, and also emphasizes the remoteness of the action from the present moment, its belonging to a finished segment of time — last year, yesterday, yesterday morning, etc'. Here (although mentioned in the wrong order) we see all three steps in the hypothesized development represented at once. The connections between the perfect and recency, on the one hand, and the pluperfect and remoteness, on the other, suggest two possible ways in which remoteness distinctions may get into tense-aspect systems. The Romance languages and the Indie languages would be examples of these two possible developments. It is interesting to compare the results: in both cases, the other member of the resulting remoteness opposition is an unmarked past tense, which means that the value of the latter form is different: in the Romance languages, the passé simple is opposed to a perfect with a hodiernal past meaning, and thus itself obtains a prehodiernal value; in the Indie languages, the simple past is opposed to a pluperfect with prehodiernal interpretation and thus itself becomes hodiernal (or at least recent).
116
Ö.Dahl
Narrativity and remoteness The following appears to be a good candidate for a universal of remoteness systems: (12) If narrative and non-narrative contexts differ with respect to the marking of temporal distance, it will be the non-narrative contexts that exhibit the largest number of distinctions. Let us first consider what a 'narrative context' is. There are actually two concepts that must be kept apart here: 'narrative discourse' and 'narrative context'. As in Dahl (1977; and elsewhere), I define a narrative discourse as one where the speaker relates a series of real or fictive events in the order they took place. As an example of a maximally short narrative discourse, Julius Caesar's famous statement (13) may be quoted: (13)
Veni, vidi, vici Ί came, I saw, I conquered'
In actual texts, such ideal or pure narrative discourses are of course relatively seldom found. Normally, the main story-line is continuously interrupted by various kinds of flashbacks and background information. This fact does not diminish the value of the concept of narrative discourse. For our present purposes, however, we are more interested in the concept of a narrative CONTEXT. We shall say that a sentence occurs in a narrative context if the temporal point of reference (in Reichenbach's sense) is determined by the point in time at which the last event related in the preceding context took place. Thus, the event referred to by vidi in (13) is understood to have taken place directly after that referred to by veni. Basically, this means that in a pure narrative discourse, every sentence except the first is in a narrative context. Veni in (13) is thus not in a narrative context. This makes sense since the function of the first sentence is quite different from that of the others: it has to provide the temporal anchoring for the rest of the discourse by e.g. an explicit time adverbial (one day last week, once upon a time, etc.). A reflection of this is found in languages (e.g. a large number of the Bantu languages) which have special narrative tenses, where these tenses are typically used from the second sentence on in narrative texts. The distinctions between narrative and non-narrative discourses and contexts, respectively, are important for the study of tense-aspect systems in several different ways (see Dahl 1980 and forthcoming). For the topic of the present paper, the important observation is that the marking of temporal distance may be different in narrative and nonnarrative contexts. We have already seen that Kamba has three distance
Tense-aspect systems
117
distinctions among its non-narrative tenses, whereas it has only two in the narrative ones. Furthermore, it was noted that the distinction between the two narrative tenses seems much vaguer. Another example would be Limouzi (an Occitan dialect). Here, as in the surrounding Romance dialects and in older forms of French, passé composé tends to be used as an hodiernal past and passé simple as a prehodiernal past. This holds only for non-narrative contexts, however: in narrative contexts, only passé simple is possible. This suggests the generalization formulated above as (12): the number of distinctions as to temporal distance tends to be smaller in narrative contexts. A stronger statement — that the number of distinctions is always smaller in those contexts — is not possible, since languages like Bengali, where the old pluperfect is used as a prehodiernal past, make the distinction in the same way irrespective of narrativity (at least according to our data). There are at least two explanations of (12). The first and perhaps most obvious one has to do with the communicative need for marking location in time. As is well known, tense marking is very often redundant, in particular when we are referring to points or intervals in time that have already been located. In narrative contexts, time is by definition provided by context, and the choice of a certain tense carries very little information indeed. It is not possible to say that it is entirely devoid of information — compare the following texts: (14)
(a) (b)
We arrived at the airport. The airline representative took care of our baggage. We arrived at the airport. The airline representative had already taken care of our baggage.
Thus, since it is always in principle possible to jump out of the main storyline, the relating of events in temporal sequence represents the default choice rather than an obligation. Even so, it appears fairly natural that a large number of languages (examples from our sample: Eskimo (EskimoAleut), Indonesian, Javanese, Sundanese (Malayo-Polynesian), Kammu (Mon-Khmer), Thai (Kam-Tai), Wolof (West Atlantic), Yoruba) use completely unmarked verb forms in narrative contexts. In Dahl (1980), the following tentative generalizations were made: (15)
(16)
If there are separate forms for narrative and non-narrative past contexts, the non-narrative forms tend to be more highly marked than the narrative ones. It is almost always possible to use the least marked indicative verb form in a narrative past context.
(As exceptions to (16), languages with a marked category of perfectivity, such as Russian and Chinese, should be noted.)
118
Ö.Dahl
We could thus regard the tendency to neutralize distance distinctions in narrative contexts as another reflection of the mentioned tendencies. There is another possible explanation, however. Above, we discussed the connection between recency and perfects. One noteworthy property of perfects is that they are by and large a non-narrative category; it could even be claimed that this is the most general characteristic of such categories (see Dahl, forthcoming, for a more detailed discussion). To the extent that recent pasts have developed out of perfects, it would thus be natural that they would also be more naturally used in non-narrative contexts. It is interesting to note in this connection that the languages where the source of the remoteness distinction is rather the opposition between a pluperfect and a simple past, such as Bengali, seem to behave differently from e.g. the Romance languages in this regard: our Bengali informant translated the narrative texts in our questionnaire variously with the simple past or the pluperfect, depending on their degree of remoteness.
Remoteness distinctions and the organization of memory Do the semantic correlates of the morphological categories we are discussing in this paper correspond to any important features in the organization of the human mind? A natural place to look is the organization of memory. Chafe (1973) tries to provide 'linguistic evidence for three kinds of memory, called surface, shallow, and deep'. Information that is being held in surface memory is supposed to be either conscious or 'very close to the surface of consciousness'; this can last for anything 'from several seconds to several weeks or more'. Chafe notes the similarity of this concept to the traditional 'short-term memory', although as he notes, it cannot be identified with it, at least not if short-term memory is supposed to have a maximal time span of 15-20 seconds, as is sometimes claimed. 'Information which is no longer so constantly or insistently present in consciousness as to belong to surface memory passes through a state' referred to as 'shallow memory'. 'Shallow memory extends back only a few days at most', whereafter information that is considered worth remembering goes into the final storage place, 'deep memory'. The linguistic evidence for this kind of process — rather reminiscent of the treatment of waste from nuclear power plants — comes from restrictions on the distributions of different kinds of time adverbials, the possibilities being 'strong', 'weak', and zero time adverbials: 'material from deep memory must be reported with a strong adverb, ... material from shallow memory may be reported with either a strong or weak
Tense-aspect systems
119
adverb, and ... material from surface memory may be reported with a strong adverb, a weak adverb, or no adverb at all'. The following examples would illustrate the six possibilities: (17)
a.
b.
c.
From surface memory: Steve fell in the SWIMMING pool. Steve fell in the SWIMMING pool a couple minutes ago. A couple MINUTES ago, Steve fell in the SWIMMING pool. From shallow memory: Steve fell in the SWIMMING pool yesterday. YESTERDAY, Steve fell in the SWIMMING pool. From deep memory: Last CHRISTMAS, Steve fell in the SWIMMING pool.
The important point, then, is the alleged nonoccurrence of sentences with zero time adverbials referring to events which are not very close in time. Chafe cites the following dialogue: (18)
Father: Hi, Dennis! What's the news? Dennis: Somethin' terrible! Did ya know Mr. Wilson broke his arm? Father: No! Mother: How awful! Dennis: He fell down his cellar stairs! Mother: The poor man! Father: When did this happen, Dennis? Dennis: When he was a little kid my age. He jus' told me about it today!
arguing that the sentence Did ya know Mr. Wilson broke his armi is inappropriate since it does not contain the proper time adverbial (when he was a little kid, say). As evidence for the proposed three levels of memory, this seems rather scanty. To start with, in order to explain the relations between time adverbs and the memory levels, Chafe has to use as an interface the concepts of 'old and new information' in ways that he himself admits give only a partial answer to the question. Furthermore, it seems that the claim about the nonoccurrence of this kind of zero adverbials for reference to distant time points is too strong. A sentence like the following seems quite appropriate even out of context, and even if the time referred to is, say, 20 years away: (19)
Did ya know Mr. Wilson had syphilis?4
Also,, a Gricean explanation based on the relevance of different possible statements in the given context does not seem impossible.
120
Ö. Dahl
An obvious place to look for evidence for the different levels of memory is in the choice of tenses. Noting that 'everything that comes out of ... surface, shallow, or deep memory, is reported in English with the past tense', Chafe (1973: 276) goes on to say that he strongly suspects 'that the distinction between surface, shallow, and deep memory does show up' in other languages. 'For example, it would not be surprising to find that some languages identify what is in surface memory by means of a special "aspect" or the like'. As the data we have discussed earlier show, Chafe would not have to go very far to find such languages (leaving aside his nonstandard use of 'aspect'). Even if Chafe's characterization of the time spans of the different memory levels does not correspond in every detail to the divisions made in remoteness systems, it is rather striking that the sentences he gives in his example 14 referring to material from surface, shallow, and deep memory would translate very naturally into a language with a three-member remoteness category as hodiernal, hesternal, and distant past, respectively. Indeed, one might say that evidence of this kind is exactly what Chafe would need as more substantial support for his theory. It may be wise, however, not to make too bold claims at this point. I think it would be a bit rash to say that the existence of morphological categories marking temporal distance provides definite evidence for the kind of structuring of human memory that Chafe proposes. What can be said, though, is that it would be rather strange if these categories did not reflect any general properties of human cognition and that the uniform character of remoteness systems from different parts of the world constitutes a challenge for cognitive psychology. It should certainly be possible for an experimental psycholinguist with interest for the structure of memory and access to a sufficient number of informants for languages with remoteness distinctions to construct suitable experiments to find any possible connections. It seems that psychologists have paid relatively little attention to the question whether there is any qualitative differentiation of information within long-term memory. As far as I know, there has been no reaction to Chafe's paper in the ten years that have elapsed since it was written. Time may now be ripe. The dominant role of the hodiernal : nonhodiernal distinction in remoteness systems makes one wonder if anything special happens to things when they have been stored in memory for 24 hours. In the absence of evidence for such a hypothesis, however, it is probably safer to hypothesize that the explanation is simply that the day is the most salient and constant time measure in just about any human culture. Of course, as
Tense-aspect a linguistic universal, amazingly concrete.
the
hodernial : nonhodiernal
systems distinction
121 is
Conclusion I hope that this paper has shown that remoteness distinctions — in spite of the fact that they have been largely neglected in general linguistic theory — are interesting in several respects. First, they are another example of a 'semantic field' which is not —as the linguistic relativists would assume — cut up in arbitrary fashions by different languages but is rather structured in strikingly similar ways from language to language. Second, they offer a possible point of departure for studying the connection between grammatical structures and the organization of the human mind, in particular memory. Third, they illustrate the subtle interplay between different but closely related semantic factors that determine the choice between morphological and syntactic categories. Department of Linguistics University of Stockholm S-106 91 Stockholm Sweden
Notes 1.
2.
3.
4.
This statement is actually an oversimplification, holding only for past tenses. See Hymes's paper (1975) for an account of the superficially contradictory behavior of the prefixes t- and u-. In many languages, also such languages as do not otherwise mark remoteness distinctions systematically, there are constructions that may be used to translate the English perfect with the adverb just. In the Romance languages we thus find constructions like the French venir de + infinitive (literally 'to come from doing something'). The semantics of these constructions is not quite clear; although it might be tempting to assume that they express a stronger closeness that a hodiernal past, it rather appears that the 'immediacy' involved is generally not measurable in objective terms, which would mean that these constructions are, strictly speaking, outside of the system of more objective remoteness distinctions. Consider e.g. a sentence like The age of computers has just begun. By using labels of this kind, I am avoiding the confusion that arises from the varying use of terms like 'immediate', 'recent', 'close', 'near', and 'remote'. What I have labeled 'hesternal' is thus 'recent past' for some people and 'remote past' for others. But compare Did ya know Mr. Wilson caught syphilis?, which seems to have the implication that he still has the disease. In other words, choosing an inchoative rather than a stative construction seems to induce the interpretation for which other languages (and maybe other varieties of English) would use a 'perfect of result'.
122
Ö. Dahl
References Angenot, Jean-Pierre (1971). Aspects de la phonétique et de la morphologie de l'ewondo. Brussels: Print-Express. Appleby, L. L. (1961). A First Luyia Grammar. Nairobi: East African Literature Bureau. Arnauld, Α., and Lancelot, C. (1676). Grammaire générale et raisonnée. Paris. Chafe, W. (1973). Language and memory. Language 49, 61-281. Chia, Ε. Ν. (1976). Kom tenses and aspects. Unpublished dissertation, Georgetown University, Washington, D.C. Colarusso, J. (1979). Verbs that inflect for kinship: grammatical and cultural analysis. Papiere zur Linguistik 20, 37-66. Comrie, Β. (1976). Aspect. Cambridge: Cambridge University Press. Dahl. Ö. (1977). Games and models. In Dahl, Ö. (ed.), Logic, Pragmatics and Grammar. Department of Linguistics, University of Göteborg. —(1980). Tense-Mood-Aspect Progress Report. Göteborg Papers in Theoretical Linguistics 41. Department of Linguistics, University of Göteborg. —(forthcoming). Tense, Mood, and Aspect Systems. Davies, J. (1981). Kobon. Lingua Descriptive Studies 3. Amsterdam: North-Holland. Derbyshire, D. C. (1979). Hixkaryana. Lingua Descriptive Studies 1. Amsterdam: NorthHolland. Fillmore, C., and Langendoen, T. (eds) (1971). Studies in Linguistic Semantics. New York: Holt, Rinehart and Winston. Greenberg, J. (ed.) (1978). Universals of Human Language, vol. 3: Word Structure. Stanford: Stanford University Press. Hyman, L. (1979). Aghem Grammatical Structure. Southern California Occasional Papers in Linguistics 7. Los Angeles: University of Southern California Press. Hymes, D. (1975). From space to time in tenses in Kiksht. International Journal of Applied Linguistics 41, 313-329. Javanaud, P. (1979). Tense, Mood and Aspect (Mainly Aspect) in Limouzi. Göteborg Papers in Theoretical Linguistics 39. Department of Linguistics, University of Göteborg. Johnson, M. R. (1981). A unified temporal theory of tense and aspect. In P. Tedeschi and A. Zaenen (eds), Syntax and Semantics 14: Tense and Aspect. New York: Academic Press. Katenina, Τ. E. (1960). Jazyk chindi. Moscow: Nauka. Kopchevskaya, M. (forthcoming). Some Bantu Tense Systems. McCawley, J. D. (1971). Tense and time reference in English. In C. Filmore and T. Langendoen (eds), Studies in Linguistic Semantics, 97-114. New York: Holt, Rinehart and Winston. Morolong, 'M. (1978). Tense and aspect in Sesotho. Unpublished dissertation, Simon Fraser University, Burnaby, British Columbia. Reichenbach, H. (1947). Elements of Symbolic Logic. New York: Macmillan. Tedeschi, P., and Zaenen, A. (eds) (1981). Syntax and Semantics 14: Tense and Aspect. New York: Academic Press. Titov, E. G. (1976). The Amharic Language. Moscow: Nauka. Ultan, R. (1978). The nature of future tenses. In J. Greenberg (ed.), Universals of Human Language, vol. 3: Word Structure, 83-124. Stanford: Stanford University Press. Weimers, W. (1973). African Language Structures. Berkeley: University of California Press. Whiteley, W. H., and Muli, M. G. (n.d.). Practical Introduction to Kamba. Nairobi and London: Oxford University Press.
The verbs of perception:1 a typological study ÂKE VIBERG
0. Introduction The number of studies devoted to the lexicon from a universal or typological point of view is very sparse in comparison to the — by now — vast literature dealing with syntax and phonology. One exception is the many studies that have followed in the wake of Basic Color Terms (Berlin and Kay 1969). Like this study, lexical studies with a universal aim have in general, it seems, been concerned with the lexicalization patterns within a specific semantic field. Such studies have dealt with fields such as body parts (Andersen 1978), ethnobiological taxonomies (e.g. Berlin 1978), cooking verbs (Lehrer 1974: ch. 8) and verbs of motion (Talmy 1975). Scovel's (1971) comparison of the verbs of perception in five languages is the closest precursor of the present study. From a somewhat different perspective, Dixon (1977) has looked at which fields have the strongest tendency to lexicalize as adjectives.
1. The verbs of perception in English The structure of a semantic field may be looked upon as the outcome of the interaction of a set of more or less field-specific semantic components and a number of general field-independent components that cut across all verbal semantic fields. As for the field of perception, the most important field-specific components are the five sense modalities: sight, hearing, touch, taste, and smell. The most important general components are called activity, experience, and copulative. The distinction between an activity and an experience is illustrated by pairs such as look at vs. see and listen to vs. hear. Activity refers to an unbounded process that is consciously controlled by a human agent, whereas experience refers to a state (or inchoative achievement) that is not controlled. The distinction between an activity and an experience on the one hand and a copulative expression on
124
A. Viberg
the other hand is dependent on a phenomenon called base selection. Base selection refers to the choice of a grammatical subject among the deep semantic case roles associated with a certain verb. An experiencer-based verb takes the animate being that has a certain mental experience as a subject (i.e. both activities and experiences are experiencer-based). A source-based (alternatively, phenomenon-based) verb takes the experienced entity as a subject (e.g. A looks funny). A copulative expression is defined as a source-based state. The components presented so far can be arranged as in Table 1 to show the structure of what will be called the basic paradigm of the verbs of perception. The analysis given there might be looked upon as the standard one. In essence, it has appeared already in Buck (1949: 1017-1018). Apart from terminological differences, it also agrees with the thorough analysis given by Rogers (1971, 1972, 1974) and by Scovel (1971) and LipmskaGrzegorek (1977). The assumptions behind the terminology adopted in this paper are further discussed in Viberg (1981). What has been said so far will be enough for the purpose of the typological investigation to follow. The analysis will be concentrated on what might be called the basic verbs of perception. The verbs in Table 1 form part of a hierarchy containing both superordinate and subordinate terms (hyponyms), roughly as shown in Fit""·" 1
2. Data collecting The typological investigation is mainly based on a questionnaire containing sentences equivalent to the ones presented in Table 1 (the basic paradigm). At the time when the present report was completed, the questionnaire had been translated into 53 languages representing 14 different language stocks from all the major parts of the world. The complete sample is presented in Table 2. Although this is a fairly good sample compared to the samples in the studies mentioned in the Introduction, it is not satisfactory, since European languages are overrepresented and some areas, such as North and South America and Oceania, are highly underrepresented. The present study should be seen as a report on work in progress and the sample is continually improving as more questionnaires are translated. In addition to the material collected with the questionnaires, a number of bilingual dictionaries have been consulted for information about verbs of perception.
The verbs of perception
125
À. Viberg reek o f stink have a smell smack of have a taste
overhear witness spot catch a glimpse o f glimpse sniff at take a smell savor take a taste o f palpate grope hearken eavesdrop hark Q.
The verbs of perception Table 2.
127
Genetic and areaI classification of the languages in the sample*
North America Eskimo-Aleut Macro-Siouan
West Greenlandic Seneca
South America Andean-Equatorial
Quechua, (Guarani)
Africa Afro-Asiatic Semitic Chadic Cushitic Nilo-Saharan Niger-Kordofanian
Standard Arabic, Egyptian Arabic, Amharic, Tigrinya Hausa Oromo ( = Galla) Luo Swahili, Chibemba, Mambwe, Setswana, Wolof, Birom
Oceania Indo-Pacific Austro-Tai
Kobon Malay, Indonesian, (Tagalog)
Asia Austro-Asiatic Sino-Tibetan Dra vidian Altaic Indo-European Indo-Iranian Armenian Europe Greek Slavic Celtic Italic Germanic Uralic Basque
Vietnamese Mandarin Chinese Kannada Japanese, Korean, Turkish Hindi, Punjabi, (Bengali), Persian, Kurdish (Armenian)
Modern Greek Polish, Serbo-Croat, Bulgarian, Russian Modern Irish French, Spanish, Portuguese, Italian, Romanian English, German, Icelandic, Danish, Swedish Finnish, Estonian, Hungarian Basque
( ) Data incomplete or only partly analyzed. *The classification follows Ruhlen (1976).
3. Presentation of the data In all of the languages in the sample there is some way of expressing all of the meanings represented by one of the 15 'boxes' that make up the basic paradigm (Table 1). But in many cases, one verb covers more than one
128
Ä. Viberg
meaning. The semantic range of each verb in a certain language can easily be read off in diagrams such as Table 3 that have been worked oiit for the languages in the sample. In English there are different lexical items for activity, experience, and copulative in the first two sense modalities, sight and hearing, while there is only one lexical item for each sense modality in the last three, i.e. touch, taste, and smell. According to the terminology to be adopted here, there is a complete lexical differentiation (English lexically differentiates completely) between the five sense modalities. There is no lexical differentiation between the dynamic meanings with respect to touch, taste, and smell. In the following two sections, data will be presented from a representative number of the languages in the sample. The first of these sections focuses on the differentiations in the dynamic system and the second on the differentiation between the sense modalities. In order to present the data from each language in one place, a great deal of overlap is allowed between the two perspectives.
4. The realization of the dynamic system The dynamic system may be realized lexically by unrelated lexical items such as listen to, hear, and sound in English. A high degree of polysemy may be allowed as for feel, taste, and smell. The verbs in the basic paradigm are not always represented by simple verbs in languages other than English. Specifically, two other types of (surface) predicates are found in several languages: serial verbs and compound verbs (verb + noun). In some languages, one type of predicate is favored to the extent that it might be called a 'pet predicate'.
Table 3.
sight hearing touch taste smell
English Activity
Experience
Copulative
look at listen to
see hear feel taste smell
look sound
The verbs of perception
129
4.1. Serial verbs In Vietnamese, serial verbs are favored. (1)
Nam
(2)
'Nam Nam
(3)
'Nam Nam
(4)
'Nam Nam 'Nam
(dà) xem chim. ASPECT look bird looked at the birds.' (dâ) xem thây chim. perceive saw the birds.' (dä) nghe dài. listen radio listened to the radio.' (dâ) nghe thây dài. perceive heard the radio.'
There is one specific activity verb for each sense modality. The corresponding experience is formed by the addition of a resultative verb thay. The complete diagram for Vietnamese looks like Table 4. In addition to xem, there are two more verbs that correspond to 'look at' or 'watch': nhin and trông. In Standard (Mandarin) Chinese experiences are also derived from activities by the addition of a serial verb, although different verbs are used depending on the sense modality. Another difference is that the activity verb is usually reduplicated: (5)
a.
Table 4.
Wang kàn -le -kàn look ASP look 'Wang looked at the birds.'
niäo. bird
Vietnamese Activity
Experience
Copulative
sight
xem
trông
hearing
nghe
touch
SÒ
taste
touch nêm
smell
ngui
xem look nghe listen càm feel nêm taste ngui smell
thây perceive thây perceive thây perceive thây perceive thây perceive
có have
giong voice
có have có have
vj taste (N) mùi smell (N)
130
Â. Viberg b.
(6)
a.
b.
Wang kàn -jiàn -le niäo look perceive ASP 'Wang saw the birds.' Wang ting -le -ting shòuylnjl listen ASP listen radio 'Wang listened to the radio.' Wang ting dào -le linjù-de shòuylnjl. listen reach ASP neighbor ATTR 'Wang heard his neighbor's radio.'
The complete diagram for Chinese is shown in Table 5. According to Teng (1975: 56), the choice of a resultative verb is subject to variation; jiàn and dào alternative after kàn and ting in the speech of many Mandarin speakers. Serial verbs are also used to some extent in other languages, but not to express meanings on the basic level. In Swedish, to take one example, the two verbs se 'see' and höra 'hear' are unmarked with respect to the distinction state vs. inchoative. An inchoative meaning is explicitly signalled by the addition of the verb fa 'get': (7)
Peter fick plötsligt se en älg komma gâende pâ myren. got suddenly see an elk come walking on the swamp
4.2. Compound verbs (verb + noun) Persian favors compound verbs consisting of a verb with a rather general meaning (make, give, come) combined with a noun that signals the sense modality: Table 5.
Chinese Activity
Experience
Copulative
sight
kàn
hearing
ting
you yàngzi have appearance paraphrase
touch
mó
taste
cháng
smell
wén
kàn-jiàn look-perceive ting dào listen-reach gán-júe-dáo feel-think-reach cháng-dáo taste-reach wén-dào smell-reach
—
you weidao have taste (N) you wèr have smell
The verbs of perception (8)
131
Parvin
be radio goç dad. to ear gave 'Parvin listened to the radio.' (9) Parvin sigar ra bu kard. DEF OBJ smell (N) made 'Parvin smelled the cigar' [activity].
The verb does not belong to the field of perception. Its main function is to express the dynamic meaning. Although there exist compound verbs of this type even in English (catch sight of, cast an eye on, take a smell), they do not belong to the basic level as in Persian. The degree to which such verbs are favored on the basic level in Persian can be seen in Table 6. Kurdish is very similar to Persian as regards compound verbs. But the data from that language are presented in section 5.1.2, since the most striking characteristic of that language is the pattern of polysemy with respect to the sense modalities.
4.3. The use of morphological markers In many languages some of the dynamic distinctions are signalled only morphologically. Japanese is very interesting in this respect, as shown in Table 7. Look and see have only one lexical equivalent, miru. The same applies to listen and hear, which correspond to kiku. The following sentence is ambiguous: Table 6.
sight hearing touch
Persian Activity
Experience
negah kardan look (N) make go§ dadan ear give dast zadan hand beat
didan
Copulative
jenidan (1)ehsas kardan feeling make (2) didan 'see' maze ta$çis dadan* taste (N)
taste
cesidan
smell
bu kardan (1) bu smell (N) make smell (N)
tajçis dadan*
(2) bu jenidan 'hear' *Taççis dadan [discerning give] 'discern', 'distinguish'.
be nazar residan to sight arrive 'seem'
maze dadan taste (N) give bu dadan smell (N) give
132
À. Viberg
Table 7. Japanese Activity
Experience
Copulative
dative theme +
ADJ ni mieru
miru Sight
mieru(see-PASS)
see-PASS
kiku Hearing
Touch
dative theme +
sawaru
(1) azi
o
miru
taste N OBJ see Taste
(2) azimj
o
suru
t a s t e - N O M OBJ make (3) aziwau taste v
azi
nioi o kagu smell N OBJ smell v
hear-PASS paraphrase
notice ni Jíi ga tsuku J
tasteN
, alt.
v
.
notice
copulative expression in subordinate clause + ki ga tasuku 'notice' nioi ni ki ga tsuku smell N * , alt.
Smell
ADJ ni kikoeru
kikoeru(hear-PASS) ki ga tsuku ,v mind SUBJ fasten . '
notice
azi ga suru taste N SUBJ make
'
copulative expression in subordinate clause
nioi
ga
suru
smell N SUBJ make
+ ki ga tsuku 'notice'
(10) Taroo wa tori o mita. THEME OBJECT V-PAST 'Taro looked at the birds.' 'Taro saw the birds.' If the progressive form is used, the verb is interpreted as an activity (as in English), since states and achievements (i.e. experiences) cannot in general be combined with the progressive (although in actual fact combinations like the following are common enough in English and used to express various activity meanings: 'He is looking happy'; cf. 'She is being nice'). (11)
Taroo wa tori o mite ita. V-PROGRESSIVE was 'Taro was looking at the birds.'
Aspect shift is a rather trivial possibility. What is more interesting is the possibility to passivize the verb in order to signal an experience in an unambiguous way.
The verbs of perception (12)
(13)
Taroo ni wa tori ga DATIVE T H E M E bird SUBJECT 'Taro saw the birds.' Taroo ni wa tonari no razio ga neighbor of SUBJECT 'Taro heard his neighbor's radio.'
133
mieta. V-PASSIVE-PAST kikoeta. V-PASSIVE-PAST
The marking of the experiencer with a postposition that may be interpreted as a kind of dative is reminiscent of the use of the dative subject to signal an experience in some of the sense modalities, which seems to be common in South Asian languages. In Hindi, an experience is derived from the corresponding copulative by adding a dative subject. This is the only option for taste and smell: (14)
(15)
khänä pyäz ka lagä. food onion of seem-PAST 'The food tasted of onions.' Pitar ko khäne më namak lagä. DATIVE food in salt seem-PAST 'Peter tasted salt in the food.'
Compare: (16)
Pitar
ne khänä AGENT food 'Peter tasted the food.'
cakhä. taste-PAST
As you can see in Table 8, there is no obligatory differentiation between an activity and an experience in the sense modalities sight and hearing. The pairs look at/see and listen to/hear have only one translation equivalent
Table 8.
Hindi Activity
Sight
Experience
Copulative
lagnä
dekhnä
Hearing
sunna
Touch
chünä
cubhnä
Taste
cakhna
dative subject
Smell
süghnä
dative subject
'seem' + lagna + xusbü änä 'smell come'
134
À. Viberg
each. But there is one construction involving the dative subject that can be used to signal an experience in an unambiguous way. A dative subject is added to the expressions dikhäi denä 'be = visible give' and sunai denä 'be = audible give': (17)
mujhe vo dikhät diya. me-to he be = visible gave Ί saw him.'
An interesting possibility is described in Dixon (1979: note 54). In Lesghian (a Caucasian language), there is a verbal root akun; either it can take the subject in an ergative case and the object in the absolutive, in which case it means 'look at'; OR the object can still be in the absolutive case, but with the subject in a dative inflection, in which case it is translatable as 'see'. Similarly, the form van akun can occur in either framework, meaning 'listen to' and 'hear' respectively. The case shifts studied up to now have exploited the possibilities of marking the different degrees of involvement or controlability on the part of the experiencer as a means of signalling the distinction between an activity and an experience or a copulative and an experience. There is also another possibility, viz. to use different markers on the object. In Swedish, all the verbs of perception that designate an activity take an object marked by the preposition pâ 'on'. In this particular function, the preposition pâ is used to mark an incompletive (partitive, irresultative) reading, something which has close parallels in other languages (Anderson 1976: 23). Although Swedish has specific verbs meaning 'look at' and 'listen to', it is possible to use even se 'see' and höra 'hear' in these meanings if the object is marked by pa:
(18) 'Peter looked at David.' Peter sâg David. 'Peter saw David.' (On verbs of perception and case marking, see Tasaku 1981.) Another possibility is to derive one verb from another by adding a derivational morpheme. In Quechua, a copulative verb is derived from the corresponding experience (or activity/experience) by the addition of a morphological marker -ku that is traditionally called a reflexive, as shown in Table 9. A similar pattern is found in the Bantu languages included in the sample, e.g. Mambwe (Zambia), shown in Table 10. The most striking phenomenon in this language, however, is the high degree of polysemy, something that will be dealt with systematically in 5.2. (19)
The verbs of perception Table 9.
135
Quechua Activity
Sight Hearing Touch Taste Smell
qhaway
Experience
Copulative
rikuy
riku/cuy uyarifcuy siente/cuy llamiy uy
uyariy sientey llamiy muskhiy
q'apay
Siente(ku)y
is a Spanish loan; -ku = reflexive.
Table 10.
Mambwe
Sight Hearing Touch Taste Smell
Activity
Experience
Copulative
-loia
?-wene
-loi eka -UVWÍ&ÍJ 9
-uvwa -lema -lila -nunshya
-uvw ika -nun ka
-ka (basic form uncertain) = stative.
4.4. A partial lexicalization
hierarchy
Most languages use fewer than 15 verbs to cover the 15 meanings of the basic paradigm. And at first sight, there is a bewildering range of possibilities to divide up the field. But to a certain degree, it is possible to predict which verbs will appear in a certain language, viz. for the part of the field that is covered by the verbs look at, see, listen to, and hear. There are some languages, such as Hindi, that have only one verb for look at/see and one for listen to/hear. There are also some languages, such as Type
Semantic
Distribution
distinction Perception sight/hearing look = see
listen = hear activity/ experience
2.
look
,
look
3.
Figure 2.
see
I I
The
see
hierarchy
listen = hear listen
hear
Australian, South Asian languages, Seneca many Niger-Kordofanian languages, M o d e r n Greek most E u r o p e a n languages, Swahili, Wolof
136
À. Vìberg
Egyptian Arabic and Modern Greek, that have special verbs for look at and see but only one verb for listen to/hear. Finally, there are languages such as English that have four special verbs. There is no language that has one verb for look at ¡see and two special verbs for listen to and hear. These relationships are summed up in Figure 2. In my original paper (Viberg 1981), I also showed how the hierarchy is reflected by the morphological complexity of the lexical items. If look at (or in a few cases see) is expressed by a morphologically complex form, so is listen to (or hear). But the opposite does not necessarily hold. (Another reflection of the hierarchy, which I have not checked systematically, seems to be that see is represented by a cognate in genetically related languages more often than hear, which indicates that see is more original and probably more stable. Borrowing does not seem to be involved.) There are several other regularities in addition to the ones expressed by the hierarchy. Among them, one stands out more clearly than the others. With respect to the experiences, equivalents to see and hear are found in most languages of the sample, in spite of the fact that straightforward equivalents are often lacking for feel, taste and smell (as experiences). This could be captured by a lexicalization hierarchy that predicts which meanings are lexicalized by a special lexical item. But this hierarchy would only hold for experiences, since it seems partly to be reversed for the copulatives. Among them, smell seems to be one of the first to be lexicalized as a simple verbal root. There is, however, a more interesting way of capturing these regularities, to which I will turn in the next section. 5. Patterns of polysemy with respect to the sense modalities One of the most striking characteristics of the lexicalization patterns of the verbs of perception is the large amount of polysemy with respect to the sense modalities that is found in many of the languages in the sample. As I will try to show in this section, it is possible in most cases to establish a basic or prototypical meaning connected to one of the sense modalities. Furthermore, it is possible to predict which extended or secondary meanings can appear. My main concern will be to show that the following hierarchy of sense modalities applies, when a verb has a prototypical meaning connected to one sense modality and that meaning is extended to cover another modality. The modality hierarchy: sight > hearing > touch > The hierarchy should be interpreted as follows: a verb having a basic meaning belonging to a sense modality higher (to the left) in the hierarchy
The verbs of perception
137
can get an extended meaning that covers some (or all) of the sense modalities lower in the hierarchy. The hierarchy is weakened somewhat by the fact that it is not always applied contiguously, i.e. a certain modality may be skipped. A similar idea has been proposed independently by Paul Kay (personal communication) as a comment on a paper on the verbs of perception in Russian by Andy Rogers (n.d.) (see 5.5. below). Most of the polysemy with respect to the sense modalities is found in the experiences, where it appears more often than not. For that reason, only the experiences are considered in Table 11, which sums up all the types of polysemy that appear in the sample. In the following sections, the sense modalities will be dealt with one by one. The relatively few cases of polysemy appearing in the activities and the copulatives will be presented here as well. It turns out that the hierarchy correctly predicts which extended meanings are possible even in this case.
5.1. Sight Sight holds the top position in the hierarchy. The patterns of polysemy studied below are not the only reflections of this. It may also be reflected by the lexical elaboration on the level below the basic one. It seems to be a very widespread phenomenon that look at has a lot of hyponyms (say, 10-20) and see seems in general to be the only experience (if any) that has any hyponyms at all. (The picture is less clear, though, for the copulative expressions.) 5.1.1. Swahili. In Swahili there are two polysemous verbs that signal an experience, ona and sikia. According to my informant who translated the basic paradigm the division of labor between ona and sikia is as shown in Table 12. According to Johnson's Standard Swahili-English Dictionary (1939 and subsequently), the following applies: 'Ona alone and unqualified by context usually means, see with the eyes, as contr. with other senses, e.g. kusikia si kuona, hearing is not the same as seeing.' The example also shows that sikia is interpreted as 'hear' when it is unqualified by context. It is thus possible to establish a prototypical meaning for these verbs. As for the extended meanings, you can see from the diagram that ona 'see' has an extended meaning (indicated with an arrow) connected to taste. That this is an extended meaning is obvious also from the presence of a noun ladha 'taste', which directly signals the sense modality. Actually, it seems that ona may occasionally be used even of all the other sense modalities, although the possibilities given in the diagram were the only ones that were in general use according to the informant. The following examples
138
c ω
À. Viberg
— tu S !3 IωI ιS* S 3 cε *"** ^ ^V tj p. "τ* saj t- u w O Q < ? < S ¿ S>
> 8 ta « μcd' c « >1 o 'C s--S .S, -g 5 3 .S α ä í M g o .£ S 5 S 3 P O Ï I υ
Tí ΛΨ.Ο Ο ,Λ S δ s ·§ ^ω ~id jjι £« (Λ £ 3 £ O I
s ίS
α) ® s S βS « Ί3 ι-
¡J , g Μι « u Ο c »™ 3-α «β Ό βU 'SJ3 S .S 'SS fΗf οl SSo aá
«C cC d C :cd M :cG d
SD Ù
g --^ ä 3