Cognitive Linguistics in the Redwoods: The Expansion of a New Paradigm in Linguistics 9783110811421, 9783110143584


246 75 22MB

English Pages 1023 [1028] Year 1995

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Introduction
Section I: At ground level
What’s cognitive about cognitive linguistics?
Neurological evidence for a cognitive theory of syntax: Agrammatic aphasia and the spatialization of form hypothesis
Cost in language acquisition, language processing and language change
From cognitive psychology to cognitive linguistics and back again: The study of category structure
Historical aspects of categorization
Unpacking markedness
Section II: Within morphology and the lexicon
The cognitive frame of a set of cricket terms
Towards a cognitive account of the use of the prepositions por and para in Spanish
What are copula verbs?
The semantics of “empty prepositions” in French
Getting at the meaning of make
Motion metaphorized: The case of coming and going
Liegen and stehen in German: A study in horizontality and verticality
The semantics of the Chinese verb “come”
Touching: A minimal transmission of energy
Section III: Some of the architecture
Complement construal in French: A cognitive perspective
Typology of if-clauses
Boundedness in temporal and spatial domains
Case markers and clause linkage: Toward a semantic typology
The thing is is that people talk that way. The question is is Why?
A cognitive grammar account of bound anaphora
Sequential conceptualization and linear order
Section IV: Higher levels of the architecture
Cognitive aspects of verbal interaction
The interaction of folk models and syntax: Case choice after prepositional verbs of cognition in German
Computer modelling of text comprehension
Section V: The varieties in Native America
The radial structure of the Wanka reportative
Chiquihuitlán Mazatec postverbs: The role of extension in incorporation
Frames and semantics of applicatives in Tepehua
List of contributors
Index
Recommend Papers

Cognitive Linguistics in the Redwoods: The Expansion of a New Paradigm in Linguistics
 9783110811421, 9783110143584

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Cognitive Linguistics in the Redwoods

W

Cognitive Linguistics Research 6

Editors René Dirven Ronald W. Langacker John R. Taylor

Mouton de Gruyter Berlin · New York

Cognitive Linguistics in the Redwoods The Expansion of a New Paradigm in Linguistics Edited by

Eugene H. Casad

1996

Mouton de Gruyter Berlin · New York

Mouton de Gruyter (formely Mouton, The Hague) is a Division of Walter de Gruyter & Co., Berlin

© Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication

Data

Cognitive linguistics in the redwoods : the expansion of a new paradigm in linguistics / edited by Eugene H. Casad p. cm. - (Cognitive linguistics research ; 6) Rev. papers originally presented at the general sessions of the 2nd International Cognitive Linguistics Conference, which was held July 2 9 - A u g . 2, 1991, University of California, Santa Cruz. Includes bibliographical references and index. ISBN 3-11-014358-5 1. Cognitive grammar. I. Casad, Eugene H. II. International Cognitive Linguistics Conference (2nd : 1991 : University of California, Santa Cruz). III. Series. P165.C645 1995 415-dc20 95-34480 CIP

Die Deutsche Bibliothek — Cataloging-in-Publication

Data

Cognitive linguistics in the redwoods : the expansion of a new paradigm in linguistics / [2nd International Cognitive Linguistics Conference, which was held July 2 9 - A u g . 2, 1991, University of California, Santa Cruz]. Ed. by Eugene H. Casad. Berlin ; New York : Mouton de Gruyter, 1995 (Cognitive linguistics research ; 6) ISBN 3-11-014358-5 NE: Casad, Eugene H. [Hrsg.]; International Cognitive Linguistics Conference ; G T

© Copyright 1995 by Walter de Gruyter & Co., D-10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Printing: Gerike GmbH, Berlin Binding: Lüderitz & Bauer, Berlin Printed in Germany

Preface

This volume would never have been possible without the willing and enthusiastic support of both the authors who wrote the papers that appear here and the colleagues who refereed them. I am deeply indebted to each of them. The authors have delighted my heart by their vote of confidence in me as first shown by their contributing papers to this volume and then by their revising those papers on the basis of comments generously given by the referees, both those on the Editorial Advisory Board of the Series of Monographs on Cognitive Linguistics Research as well as those with other affiliations on whom I called for assistance. I extend my deepest gratitude to each of the following who read one or more papers for me: John Barnden, Bill Bright, Ken Cook, Claudia Brugman, Paul Deane, Nicole Delbecque, René Dirven, Wolfgang Dressier, Gilles Fauconnier, Dirk Geeraerts, Ray Gibbs, Louis Goossens, Joe Grimes, Gottfried Graustein, Bruce Hawkins, Dick Hudson, Yoshihiko Ikegami, Laura Janda, Zoltán Kövecses, Thomas Krzeszowski, Adrienne Lehrer, Odo Leys, Suzanne Kemmer, Ludo Melis, Peter Mühlhäusler, Brygida Rudzka-Ostyn, Gary Palmer, Mava Jo Powell, Günter Radden, John Rager, Sally Rice, Paul Saka, Rainer Schulze, Mike Smith, Elzbieta Tabakowska, John Taylor, Sandra Thompson, David Tuggy, Mark Turner, Willy Van Langendonck, Claude Vandeloise, James Watters, Anna Wierzbicka, and Margaret Winters. Their judgments almost always reinforced my own, and, more importantly, without their assistance, I could not have made the decisions that I did. Beyond these, I would like to mention four other people whose support was crucial to the emergence of this volume. In the first place, Ron Langacker's influence is pervasive in both the background to this volume, as well as in its very contents. His extensive knowledge of

vi

Eugene H. Casad

Uto-Aztecan languages found application to my own research into a single Southern-Uto Aztecan language, Cora, which I have been able to describe in detail in a number of publications. He was also my thesis advisor and patiently endured my struggles of trying to become a gradschool student at the age of thirty-seven. Finally, he has also been hovering in the background during the entire process of editing this volume; his suggestions and encouragement have helped keep this project alive. Beyond that, several of the papers presented here are written by his students, both present and former ones. His work is also reflected in most of the other papers of this volume. Brygida Rudzka-Ostyn has been at least equally supportive over the last six years. She organized my first lecture tour abroad and has opened the way for me to participate in the academic arena in several other ways, twice, for example, by serving as editor of volumes in which my own papers appear. Besides reading a couple of papers for me, she has given me numerous helpful suggestions regarding the editing of this volume. Both on the professional level and the personal, Brygida has been one of the most wonderful people I have ever met. I cannot express sufficiently my thanks to Brygida and her husband Paul for the hospitality they have show to me on several occasions and the encouragement and good times that they have given to me. René Dirven has also been very supportive for the last several years. In his role as Conference Coordinator of the International Cognitive Linguistics Association, he made the choices that led to my undertaking the editorship of this volume. It has grown out of the Second International Cognitive Linguistics Association Conference that was held at the University of California at Santa Cruz, July 29-August 2, 1991. Whereas almost all of the plenary session lectures have been reserved for publication in the journal Cognitive Linguistics, the papers included in this volume are a selection of the general session papers which have been refereed and revised for publication. The influence of the irascible George Lakoff is seen firstly in the title of this volume; he came up with it. In addition, his work with Mark Johnson and Mark Turner on categorization and metaphorization is reflected in many of the-papers found here. He also provided me with a copy of one of his papers that I refer to in the introduction

Preface

vii

to this volume. In a similar vein, I would like to thank Melissa Bowerman, Dedre Gentner and Len Talmy for the papers they also sent to me. My special thanks also go to Osten Dahl for having provided me with technical assistance at one stage in the writing of the introduction to this volume in Stockholm. In addition to all the above, two gifted ladies have graciously put their expertise to work in early stages of the copy editing of this volume; I am very much indebted to the late Verna Glander of the Technical Services Department of the Mexico Branch of the Summer Institute of Linguistics in Tucson, Arizona. I am equally indebted to Birgit Smieja, René Dirven's aide at the University of Duisburg, for all her work leading to the final round of the copy editing. Both of these colleagues have made this volume much more readable and presentable than I could ever have done on my own. Hermann Cölfen was responsible for the final round of copy-editing with all the nitty-gritty of preparing special characters and handling the graphics for the diagrams, in addition to entering all the final editorial changes that I dumped on him at the last moment. Finally, I would like to express my thanks to the directorate of the Mexico Branch of the Summer Institute of Linguistics for giving me the leeway to take the time out from other activities in order to carry out this editorial task, which took away a full year from other important tasks.

Eugene H. Casad

Contents

Introduction Eugene H. Casad

Section I:

At ground level

What's cognitive about cognitive linguistics? Raymond W. Gibbs Neurological evidence for a cognitive theory of syntax: Agrammatic aphasia and the spatialization of form hypothesis Paul Deane Cost in language acquisition, language processing and language change Dorit Ravid From cognitive psychology to cognitive linguistics and back again: The study of category structure Barbara C. Malt Historical aspects of categorization Gábor Györi Unpacking markedness Laura A. Janda

χ

Contents

Section II:

Within morphology and the lexicon

The cognitive frame of a set of cricket terms Willem J. Botha

237

Towards a cognitive account of the use of the prepositions por and para in Spanish Nicole Delbecque

249

What are copula verbs? Bruce Horton

319

The semantics of "empty prepositions" in French Suzanne E. Kemmer and Hava Bat-Zeev Shyldkrot

347

Getting at the meaning of make Keedong Lee

389

Motion metaphorized: The case of coming and going Günter Radden

423

Liegen and stehen in German: A study in horizontally and verticality Carlo Serra Borneto

459

The semantics of the Chinese verb "come" Ya-Ming Shen

507

Touching: A minimal transmission of energy Claude Vandeloise

541

Section III: Some of the architecture Complement construal in French: A cognitive perspective Michel Achard

569

Contents

xi

Typology of «/-clauses Angeliki Athanasiadou and René Dirven

609

Boundedness in temporal and spatial domains Hana Filip

655

Case markers and clause linkage: Toward a semantic typology Toshio Ohori

693

The thing is is that people talk that way. The question is is Why? David Tuggy

713

A cognitive grammar account of bound anaphora Karen van Hoek

753

Sequential conceptualization and linear order Arie Verhagen

793

Section IV: Higher levels of the architecture Cognitive aspects of verbal interaction Jacqueline Lindenfeld

821

The interaction of folk models and syntax: Case choice after prepositional verbs of cognition in German Johanna Rubba

837

Computer modelling of text comprehension Inger Lytje

867

xii

Contents

Section V:

The varieties in Native America

The radial structure of theWanka reportative Rick Floyd

895

Chiquihuitlán Mazatec postverbs: The role of extension in incorporation Carole Jamieson Capen

943

Frames and semantics of applicatives in Tepehua James K. Waiters

971

List of contributors

997

Index

1001

Introduction Eugene H. Casad

Central to the endeavor of Cognitive Linguistics is the idea that language use is grounded in our daily experience. A typical case in point is Cora, a Uto-Aztecan language of Northwest Mexico in which the grammatical structure is highly influenced by both the geographic environment and the social structure that constitute the matrix of life for the Cora people. Topographic adverbs, locative particles, demonstrative pronouns, definite articles and an elaborate set of verbal prefixes of location and direction permeate Cora linguistic structure. Not only does a close study of the semantics of these elements tell us a lot about how the Coras themselves view the world around them, it also tells us much about the kind of theoretical constructs that one must invoke in order to give a credible and satisfying account of these complex data. In particular, it suggests strongly that Cognitive Grammar, as it is being elaborated by Langacker, Lakoff, Geeraerts, Rudzka-Ostyn, Sweetser, Talmy, Taylor, Wierzbicka and their associates, is an appropriate and powerful framework for linguistic analysis and description. This framework is applicable, of course, to a much broader range of phenomena than Cora locationals, as all the papers in this volume testify. Janda, for example, explains the bewildering variety of markedness phenomena in terms of concepts central to cognitive linguistics. In addition to the studies presented here, the papers in volumes such as Paprotté and Dirven 1985, Rudzka-Ostyn 1988, Geiger and Rudzka-Ostyn 1993 and Sweetser and Fauconnier 1994 amply illustrate the utility of this approach. Obviously, one's idea of what "cognitive" means differs from person to person and the role that "cognitive" phenomena are accorded in linguistic theory may differ greatly from framework to framework. For some investigators, what is "cognitive" is outside the domain of linguistics proper and can thereby be singularly consigned to some other workplace. On the other hand, a basic assumption of Cognitive

2

Eugene H. Casad

Linguistics as it is presented in this volume and in the other volumes of the series Cognitive Linguistics Research is that linguistic descriptions and explanations must accord with what we know about human mental processing as a whole (Langacker 1987: 12-13; Lakoff 1990: 40). The "cognitive commitment" as Gibbs and Lakoff call it, carries heavy implications for how the overall research is carried out, the kind of data that are collected, what the investigator chooses to say about those data and the choice of both the theoretical constructs and the notational devices he/she uses for presenting data and explaining them. All of these points are addressed in various ways by the twentyeight papers included here. They are divided into five sections roughly framed according to distinct implicatures of the metaphor "Cognitive Linguistics in the Redwoods". Section I:

At ground level

The six papers contained in this section treat topics that explore the basis of cognitive linguistics - the phenomena that occur at ground level, if you please, the neurological, mental, developmental, environmental, functional and societal matrix which gives rise to the conventionalized usages in language, that which we call "grammar" in the widest sense of the term, including morphology, the lexicon and discourse. In the leadoff article of this volume, "What's cognitive about cognitive linguistics?", Ray Gibbs nicely contextualizes for us what is cognitive about our approach. He notes, first of all, that cognitive linguistics is especially cognitive because of the way that it incorporates empirical findings from other disciplines into linguistic theory. The research strategy employed is an interdisciplinary one and the works included in this volume are in part selected precisely to illustrate this point. All the papers in Section I, as well as Radden's and Serra Borneto's papers in Section III and Lindenfeld's and Lytje's papers in Section V link Cognitive Linguistics to a number of psycholinguistic, neurological, developmental and sociocultural issues. In addition, Gibbs points out that Cognitive Linguistics seeks to examine the specific contents of human knowledge and not just its architecture. Most

Introduction

3

of the papers in this volume illustrate his second point to one extent or another. Gibbs also provides us with an additional call to further research by addressing some interesting questions with which Cognitive Linguistics must concern itself. For example, he cites four possible ways that conceptual knowledge can influence language use. The demonstration as to which ones do influence it and the kinds of influence that they exert is an empirical issue still to be settled. Paul Deane's paper addresses the question as to why children possess such a strong sense of linguistic structure. Deane's account, in strong contrast to the Chomskyan approach, presents arguments for an explicit linguistic theory that makes specific predictions about how linguistic knowledge is instantiated in the brain. In particular, Dean explores the neurological basis for Lakoff's Spatialization of Form Hypothesis, which states that grammatical structure is organized in terms of basic spatial schémas such as LINK, PART/WHOLE and CENTER/PERIPHERY. Deane posits a variety of LINKS and shows that the disruption of these links leads to a number of types of agrammatism, which may vary greatly in the degree of severity which affects a person's speech performance. At this point Deane's work converges nicely with that being carried out by Damasio and Tranel and their Convergence Zone Hypothesis, discussed by Lakoff in a recent paper (Lakoff 1993). This latter work in turn confirms the developmental studies of nouns and verbs detailed in Gentner (1982). In short, Deane's theory seeks to ground linguistic phenomena in general cognitive capacities and argues for a model of syntax which is neither autonomous nor strictly modular. Grammar is simply one instantiation of the general human capacity for spatial structural thought, a point underscored by recent research on American Sign Language by Armstrong, Stokoe and Wilcox (1993: 7). Barbara Malt focuses precisely on the need for interdisciplinary efforts in discussing the questions of concepts and word meanings from the standpoints of both Cognitive Psychology and Cognitive Linguistics in her paper titled "From cognitive psychology to cognitive linguistics and back again". Malt is concerned about the increasing divergence between cognitive psychologists and cognitive linguists. In

4

Eugene H. Casad

spite of the fact that both disciplines seem to be closely aligned: both seek to characterize how the human mind understands the world and encodes that understanding in language. In order to stress the fundamental overlap between Cognitive Psychology and Cognitive Linguistics, Malt presents the results of experiments that suggest that the structure of many common object categories studied by cognitive psychologists may be more alike the structure of categories discussed by cognitive linguists. Folk models, for example, do not adequately constrain what entities are counted as category members, but rather multiple dimensions are important in characterizing the concepts which underlie common object categories. In support of her point is Langacker's comment that "Most concepts require specifications in more than one domain in order to characterize them" (Langacker 1987: 154). In passing, note also that a number of linguists do discuss and invoke folk models as one possible factor in some of their analyses (cf. especially Herskovits 1986; Holland and Quinn 1987 and Rubba, this volume). In "Cost in language acquisition, language processing and language change" Dorit Ravid explores some of the cognitive principles and strategies that govern how language is acquired and how it is processed. The language of the study is Hebrew. Ravid undertakes to characterize and explain the variation in the usage of certain Hebrew verbs as evidenced by speakers of varying age and socioeconomic status. For her, a basic assumption is that linguistic change has its source in the synchronic variation found within a given speech community, a point of view very much in sympathy with Labov's widely known work, and one that fundamentally underscores Langacker's view that the interaction of grammar as a sanctioning device for actual language usage is the crucible of emerging language structures (1987: 65). Ravid concludes that the changes which do find their way into the grammar and become part of the established standards are less "costly" than those changes that momentarily pop up, but never gain acceptance. The accepted changes have achieved their aim without disrupting the system elsewhere, creating even greater havoc. Tuggy's paper in Section III discusses a spectacular example in English of just such a change that does not cost very much in Ravid's terms. One

Introduction

5

fundamental constraint that also helps to ensure this result is the "intelligibility" requirement discussed by Györi in the paper that follows. Redwood forests achieve their present form throughout a long period of growth, accommodation to the environment and diversification. The history of the forest is indelibly imprinted in the phenomena found there. This is also true of linguistic systems. Gabor Györi, in "Historical aspects of categorization", examines how categories come to be formed in a culture and the way that they become encoded in language. He holds that the process of cultural category formation is functional in nature precisely because it is based on the way that a speech community adapts to its environment. Sounding a note fully compatible with that recently expressed by Anttila (1992: 316), Györi holds that etymologies reveal much about how cultural categories are formed, since they show how conceived reality can be construed in alternate ways at different points in time to facilitate a society's adaptation to its environment. A major topic discussed in this paper is the role of a descriptive naming model as the mechanism for the coding of culturally valid categories (cf. also Armstrong, Stokoe and Wilcox (1993: 10). Finally, Györi discusses parallels between Hermann Paul's view of semantic change with that of both Geeraerts and Langacker on the contemporary scene. In her paper titled "Unpacking markedness", Laura Janda explores the nature and phenomena of markedness. She shows that different kinds of markedness phenomena are natural by-products of the cognitive structuring of language. Janda also finds that the theoretical constructs of Cognitive Linguistics are particularly suited to her approach. Those that figure prominently in her analysis include the primarily Lakovian notions of 'radial category', 'the idealized cognitive model (ICM)', 'basic level' and 'metaphorical mapping'. For Janda, all of human linguistic knowledge is stored in cognitive categories and the structure of those categories results in markedness phenomena. Finally, she notes that the assignment of markedness values is neither arbitrary nor predictable, a point similar to that made by Kemmer and Bat-Zeev Shyldkrot, Lee and Waiters in this volume regarding the data they discuss.

6

Eugene H. Casad

Section II:

Within morphology and the lexicon

The life of a redwood forest is found within the morphology of its architecture: the roots, the trunk, the branches and the leaves. The papers in this section deal with those aspects of language which illustrate its life as seen in its own morphological structures and lexicon. Included here are papers by Willem Botha, Nicole Delbecque, Bruce Horton, Suzanne Kemmer and Hava Bat-Zeev Shyldkrot, Keedong Lee, Günter Radden, Carlo Serra Borneto, Ya-Ming Shen and Claude Vandeloise. In "The cognitive frame of a set of cricket terms", Willem Botha analyzes lexicographic definitions taken from four different dictionaries of Afrikaans, viewing them against the background of the culture-based conventionalized knowledge which is encapsulated in what he calls "the cricket frame". He concludes that the conceptualization of different cricket terms takes place in relation to an intrinsic point of orientation. For example, the definition of a term such as batsman involves the fact that batsman acts as an intrinsic point of orientation and that, furthermore, that orientation is a two-sided one. An adequate lexicographic definition of this term must make explicit note of this. In other words, the lexicographer, as both a perceiver and as a conceptualizer, must go onstage with the batsman. In Langacker's terms, such lexical items are highly objective in nature (cf. Langacker 1990). Prepositions convey a variety of semantic relations. Yet they are often held to be grammatically determined and empty of semantic content. Nicole Delbecque examines these assumptions in her detailed discussion of the Spanish prepositions por and para. Her purpose is to provide a unified and cognitively satisfactory account of the uses of por and para based on usages culled from a corpus of essays. She arrives at single schematic meanings for each of these prepositions (a feat not always attainable) and spells out partial semantic networks in which she relates the specific meanings of por and para to the schematic meanings of each one. Other aspects of Delbecque's analysis include the role of the differential profiling of elements within a schematic structure, the influence of the speaker's perspective on the scene he/she is describing and the way in which the situation itself is construed.

Introduction

7

Bruce Horton focuses on a different domain of grammar in "What are copula verbs?" He shows that in the inventory of English copula verbs, there is a category prototype, as well as a range of copular types that diverge from that prototype in various ways. He goes on to discuss the entire gradient of copular verb types, which range from noncopular "look-alike constructions" to quasi-copular constructions and on to the true copulas. He notes, crucially, that category membership is not an all or nothing affair, but is rather a matter of degree, a theme oft discussed by Lakoff, Langacker and Rosch, among others, and reiterated by several papers in this volume, including that of Tuggy in his analysis of the "double is construction" in English. Suzanne Kemmer and Hava Bat-Zeev Shyldkrot turn their attention to the French prepositions à and de, noting that these prepositions often appear in similar syntactic contexts, but with a distribution that seems entirely arbitrary. Thus, infinitival complements, for example, may be introduced by either à or de. Their goal is to show that semantic properties of à and de motivate their occurrence in the constructions in which they introduce infinitival complements. Their analysis encompasses both synchronic and diachronic facts and relates cases in which the semantics of à and de are clear to those in which the semantics of à and de appears to play no role whatever in the construction. They find that there is no clear dividing line between the "meaningful usages" of à and de and the "meaningless usages". Basically, even prepositions involving infinitives can be meaningful. The characterization of the meanings of these constructions is simply a matter of imposing alternative ways of viewing the situation that the speaker is discussing: the main clause and its relation to the infinitive clause are construed in different ways and this is reflected in the choice of either à or de. Achard in his paper on French complements (this volume), Delbecque in the preceding paper on Spanish por and para and Verhagen in his paper on linear order in complex sentences in Dutch (also in this volume) all invoke the notion of construal in their respective analyses. This conclusion also underscores Langacker's claim that grammar structure is almost entirely overt (Langacker 1987: 46; 1992: 127, 465). Finally, Kemmer and Bat-Zeev Shyldkrot note that the question of the meaningfulness of grammatical

8

Eugene H. Casad

elements is essentially independent of the degree of obligatoriness in the occurrence of these elements. The notion of a semantic network that relates specific meanings of a lexical item or those of a grammatical morpheme to more schematic meanings is the framework for Keedong Lee's paper "Getting at the meaning of make". He follows Bolinger (1977) in claiming that a word form is not a container into which different and unrelated senses can be randomly placed, but rather is one which contains related senses. An additional construct from Cognitive Grammar that figures heavily in Lee's analysis of make is the conceptual base that is necessary for characterizing a predicate and the ancillary notion of profiling distinct elements within that base, an idea also invoked in the papers of Delbecque and Kemmer and Bat-Zeev Shyldkrot discussed above. Lee notes that in the conceptual base associated with the meaning of make, there are several components. However, given entities within conceptual structure are not always profiled in the same way. Some are selected for special attention, while others are backgrounded in the base. Through this profiling, the verb make comes to have not only different senses, but also gets grammaticalized in various ways so that its variants can take distinct complement structures. Lee also concludes from his study that the different senses of make are not so arbitrary as commonly thought, but that they are not predictable either, a sentiment shared by several other of our authors. That language use is grounded in our daily experience is the starting point for Giinter Radden in his paper "Motion metaphorized: The case of coming and going." This accounts for both the persuasiveness of metaphors that describe events in terms of motion and the observations of developmental psychology that motion verbs are the ones that children learn earliest, are the most frequently used ones and are conceptually dominant, (cf. Miller and Johnson-Laird 1977). Behind all this is a fundamental schema whose properties allow it to serve as the base for numerous metaphors and whose properties have been discussed by a number of authors in a variety of contexts (cf. Casad 1982; 1993; Johnson 1987; Lakoff 1987, Langacker 1987; Lindner 1981).

Introduction

9

In this paper Radden addresses the problem of the metaphorical mappings from the source domain of motion onto the target domain of change of state. He notes that the conceptual metaphor CHANGE OF STATE is CHANGE OF LOCATION is highly motivated, is probably universal and is an entailment of the general metaphor STATES ARE LOCATIONS. The conceptual metaphor CHANGE IS MOTION is probably an excellent candidate for Lakoff's Invariance Hypothesis, according to which "metaphorical mappings preserve the cognitive topology (that is, the image-schema structure) of the source domain (1990: 54)". The topological elements of the motion schema, SOURCE, PATH, GOAL and, possibly, DIRECTION, are directly mapped onto the structure of changes of states. Spatial SOURCE and GOAL correspond to the states before and after transition, respectively. Spatial PATH corresponds to the transitional phase of a change of state and spatial DIRECTION may be related to the "direction" of a change of state. Lakoff's notion of an 'image-schema' is a powerful tool used widely in semantic analyses within Cognitive Grammar. Imageschemas provide a natural way for representing notationally such factors as the speaker's vantagepoint on a scene, the speaker's involvement in the scene that he/she is describing and the orientation of foregrounded entities vis à vis backgrounded ones within a given context. In his paper "Liegen and stehen in German", Carlo Serra Borneto invokes all these aspects of image-schemas to relate the basic usages of each verb to their figurative and metaphorical usages. He also contrasts these two verbs with each other in considerable detail. He concludes that one cannot treat the notions 'horizontality' and 'verticality' as simple universal semantic features. Instead, he finds at work complex, almost 'Gestalt-like schemata', which are linked to basic perceptual and psychological experiences, but which are not necessarily derived from them. Vandeloise reaches a similar conclusion with respect to the feature 'contact' in his discussion of the French verb toucher (this volume). Serra Borneto also finds a 'semantic continuum' that can be associated with each verb, relating the figurative usages to the more concrete ones. The utility of the framework of Cognitive Grammar as a tool for the description and explanation of linguistic phenomena is evidenced

10

Eugene H. Casad

by the ease with which its concepts and notational devices can be applied to diverse languages with equal appropriateness - what is different remains so, what is cognitively the same is revealed as such. In "The semantics of the Chinese verb 'come'", Ya-Ming Shen presents an integrated analysis of the main verb usages of lai 'to come'. She discusses the distribution of lai in different sentence patterns as well as the different relationships between lai and its postverbal complements. Considering both semantic structures and syntactic ones, Shen describes how the semantic structures of lai differ from each other in different sentence types and how these meanings are interrelated by means of a semantic network. In her analysis, Shen particularly invokes the following notions of Cognitive Grammar: (1) the base-profile distinction, (2) the "degree of prominence" scale, (3) the "setting-participant" asymmetry, (4) the ability to shift mentally from one domain to another, (5) the "subjectivity-objectivity" distinction and (6) the notion "active zone". In summary, Shen notes that the diverse usages of lai group into those that specify spatial motion versus those that specify abstract motion. These two major senses are related by both a shift from the domain of physical space to that of mental space as well as a shift from an objective to a subjective perspective. Claude Vandeloise discusses certain facts about transitive usages of the French verb toucher 'to touch' in his paper "Touching: A minimal transition of energy". In doing so he adds to our literature on the nature of Force Dynamics, a topic brought to the fore by Len Talmy (1988; 1993). To be more precise, Vandeloise notes that the transitive usages of toucher often forbid any transmission of energy. Curiously, in the present tense, such sentences cannot be passivized. Vandeloise accounts for this in terms of the concept of minimal physical action. The subject in such sentences is neither a prototypical agent nor a prototypical experiencer, but rather stands midway between the two along the energy chain. He notes that the ban on passivization may be waived only if the external object makes contact particularly difficult. The main lesson that Vandeloise draws from all this concerns the role of the notion 'contact' in language analysis. This concept is often presented as an important semantic feature in the componential analy-

Introduction

11

sis of the lexicon. Although the feature [± contact] provides a convenient way for dividing all spatial relationships into two classes, one class which allows contact, the other which excludes it, linguistic categorization seems more complex. Categorization relies on complex bundles of attributes, conceptualized globally, whereas the feature [± contact] is only one such attribute. It may sometimes be a necessary condition, but it is never a sufficient one. This also holds true for the transitive usages of toucher: they are better described in terms of the concept minimal physical action rather than in terms of the topological concept of 'contact'. This in turn explains why A touche Β cannot be used as a paraphrase of either certain kinesthetic predicates or certain static ones. Section III: Some of the architecture In the terms of Cognitive Grammar, all grammatical units fall somewhere along a continuum of symbolic structures ranging in size from morphemes to lexical items to phrasal structures and then on to sentence and discourse level structures, a point that is prominent in the writings of Pike and Longacre. As Langacker observes, the higher up one goes on the complexity scale, the more schematic the patterns tend to be and the less conventionalized (Langacker 1987: 36; 1991: 117; 1992: 6,152). Nonetheless, all conventionalized units, even those considered syntactic, are meaningful and the meanings of all such grammatical constructions can be modelled in much the same terms as those of lexical items. The papers in Section III are grouped here because they discuss higher level grammatical constructions, some instantiated by simple sentence structures, others by complex sentence structures. These papers are by Michel Achard, Angeliki Athanasiadou and René Dirven, Hana Filip, Toshio Ohori, David Tuggy, Karen van Hoek and Arie Verhagen, respectively. They all tie meaning to their syntactic analyses in substantial ways and invoke many of the same constructs that were employed in the analyses of individual lexical items and grammatical morphemes given by the papers in Section Π.

12

Eugene H. Casad

Michel Achard, in "Complement construal in French: A cognitive perspective", provides a semantic account of the distribution of modal marking in sentential complements. He finds that whether a speaker of French uses indicative marking or subjunctive marking on the verb in the subordinate clause of a complex sentence is a matter of how the speaker construes the content of that subordinate clause. He accounts for the use of the indicative mood vis à vis the subjunctive mood in terms of a compatibility condition between the main verb of the sentence and the meaning of the indicative mood. He states that the use of the indicative mood in French means that the content of the complement clause is viewed as a proposition, a distinct part of a conceptualizer's dominion. The main verbs of indicative sentences tend to be verbs of perception, communication and propositional attitude. The kinds of verbs found to be incompatible with the meaning of the indicative include verbs of volition. In Achard's terms, these verbs are "solely concerned with the event described in the complement". We can likely conclude that indicative complements construe their content objectively, whereas subjunctive complements construe their contents subjectively. Since the objective-subjective asymmetry is a matter of degree, it is no surprise that Achard also finds that verbs of emotional reaction are "potentially compatible" with the meaning of the indicative. In "Typology of «/-clauses", Angeliki Athanasiadou and René Dirven provide a detailed description of English //-clauses, focussing on the relationship between the various types that they discovered in a sample of 400 sentences culled from the Cobuild Corpus. These types include three classes of Course of Event Conditionals (CEC), a class that instantiates a gradient along the probability scale of Hypothetical Conditionals (HC) and two classes of Pragmatic Conditionals (PC). They find that distinct cognitive needs are associated with the use of each kind of conditional as well as different degrees of cognitive salience. The use of Course of Event Conditionals resides in the fact that speakers have firm knowledge of real situations, expect them to occur regularly and make generalizations on the basis of those expectations. On the other hand, Hypothetical Conditionals arise from the speaker's need to make predictions about possible future events based on his ev-

Introduction

13

ery-day experiences. Such "predictions", of course are stated in terms of a sliding scale of probability of outcome. Pragmatic Conditionals, in contrast, relate to the domains of logic and conversation. The role of Logical i/-clauses is to understate the strong certainty that the speaker has regarding a given situation, whereas the Conversational ifclauses function to background the use and expression of too obvious a reason for some event. Finally, given the semantic transparency of if in the Hypothetical Conditionals, Athanasiadou and Dirven suggest that this use of if is the prototypical one. Hana Filip's contribution, titled "Boundedness in temporal and spatial domains" presents an analysis within the framework of construction grammar to show how Slavic languages employ verbal operators to allow speakers to interpret nominal complements as either definite or indefinite. Filip characterizes Construction Grammar as a "mono-stratal, non-transformational and unification-based f r a m e work". Important to her analysis is the idea of an 'Incremental T h e m e ' which applies to those cases in which a simple NP is associated with the participant that "measures out" an event. Here Filip follows the theories of Krifka and Dowty, who link the Incremental Theme to the direct object N P ' s in such expressions as to build a house. Filip links the Incremental Theme to the domain of an entire sentence and places it within an Incremental Schema, which is one of the interpretive schémas (or frames, in the sense of Fillmore) that is associated with sentences. Certain Aktionsart and aspect properties of sentences are interpreted against this schema. Filip also points out that this schema allows one to characterize the interaction between predicate operators and nominal arguments in terms of the system of categories that make up the "disposition of a quantity" (Talmy 1986: 16ff.) The meaningfulness of even highly grammatical morphemes comes out in Toshio Ohori's paper "Case markers and clause linkage". Ohori draws on data f r o m a variety of languages f r o m distinct stocks, keeping in view the need to remain descriptively adequate while seeking to make the pertinent generalizations. He cites a number of parallels between case markers and clause linkage markers and concludes that these parallels are motivated on semantic grounds: in part, this motivation comes f r o m the figure and ground distinction that is opera-

14

Eugene H. Casad

tive in semantic extension, and, in part, by the interplay of localism and the Gestalt preserving nature of semantic extension. He finds that case markers for the peripheral relations are more likely to be extended to become clause linkage markers than those from the core grammatical relations. In Ohori's terms, peripheral NP's are those that serve as datives, ablatives and instrumentais. Peripheral relations also involve a variety of subordinate clauses. All of these share the property of serving as ground in a relational predication. On the other hand, nominatives and accusatives do not fit the pattern because they are either selected as figure within a complex predication or they are indeterminate with respect to the figure-ground distinction. Nonetheless, as Ohori himself notes, there are sufficient problems and there is sufficient fuzziness in all the data that the statement of particular parallels one hopes to discover can only emerge from a pair by pair study of languages for selected grammatical features. A central theme of Langacker's formulation of Cognitive Grammar is that grammar sanctions usage, but that this sanctioning is not strongly determinative of the form that an expression assumes in a given case. David Tuggy's paper on the "double is" construction in English illustrates quite well some of the implications of this point of view. The "double is" construction is characterized by a short definite noun phrase whose head is ordinarily the word thing. This noun phrase is followed by two occurrences of the word is. These are in turn followed by the complementizer that and a finite clause. The finite clause itself may be quite long. Tuggy notes that this construction is marginal in English in several respects and mentions that many people who actually use it consider it to be erroneous and to be a deviation from the similar English copular construction which has a single is. Sanctioning can be multiply motivated and that is the answer that Tuggy gives: the double is construction has apparently arisen from a number of sources, most of them anomalous or erroneous. In particular, this construction is sanctioned by parallelism with the "legitimate" double is construction, by solidification of the phrase the thing is with the concomitant loss of the analyzability of its parts and the use of a unit complementizer is that. In short, Tuggy presents us with a snap-

Introduction

15

shot of an erroneous construction being partially sanctioned by a few established patterns of English grammar and becoming grammaticalized to take its own position within the grammar. In this position, then, it is now beginning to sanction its own use. It is hard to see how any of this could even take place if language really was rule-ordered as the generativists would have us believe. Karen van Hoek presents us with a cognitive analysis of bound anaphora in English, and in so doing, shows us in precise terms how a cognitive analysis contrasts with a generative one in accounting for similar data. The generative account that van Hoek has in mind is Reinhart's 1983 solution, which invokes the notion of c-command, i.e. in a syntactic tree structure, the first branching node which dominates an element X must also dominate an element Y in order for the relation X c-commands Y to hold. For bound anaphora in particular, the first branching node that dominates the antecedent must also dominate the pronoun. Van Hoek notes that while this condition accounts for much of the data, it does not account for a number of construction types. Van Hoek's analysis argues that the antecedent for a pronoun functions as a conceptual reference point. It is an element which is highly salient within the context in which the pronoun is embedded and it shapes the semantic construal of the pronoun by specifying its referent. This analysis further argues that the constraint on bound anaphora follows from the antecedent's function as a reference point within a conceptual context set up by the quantifier, i.e. a mental space in the sense of Fauconnier 1985. The result of this analysis is a model that places severe limitations on the range of possible bound anaphoric configurations, a range that is nonetheless not as restrictive as that allowed by Reinhart's c-command analysis, but one that accommodates the facts more easily. In "Sequential conceptualization and linear order", Arie Verhagen examines the question as to how the ordering of elements in a sentence is related to the sequencing of individual conceptualizations in a complex one. More precisely, how may linear order be used in order to justify the possible interpretations of a sentence? Verhagen considers two sets of data - (a) a set of verbs that may be viewed either subjectively or objectively and (b) extraposed relative clauses. His answer

16

Eugene H. Casad

is partly based on the notion of independence: whenever two elements in a sentence are distinguished as separate, the one that comes first is to be conceptualized independently with respect to the one that follows, whereas the reverse does not apply. Verhagen's analysis is highly reminiscent of Achard's account of the contrast between French complement constructions involving perception verbs vis à vis those embedded to volitional verbs given earlier in this section. Both analyses, moreover, may well reflect distinct aspects of construal, a concept that Talmy has recently called "the windowing of attention in language" (Talmy 1993). Section IV: Wider connections in the forest Section IV contains papers by Jacqueline Lindenfeld, Johanna Rubba and Inger Lytje, all of which relate language use to a broader context, either social, conceptual or paralinguistic. In "Cognitive aspects of verbal interaction", Jacqueline Lindenfeld seeks to employ a cognitively oriented approach to the study of verbal interactions in order to better understand the link between purposive and verbal behavior within the sociocultural context. For her analysis, she draws on insights from the ethnography of communication tradition of Hymes and Gumperz, as well as those of the communication goals studies of Craig (1986) and the work of Schank and Abelson (1977) on scripts. She characterizes communicative competence in terms of relations between actors' goals and their discourses as observed at a Southern California fruit stand. She notes that marketplace encounters are goal directed, that goal fulfillment is achieved in part through discourse, which varies in relation to the participant's specific goals and that this, in turn, results in the diversity of conversational structures. Johanna Rubba looks into an area of grammar usually held to provide crucial data for demonstrating the autonomy of syntax - the choice of case markers in a sentence. Instead, Rubba proposes a direct link between conceptualization and syntax. Her study "The interaction of folk models and syntax: Case choice after prepositional verbs of cognition in German" impinges on a number of complex areas including preposition semantics, case semantics, the semantics of mental ex-

Introduction

17

perience verbs and the German folk model of the mind. The particular proposal is that the metaphorical structuring of an area of experience in a folk model motivates case choice. Her analysis draws on the work of both Langacker and Lakoff and supports the work by Smith (1987) on German case marking, as well as that of Holland and Quinn (1987) on folk models. Rubba notes that folk models are complex schémas which people use to understand the world around them and to manage their own experience. Folk models are used to categorize, to reason, to form expectations and to guide behavior, among other things (cf. Lakoff 1982; 1987; Holland and Quinn 1987; Herskovits 1986). Rubba concludes that for prepositional verbs in which the preposition allows potentially for the selection of either accusative or dative case, the selection of a particular case marker is determined by the conceptualization of the event chain encoded by the verb. For some verbs, a scenario more closely approximating the transitive prototype in the realm of concrete action is found. For other verbs, a scenario is found which is much less like the transitive prototype. The case semantics each conceptualizer matches best will be used to mark the prepositional object. Presently there are few projects exploring the possible implementation of Cognitive Linguistics within the framework of Artificial Intelligence. The paper by Inger Lytje, titled "Computer modelling of text comprehension" represents one of only two efforts that I presently know of that attempt to employ Langacker's approach in a computer simulation of natural language processing (for the other, see Holmqvist 1992). As Lakoff states so clearly, the mind does a lot more than simply compute (1987: 348-9). Thus it is encouraging to see someone express the view that the computer modelling of natural language understanding can be a methodology for gaining insight into language regarded as a multifaceted array of synergistic cognitive processes rather than as a set of autonomous formal rules (cf. Bowerman 1994). The project that Lytje describes in this paper is in its early stages of implementation and is based on a Danish lexicon of 4,000 words. The computer model that she is suggesting is construed as a research tool for studying the relation between semantic structures and the cognitive pro-

18

Eugene H. Casad

cesses of understanding and comprehension. She and her associates are suggesting methodologies that seem to cope with some of the classical problems concerning ambiguity and undecidability. The method consists in rejecting classical categories in favor of invoking categorizing principles based on the roles of prototypes and schematic units.

Section V:

The varieties in native America

The Amerindian languages provide a genuine testing ground for the development and validation of Cognitive Linguistics because of the kinds of categories that are encoded in their grammars and the rich morphological structures that characterize their word, phrase and sentence patterns. Given that there are approximately 800 such languages in the Americas, many of which are rapidly passing off the scene, the need to collect data from them and document them as fully as possible is of paramount importance. This was stated forcefully by several authors in a recent issue of the journal Language, and was more recently underscored by the Symposium on Endangered Languages at the 48th International Congress of Americanists held in Stockholm, Sweden. To date, cognitive analyses of selected grammatical patterns of Amerindian languages have been published by Brugman for Mixtee, Palmer, Ogawa and Ochs for Coeur d'Alene, Tuggy for Nahuatl and Casad for Cora. In this section we add three more languages and three more authors to the roster. I include here papers by Rick Floyd, Carole Jamieson Capen and James Waiters. In "The radial structure of the Wanka reportative", Rick Floyd explores the domain of the reportative evidential suffix -shi in the Wanka dialect of Peruvian Quechua. Floyd assumes a view compatible with those expressed by Langacker 1987, 1991; 1992 and Lakoff 1987, i.e. the forms that linguistic structures take are motivated by human cognitive processing. He shows that the usages of the Wanka reportative suffix -shi fall into a radially structured category in which the extended usages are motivated by a central prototypical usage or by one or more of the extensions of that prototype.

Introduction

19

Floyd finds four distinct usages of -shi. In its prototypical use, s hi indicates that an utterance is based on hearsay. In a second use, -shi marks the authoritative source for folklore. A third use occurs in riddles, whereas the fourth is one that Floyd labels "a challenge construction". Not all the uses of -shi can be adequately analyzed as hearsay. However, all of its uses do involve variations on the schematicity of the speaker-external information source. The central point is that there is no single characteristic that all the uses of -shi hold in common, but that conjointly they constitute a radial category. The role of grammar as a sanctioning device for language use, discussed in Tuggy's paper, comes into the picture again in Carole Jamieson's paper "Chiquihuitlan Mazatec postverbs: The role of extension in incorporation". In addition, a number of other points crucial to Cognitive Grammar are illustrated by this paper. Chiquihuitlan Mazatec is an Oto-Manguean language spoken in the state of Oaxaca, Mexico. Its lexicon contains only about 300 simple verb stems. However, there is a set of approximately fifty postverbs which undergo incorporation into simple verb stems to create a rich lexicon of compound verbs. The postverbs include both optionally possessed nouns and inherently possessed body part nouns. Jamieson argues that the present schemata are neither basically syntactic nor semantic. Some of the characteristics of Chiquihuitlan Mazatec incorporation appear to have been sanctioned by the extension of existing syntactic patterns, while others appear to have been sanctioned by the extension of existing lexical patterns and may involve an interplay between them. Jamieson's comment here jibes very well with the accounts of multiple motivation already given in the papers by Tuggy and Floyd, among others. In Jamieson's view, the syntax of a language and its lexicon must be simultaneously available to the speaker. It is this interplay or multiple motivation between the two processes which accounts for much of the lexical richness and grammatical complexity in the Chiquihuitlan verb. Her view also finds strong support in the psycholinguistics literature (Gentner 1993). Jamieson concludes that Chiquihuitlan Mazatec postverbs appear to be the result of the interaction or networking of the grammatical rules and ideals (cf. Herskovits 1986) and the building of the lexicon. Taken together, these factors show clearly

20

Eugene H. Casad

how a seemingly small inventory of units may well combine into a very productive linguistic system, a point similar to that recently made by Pawley for Kalam, a language of the Highlands of Papua New Guinea (cf. Pawley 1987: 337) The final paper in this volume presents an analysis of a set of constructions that occur in Tepehua, a Totonacan language of Eastern Mexico. In "Frames and the semantics of applicatives in Tepehua", James K. Watters discusses the ideas related to accounting for the morphology and the semantics of applicative constructions in this language. There are actually four affixes that figure in these constructions; Watters discusses the three of them that are the most recalcitrant semantically. The suffix -mi takes an argument that may be the goal, source, benefactee or causee in an event. The prefix li- takes an argument that may be the direction, the secondary theme, or the reason for which something is done (among other things). The prefix pu- takes an argument that may be either the route, instrument, means, contained location or manner in which something is carried out. Watters shows that any satisfactory account of the semantics of such constructions, including the assignment of semantic roles, must appeal to notions such as frames (Fillmore 1978, 1982, 1992) and image-schemas (Langacker 1987 and Lakoff 1987). He argues that in virtually all cases the resulting meaning is motivated by, although not necessarily predicted by, the image schema of the applicative suffix or prefix and the semantic frame associated with the verb to which it attaches. Watters uses the term "image-schema" to refer to the configuration imposed by the applicative affix and "frame" to refer to the scene (and the set of lexical stems) associated with the verb stem. Both are instances of what Lakoff 1987 calls "idealized cognitive models", but they differ significantly in the elaborateness of the information that each conveys. Returning to the first paper in this volume, Gibbs comments that the focus of cognitive linguists on some of the possible ways that conceptual thought might influence language use and understanding has led to deeper analyses of human conceptual thought than was traditionally provided by generative linguists and that this appears to be the level at which Cognitive Linguistics makes its unique contribution to linguistics. He concludes that the studies coming out of this rapidly

Introduction

21

developing field are leading the way to new theoretical understandings of how the mind, body and language interact. And this is why cognitive scientists must pay close attention to the developments in Cognitive Linguistics. In closing, we offer this selection of papers to cognitive linguists, cognitive psychologists and readers in all areas of Cognitive Science and Linguistics for their own study, benefit and use. References Anttila, Raimo 1992 "The return of philology to linguistics", in: Martin Pütz (ed.), Thirty years of linguistic evolution. Philadelphia and Amsterdam: John Benjamins, 313-332. Armstrong, David F., William C. Stokoe & Sherman Wilcox 1993 Signs of the origin of syntax. Revised version of a paper presented at the annual meeting of the Language Origin Society, Cambridge University, Cambridge, UK. September 1992. Bolinger, Dwight 1977 Meaning and form. London: Longman Bowerman, Melissa 1994 "The origins of children's spatial semantic categories: Cognitive versus linguistic determinants", in: John J. Gumperz & Stephen G. Levinson (eds.), Rethinking linguistic relativity. Brugman, Claudia 1984 "The use of body-part terms as locatives in Chalcatongo Mixtee", Report No. 4 of the survey of California and other Indian languages: Berkeley: University of California at Berkeley, 235-290. Casad, Eugene H. 1982 Cora locationals and structured imagery. Ph.D. dissertation, University of California of San Diego. 1988 "Conventionalization of Cora locationals", in: Brygida RudzkaOstyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins, 345-378. 1992 "Cora yee and the conversational schema", Cognitive Linguistics 3.1: 151-186. 1993 ""Locations", "paths" and the Cora verb", in: Richard Geiger & Brygida Rudzka-Ostyn (eds.), Conceptualizations and mental processing in language. Berlin and New York: Mouton de Gruyter, 593-645. Casad, Eugene H. & Ronald W. Langacker 1985 ""Inside" and "outside" in Cora grammar", International Journal of American Linguistics 51: 247-281.

22

Eugene H. Casad

Craig, Robert 1986

"Goals in discourse", in: Donald Ellis & William Donohue (eds.), Contemporary issues in language and discourse. Hillsdale, Ν. J.: Lawrence Erlbaum. Fauconnier, Gilles 1985 Mental spaces. Cambridge, MA: MIT Press. Fauconnier, Gilles & Eve Sweetser (eds.) 1994 Mental spaces, grammar, and discourse. Chicago: University of Chicago Press. Fillmore, Charles J. 1978 "On the organization of semantic information in the lexicon", in: Donna Farkas, W. M. Jacobsen & K. W. Todrys (eds.) Papers from the parasession on the lexicon. Chicago: Chicago Linguistic Society. 1982 "Frame semantics", in: Linguistic Society of Korea (ed.), Linguistics in the morning calm: Seoul: Hanshin, 111-137. Fillmore, Charles J. & Beryl T. Atkins 1992 "Toward a frame-based lexicon: The semantics of RISK and its neighbors", in: Adrienne Lehrer and Eva Fedder Kittay (eds.), Frames, fields and contrasts: New essays in semantic and lexical organization. Hillsdale, N.J.: Lawrence Erlbaum, 75-102. Geiger, Richard & Brygida Rudzka-Ostyn (eds.) 1993 Conceptualizations and mental processing in language. Berlin and New York: Mouton de Gruyter. Gentner, Dedre 1982 "Why nouns are learned before verbs", in: Stan Kuczay (ed.), Language development: Language, cognition and culture. Hillsdale, N.J.: Lawrence Erlbaum, 301-334. Herskovits, Annette 1986 Language and spatial cognition: An interdisciplinary study of the prepositions in English. Cambridge: Cambridge University Press. Holland, Dorothy & Naomi Quinn 1987 Cultural models in language and thought. Cambridge: Cambridge University Press. Holmqvist, Kenneth 1992 Implementing cognitive semantics. Lund: Department of Cognitive Science. Gumperz, John & Dell Hymes (eds.) 1964 The ethnography of communication. Washington D. C : American Anthropological Association. Lakoff, George 1982 Categories and cognitive models. Cognitive Science Report No. 2. University of California, Berkeley: Institute for Cognitive Studies. 1987 Women, fire and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press.

Introduction

1990

23

"The Invariance Hypothesis: Is abstract reason based on imageschemas?", Cognitive Linguistics 1.1: 39-74. 1993 "Convergence zones and conceptual structure" (Commentary on: Antonio Damasio and Daniel Tranel: Nouns and verbs are retrieved with differently distributed neural systems), Proceedings of the National Academy of Sciences (to appear). Langacker, Ronald W. 1987 Foundations of cognitive grammar. Vol. I: Theoretical prerequisites. Stanford, CA: Stanford University Press. 1990 "Subjectification", Cognitive Linguistics 1:5-38. 1991 Concept, image and symbol: The cognitive basis of grammar. Cognitive Linguistics Research 1. Berlin: Mouton de Gruyter. 1992 Foundations of cognitive grammar. Vol. II: Descriptive application. Stanford, CA: Stanford University Press. Miller, George & Philip Johnson-Laird 1977 Language and perception. Cambridge, MA: Belknap Press of Harvard University Press. Occhi, Debbie J., Gary B. Palmer & Roy Ogawa 1993 Like hair, or trees: Semantic analysis of the Coeur d'Alene prefix ne' 'amidst'. Ms. Ogawa, Roy H. & Gary Palmer 1994 Langacker semantics for three Coeur d'Alene prefixes glossed as 'on'. Ms. Paprotté, Wolf & René Dir ven (eds.) 1985 The ubiquity of metaphor: Metaphor in language and thought. Amsterdam and Philadelphia: John Benjamins. Pawley, Andrew 1987 "Encoding events in English and Kalem", in: Russell S. Tomlin (ed.), Coherence and grounding in discourse. Amsterdam and Philadelphia: John Benjamins, 329-360. Reinhart, Tanya 1983 Anaphora and semantic interpretation. Chicago: University of Chicago Press. Rudzka-Ostyn, Brygida (ed.) 1988 Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins. Schank, Roger C. & Robert P. Abelson 1977 Scripts, plans, goals and understanding. Hillsdale, N.J.: Lawrence Erlbaum. Smith, Michael B. 1987 The semantics of dative and accusative in German: An investigation in cognitive grammar. Ph.D. dissertation, University of California at San Diego.

24

Eugene H. Casad

Talmy, Leonard 1983 "How language structures space", in: Herbert L. Pick, Jr. & Linda P. Acredolo (eds.), Spatial orientation: Theory, research, and application. New York: Plenum Press, 225-282. 1988a "The relation of grammar to cognition", in: Brygida Rudzka-Ostyn (ed.), 165-205. 1988b "Force dynamics in language and cognition", Cognitive Science 12: 49-100. 1991 "Path to realization: A typology of event conflation", in: Laurel A. Sutton and Christopher Johnson with Ruth Shields (eds.), Proceedings of the Berkeley Linguistic Society, Vol. 17: 480-519. 1993 "The windowing of attention in language". Paper presented to the Third International Cognitive Linguistics Conference, Leuven, Belgium. July 21-25, 1993. Tuggy, David 1981 The transitivity-related morphology of Tetelcingo Nahuatl: An exploration in Space Grammar. Ph.D. dissertation, University of California at San Diego.

Section I At ground level

What's cognitive about cognitive linguistics? Raymond W. Gibbs

1.

What's cognitive about cognitive linguistics?

The emergence of the International Cognitive Linguistics Association (ICLA) and the new journal Cognitive Linguistics reflects the growing desire by many linguists to study language from a cognitive perspective. Under this approach, linguistic structures are seen as being related to and motivated by human conceptual knowledge, bodily experience, and the communicative functions of discourse. The great variety of papers in this volume, along with those published in Cognitive Linguistics, provide good evidence that doing linguistics from a cognitive perspective leads to rich insights into many linguistic phenomena, ranging from studies in phonology and syntax, to those in semantics, pragmatics, and psychological aspects of language use. Cognitive linguistics has blossomed in part through the adherence to two different commitments (Lakoff 1990). The Generalization commitment emphasizes that we seek general principles in our theoretical descriptions of linguistic phenomena. For example, in syntax there are generalizations about the distributions of grammatical morphemes, categories and constructions. In semantics there are generalizations about inferences, polysemy, semantic fields, conceptual structure, and so on. In pragmatics there are generalizations about communicative functions such as speech acts, implicatures, deixis, and language use in context. Cognitive linguistics takes as an empirical issue whether these domains are autonomous or related (e.g., is the distribution of grammatical morphemes influenced by semantics and pragmatics?). The Cognitive commitment stresses the importance of incorporating a wide range of data from other disciplines into our theoretical description of language. This commitment compels cognitive linguists to

28

Raymond W. Gibbs

recognize the empirical findings from related disciplines such as cognitive and developmental psychology, psycholinguistics, anthropology, neuroscience, and so on. Defining what is special about cognitive linguistics in terms of the Generalization and Cognitive commitments has caused quite a bit of consternation within the linguistic community. For example, adopting the cognitive commitment as primary suggests a different meaning of the notion "generalization" than that assumed by most formal linguists and philosophers. Generalizations are statements about categories (Lakoff 1991). Yet much of the empirical research in cognitive science suggests the inadequacy of the traditional view that categories are defined by necessary and sufficient conditions. Most of our everyday conceptual system is based on basic-level categories and prototypes that are graded, radial, and metaphoric (Lakoff 1987). Consequently, what counts as a generalization within linguistic theory will depend on whether one adheres to the cognitive commitment. There has been great debate as to whether the cognitive commitment is a defining or characteristic feature of cognitive linguistics. As one of the coordinators for the 2nd ICLA conference held at Santa Cruz, in July 1991,1 sent out an announcement in January 1991 over the computer network Linguist List about the ICLA, the journal Cognitive Linguistics, and the upcoming ICLA conference. My message introduced readers to the ICLA in the following way: "The ICLA has an explicit interdisciplinary orientation, not only in the sense that cognitive linguistics tries to incorporate relevant research from other cognitive disciplines, but also because it hopes to highlight the contribution of linguistics to Cognitive Science." Readers did not comment on this idea that linguistics should make a significant contribution to Cognitive Science. But a tremendous debate arose when I went on to say: "Within cognitive linguistics, the analysis of the conceptual and experiential basis of linguistic categories and constructs is of primary importance: the formal structures of

What's cognitive about cognitive linguistics?

29

language are studied not as if they were autonomous, but as reflections of general conceptual organization, categorization principles, and processing mechanisms." This last line struck a very sensitive nerve in some of the people who responded to my message (the purpose of which, again, was to inform people about the 2nd ICLA conference held at the University of California, Santa Cruz, in the summer of 1991, not to start a full-blown debate). Various people complained that linguistics has always been cognitive and that research in other cognitive disciplines confirms many of the ideas touted by generative linguists, particularly in regard to the autonomy of language from cognition (cf. Chomsky 1980; Fodor 1983). Some critics stated that there is nothing special about cognitive linguistics to warrant calling itself cognitive. One wellknown linguist even commented in the Linguist List that the adoption of the terms cognitive by cognitive linguists was "arrogant." Over the past year many psychologists have conveyed to me their skepticism about the term cognitive linguistics because it implies that there is something missing from what they do as cognitive psychologists in studying language and language users. As one psychologist asked me recently (spoken in a stern tone of voice): " What exactly is it that cognitive linguists are doing that is so special and deserving of the title 'cognitive linguistics'?" My aim in this chapter is to suggest what is especially cognitive about cognitive linguistics. Linguists and psychologists who are skeptical of the cognitive linguistics enterprise miss several significant features of cognitive linguistics that make it particularly unique, not only as a linguistic science, but in relation to all of the cognitive sciences. I will suggest that cognitive linguistics is especially cognitive because of (a) the way that it incorporates empirical findings from other disciplines into linguistic theory, and (b) because it seeks to examine the specific contents, and not just the architecture, of human conceptual knowledge. Thus, Langacker places a "content requirement" on the organization of grammar, which specifies that the only structures constituting the grammar of a given language are (a) the linguistic structures attested in the data; (b) the structures that are schematic for the attested ones; and (c) the categorization

30

Raymond W. Gibbs

relationships that occur both within grammatical constructions and within schematic networks (Langacker 1987: 53-54; 1990: 290-291). I will also argue that cognitive linguistics needs to recognize the limitations of its methods and should remain open to complementary research strategies in studying thought and language. Research in experimental psycholinguistics adds significant, and much needed, constraints to theory construction in cognitive linguistics. This experimental work uses the ideas from cognitive linguistics to explore what is indeed cognitive, and "psychologically real," about cognitive linguistics.

2.

Linguistics, Psycholinguistics, and the Cognitive Commitment

The uproar over the emergence of cognitive linguistics as witnessed in the heated debate in the first few months of 1991 on the computer network Linguist List focused in part on whether generative linguistics has always been cognitively oriented. Chomsky has often stated that linguistics is a branch of cognitive psychology and many linguists of the generative persuasion claim to pay close attention to research in cognitive psychology, psycholinguistics, and cognitive neuroscience. In fact, many linguists argue that research findings from these disciplines support the autonomy of language position (see Fodor 1983; Gardner 1985; Garfield 1987 for reviews). For this reason, so goes the argument, cognitive linguistics cannot define itself as that branch of linguistics adhering to the cognitive commitment because even generative linguistics looks closely at the work in related cognitive disciplines. Cognitive linguists enthusiastically embrace the cognitive commitment because there is much research in related cognitive disciplines that bears directly on the relationship between thought and language that has completely been ignored by both linguists and cognitive psychologists in formulating theories of linguistic structure and behavior. For instance, work by Rosch and her colleagues on categorization is rarely seen by linguists or psychologists as relevant to linguistic issues. As evident from the large body of research that has emerged from cognitive linguistics in recent years, ideas about basic-

What's cognitive about cognitive linguistics?

31

level categories and prototypes has greatly informed theories in all areas of linguistic research (e.g., phonology, morphology, syntax, semantics and pragmatics). The findings of this research point to important possibilities that linguistic categories are not autonomous from general conceptual organization and processing mechanisms. So what are generative linguists thinking of when they argue that there is much evidence from cognitive psychology that supports the autonomy of language view? I briefly describe below some of the evidence that is thought to favor the autonomy of language, or modularity, view because its existence is seen by some linguists as removing the need for any subfield of linguistics called cognitive linguistics. At the same time, this experimental work from contemporary psycholinguistics illustrates what is both important and missing from cognitive linguistics. The question of whether linguistic structures are autonomous or reflections of general conceptual organization has motivated a variety of research in psycholinguistics on language processing. Psycholinguists ask to what extent does linguistic and non-linguistic information interact during the immediate, on-line comprehension of sentences. One view, the autonomy or modularity position, suggests that contextual and conceptual information is evaluated only after various linguistic sources of information (e.g., phonology, syntax) have been accessed and evaluated. Advocates of the modularity position tend to adhere to the philosophical belief that linguistic structures are autonomous from more general conceptual structures with the language faculty being its own special mental organ or module. The alternative position argues that contextual and conceptual information immediately influences analysis of specific linguistic information during on-line sentence processing. Psycholinguists contend that non-speeded tasks, such as grammaticality and semantic judgements, are not informative about the sources of information, both linguistic and non-linguistic, that are evaluated during the immediate parsing of linguistic expressions. To better understand whether language understanding involves linguistic knowledge that is autonomous from general conceptual knowledge, psycholinguists have devised many clever techniques to intrude on the silent, unconscious activities of reading and listening. Consider two

32

Raymond W. Gibbs

examples of experimental work whose results are thought to bear directly on the autonomy of language processing debate. One focus of recent psycholinguistic research considers whether lexical access- the initial activation of semantic information attached to a lexical representation- can be affected by the context in which the word occurs. Context clearly influences the meaning of a word; the question is when context has its effect. For example, is listeners' processing of the ambiguous word bug immediately influenced by their understanding of the context in which the word appears? If context has its effect only after a lexical ambiguity has been accessed, then a variety of meanings or senses for a word, such as bug, should be momentarily activated. Evidence for this initial activation of multiple senses for an ambiguous word is seen as support for a model of language processing where certain linguistic information (e.g., from the lexicon) is analyzed apart from other conceptual knowledge (i.e., the autonomy view). The most influential work on ambiguity resolution employed a cross-modal lexical decision paradigm (Swinney 1979; Onifer & Swinney 1981). Participants in one study listened over headphones to sentences such as the following (Swinney 1979): The man was surprised when he found several spiders, roaches, and other bugs in the corner of his room. Precisely at the offset of the ambiguity (bug), a target letter string was presented visually on a computer screen facing the participants. The target was either related to one meaning of the ambiguity (e.g., ANT or SPY) or was an unrelated control (e.g., SEW). The participants' task was to quickly decide whether the visually presented letter string formed an English word (i.e., to make a "lexical decision"). One of the advantages of this paradigm is that the visual target is presented simultaneously with the offset of the ambiguity so that the lexical decision reflects immediate access processes rather than postaccess phenomena. If there is a point in time at which all meanings of the ambiguous word are activated, then hearing this word should prime or facilitate people's lexical decision response to a visual target word related to any of its meanings. In fact, lexical decisions to targets related to either meaning of the ambiguity (e.g., ANT or SPY in the

What's cognitive about cognitive linguistics?

33

above example) were faster than were responses to the control words (e.g., SEW), regardless of the contextual bias. However, when these targets were displayed 750 milliseconds after the offset of the ambiguous word, lexical decisions were faster only to targets that were contextually related (e.g, ANT). It appears that people momentarily activate all the meanings of an ambiguous word, with context then working to disambiguate its meaning. These and other results have led to a view in which the access of an ambiguous word, or more generally lexical access, is an exhaustive, context-insensitive, sub-process of the language comprehension system (Burgess, Tannenhaus, & Seidenburg 1988; Kinstch & Mross 1985). Context becomes operative only at a post-access stage, guiding the selection of the contextually relevant meaning of the ambiguous word. These findings have been interpreted as support for the autonomy of lexical processing position. Linguists who favor the autonomy of language position often cite work by Swinney and others when they claim that experimental evidence exists in other cognitive disciplines to support their view of language as a formal, autonomous system of the mind. Their acknowledgment of this work leads them to believe that they are adhering to the cognitive commitment (the commitment which makes one's account of human language accord with what is generally known about the mind and brain from disciplines other than linguistics). Unfortunately, these same linguists rarely acknowledge the considerable evidence on processing of lexical ambiguity that contradicts the autonomy of language view of lexical access (Glucksberg, Kreuz, & Rho 1986; Simpson 1981; Tabossi, Colombo, & Job 1987; Tabossi 1989; Van Petten & Kutas 1987). This work employs different experimental tasks or stimuli and provides results that support an interactive view of lexical processing. The debate continues over whether lexical processing is best characterized as a modular or interactive view. At the very least, it is safe to say that a good deal of evidence supports both major positions on this issue. Another well-researched topic in contemporary psycholinguistics that bears on the autonomy of language debate concerns the interaction of syntax and semantics in human sentence processing. This work focuses on the question of why, and under what conditions, the human

34

Raymond W. Gibbs

sentence processing mechanism adopts one syntactic analysis of an ambiguous sentence rather than another. The rationale for asking this question is that systematic preferences presumably reflect the human sentence processing mechanism's use of particular kinds of information at particular points in the processing sequence. One idea is that language comprehension follows a strategy of taking the first available structural option for attaching new input to old. This minimal attachment strategy represents a bias toward attaching a new phrase to an old node rather than building a new phrasal node (Frazier 1979). Sentences (a) and (b) are examples of minimal and nonminimal representations of prepositional phrases beginning with with (from Rayner, Carlson, & Frazier 1983). (a) (b)

The spy saw the cop with binoculars but the cop didn't see him. The spy saw the cop with a revolver but the cop didn 't see him.

In sentence (a) the phrase with binoculars is directly attached to the verb phrase whereas in sentence (b) a new noun phrase is constructed and the prepositional phrase is attached to it. According to the minimal attachment hypothesis, sentences such as (b) should be more difficult to parse in the binocular/revolver area than in (a) because of the initial bias to simply attach the prepositional phrase to the verb phrase. One notable study recorded eye movements of people while they read sentences like these and found longer reading times in the binocular/revolver area for the sentences where the correct interpretation violated minimal attachments as (b) does (Rayner, Carlson & Frazier 1983). Thus, the human sentence processing mechanism's initial decisions, when faced with surface structure ambiguities, are made on the basis of purely structural criteria. If the initial decision subsequently turns out to have been wrong (as in the case of a garden path), then a thematic processor guides the sentence processor on to the correct analysis. Psycholinguists continue to debate the independence of syntactic processing in human sentence comprehension (Altman & Steedman 1988; Ferreira & Clifton 1986; King & Just 1991; Taraban & McClelland 1988). Similar to the debate on the independence of

What's cognitive about cognitive linguistics?

35

lexical access, there is much evidence to support the interactive view that semantic/pragmatic information constrains syntactic processing. And, once again, linguists often ignore this other work when they embrace certain evidence from cognitive psychology as support for their views on the autonomy of linguistic processing. There are two things to be said about the psycholinguistic work on modularity in sentence processing and its relation to cognitive linguistics. Critics of cognitive linguistics are, at least, partially correct when they say they are aware of findings in different cognitive disciplines that might be said to support the autonomy of language position. For this reason, it is incorrect to define cognitive linguistics simply in terms of that branch of linguistics that pays attention to related work in other cognitive disciplines (i.e., as that discipline that adheres to the cognitive commitment). But, secondly, cognitive linguistics would not, in my estimate, argue against the idea that (a) there is a mental lexicon that might be independently accessed during sentence processing, or (b) that certain syntactic information might be evaluated independently of semantics in sentence comprehension. It is important here to distinguish between two kinds of questions on the autonomy of language. First, is the language faculty autonomous of non-linguistic, cognitive abilities? Second, within the grammar, are different linguistic components truly independent of each other in their organization, function, and processing (e.g., is syntax autonomous of phonology, semantics, etc)? Both of these questions demand empirical evidence to be answered, but probably require different kinds of evidence. Part of the confusion about the role of cognitive structure in language use and processing results from the failure to distinguish between different levels at which cognition and language interact. Let me suggest four possible ways that conceptual thought might influence ordinary language use and understanding. (1)

(2)

Conceptual thought plays some role in changing the meanings of words and expressions over time, but does not motivate contemporary speakers' use and understanding of language. Conceptual thought motivates the linguistic meanings that have currency within linguistic communities, or may have

36

(3)

(4)

Raymond W. Gibbs

some role in an idealized speakers'/hearers' understanding of language. But conceptual thought does not actually play any part in an individual speaker's ability to make sense of language or to process it. Conceptual thought motivates individual speaker's use and understanding of why various words and expressions mean what they do, but does not play any role in people's ordinary on-line production or comprehension of everyday language. Conceptual thought functions automatically and interactively in people's on-line use and understanding of linguistic meaning.

Many linguists, including those of the cognitive persuasion, play fast and loose between these different possibilities when they claim that cognition either does or does not play a role in language understanding. For example, the evidence from cognitive linguistics that supports possibilities (1), (2), and (3) does not necessarily indicate that possibility (4) is true. It is incorrect to suppose that conceptual knowledge has an automatic, immediate role in people's on-line processing of language until the appropriate on-line experiments have been conducted. These experiments will have to use methodologies similar to those employed in the research described above on lexical access and sentence processing. At the same time, it is incorrect to claim, as some generative linguists and psycholinguists do, that linguistic structures reflect specific, autonomous linguistic knowledge that is separate from various conceptual knowledge on the basis of on-line processing experiments such as those described above. Very few studies explicitly examine whether conceptual knowledge is automatically instantiated during lexical and/or sentence processing (see Potter, Kroll, Yachzel, Carpenter, & Sherman 1986 for one notable exception). Psychologists and linguists cannot argue that the psycholinguistic evidence supports the autonomy of language view until the psychological reality of possibility (4) has been experimentally examined. Although much work needs to be done here, there exists some evidence to claim that possibility (4) is true, especially in regard to how people understand figurative language (Gibbs 1984, 1989). In any event, cognitive linguists' focus on possibilities (1), (2), and (3) has provided a deeper analysis

What's cognitive about cognitive linguistics?

37

of human conceptual knowledge than is traditionally given by generative linguists. It is at this level of analysis that cognitive linguistics appears to make a unique contribution to the study of linguistics. 3.

What is especially cognitive about cognitive linguistics?

Linguistic theory attempts to capture significant generalizations about language structure, generalizations that are often thought to reflect underlying linguistic universals. There are two very different working assumptions about the origins of language universals, and these have led to a good deal of misunderstanding (Clark & Malt 1984). Most linguists and some psychologists work from what can be called the generative wager. It is highly likely that most aspects of language that are universal are a result not of general cognitive constraints, but of constraints specific to language functions- specific to an autonomous language faculty. It is therefore appropriate a priori to assume autonomous psychological constraints and to leave it to others to prove otherwise. Many psychologists and some linguists, on the other hand, make the opposite bet, which might be called the cognitive wager. It is highly likely that most language universals are a result not of linguistically autonomous constraints, but of constraints general to other cognitive functions. It is therefore appropriate a priori to assume that language universals derive from general cognitive constraints and to leave it to others to prove otherwise. Which of these two wagers would you put your hard-earned money on? Each wager reflects different working strategies that motivate linguists and psychologists to study linguistic structure and behavior from either the generative or cognitive perspective. The generative wager seems unsound because it encourages investigators not to look

38

Raymond W. Gibbs

for structure-independent explanations of language universale, but to be satisfied with a purely linguistic description of a universal, assuming that it is also a description of a feature of the human language faculty. To take one example, if Kay and McDaniel (1978) had accepted the generative wager, they never would have sought an explanation for color terminology in the workings of the human visual system (Clark & Malt 1984). Advocates of the generative wager will often miss cognitive/functional explanations of linguistic structure because they assume a priori that linguistic constructs are autonomous from general conceptual knowledge. A good example of this is found in Newmeyer's (1991) discussion of the work by Lakoff (1987) and Lee (1988) on the Japanese classifier hon. Lakoff and Lee argued that hon was historically extended via regular conceptual principles such as metaphor and metonymy from referring to long, thin, rigid objects like baseball bats, writing scrolls, and swords and staffs to refer to letters, home runs, and wins in martial arts contests. Hon now exists as a radial category whose internal state reflects these historical extensions. But Newmeyer incorrectly interprets these various historical extensions as resulting in a classifier that lacks a coherent function. This interpretation of the data on hon completely ignores the possibility that this Japanese classifier could reflect anything about the cognitive/functional organization of language, particularly principles of human categorization. Newmeyer's misunderstanding demonstrates the price paid by those who adhere to the generative wager. Most generally, it seems impossible a priori to distinguish between those universale whose explanations probably lie within an autonomous language faculty, if there is one, from those whose explanations lie without. To really demonstrate that some aspects of linguistic structure are autonomous, one must show how this idea contrasts with cognitive explanations. Embracing the generative view simply doesn't allow for such contrasts and linguistic theory has suffered as a result. Linguists who bet on the cognitive wager open themselves up to a whole new range of theoretical explanations that are rarely considered by those adhering to the generativist assumptions. Again, the papers in this volume, those published in the journal Cognitive Linguistics, those in various edited collections (Rudzka-Ostyn 1988; Tsohatzidis

What's cognitive about cognitive linguistics?

39

1989, Geiger and Rudzka-Ostyn 1993), and the many books by Haiman, Langacker, Lakoff, Taylor, Sweetser, and others all provide impressive evidence supporting the idea that language is not autonomous but a product of various, general cognitive mechanisms. Some critics of cognitive linguistics who engaged in the debate on the Linguist List argued that most of the evidence from the cognitive view comes from work in semantics and pragmatics on topics such as preposition choice and metaphor. What is needed, so the critics cry, is work of interest to generative linguists such as "locality conditions on Norwegian pronouns or the possibility of explicative do participating in inverted counterfactuals." These scholars suggest, to take another example, that studies on the modularity of metaphor doesn't help explain the antecedents of Norwegian pronouns. Cognitive linguists should therefore provide cognitive re-analyses of these topics in syntax and phonology (e.g., verb-negation-clitic order, the distribution of parasitic gaps) that are primarily understood in terms of autonomous linguistic knowledge. Anyone familar with the work of cognitive linguistics in the last decade knows that there is significant research on topics in phonology and syntax from the cognitive perspective (perhaps generative linguists need to pay attention as much to work in their own discipline as much as they look at findings from neighboring fields). Although this work may not have touched on all the topics studied by generative linguists (e.g., locality conditions of Norwegian pronouns), there is enough evidence to suggest that many aspects of phonology and syntax, in addition to topics in semantics and pragmatics, can be understood in relation to general cognitive principles (two very recent examples of this work are found in Heine, Claudi, and Hünnemeyer 1991; and Ruwet 1991). This is not to say that all aspects of language reflect conceptual structure because there is probably a fair amount of linguistic knowledge that is indeed autonomous from general cognitive mechanisms. The advantage of the cognitive approach, and one reason why it makes cognitive linguistics especially cognitive, is that it does not accept the generative wager, but explicitly looks to possible links between cognition and language. Beyond its significance for the practice of linguistics, cognitive linguistics plays a particularly unique role in cognitive science because it

40

Raymond W. Gibbs

aims not just at studying language as it relates to other aspects of cognitive knowledge, but because it actually contributes to a deeper understanding of the conceptual contents of the human mind. Much of the work in cognitive linguistics is unique because it attempts to infer something about conceptual knowledge based on the analysis of systematic patterns of linguistic structure. These analyses of systematic patterns in language suggest a variety of conceptual and pre-conceptual structures including idealized cognitive models, image schémas, metaphoric and métonymie mappings, mental spaces, radial structures, and so on. This emphasis on the contents of what people know and the bodily experiences that give rise to such knowledge is quite different from the major focus in cognitive science on the general architectural form of human thought and language. For example, cognitive psychologists historically attempt to characterize the different structural stores through which information is processed and transformed from input (i.e., when information enters the system from the environment) to output (i.e., behavioral response). Psychologists work on identifying information processing stores such as sensory, short-term, and long-term memories or try to distinguish between different systems within long-term memory such as semantic and episodic memory stores. Most recently, cognitive psychologists argue whether human cognition is best characterized as a symbolic or sub-symbolic (e.g., neural net) system. In most cases, cognitive psychologists focus on the architecture of mind and on the mental processes that operate within this representational system. They do not worry about the kind of knowledge people have or how people come to know what they do about themselves and the world. Consistent with this information processing approach to the study of mind in cognitive psychology, psycholinguists traditionally concentrate on specifying the general architecture of the language processor. For example, do people possess separate linguistic processors representing their knowledge of phonology, morphology, the lexicon, syntax, semantics, and pragmatics? This theoretical concern with general architectural features of the human language processor and the processes that operate on these linguistic representations has provided cognitive science with a rich array of insights into the structure of mind. However, as in the case of cognitive psychology in general,

What's cognitive about cognitive linguistics?

41

there is very little impetus in psycholinguistics to study the contents of the mind in terms of the actual beliefs and conceptions that people have of themselves and the world around them or how such knowledge specifically motivates different linguistic behavior. Understanding what people actually know and what motivates how they know what they do is viewed as less theoretically interesting than being able to characterize the overall architecture of the mind. But cognitive linguistics (and its allies in philosophy, anthropology, and psychology) views knowledge as arising out of people's bodily interactions with the world. Knowledge is seen not as static, propositional, and sentential, but as grounded in patterns of bodily experience. These patterns emerge throughout sensorimotor activity as we manipulate objects, orient ourselves spatially or temporally, and direct our perceptual focus for various purposes (Johnson 1991). To take just one example, we have a SOURCE-PATH-GOAL schema that develops as we learn to focus our eyes and track forms as they move throughout our visual field. From such experiences, a recurring pattern becomes manifest of tracking a trajectory from point A to another point B. The pattern itself may vary considerably (e.g., many objects, shapes, types of paths), but the emergent image-schematic structure of a SOURCE-PATH-GOAL can be projected into more abstact domains of understanding and reasoning (Johnson 1987). Thus, the SOURCE-PATH-GOAL schema gives rise to conceptual metaphors such as PURPOSES ARE DESTINATIONS and English is repleat with systematic expressions that illustrate this underlying metaphorical conceptualization. For instance, we start off to get our Ph.D.s, but along the way we get sidetracked or led astray, and are diverted from our original goal. We try to get back on the right path and to keep the end in view as we move along. Eventually we may come a long way and reach our goal (Johnson 1991). This way of talking about experience shows how the PURPOSES ARE DESTINATIONS metaphor, resulting from a very basic imageschematic structure, is constitutive of our understanding of intentional action. Cognitive linguistic work over the past 15 years has identified many image-schematic structures that serve as the foundation for concepts and thought. My argument is that this work demonstrating ex-

42

Raymond W. Gibbs

plicit links between bodily experience and the actual content of what people know and understand is unique within the cognitive sciences. Looking for the conceptual and bodily foundations of knowledge through systematic analyses of linguistic expressions provides for a comprehensive understanding of both mind and language that makes cognitive linguistics seem especially cognitive. 4.

Why cognitive linguistics needs psycholinguistics

As is the case with all scientific methods, there are limitations to the strategy of trying to infer something about conceptual structure from a systematic analysis of linguistic structure and behavior. The primary limitation is that shared by most linguistic research, namely the problem of making conclusions about phenomena based on the individual analyst's own intuitions. Distrust of the practice of relying on private, unverifiable intuitions as a source of data is an important reason for the growing interest in functionalist approaches to linguistics (Hopper 1991). Cognitive psychologists simply do not accept hypotheses about human conceptual knowledge, or anything for that matter, that are based on a theorist's intuitive speculations, even when such speculations are based on a systematic analysis of linguistic structure and behavior. To many, the idea, for example, that conceptual metaphors underlie our everyday experience or motivate our use and understanding of different linguistic expressions cannot be accepted as "psychologically real" because such a theory is based on intuitive explanation. They seek "objective" evidence that is elicited from experimental participants who have no pre-conceived notion about the phenomenon of interest. Although cognitive psychologists have in recent years begun to study people's introspections in a variety of domains (e.g., reading, problem-solving, decision-making), there remains a strong belief that linguistic analyses of human mental activities are unreliable because of their reliance on intropsective judgments by theorists. Another complaint with trying to infer aspects of conceptual knowledge from an analysis of systematic patterns of linguistic structure is that the resulting theories appear to have a rather post hoc quality to them. For instance, the claim that the systematicity in expres-

What's cognitive about cognitive linguistics?

43

sions such as He's wasting our time, I save an hour doing my paper on the computer, and I can no longer invest that much energy into my marriage is due to the presence of an independent, pre-exisiting conceptual metaphor TIME IS MONEY provides only a motiyated explanation for linguistic behavior. Cognitive psychologists and psycholinguists wish to be able to predict behavior in advance according to the hypothetico-deductive method of scientific inference. What they seek is empirical, objective evidence that people's conceptual knowledge somehow predicts the existence of different linguistic behavior, not that people's linguistic behavior can be explained by positing theoretical entities such as conceptual metaphor. I do not entirely agree with this characterization of how cognitive linguistics does its work, nor do I look negatively on motivated explanations of human behavior (see Casad 1988; Wierzbicka 1985 for two good examples of motivated explanations of data that have hitherto been considered arbitrary or unpredictable). Yet I recognize the need to provide empirical demonstrations for many of the ideas about human conceptual knowledge that have been proposed in cognitive linguistics (e.g., for notions such as conceptual metaphor, image schémas, radial structures, mental spaces, and so on). Much of my work over the past six years has been devoted to showing how people's conceptual systems motivate their use and understanding of everyday and literary language, particularly in regard to the role of conceptual metaphor in idiomaticity. Earlier publications describe much of this work (especially see Gibbs 1990). I now want to briefly report the findings of some new experiments that should be especially comforting to cognitive psychologists seeking more objective evidence on the role of conceptual metaphors in everyday language use. This new work provides a complementary way of doing cognitive linguistics that partially helps to eliminate the strict reliance on the individual analysts' own intuitions in assessing different kinds of linguistic phenomena. It also shows how motivated explanations of linguistic structure can be used to predict people's linguistic behavior in experimental situations. Do people possess tacit metaphorical knowledge that influences their use and understanding of language (possibilities 3 and 4)? My research has examined this question in regard to people's understanding

44

Raymond W. Gibbs

of idiomatic phrases in English. The strategy employed in this work can be stated as follows: Use cognitive linguistics to suggest aspects of human conceptual knowledge that can then be studied independently from language and from which certain predictions can be made about linguistic structure and behavior. For instance, cognitive linguistic work suggests that people make sense of idioms such as blow your stack, flip your lid, and hit the ceiling because of their metaphorical understanding of certain concepts to which these idioms refer. People metaphorically conceptualize anger, in this instance, in terms of heated fluid in a container (i.e., ANGER IS HEATED FLUID IN A CONTAINER), and this tacit knowledge partially motivates why speakers create and use expressions such as blow your stack and flip your lid to talk about their anger experiences (Kövecses 1986; Lakoff 1987). How can these ideas about conceptual metaphor in idiomaticity be verified as "psychologically real?" A variety of experimental studies examined whether people possess metaphorical knowledge to enable them to make sense of idioms. For example, previous work on mental imagery for idioms indicated that people have very similar intuitions about the actions that are described by idiomatic expressions (Gibbs & O'Brien 1990). Consider anger idioms such as blow your stack, flip your lid, and hit the ceiling. Participants in earlier studies strongly agreed about the causes, intentionality, and manner in which stacks are blown, lids are flipped, and ceilings hit when they formed mental images for these anger idioms. This consistency in people's intuitions about their mental images for idioms was attributed to the constraining presence of specific conceptual metaphors that motivated the figurative meanings of these idioms. For the anger idioms studied, the conceptual metaphor ANGER IS HEATED FLUID IN A CONTAINER provides part of the link between an idiom and its figurative meaning and also constrains the inferences people make about what these idioms mean. Psycholinguistic research has also shown that people's knowledge of the metaphorical links between different source and target domains provided the basis for the appropriate use and interpretation of idioms in particular discourse situations (Nayak & Gibbs 1990). Participants in one study, for example, gave higher appropriateness ratings to blew

What's cognitive about cognitive linguistics?

45

her stack in a story that described the woman's anger as being like heat in a pressurized container while bit his head off was seen as more appropriate in a story that described the woman's anger in terms of a ferocious animal. Thus, readers' judgments about the appropriateness of an idiom in context are influenced by the coherence between the metaphorical information depicted in a discourse situation and the conceptual metaphor reflected in the lexical make-up of an idiom. A recently completed series of experiments extended these earlier studies to show that people's understanding of idioms reflected the particular entailments of the underlying conceptual metaphors motivating these phrases (Gibbs 1992). My hypothesis was that the metaphorical mappings that motivate idiomatic meanings preserves the structural characteristics or cognitive typology of the source domain. This idea follows directly from the invariance hypothesis (Lakoff 1990). For example, people's understanding of the causes, intentionality, and manner of physical events, such as heating fluid in containers (i.e., source domains), should be similar to their understandings of the causes, intentionality, and manner of the anger to which idioms such as blow your stack, flip your lid, and hit the ceiling refer. That is, people's knowledge of a particular source domain (e.g., their understanding about the behavior of heated fluids in containers) should predict their understanding of a dissimilar target domain (e.g., what they understand about anger) that is partially structured by that source domain. The first experiment simply assessed people's understanding of the causes, intentionality, and manner of the actions in the source domains (e.g., heated fluid in a container, the behavior of brittle objects in containers, and so on). These events corresponded to particular source domains in various conceptual metaphors (e.g., ANGER IS HEATED FLUID IN A CONTAINER, IDEAS ARE PHYSICAL ENTITIES IN CONTAINERS, THE MIND IS A BRITTLE OBJECT, CONTROL IS POSSESSION OF SOME OBJECT) that have been seen in previous research as motivating the figurative meanings of idioms such as blow your stack, spill the beans, lose your grip, and lay down the law (Gibbs & O'Brien 1990). A short scenario was written to depict the basic elements in each of the four source domains. For example, the scenario for the source domain of heated fluid in a container stated

46

Raymond W. Gibbs

"Imagine that you are looking at a container that is shaped like a cylinder. The top of the container is sealed. The container is completely filled with some sort of fluid." Following each scenario were three questions that queried participants about various events relevant to these source domains. One question assessed people's intuitions about the causation of some event (e.g., "Describe something that would cause the fluid to come spontaneously out of the container"). A second question assessed people's intuitions about the intentionality of that event (e.g., "Imagine that something caused the fluid to come out of the container. Do you think that the fluid comes out on purpose or does the fluid just somehow get out by accident?"). A final question assessed people's intuitions about the manner in which the event is performed (e.g., "Imagine again that the fluid comes out of the sealed container. Do you think the fluid comes out in a gentle manner or does it explode out?"). The participants' responses to each question were analyzed by determining the single most frequent answer to each question for each source domain. On average 89% of the participants' responses were in agreement collapsed across the three types of questions and the four source domains examined. 87% of the participants suggested that for fluid in a container that fluid is heated and/or under pressure (causation), 87% suggested that the escape of fluid is unintentional (intentionality), and 84% said that the action is performed violently and abruptly (manner). For a fragile object in a container, 97% of the participants stated that some severe external stress is applied to the container that causes breakage of the fragile object, 95% suggested that breaking of the object in the container in this situation is unintentional, and 97% stated that the action causing the breakage of the object is forceful and quick. For small objects in a container, 92% of the participants stated that the cause of the objects escape is applied pressure or stress, 71% stated that escape of objects is intentional, and 73% suggested that the action is performed forcefully. Finally, for taking control of some object, all the participants stated that desire to control the object causes it to be grasped, 97% stated that the controlling action is intentionally performed, and 84% suggested that this action is performed with force.

What's cognitive about cognitive linguistics?

47

These responses show that people were quite consistent in their intuitions regarding the causation, intentionality, and manner of events for the four different source domains studied. The important finding in this experiment concerns the similarity in people's ideas about the source domains with other people's understanding of the causation, intentionality, and manner of their actions in their mental images for idioms (Gibbs & O'Brien 1990). For instance, the earlier studies showed that when people imagined the phrases blow your stack and flip your lid they consistently viewed the cause of the stacks blowing and lids flipping as being due to some internal pressure, that these actions were unintentional and performed in a forceful manner. As shown in this most recent experiment, people have similar ideas about the objects and events in the source domain of fluid in a sealed container. The responses for the other source domains were also identical to that found by Gibbs and O'Brien (1990) in their participants' imagery protocols for revelation, control, and insanity idioms (e.g., spill the beans, crack the whip, and lose your grip). Most generally, then, the present findings clearly show how the metaphorical mappings between source and target domains, which motivate the figurative meanings of idioms, preserved the critical, structural characteristics or cognitive topology of the source domains. Further experiments in this series demonstrated that people viewed idioms as having more complex meanings than they did literal paraphrases of idioms. These idiomatic meanings could be predicted based on the independent assessment of people's folk understanding of particular source domains that are part of the metaphorical mappings which motivate these idioms' interpretations. Participants in a second study, for example, read short stories ending in either idiomatic phrases (e.g., blow your stack) or corresponding literal paraphrases (e.g., get very angry). The participants then rated their agreement with different statements regarding the cause of the action (e.g., the cause of the anger), its intentionality (e.g., whether the person got angry intentionally), and the manner in which the action was performed (e.g., whether the person exhibited his anger in a forceful or gentle manner). The findings demonstrated that people viewed idioms as having very specific entailments compared to literal paraphrases, entailments that could be predicted by the source to target domain mappings of the id-

48

Raymond W. Gibbs

ioms' underlying conceptual metaphors. Literal paraphrases of idioms (e.g., to get very angry, to reveal the secret and so on) did not possess the same kind of specificity about the causation, intentionality, and manner of the human actions referred to by the idioms considered here. An important control study showed that the entailments people drew when they read idioms was not simply due to the meanings of the individual words in each phrase but could be attributed to the underlying conceptual metaphors that partially motivate these phrases' figurative meanings. A final series of reading time experiments showed that people found idioms more appropriate and easier to understand when they were seen in discourse contexts that were consistent with the various entailments of these phrases. This set of studies on the conceptual basis for idiomatic meaning provides experimental evidence in support of cognitive linguistic analyses of idiomaticity. Such data specifically support the idea that the mappings of source-totarget domain information in conceptual metaphors preserves the structural characteristics or cognitive typology of the source domains (Lakoff 1990). This research strategy of predicting idiomatic meaning based on people's conceptualizations of various source domains differs from that employed by cognitive linguists who make inferences about the conceptual foundations of meaning from analyses of linguistic expressions (cf. Kövecses 1986; Lakoff 1987). My work takes advantage of the cognitive linguistic analyses to pick idioms and source domains that are likely to be motivated by conceptual metaphor. But the data from these studies are important because they provide an independent, non-linguistic way of partially predicting what specific meanings some idioms are likely to possess based on the analyses of certain metaphorical concepts in long-term memory. As such, this experimental work is an important, perhaps necessary complement, to cognitive linguistic analyses, and suggests that such complementary research efforts will be essential to demonstrating what's really cognitive about cognitive linguistics. The experimental evidence on the conceptual basis for interpreting idioms is not representative of research in contemporary psycholinguistics. As my earlier discussion made clear, psycholinguists have in recent years attempted to formulate theories of linguistic understanding to account for the moment-by-moment processes used when peo-

What's cognitive about cognitive linguistics?

49

pie ordinarily process language. Only experimental methodologies that tap into what people actually, and unconsciously, do on-line are thought to be appropriate in studying normal utterance comprehension. Although my data are quite suggestive of the possibilty that people make use of various kinds of conceptual knowledge when understanding idioms, it is inappropriate to conclude from my work that people normally and automatically instantiate conceptual metaphors or image schémas when understanding language. It might very well be the case that people tacitly recognize that idioms have meanings that are motivated by different kinds of conceptual knowledge (possibility 3). But this does not mean that people always tap into this conceptual knowledge each and every time they hear certain idioms (possibility 4). It might even be true that people rarely make use of this conceptual knowledge during ordinary language understanding. For example, many idioms are highly conventionalized and their status as idioms is not salient perceptually, until, that is, some situation arises in which the literal counterpart to the idiom is used. Further research employing on-line methodologies can best provide the critical evidence of these different possibilities on the thought/language interface. 5.

Conclusion

Cognitive linguistics is especially deserving of the term cognitive not solely because of its commitment to incorporating a wide range of data from other cognitive disciplines, but because it (a) actively seeks correspondences between conceptual thought, bodily experience, and linguistic structure, and (b) because it seeks to discover the actual contents of human cognition. At the same time that cognitive linguists are using ideas from related fields such as psychology and anthropology to inform their theoretical understanding of linguistic phenomena, they are discovering a good deal about the substantive content of human conceptual knowledge. This emphasis on specifying the actual knowledge, the various cognitive models, image schémas, radial structures, conceptual metaphors, and so on, that are constitutive of people's everyday experience, makes cognitive linguistics a unique discipline within the cognitive sciences. In many ways, cognitive lin-

50

Raymond W. Gibbs

guistics is more cognitive in its orientation than other fields in cognitive science precisely because it seeks to understand the contents, and not just the architecture, of human cognition. Despite the uniqueness of how cognitive linguistics approaches the interaction of thought and language, it is important not to prejudge the ultimate answer to the question of how language and mind are related. It might very well be that some linguistic knowledge is autonomous from the rest of our conceptual system. Many generative linguists and psycholinguists favor the autonomy of language view as an a priori assumption (the generative wager) or philosophical belief rather than as a result of empirical research. In the same way, most cognitive linguists tend to believe that there is much interaction of language with human conceptualization (the cognitive wager). But it seems best to view the generative and cognitive approaches to linguistics as research strategies rather than as a priori philosophical commitments. For this reason, rejection of the autonomy of language view should not be the defining feature of cognitive linguists. I say this noting some of the significant advances in the philosophy of mind that result from adopting the cognitive wager as an a priori philosophical commitment (cf. Johnson 1987, 1991). Finally, cognitive linguistics must recognize some of the limitations in its methods, particularly in regards to the claims it sometimes makes about the role of conceptual knowledge in language use and understanding. There are at least four possible ways that conceptual knowledge can influence language use and we must be especially aware of how cognitive linguistics research does or does not bear on these possibilities. To say that conceptual knowledge permeates people's understanding of language (and experience) demands a closer analysis of what is meant by "understanding." It may be that cognition influences how people make sense of language but not how they immediately comprehend linguistic expressions each and every time they encounter language. Much work is needed to tease apart these different possibilities and cognitive linguistics itself may be unable to provide the answers to some of these questions, such as whether possibility (4) above is true. At the very least, though, cognitive linguistics is leading the way to new theoretical understandings of how mind, body, and language interact. For this reason alone, cognitive scientists must pay close attention to developments in cognitive linguistics.

What's cognitive about cognitive linguistics?

51

References Altmann, Gerry & Mark Steedman 1988 "Interaction with context during human sentence processing", Cognition 30: 191-238. Burgess, Curt, Michael Tannenhaus & Mark Seidenberg 1989 "Context and lexical access: Implications of nonword interference for lexical ambiguity resolution", Journal of Experimental Psychology. Learning, Memory & Cognition 15: 620-632. Casad, Eugene H. 1988 "Conventionalization of Cora locationals", in: Brygida RudzkaOstyn (ed.), Topics in cognitive linguistics. Amsterdam: John Benjamins. Chomsky, Noam 1980 Rules and representations. New York: Columbia University Press. Clark, Herbert & Barbara Malt 1984 "Psychological constraints on language: A commentary on Bresnan, Kaplan, and on Givon", in: Walter Kintsch, James Miller & Peter Paulson (eds.), Methods and tactics in cognitive science. Hillsdale, N.J. : L. Erlbaum Ferreira, Fernanda & Charles Clifton 1986. "The independence of syntactic processing", Journal of Memory and Language 25: 348-368. Fodor, Jerry 1983 Modularity of mind. Cambridge, Mass.: MIT Press. Frazier, Lynn 1979 "On comprehending sentences: Syntactic parsing strategies." Bloomington, Ind.: Indiana University Linguistics Club. Gardner, Howard 1983 Frames of mind. New York: Basic Books. Garfield, Jay (ed.) 1987 Modularity in knowledge representation and natural language processing. Cambridge, MA.: MIT Press. Geiger, Richard A. & Brygida Rudzka-Ostyn (eds.) 1993 Conceptualizations and mental processing in language. Cognitive Linguistic Research 3. Berlin and New York: Mouton de Gruyter. Gibbs, Raymond 1984 "Literal meaning and psychological theory", Cognitive Science 8: 275-304. 1989 "Understanding and literal meaning", Cognitive Science 13: 243251. 1990 "Psycholinguistic studies on the conceptual basis of idiomaticity", Cognitive Linguistics 1: 417-451. 1992 "What do idioms really mean?", Journal of Memory and Language 31:485-506

52

Raymond W. Gibbs

Gibbs, Raymond & Jennifer O'Brien 1990 "Idioms and mental imagery: The metaphorical motivation for idiomatic meaning", Cognition 36: 35-68. Glucksberg, Sam, Roger Kreuz & Susan Rho 1986 "Context can constrain lexical access: Implications for models of language comprehension", Journal of Experimental Psychology. Learning, Memory & Cognition 12: 323-335. Heine, Bernd, Ulrike Claudi & Frederike Hünnemeyer 1991 Grammaticalization: A conceptual framework. Chicago: University of Chicago Press. Hopper, Paul 1991 "Functional explanation in linguistics and the origins of language", Language & Communication 11: 3-28. Johnson, Mark 1987 The body in the mind. Chicago: University of Chicago Press. 1991 "Knowing through the body", Philosophical Psychology 4: 3-20. King, Jonathan & Marcel Just 1991 "Individual differences in syntactic processing: The role of working memory", Journal of Memory and Language 30: 580-602. Kintsch, Walter & Ernest Mross 1985 "Context effects in word identification", Journal of Memory and Language 24: 336-349. Kövecses, Zoltan 1986 Metaphors of anger, pride, and love. Amsterdam: John Benjamins. Lakoff, George 1987 Women, fire, and dangerous things. Chicago: University of Chicago Press. 1990 "The Invariance Hypothesis: Is abstract reason based on imageschemas?", Cognitive Linguistics 1: 39-74. 1991 "Cognitive versus generative linguistics: how commitments influence results", Language & Communication 11: 53-62. Langacker, Ronald 1987 Foundations of cognitive grammar, Vol. 1: Theoretical prerequisites. Stanford: Stanford University Press. 1990 Foundations of cognitive grammar, Vol. 2: Descriptive application. Stanford: Stanford University Press. Lee, Mark 1988 "Language, perception, and the world.", in: John Hawkins (ed.), Explaining language universals. Oxford: Basil Blackwell, 23-46. Nayak, Nandini & Raymond Gibbs 1990 "Conceptual knowledge in the interpretation of idioms", Journal of Experimental Psychology : General 119: 315-330. Newmeyer, Frederick 1991 "Functional explanation in linguistics and the origins of language", Language & Communication 11: 3-28.

What's cognitive about cognitive linguistics?

53

Onifer, William & David Swinney 1981 "Accessing lexical ambiguities during sentence comprehension: Effects of frequency of meaning and contextual bias", Memory & Cognition 9: 225-236. Potter, Mary, Judith Kroll, Barbara Yachzel, Elizabeth Carpenter & John Sherman 1986 "Pictures in sentences: Understanding without words", Journal of Experimental Psychology. General 115: 281-294. Rayner, Keith, Mary Carlson & Lynn Frazier 1983 "The interaction of syntax and semantics during sentence processing: Eye movements in the analysis of semantically biased sentences", Journal of Verbal Learning and Verbal Behavior 22: 358373. Rudzka-Ostyn, Brygida (ed.) 1988 Topics in cognitive linguistics. Amersterdam: John Benjamins. Ruwet, Nicholas 1991 Syntax and human experience. Chicago: University of Chicago Press. Simpson, Greg 1981 "Meaning dominance and semantic context in the processing of lexical ambiguity", Journal of Verbal Learning and Verbal Behavior 20: 120-136. Swinney, David 1979 "Lexical access during sentence comprehension: (Re)consideration of context effects", Journal of Verbal Learning and Verbal Behavior 18: 645-659. Tabossi, Patrizia 1988 "Accessing lexical ambiguity in different types of sentential context", Journal of Memory and Language 27: 324-340. Tabossi, Patrizia, Lucia Colombo & Remo Job 1987 "Accessing lexical ambiguity: Effects of context and dominance", Psychological Research 49: 161-167. Taraban, Roman & James McClelland 1988 "Constituent attachment and thematic role assignment in sentence processing: Influence of content-based expectations", Journal of Memory and Language 27: 597-632. Tsohatzidis, Savas L. (ed.) 1989 Meanings and prototypes: Studies in linguistic categorization. London: Routledge and Kegan Paul. Van Petten, Cynthia & Marta Kutas 1987 "Ambiguous words in context: An event-related potential analysis of the time course of meaning activation", Journal of Memory and Language 26: 188-208. Wierzbicka, Anna 1985 Lexicography and conceptual analysis. Ann Arbor: Karoma.

Neurological evidence for a cognitive theory of syntax: Agrammatic aphasia and the spatialization of form hypothesis Paul Deane

1.

Introduction

Central to syntax is the notion of structure: our perception that the expressions of language are organized into a complex but precisely articulated array of constituent structures and grammatical relations. It is this, more than anything else, which seems to separate grammar from other aspects of thought, which seem by contrast to be fluid and everchanging, dominated by the vicissitudes of family resemblance structure, prototypes, relevance, and the like. Thus arises one of the central questions of modern linguistic theory: why children are able to discern such intricate structures in the realm of language when most other intellectual abilities are as yet undeveloped. In a sense, linguistic theory is an attempt to explain why children possess such a strong sense of linguistic structure. We are all familiar with the standard, Chomskyan answer to this question: that linguistic competence is innate, triggered by experience, but emerging from a genetically endowed language faculty whose properties are unrelated to other, more general cognitive capacities. In this paper, I will seek to explore an alternative: a view which roots our sense for grammatical structure in an underlying capacity to appreciate the structure of physical objects. This view is termed the Spatialization of Form Hypothesis in Lakoff (1987: 283,290), and it has a certain intuitive appeal. It is certainly natural, when we speak of the constituent structure of sentences, to speak of constituents as parts of the sentence; when we speak of grammatical relations, we speak of

56

Paul Deane

connections between elements in the sentence; and likewise, when we speak of head-adjunct relations, it is natural to describe the head of a phrase as its central part, or to speak of an adjunct as an element placed at the periphery of the phrase. It is possible, in short, to describe grammatical structure in physical terms. By itself, of course, this proves nothing about the nature of grammar: all that it demonstrates is that it is easy for adult humans to describe abstract concepts by the use of metaphor. However, according to the Spatialization of Form Hypothesis, grammatical structure is organized in terms of abstract spatial schémas (LINK, PART/WHOLE, CENTER/PERIPHERY). If this is true, various predictions follow. First, we would not expect grammatical competence to be a strictly encapsulated cognitive system, but would expect to find interactions between grammatical and nongrammatical modes of thought. Second, we would expect to be able to develop a syntactic theory in which purely grammatical concepts were derived from properties of the LINK, PART/WHOLE, and CENTER/PERIPHERY schémas. And finally, we would expect to find an association between grammatical and spatial thought in the brain from which it would be possible to make accurate and insightful predictions about the interaction of linguistic structure and brain damage. An earlier work (Deane 1992) develops a syntactic theory based upon the Spatialization of Form Hypothesis. Various arguments are presented. One series of arguments focuses on syntactic constraints like subjacency, presenting evidence for extensive interaction with semantic, functional and attentional variables. Ultimately, an attentionally-based theory of extraction is developed (see Deane 1991 for further explication). Another series of arguments focuses on the predictive consequences of the Spatialization of Form Hypothesis, arguing that it yields a highly constrained account of syntactic structure in which apparently arbitrary aspects of grammar are predictable from extrinsically motivated properties of spatial cognition. A final argument is based upon facts of neuroanatomy and speech pathology. In particular, there is evidence for a neurological association between spatial and grammatical processing; when this evidence is combined with the grammatical theory developed in the rest of the work, it be-

Neurological evidence for a cognitive theory of syntax

57

comes possible to predict properties of various aphasie syndromes, including agrammatic Broca's aphasia. The present paper will examine neurological evidence for the Spatialization of Form Hypothesis, with particular emphasis upon the theory of agrammatism developed in Deane (1992: 271-292). The paper will begin by reviewing evidence that spatial and grammatical thought employ homologous structures in the brain; it will then review the syntactic theory developed in Deane (1992: 95-186) and explicate its application to agrammatism. After these preliminaries the resulting theory will be compared to other accounts; evidence will be presented that the Spatialization of Form Hypothesis provides a more accurate account of agrammatism than competing theories. 2.

The Spatialization of Form Hypothesis

2.1. On the neurological association between grammatical and spatial thought In the brain, there are a variety of language areas. Most familiar, perhaps, are Broca's area in the frontal lobe of the brain, critical for speech production, and Wernicke's area in the temporal lobe, critical for auditory comprehension. However, other brain areas are also critical for language function. Among these is the inferior parietal lobe, located above Wernicke's area and behind Broca's area. There are a variety of long-established results which suggest that the inferior parietal lobe is a critical processor of both spatial and linguistic information. The inferior parietal lobe comprises two regions: the supramarginal gyrus and the angular gyrus. These regions are specifically human parts of the brain (Geschwind 1965, Critchley [1966]: 15); ontogenetically, they are elaborations of brain regions whose primary function is the integration of information from different sensory modes in order to model bodily interaction with extrapersonal space (Critchley [1966]: 53, Le Doux et al. 1977, Le Doux 1983, Mountcastle et al. 1975). In addition, the inferior parietal lobe is a critical aspect of linguistic function; damage to it can cause a variety of aphasie syndromes, in-

58

Paul Deane

eluding anomi a (word finding difficulties) and global aphasia (general disruption of linguistic functioning) (Hecaen 1967). This evidence is quite suggestive: for the Spatialization of Form Hypothesis predicts the existence of a region like the inferior parietal lobe in which primarily spatial brain centers also function as critical processors of linguistic information. This is particularly true in light of the fact that the inferior parietal lobe displays different functions in the left hemisphere of the brain than it does on the right. In the left hemisphere of normal right handed individuals, it is a linguistic center; in the right hemisphere, it is a purely spatial center. A variety of authors have argued that linguistic function is a secondary specialization of the left inferior parietal lobe, since in nonhuman primates the inferior parietal lobe is a spatial center on both sides of the brain (Young and Ratliffe 1983, Le Doux 1983, Corballis and Beale 1983). These considerations lead naturally to the idea that our sense of grammatical structure is based upon our sense of spatial structure, with the implication that the inferior parietal lobe is the true seat of grammatical competence. Similar views have been advanced before; of particular note is Geschwind (1965), who argues that the crossmodal integrating capacities of the inferior parietal lobe must underlie such basic linguistic functions as naming, and hence more advanced aspects of linguistic structure. There appears to be considerable evidence to back up this view, as argued in Deane (1992: 257-270). Moreover, recent studies indicate that the inferior parietal lobe has a privileged role in linguistic processing. It seems to be active under practically any type of linguistic processing, unlike other brain areas (such as Broca's area) which are most active for particular linguistic tasks. Smith and Fetz (1987), for instance, note that PET scans reveal that the inferior parietal lobe but not Broca's area is active during visual language processing (reading and writing). Lechevalier et al. (1989) report measurements of regional cerebral blood flow during linguistic processing in which the inferior parietal lobe again plays a critical role. During auditory comprehension tasks in which subjects listened to connected speech, increases in blood flow were concentrated in the inferior parietal lobe. On oral expression tests in which subjects were required to perform a

Neurological evidence for a cognitive theory of syntax

59

word-definition task, increases in blood flow were more general, with strong rises in Broca's area and speech motor areas, and weaker rises in Wernicke's area, as might be expected - but once again Lechavalier et al. note (1989: 7) that the greatest increases in blood flow were in the left inferior parietal zone. Samar and Berent (1986) tested syntactic priming effects during word recognition tasks, and report higher evoked response potential (ERP) measurements in temporoparietal areas, near the inferior parietal lobe, than at either inferior frontal sites (near Broca's area) or at superior temporal sites (near Wernicke's area). Such results seem to be general in the literature (cf. Larsen et al. 1977, Kohlmeyer 1975, Gur et al. 1983, Risberg 1980). Evidence of this sort indicates that the Spatialization of Form hypothesis is a viable hypothesis which raises very interesting possibilities. In what follows we shall explore these possibilities, demonstrating ways in which the Spatialization of Form Hypothesis generates interesting and correct predictions about grammar and its neurological instantiation in the brain. 2.2. Metaphor theory and its implications for grammar At this point it is necessary to lay the theoretical groundwork for the discussion that will follow and to clarify the conceptual foundations of the Spatialization of Form Hypothesis as it is presented in Deane (1992: 45-54, 95-127). In particular, the Spatialization of Form Hypothesis claims that grammatical competence is based upon a spatial metaphor in which linguistic expressions are treated as if they were physical objects. It is thus necessary to consider the nature of metaphor and its implications for the analysis of grammar. It has been observed that metaphor is a normal attribute of language and thought, one which provides the capacity to extend familiar, and richly structured concepts to make sense of as yet unfamiliar and poorly understood domains of experience. Thus there is a general tendency for metaphor to proceed from the immediate and concrete world of individual experience down the scale to the distant and abstract realm of logical categories. The workings of metaphor can be observed in ordinary uses of language, as discussed e.g. in Lakoff and

60

Paul Deane

Johnson (1980), where we observe that metaphor operates through the construction of mappings which equate concepts in the domain which functions as a model (the source domain) with concepts in the domain to be understood (the target domain). For example, the metaphor THEORIES ARE BUILDINGS establishes a systematic parallelism between the two domains. If theories are buildings, then propositions are building blocks, and the evidence one proposition provides for another is equivalent to the support that a lower block provides to another block placed on top of it. Thus, to elaborate a theory is to build a theory; to seek to raise doubts about a theory is to undermine it; to disprove a theory is to demolish it. One important property of this sort of mapping is that it results, not merely in the passive recognition of similarity, but in the active transfer of thought patterns. Once we understand theories metaphorically as buildings, it becomes possible to reason about theories as if they were buildings: for instance, we know that if a theory is to be undermined, we must attack its foundations; conversely, we know that a theory whose foundations are firm can take quite a lot of damage to its superstructure and yet remain sound. As a result, there are empirical consequences to analysing a particular turn of phrase as a metaphor: it yields metaphorical entailments whose conceptual consequences are often directly observable. For example, once we accept that theories are buildings, and that baroque, ornately decorated buildings are an offense to good taste, we must admit the same of baroque theories whose soaring vaults are upheld only by massive external buttresses. Another important property of metaphor may be summed up in what Lakoff (1990) terms the topological Invariance Hypothesis. This hypothesis is based upon the observation that metaphoric mappings seem to preserve certain basic aspects of conceptual structure: a part is mapped onto a part, a cause onto a cause, an effect onto an effect. These abstract concepts (PART/WHOLE, CAUSE/EFFECT) are obviously basic to any theory of conceptual structure; in Lakoff s theory, they are basic image schemata, abstract conceptual structures whose purpose in the first instance is to model fundamental perceptual invariants, topological constants in the sense of Talmy (1983), which organize our perception of bodily interaction with the world. Lakoff (1990) argues that metaphor respects this topological structure, estab-

Neurological evidence for a cognitive theory of syntax

61

lishing correspondences between domains, which mirrors their basic schematic structure. It is important to observe that basic image schémas are not mere conceptual primitives, but have instead a rich internal logic. For example, the concept of part-whole relationships has intricate connections with other basic schemata: parts of the same whole must be linked; we are more likely to distinguish a part if its connections to the whole are relatively weak and flexible, and so forth. We will have occasion to examine many of these properties in detail later. For now, it is important to note that if basic schemata have fundamental structural and inferential properties, and if these are carried over from the source to the target domain, then metaphoric mappings are richly structured and highly predictive. Let us now consider the consequences of the Spatialization of Form Hypothesis within the theory sketched above. According to this hypothesis: i) ii) iii)

Grammatical relations are organized in terms of the LINK schema; Constituent structure is organized in terms of the PART/ WHOLE schema; Head-adjunct relations are organized in terms of the CENTER/PERIPHERY schema.

If this hypothesis is correct, quite a lot follows, for the properties of the LINK, PART/WHOLE and CENTER/PERIPHERY schémas can be determined independently of grammar. The following properties can be noted: Properties of the LINK schema. Linkage relationships appear to have the basic structure of a functional dependency. If, for example, I characterize my hat as being linked to my head, that is because the location of my hat is dependent on the location of my head: if I move the latter, the former is likely to shift position along with it. Linkage may be asymmetrical, as in the former case, or it may be symmetrical through the presence of links in either direction, as with the arm and the hand, where one cannot move without a corresponding displacement of the other.

62

Paul Deane

Properties of the PART/WHOLE schema. There are many properties which determine that a collection of parts forms an integrated whole, including perceptual closure, but physical objects have one especially salient property: their parts are mutually linked. A whole object moves as a whole, which means that the position of any one part is a function, to some degree, of the position of the other parts. For example, if one lifts a puppy by the scruff of the neck, the tail is sure to follow - and vice versa. One may reasonably define a physical whole as a configuration of parts each of which is linked, directly or indirectly, to all the other parts. Properties of the CENTER/PERIPHERY schema. When a physical object is articulated into parts, there is often one part which has a special status as the central, or core part, compared to which other parts are peripheral. For example, the torso is the central part of the body just as the trunk is the central part of a tree and the palm region is the central part of the hand. Centrality of parts seems to be a function of linkage: the central part is that part to which other parts are linked; peripheral parts are those which link up to the center without being attached to other (peripheral) parts. Thus the fingers and thumb are peripheral to the hand, for they all attach to the central region of the hand just as branches each attach individually to the trunk. The central part of an object is typically the most massive and salient part. As the above comments indicate, the LINK, PART-WHOLE and CENTER-PERIPHERY schémas function as an integrated system which, when applied to physical objects, characterize certain of their fundamental properties. The entire system is grounded on the LINK schema, since the PART/WHOLE and CENTER/PERIPHERY schémas are largely defined in terms of linkage patterns. It follows that if grammatical structure is a metaphoric projection of this system, then linguistic expressions are being processed as if they were physical objects, that constituency and head-adjunct organization is being perceived as if it were an object's internal configuration, and that the entire grammatical system is based on the processing of fundamental grammatical relationships as if they were linkages between physical objects. Thus, to give the hypothesis substance, what is needed is a precise definition of just which grammatical relationships are treated as the metaphoric equivalent of physical linkage relationships.

Neurological evidence for a cognitive theory of syntax

63

Before we examine this question in detail, one point must be clarified. The Spatialization of Form Hypothesis can be interpreted in at least two ways, one of which is obvious and almost certainly false; the other, less obvious but also potentially more fruitful. It would be easy to interpret the Spatialization of Form Hypothesis as claiming that all grammatical concepts must be explicitly modeled as spatial concepts. There are obvious problems with such an account, for it is far from clear that spatial and linguistic processing operate in tandem. If anything, a negative correlation could be claimed, since linguistic processing is primarily a left-brain function, whereas ordinary spatial reasoning is if anything localized more in the right brain than the left (Le Doux, Wilson and Gazzaniga 1977). However, such objections apply only if we assume that the metaphor is an explicit metaphor. There is, however, another possibility. Given the biological importance of spatial cognition, it is plausible to postulate that the brain is hardwired to process spatial information, with specific brain regions specifically adapted to process spatial information. However, hardwiring creates processing structures, not representations; a particular region of the brain is a processor of spatial information only because it receives information about spatial position. The very same region, connected differently, could function as a processor of linguistic information - but it would process that information in spatial terms, using the same schematic structures and the same inferential routines it would have applied to spatial patterns. The result would be an implicit spatial metaphor: a pattern in which grammatical information is organized as if it were spatial information without any explicit connection to spatial concepts. Such an account is implicit in the neurological evidence cited above and presented in detail in Deane (1992: 251-295). 2.3. Linkage and the prediction

of grammatical

properties

Let us now briefly examine how the Spatialization of Form Hypothesis can be elaborated into a fully predictive syntactic theory. The details of the approach have been presented elsewhere (Deane 1992), so our purpose at this juncture will be to explain the theory well enough

64

Paul Deane

to lay bare its logical structure and to clarify its implications. The first priority, therefore, will be to discuss how the LINK schema is projected onto grammatical structure. Spatial linkage is at root a functional dependency with respect to spatial position; projected onto grammar, it should lose its spatial character but retain its asymmetric, functional character. Primitive grammatical links must therefore be asymmetric dependency relations whose role in syntactic structure is indisputable. There appear to be four such linkage types: cooccurrence dependencies, the dependency of predicates on their arguments, referential dependencies, and dependencies of sense. Each of these has an undisputed role in the analysis of grammar. Cooccurrence. However it may be recognized, whether through a theory of Case Marking, subcategorization frames, or the like, it is clear that grammatical structure enforces restrictions on cooccurrence. This is, in a sense, a phonological dependency, in which the phonological realization of one element is dependent on the phonological realization of another. In the discussion which follows, dependencies of this sort will be termed cooccurrence links, or Clinks. Predication. Predicating elements, such as verbs, verb phrases, and prepositions are semantically dependent on their arguments), without whom they cannot be evaluated for validity. This is at root a semantic dependency, but one of assured syntactic relevance, recognized in every syntactic theory. One of the most interesting emphases of Government-Binding theory is its separation of predicate-argument relations from cooccurrence. This distinction between Theta Marking and Case-Marking has its analog in the present theory in a distinction between C-links and predication links, or P-links. It should be noted, however, that C-links occur wherever there is a cooccurrence restriction, and thus are a more general concept than Case-Marking is in Government-Binding theory. Similarly, the concept of P-link subsumes a wide variety of predicate-argument relations, including those between modifiers and heads. Referential Dependency. Pronouns and a variety of other grammatical elements are dependent on their antecedent for their reference. Such dependencies are recognized in a variety of syntactic theories

Neurological evidence for a cognitive theory of syntax

65

through the use of referential indices and binding relationships. The present theory will therefore also postulate referential links, or R-links. Sense Dependency. Identity of sense anaphors have no fixed sense of their own, but retrieve their sense from their antecedents. Similar relationships may be observed with so-called VP deletion phenomena, where an auxiliary element represents the sense, not the reference, of a preceding VP. The present theory will therefore postulate sense links, or S-links, as a final projection of the LINK schema onto grammar. C-links, P-links, R-links and S-links do not in and of themselves constitute a theory; however, when we combine them with the Spatialization of Form Hypothesis, a variety of predictions ensue. Some of the most important concern the relation between linkage, constituency, and the X-bar structure of phrases. Types of Grammatical Relations. One consequence of the theory is that all grammatical relationships must be composed from combinations of the four basic types of grammatical linkage. This leads to entirely reasonable results. Consider the following diagram:

Y

Figure la b. eat an apple, in the store, e te.

(la) represents the relationship between the HEAD OF A PHRASE (the X element, such as eat and in in (lb) and its complement (the Y element, such as the objects an apple and the store in (lb)): while the head is predicated of (P-linked to) the complement, the complement cannot appear unless the head is also present (i.e., it is C-linked to the head). This pattern follows automatically from the linguistic facts: it simply states that a complement is a subcategorized argument.

66

Paul Deane

Next, consider (2):

Figure 2a b. this house, will leave, so happy, etc. (2a) represents the relationship between a specifier (the X element, such as this, will and so in (2b)) and the head of its phrase (the Y element, such as house, leave, and happy in (2b)). Specifiers (determiners, auxiliaries, intensifiers) can often also function as pro-forms (specifically, as identity-of-sense anaphors). In VP deletion phenomena, for example, the auxiliary element has its own time reference but stands for a VP having the sense of its antecedent. Similar points apply to determiners like many or his, which often function as (identity of sense) pronouns, and to a variety of other specifier elements. Since there is no reason to think that these words are all that different semantically when they function as specifiers, it is reasonable to assume that specifiers are S-linked to the head of their phrase. On the other hand, specifiers function to restrict (qualify) the reference of their head: thus there is an R-link in the opposite direction. Next, consider a diagram like (3):

Figure 3a b. old men, walk happily, etc. Disregarding linear order, (3a) represents the relation between a restrictive modifier (the X element, such as old and happily) and its

Neurological evidence for a cognitive theory of syntax

67

head (the Y element, such as men and walk in (3b)). A modifier is predicated of its head, hence the P-link from X to Y. On the other hand, a restrictive modifier serves to limit the reference of its head, hence an R-link from Y to X. (1), (2) and (3) represent the three most common patterns available under the Spatialization of Form Hypothesis. Many other patterns can be described, and are given in detail in Deane (1992: 102-114). It should be noted, however, that the theory provides a principled typology of grammatical relations in which concepts like complement and specifier are given precise definitions grounded in underlying phonological and semantic dependencies. They are thus doubly constrained: they cannot be assigned arbitrarily, for they are extrinsically motivated. On the other hand, they constitute metaphoric projections of the LINK schema and must conform to its properties as well. Constituent Structure. This last point has wide ramifications for constituency. Since the four primitive grammatical relations are analyzed as metaphoric projections of the LINK schema, we may draw a variety of metaphoric inferences. For example, since parts of the same object must be mutually linked, we predict that mutual linkage will produce the same effect in grammar. In other words: If two nodes are mutually linked, then they unite to form a phrase. This prediction appears to be very well founded. There are asymmetric relations, links in one direction only, such as those which hold between a pronoun and its antecedent. These fail to establish coconstituency. Compare symmetric grammatical relations like (1) through (3). These involve mutual linkage and the result is a unified phrase. As illustration, consider the contrast between restrictive and nonrestrictive relative clauses: (4)

a. b.

The man whom I saw smiled at me. (restrictive relative) The man, whom I saw, smiled at me. (nonrestrictive relative)

68

Paul Deane

In either case, the relative clause is predicated of the noun man. In either case, therefore, the relative clause will be P-linked to man. But there is a critical difference between (4a) and (4b). Only restrictive relatives limit the reference of the head noun; hence, as (5) illustrates, only they are mutually linked with the nouns that they modify:

man

Figure 5a: restrictive modifier

man

Figure 5b: nonrestrictive modifier As a result, the theory predicts (correctly) that restrictive clauses form a single NP with the noun they modify, but that nonrestrictive relatives do not. Head-Adjunct Structure. Another, rather different set of predictions concern the concept head of a phrase. For example, consider a phrase like the old man from Paris. On the present theory, each of the adjunct elements connects directly to the head noun, yielding the following linkage structure:

Figure 6

Neurological evidence for a cognitive theory of syntax

69

According to the Spatialization of Form Hypothesis, heads of phrases are core parts - that is, salient and structurally central parts. In (6), for example, the head noun man is obviously central since it is the only constituent directly linked to the other parts of the phrase. Moreover, as Deane (1992: 117-127) argues, a head noun should be salient on purely semantic grounds, since it is potentially referential and names the same general concept as the entire phrase. When this analysis of head-adjunct structure is transferred from phrases to clauses, very interesting results fall out. Independent clauses in English require a tense morpheme which has strong connections with both subject and predicate. If we analyze the relation between a tense morpheme and the verb phrase which follows, the tense morpheme acts like a specifier. That is, when it is realized as the periphrastic auxiliary do or did it functions as an identity-of-sense anaphor. At the same time, the tense morpheme serves to restrict the time reference of the VP. On the other hand, the tense morpheme necessarily cooccurs with a subject NP and is capable of carrying agreement features. These features suggest the presence of a schematic specification of NP within the tense morpheme's cognitive representation (a point also developed in Langacker's (1992: 174-177) theory of E-SITES. Deane (1992: 222-224) ultimately analyzes English tense morphemes as schematic clauses, containing within themselves a schematic representation of subject and predicate position. In the end, after considering opacity effects with reflexive pronouns, the following analysis of basic clause structure in English is proposed (op. cit.):

Figure 7

70

Paul Deane

According to this analysis, the English tense morpheme contains two schematic elements: VP', which carries the tense features, and functions as a specifier of VP, and NP', which carries agreement features. The analysis assumes that NP' is characterized as an argument of V'. It (and NP) are inferred arguments of VP, but these links are not represented as they can only be inferred if the linkage relations in (7) are intact. Finally, the subject NP functions as an obligatory antecedent for NP', and is not directly related to the verb at all. Several points should be noted at this juncture. First, we should note that the specific analysis presented above is extrinsically motivated by various properties of tense morphemes in English: their auxiliary-like relation to VP, their association with tense morphemes, and various other facts, including the opacity of subject position, which is argued (ibid.) to be a consequence of the fact that subject NPs are not arguments, and hence cannot accept reflexives and other anaphoric elements whose prototypical characteristics invoke a coargument relation between the pronoun and its antecedent. Given the specifically English motivation for much of the analysis, we would expect significant cross-linguistic variation. For example, pro-drop languages will not contain a C-link from INFL to the subject NP; similarly, languages without agreement would require an analysis in which subjects are much more like objects. These cross-linguistic ramifications, however, go beyond the scope of the present paper. Second, we should note certain similarities between this analysis and various aspects of GB theory. Note that (7) agrees with the claim of GB theory that the agreement morpheme counts as a kind of SUBJECT element. Note further that (7) treats the tense morpheme functions as the structural center of the clause, which means that the present theory also analyzes INFL as the head of S. These are real similarities, involving similar empirical claims about clause structure, although these claims fit very differently into the two theories. The two theories give very different accounts of opacity effects and they analyze the tense morpheme as a clausal head for very different reasons. Notice that in the present theory INFL is the head of S because it is central in the linkage structure (though it deviates from prototypical heads in not being a particularly salient element grammatically).

Neurological evidence for a cognitive theory of syntax

71

The idea that so-called functional heads are nonprototypical heads is exploited in Deane (1992: 143-151) to account for their distribution. Consider, for example, the pattern which obtains when a clause is introduced by a complementizer. In modern GB theory, such as Chomsky (1986), it is argued that complementizers are functional heads, so that the category S' is really a "complementizer phrase". This has some plausibility, since particular verbs can select for complementizers, much as they can select for complements whose heads belong to particular lexical categories. Within the present theory, however, it has further consequences. Complementizers are function words, and function words by definition are not particularly salient. Thus, if complementizers are heads, they are nonprototypical heads: and, if they are heads at all, they must be structurally centred. But there is a problem: complementizers combine with only a single argument, i.e. the clause they introduce. The result is a diagram like (8), assuming, following Deane (1992: 145), that complementizers are clausal specifiers:

Figure 8 In a diagram of this form, the complementizer can hardly be termed central. It is by nature not linked to any other element in the clause I saw him, and yet to be central it must be linked to yet another item in the sentence. But if the complementizer is linked to an element outside the clause, the clause itself must be a subordinate clause. In other words: if complementizers are heads, they can only appear in subordinate clauses. Predictions of this type are one of the benefits of the Spatialization of Form Hypothesis. They are generated by the fact that metaphor imposes on grammar the inferential patterns associated with the LINK, PART/WHOLE, and CENTER/PERIPHERY schémas. It should be noted that the structure of the resulting theory is fundamentally different than in GB theory or other constituency based theories. The pre-

72

Paul Deane

sent theory is by nature a dependency theory, in which constituency is derived from grammatical relations. Constituency relations are still present, and need to be represented for a variety of purposes, but they can be inferred from the properties of linkage and head-adjunct organization. The present theory is also unique in its analysis of grammatical relations in terms of four types of primitive links. These two properties interact, since coconstituency depends upon the presence of mutual linkage. 2.4. The R-link theory of agrammatism

The Spatialization of Form Hypothesis as outlined above has interesting consequences when it is combined with the idea that grammatical competence is primarily seated in the inferior parietal lobe. Among these is the consequence that when aphasies display syntactic deficits, this should entail damage either to the inferior parietal lobe or to its connections with other language centers in the brain. This is far more plausible than might appear at first glance. Certain aphasie syndromes, including global aphasia, anomic aphasia and conduction aphasia, typically involve damage in the inferior parietal region itself. And while classical syndromes such as Wernicke's aphasia and Broca's aphasia need not directly involve the inferior parietal lobe, they still seem to fit the general pattern. Wernicke's aphasia typically involves damage near the so-called POT junction where the parietal, occipital, and temporal lobes connect (cf. Schwartz 1983). It is thus likely that damage to these areas, in addition to damaging the centers which process auditory input, also damage necessary connections of these regions to the inferior parietal lobe. Broca's aphasia seems typically to involve damage which extends beyond Broca's area into the inferior parietal lobe (cf. Mohr 1976; Mohr, Funkenstein, Finkelstein, Pessin, Duncan and Davis 1978); in any case, there is a good case to be made that damage in Broca's area interrupts necessary connections between the frontal lobe and the inferior parietal lobe, as is argued in Deane (1992: 271-277). As an illustration of the value of the approach, let us consider specifically the syndrome of agrammatic Broca's aphasia. This syn-

Neurological evidence for a cognitive theory of syntax

73

drome is believed to occur after damage to Broca's area and the regions behind and around it; it is characterized by nonfluent, telegraphic speech in which grammatical function words tend to be omitted, leaving only the content words. Its characteristics may be summarized as follows, as specified in the literature review in Deane (1992: 287): i)

Reduced sentence length: utterances tend to be phrases rather than clauses, with a marked preference toward producing nouns rather than verbs (Caramazza and Berndt 1985; Marin, Saffran and Schwartz 1976; Goodglass 1976; Schwartz, Lineberger and Saffran 1985; Tyler 1983).

ii)

A tendency to omit inessential grammatical elements, especially grammatical function words and inflections. Function words tend to be dropped in the order determiner < auxiliary < pronoun < preposition < connective; tense inflections are dropped more often than aspect and number inflections; noun modifiers such as adjectives tend to be omitted (Miceli, Mazzucchi, Menn and Goodglass 1983; Caramazza and Berndt 1985; Schwartz 1983).

iii)

There are difficulties in articulation, with speech typically being effortful and misarticulated with disruptions in prosody. This is sometimes argued to result in the omission of unstressed or clitic elements (Goodglass 1968; Kean 1977).

Agrammatism has been the focus of intensive study, especially since it was discovered that agrammatic aphasies have syntactic difficulties in comprehension (Berndt 1983; Tyler 1983; Grodzinsky, Swinney and Zurif 1985; Schwartz, Linebarger and Saffran 1985). The comprehension difficulties in question have to do with the interpretation of active and passive sentences and other sentence types, such as relative clauses, which present multiple NPs in varying orders. Agrammatic aphasies show a strong tendency to interpret such sentences in a pragmatically plausible manner, overriding the assignment of thematic roles dictated by the syntax. For example, an agrammatic patient

74

Paul Deane

might interpret a sentence like John was hit by Mary as if it meant that John hit Mary. This tendency is particularly marked in reversible sentences, where either assignment of argument structure is pragmatically plausible, and in passive rather than active sentences. However, it has since been discovered that such difficulties occur in a variety of aphasie syndromes, suggesting that they reflect generalized processing difficulties rather than a specific failure in syntactic competence (Caplan and Hildebrandt 1988; Bates, Wulfeck and MacWhinney 1991). In fact, agrammatic aphasies display a general preservation of grammatical competence on offline tasks: they are capable, for example, of making fairly accurate grammaticality judgements despite their failures in production and comprehension (Schwartz, Linebarger and Saffran 1985). Nonetheless, there are selected aspects of grammar in which agrammatic aphasies display striking deficiencies in competence. These focus around the interpretation of specifiers (auxilaries and determiners) and pronouns, especially anaphoric elements like reflexive pronouns. For example, Zurif and Caramazza (1976) studied the ability of aphasie patients to give judgements about word-relationships in sentences. From such judgements it is possible to infer constituency relations: thus it is important to note that agrammatic Broca's aphasies had serious problems with constituency. Consider the following diagrams:

The man will see the woman Figure 9a.

Neurological evidence for a cognitive theory of syntax

75

Γ Ζ Γ

ι I

he man will see the woman Figure 9b. (9a) represents the sort of pattern Zurif and Caramazza obtained with normal controls: while the constituent structures did not always match those postulated by linguists, they followed similar patterns; in particular, normal controls always grouped auxiliaries and determiners with the following noun or verb. In contrast, the agrammatic Broca's aphasies displayed a very different pattern: they recognized relations among content words, but treated specifiers as if they were loose adjuncts of the whole sentence. However, these patterns did not extend to all grammatical function words. In particular, prepositional phrases remained intact, a pattern which distinguished Broca's aphasies from other aphasie groups in the study. In addition, Zurif and Caramazza found that the Broca's aphasies appeared incapable of using determiners to disambiguate reference. 1 This loss of integration is confirmed in more recent work. For example, Blumstein, Milberg, Dworetzky, Rosen and Gershberg (1991) analyzed syntactic priming effects among normal individuals and agrammatic aphasies. For normal subjects, auxiliary verbs primed main verbs: that is, they significantly speeded reaction times for words related to the main verb on a lexical decision task. This effect failed to hold for agrammatic aphasies, who displayed only inhibitory effects usually interpreted as reflecting conscious rather than automatic processing. When these results are combined with Zurif and Caramazza's study, they strongly suggest that agrammatic aphasies are unable to integrate specifiers with the rest of their phrase. Another pattern of syntactic deficit appears in Schwartz, Lineberger and Saffran (1985). This study examined agrammatic aphasies' ability to make grammaticality judgements for a variety of syntactic constructions. They examined the following constructions:

76

Paul Deane

(1) strict subcategorization; (2) particle movement; (3) subject-auxiliary inversion; (4) interpretation of gaps; (5) tag questions; (6) the left branch condition on extraction; (7) sensitivity to basic phrase structure patterns; (8) reflexives. Their results demonstrated that agrammatic aphasies are able to make reasonably accurate judgements of grammaticality on most syntactic patterns. There were two exceptions, however: the agrammatic aphasies had particular difficulty with reflexives and tag questions. These results hold up in other research. Blumstein, Goodglass, Statlender and Biber (1983) report serious problems for agrammatic patients when asked to interpret reflexive pronouns. These difficulties are most pronounced when they must interpret the sentence on purely syntactic grounds. Thus, while agrammatic patients perform reasonably well where they can use lexical semantic cues like gender and animacy to interpret reflexives (Wulfeck 1988, Friederici, Weissenborn and Kail 1991) they fail to distinguish the referential possibilities of sentences like (10a) and (10b): (10)

a. She washed her. b. She washed herself.

Blumstein et al.'s results (1983: 121) indicated significantly greater problems for Broca's (agrammatic) aphasies, with different (and less severe) patterns for Wernicke's and conduction aphasies. Berndt, Salasoo, Mitchum and Blumstein (1988) replicated Schwartz, Linebarger and Saffran's results. Their agrammatic patients also displayed generally preserved ability to make grammaticality judgements; as before, they displayed serious difficulties with tag questions. In Berndt et al.'s experiment, this pattern held for both agrammatic patients studied, with a similar pattern for a patient who

Neurological evidence for a cognitive theory of syntax

77

displayed agrammatic patterns in visual but not auditory tasks. Reflexive patterns were not examined. These results may be summarized very simply. In the studies cited, tag questions and reflexives were the only constructions studied which intrinsically involved anaphoric pronouns and specifier-head relations. It is thus not surprising that these are among the grammatical elements that agrammatic aphasies most often omit in production. According to Miceli, Mazzucchi, Menn and Goodglass (1983) and Caramazza and Berndt (1985: 34-35), agrammatic aphasies tend to omit function words in the following order of frequency: (1) (2) (3) (4) (5)

determiners auxiliaries pronouns prepositions connectives

In short, agrammatic aphasies appear to display consistent difficulties with anaphoric elements and with specifiers, with other function words typically being better preserved. But why? Deane (1992: 281-292) presents an analysis of classical Broca's agrammatism which analyzes these difficulties as involving problems in the processing of R-links; it also suggests that other syndromes may also involve the compromising of other specific linkage types. Within the theory sketched in section 4, specifiers connect to their head by the following pattern of linkages:

Figure 11

According to the same theory, pronouns (or at least bound pronouns like reflexives) are linked to their antecedent by an R-link:

78

Paul Deane

PRONOUN

ANTECENT

Figure 12 These patterns share a common element: the presence of an R-link. It is thus possible to hypothesize that agrammatism reflects an underlying incapacity or inefficiency in the processing of R-links (cf. op. cit. for further discussion). It should be noted, by the way, that such an analysis does not claim that agrammatic patients have difficulty processing reference; but rather, that they have difficulties calculating referential dependencies, defined rather broadly. As is pointed out (ibid.), this hypothesis entails a variety of consequences. For example: if agrammatic patients are unable to process Rlinks, they should behave as if the R-links were absent. The result: a loss of phrasal integration. Without R-links, the specifier-head pattern looks like (13): •

SPEC

HEAD

Figure 13 But in (13), there is no mutual linkage between the specifier and the head; thus, they cannot join to form an integrated phrase. In short, the Spatialization of Form Hypothesis predicts the failure of agrammatic aphasies to integrate auxiliaries and determiners with the rest of the sentence. Note that this result is specific to the present theory, in which constituency and linkage are interrelated; it does not appear to be derivable in an account based on the (Government-Binding) concept of binding, where constituency is primary and binding is defined against constituency structures. Another noteworthy consequence follows. As discussed above, clauses appear to be integrated via the tense morpheme, as illustrated in (7). But this integration is achieved through the use of R-links: an R-link from VP to the tense element (V') to represent its function in restricting verbal time reference and another R-link from the agreement morpheme (NP') to the subject NP itself. The result: the theory

Neurological evidence for a cognitive theory of syntax

79

predicts correctly that agrammatic aphasies will have particular difficulty with tense inflections and hence with syntactic integration of clause-level structures. Other properties of agrammatism also follow. For example, the omission of restrictive modifiers is not too surprising given that these also depend crucially on R-links as shown in (5a). Even the special difficulties with tag questions make sense. Consider a tag question like (14): (14)

John is coming, isn 't he?

To begin with, tags must include a subject pronoun bound by the matrix subject. There is thus an R-link from he to John. Moreover, tags indicate the mood of the entire sentence: without the tag, (14) is declarative; with it, it is interrogative. Since mood is analyzed in Deane (1992: 141) as a restriction of the class of situations to which a sentence can be validly applied, tag questions involve a second R-link, this time from the matrix tense to the verbal tag. As a result, it is not surprising that tag questions prove particularly difficult for agrammatic individuals. It should be noted, however, that the present theory does not attribute all the deficits of agrammatic Broca's aphasies to a single source. Classical Broca's aphasia (characterized by agrammatic, nonfluent speech) is probably not produced by damage to Broca's area alone. This point is demonstrated in Möhr (1976), who shows that agrammatic Broca's aphasia usually results from extensive brain damage over a wide region including Broca's area and parts of the inferior parietal lobe. Thus, Deane (1992: 284-285) analyzes Broca's aphasia as involving two deficits: (i) a general problem with speech production associated with damage to Broca's area and nearby motor areas; (ii) a specific problem with comprehension and production characterized by damaged capacity to process R-links. The former is responsible for the difficulties agrammatic Broca's aphasies have with speech production and is presumably associated with damage to Broca's area, the speech production center. The latter is responsible for agrammatics' special problems with specifiers and referential elements. It is argued {op. cit.) that R-link deficits are associated with Broca's aphasia

80

Paul Deane

because Broca's area lies between the inferior parietal lobe - specifically, the supramarginal gyrus - and the frontal lobe as a whole. Extensive damage in and around Broca's area is thus likely to disconnect (or partially disconnect) the supramarginal gyrus from frontal association areas. The argument relies on evidence that the frontal assocation areas handle information about the relevance of stimuli in the immediate situational context (cf. Sommerhoff 1974: 312-317). Such information would presumably be needed to process reference. The model thus predicts that R-link processing deficits could ensue from damage to Broca's area, the supramarginal gyrus, or intervening areas as long as the result was a disruption of efferent fibers connecting the frontal to the inferior parietal lobes. The expressive and syntactic deficits outlined above should interact. For example, the model being contemplated incorporates the traditional hypothesis (cf. Goodglass 1968) that difficulties in speech production make it more difficult to produce low salience items such as unstressed function words. However, it postulates that determiners, auxiliaries, and pronouns are selectively vulnerable to omission because the disruption of R-links eliminates much of the semantic load which triggers their use in normal subjects. For instance, there is little point in producing the definite article the if the aphasie cannot distinguish the resulting NP from an indefinite noun phrase. Similarly, production problems might exacerbate difficulties with clause-level syntax by reducing the resources available to integrate tense morphemes with subject and predicate elements and thus yield worse performance in speech production than in comprehension. Additional Evidence. In addition to the patterns analyzed in Deane (1992: 284-292), agrammatic aphasies display certain other deficits of syntactic comprehension. While other types of aphasies display similar deficits, I now turn to considering how such patterns could be derived in the present theory, especially since they have formed the basis for alternate theories of agrammatism such as that presented by Yosef Grodzinsky (1984, 1986, 1990). To be precise, agrammatic aphasies tend to display the following patterns: good comprehension of active sentences and adjectival passives but chance performance with syntactic passives,

Neurological evidence for a cognitive theory of syntax

81

above chance performance on subject relative clauses but chance performance on object relative clauses, above chance performance on subject clefts but chance performance on object clefts. Grodzinsky interprets these results as evidence for a GovernmentBinding theory based interpretation of agrammatic comprehension according to which agrammatic patients cannot interpret traces. In particular, he notes that the contrasts listed above all involve a failure with sentences involving object traces. His theory requires postulation of a nonsyntactic heuristic to avoid falsely predicting poor results on subject relatives and subject clefts, where the trace would be in subject position. It is worth noting that the present theory can account for these difficulties, although the connections are less direct in some cases than in others. Passive sentences depend specifically on the auxiliary verb (passive be) to signal the correct assignment of thematic relations to the subject NP, but auxiliaries involve R-links. Disruption of R-links should therefore disrupt processing of passive be and so is likely to cause difficulties assigning the correct interpretation to the subject of passive sentences. If the auxiliary is ignored, the subject will be interpreted as if the sentence were active, yielding an account of the unacceptability of passives that relies like Grodzinsky's on a conflict between the subject and the phrase, both of which require an agentive interpretation. In fact, the theory presented in Deane (1992) seems to predict an asymmetry between lexical and syntactic passives. The theory predicts, to begin with, a different structure for syntactic and lexical passives. The form of be which occurs in syntactic passives is an auxiliary: but the form of be which occurs in lexical passives is a main verb. The result is a different linkage structure:

Aux 'past prt.

be

Figure 15a

82

Paul Deane

A (deverbal)

V be

Figure 15b Only in the syntactic passive (15a) does the structure involve an Rlink; thus, only syntactic passives present difficulties. Note further that relative clauses and cleft sentences involve a clause functioning as a restrictive modifier, another construction involving R-links. Disruption of the R-link connecting a relative clause to the NP that it modifies will disrupt the hierarchical structure of the sentence, yielding a structure in which the relative clause is an independent proposition. And the relative clause, being a clause, is itself subject to disruption. Failure to process R-links will disassociate the tense morpheme and any auxiliaries from the verb and eliminate its semantic relation to the subject, breaking the clause into smaller, phrasal units. Sentences involving relative clauses ought therefore to present major difficulties through loss of critical structural information. Here too the theory of Deane (1992: 178-179) can account for the contrasts. This theory adopts an account of extraction processes in which there is a crucial difference between subject and object relatives. Subject relatives can be assigned an interpretation by independently motivated processes - essentially because the relations among head noun, relative pronoun, and the verb phrase are essentially local, allowing the relations to be computed on a word-to-word, local basis, without presupposing a hierarchical relation among them. On the other hand, object relatives cannot be interpreted without invoking a search process which scans the VP for the gap (more precisely, a verb with an unsatisfied argument slot) embedded within it. Such a search presupposes the hierarchical structure of the clause; if this is disrupted, as the R-link theory of agrammatism predicts, then the search has nothing to operate on. Thus object relatives and object clefts will not be correctly interpreted despite the potential for preserved comprehension of subject relatives and subject clefts. This is assuming, however,

Neurological evidence for a cognitive theory of syntax

83

that the agrammatics have retained enough awareness of grammatical morphemes to be able to process them despite the loss of R-links; if they are unable to process tense elements or relative pronouns, the theory would predict complete failure on these complex structures. It should be noted that the contrast between subject and object clefts and relatives is a basic property of the theory's account of grammatical structure; it is not specific to agrammatism. Moreover, in the account sketched above, the loss of R-links is not directly responsible for the difficulties: rather, it sets up the difficulty by disrupting the processing of clause-level structures. The theory thus allows for a variety of possibilities. If, for example, a patient had a fairly mild disturbance of R-link processing, combined with a strong production deficit, it is quite possible that they would perform quite well on passives, clefts, and relative clauses despite severe agrammatic symptoms. Such a case could be interpreted as involving an interference effect, with mild R-link difficulties in comprehension being exacerbated by the processing strain associated with difficulties in production. For a case which might fit this pattern, see Frazier and Friederici (1991). This hypothesis predicts, incidentally, the possibility of modality-specific productive agrammatism, a possibility which appears to be borne out by a patient discussed in Berndt et al. (1988), who patterns with agrammatic patients on tests of comprehension but who displays productive agrammatism only on visual use of language. 3.

Comparison with other theories

Thus far the argument has focused on plausibility: on demonstrating that the Spatialization of Form Hypothesis yields interesting and useful predictions about grammatical structure. It should be noted, however, that agrammatism is a phenomenon which has attracted considerable attention, with a wide variety of theories being advanced. It is thus necessary to compare the present theory with its competitors. Certain general points should be noted. The Spatialization of Form Hypothesis as developed above entails what neuropathologists term a connectionist account of brain function; that is, it postulates functional centers located in different brain regions and argues that aphasia re-

84

Paul Deane

suits from damage to particular centers or to the connections between them. 2 It is, simultaneously, a linguistic hypothesis. It assumes, in other words, that deficit patterns in aphasia are not random, but reflect impairments in the underlying linguistic system. As a result, impairments ought to fall into natural classes defined by the linguistic theory. The theory can therefore be evaluated on two grounds: on the accuracy of its connectionist implications or on the validity of its linguistic predictions. 3.1. On connections between symptoms and lesion sites One recurrent criticism of connectionist theories of neurolinguistic function is that their generalizations are spurious, representing statistical rather than causal generalizations, without any consistent correlation between symptoms and lesion sites. Such criticisms can be found in reaction to the initial formulations of connectionism (Marie 1906; Head 1923) and we find similar objections being raised in the modern neurolinguistic literature, (Caplan and Hildebrandt 1988; Miceli, Silveri, Romani and Caramazza 1989; Badecker and Caramazza 1985; Martin, Wetzel, Blossom-Stambach and Feher 1989). Such criticisms obviously raise serious questions for the present theory, for if there is no unified syndrome to be studied, theories of agrammatism are pointless. In what follows, the discussion will focus on two recent studies: Miceli et al. (1989) and Caplan and Hildebrandt (1988). These studies provide a wealth of detail about individual patients and their deficits, making it possible to evaluate the effectiveness of their arguments against the present hypothesis. Close examination reveals that their data do not provide evidence against the R-link theory of agrammatism; if anything, they tend instead to confirm it. Miceli, Silveri, Romano and Caramazza 1989. Miceli et al. collected extensive data on 20 "so-called agrammatic patients". Their experiment focused on grammatical function words, inflectional morphemes, and agreement patterns in Italian. Other variables were also documented, such as the mean length of their utterances, and the proportion of errors that were omissions (normally considered agram-

Neurological evidence for a cognitive theory of syntax

85

matic errors associated with Broca's aphasia) or substitutions (normally considered paragrammatic errors associated with Wernicke's aphasia). They then analyzed the results for consistency. The data were highly variable. Many patients displayed opposite patterns of errors. Some had very high percentages of substitution errors, others very low; the same held for errors of omission. Nor was there any particular relation between the patterns: in some patients substitution and omission errors were high or low together; in others, the two types of errors varied independently. Nor were there consistent error patterns when different types of function words were compared. One patient had a high error rate for determiners and a low one for auxiliaries, but with another patient, the reverse pattern held. While there was a general correlation between errors with function words and errors with inflections, there were many exceptions. The most systematic patterns occurred with agreement, where there were two clearly distinguished groups of agrammatic patients, but with a large intermediate population. One set erred most often by producing the citation form in the paradigm instead of the appropriate inflected form; the other group, by contrast, tended to use an inappropriate inflected form. On the basis of this analysis, Miceli et al. argue that agrammatism is not a useful category for theory construction. They conclude: There are not only metatheoretical and methodological reasons for doing away with patient categories such as agrammatism, but also empirical demonstration of the futility of holding on to such categories. Here we have provided one such demonstration. How are we to evaluate this analysis? One response is provided in Grodzinsky (1991), who argues that the results prove that "either the group was not homogeneous or the performances that were measured were irrelevant". He then argues that the variability in results is irrelevant because agrammatism as a syndrome has been characterized qualitatively, without reference to exact numerical patterns. Thus, he claims, the quantitative variation in Miceli et al.'s study is simply irrelevant. However, this response misses the point, which is that a large

86

Paul Deane

proportion of the patients displayed error patterns so different as to be qualitatively distinct. Within the present theory, however, it is not important whether agrammatism as such is homogeneous: what matters is that there be a clear correlation between deficit patterns and patterns of brain damage. In fact, Deane (1992: 278-292) argues that there are at least two types of agrammatism: one characterized by a combination of frontal and parietal damage, to which the R-link hypothesis applies and another, characterized by parietal but not frontal damage, where agrammatism is a consequence of on-line processing difficulties without specific grammatical deficits. Lack of homogeneity should be no surprise if it correlates with different patterns of brain damage. Let us therefore examine Miceli et al.'s data in this light. Close examination of the information Miceli et al. provide in table 2 (p. 455) and appendix I (pp. 475-481) demonstrates great variation in the site of brain damage. Of twenty patients, four have lesions restricted to the left frontal lobes, six have lesions of the sort described in Möhr (1976), in which the damage extends across left frontal and parietal regions, in the part of the brain supplied by the left middle cerebral artery. Three have damage in left parietal and temporal regions. Of the remaining seven patients, one had transient left parietal lobe damage, one had bilateral damage to the temporal lobe, one had damage to frontal and parietal areas in the right hemisphere of the brain, one had no sign of brain damage on a CT scan, and for three patients no information about brain damage was available. In a theory like the present one, such results practically guarantee a lack of homogeneity. While such varied patterns of brain damage might yield superficially similar results, it must on a connectionist account be the consequence of distinct underlying deficits. In particular, it should be noted that the R-link theory should apply specifically to the six patients with lesions in frontal and parietal regions, in the area supplied by the left middle cerebral artery, since damage so extensive would certainly have damaged efferent fibers connecting the frontal lobe with the inferior parietal lobe. The other two groups should have agrammatic symptoms for entirely different reasons. Within the model proposed in Deane (1992: 271-278), purely frontal damage would be expected primarily to produce difficulties in

Neurological evidence for a cognitive theory of syntax

87

speech production. The expected consequence of parietal damage would be a reduction in the available working space for syntactic processing. Let us therefore examine the errors of the six patients to whom the R-link theory ought most directly to apply. Consider Table 1, which presents the cumulative percentage of errors for the six aphasies with frontoparietal damage. The table presents the following categories auxiliaries (AUX), prepositions (P), clitics (CL), subject-verb agreement (SV AGR), definite articles (DA), indefinite articles (IA), nounauxiliary agreement (NA AGR) and determiner-noun agreement (DN AGR): Table 1: Grammatical errors and aphasies Kinds of Errors

AUX Ρ CL SV AGR DA IA NA AGR DN AGR

Aphasie Patient

G.D.C.

A.A.

F.S.

BP.

F.G.

F.B.

100% 83.3% 44.4% 30.0% 33.9% 55.6% 22.2% 5.0%

100% 66.7%

80.0% 54.7% 50.0% 47.7% 35.7% 25.0% 20.0% 14.5%

62.5% 53.3% 57.2% 52.2% 27.5%

50.0% 16.6% 14.3% 4.4% 7.9% 7.1% 3.1% 1.4%

30.8% 15.3% 18.2% 3.5% 3.5%

-

43.5% 25.0% -

-

8.0%

-

3.8% 9.3%

-

11.1% -

This is not a random pattern. 3 To begin with, rank order of errors is generally consistent. With the freestanding grammatical morphemes, auxiliary errors are most frequent, with preposition and clitic errors next, followed by errors with definite and then indefinite articles. The only exception to the pattern involves the high proportion of indefinite article errors in G.D.C. 's speech. With regard to the morphological er-

88

Paul Deane

rors, subject-verb agreement is the most error-prone; determiner-noun agreement, the least error prone. Going from top to bottom, we find that the number of errors drop. This is fully reflected in the most common errors (auxiliary verbs and prepositions); other errors pattern less consistently but reflect a clear difference in overall impairment between F.B. and F.G., with low error rates, and the other patients, whose error rates are much higher. For the most part, these patterns make sense in terms of the R-link theory of agrammatism. First, the high number of problems with auxiliaries arguably reflects the double load involved in processing R-links both to the subject and to a following verb. The relatively high proportion of errors on S-V agreement and clitic pronouns would also follow from difficulties in processing R-links. Such results are certainly reasonable given the huge range of uncontrolled variables likely to affect the frequency with which word types are produced, although certain patterns, such as the high frequency with which prepositions are omitted and the relatively low frequency of determiners, certainly require explanation. This pattern might be taken as an argument against the R-link theory of agrammatism, since the theory does not predict any particular problem with prepositions and does predict serious problems with determiners. However, there are two factors needing to be considered. One is a complicating factor with regard to the structure of Italian: prepositions frequently contract with a following article. Miceli et al. do not indicate how they scored an omission of a form like della consisting of a preposition and an attached article. If they scored them as omissions of a preposition, that would artificially elevate the number of prepositions in the data. Also, even if they were scored as omissions of preposition and article, the word could have been omitted purely because of difficulties with the article, which would also elevate the number of prepositions being omitted. That this is plausible can be confirmed by examining earlier studies of Italian, such as Miceli et al. 1983. This study found that combinations of preposition plus determiner were omitted more often even than auxiliaries, consistent with the pattern in other languages in which determiners were omitted more often than auxiliaries. Berndt and Caramazza (1985: 36) cite an Italian patient of Miceli's, F.E., whose omissions of preposi-

Neurological evidence for a cognitive theory of syntax

89

tion plus determiner combinations (49%) lie between his omissions of determiners alone (72%) and prepositions alone (26%). There is also evidence in the literature that prepositions are deleted when they are subcategorized or at least governed by the verb (Friederici 1982, 1985; Grodzinsky 1984, 1988). This raises the question of whether the deletion of prepositions might in some way be related to difficulties with the verb, especially considering the close relation between auxiliary and prepositional errors in Table 2. Examining appendix 2 in Miceli et al., where they provide samples of patients' speech, yields very suggestive results: there appears to be a strong tendency for deletions of governed prepositions to occur specifically in sentences where an error is made with the verb, and for governed prepositions to be retained in sentences where the main verb is correctly formed. These considerations can be illustrated from the sample of patient G.D.C.'s speech (Miceli et al.: 484). G.D.C, retains the preposition a in the sentence Allora andavo a Roma "Then I was going to Rome", where the verb form andavo is also retained and correctly inflected. He deletes it in Lavora [a] Milano, meaning contextually, "I work in Milan", where the correct verb form would be lavoro. In fact, he is so likely to drop prepositions that the only preposition which occurs in the sample of his speech is the example given above in which the governed preposition a follows a correctly inflected verb. While such observations are at best suggestive, when combined with the other considerations adduced above, it seems likely that the deficit patterns for patients with frontoparietal lesions conform to the R-link theory of agrammatism. Other data that Miceli et al. provide is also suggestive. Let us compare the three major groups of patients in Miceli et al.'s study, comprising thirteen of the twenty agrammatic patients. The first group consists in patients whose lesions affect both frontal and parietal areas - that is, in the general territory of the left middle cerebral artery (LCMA), which Möhr (1976) argues is responsible for normal patterns of Broca's aphasia. The second group consists in patients with frontal lesions (F). Finally, the third group consists of patients with parietal and temporal but not frontal damage (PT).

90

Paul Deane

Consider their average performance on the following measures: pronunciation difficulties (dysarthria, abbreviated D in Table 1, and coded as a fraction of the form number/dysarthric number in the group), reduced speech rates, (labeled SR in Table 1), presence of fragmented speech without recoverable grammatical structure (abbreviated FS in Table 1) and the mean length of utterances produced (labeled MLU in Table 2): Table 2: Pattern disturbance ratios in aphasies Group LCMA F PT

D 5/6 3/4 1/3

SR 44.33 46.25 41.33

FS 22.5% 8.75% 8.33%

MLU 4.19 5.45 3.72

Already on these rather general measures there are noticeable differences among these anatomically defined groups. The six patients with combined frontal and parietal damage display the classical profile for agrammatic Broca's aphasie: nonfluent with reduced and fragmented utterances. This is exactly what the model presented in Deane (1992: 271-292) would predict. Frontal damage should yield expressive difficulties, while disruption of frontoparietal connections should disrupt R-links and hence the integrity of grammatical structure. While the frontal and parietal/temporal groups are rather small, different trends emerge. The purely frontal patients are dysarthric and have longer utterances with a much smaller incidence of fragmented speech. This is consistent with the hypothesis that they suffer from serious speech production difficulties with relatively little damage to syntactic competence. It should be noted that Caplan and Hildebrandt (1988), to be discussed in the next section, report two patients with strictly frontal lesions; these patients showed relatively minor impairments of syntactic comprehension as would be predicted under the proposed model.

Neurological evidence for a cognitive theory of syntax

91

On the other hand, the patients with parietotemporal lesions tend not to be disarthric, yet speak at rates comparable to the patients with frontoparietal lesions. Their speech rates may be even slower than the average indicates, since two of the three had extremely slow speech (20 and 23 words per minute, respectively). They produce a small proportion of fragmented speech and yet have the shortest utterances. This pattern makes sense on the hypothesis that their major problem is a reduction in syntactic processing capacity: low capacity would reduce the speed of processing and the length of utterances but would not disrupt the integrity of those phrases that were produced. This is also the conclusion reached in the two case studies of which I am aware which examined agrammatic patients with parietal or parietotemporal lesions (Kolk, van Grunsven and Keyser 1985; Nespolous et al. 1988); thus, the available evidence is consistent with the model. While the restricted number of patients makes it unwise to extrapolate too far from such data, it is thus far entirely consistent with the theory. In short, as this discussion indicates, Miceli et al.'s arguments against agrammatism do not apply to the present theory, and in fact their data, when examined closely, are largely consistent with it. Caplan and. Hildebrandt 1989. We may now focus our attention on Caplan and Hildebrandt (1988). This work argues explicitly that deficits in syntactic comprehension cannot be correlated with the location of brain damage. They base these arguments on a series of group studies in which they gave an unselected group of aphasies a battery of tests of syntactic comprehension. They also provide indepth case studies of the syntactic comprehension of several aphasie individuals. These studies confirm the existence of several very different patterns of specific linguistic deficit, evidence which they take to provide further confirmation of their basic conclusions. Caplan and Hildrebrandt's group studies examined performance on nine sentence types: active, passive, cleft-subject, cleft-object, dative, dative passive, conjoined, subject-object relative, and object-subject relative. Comprehension was tested by ascertaining "who did what to whom" that is, by testing the accuracy of their assignment of thematic roles. They analyzed the resulting patterns statistically, demonstrating the following points: first, that syntactic comprehension disorders were extremely common among aphasie patients (they estimated a

92

Paul Deane

rate in excess of 90%); second, that most of the variation among patients could be handled purely as an effect of the severity of aphasia. In other words, it was possible to show that certain factors made sentences harder to comprehend, including sentences with two or more argument slots to be filled and sentences in which NPs are not arranged according to the usual order of thematic roles. They also showed that there were identifiable subgroups whose performance was impaired more on some sentence types than on others. However, when they attempted to correlate this data with aphasie syndromes and the site of brain damage, they found great heterogeneity. The patient clusters were mixed in their characteristics. They found no correlation between patient clusters and either aphasie syndromes or lesion sites. In fact, most of the variation in performance (60-65%) could be accounted for by a general measure of overall impairment. Similar degrees of impairment in syntactic comprehension could be found among patients whose brain damage was restricted to the frontal, parietal, or temporal lobes. While they are cautious about assessing the implications, Caplan and Hildebrandt argue that such results imply a model in which syntactic comprehension is primarily impaired by reductions in the overall "syntactic workspace" available for sentence comprehension, and that reductions in the available workspace can be induced by damage anywhere in the language regions of the brain. Moreover, they suggest that the great variability observed could result from variations among patients in the brain sites dedicated to syntactic comprehension. Such conclusions might be taken to raise doubts about the connectionist approach under consideration. It should be noted, however, that these experiments examined how effectively patients could infer thematic roles. They did not test directly for comprehension of function words, pronouns, or other functional elements. In fact, Caplan and Hildebrandt's analysis suggests that the variation among patients could be a consequence of such factors as the extent to which they use heuristic strategies to assign thematic roles and their ability to deal with the processing strain of multiverb structures. It is not particularly surprising that a variety of different types of patients should employ similar heuristic strategies to overcome difficulties in comprehension, even if the difficulties de-

Neurological evidence for a cognitive theory of syntax

93

rived from very different causes. Nor would it be surprising if a variety of patients had particular trouble with relatively difficult structures, such as those with multiple verbs. Even more crucially, we should note that Caplan and Hildebrandt do not seek correlations between brain lesion sites and difficulties with function words or pronouns. This is so despite their having performed a sentence-picture matching test which examined errors of this type, and which isolated at least one group which had difficulties with both types. It would have been very interesting to see whether agrammatic Broca's aphasies with frontal and parietal damage were members of this group, as the R-link theory of agrammatism would predict. Thus, if we are to obtain information about lesion sites, we must examine Caplan and Hildebrandt's cases studies, which yield very suggestive results. There are nine case studies. The patients displayed many different deficits of syntactic comprehension. After analyzing the results on an extensive battery of tests, Caplan and Hildebrandt concluded that their patients displayed specific impairments of syntactic comprehension due to difficulties with the following kinds of syntactic complexity (p. 277):

complexity due to the demands of building hierarchical structure specifically complex NPs (patients A.B. and C.V.), complexity due to the demands of holding an NP in the syntactic structure without a thematic role (patient K.G.), complexity due to the transmission of a thematic role from an empty [+anaphoric, -pronominal] NP to its antecedent (patient J.V.), complexity due to searching for an antecedent (patients A.B. and C.V.), complexity due to searching for an antecedent over a sentence boundary (patient G.G.), postsyntactic complexity due to the need to hold propositions in memory and assign thematic roles to real-world referents (patient B.O.)

94

Paul Deane

From the perspective of the present theory, these are striking results; two patients - A.B. and C.V. - display the precise pattern of results that would be expected if they were unable to process R-links. Their difficulties with complex NPs occurred with phrases like Bill's friend or a friend of Bill's, in which the adjunct functions as a specifier or a restrictive modifier. On the present theory such phrases involve an R link from the head noun to the adjunct, and so it is no surprise that the same patients also have difficulties finding the antecedent of a pronoun. Nor is it surprising that both patients, A.B. particularly, evinced difficulties in the interpretation of complex phrase structure, as this would follow f r o m the loss of R-links. If the present theory is correct, A.B. and C.V. should display suffering f r o m lesions involving frontal and/or parietal damage likely to sever connections between the frontal lobe and the supramarginal gyrus. The other patients should not. Consider Table 3. This chart shows the usual correlation between brain lesion and syndrome diagnosis: the patients with anterior or frontal lobe damage display various nonfluent aphasias, including agrammatic symptoms and dysarthria; conversely, J.V. shows the typical pattern for Wernicke's aphasia, with posterior damage, correlating with paragrammatism and paraphasia. But it also confirms the predictions made above. T o begin, patients C.V. and A.B. have brain damage in the correct brain regions. C.V. is classified as an agrammatic aphasie and has extensive frontal, parietal, and temporal lobe damage: exactly the pattern Möhr (1976) argues to underlie the classical pattern of Broca's aphasia. A.B., on the other hand, has parietal lobe damage. He is also a nonfluent aphasie with apraxia of speech (practically the same thing as Broca's aphasia). The literature on apraxia of speech generally concludes either that it results from frontal lobe damage or that it is a consequence of damage to the supramarginal gyrus which disrupts its ability to program speech production in the frontal lobe (Buckingham 1991). Thus it seems safe to conclude that A.B. either has frontal lobe damage in addition to the primary parietal damage, or that the parietal damage specifically involves the supramarginal gyrus. Either pattern is consistent with the hypothesis that A.B. suffers f r o m damage to his ability to process R-links.

Neurological evidence for a cognitive theory of syntax

95

Table 3: Caplan and Hildebrandt's case studies Patient

Lesion

G.G.

lesion in the region of the left mildly anterior thalamus and possibly nonfluent the left putamen at the border of the external capsule (CT scan) left frontal (CT scan) mild agrammatism, apraxia of speech, marked anomia left internal capsule, left mid- dysarthric, dyspraxic, dle cerebral artery, territory anomic (CT scan) ? (subarachnoid hemorrhage) expressive, dysphasia, mildly, agrammatic parietal and frontal (CT scan) nonfluent and, anomic, focal slowing of temporal with agrammatic, paralobe on both sides (EEG) grammatic, and paraphasic errors right parietal nonfluent aphasia with apraxia of speech left posterior (PET scan) paragrammatism, paraphasia ? (right hemisphere cardiovascular accident) ? reproduction conduction aphasia

K.G.

B.O.

S.P. C.V.

A.B. J.V. G.S. R.L.

Diagnosed syndrome

There is a third patient who arguably should be grouped with A.B. and C.V.: S.P., who is clinically classified as agrammatic with expressive dysphasia. S.P. was studied in earlier work (Caplan and Futter 1986),

96

Paul Deane

and so was not tested on the same battery of tests as A.B. and C.V. Caplan and Futter examined S.P.'s interpretation of the usual range of basic patterns, e.g. active and passive sentences with a single verb, cleft sentences, sentences with two conjoined verbs, and sentences containing a relative clause. S .P. displayed such serious difficulties in interpretation that Caplan and Futter argue she is incapable of assigning hierarchical structures to sentences - in particular, that she fails to appreciate head-adjunct relations (Caplan and Hildebrandt 1986). The combination of agrammatic, nonfluent speech output combined with failure of hierarchical sentence structure is consistent with the hypothesis that her disorder includes damage to the capacity to process Rlinks. Moreover, her brain damage followed a subarachnoid hemorrhage requiring ligature of the left internal carotid artery - a massive hemorrhage at the base of an arterial system that supplies practically the whole left cerebral hemisphere. The resulting brain damage is certainly consistent with the hypothesis of damage to the region extending from Broca's area to the supramarginal gyrus. Contrast the other patients: two are anterior aphasies (G.G., K.G.) with frontal lobe but no parietal lobe damage. While K.G. is classified as mildly agrammatic, his symptoms are much less severe than would result from a disruption of R-linking - a result entirely consistent with the absence of parietal damage. Of the remaining patients, only one (B.O.) has damage in the correct general region - although without evidence about precise localization of brain damage. She demonstrates preserved comprehension of syntactic structure and Caplan and Hildebrandt argue that her comprehension deficits result from a reduction in short term memory. Since she displayed no agrammatic symptoms and it is not clear that her specific lesions affected the connections between the frontal lobe and the supramarginal gyrus, her pattern of syntactic comprehension is not relevant to the R-link model of agrammatism. As a result, Caplan and Hildebrandt's case studies present an excellent correlation between expressive agrammatism, frontoparietal lesions, and grammatical deficits consistent with disruption of R-links. Here, as with Miceli et al.'s work, the data are not conclusive in themselves. Nonetheless, they are at the least consistent with and arguably support the theory presented in Deane (1992). As it predicts,

Neurological evidence for a cognitive theory of syntax

97

the typical symptoms of agrammatic Broca's aphasia can be accounted for as a disconnection syndrome in which damage around Broca's area and the supramarginal gyrus disrupt the processing of Rlinks. 3.2. A formal linguistic approach: Grodzinsky 's theory of agrammatism Various theories of agrammatism have been proposed (see Kean 1980 for discussion); however, only one other attempts a detailed grammatical characterization of agrammatism comparable to that argued for in the present paper. This is the theory developed by Grodzinsky (1984, 1986, 1990). Grodzinsky argues for the following characterization of agrammatism within Government-Binding theory: (16)

a.

b.

At S-structure the representation underlying agrammatic speech production differs from the representation underlying normal speech production in the following respects: (i) Nonlexical terminals are deleted. (ii) Governed prepositions are deleted. The S-structure representation underlying agrammatic comprehension lacks traces. In interpretation, a Default Principle is invoked that is defined as follows:

If a lexical NP has no theta-role (that is, if it is in a non thematic position), assign it the theta-role that is canonically associated with the position it occupies, unless this assignment is blocked. In this case assign it a role from the next lower level in the Thematic Hierarchy (agent, goal/ source/ instrument/ experiencer/ theme). This theory accounts for difficulties in speech production with determiners, auxiliaries, and prepositions on the grounds that these are nonlexical elements - i.e., elements not assigned phonological representations at S-structure. If they are deleted at this level, of course they tend to be omitted in speech production. On the other hand, it accounts for

98

Paul Deane

difficulties in agrammatic comprehension by postulating a separate deficit: lack of information about traces, and hence problems with passives, relative clauses, and presumably a variety of other construction types. In what follows I shall argue that Grodzinsky's account is less satisfactory than the R-link theory of agrammatism both on conceptual and on empirical grounds. One general conceptual point is that Grodzinsky's theory makes no predictions about the relation of agrammatism to the brain. That is, it describes the symptoms of agrammatism without accounting for their origins or explaining why ρ cuticular patterns of brain damage produce agrammatism where others do not. In fact, Grodzinsky goes so far as to state that "It is hardly imaginable that these physical variables will play an explanatory role in an abstract theory of language structure. From the point of view of this theory, the anatomical facts are completely arbitrary". The argument, apparently, is that brain damage is typically produced by ruptured blood vessels, whose patterns of distribution have no logical connection to how the brain processes grammatical structure. However, the resulting brain damage most certainly does. Any theory which postulates that specific grammatical deficits follow from brain damage requires a causal theory which explains how the brain damage leads to the deficit. Less than that is not a fully explanatory theory, though it may be adequate in a more limited domain. In other words, Grodzinsky's theory is only concerned with providing a theory of the deficit patterns in agrammatic aphasia and using these to motivate his choice of grammatical theory; no broader account is attempted. By contrast, the R-link theory of agrammatism presupposes a model of how linguistic knowledge is instantiated in the brain. The Rlink model is thus of greater theoretical interest as it entails far more about the relation of grammar to mind and brain. Another, related, conceptual point should be raised. Grodzinksy (1990: 8-11) criticizes connectionist theories such as those offered by Geschwind (1965, 1979, 1983) on the grounds that they are linguistically inadequate. He characterizes connectionist theories as ignoring fine-grained patterns of linguistic deficit which require specific reference to grammatical rules or elements. He claims, in fact, that "knowledge (linguistic or otherwise) plays no role in the explanation".

Neurological evidence for a cognitive theory of syntax

99

Finally, he argues that "this approach assumes a trivial, one-to-one relation to hold between the functional characterization of behavior and the neural tissue supporting the mechanisms that underlie behavior" In other words, he argues, it does little good to specify that a particular function, such as speech production, is fully localized in a particular brain region if that function interacts with knowledge and resources which might also be needed for other linguistic functions. It should be noted that this negative characterization amounts largely to the observation that connectionist theories were proposed by neurologists, not linguists, and so lack an explicit theory of linguistic structure. By contrast, the present theory combines a connectionist view of neurological function with an explicit linguistic theory. It can thus be compared to Grodzinsky's theory and evaluated for the adequacy of its linguistic predictions, but its connectionist underpinnings provide a causal account which yields further predictions about the correlation of linguistic deficits with lesion sites and normal patterns of brain activity. Let us now compare Grodzinsky's theory to the R-link model and see where they yield different predictions. The following differences are worthy of note: (i)

(ii)

(iii)

Grodzinsky's account predicts that difficulties with function words are restricted to production. The R-link model predicts that problems with specifiers (determiners, auxiliaries) will also occur in comprehension. Grodzinsky's model predicts that constituent structure is fully preserved in comprehension; the R-link model predicts that constituent structure is differentially impaired. For example, it predicts that determiners and restrictive modifiers may not be recognized as parts of the noun phrase to which they belong. On the other hand, it predicts that prepositional phrases and the core of VP (verb-object structures) remain intact. Grodzinsky's model predicts that comprehension is only impaired in constructions which involve traces; thus it predicts, for example, that comprehension of reflexives and pronouns should not be impaired. By contrast, the R-link model predicts

100

Paul Deane

difficulties with both classes to the extent that their interpretation requires processing of an antecedent. In each of these cases, there is evidence that the R-link model yields more accurate predictions. Let us begin with the comprehension of determiners. As noted above, there is evidence that agrammatic aphasies do not make use of information about articles to disambiguate nominal reference (Zurif and Caramazza 1976: 269; Goodenough, Zurif and Weintraub 1975; see also Zurif 1982, 1984). On the other hand, it is clear that they do preserve some knowledge of determiners; for example, Grossman, Carey, Zurif and Diller (1986) report that agrammatic aphasies are sensitive to the fact that determiners introduce common but not proper nouns. Such a split pattern cannot be accounted for in Grodzinsky's theory. As it stands, it predicts complete comprehension of determiners, and thus is falsified by the failure of agrammatics to use articles to restrict the reference of the noun. If the theory is generalized to include comprehension, it would predict, falsely, a complete failure to respond to the syntactic cues provided by determiners. By contrast, the R-link model correctly predicts this split in performance. To begin with, it directly predicts that agrammatic aphasies will not make use of articles to determine the reference of the noun: that is simply another way to describe the absence of an R-link. Conversely, the fact that determiners cooccur with common nouns is unrelated to reference. In fact, when a determiner is used as an identity of sense pronoun, it displays exactly the same pattern: (17)

a. *Billi is better than hisi b. My friends i are better than his ¡

The fact that determiners occur with common nouns thus reflects the fact that a determiner must be S-linked to a noun which supplies its sense; since S-links are not disrupted in the R-link model of agrammatism, this aspect of grammatical competence ought to be preserved. With regard to the next issue, constituency, Grodzinsky's account predicts preserved constituency structures, at least in comprehension. It thus fails entirely to account for the evidence adduced by Zurif and

Neurological evidence for a cognitive theory of syntax

101

Caramazza (1976) who show that agrammatic aphasies fail to associate determiners and auxiliaries with the head of the phrase to which they should belong. By contrast, the R-link model is able to explain various contrasts in Zurif and Caramazza's data. For example, it predicts (correctly) that agrammatic aphasies will associate prepositions with their objects. One of the most striking results in Zurif and Caramazza' s study is the contrast between to as a preposition and as the marker of the infinitive. In a sentence like (18a), where to is a proposition, agrammatics assigned the correct constituency; in (18b), where it is an infinitive marker, they fail to associate it with the verb: (18)

a. John gave the book to Mary. b. John wanted to leave.

The R-link model directly accounts for the contrast: the relation between preposition and object does not involve R-links; the infinitive marker, by contrast, is a specifier element which marks the verb as referring to hypothetical or unrealized situations and which occurs in the same syntactic position as a modal auxiliary. The verb is therefore Rlinked to the infinitive marker much as it would be to a preposition; disruption of R-links thus entails a failure to integrate the infinitive marker with the correct constituent. Finally, with regard to reflexives and bound pronouns, Grodzinsky's theory predicts preserved competence. Grodzinsky (1990) reports agrammatic performance on sentences like the following: (19)

a. b. c. d.

Is Mama Bear washing herself? Is Mama Bear washing her? Is every bear washing her? Is every bear washing herself?

Grodzinsky's patients performed nearly perfectly on (19a), with a reflexive, at chance on (19b), and above chance on (19c) and (19d). Such results appear at first to suggest that Grodzinsky's agrammatic patients are able to comprehend anaphoric binding relations. However, it should be noted that Grodzinsky's theory fails to predict the errors with (19b); at the very least, the theory would have to be modified to

102

Paul Deane

stipulate a second comprehension deficit, this time involving pronominal elements but not anaphora. The picture is complicated even further when we consider other data. Blumstein et al. (1983) used a picture matching procedure to test agrammatic and other aphasies on their ability to interpret pronouns and to distinguish correctly between sentences like (20a) and (20b): (20)

a. She washed her. b. She washed herself.

The Broca's aphasies performed perfectly when they had to identify which individual in a picture was the referent of a pronoun. By contrast, they performed worse than any other group of aphasies on their interpretation of reflexives. This much is arguably consistent with Grodzinsky's results, since the difficulties on (20) could have been due to a tendency to assign pronouns a reflexive interpretation across the board. However, when agrammatic aphasies were tested with more complex sentences like (21), their performance was poor; Blumstein et al. interpret the results as indicating that agrammatic aphasies were simply treating reflexives as coreferential with the nearest preceding NP: (21)

a. The boy watched the chef bandage himself. b. The boy watching the chef bandaged himself

These results are not consistent with Grodzinsky's theory. According to his account, constituent structure is fully preserved in aphasie comprehension with the only comprehension difficulties resulting from the absence of traces. If this were correct, agrammatic patients would have no problems with sentences like (21), since the interpretation of the reflexive does not involve any traces. Problems with the reflexives could not even be explained as due to a mistaken assignment of thematic roles, for if (21b) contains a trace, it will be a subject trace, for which Grodzinsky's theory predicts correct thematic role assignment. In order to predict the difficulties encountered, Grodzinsky would need, at the least, to give up the assumption that constituency is preserved in agrammatic comprehension, a move which would under-

Neurological evidence for a cognitive theory of syntax

103

mine the rest of his theory - for how can one tell if traces are absent if there is no syntactic context within which they could be bound? However, the data are not consistent with the R-link model in its present form, either: for in at least the simplest cases agrammatics seem to be capable of interpreting reflexive sentences, which by definition involve an R-link. Let us consider how the R-link model could accommodate such data. Thus far we have assumed that R-links are simply deleted from the representation of sentences. However, nothing in the theory requires that loss of R-links be an all-or-nothing process: it is entirely possible that agrammatic aphasies can be partially disabled in their processing of R-links. Within the model developed in Deane (1992: 55-84, 164-178), linkage relationships are established by cognitive routines of the sort labeled productions in Anderson (1983). Such productions are condition-action structures which are triggered by pattern matching within a spreading activation model. For instance, Deane (1992: 211) proposes a production whose action is to R-link reflexive pronouns to their antecedents. Its condition is a complex pattern which characterizes the relation a NP must bear to the reflexive pronoun in order to be qualify as its antecedent. However, pattern matching is driven by attentional processes, which are analyzed as involving spreading activation: the more active an element is, the more quickly pattern matching applies; conversely, the less active an element is, pattern matching is delayed if not stopped altogether. Thus, if the assignment of R-links is compromised, the most probable reason would be failure of the appropriate productions to apply. The theory postulates that productions apply most strongly to salient patterns and apply slowly and uncertainly to patterns low in activation. Thus, if the productions which process R-links are degraded but not destroyed, they will continue to apply - but only for the most salient patterns; i.e., only if a salient NP is chosen as antecedent. It seems reasonable, therefore, to adopt the following hypothesis: (22)

If their capacity to process R-links is partially degraded, agrammatic aphasies will construct R-links only if the R-link attaches to a salient constituent.

104

Paul Deane

Since Deane (1992: 34-54, 78-84, 187-250) devotes considerable space to defining what makes a constituent salient, this hypothesis makes clear empirical predictions. The following predictions are relevant: (i)

(ii)

(iii)

(iv) (v) (vi)

C-command is analyzed as a pattern of spreading activation: that is, syntactic priming is argued to occur strongly from part to whole in the syntactic tree, but more weakly from whole to part. In consequence, c-commanding phrases are available for processing whenever the constituent they c-command is in the focus of attention. Referentiality is a major factor: independently referential phrases are relatively salient; the less referential the phrase, the less it is intrinsically able to attact attention. Thus proper nouns are more salient than definite NPs, which are more salient than indefinites. Semantic role hierarchies are also analyzed as involving intrinsic salience: they involve salience gradients, with agentive NPs intrinsically attracting more attention than NPs whose semantic roles are more oblique. Elements which function as sentence or discourse topics also intrinsically attract attention. Recency is, of course, another factor in salience, with recent NPs more salient (ceteris paribus) than less recent NPs. Grammatical status also plays a role: Lexical items are intrinsically more salient than phrases, NPs are more salient than PPs, which are more salient than VPs and clauses, and content words are intrinsically more salient than function words, particularly those which are minimally referential.

Let us now consider the consequences that (22) has within the R-link model of agrammatism. Among other things, it entails that specifiers and inflections are more vulnerable to R-link disruption than pronouns, since their low levels of salience will increase the probability that an R-link will be attached to them. Pronouns, on the other hand, are far more likely to have a salient antecedent and thus to achieve an

Neurological evidence for a cognitive theory of syntax

105

interpretation. But the interpretation they achieve is not likely to be normal. Here is why: the normal interpretation of pronouns is heavily constrained by structural c-command relationships, which (the theory argues) are defined as spreading activation domains within the syntactic tree. But spreading activation presupposes constituency. Thus, if clause-level constituency is disrupted, as the theory predicts, c-command relations are too. Critical consequences follow. If, as the theory argues, the subject is discontinuous from the predicate, no c-command relation holds, and the subject is (for all practical purposes) a NP within a separate clause. Furthermore, the loss of c-command information entails that the most salient antecedent will be defined by semantic and pragmatic variables of the sort listed above. It will not be defined in terms of syntactic configurations4. The consequences for ordinary third person pronouns are quite direct. To begin, bound-anaphora readings of third person pronouns seem to depend specifically on c-command: a bound-anaphora reading is ordinarily possible only if the antecedent c-commands the pronoun (cf. Reinhart 1983). A comparable analysis of bound-anaphora can be stated naturally within the present theory, simply by assuming a production which searches for an intrasentential antecedent for boundanaphora; c-command follows automatically since (usually) only ccommanding NPs will be made accessible by spreading activation from the pronoun. Since subject NPs no longer c-command the predicate, the theory correctly predicts that bound-anaphora will not occur in sentences like (19c). A different pattern should hold for ordinary coreference. The typical subject NP has all the properties which make a constituent salient: it is a referential, agentive, topical NP. It is also quite close to an object pronoun, and hence recent. In short, there are overwhelming reasons to prefer subject NPs as antecedents. Normally, third person pronouns cannot be coindexed to the subject due to noncoreference effects (condition Β effects in Government-Binding theory). But, however one handles such effects, they clearly apply intraclausally. They do not apply when a pronoun takes an antecedent in another clause. But the R-link model of agrammatism disrupts clausal structure, in effect splitting the clause into two grammatical units. Thus there is noth-

106

Paul Deane

ing to prevent the pronoun from taking the subject as its antecedent, and every reason for it to do so. As a result, the theory predicts that sentences like (19b) will be given reflexive interpretations. Slightly different consequences ensue for reflexive pronouns. According to Deane (1992: 211), prototypical reflexive pronouns are Rlinked to their antecedents by a production which is sensitive (in prototypical uses) to argument status, not constituent structure: i.e., prototypical reflexives must be a coargument of the verb. This is the part of the pattern that has to be salient in order to trigger the production which interprets reflexive pronouns. Deane (1992) also argues that in prototypical cases reflexive pronouns are R-linked, not to their antecedent but to the agreement schema within tense, as shown in (7). While a functional element, the agreement morpheme has several properties which conspire to make it salient: (i) it is assigned a salient semantic role, such as agent; (ii) it is an intrinsically referential element which is R-linked in turn to the subject NP; (iii) its referent is prototypically topical, (iv) it is a coargument of the verb; (v) since it is morphologically associated with the verb, it is by definition recent. In other words, the antecedent of a reflexive pronoun is intrinsically salient, and so the R-link can be assigned. This is not the last step, though. To derive the correct interpretation, an R-link must also be assigned which links the agreement schema to the subject NP. In this case, almost exactly the same factors come into play: the subject is a NP, typically (though not necessarily referential) and it is recent by definition. However, another factor plays a role: since subject NPs necessarily cooccur with the tense morpheme, the two elements are mutually c-linked. As a result, they form a constituent. The subject NP therefore c-commands the tense morpheme (even though it does not c-command the main verb) and so the subject NP is salient on practically every count. The correct interpretation of reflexives therefore falls out automatically. Note, moreover, that because of the c-command relation between the subject NP and tense, it is possible to derive a bound-anaphora reading in sentences like (19d). Finally, the theory also appears capable of explaining Blumstein et al.'s data with complex reflexive clauses: like its account of third person pronominal elements, it depends on the loss of syntactic integrity

Neurological evidence for a cognitive theory of syntax

107

consequent upon imperfect assignment of R-links. If a sentence like (21a) or (21b) is encountered, the theory predicts that it will be processed as a series of phrasal chunks (NP V NP VP), with consequent ambiguity of structure; interpretation of the reflexive then depends upon the semantic roles that the agrammatic has actually assigned to the NPs in the sentence. As the preceding discussion indicates, the R-link model of agrammatism can be extended to account for agrammatics' deficits with pronouns and reflexives. In fact, the resulting theory relies on the model's key elements - difficulties with R-links and subsequent degradation of constituent structure. By contrast, Grodzinsky's theory predicts that agrammatic aphasies fully understand reflexives and pronouns; while it could be modified by stipulating difficulties with pronominal elements, such a move would simply complicate the theory. 4.

Conclusion

The arguments presented above represent a rare conjunction: an explicit linguistic theory which also makes explicit predictions about the instantiation of linguistic knowledge in the brain. As such, it has an intrinsic interest to the extent that its theoretical foundations are well supported and its empirical consequences are both interesting and correct. The arguments adduced thus far suggest that the theory stands a good chance of being on the right track. But the theory is of interest for another reason: the support it gives for a cognitive approach to linguistic theory. The whole point of the theory is that it seeks to ground linguistic phenomena in general cognitive capacities; it argues for a model of syntax which is neither autonomous nor strictly modular, but is founded upon and interacts with other human mental capacities. The analysis which results suggests that grammar, far from being an independent module of mind, is simply one instantiation of the general human capacity for spatially structured thought.

108

Paul Deane

Notes 1.

2.

Zurif and Caramazza's study of metalinguistic judgements correlated with constituency has been extended by Hermann Kolk (Kolk 1978, Kolk and van Grunsven 1984). Kolk (1978) found that agrammatic patients had difficulty not only with function words but with attributive adjectives - results consistent with the analysis outlined in the text. But then Kolk and van Grunsven (1984) yielded results which might seem to call the usefulness of the technique into question; closer examination, however, reveals a pattern consistent with Zurif and Caramazza's original results. To be precise, Kolk and van Grunsven (1984) observed a group of "a grammatic" patients with normal consituency judgements, and a group of "normal" patients with "agrammatic" constituency judgements. Closer examination resolves the paradox, however. The agrammatic patients studied in Kolk and van Grunsven (1984) failed to display dysarthria (articulatory difficulties). The presence of such difficulties in speech production indicates damage to the frontal lobe of the brain. Their absence suggests that Kolk's agrammatic patients are not classical Broca's aphasies with extensive damage to Broca's area. Rather, their syndrome may resemble the strictly "parietal" agrammatism discussed in Deane (1992: 278281), where patients suffer not from a loss in syntactic competence but solely from a slowdown in processing. In fact, this is precisely the analysis proposed by Kolk (cf. Haarmann and Kolk 1991). It is thus not surprising that Kolk's patients continued to evince the ability to judge constituency relations. Kolk's other patients were not normal controls, but aphasie speakers who "only manifested occasional word-finding difficulties". In other words, they were mild anomic aphasies. Their constituency judgements were unreliable for all function words: for articles, auxiliaries and prepositions. But in Zurif and Caramazza's original study, agrammatic aphasies did not have difficulties with all function words - in particular, they had no difficulty with prepositions. But the anomic aphasies in Zurif and Caramazza's study displayed exactly the pattern reported by Kolk. Kolk's results thus fail to be germane to the analysis proposed in the text; they illustrate other syndromes, not classical agrammatism. Connectionism in this sense is not equivalent to connectionism of the sort advocated in McLelland and Rumelhart (1986). It deals with connections among brain regions, i.e., a fairly high level of functional analysis, rather than the microanalysis advocated by McLelland and Rumelhart.

Neurological evidence for a cognitive theory of syntax

3.

4.

109

The rank order of errors is statistically significant beyond a level of .001 as determined by the use of the Kendall W Coefficient of Concordance. This very point is substantiated by van Hoek's paper on boundanaphora (this volume).

References Anderson, John 1983 The architecture of cognition. Cambridge, Mass.: Harvard University Press. Badecker, William & Alfonso Caramazza 1985 "On considerations of method and theory governing the use of clinical categories in neurolinguistics and cognitive neuropsychology: The case against agrammatism", Cognition 20: 97-126. Bates, Elizabeth, Beverly Wulfeck & Brian MacWhinney 1991 "Cross-linguistic research in aphasia: An overview", Brain and Language 41: 123-148. Berndt, Rita Sloan 1983 "Symptom cooccurrence and dissociation in the interpretation of agrammatism", in: Max Coltheart, Giuseppe Sartori & Remo Job (eds.), 221-234. Berndt, Rita Sloan, Aita Salasoo, Charlotte C. Mitchum & Sheila E. Blumstein 1988 "The role of intonation cues in aphasie patients' performance of the grammaticality judgement task", Brain and Language 34: 65-97. Blumstein, Sheila E., Harold Goodglass, Sheila Statlender & C. Biber 1983 "Comprehension patterns determining rererence in aphasia: a study of reflexivization", Brain and Language 18: 115-127. Blumstein, Sheila E„ William P. Milberg, Barbara Dworetzky, Allyson Rosen & Felicia Gershberg 1991 "Syntactic priming effects in aphasia: An investigation of local syntactic dependencies", Brain and Language 40: 393-421. Buckingham, Hugh W. 1991 "Explanations for the concept of apraxia of speech", in: Martha Taylor Sarno (ed.), 271-312. Caplan, David (ed.) 1980 Biological studies of mental processes. Cambridge, Mass.: MIT Press. Caplan, David & Christine Futter 1986 "Assignment of thematic roles to nouns in sentence comprehension by an agrammatic patient", Brain and Language 27: 117-134. Caplan, David & Christine Hildebrandt 1986 "Language deficits and the theory of syntax: A reply to Grodzinsky", Brain and Language 27: 168-177.

110

Paul Deane

1988

Disorders of syntactic comprehension. Cambridge, Mass.: MIT Press. Caramazza, Alfonso & Rita Sloan Berndt 1985 "A multicomponent deficit view of agrammatic Broca's aphasia", in: Mary-Louise Kean (ed.), 27-63. Chomsky, Noam 1986 Barriers. Cambridge, Mass.: MIT Press. Coltheart, Max, Giuseppe Sartori & Remo Job (eds.) 1987 The cognitive neuropsychology of language. Hillsdale, N.J.: Lawrence Earlbaum Associates. Corballis, Michael & C. & Ivan L. Beale 1983 The ambivalent mind'. The neuropsychology of left and right. Chicago: Nelson-Hall. Critchley, Macdonald 1953 The parietal lobes. London: Arnold. [1966] [Reprinted New York: Haffner] Darley, Frederic (ed.) 1967 Brain mechanisms underlying speech and language. New York: Grune and Stratton. Deane, Paul D. 1991 "Limits to attention: a cognitive theory of island constraints", Cognitive linguistics 2(1): 1-63. 1992 Grammar in mind and brain: Explorations in cognitive syntax. Berlin: Mouton de Gruyter. Frazier, Lyn & Angela D. Friederici 1991 "On deriving the properties of agrammatic comprehension", Brain and Language 40: 51-66. Friederici, Angela D. 1982 "Syntactic and semantic processes in aphasie deficits: The availability of prepositions", Brain and Language 15: 249-258. 1985 "Levels of processing and vocabulary types: Evidence from on-line comprehension in normals and agrammatics, Cognition 19: 133166.

Friederici, Angela, D. J. Weissenborn & M. Kail 1991 "Pronoun comprehension in aphasia: A comparison of three languages", Brain and Language 41: 289-310. Geschwind, Norman 1965 "Disconnection syndromes in animals and man", Brain 88: 237294, 585-644. 1979 "Specializations of the human brain", Scientific American : September. 1983 "Biological foundations of language and hemispheric dominance", in: M. Studdert-Kennedy (ed.), Psychobiology of Language. Cambridge, Mass.: MIT Press.

Neurological evidence for a cognitive theory of syntax

111

Goodenough, C., Edgar B. Zurif & Sandra Weintraub 1977 "Aphasies' attention to grammatical morphemes", Language and. Speech 20: 11-19. Goodglass, Harold 1968 "Studies in the grammar of aphasies", in: Sheldon Rosenberg & James H. Koplin (eds.), 177-208. 1976 "Agrammatism", in: Haiganoosh Whitaker & Harry A. Whitaker (eds.), 237-260. Goodglass, Harold, F. A. Quadfasel & W. H. Timberlake 1964 "Phrase length and the type and severity of aphasia", Cortex 1: 133-153. Grodzinsky, Yosef 1984 "The syntactic characterization of agrammatism", Cognition 16: 99-120. 1986 "Language deficits and the theory of syntax", Brain and Language 27: 135-159. 1988 "Syntactic representations in agrammatism: The case of prepositions", Language and Speech 31: 115-134. 1990 Theoretical perspectives on language deficits. Cambridge, Mass.: MIT Press. 1991 "There is an entity called agrammatic aphasia", Brain and Language 41: 555-564. Grodzinsky, Yosef, David Swinney & Edgar Zurif 1985 "Agrammatism: structural deficits and antecedent processing disruptions", in: Mary-Louise Kean (ed.), 65-82. Grossman, Murray, Susan Carey, Edgar Zurif & Lisa Diller 1986 "Proper and common nouns: Form class judgements in Broca's aphasia", Brain and Language 28: 114-125. Gur, Ruben C., Raquel E. Gur, Allyson D. Rosen, S. Warach, A. Alavi, J. Greenberg & M. Reivich 1983 "A cognitive-motor network demonstrated by positron emission tomography", Neuropsychologia 21: 601-606. Haarmann, Henk J. & Herman J. Kolk 1991 "A computer model of the temporal course of agrammatic sentence understanding: The effects of variation in severity and sentence complexity", Cognitive Science 15:49-87. Head, Henry 1923 "Speech and cerebral lateralization", Brain 46: 355-528. Hecaen, Henry 1967 "Brain mechanisms suggested by studies of parietal lobes", in: Frederic Darley (ed.), 146-255. Johnson, Mark 1987 The body in the mind: The bodily basis of meaning, imagination, and reason. Chicago: University of Chicago Press.

112

Paul Deane

Kean, Mary-Louise 1977 "The linguistic interpretation of aphasie syndromes: Agrammatism in Broca's aphasia, an example", Cognition 5: 9-46. 1980 "Grammatical representations and the description of language processes", in: David Caplan (ed.), 239-268. Kean, Mary-Louise (ed.) 1985 Agrammatism. Orlando: Academic Press. Kohlmeyer, K. 1975 "Dynamic speech studies by measurement of regional cerebral blood flow in aphasie and nonaphasic cases", in: M. Harper, B. Jennet, D. Miller & J. Rowan (eds.), Proceedings of the 7th international symposium on cerebral blood flow and metabolism. London: Churchill Livingstone. Kolk, Herman H. J. 1978 "Judgment of sentence structure in Broca's aphasia", Neuropsychologia 16: 617-625. Kolk, Herman H. J. & Marianne J. F. van Grunsven 1984 "Metalinguistic judgments on sentence structure in agrammatism: A matter of task misinterpretation", Neuropsychologia 22(1): 3139. Kolk, Herman H.J., Marianne J.F. van Grunsven & Antoine Keyser 1985 "On parallelism between production and comprehension in agrammatism", in: Mary-Louise Kean (ed.), 165-206. Lakoff, George 1987 Women, fire and dangerous things : What categories reveal about the mind. Chicago: University of Chicago Press. 1990 "The invariance hypothesis: Is abstract reason based on image schemas?", Cognitive Linguistics 1(1): 39-74. Lakoff, George & Mark Johnson 1980 Metaphors we live by. Chicago: University of Chicago Press. Langacker, Ronald 1991 Concept, image and symbol: The cognitive basis of grammar. Berlin: Mouton de Gruyter. Larson, Β., Erik Skinhoj, Κ. Soh, Η. Endo & Neis A. Lassen 1977 "The pattern of cortical activity provoked by listening and speech revealed by rCBF measurements", Acta Neurologica Scandinavica Supplementum 56: 268-269. Lechevalier, B., M.C. Petit, F. Eustache, J. Lambert, F. Chapon & F. Viader 1989 "Regional cerebral blood flow during comprehension and speech (in cerebrally healthy subjects)", Brain and Language 37: 1-11. Le Doux, Joseph E. 1983 "Cerebral asymmetry and the integrated function of the brain", in: Andrew W. Young (ed.), 203-216. Le Doux, Joseph E., Donald H. Wilson & Michael S. Gazzaniga 1977 "Manipulo-spatial aspects of cerebral lateralization: clues to the origin of lateralization", Neuropsychologia 15: 743-750.

Neurological evidence for a cognitive theory of syntax

Marie, P. 1906

113

"Revision de la question de l'aphasie: la troisième convolution frontale gauche ne joue aucun role speciale dans la fonction du langage", Semaine Médicale 26: 241-247. Marin, Oscar, S. M., Eleanor M. Saffran & Myrna F. Schwartz 1976 "Dissociations of language in aphasia: Implications for normal functions", Annals of the New York Academy of Science 280: 868884. Martin, Randi C., W. Frederick Wetzel, Carol Blossom-Stambach & Edward Fehrer 1989 "Syntactic loss versus processing deficit: An assessment of two theories of agrammatism and syntactic processing deficits", Cognition 32: 157-191. McClelland, James L., David Rumelhart & the PDP Research Group 1986 Parallel distributed processing : Explorations in the microstructure cf cognition, 1. Cambridge, Mass. : MIT Press. Miceli, Gabriele, Anna Mazzucchi, Lise Menn & Harold Goodglass 1983 "Contrasting cases of Italian agrammatic aphasia without comprehension disorder", Brain and Language 19: 65-97. Miceli, Gabriele M., Catarina Silveri, Cristina Romani & Alfonso Caramazza 1989 "Variation in the pattern of omissions and substitutions of grammatical morphemes in the spontaneous speech of so-called agrammatic patients, Brain and Language 36: 447-492. Möhr, Jay P. 1976 "Broca's area and Broca's aphasia", in: Haiganoosh Whitaker & Harry A. Whitaker (eds.), 201-236. Möhr, Jay P., H. Funkenstein, S. Finkelstein, M. Pessin, G. W. Duncan & K. Davis 1978 "Broca's aphasia: Pathological and clinical". Neurology 28(4): 311-324. Mountcastle, V.B., J.B. Lynch, A. Georgopoulos, H. Sakata & C. Acuna 1975 "Posterior parietal association areas of the monkey: Command fonctions for operations in extrapersonal space", Journal of Neurophysiology 38: 871-908. Myerson, Rosemarie & Harold Goodglass 1972 "Transformational grammars of three agrammatic patients", Language and Speech 15: 40-50. Nespolous, Jean-Luc, Monique Dordain, Cecile Perron, Bernadette Ska, Daniel Bub, David Caplan, Jacques Mehler & Andre Roch Lecours 1988 "Agrammatism in sentence production without comprehension deficits: Reduced availability of syntactic structures and/or of grammatical morphemes? A case study", Brain and Language 33: 273-295. Pick, Herbert & Linda Acredolo (eds.) 1983 Spatial orientation : Theory, research and application, n.p.: Plenum Press.

114

Paul Deane

Reinhart, Tanya 1983 Anaphora and semantic interpretation Chicago: University of Chicago Press. Risberg, Jarl 1980 "Regional cerebral blood flow measurements by 133Xe-inhalation: Methodology and applications in neuropsychology and psychiatry, Brain and Language 9: 9-34. Samar, Vincent J. & Gerald P. Berent 1986 "The syntactic priming effect: Evoked response evidence for a prelexical focus", Brain and Language 28: 250-272. Martha Taylor Sarno (ed.) 1991 Acquired aphasia. New York: Academic Press. Schwartz, Myrna F. 1983 "Patterns of speech production deficit within and across aphasia syndromes: Application of a psycholinguistic model", in: Max Coltheart, Giuseppe Sartori & Remo Job (eds.), 163-200. Schwartz, Myrna F., Eleanor M. Saffran, & Oscar Marin S.M. 1980 "The word order problem in agrammatism: I. Comprehension", Brain and Language 10: 249-262. Schwartz, Myrna F., Marcia C. Linebarger & Eleanor M. Saffran 1985 "The status of the syntactic deficit theory of agrammatism", in: Mary-Louise Kean (ed.), 83-124. Smith, Wade S. & Eberhardt E. Fetz 1987 "Noninvasive brain imaging in humans", in: Steven P. Wise (ed.), 310-346. Sommerhoff, Gerdt 1974 Logic of the living brain. London: John Wiley and Sons. Talmy, Leonard 1983 "How language structures space", in: Herbert Pick & Linda Acredolo (eds.), 225-282. Tyler, Lorraine K. 1983 "Spoken language comprehension in aphasia: A real-time processing perspective", in Max Coltheart, Giuseppe Sartori & Remo Job (eds.), 145-162. Whitaker, Haiganoosh & Harry A. Whitaker (eds.) 1976 Studies in neurolinguistics, 1. New York: Academic Press. Wise, Steven P. (ed.) 1987 Higher brain functions'. Recent explorations of the brain's emergent properties. New York: Wiley. Wulfeck, Beverly 1988 Grammaticality judgements and sentence comprehension in agrammatic aphasia. Journal of Speech and Hearing Research 31: 72-81. Young, Andrew W. (ed.) 1983 Functions of the right cerebral hemisphere. Orlando: Academic Press.

Neurological evidence for a cognitive theory of syntax

115

Young, Andrew & W. Graham Ratliffe 1983 "Visuospatial abilities of the right hemisphere", in: Andrew W. Young (ed.), 1-32. Zurif, Edgar Β. 1982 "The use of data from aphasia in constructing a performance model of language", in: M. A. Arbib, D. Caplan & J. C. Marshall (eds.), Neural Models of Language Processes. New York: Academic Press. 1984 "Psycholinguistic interpretation of the aphasias", in: David Caplan Andre Roch Lecours & Anne Smith (eds.), Biological perspectives on language. Cambridge, Mass.: MIT Press. Zurif, Edgar B. & Alfonso Caramazza 1976 "Psycholinguistic structures in aphasia: Studies in syntax and semantics", in: Haiganoosh Whitaker & Harry A. Whitaker (eds.), 261-292.

Cost in language acquisition, language processing and language change Dorit Ravid

1.

Introduction

This study investigates features of inflectional morphology in spoken Israeli Hebrew across a wide range of native speakers. Its purpose was to explore cognitive principles and strategies governing the acquisition and processing of language, and to characterize and explain variations in the usage of Hebrew in speakers of different ages and socioeconomic backgrounds in the context of broader universal questions about the nature of language variation and change. The Saussurean tradition established the analysis of linguistic structures as either synchronic, and thus relating to a particular state of the language at some point in time, or diachronic, treating its development through time (de Saussure 1955). This view does not permit the investigation of changes taking place in a language such as modern Hebrew, too young to allow the time-depth perspective necessary for such an analysis1. And the generative model of an idealized homogeneous community of speakers focuses interest on abstract linguistic competence, leaving no room for the study of variation in error (Chomsky 1965). In fact, the Chomskyan model seeks to account for the human language faculty as a phenomenon properly studied only indirectly, by developing abstract systems of rules (Chomsky 1980). The language behavior of actual people is considered completely inappropriate as a source of data for theory development, since "...no individual speaks a well-defined language... In fact, each individual employs a number of linguistic systems in speaking. How can one describe such an amalgam?" (Chomsky 1979: 54).

118 DoritRavid

I adopt a different position here, one that runs counter to both the synchronic/diachronic dichotomy and the homogeneous speech community hypothesis. A basic assumption of this work is that a linguistic change has its source in the synchronic variation in the speech community (Weinreich, Labov & Herzog 1968). Language, while certainly an inherently human faculty endowed equally upon all human beings, is essentially diverse in nature: Each language community displays dialects, sociolects, registers, child language, the language of adolescents and that of adults, educated and uneducated speech, spoken and written codes, formal and colloquial as well as ethnic variants. This variation reflects ongoing processes of change constantly being initiated, promulgated and completed in the various groups that make up the speech community (Bailey 1973; Blount & Sanches 1977; Labov 1972; 1980, 1981; Ravid 1988). In Vygotskiyan terms, language is inherently fluid in four basic dimensions. In evolutionary terms, language developed in Homo Sapiens Sapiens. On a smaller scale, each language undergoes change during its historical development. From a third point of view, language undergoes change in every child as she develops and acquires her mother tongue. Finally, language undergoes change as we speak, during on-line conversation. Thus the study of historical language change is intrinsically related to a number of fundamental queries. Its first business is with the nature of the relationship between language and cognition: the brain uses the same machinery to represent language that it uses to represent any other entity, yet language arose as a secondary representational system, representing a primary representational level of symbols and concepts (Bickerton 1990). Both neuroscientific and linguistic studies indicate that the entities and processes of any interaction are represented in an interwoven fashion in cognition, and that aspects of neural representation bear a strong resemblance to primitives of conceptual structure (Damasio et al. 1990; Lakoff & Norvig 1987; Langacker 1991). The study of language change may shed light on the nature of this relationship by indicating to what extent linguistic change is dependent upon structural, rule-bound determinants as opposed to general cognitive constraints.

Cost in language acquisition, processing and change

119

Next, the examination of general cognitive and language-specific components in the causes and trends of historical language development is associated with psycholinguistic analyses of cognitive principles and derived strategies that operate in language acquisition and in adult speakers as they produce and perceive language in its spoken and written modes; and to the study of the same principles and strategies as they interact with sociolinguistic factors governing the diversity found in varieties of adult speech (Baron 1977; Berman 1981; Hale 1973; Hooper 1979; Naro & Lemle 1976; Ravid 1988; Slobin 1977, 1985). It is by no means clear how far these principles are due to general cognitive abilites rather than being geared specifically to the acquisition, processing and monitoring of language. Given this background, a number of specific questions arise which focus on language change. At the fundamental level, we might ask why language changes in the first place. At the lexico-semantic level, it is clear that change in the conceptual network automatically entails alterations in the secondary representational system. However, the more arbitrary components of language - phonology and grammar also change in time, and this change is not readily interpretable in general cognitive, but rather in biological, or evolutionary terms. The natural state of language as an evolutionary entity is to undergo change, in a manner analogous to the way living organisms undergo change. In the same way that existing gene pools always contain a number of coexisting variants, so does language. Changes in language may occur spontaneously, like biological mutations; spontaneous change is mostly phonetic in nature, in the classical neogrammarian sense (Hocket 1965; Labov 1981; Martinet 1960; Wang 1969; Wang and Cheng 1977). But the kind of change that is of interest to us here is in the morphosyntactic domain, and (to continue the biological analogy) results from natural selection, i.e. evolutionary pressure forcing a natural selection of the most appropriate variation (Gould 1977). This kind of change (e.g. the dialectal variation and historical change in the past tense form of the English verb dive from dove to dived) follows certain well-known trends. I will present these inclinations in the form of questions.

120

1.

Dont Ravid

Why are certain linguistic structures more prone to change than others?

Grammatical change does not occur everywhere. Some structures (like the Hebrew noun pattern CiCCa 2 are particularly liable to change, while others are change-resistant. Obviously, change-prone forms differ in some fundamental way, and subsequently evoke different choices by speakers. Theorists of language change claim that change-prone structures are opaque in the sense that the underlying motivation for their alternation is inaccessible to the speaker, and therefore he/she renders them more transparent by reanalyzing them, following Slobin's 1977 maxims of "being clear" and "being processible" (Lightfoot 1981, 1991). This is an example of change by "natural selection" of the most appropriate variation. However, studies show that an opposing linguistic force erodes the tendency to simplify by demanding that language be maximally expressive, with the result of increasing the surface density of the message and the degree of complexity of grammatical nuances (Slobin 1977). This tendency towards markedness or polarization results in rule addition and in forms which have greater surface complexity, and in the retention of irregular but unique forms (e.g. went), which are more salient (Andersen 1974) 2.

3.

Why are some sections of the population, namely children and disavantaged speakers, more apt to come up with nonstandard forms than others? Why are some changes restricted to these segments of the population, while others spread quickly throughout it?

Alternative forms arise in language all the time. Hebrew-speaking children frequently come up with backformed nouns such as bana 'girl' for adult bat (from irregular plural banot), but such forms never survive early childhood (Ravid 1990, to appear); older children utter variations such as mimé-xa for adult mim-xa 'for-you' during play, but abandon these forms as they grow older (Ravid 1990); nonstandard forms such as nisé-ti Ί tried' for standard nisit: occur in very great frequency among less literate speakers such as children, on the one

Cost in language acquisition, processing and change

121

hand, and uneducated adults, on the other, but do not characterize the literate adult population; but forms such as yoSénet '(she is) sleeping', considered a solecism by the Hebrew Language Establishment, occur in both the young pre-literate and adult non-literate speech community, as well as among literate adult speakers, though with lesser frequency. This distribution of linguistic varieties indicates disparate construals of grammatical structures by populations with specific psycho- and sociolinguistic attributes: The "Grammar" of a language is thus not a unique entity, but rather a cluster of parallel constructions matching the different understandings of the structured unit by the speaker. I assume that the incomplete mastery of both rule-bound behavior and conventionalized lexical knowledge in children leads to uniquely childish forms; and that choice of a more transparent alternative by children and uneducated adults stems from a looser grasp of rotelearned forms in addition to limited metalinguistic ability of less literate populations. However, the fact that certain non-normative forms (that is, forms considered "incorrect" by the Language Establishment) occur under certain circumstances in the mature literate speech community indicates that register plays a significant role in variation and change as well (Ochs 1979). 4.

Why are certain changes "successful", that is, why do some become established in the speech community, while others are blocked?

Language change is a fact, but only certain forms finally make it to become part of the general linguistic knowledge shared by all speakers at the end of a process of consolidation. For example, a completed change in Modern Hebrew is the failure to shift the stress in past-tense verbs to the second person plural suffix (e.g. current sipártem vs. historical sipariétti), 'you, PI. told' thus leading to the retention of the vowel a in binyan Pa'al (e.g. current katávtem vs. normative/historical ktav-tem) 'you, PI. wrote' (Ravid & Shlesinger 1987). Successful changes must possess certain features that facilitate their assimilation in the established language patterns.

122

Dorit Ravid

As I am going to show below, changes that find their way into the established standards are less "costly" than those that do not, in that they achieve their aim without disrupting the system elsewhere and creating greater havoc. I sought the answers to these questions in the variation patterns of Modern Hebrew. Hebrew is a Semitic language, rich in bound inflections and lexicalization devices, displaying both fusional and agglutinating structures (Blau 1981). Modern Hebrew was revived as a spoken language roughly one hundred years ago, and it has been termed a "fusion language" as it is a simultaneous descendant of at least three historical periods stretching over 4,000 years, as well as of various literary sources across the centuries (Ben Hayyim 1953; Kaddari 1983; Kutscher 1982). Hebrew further underwent lexical and in some areas, syntactic reorganization due to borrowing from Jewish languages such as Yiddish and Judeo-Spanish, as well as from contemporary European languages (Bar-Adon 1977; Fellman 1973; Rosén 1977; Wexler 1990). The constant waves of immigration to Israel - approximately every ten years since 1880 - have resulted in an intensification of the "languages in contact" situation. Another central factor contributing to the acceleration of change in Modern Hebrew is the constant friction between the puristic stipulations of the Hebrew Language Establishement3, on the one hand, and the creole-like tendency of native speakers to lean heavily on innate linguistic principles, on the other (Manzur 1962). In the morphophonological domain, the erosion of historical distinctions in the phonological system of Hebrew has resulted in a simple phonetic system which masks multiple underlying forms (Medan 1953). These nowadays find their expression only in the orthographic system, yet they provide the motivation for an array of morphophonological operations (Blanc 1957; Ornan 1973; Rosén 1956). Note, for example, the different behaviors of the a4 in sadin/sdin-im 'sheet/s' vs. sakin/sakinim 'knife/s'; or the stop/spirant alternations in saxar/yisxar 'traded/ will trade' (root s-H-r)5 vs. saxar/yiskor 'rented/will rent' (root s-k-r); and sakar/yiskor ' reviewed/will review ' (root s-q-r). In the purely morphological domain, speakers face multiple allomorphy bordering on the idiosyncratic, whose formal historical motivation is again inaccessible to native speakers. For example, the

Cost in language acquisition, processing and change

123

pronominal suffix on case-bound prepositions in the second person, feminine singular is determined according to the historical origins of the preposition, thus: b-ax 'in- you', be- 'in', true preposition; al-áyix 'on-you', al- 'on', historically plural preposition; biSvil-ex 'for-you', be+Svil 'for', historically Svil 'path' - a noun (Ben-Asher 1974; Glinert 1982). For the Hebrew speakers, who are naturally unaware of the historical facts, these are opaque systems. The measures that speakers of Modern Hebrew of various ages and backgrounds take to remedy opacity involve language change processes of selection and modification of forms resulting in what looks like a torrent of solecisms and deviations from historical norms to the puristic establishment (Akavia 1957; Goshen-Gottstein & Eitan 1952). These can roughly be arranged into three classes: (i) Transient deviations or true mistakes, which occur solely in the speech of very young children and disappear with maturation, tantamount to English *holded or *mans. These deviations disappear completely by age four, (ii) Literacy-related nonstandard forms, e.g. the use of niftéxeî for niftáxat 'is opening'. These frequently occur in the speech of children up to age 8 and in that of low-SES [socioeconomic status] adolescents and adults, but are infrequent in educated adult speech, (iii) Language change or fossilized forms, considered deviations only by the Language Establishment, which characterize the population as a whole. The experimental design described below set out to explain this three-way distribution of childish deviations, nonstandard forms and language change phenomena in relation to the general questions raised above. 2.

The research experiment

2.1. The experimental design The test consisted of 10 inflectional morphology categories, such as Tense Marking and Lexical Exceptions; and domains where morphophonological processes take place, e.g. Stop/Spirant Alternation, Vowel Alternation (Table 1). The test consisted of 61 items assembled

124

Dont Ravid

in the form of 5 types of tasks which subjects were asked to carry out, and which concealed the true nature of the test categories. For example, when asking the test subjects to change the adjective rax 'soft' to the feminine form raka, I was interested in the change from spirant to stop, rather than in the grammatical change; when the subjects were presented with dma'ot 'tears' the relevant morphophonological process was vowel metathesis leading to singular dim'a although the formal task was the shift from plural to singular. Picture cards and dolls were used for elicitation in the young but also in the less literate subjects. Table 1. The research categories (Ravid 1988) Category Weak Syllable

Example of test item and response*

menaka 'is cleaning, Fm': nikit:/niketi Ί cleaned' Stop/Spirant Alt. cahov 'yellow': cehuba/cahova 'yellow, Fm' Stem Change iparon 'pencil': efronot/iparonim 'pencils' Vowel Alternation hipil/herim 'dropped/raised': mapil/mepiïis dropping'; merim/marim 'is raising' Verb Tense nizhar 'is careful': nizhar/mizaher 'is careful' Lexical Exceptions iSa 'woman': naSim/iSot 'women' Governed Prepositions tetapel be/im- 'take care in/with' Agreement ko 'évet/ko 'ev ha-béten 'aches, Fm/Masc the-stomach, Fm' Backformation cdafim 'shells': cédef/cdaf'sheW Case-Marked Pronouns al-av/al-o 'on-him' Total

Ν 10 1 3 4

7 5 5 5 6 6 61

The first response is the normative one, the second is considered a solecism.

Cost in language acquisition, processing and change

125

2.2. Population The test population (Table 2) consisted of 188 native speakers of Hebrew, divided into 6 age groups from age 3 to 60: 3-year olds, 5-year olds, 8-year olds, 12-year olds, 16-year olds and adults. Table 2. The test population: Distribution by age and socioeconomic status Group ID

Range

3-year olds 3;0 - 3 ; 11 5-year olds a 5;0 - 6;6 5-year olds b 5;1 - 6;2 8-year olds 8;0 - 9;7 12-year olds a 12;3 - 13;4 12-year olds b 12;6 - 14;2 16-year olds 15,11 - 17;11 Young Adults a 22- 35 Older Adults a 49- 61 Adults b 19- 50 Total

Mean

Level of Schooling

Nursery School 3;4 Kindergarten 5;7 Kindergarten 5;7 8;11 Third Grade 12;10 Seventh Grade 13;2 Seventh Grade 16;10 Eleventh Grade 29 55 32

SES

Ν

High High Low High High Low High High High Low

23 21 20 24 21 21 20 11 10 17 188

Every second group was a double group, consisting of one middle/ high SES group and one low SES group. The rest of the population was of middle/high SES background. 2.3. The free speech samples The experimental design was supplemented by spontaneous speech samples from a 7-year longitudinal study of 2 children, plus recorded

126 Dont Ravid

free speech samples of the test subjects and observational data. The experimental and non-structured evidence yield insight into the nature of language variation and language change. Table 3. Mean percentage of Normative scores on test as a whole, ranged in ascending order by age and SES [N=188] Group ID

SES

3-year olds 5-year olds b 5-year olds a Adults b 12-year olds b 8-year olds 16-year olds Older adults a 12-year olds Younger adults a

high low high low low high high high high high

% Normative 1 27.4 33.8 51.1 55.9 69.6 72.5 86.5 86.6 87.2 88.7

2.4. Results The older, more educated population always did better in adhering to Standard or Normative forms than the younger or less educated subjects (Tables 3,4). By age 12, the hi-SES population reaches a very high plateau, while the low-SES subjects start out lower and actually go down in adulthood. Both show the effect of schooling. A clear statistical correspondence is found between younger high-SES subjects and older low-SES subjects. This is very consistent throughout the test categories.

Cost in language acquisition, processing and change

127

Table 4. Statistically significant differences in amount of Normative responses on the test as a whole between groups of subjects, by age and SES. 3 3 5b 5a Adb 12b 8 16 Adao 12a Aday

5a

5b

X X X X X X X X X X

X X X X X X

X X X X X X X X

8

12a

12b

16

X X X X

X X X X X X

X X X X

X X X X X X

X X X

X X X X

Aday Adao X X X X X X

X X X X X

Adb X X

X X X X

Xs indicate statistically significant differences at the .05 level. Groups are ranked horizontally in ascending order of age; vertically, the test groups are arranged in ascending order of Normative responses, from lowest to the highest. Although this general tendency is maintained throughout the test, the subjects do better on some categories - the stable language domains, and poorly on some other categories -the language change domains. This is shown in Table 5, which ranks the test categories in terms of stability: the top domains are the least stable, i.e. undergoing language change, and as we go down their stability increases.

128

Dorit Ravid

Table 5. Mean percentage of Normative responses on the 10 test categories ranked in ascending order of achievement for the population as a whole, and for the double hi/lo SES groups. Category

# of items

Mean Normative Responses Total 188 High SES N=63

Low SES N=58

Vowel Alternation Lexical Exceptions Verb-Subject Concord Stop/Spirant Alter Case-Marked Pronoun Backformation Weak Final Syllable Verb Tense Verb-Governed Prep Stem Change

4 7 5 7 5 6 10 7 7 3

46.8 50.2 51.5 58.4 60.5 63.4 68.7 75.5 76.2 78.4

58.3 62.6 60 73.9 74 77 82.5 86.2 82.1 90.5

35.3 39.4 40 42.6 49.3 50 55.3 71.2 75.9 67.2

Total

61

63.5

75.3

53.2

2.5. Summary of findings 1.

2.

The rise in the amount of Normative responses 6 is a direct result of increasing age and of higher literacy and socio-economic status. Age and SES were found to interact. Specifically, older lowSES groups performed on the same Normative level as younger high-SES groups.

Cost in language acquisition, processing and change

3.

4.

5.

6.

3.

129

Some of the test categories7 were found to be less stable than others: Hifil Vowel Alternation, Subject-Verb Concord, Lexical Exceptions and Stop/Spirant Alternation. The acquisition of a number of categories was clearly dependent on developmental factors. These were Weak Final Syllable, Stop/Spirant Alternation, Stem Change, Verb Tense, Case-Marked Pronouns, Verb-Governed Prepositions, and Backformation. Some categories and specific items were more affected by socioeconomic factors. These were Weak Syllable, Stop/Spirant Alternation, Lexical Exceptions, Case-Marked Pronouns and Backformation. Finally, some of the test categories were especially stable and conformed to Normative requirements. These were Stem Change, Verb Tense and Verb-Governed Prepositions. These were strongly influenced by maturational factors and showed rapid acquisition completed around school age. These results would now be used to discuss the research questions raised above. Discussion

The data presented above have certain implications with regard to the notion of "grammar" within the framework of cognitive linguistics. In contrast to the generative view of grammar as a highly economical entity, consisting primarily of general rules with little or no exception, I will follow the outlines of "the usage-based model" as presented in Langacker (1991). As I perceive this model, rather than being entirely rule-bound, cognitive grammar is a continuum that runs from the highly general to the completely idiosyncratic. Knowledge of grammar by a native speaker includes both schema (= rule) and rotelearned forms, with all possible variations between. A form can be either recalled ("activated directly"), or computed by employing the schema. My data show that economy and rote-learning compete, governed by considerations that may be either language-internal or language-external (age-related, sociolectal, situational, etc.), in most

130 DoritRavid

cases both. Language change results from this competition. The main grammatical components are memorized units and schémas, or generalizations made by the speaker on the basis of his/her knowledge of these units and using his/her understanding of their analyzability - that is, their compositionality - and of both their components and the composite (but not automatic) meanings of these components. These generalizations provide the speaker with symbolic resources that can be used to make up new composite forms. However, two important points that are directly relevant to our discussion should be made here: (i)..."the schema describing a pattern of composition is not itself responsible for actually constructing an expression. Instead it serves a categorizing function: It furnishes the minimal specification an expression must observe to be categorized as a valid instantiation of the pattern it embodies." (Langacker 1991: 265) (ii) ..."schematic relationships are dynamic reflections of the interplay between established convention and the pressures of on-going language use. The forms they take are thus produced by the specific linguistic experience of the individuals, starting during primary language acquisition and continuing throughout life". (Langacker 1991: 117) Why are certain linguistic structures more prone to change than others? When speakers do not perceive an expression as an instantiation of the pattern it is supposed to embody, we have a case of opacity, or a violation of transparency. This study has shown that opaque structures, which violate the transparency principle, are more prone to change than others. The transparency principle (Kiparsky 1982; Lightfoot 1981; Slobin 1985) requires that underlying representations should be easily recoverable by having a 1:1 relationship with their surface manifestations. Within the cognitive model, transparent forms are easily perceived as instantiations of their pattern. For example, Hebrew metathesis interchanges the t of the hitpa'el verbal pattern with the adjacent sibilant root radical, e.g. hit+sabex 'became entangled', root s-b-k, -> histabex (cf. hitkadem 'made progress', root q-d-

Cost in language acquisition, processing and change

131

m). The process occurs only in hitpa'el, and involves only sibilants. As a maximally transparent rule it is never violated in any age or SES group. The less stable linguistic domains in the study violate transparency as will be shown below. The speaker's choice will be either to access the pertinent forms directly as memorized units, or to activate wellknown schémas to remedy opacity by relating the offending structures to them. Naive speakers (i.e. children and uneducated adults) employ strategies deriving from innate operating principles which constitute the innate language-making capacity. In addition to rote learning of given forms in early childhood and of exceptions later on, much evidence has been found for the use of allomorphy-reducing strategies promoting regularity and transparency (Clark & Berman 1984; Slobin 1977, 1985), such as formal simplicity - leaving the stem unchanged when morphologically modified; formal consistency - applying a rule wherever it seems appropriate; semantic transparency - promoting a 1:1 relationship between form and meaning; and saliency - attending to perceptually distinct forms. These principles lead to the creation of schémas that reduce opacity. Adherence to and loosening of the hold of such operating principles is triggered by maturation, the natural process of cognitive growth, specifically referring here to the development of language knowledge and linguistic abilities in interaction with language input. The employment of the operating principles per se does not, however, account for a number of facts. First, the deviations that young children produce must be essentially different from the nonstandard forms of older children and uneducated adults, since they spontaneously disappear from everybody's speech, while nonstandard forms are fossilized in older age groups. Secondly, if indeed the operating principles were the only dominant factors in language acquisition and processing, why do languages still display so much allomorphy and idiosyncracy? Obviously, there must be some contradicting forces that control or check language change. Why are some sections of the population, namely children and disavantaged speakers, more apt to come up with nonstandard forms than others? Why are some changes restricted to these segments of the population, while others spread quickly throughout it?

132 DoritRavid

Table 6. Variation in Modern Hebrew Childhood

Adulthood Level of Education

Children's Errors Synchronic Variation

Nonstandard Forms

W Fossilized Forms •

Diachronic Change

I would like now to propose two criteria that govern the loss of Transient phenomena and the distribution of Standard and Nonstandard forms: cost and literacy (Table 6). These (i) regulate the application of strategies deriving from operating principles to follow transparent and regular schémas when conforming to general rules fails to yield the conventionalized grammatical units as construed by the adult; and (ii) direct the speaker to retrieve memorized units instead of activating schémas. Deviations from the norm, such as in Modern Hebrew, reflect speakers' effort to "correct" subsystems that do not convey grammatical information according to the internal organizing principles that shape morphosyntactic space. Children are the least tolerant of irregularity because they are least able to perceive underlying relationships, having a "local" perception of grammatical systems, and being unaware of orthography (Bentur 1978; Ehri 1985; Karmiloff-Smith 1986). They most frequently re-arrange linguistic units in patterns more compatible with operating principles, which are their only guidance towards the selection of a learnable grammar. Their attempts to remedy inherent opacity are doomed to early disappearance, however, because they "cost" too much: Saliency is lost, markedness is increased, and particular typological features and rote-learned information are transgressed. Most important, although they exercise therapeutic influence on the subsystem involved, they cause opacity to re-

Cost in language acquisition, processing and change

133

lated, major systems. Children's reanalyses adhere to a black-andwhite linguistic model which is modified with maturation. Cost, recognized by the maturing cognitive capacity, blocks Transient deviations from moving on to the Nonstandard category. Forms that survive to adolescence and adulthood must have some beneficial effect upon the particular subsystem with no adverse effect on the global system - this is low cost. These corrective efforts made by older speakers result in re-analyzed structures that deviate from the educated norm without impinging on other systems, to yield a situation of synchronic variation. The factor that separates nonstandard forms from Language Change phenomena is literacy. Literacy is the result of education: It is the ability to find one's way explicitly in both the written and spoken language. It is only fully literate speakers who have access to all possible registers and levels of linguistic usage, are sensitive to schooltaught norms and have an enhanced awareness of abstract underlying structures which otherwise appear idiosyncratic. Indeed, naive speakers have less tolerance towards violations of transparency while literate ones easily accept the fact that much of language is idiosyncratic and has to be learnt by rote. In Hebrew, moreover, access to the orthographic system is especially valuable since distinctions are retained in the orthography which have disappeared from the spoken language and can be directly referred to in the selection of a form. In the absence of literacy, naive speakers employ strategies deriving from operating principles much more than educated speakers, leading to a shallower grammar with more surface motivation. For example, my data show that 52% of the (high-SES) 5-year olds, 71% low-SES 12year olds and 77% of the low-SES adults produce the nonstandard form nikéti Ί cleaned' by following (i) formal consistency (nonstandard open-syllable stem nike, instead of standard nika, by following the regular schema of siper 'told'), and (ii) formal simplicity (attaching the suffix - ti expressing 1st person singular without making any change in the stem vowel). This is why older children and uneducated adults produce more deviations from the standard than literate adults. Why are certain changes "successful", that is, why do some become established in the speech community, while others are blocked?

134

DoritRavid

The line dividing Nonstandard from Language Change phenomena is not as sharp as the one that separates Transient childish deviations from other forms. The reason is that certain types of Nonstandard forms do occur in less controlled varieties of educated Hebrew, such as relaxed, informal, intimate discourse, or the speech of adolescents. An educated person has both the Standard and Nonstandard forms at his/her disposal to use with varying degrees of metalinguistic control as appropriate to the communicative situation. Manipulating this knowledge is readily explainable within a network model, where the various forms available to the speaker are linked together with the default-case form serving as the prototype from which they extend. The prototypical, or default, form will vary with the educational, cultural and sociolectal background of the mature speaker. One piece of evidence is the fact that the high-SES subjects were able to falsify normativity8 in the Nonstandard but not in the Language Change categories and items precisely because they were aware of the existence of a norm. Faced with the experimental situation, educated adults presented a carefully monitored, selfconscious linguistic image to the tester, which the lo-SES subjects were not able to do. Language change is inevitable when literate speakers can no longer perceive the motivation for an opaque structure and access the norm. This is when Operating Principles are triggered in full force and act to select and modify forms in such a way that they should be "beneficial" from the viewpoint of the system as a whole: They repair local opacity by therapeutic re-analysis (Lightfoot 1979), bringing the underlying structure close to the surface for speakers to perceive, without causing opacity elsewhere. Language Change phenomena, pervasive in the sense that they characterize the entire population across the board, serve as a link between synchronic variability and diachronic change. They reach levels where they can no longer be regarded as deviations, but rather as representing a new standard (kyriolexia - as defined by Householder 1983). The interaction of maturation, cost and literacy with the principle of semantic transparency will be demonstrated in the domain of Tense Marking. In four separate cases, Hebrew speakers have modified tense-marked forms to keep homophonous forms apart and allow for semantic transparency.

Cost in language acquisition, processing and change

4.

135

Modification of tense-marked forms

4.1. Transient deviations In 5 out of 7 verb patterns (binyanim), present tense is marked by a mV- prefix, e.g. mastir 'is hiding', mitkadem 'is making progress', while in a sixth, the Pa'al conjugation, a change in the vocalic pattern reflects the change in tense, e.g. katav/kotev 'wrote/is writing. There are only two places in the verb system of Hebrew where past and present tense are not differentiated: in the Nif'al conjugation (e.g. nizkar 'recalled/is recalling') and in an irregular conjugation of glide-medial roots in the Pa 'al conjugation (e.g. kam 'rose/is rising). Very young children use the highly salient prefix mV to mark present-tense verbs in precisely these two verb conjugations where the past and present forms happen to be identical (Tables 7,8), so as to follow transparency. In both cases, the form created by children resembles the future-tense pattern in the same conjugation and thus follow formal consistency as well as semantic transparency, since in 5 out of 7 binyanim, present and future tense forms are identical but for the prefix, e.g. melamed/yelamed 'is teaching/will teach', or maklit/ yaklit 'is recording/will record'. Table 7. Transient deviations: Children's construal of the Nif'al and Pi 'el paradigms: nixnas 'enter', diber 'talk'

Tense Past Present Future

Ν IF'AL Adults Children nixnas nixnas nixnas mikanes yikanes yikanes

Adults diber medaber yedaber

PIEL Children diber midaber yedaber

However, despite the elegant childish solution to eliminate opacity by using the m y prefix, its use is too costly: Both the experimental and

136

DoritRavid

the spontaneous speech evidence as well as the literature show that forms with redundant mV disappear by age 3 year; 6 months. Creating the miCaCeC form results in a disparity between the adult and childish systems, which is too great to preserve communicability since not only is the internal vowel structure altered radically, but so is the unique external prefix mV-, which violates saliency. Moreover, as noted by Manczak (1980), present tense is among the forms that are more frequent in discourse and which remain unchanged diachronically. As maturation reduces the strict adherence to transparency, children are eventually able to come to terms with fused, portmanteau inflectional morphs (Bybee 1985), which are typical of Hebrew morphology. Table 8. Transient deviations: Children's construal of the Pa'al and Hif'il paradigms, Sar 'sing', herim 'rise'.

Tense Past Present Future

Adults Sar Sar yaSir

PA'AL Children Sar maSir yaSir

Adults herim merim yarim

HIF'IL Children herim marim yarim

In terms of cost, there is a further problem: The resulting miCaCeC form (Table 7) is identical to the childish rendition of the present tense form of Pi'el, e.g. misaxek for adult mesaxek 'plays' (Berman 1983); while the MaCiC form (Table 8) is identical to the present tense form of the hif'il pattern. Thus greater transparency in the local inflectional paradigm gives rise to opacity across derivational systems. In terms of cost, it is not worth retaining a strategy which keeps tenses distinct while merging two lexeme-creating conjugations. This cost effective result is also in line with diachronic research showing that more opacity is to be found within inflectional paradigms (Thomason 1976).

Cost in language acquisition, processing and change

137

4.2. Literacy-related nonstandard phenomena (Table 9) The Pa'al conjugation has two present forms: CoCeC, denoting mainly action verbs, e.g. gorer 'is dragging'; and another pattern, CaCeC, denoting mainly state verbs, often used as adjectives, e.g. ra'ev 'hungry'. There is a strong tendency among children up to age 10 and among older less educated speakers to preserve transparency by assigning the active CoCeC pattern to non-stative CaCeC forms. Such transfers occur quite often among teenagers, and may sometimes be found in educated adult speech. Table 9. CaCeC verbs: From error to kyriolexia* +state Literate Form

Classification

Full adjective No regularization Transient Form smexa 'is happy' somáxat 'is happy' Childish regularization "Jarring" Deviation yeSena 'is sleeping' yoSénet 'is sleeping' gdela Tolerated Deviation 'is growing' godélet 'is growing' "kyriolexia" lomédet 'is studying' lomédet 'is studying' Established Form Completed Change -state Verbs are given in the Feminine form CoCeCet where the difference from CCeCa is more salient perceptually than Masculine CaCeC/CoCeC. kveda

'heavy'

Naive Form kveda

'heavy' I

This constitutes an example of an on-going change which is spreading from young and less literate to older and more literate groups. It is a direct continuation of a historical trend which is attested to even in Biblical Hebrew, as exemplified by the modern form of lomed 'is studying' rather than historical lamed (Table 9) which still occurs in

138 DoritRavid

formulaic expressions (Bergstrasser 1982). The existing CaCeC/CoCeC forms can be ranked on a continuum according to Householder's (1983) notion of kyriolexia - the speaker's construal of what is a "standard form", varying semantically from stative to more dynamic forms. The less stative the CaCeC verb is, the more likely it is to cross over to the major CoCeC pattern, while in terms of cost, nothing prevents this shift. The difference between the distribution of these forms in the population is made by literacy: For less educated speakers it is enough for a form to cross the middle line of stativeness to be classified as CoCeC (Table 9). Literate speakers are more sensitive to the semantic nuances of this continuum, possibly because they are more familiar with obsolete CaCeC forms in texts from various historical periods, as well as because of their preference for markedness. 4.3. Language change phenomena (Table 10) The modal verb yaxol 'can, is able to' has an idiosyncratic paradigm within Pa'al, with identical 3rd person singular forms in past and present tense. All speakers invariably keep the two tense forms apart by either using the copula haya 'be' to carry tense - which is the strategy favored by educated adults as in hu lo haya yaxol li-sbol et ze 'He wasn't able to tolerate this'; or by assigning the root the Pa'al past tense form CaCaC, yielding yaxal - which is favored by the young and/or less sophisticated speakers, as in hu lo yaxal li-sbol et ze 'He could not (to)tolerate this'.

Cost in language acquisition, processing and change

139

Table 10. The yaxol 'can' paradigm compared with a regular Pa'al verb and a comparable adjective histórica 1 form

literate adults

immature/less literate speakers

Pa'alvb

adjective

Past Tense

yaxol

haya yaxol 'was able'

yaxal

katav wrote'

haya gadol was big'

Present Tense

yaxol

yaxol

yaxol

kotev is writing'

gadol 'big'

This is an example of a successful linguistic change. Unlike the childish miCaCeC present-tense re-analysis which, if retained, would obliterate the characteristics of the juvenile Pi'el pattern (see Table 7), yaxol is a unique exception within Pa 'al, an idiosyncratic form with no group coherence to back it up (see Table 11). Given this, it is "inexpensive" to follow transparency in creating a regular and distinct past tense form in this case, with no danger of creating problems in other systems. One might ask then how speakers tolerate the idiosyncratic present tense form of yaxol. My answer would be that it is a highly frequent, familiar, daily verb, whose entire exceptional paradigm is learnt by rote early on and thus resists change; and that it belongs to the ancient conjugation of Pa'al, which typically includes a large number of exceptional forms (Blau 1981). Moreover, it is a modal verb, akin to adjectives in meaning, sharing the same basic pattern with them (compare karov 'close', gadol 'big'). By using the copula haya, older high-SES speakers preserve transparency while holding on to a marked form. Younger/less educated speakers who adopt the regular past tense form yaxal adhere to both Semantic transparency and formal consistency without being hindered by literacy.

140

Dont Ravid

4.4. Language change completed (Table 11) CaCeC state verbs in Pa'al historically had the same 3rd person singular structure in past and present tense, e.g. zaken 'old/grew old'. In Modern Hebrew the past form takes the regular Pa'al pattern to yield zakan 'grew old' (Goshen-Gottstein, Livne & Span 1977). This established change of splitting CaCec forms in past and present tense is an example of following semantic transparency in the evolution of language which is "worthwhile" in terms of cost: The change from present tense to past tense form is in the vowel (e.g. gadel/gadal 'is growing/grew') in line with other Pa'al forms (for example, kotev/katav 'is writing/wrote'); Collapsing the a-a and α-e vowel patterns in past tense allows speakers to preserve formal consistency as well as semantic transparency by regularizing a minor form to the predominant vowel pattern within the same binyan form. No structural changes are occasioned elsewhere in the grammar outside the Pa'al paradigm, as was the case with the childish miCaCeC forms, since the a-a past tense pattern is unique to Pa'al. The change is thus purely beneficial, or "cheap", therefore successful. Table 11. Splitting homophonous tense forms: Stages in language change Type of Change

Past Tense

Transient niCCaC

Present Tense

miCaCeC

Ongoing yaxol 'is able' haya yaxol/yaxal

Established yaSen 'is sleeping' yaSan

Cost in language acquisition, processing and change

Notes 1.

2.

3.

4.

5.

6.

7. 8.

Modern Hebrew is about 90 years old. It was revived as a spoken language in (then) Palestine around the turn of the century, mostly by East European Jews, and became the language of a first generation of native speakers within as little as 20 years (Bachi 1957; Blau 1981b; Kutscher 1982). The Hebrew pattern CiCCa is losing its historical plural form CCaC-ot, and this is reflected, for example, in late acquisition (children go on saying dim'a/dim'ot 'tear/s' instead of dma'ot until age 6 and later; and in that it takes adults' familiarity with a lexicalized form to produce the plural form required by the Language Establishment, so that Hebrew speakers would usually say cir'a/cir'ot 'wasp/s' for Normative cra'ot. In contrast, the CéCeC pattern, which is historically related to CiCCa is an early acquisition and is not undergoing change despite being just as complex structurally (e.g. pérek/prakim 'chapter/s' (Ravid 1988). A general term for that section of the Hebrew speaking community which regards itself as guardians of historical language norms, mostly consisting of Hebrew teachers, some journalists and the Academy for the Hebrew Language, the official body that is appointed by law to uphold the Hebrew language. The vowel a usually deletes or reduces in Hebrew two syllables before the stressed vowel, e.g. karov/krov-im 'relative/s' (Blau 1981a). Hebrew words appear in broad phonetic transcription; however Semitic roots are represented in a more abstract way, with root radicals indicating the underlying phonemes which determine the morphophonological alternations the root undergoes, and reflecting the Hebrew letters corresponding to them. There was also a very marked rise in Appropriate, that is, nonnormative but grammatically and contextually pertinent, responses. For details see Ravid (1988). See Ravid (1988) for specific test items which were especially unstable as well. Please recall that "normativity" in the context of Hebrew studies refers to conforming to the dictates of the Language Establishment by adhering to rules of historical grammar.

142

Dorit Ravid

References Akavia, Uriel 1957 "On solecisms", Hachinuch 58: 168-175. (In Hebrew) Andersen, Henning 1974 "Toward a typology of change: Bifurcating changes and binary relations", in: J.M. Anderson & C. Jones (eds.), Historical Linguistics. Amsterdam: North-Holland. Bachi, Roberto 1957 "Statistics of the revival of Hebrew", Leshonenu 20: 65-82; 21:4168. (In Hebrew) Bailey, Charles-James N. 1973 Variation and linguistic theory. Arlington, Virginia: Center for Applied Linguistics. Baron, Naomi 1977 Language acquisition and historical change. Amsterdam: NorthHolland. Bar-Adon, Aharon 1977 S. Agnon and the revival of Hebrew. Jerusalem: The Bialik Institute. (In Hebrew) Ben Hayyim, Ze'ev 1953 "An ancient language in a new reality", Leshonenu La'am 35-37. (In Hebrew) Ben-Asher, Mordechai 1974 "Prepositions in Modern Hebrew", Leshonenu 38: 285-294. (In Hebrew) Bentur, Esther 1978 Some effects of orthography on the linguistic knowledge of Hebrew speakers. Ph.D. dissertation, University of Illinois. Bergstrasser, Guttholff 1982 Hebrew grammar. Jerusalem: Magnes. (In Hebrew) Berman, Ruth A. 1981 "Language development and language knowledge: Evidence from the acquisition of Hebrew morphophonology", Journal of Child Language 9: 169-190. 1983 "Establishing a schema: Children's construal of verb-tense marking", Language Sciences 5: 61-78. Bickerton, Derek 1990 Language and species. Chicago: University of Chicago Press. Blanc, Hayyim 1957 "Hebrew in Israel: Trends and problems", The Middle East Journal 11: 374-410. Blau, Joshua 1981a Hebrew phonology and morphology. Tel Aviv: Hakibbutz Hame'uchad. (In Hebrew)

Cost in language acquisition, processing and change

1981b

143

The renaissance of Modern Hebrew and modern standard Arabic: parallels and differences in the revival of 2 Semitic languages. University of California Publications Series: Near Eastern studies Vol. 18. Blount, Ben G. & M. Sanches (eds.) 1977 Sociocultural dimensions of language change. New York: Academic Press. Bybee, Joan L. 1985 Morphology: A study of the relation between meaning and form. Amsterdam: John Benjamins. Chomsky, Noam 1965 Aspects of the theory of syntax. Cambridge, Mass: The MIT Press. 1979 Language and responsibility. New York: Pantheon Books. 1980 "On the biological basis of language capacities", in Noam Chomsky (ed.), Rules and representations. New York: Columbia University Press. Clark, Eve V. & Ruth A. Berman 1984 "Structure and use in the acquisition of word formation," Language 60: 542-590. Damasio, Antonio R., Hanna Damasio, Daniel Tranel & J.P. Brandt 1990 "Neural regionalization of knowledge access: Preliminary evidence", in Cold Spring Harbour Symposia on Quantitative Biology, Vol. LV: The Brain. Cold Spring Harbor: Laboratory Press. de Saussure, Ferdinand 1955 Cours de linguistique generale. 5th edition. Paris: Payot. Ehri, L.C. 1985 "Effects of printed language acquisition on speech", in: D.R. Olson, N. Torrance & A. Hildyard (eds.), Literacy, Language and learning: The nature and consequences of reading and writing. Cambridge: Cambridge University Press. Fellman, Joshua 1973 "Concerning the "revival" of the Hebrew language", Anthropological Linguistics, May: 250-57. Glinert, Lewis H. 1982 "The preposition in biblical and Modern Hebrew: Toward a redefinition", Hebrew Studies 23: 115-125. Goshen-Gottstein, Moshe & Eli Eitan 1952 "Language corrections", Leshonenu La'am 26: 3-10. (In Hebrew) Goshen-Gottstein, Moshe, Ζ. Livne & S. Span 1977 Functional Hebrew grammar. Jerusalem:Shoken. (In Hebrew) Gould, Stephen J. 1977 Ever since Darwin. New York: W.W. Norton. Hale, Kenneth 1973 "Deep-surface canonical disparities in relation to analogy and change: an Austrian example", in Thomas S. Sebeok (ed.), Current trends in linguistics. The Hague: Mouton.

144

Dont Ravid

1981

On the position ofWalbiri in a typology of the base. Reproduced by the Indiana Linguistic Club, Bloomingdale, Indiana. Hocket, Charles F. 1965 "Sound change", Language 41: 185-204. Hooper, Joan B. 1979 "Child morphology and morphophonemic change", Linguistics 17: 21-50. Householder, Fred W. 1983 "Kyriolexia and language change", Language 59: 1-17. Kaddari, Menachem Z. 1983 Research in Israeli Hebrew. Min Hasadna. Jerusalem: The Council for Teaching Hebrew. (In Hebrew) Karmiloff-Smith, Annette 1986 "Some fundamental aspects of language development after five", in: P. Fletcher & M. Garman (eds.), Language acquisition: Studies in first language development (2nd ed.) Cambridge: Cambridge University Press. Kiparsky, Paul 1982 Explanation in phonology. Dordrecht: Foris Publications. Kutscher, Yechezkel 1982 A history of the Hebrew language. Jerusalem: Magnes. Labov, William 1972 Language in the inner city: Studies in the black English vernacular. Philadelphia: University of Pennsylvania Press. 1980 "The social origins of sound change", in William Labov (ed.), Locating Language in Time & Space. New York: Academic Press. 1981 "Resolving the neogrammarian controversy", Language 57: 267308. Lakoff, George & Peter Norvig 1987 "Taking: A study in lexical network theory", Proceedings of the annual meeting of the Berkeley Linguistics Society 13: 195-206. Langacker, Ronald W. 1991 Concept, image and symbol: The cognitive basis of grammar. Berlin: Mouton de Gruyter. Lightfoot, David W. 1979 Principles of diachronic syntax. Cambridge: Cambridge University Press. 1981 "Explaining syntactic change", in: N.K. Hornstein & D.W. Lightfoot (eds.), Explanation in linguistics. London: Longman. 1991 How to Set Parameters: Arguments from language change. Cambridge, Mass: The MIT Press. Manczak, Witold 1980 "Laws of analogy", in Jacek Fisiak (ed.), Recent developments in historical phonology. The Hague: Mouton. Manzur, Ya'akov 1962 "Language corrections", Leshonenu La'am 13-17. (In Hebrew)

Cost in language acquisition, processing and change

145

Martinet, André 1960 Elements of general linguistics (Translated by E. Palmer). Chicago: University of Chicago Press. Medan, Meir 1953 "Hebrew and related languages", Leshonenu La'am 33-41. In Hebrew. Naro, Anthony & Miriam Lemle 1976 "Syntactic diffusion", in: S.F. Stever et al. (eds.), Papers from the parasession on diachronic syntax. Chicago: Chicago Linguistic Society, 221-239. Ochs, Elinor 1979 "Planned and unplanned discourse", in: Talmy Givón (ed.), Syntax and semantics. Vol. 12: Discourse and syntax. New York: Academic Press. Oman, Uzi 1983 "Regular rules and the bkp problem", in: Uzi Ornan (ed.), Readings in phonology. The Hebrew University in Jerusalem. (In Hebrew) Ravid, Dorit 1988 Transient and fossilized phenomena in inflectional morphology: Varieties of spoken Hebrew. Ph. D. dissertation, Tel Aviv University. 1990 "Internal structure constraints on new-word formation devices in Modern Hebrew", Folia Linguistica 24/3-4: 289-346. to appear "The acquisition of morphological junctions in Modern Hebrew", in: H. Pishwa (ed.), The development of inflectional morphology. Ravid, Dorit & Yizhak Shlesinger 1987 The case ofa-deletion in Modern Hebrew. Talk given at the annual conference of the Israeli Association for Theoretical Linguistics, Bar-Ilan University, 13.5.87. Rosén, Hayyim 1956 Our Hebrew: A linguistic analysis. Tel-Aviv: Achad Ha'am. (In Hebrew) 1977 Good Hebrew: Readings in syntax. Jerusalem: Kiryat Sefer. (In Hebrew) Slobin, Dan I. 1977 "Language change in childhood and in history", in: J. MacNamara (ed.), Language learning and thought. New York: Academic Press. 1985 "Crosslinguistic evidence for the language-making capacity", in: Dan I. Slobin (ed.), The Crosslinguistic Study of Language Acquisition. Hillsdale, NJ: Erlbaum. Thomason, Sarah G. 1976 "What else happens to opaque rules?", Language 52: 371-381. Wang,, William S. Y. 1969 "Competing residues as a cause of change", Language 45: 9-25.

146

Dont Ravid

Wang, William S. Y. & Chin Chuan Cheng 1977 "Implementation of phonological change: The Shuang-Feng Chinese case", in: William Wang (ed.), The lexicon in phonological change. Hie Hague: Mouton. Weinreich, Uriel, William Labov & Marvin I. Herzog, 1968 "Empirical foundations for a theory of language change", in: W.P. Lehman & Y. Malkiel (eds.), Directions for historical linguistics. Austin: University of Texas Press. Wexler, Paul 1990 The schizoid nature of Modern Hebrew: A Slavic language in search of a Semitic past. Wiesbaden: Otto Harrassowitz.

From cognitive psychology to cognitive linguistics and back again: The study of category structure Barbara C. Malt

1.

Introduction

The study of concepts and classification in psychology has long been intertwined with the study of word meaning in linguistics, and both fields have drawn from, and contributed to, consideration of these same issues in anthropology and philosophy. To take just one example of this mutual influence, the psychologist Rosch's early studies of the mental representation of categories (e.g., 1973; Rosch & Mervis 1975; Rosch, Mervis, Gray, Johnson, & Boyes-Braem 1976) were heavily influenced by previous anthropological work by Berlin and his colleagues (e.g., Berlin & Kay 1969; Berlin, Breedlove, & Raven 1974), by philosophical writings by Wittgenstein (1953), and by linguistic analyses provided by Lakoff (1972). The work of Berlin et al. was itself inspired in part by linguistic hypotheses advanced by Sapir (1949) and by Whorf (1956), while the outcomes of Rosch's studies have in turn been influential in subsequent work in linguistics (e.g., Lakoff 1987; Taylor 1989). Thus this general area of investigation has been one of the most truly interdisciplinary ventures in the study of mind and language. The recent rise of an orientation within linguistics known as cognitive linguistics would seem to suggest an even closer alignment between the work of linguists and cognitive psychologists interested in concepts and word meanings. In fact, however, the trend in psychological theorizing about these topics has been in a rather different direction from the trend within cognitive linguistics. Below, I will briefly describe recent suggestions from psychology about the nature of concepts, and I will discuss the motivation behind these suggestions.

148

Barbara C. Malt

These will be contrasted with the view pursued by cognitive linguists in describing word meaning. I will then present some data addressing psychological issues that I think argue for drawing the psychological and the linguistic theorizing closer together. 1.1. Suggestions from psychology about the nature of concepts

Before the mid-1970s, an influential idea in both psychology and linguistics was that concepts and word meanings might best be described in terms of a set of necessary and sufficient features. Thus, for instance, the meaning of a word like "bachelor" would be described as consisting of defining features such as "unmarried" and "male" (e.g., Katz & Fodor 1963). Similarly, studies of concept acquisition assumed that artificial categories based on defining features provided a useful model of natural categories (e.g., Bruner, Goodnow, & Austin 1956). However, most psychologists and at least many linguists abandoned this idea following the seminal work of Rosch (e.g., 1973; Rosch & Mervis 1975) and others (e.g., Labov 1973; Lakoff 1972; Posner & Keele 1970), which suggested that defining features for most common concepts/word meanings could not be found, and that concepts/word meanings could better be described as having a prototype or family resemblance structure. Through the 1980s, linguists expanded on this new conception of word meanings, describing not only the use of common nouns in terms of a prototype structure, but also the use of other sorts of words such as prepositions (Brugman 1983; Lakoff 1987), verbs (Coleman & Kay 1981), and adjectives (Dirven & Taylor 1988). Further, they moved beyond appeals to categories based simply on similarity to a prototype and described more complex sorts of family resemblances among the uses of a word, including relationships such as metonymical and metaphorical ones (Lakoff 1987; Taylor 1989; see also Nunberg 1979). Such analyses are now applied to categories of other sorts as well, including morphological categories such as tenses and grammatical categories such as "noun" and "verb" (e.g., Lakoff 1987; Taylor 1989).

From cognitive psychology to cognitive linguistics

149

In contrast to the trend in cognitive linguistics, though, psychologists gradually moved away from, rather than toward, fully embracing the idea of concepts consisting of prototypes or family resemblance structures. The motivation for this movement has been at least threefold. First, experimental results that had been taken as evidence for prototypes in the mental representation (e.g., typicality ratings and reaction times to category membership judgments that seemed to reflect graded category membership) turned out to be less definitive than originally thought. Armstrong, Gleitman, & Gleitman (1983) showed that similar patterns of results could be obtained for categories presumably based on defining features such as "bachelor". Formal models of categorization showed that these sorts of results could be produced by representations involving neither a prototype nor a family resemblance pattern of similarity, given appropriate processing assumptions (e.g., Estes 1986; Hintzman & Ludlam 1980; Medin, Dewey, & Murphy 1983). These developments indicated that the nature of the mental representation was still an open question. Second, psychologists began to question whether family resemblance accounts of the mental representation were sufficiently constrained to explain categorization of common objects such as dogs, cats, and chairs. One problem was that existing accounts had no way to specify what features were to be counted in determining whether two objects were sufficiently similar to one another or to a category prototype to be considered category members. In the absence of some principled method, any two objects can be described as arbitrarily similar or different from one another or from a category prototype (Murphy & Medin 1985). Another problem was that the accounts gave no explanation of why some things that might seem to be sufficiently similar to the category prototype are not, in fact, counted as category members. For example, they could not explain why people think that a raccoon that has been surgically altered to look and act just like skunk is not a skunk (Keil 1989). Finally, evidence began to accumulate for the existence of "folk" or naive theories (e.g., Carey 1985; Malt 1990; Murphy & Medin 1985) or "psychological essentialism" (Medin & Ortony 1989) in people's concepts of many categories. So, for example, people appear to have a folk theory of what makes a skunk a skunk or a dog a dog that centers

150

Barbara C. Malt

on the idea of genetic stucture or parentage (Keil 1989). Folk theories were seen as potentially providing the missing constraint on what animals people will accept as a skunk or a dog. Similarly, a folk theory of artifacts that centers on the function that various artifacts carry out might constrain what objects belong to artifact categories (see Malt & Johnson 1992). The development of such theories was seen as a potentially critical part of the child's conceptual development; thus, studying them has become a central part of research on concept acquisition (e.g., Carey 1985; Gelman 1988; Keil, 1989). 1.2. Contrasts with the view pursued by cognitive

linguists

In essence, then, while cognitive linguists have moved farther and farther from the idea of there being features or other types of properties that are common to all category members, psychologists have moved back in the direction of this idea (though the "theory" form of the current conception is quite distinct from the earlier defining features notion). This divergence may seem odd given that both groups ultimately seek to characterize the same thing: how the human mind understands the world and encodes that understanding in language. However, it may reflect a fundamental difference in the research strategies of psychologists vs. linguists. Psychologists are directly interested in the mental representation and processing of categories. To investigate these issues, they take as their primary data observations about human behavior; that is, the responses of subjects engaged in various types of categorization tasks. Linguists, on the other hand, take as their primary data observations about patterns of word use in a language. These data are used most directly to draw conclusions about the nature of language (or some part of language) as a system. The conclusions about language may then lead to inferences about the nature of the mind. These differences in both proximal goal and methodology may at times lead to conflicting perceptions about category structure. The psychologists' interest in representation and processing has led, among other things, to a greater emphasis on probing for explanations of classification decisions, and a correspondingly greater interest in

From cognitive psychology to cognitive linguistics

151

the potential role of folk theories in determining category membership (e.g., Carey 1985; Gelman 1988; Keil 1989). The psychologists' use of human subjects with limited time and attention spans has also led them to most often deal with relatively small samples of category members (though cf. Rosch & Mervis 1975 for one exception). They may, in fact, sample only one or a few members of a given category in order to be able to test members of several different categories (e.g., Keil 1989; Malt & Johnson 1992). To achieve greater experimental control, they have also often used artifical stimuli rather than members of real-world categories (e.g., Estes 1986; Malt 1989; Medin & Shaffer 1978). In contrast, cognitive linguists study only categories given by natural languages, and they frequently examine a wide range of examples of uses of a given word, bringing to light an almost astonishing degree of variability in uses of a particular word (e.g., Brugman 1983). Both these aspects of research strategy chosen by psychologists may help produce the impression of relatively constrained category membership. Limited stimulus samples may not reflect all the complexity that exists within the full range of real-world category members, and subjects' explanations of their laboratory classification choices may not reflect all the factors that influence categorization in general. The divergence between psychological and linguistic theorizing could, in principle, also be due to the fact that cognitive linguists have generally been interested in a different set of categories than cognitive psychologists have: Cognitive psychologists have remained mainly interested in classification of objects in the world such as dogs, cats, and chairs, whereas cognitive linguists have examined many other sorts of categories. It is possible that the structure of categories labelled by nouns such as "dog" and "chair" is different from the structure of categories labelled by prepositions or the structure of grammatical or morphological categories. However, I want to argue that, in fact, the structure of many common object categories may be similar to the structure of categories more often discussed by cognitive linguists. Adopting the linguists' strategy of examining as large a sample of category members as possible for a given category may reveal complexities in membership that have not been noted before. Further, this examination suggests that although folk theories of category member-

152

Barbara C. Malt

ship may be an important component of the contents of a concept, their presence need not fully constrain what entities are counted as category members. Below, I present data to support these two claims. 2.

Data collection

I have recently gathered data on the members of categories labelled by a number of common English nouns, and I have been using these data to explore various aspects of the mental representations of the categories. Some of the categories that we have been studying include: belt, book, bottle, collar, can, juice, milk, seed, tea, tape, and water. My students and I have compiled extensive lists of the things in the world that get called by each of these words. We used three sources in constructing our lists. The first was observation of use of the target nouns in ordinary conversations, newspapers, television, etc. We noted when the noun occurred and what sort of object it referred to. The second was a computer search of the Brown text corpus for all occurrences of the target nouns. This corpus, which was the source for the Kucera and Francis (1967) counts of word frequency in American English, consists of a selection of newspaper articles, novels, and other ordinary prose. We used the presence of adjective and noun modifiers (e.g., orange juice, Coke bottle) along with the surrounding context where necessary to determine what sort of object the noun was referring to. The third source of examples was a laboratory task aimed at getting subjects to generate examples other than the standard examples of the named categories used in many psychological studies. Fourteen Lehigh University undergraduates were asked to play a game in which they were to try to guess the same example of the target word that someone else was thinking of. They were told that the trick to the game was that their opponent was not thinking of an easy, obvious example. So, for instance, a subject might be asked to try to guess the same type of juice that someone else was thinking of, but he or she should assume it was not an obvious one like apple juice or orange juice. Subjects simply recorded their guesses for each target noun on a piece of paper.

From cognitive psychology to cognitive linguistics

153

We have also collected what I call "non-examples" of the same set of nouns. Non-examples are things that are similar in some ways to examples of the target nouns but are labelled differently; that is, they do not appear to be members of the category named by the noun. Nonexamples are important for constraining hypotheses about concepts. For instance, a reasonable description of the concept of a sweater might at first seem to be something like "a soft garment that is worn over a shirt or blouse for additional warmth in cool weather." Two non-examples, though, immediately indicate that this description is not an adequate one. Both sweatshirts and shawls meet the definition, but they are not considered to be sweaters. Therefore, the mental representation of the category must include some other sort of information that in some way yields a separation of the things we call "sweater" from the things we call "sweatshirt" and "shawl." We used two sources of non-examples. First, we simply observed entities that are similar to examples but are labelled differently in conversations, newspapers, etc. For instance, if a target noun were "sweater," we would note items of clothing that share some properties of sweaters but that have other names such as "sweatshirt" or "shawl." Second, we used a laboratory task in which fourteen Lehigh University undergraduates were asked to try to generate non-examples. They were asked, as in the task to generate examples, to play a game in which they should try to guess something that their opponent had in mind. In this version, though, they were told that the thing to try to guess was not the target category, but was something similar to it. For instance, subjects might be asked to try to guess what object someone else was thinking of that wasn't a sweater, but was similar to a sweater. The purpose of this task was to elicit a range of objects that are given category names contrasting with the target category. The examples that we studied did not include cases where a category name appears to be applied to an object metaphorically or metonymically. Although it is difficult (perhaps impossible) to draw a clear dividing line between literal and non-literal uses of a word, I am most concerned here with uses of words in which the objects being referred to are of the same ontological kind. It is these uses that have been of concern for theories of concepts in cognitive psychology. Accounting for these uses in itself is a large task, and relationships

154

Barbara C. Malt

among the examples of a category even when limited in this way may be quite complex. 3.

Discussion of the data

3.1. Water I will discuss data on the category picked out by the word "water" first and in most depth. "Water" is an especially interesting noun to consider because it has featured prominently in recent psychological theorizing about concepts. The discussions within psychology derive in part from work on word meaning by the philosopher Hilary Putnam. According to Putnam (1975), natural kind terms such as "water" function to pick out sets of things that share a common nature or "essence," such as a particular chemical composition. Under this view, the correct use of a natural kind term is not determined by a representation of meaning in a person's head at all; rather, the set of things that the term applies to is a matter of empirical investigation of the world. However, the view also assumes that language users strive to use words in a way that is consistent with this claim. Users of a term may sometimes have mistaken or incomplete beliefs about the properties that the word picks out and so they may sometimes use a term incorrectly. To the best of their ability, though, they use it to label things that share a common essence. Psychologists have not generally accepted Putnam's view of word meaning as a whole. As discussed earlier, however, many psychologists have adopted a view of concepts consistent with the more psychological part of Putnam's proposal. They have suggested that, whether or not there really are sets of things sharing a common essence, people believe that there are, and people use category names to pick out sets of things that they think share a common essence (Carey 1985; Keil 1989; Malt 1990; Medin & Ortony 1989). Clearly, this idea suggests a direct connection between people's beliefs about category membership and the things in the world that they consider to be members of a category. Thus, this is a case where careful examination of the things that get called by a particular category name should

From cognitive psychology to cognitive linguistics

155

be useful in evaluating whether the proposal about concepts and category membership is correct. With respect to "water," most Americans with at least a high school education are familiar with the idea that compounds such as water have a particular chemical composition, and they are also aware that water itself is considered to be H2O. The prediction of this point of view, then, is that the liquids people will choose to label "water" will be heavily determined by their belief that the liquids have the right essence (the particular chemical composition, H2O). If someone believes a liquid is mainly H2O, he or she should call it water, and if the person believes it is not H2O, he or she should not call it water. To investigate whether this prediction holds true, we compiled a list of 43 examples of liquids that people call "water," using the three sources I described before. Some of the examples that we collected include tap water, dish water, swamp water, salt water, rain water, and swimming pool water. We also compiled a list of 55 liquids that are similar in some way to water but that are not normally called "water," also as I described earlier. Some of these are eye drops, windshield wiper fluid, Sprite, lemonade, tea, and chicken broth. More examples and non-examples are given in Table 1 below, along with the data from the task I will now describe. In order to test whether it is a belief about the essence of the substance that separates out the liquids people call "water" from liquids they do not call "water," 23 Lehigh University undergraduates were asked to estimate the proportion of H2O that they felt was contained in each instance of water. The instructions for rating the "water" examples pointed out that many liquids are not pure, but rather are a mixture of different substances; subjects were then asked to carefully consider each liquid and estimate how much H2O was actually contained in it. An additional 23 Lehigh students were asked to estimate the proportion of H2O they felt was contained in the non-examples. The instructions for ratings of the non-examples pointed out that the name of a liquid does not necessarily fully reveal its composition; subjects were then asked to carefully consider each example and estimate how much H2O was contained in it. (Separate groups of subjects rated the examples and the non-examples so that they could not simply use

156

Barbara C. Malt

presence or absence of the word "water" in a liquid's name as the basis for their judgment.) Table 1 gives the average amount of H2O judged to be contained in some of the liquids. As the table shows, all the things called "water," and all the things not called "water" as well, were judged to have at least some H2O in their composition. Although the overall percent of judged H2O was higher for the things called "water" than for the things called by other names, there are two critical observations that undermine the idea that belief in a particular essence, in this case H2O, is the basis for separating members of the two categories. First of all, most liquids NOT considered to be water were judged as being composed of more than 50% H2O (in fact, the average was 67%). This result indicates that people do not consider "something" to be water simply because they believe that the dominant substance in it is H2O. If this were so, they should call most of the non-waters "water" instead. Second, and even more importantly, there was noticeable overlap in the amount of judged H2O between the liquids that are called "water" and those that are not called "water". That is, some non-waters were actually rated higher in H2O than some waters. For instance, tea was judged on average to contain 91% H2O, yet it is not called "water." In contrast, swamp water was judged on average to contain only 69% H2O, but it is considered to be water. Table 1. Some water and non-water examples and mean judged percent H2O water example

mean judged percent H2O

pure water purified water natural spring water bottled water rain water

98.1 94.8 92.6 92.3 90.9

non-water example

tea (cup of) saliva coffee (cup of) tears sweat

mean judged percent H2O

91.0 89.3 89.1 88.6

87.3

From cognitive psychology to cognitive linguistics

water example

157

mean judged percent H2O

non-water example

mean judged percent H2O

ice water soft tap water drinking water fresh water water fountain water

90.4 89.9 89.4 89.1 88.8

lemonade chicken broth saline solution urine cranberry juice

86.9 81.3 80.1 79.1 76.9

pond water ocean water chlorinated water dish water polluted water muddy water unpurified water swamp water radiator water sewer water

78.8 78.7 78.1 77.1 70.6 70.3 69.2 68.8 67.3 67.0

transmission fluid corn syrup aftershave lotion bleach creme rinse vodka tree sap nail polish remover lighter fluid Cool Whip

54.2 52.8 52.6 51.3 50.7 48.5 48.2 46.8 42.3 41.8

This result suggests even more strongly that the set of things called "water" is NOT determined heavily by what people believe about the presence of H2O in the liquid. To provide additional information about the concept of water, in another part of this study we collected typicality ratings for the liquids considered to be water. Twenty-six Lehigh undergraduates rated each example of water for how good or typical an example of water it was. These data are consistent with the outcome of the first part of the study. The ratings show that the waters judged closest to pure H2O in the previous task (pure water, purified water, and natural spring water) are only eighth, fifth, and fifteenth in judged typicality, respectively.

158

Barbara C. Malt

The example of water rated the best example was actually drinking water, with tap water, rain water, and water fountain water coming next in line. In other words, the liquids that people think of as the best examples of water are not necessarily those they believe have the greatest amount of H2O. So, the typicality results also suggest that the concept of water has influential components other than that of a particular chemical composition, H2O. If people are not using the word "water" to label liquids that share the composition H2O, what IS the word "water" capturing? In a third phase of the study, we selected a subset of twenty examples of water from our list of 43 (avoiding ones that were near synonyms of each other; e.g., since salt water and ocean water were both in the original list, only salt water was included). Twenty Lehigh undergraduates judged how similar each of those 20 examples of water was to each of the others (for a total of 190 judgments per subject). The average similarity judgments for each pair were then entered into a computer program that arranged the examples in an "extended tree" structure (Corter & Tversky 1986) reflecting the judged similarity. Table 2 shows the tree that resulted. Distance in the tree reflects distance in the similarity ratings. Segments that are marked with the same letter indicate features shared by non-adjacent nodes. The tree shows that three distinct clusters of waters are present. At the bottom of the tree are what I call "domestic waters" — waters that come out of the tap and are used in the home, along with waters that are also used in or around the home but come from other sources. In the middle are "wild" waters — waters that occur outdoors, in nature, and are deposited there by nature. Finally, at the top of the tree, there is a group of unhealthy or dirty waters —waters that have a large amount of ingredients other than H2O and that have some sort of negative connotation attached to them. This structure suggests that at least four dimensions of similarity are salient in people's concept of water: the immediate source (from nature or via human intervention); the current location (around the home or in the wild); the function with respect to humans (domestic or not); and the relative purity of the liquid. Only the last of these corresponds to the notion of the composition of the substance. The other three are more superficial characteristics in the sense of being readily

From cognitive psychology to cognitive linguistics

159

observed by the average person, and they are ones that depend much more heavily on how humans interact with the substance. I would like to suggest that the three dimensions other than composition are relevant to explaining why some liquids are considered to be water and others are not, when there is no obvious dividing line based on beliefs about their chemical composition. Why is tea, for instance, not considered to be water when it is judged to be 91% H2O? It may be because it is drunk by humans as a hot beverage, not cold, unlike typical examples of water (its use), and because it is a mixture created through deliberate action by humans, not the processes of nature (its source). Why is grapefruit juice not considered water, when it is judged to be 76% H2O? It may be because it comes from inside a fruit (its source), and it occurs inside the fruit (its location), unlike things that are more typical waters. And why is swamp water considered water, even though it is judged to be only 69% H2O? It is in a swamp because it fell from the sky into that place like more typical examples of water (its source); it lies in a natural body (its location), and it serves the same basic life support functions for wildlife that other, purer, bodies of H2O do (its use). As further evidence for the importance of the dimensions of source and use, consider an illustration of water in a news item from the April, 1991 issue of Greenpeace Magazine (brought to my attention by P. O'Seaghdha). The news item included a photograph of the Toronto skyline and Lake Ontario that was developed entirely in water taken from Lake Ontario. Horribly enough, the water in this lake contains enough chemicals that it has the same effect on photosensitive paper as does developing solution used by photographers. Why is this liquid, which is very similar to developing solution, commonly considered to be water (rather than developing solution)? I suggest that it is because the liquid lies in a lake, as it was put there by nature, and it is used by humans primarily for boating or turning a water wheel or other standard lake uses. The fact that in recent years, it has acquired the ability to serve as a developing solution, is purely accidental and is incidental to its more major role in relationship to the people who live near it. For all these reasons, it seems perfectly reasonable to consider the liquid to be water, albeit quite polluted water.

160

Barbara C. Malt

Table 2. Extended tree structure representing similarity among waters swamp

NNNNNNNNNNN

CCC Ι

EEEEUU I zzzz

sewer

IUI stagnant

DDDDDD

auto radiator

xxxxxxx

stream

r-1

river

CCC I flood

zzzz TXJ

HHHHHHHHH -

rain

CCC puddle

DDDDDDEEEEIIII

NNNNNNNNNNN

pond lake —

HHHHHHHHH OOOOOOO

XXXXXXX

salt

well mineral distilled swimming pool chlorinated

OOOOOOO

tap bath dish

If, however, this same liquid were taken from the lake, bottled, and sold in a photography store for developing pictures, it would seem misleading to label it as a bottle of water. In that circumstance, it

From cognitive psychology to cognitive linguistics

161

would be more relevant and more natural to label it as a weak developing solution. In the photography shop, its location is no longer the one originally given by nature; its use is no longer that of the more typical uses of water by humans; and its ability to develop pictures is not incidental to some other more primary use. The possiblity of considering the same liquid to either be or not be water is, in fact, illustrated within the news item itself. The caption that identifies the photo refers to it as having been developed in water from Lake Ontario, but the photographer's note written on the photograph states that the film developer was Lake Ontario. Thus, this example illustrates that the status of the liquid as either being water or not being water seems to be only in part determined by the contents of the liquid. The values on the other dimensions identified in the extended tree solution are also critical to what it is perceived to be. In sum, this detailed consideration of what liquids are examples of the category named "water" argues against there being a single dominant piece of information in the concept, in particular a folk theory of the composition of water, that heavily constrains which familiar liquids are considered to be water and which ones are not. The data by themselves do not, of course, absolutely prove that there is no single type of information or folk theory that can account for what liquids are considered to be water. However, they do clearly indicate that the one widely taken to be the most plausible one will not do the trick. Furthermore, the extended tree solution presented above has identified several other salient dimensions of water, and it appears that none of these three alone can fully separate liquids that are considered to be water from ones that are not water. In the absence of any other viable candidates for a single unifying factor, we can at this point conclude that it is unlikely that such a factor will be identified. It is worth considering whether the word "water" might be polysemous, and whether this fact would influence the conclusions that can be drawn from the data. It is certainly possible that people (at least, those who are relatively educated in Western science) have a restricted sense of the word "water" that they use in some contexts, that is something like "pure H2O." However, the data indicate that whether or not this is so, they also use the word "water" in a more general sense to encompass many liquids with less pure compositions. Given that they

162

Barbara C. Malt

do, the original theory of what the concept consists of can at best account for one (probably infrequent) sense of the word. Clearly, a general theory of concepts or word meanings should account for the common use of the term as well as any more specialized use. As for whether the more common use might itself be viewed as constituting several different senses of the word, this possibility likewise would not be compatible with the idea that there is some property shared by all category members (unless one takes this to mean that there is a separate shared property for each sense — but then we are left with no way to motivate why they are all senses of the same word; see, e.g., Taylor 1989: 105). My consideration of the factors influencing category membership has not been detailed enough to argue for particular relationships among the different exemplars or to lay out an overall picture of the category structure. However, it is clear that the description I have given of the factors influencing membership are suggestive of a family resemblance among members: Some share similarities of source, others of location, and others of use. This family resemblance is more in line with suggestions from cognitive linguistics about category structure than it is with recent views from within cognitive psychology. 3.2. Seed

I now want to consider, more briefly, data that I have collected on two other nouns, "seed" and "tea." I will discuss "seed" first. Table 3 provides examples of things that get called "seed," along with things that are similar to the ones called seeds but that are not called "seed." What is the basis for the distinction between things we consider to be "seed" and things we do not? Botanically, a seed is a plant embryo along with food storage tissue and a protective outer coating. Most people probably know at least the fact that a new plant grows from a plant seed. However, neither the botanical definition nor the more common knowledge version actually seem to determine which things people normally think of as seeds and which things they do not. Most notably, many things are not called seeds in everyday language that

From cognitive psychology to cognitive linguistics

163

many people would, in fact, recognize as the thing that is put into the ground to produce a new plant. Those excluded from "seed" but probably often recognized as sharing the growing function include nuts, beans, avocado pits and other pits, and kernels of corn. Table 3. Examples and non-examples of "seed" Seed apple seed bird seed cucumber seed flower seed grapefruit seed grass seed poppy seed sesame seed lemon seed cotton seed dandelion seed

Not Seed walnut, acorn, other nuts peach, apricot pits avocado pit peas beans flower bulb corn kernels

What is it, then, about the everyday concept of a seed that encompasses some things with the botanical seed function and not others? Part of the distinction may be in terms of aspects of appearance: The ones we consider to be seeds tend to be small and to be encountered in groups of several or more, such as flower seeds and poppy seeds; the ones called by other names tend to be large and to occur alone (e.g., avocado pits). However, this dimension by itself does not fully account for the distinction, since peas and corn generally occur in groups and are relatively small, but they are not called seeds. Another important dimension seems to be how people typically interact with them. Many of the things we call seeds are planted for new plants and are not directly eaten by humans. Many of the things we do not call seeds, on the other hand, may be recognizable to us as sprouting new plants, but our own interactions come mainly in the context of eating rather than planting (e.g., beans, nuts, and corn are things we eat; avo-

164

Barbara C. Malt

cado, peach, and apricot pits are things we find in the middle of fruits that we eat, and our usual interaction with them is just to throw them away). Of interest here is that, just as with water, there is a potential definition for the word in terms of familiar aspects of a scientific definition. However, in actuality what constitutes the associated concept seems to be at best only partly driven by the information given in that definition. Instead, the category membership is to a great extent influenced by dimensions more directly related to the human experience with the referent objects, including their size, typically encountered quantity, and use as a food. This finding, in turn, suggests that the concept must involve these dimensions, not only the botanical role of the entity. Therefore, the concept of a seed does not seem to be aimed at capturing any single sort of underlying commonality nor to reflect a coherent folk theory of botany. This suggestion again is inconsistent with the emphasis on constraints on category membership recently popular in the psychological literature. It is also interesting to note that the dimension of edibleness for humans can broadly be considered to be the function of the items to humans. Thus a factor that was important for water emerges as important here also, and this fact provides a suggestion that the function dimension may be an important one that cross-cuts a variety of domains (see also Lakoff 1987). Of course, as with water, it is possible that two concepts of seeds exist, one that conforms more closely to the botanical definition and one that diverges from it in the way I have described. People who recognize the common role in plant growth of avocado pits and beans, for instance, may find it acceptable to say things such as "Avocado pits and beans are types of seeds." This acceptability suggests that there is a concept of seeds (for at least the botanically more sophisticated layperson) that encompasses things that are not labelled with the word "seed" in English, and this concept may indeed be relatively well-defined in terms of a botanical role. Nevertheless, it seems that there is also a subordinate concept of seeds that contrasts with concepts named in the non-example list given in Table 3. For instance, people who find the sentence above acceptable would probably at the same time find nothing odd about a sentence such as "After the trash bags ripped,

From cognitive psychology to cognitive linguistics

165

the ground was littered with leftover nuts, fruit pits, and seeds of various sorts, along with other garbage." Similarly, even sentences that assume the common botanical role, such as "Fruit pits, beans, nuts, and seeds can all be planted in pots to grow fun houseplants," seem perfectly acceptable. The acceptability seems to be because the terms are being used to pick out sets of things that would be conceptualized separately if one were looking around the house for something to sprout in a pot. Here again, then, although there may be a sense of the word "seed" that picks out a relatively well-defined category, it also appears to label a subordinate category that does not have the same structure. 3.3. Tea Finally, I want to briefly discuss some data on the noun "tea". Table 4 provides some examples and non-examples of tea that we collected. Table 4. Examples and non-Examples of "tea" Tea black tea green tea sassafrass tea camomile tea peppermint tea lemon tea apple-cinnamon tea

Not Tea bouillon beef broth coffee hot cider chicken broth cocoa hot toddy vegetable broth

What distinguishes liquids that fall into the tea category from ones that are not teas? There is a tea plant (or, actually, several different plants) grown for tea as a cash crop, and the most typical teas are probably those that are brewed from the leaves of these plants. However, not all the beverages that are commonly considered to be tea are

166

Barbara C. Malt

made from these tea plants. Other teas include those made from petals or roots of various other plants (such as sassafrass, rose, peppermint, etc.), and the peels or other parts of various fruits and spices such as lemon, apples, cinnamon, and so on. The category of tea, then, does not seem to be based on any obvious sort of common composition. What does separate teas from things not considered to be teas? Clearly, teas are set apart from coffees by virtue of the fact that coffees all derive from coffee beans. However, teas share with cider, cocoa, and vegetable broth the properties of being made from vegetable matter other than coffee beans, and they share with cider, broth, and various other non-teas the properties of being drunk hot and being thin in texture. They also overlap substantially in color with non-teas; teas range from light, clear colors to dark colors similar to coffee. The most consistent differences between teas and non-teas may be in the texture of the raw materials and in the method of preparation. Teas typically come in the form of dried, shredded matter that is steeped in water and then removed. Other hot beverages tend to come in powders, granules, liquids, etc., and are prepared for drinking in other ways. However, even these properties cannot alone fully distinguish teas from non-teas. For instance, coffee now comes in single-serving bags that are steeped in water just like tea bags. Until relatively recently, many teas were made from fresh leaves, roots, peels, etc., that were prepared by cooking as other liquids called broths might be, and occasionally tea is still made this way. For tea, then, as for water and seeds, there is no single dimension that provides a clear distinction between things that are considered to be category members and things that are considered to be something else. Again, since describing the nature of category membership requires reference to multiple properties, this suggests that the associated mental representation also involves these properties. An alternative to the possiblity of multiple relevant dimensions would again be that people use the word to label things that they believe share a commmon "essence" of some sort. However, the discussion just given suggests that this is not a strong possibility. The most likely candidate for an essence for "tea" would probably be commonality of the substance used in making tea, but the analysis shows that a variety of different substances can be used to make teas as well as non-teas. It is

From cognitive psychology to cognitive linguistics

167

fairly safe to assume that people are familiar with the varied contents of the teas and non-teas since many display their contents in their name (e.g., apple-cinnamon tea; apple cider). Thus, for "tea," too, the category membership data suggest that the concept involves multiple dimensions without one of special status that can be identified as the "essence" underlying category membership. Once again, the pattern of category membership seems to reflect a category structure more in line with recent suggestions from cognitive linguistics than from cognitive psychology. 4.

Conclusions

Although I have discussed category membership data for only three particular nouns, these data suggest several more general conclusions about the nature of the associated concepts and the value of studying as many category members as possible. 4.1. The role of multiple dimensions

in

categorization

First, the data provide evidence that detailed information about realworld category members can be helpful for understanding the concepts underlying common object categories. In particular, the data provide evidence against the idea that the categories studied include only objects that are perceived as sharing a common characteristic. Although the category membership data themselves cannot directly tell us the full content of the concept, they do limit its possible contents. For instance, they strongly suggest that beliefs about the simple presence or absence of H2O cannot underlie the distinction between waters and non-water. They also suggest that the most plausible single properties for separating things considered to be teas and seeds from members of contrasting categories do not seem to constitute defining features in the associated concepts. In agreement with Rosch & Mervis' (1975) and Wittgenstein's (1953) analyses and the recent extensions by cognitive linguists (e.g., Lakoff 1987; Taylor 1989), and contrary to more recent suggestions in the psychological literature, then, the data suggest that multiple dimensions are important in the

168

Barbara C. Malt

concepts underlying common object categories. Thus I want to argue that cognitive psychologists should pay more attention to category membership data, following the lead of cognitive linguists. It is worth noting that several of the same dimensions emerged as important across water, tea, and seed. In particular, the nature of the human experience with the entities seems to be important for all three of the categories studied; use and source/method of obtaining helped differentiate waters from non-waters, teas from non-teas, and seeds from non-seeds. In addition, relatively gross physical qualities were important for two of the categories (texture for tea and size and number for seed), and composition of the substance was important for two (water and tea). As more categories are analyzed, detailed data on category membership holds the promise of helping to identify dimensions that may be universally important to concepts or at least relevant to a large number of concepts across different domains. 4.2. The role of folk theories in categorization The data on water also suggest that although people may indeed hold folk theories of the basis for membership in many categories, these folk theories alone cannot account for category membership. One explanation of this outcome is that folk theories may be part of the conscious knowledge that people have about how they determine category membership, but other factors may also exist, the influence of which is less readily accessible to consciousness (see also Malt 1991). Laboratory tasks in which people are asked to explain the basis for their category membership judgments, or even in which they do not give explanations but make category judgments in a relatively deliberate, conscious fashion, may tend to shed light on the more conscious aspects of categorization. In contrast, studies of the range of objects that are given a particular category name under natural, non-laboratory circumstances may tend to shed light on the factors that are less accessible to consciousness. The possible interaction of the folk theories with other sorts of influences on category membership remains to be explored. Here again, it may be useful for cognitive psychologists to adopt in part the methodology of cognitive linguists.

From cognitive psychology to cognitive linguistics

169

Despite this caveat, there is a sense in which the idea of a folk theory forming the basis for category membership is compatible with the sorts of complex category structures that I have been discussing. Cognitive linguists have spoken of "idealized cognitive models" (Lakoff 1987) that may underlie category structure and determine what members of categories are typical ones and what ones are peripheral ones. Similarly, category membership for categories such as given by Dyirbal noun classifiers (Dixon 1982; see Lakoff 1987) can be described in terms of a central model and extensions of that model. These cognitive models may be similar in content to the sorts of folk theories that psychologists have talked about (see Gibbs 1993). However, there are at least two ways in which the conception of these models seems to differ from the way that psychologists have described folk theories in categorization. First, the models as described in linguistics are not taken to fully define by themselves what sorts of entities will become category members. Instead, they provide a basis for extension from central category members to things that are more distantly related. As such, they support rather than oppose the idea of family resemblance category structures. Second, these models are not necessarily accessible to consciousness. Though Lakoff (1987) has argued that speakers of Dyirbal, for instance, can verbalize the models that underlie their noun classifier system, similar analyses have also been given for the historical development of a category rather than a speaker's current knowledge (e.g. Geeraerts 1985). Thus it might be argued that the model or models that motivate the various uses of some words provide an account of how the word came to have the uses that it does, but that this account is not always explicit in the mind of the individual speaker of the language. Although I have just made some suggestions about similarities and differences between the psychological and linguistic conceptions of folk theories or models, it is difficult to be precise about the extent to which they overlap. Both sides have tended to use the terms in a variety of different ways and to be somewhat variable in the assumptions made about the relationship of theories or models to conscious knowledge and mental representation in general. Psychologists and linguists alike may benefit from greater clarification of what terms like "folk

170

Barbara C. Malt

theory" and "idealized cognitive model" mean and where these theories or models are assumed to reside. 4.3. Category structure and mental representation Finally, examining category membership data raises interesting questions about the relationship between category structure and mental representation. Unlike cognitive psychologists, cognitive linguists have generally not been concerned with describing the particular processing that takes place during categorization. As such, they have phrased their analyses in terms of category structure per se, and have not made strong claims about the nature of the act of categorizing (e.g., Brugman 1983: 66; Taylor 1989: 119). Despite this caution, it is not always clear to what extent the category structures described are assumed to correspond directly to mental representations. One possibility is that the categories described are represented in toto (that is, all the individual examples are represented) in the mind of a language user. An alternative, though, is that the mental representation of a category per se is something else (e.g., only a prototype or set of prototypes or "models"), and various processes operate on the representation to produce the acceptance of an individual example at a particular time. As noted earlier, cognitive psychologists have learned that it can be difficult to pin down the nature of a category representation since the representation is always acted on by some (also not fully known) cognitive processes in producing categorization at any given moment (see, e.g., Estes 1986; Medin 1986; Medin, Dewey & Murphy 1983). The complex category structures that have recently been discovered pose new challenges for cognitive psychologists who strive to understand the nature of the mental representations and the processes that act on them. Acknowledgments The work reported here was supported by NSF grant BNS-8909360.1 thank James Corter for carrying out the ExTree analysis shown in Table 2, and Eugene Casad, Raymond Gibbs, Gregory Murphy, and

From cognitive psychology to cognitive linguistics

171

an anonymous reviewer for helpful comments on a previous draft of this paper.

References Armstrong, Sharon, Lila Gleitman & Henry Gleitman 1983 "What some concepts might not be", Cognition 13: 263-308. Berlin, Brent & Paul Kay 1969 Basic color terms: Their universality and evolution. Berkeley: University of California Press. Berlin, Brent, Dennis E. Breedlove & Peter H. Raven 1974 Principles of Tzeltal plant classification. New York: Academic Press. Brugman, Claudia 1983/88 The story of OVER: Polysemy, semantics, and the structure of the lexicon. Bloomington: Indiana University Linguistics Club. Again 1988: New York: Garland Outstanding Dissertations in Linguistics Series. Garland. Bruner, Jerome S., Jaqueline Goodnow & G. Austin 1956 A study of thinking. New York: Wiley. Carey, Susan 1985 Conceptual change in childhood. Cambridge, MA: MIT Press. Coleman, Linda & Paul Kay 1981 "Prototype semantics: The English verb LIE", Language 57: 26-44. Corter, James E. & Amos Tversky 1986 "Extended similarity trees", Psychometrika 51:429-451. Dirven, René & John Taylor 1988 "The conceptualization of vertical space in English: The case of TALL", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins. Dixon, Richard M. W. 1982 Where have all the adjectives gone? Berlin: Walter de Gruyter. Estes, William Κ. 1986 "Array models for category learning", Cognitive psychology 18: 500-549. Geeraerts, Dirk 1985 "Cognitive restrictions on the structure of semantic change", in: Jacek Fisiak (ed.), Trends in linguistics: Studies and monographs 29 (Historical semantics, Historical word formation). Berlin and New York: Mouton Publishers. Gelman, Susan A. 1988 "The development of induction within natural kind and artifact categories", Cognitive Psychology 20: 65-95.

172

Barbara C. Malt

Gibbs, Raymond W. 1993 The poetics of mind: Figurative thought, language, and understanding. Cambridge: Cambridge University Press. Hintzman, Douglas L. & Genevieve Ludlam 1980 "Differential forgetting of prototypes and old instances: Simulation by an exemplar-based classification model", Memory & Cognition 8: 378-382. Katz, Jerrold J. & Jerry A. Fodor 1963 "The structure of a semantic theory", Language 39: 190-210. Keil, Frank C. 1989 Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. Kucera, Henry & W. Nelson Francis 1967 Computational analysis of present-day American English. Providence: Brown University Press. Labov, William 1973 "The boundaries of words and their meanings", in: Charles-James Bailey & Roger W. Shuy (eds.), New ways of analyzing variation in English. Washington, D.C.: Georgetown University Press. Lakoff, George 1972 "Hedges: A study in meaning criteria and the logic of fuzzy concepts", Papers from the eighth regional meeting, Chicago Linguistic Society: 183-228. 1987 Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press. Malt, Barbara C. 1989 "An on-line investigation of prototype and exemplar strategies in classification", Journal of Experimental Psychology: Learning, Memory, and Cognition 15: 539-555. 1990 "Features and beliefs in the mental representation of categories", Journal of Memory and Language 29: 289-315. Malt, Barbara C. & Eric C. Johnson 1992 "Do artifact concepts have cores?", Journal of Memory and Language 31: 195-217. Medin, Douglas L., Gerald I. Dewey & Timothy D. Murphy 1983 "Relationships between item and category learning: Evidence that abstraction is not automatic", Journal of Experimental Psychology: Learning, Memory, and Cognition 9: 604-625. Medin, Douglas L. & Andrew Ortony 1989 "Psychological essentialism", in: S. Vosniadou & A. Ortony (eds.), Similarity and analogical reasoning. New York: Cambridge University Press. Medin, Douglas L. & Marguerite M. Shaffer 1978 "Context theory of classification learning", Psychological Review 85: 207-238.

From cognitive psychology to cognitive linguistics

173

Murphy, Gregory L. & Douglas L. Medin 1985 "The role of theories in conceptual coherence", Psychological Review 92: 289-316. Nunberg, Geoffrey 1979 The non-uniqueness of semantic solutions: Polysemy", Linguistics and philosophy 3: 143-184. Posner, Michael & Stephen W. Keele 1970 "Retention of abstract ideas", Journal of Experimental Psychology 83: 304 - 308. Putnam, Hilary 1975 "The meaning of 'meaning"', in: K. Gunderson & G. Maxwell (eds.) (1973), Minnesota Studies in Philosophy of Science. Vol. 6. Minneapolis: University of Minnesota Press. Reprinted in Hilary Putnam (1975), Mind, language and reality (Philosophical papers. Vol. 2). Cambridge: Cambridge University Press, 215-271. Rosch, Eleanor 1973 "On the internal structure of perceptual and semantic categories", in: Terence E. Moore (ed.), Cognitive development and the acquisition of language. New York: Academic Press. Rosch, Eleanor & Carolyn B. Mervis 1975 "Family resemblances: Studies in the internal structure of categories", Cognitive Psychology 7: 573-605. Rosch, Eleanor, Carolyn B. Mervis, Wayne D. Gray, David M. Johnson & Penny Boyes-Braem 1976 "Basic objects in natural categories", Cognitive Psychology 8: 382439. Sapir, Edward 1949 Language: An introduction to the study of speech. New York: Harcourt, Brace, and Company. [Original work published in 1921], Taylor, John R. 1989 Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon Press. Whorf, Benjamin Lee 1965 Language, thought, and reality: Selected writings of Benjamin Lee Whorf. J. B. Carroll (ed.), Cambridge, MA: MIT Press. Wittgenstein, Ludwig 1953 Philosophical investigations. New York: MacMillan.

Historical aspects of categorization Gábor Györi

1.

Introduction

This paper examines how categories come to be formed in a culture and the way that they become encoded in language. I claim that the process of cultural category formation is functional in nature since it is based on a speech community's adaptation to its environment.1 Etymologies reveal much about this process as they show how conceived reality can be construed in alternate ways at different points in time to facilitate this adaptation. I sketch a descriptive naming model as the mechanism for the coding of culturally valid categories in the course of semantic change. 2.

Categorization

The term categorization can be understood both as a product of the operation of joint cognitive processes, as well as a global process of ordering various things into different groups according to certain kinds of similarities. These two aspects of categorization cannot be separated. On the one hand, an established system of categories is needed as the basis for performing the ordering process, and on the other hand, this established system of categories results from previous ordering processes involving at least two kinds of activities. We can either group entities as conforming to one degree or another to the established category system or we can group them in novel ways, which means we devise and set up new categories; in this way we end up changing the old category system. The latter option selects some novel conceptualization that prevails and becomes current. In spite of the apparent circularity inherent in this account, the existence of an established category system seems to be the more fundamental side of

176

Gábor Györi

the coin, because the ordering process, in both of its senses, must be rooted in such a system. Still, the existence of such an established category system cannot be assumed a priori and it makes sense to ask how the system of human conceptual categories arises. Rosch has pointed out that human categorization is not "the arbitrary product of historical accident or of whimsy but rather the result of psychological principles of categorization" (Rosch 1978: 27-28). Category systems are as they are because humans tend to make categorizations in one way rather than another. Rosch proposes two basic principles of human categorization. The first one, termed cognitive economy, "asserts that the task of category systems is to provide maximum information with the least cognitive effort". The second one, which we can call structured input, "asserts that the perceived world comes as structured information rather than as arbitrary or unpredictable attributes". Rosch does not propose these principles to explain "the development of categorization in children" nor to explain "how categories are processed (how categorizations are made) in the minds of adult speakers of a language". Her purpose is rather to explain "the categories found in a culture and coded by the language of that culture at a particular point in time", i.e. "their formation in a culture" (1978: ). The issue is thus how categories come to be formed in a culture, or in other words, how the ordering process, taken in its second sense, can establish a system of categories or change an already established system. The term categorization will be used in this sense throughout this paper. 2.1. Some basic considerations In my view, the process of category formation in a culture is closely tied to diachronic language change. For a conceptual category to spread in a culture, i.e. to become explicitly part of the cognitive structures of the individual members of that culture, it is necessary for it to become coded into language. Coding is done by the use of a linguistic expression appropriate for the conceptualization in question. Langacker calls such an expression a target structure, which is sanctioned by the grammar and may become entrenched by linguistic con-

Historical aspects of categorization

177

vention. If this entrenchment is achieved to a sufficient degree, the conceptualization becomes available for a whole speech community (Langacker 1987: 65). For this reason, Csányi (1992) considers language to be a "social super-model" of reality, i.e. a multifaceted cognitive model (not in Lakoff's 1987 sense) shared by all members of a social group. It is only through this common model of conceived reality that humans can share their individual conceptualizations of their experiences, which are the bases for forming conceptual categories (cf. Lakoff 1987: 267). The fact that language stabilizes conceptual structure against fragmentation is especially true from a historical point of view (Anderson 1988: 93). This is how language can facilitate the cultural inheritance of category systems from generation to generation (cf. Csányi 1992). However, individuals of following generations never come up with exactly the same system of categories. During the course of time, the system is continually adjusted to fit the changing needs of the succeeding generations of the given culture. The language spoken in that particular culture is not only the medium of the cultural inheritance process, but also the tool of this continuous, almost imperceptible adjustment. Thus, conceptual categories of cultural significance must become encoded linguistically without exception and they have historical character in the sense that they comprise the experience and knowledge of earlier generations made available for future ones. They are subject to alterations as this experience and knowledge changes over time, however. This paper attempts to give an explanation of the workings of this process. 2.2. Categorization as functional

behavior

A functional view of categorization claims that the formation of categories is part of a conceptualizer's adaptive behavior for organizing and controlling its environment (Herrnstein 1984; Rosch 1978; Zimmerman 1979). In the process of categorization, the perceiving entity arranges different phenomena or stimuli into one group or class as long as their differences can be considered irrelevant to the behavioral

178

Gábor Györi

purposes at hand (Rosch 1978: 29). Thus, the perceived similarity of stimuli is relative to the role that the stimuli play in a sentient being's behavior. Behavior in this context means any kind of interaction between that being and its environment. This kind of similarity is called analogy, because, as Holyoak (1984: 204) puts it, "analogy ... is structured similarity with functional import". The recognition of such an analogy is at the basis of the formation of human conceptual categories as well, but for the human mind analogy is not wholly based on perceptual attributes, as it is for organisms lower on the evolutionary scale (Csányi 1988,1992), where function is always a matter of physical properties. To begin, I discuss the topics of concepts, categories and categorization and their functional roles. Concepts and categories are actually two sides of the same coin. Concepts are mental representations of categories and they function as recognition devices of instances of the categories (cf. Smith and Medin 1981: 8). Categories consist of a class of related phenomena, be they entities, parts of entities, properties of entities, or relations between entities, which are grouped together mentally on the basis of some kind of salient perceptual or functional similarity. Phenomena do not constitute categories in their own right, independently of anyone who conceptualizes them, but rather form categories only in accordance with the given person's cognitive structuring of his environment. Nonetheless, certain natural discontinuities necessarily do play a role in this process (cf. Rosch 1978). By perceptual similarity I mean the coincidence of the same sensory attributes in different entities. For example, it is on the basis of its shape that we identify something as a 'sphere' and contrast it with a 'cube.' A simple example of the coding of one category as distinct from another one purely on perceptual grounds is German Rappe 'black horse', which contrasts with the more general term Pferd, 'horse.' The more general term Pferd, includes in its semantics no specification whatsoever of a particular color to be associated with the animal it designates, whereas the semantics of Rappe includes precisely such a specification. Functional similarity is a much vaguer and broader term than perceptual similarity. It not only implies similar usage, in the strict sense

Historical aspects of categorization

179

of physical manipulation of objects, but also embraces the coincidence of all those attributes of entities that exhibit the same relevance for a particular type of behavior of an organism. Most of the time, functional similarity overrides mere perceptual similarity. This allows the conceptualizer to group together physically unlike entities. In short, perception is always selective since it is determined by anticipatory schemata that are never independent of the person's adaptation to its environment (Neisser 1976: 55). Similarity thus does not reside objectively in the entities themselves, but emerges as a result of the subjective conceptualizations made by a person (cf. Taylor 1989: 60). How can one know the world, one might ask, if subjectivity plays any role in cognition? The answer is simple: cognition is not simply knowing reality, but knowing reality in a way that facilitates a person's optimal adaptation to it. Parts of reality are conceptualized according to the role that they play in this adaptation. A clear example is the group of things a person "considers" consumable. My dog categorizes the bones of a chicken that I leave after lunch as food, maybe even as a delicacy, while I categorize it as garbage. However, this is not only an interspecies phenomenon but also an intraspecies one. Different peoples (i.e. cultures) may conceptualize similar experience in alternate ways due to its different role in their lives. Casad (1988) has shown, via examples from Cora, that even the conceptualization of locations, which one would think to be universal on the basis that they are so basic to our physical perception and spatial orientation, can be functionally dependent on the idiosyncratic adaptation of the conceptualizes to their environment. For different conceptualizing beings, and for the same entities under different environmental conditions (which not only include the physical but also social and cultural aspects), different construals of phenomena may gain validity according to their adaptive value. For example, consider the variety of places that can serve as shelter for one entity but not for another and vice versa, or just think of the fact that a given animal can be the predator of another and the prey of a third. As for humans, we can construe a piece of stone as a weapon, a tool, maybe even as a toy or simply as a formation of nature depending on the situation we have in mind.

180

Gábor Györi

A functional approach to semantic research should not exclude subjectivity, but view it as the basis for adaptive behavior. Of course, subjectivity has its limits, and it is just its adaptive value that will regulate it. A good example of this is the observation that people most commonly conceive of electricity in two different ways, i.e. as analogous to either flowing water or to a crowd moving through passageways. As Gentner and Gentner (1983) have shown, distinct problems of electrical circuitry are better understood by applying whichever of these two subjective conceptualizations is more appropriate for grasping the given problem. The same applies to the appropriate usage of the distinct metaphors of anger in English (cf. Kövecses 1986; Lakoff 1987), which are all different subjective conceptualizations of this psychological state. All are valid in particular circumstances in the sense that they facilitate our understanding of this psychological state in those circumstances. Within the animal world a choice that hinders optimal adaptation to the environment rather than facilitates it is not likely to gain validity because it would jeopardize the animal's survival. In the case of humans, the problem is, of course, much more complex. Adaptation and survival are by no means to be understood in the strict biological sense but rather in a socio-cultural one. The environment in which human conceptual categories emerge is always a given culture, and the attributes of phenomena considered important for categorization reflect the characteristics of this particular culture. One of Rosch's principles of category formation, viz. perceived world structure, maintains that objects are analyzed into attributes according to the structure inherent in the world and grouped together on that basis. Rosch (1978: 41) realized, however, that some attributes, even in the case of basic objects, were not really out there in the world and that their correlational structure had to be provided by human knowledge. She found three types of these "problematic" attributes. The first type of attribute did not seem to be meaningful prior to a subject's recognition of the object as a member of a particular category, for example, the attribute 'seat' in relation to the object 'chair'. The second type of attribute seemed to be meaningful only in terms of the superordinate category of the object; thus a piano is large only when it is

Historical aspects of categorization

181

compared to other pieces of furniture, but not when it is compared to buildings. The third type seemed to require knowledge about human activities, as, for example, the statement 'you eat on it' in the case of a table. These findings indicated, first, that in the formation of categories the perceived world structure is very often "influenced by the functional needs of the knower", and second, that the analysis of objects into certain kinds of attributes very often requires a system of cultural knowledge and an already developed category system (Rosch 1978: 42). In conclusion, the fact that concepts and categories have an essentially functional role in cognition, not only has an influence on their structure, but also on the mechanism of their encoding (cf. Zimmerman 1979). 3.

Semantic change

My purpose in this paper is not to give a general explanation for all types of semantic change, but rather to explain the role that semantics has in establishing conceptual categories of cultural validity. This is seen as a historical process manifested in linguistic coding, which takes place within the domain of semantic change.2 In short, when looking at how established meanings are adapted for the expression of new concepts, what I am specifically interested in is the way that the conventional category system can give rise to new categorizations. 3.1. Semantic change as a cognitive

process

Every language, at any given point in its historical development, encodes a well-defined and finite system of culturally significant conceptual categories. But language is in constant change; therefore a linguistically coded category system can never be adequately characterized by a static model. Perhaps the underlying conceptual categories are more or less stationary. However, the manifestation of the existence of any conceptual category at the level of a culture is the way that it is coded into the language. As we have already stated above, the formation of categories, at least at the level of a whole culture,

182

Gábor Györi

cannot be separated from their coding in language, which entails the emergence of new expressions. The primary avenue for new expressions to arise in a language is through semantic change. It is within this historical linguistic process that the formation of categories in a culture becomes most easily seen (cf. Rosch 1978: 42). Thus, etymologies, which provide clues about former and later states of a language, can provide many clues to the cultural formation of categories. In the following section of this paper, I will try to show that Rosch's two basic principles in category formation, along with other characteristics of category formation, can be seen in the process of category encoding through semantic change. This change does not simply mirror the historical and cultural formation of categories, but actively influences it. In other words, a language defines the material that can be utilized in order to allow speakers to form and express new conceptualizations because it is the stock of lexical items of the language that makes a conceptual category accessible, at least at the level of a cultural community. It is a fact in the history of languages that words change their meanings in the course of time. But why should a word change its meaning? Most historical linguists agree that meaning change is primarily due to external factors: to social, economic, and cultural changes in the life of a speech community. Sometimes taboo and euphemism are mentioned as separate, psychological factors. But they are also due to external causes and respond to socio-cultural expectations. We can also identify internal causes of semantic change, but these are always due to other changes that were ultimately induced by external factors. A prominent example is the clash of homonyms, when the phonological merger of lexical items results in ambiguity, which is often resolved by replacement of one of the two lexical items. Such cases, however, do not figure prominently in the present study. Geeraerts (1988) notes that historical-philological semantics has always had its affinity to cognitive semantics. In particular, Paul's (1920) approach to semantic change and meaning in general is much closer to a modern cognitive account than it is to later structuralist views. For one, Paul's view is very similar to today's cognitive se-

Historical aspects of categorization

183

mantics in emphasizing the encyclopedic character of meanings (cf. Haiman 1980; Langacker 1987: 154). Paul also differentiated between usual ^conventional) and occasional (=nonce) meanings of a word as a precondition for any possibility of change of meaning (Paul 1920: 75). He defined usual meaning as the total conceptual content (gesamter Vorstellungsinhalt) that is associated with a word for any member of a speech community, and nonce meaning as the conceptual content ( Vorstellungsinhalt) that the speaker associates with a word while uttering it, and that he expects the hearer to associate with it also. For the production and understanding of a nonce meaning of a word, speaker and hearer respectively have to resort to the usual (i.e. original and literal) meaning of that word. Paul noticed that the basis for change of meaning was a discrepancy between these two meanings and that any novel meaning of a word was a candidate for developing into a conventional meaning on its own. The process of a novel meaning turning usual is a gradual one, but generally it can be said that a change of meaning has occurred when speakers of a language can interpret the erstwhile novel meaning of a word without resorting to the original one (Paul 1920: 84). Thus, speakers no longer recognize ENG hut as a semantic extension of ENG hide, although both words derive from Proto-Indoeuropean (PIE) *(s)keu- 'cover, conceal'. 3 An even clearer example is that of ENG crane, which, in one sense, designates a construction device equipped with a long boom for lifting materials. This meaning is now completely separate from the meaning in which crane designates a kind of bird with a long skinny neck. This is especially true for those speakers of English who have never seen the bird.4 In Langacker's terms this change occurs when the cognitive routine connected to the occasional meaning reaches a degree of entrenchment where it achieves unit status (Langacker 1987: 100). In Paul's view the process is accelerated when an occasional meaning is handed down to a subsequent generation (Paul 1920: 85). The reason for this must be that this is a phase in the development of a language when linguistic convention sanctions a particular meaning for a word (=target structure, cf. Langacker 1987: 66). An example of this is Hungarian tuja, a slang equivalent (used almost exclusively in the capital only) of villamos 'tram'. Only the older generation knows to-

184

Gábor Györi

day that tuja is derived from HUNG hátulja 'the back of something' on the basis that some decades ago poorer children in Budapest liked to travel on the back fender of trams in order not to pay a fare. As a description of the process of semantic change Paul's formulation is valid to this day, even if recent approaches to the history of language emphasize the usage of words in new contexts, changes in reference, or changes in the connection between signifiant and signifié (cf. Palmer 1978: 306; Jeffers and Lehiste 1979: 127; Bammesberger 1984: 98; Hock 1986: 300). But even these approaches will not suffice as full explanations, because they do not give us any insight into what goes on inside the heads of speakers and listeners while the change of meaning is taking place. Alhough in this respect Paul's explanation is not a proper explanation either, it clearly attributes to the language user a very important role in semantic change. Anticipating the usagebased theory of cognitive grammar (cf. Langacker 1987, 1990), Paul realized that all understanding between different persons is based on the correspondence of their psychological behaviors, and because of this, the understanding of novel meanings requires greater correspondence between speaker and hearer than does the understanding of ordinary meanings (Paul 1920: 78). Thus, when asking the question of why words change their meanings, we are actually asking about the speakers' reasons for using a word in a novel manner instead of using it in the conventional way. While the reasons ultimately triggering semantic change are clearly external, the speakers' reasons for reacting to these external stimuli must have cognitive grounds. The psychological motives of speakers for starting to use a word in a different way are to be found in their adaptive behavior that adjusts their language as a cognitive system to their changing environment. This is illustrated by the origin of ENG glass. It was through the Romans that the Germanic people got acquainted with this material and they named it after something they already knew and had coded into their language, namely Common-Germanic (CGER) *glazam 'amber'. In short, language has to be adjusted to changes in perceived reality if it is to fulfil its purpose as an instrument of cognition. Thus, if we accept social, cultural and economic factors as external causes of semantic change, and I think it is reasonable to do so, the correspond-

Historical aspects of categorization

185

ing psychological motives of the speakers must primarily be their need for instituting new meanings due to these changes in the environment. A continual adjustment of language to how speakers conceive of reality can only be explained on the basis of a usage-based grammar such as that explicated by Langacker (1987, 1990). 5 A grammar based exclusively on general patterns with universal validity is not flexible enough to do the job. As Paul aptly observed, new meanings of a word always develop from older ones, to which speakers can resort for interpretation. The functional reason for this is that all innovation in a language "has to pass the test of intelligibility" (Palmer 1978: 309). This simply means that a speech community will only accept and conventionalize those innovations whose understanding does not put unnecessary strain on the hearers and whose usage facilitates rather than hinders communication.6 In terms of cognitive grammar, the closer that a new meaning is to the meaning of some already established conventional unit in the grammar, the more likely the grammar is to sanction a particular target structure which conveys that new meaning (cf. Langacker 1987: 66).

Thus, when Germanic people began to refer to a wall as a woven thing, as attested in the etymology of German (GER) Wand 'wall' going back to PIE *wendh- 'to turn, wind, weave', the reference must have been quite clear, since the weaving of flexible branches and twigs was the technology that they employed in erecting walls at that point in history. That words come from other words is commonplace in historical linguistics. Hopper (1990) points out that it is quite probable that even unanalyzable morphological segments consist of highly grammaticalized residues of earlier morphemes. In many cases we know that certain forms that are considered to be single morphemes and are unanalyzable today actually contain several morphemes if we look at them diachronically, but they have become so grammaticalized that the former polymorphemic structure has become obscure in the course of time. This can be seen in GER bleiben 'to stay, remain', which goes back to Middle High German belîben, where be- was still an analyzable prefix attached to a form that comes from PIE *(s)lei- 'to be wet, to stick'. Consider also the etymology of HUNG/zofc meaning 'drawer

186

Gábor Györi

(in any type of furniture)', which is not transparent to native speakers any longer. Diachronically, the word contains the morpheme fi, derived from fiú 'boy, son', while -ók is an old diminutive suffix. This is a metaphorical name for 'drawer.' It construes this particular object as being a 'small son (of a piece of furniture).' This source, however, is now totally obscure to native speakers of Hungarian. Hopper speaks about an ever ongoing attrition of morphemes counterbalanced by accretion "in response to the collective needs of discourse." He notes: The structuralist, cognitive notion of linguistic forms which correspond to (or 'mirror') concepts or ideas (such as 'persisting entities') is replaced by forms caught up in continual change in response to the on-going referential needs of discourse (Hopper 1990: 159). The kind of structuralist, cognitive notion of linguistic forms criticized here by Hopper is the one implied in objectivist views on language, which are based on a rationalist account of cognition in which linguistic signs are treated as abstract symbols whose meaning is based solely on their correspondence to categories of things in the world. The objectivist view cannot account convincingly for semantic change since it implies a static linguistic structure characterized by rigid semantic boundaries resulting from an alleged one-to-one correspondence to real world categories (cf. Harnad 1990: 336; Johnson 1987: x,xxii; Lakoff 1987: 162). Objectivist semantics is unable to come up with an adequate explanation of semantic change, simply because this would mean that the supposed direct and systematic correspondence between mental representations and external reality would have to be given up. Otherwise, how could we explain that the truth conditions of one and the same expression can change over time? This is what happened to GER Wand (from PIE *wendh- 'to turn, wind, weave'), mentioned above, which came to be used to refer to walls even after they were no longer woven. On the other hand, a non-objectivist, experientialist view of the interaction of language and cognition, which treats meanings in terms of the way people conceptualize experiential stimuli provides an

Historical aspects of categorization

187

explanation for the "continual change" by revealing the psychological backgrounds of "the ongoing referential needs of discourse" (e.g. Johnson 1987: 174; Lakoff 1987: 292; Langacker 1990: 2). As a concrete example, let us examine a semantic change in which category coding is clearly manifested. The etymology of the English word glass, meaning 'a vessel made of glass for drinking,' is quite apparent: the vessel got its name from the material it is made of. In fact, it comes from Old English (OE) glees, meaning the material 'glass'. OE glees is derived from the Common-Germanic (CGER) stem *glaza-, found in *glazam, meaning 'amber.' The basis for the change in meaning should be clear. Glass (the material) is similar to amber in that it is translucent, shining and can also be used as jewelry, as, in fact, Germanic people may have seen Romans employing it. Going further back into history, we have the PIE root *ghel-, meaning: 'shine, glitter'. It should not be too daring to infer that Germanic people conceptualized amber as something shining. What these successive acts of category coding reveal is, in fact, a series of conceptualizations: amber was conceptualized as something shining, glass (the material) was conceptualized as something similar to amber, and glass (the vessel) was conceptualized as something made of glass (the material). With respect to semantic change and the formation of cultural categories, we can now construct a general cultural scenario for the linguistic process just described above. Changes in the environment or in the cultural or economic life of a people (a speech community) can have any of the following results: (i) (ii) (iii)

create new entities, or make hitherto unknown ones known, or make already known ones be seen in a new light.

If these entities are in any way important at the level of a culture, i.e. in the life of a whole speech community, then they have to be designated in some way for purposes of talking about them. Etymological examples reflecting the above processes are not difficult to find: thus for (i) in addition to the already mentioned forms of ENG hut and GER Wand, we have ENG wheel going back to PIE *kwel- 'to re-

188

Gábor Györi

volve, move around' or ENG hat going back to PIE *kadh- 'to shelter, cover'; as an example for (ii) beside glass (see above) we could mention ENG gold derived from PIE *ghel- 'to shine, glitter' and ENG hawk derived from PIE *kap- 'to grasp' ; and for (iii) we find ENG cloud coming from PIE *gel- 'to form into a ball'. GER Wolke 'cloud,' coming from PIE *welg- 'wet' could also be mentioned as an example of (iii), since Proto-Indoeuropeans already had a word for 'cloud,'i.e. *nebh-, actually surviving in GER Nebel 'mist, fog'. 7 When the need arises to refer to entities that are not conventionally categorized in a language, the subjective application of usual meanings in novel ways in appropriate contexts appears to be the best solution for securing intelligibility among speakers (cf. Palmer 1978: 309). If such newly designated classes of entities are culturally significant enough, the novel meanings applied to them will turn into new conventionalized meanings with time. For example, we do not have to resort to the meaning 'shine, glitter' (of the PIE root *ghel-) in order to understand the meaning of the lexcal item glass. The meaning of the lexical item glass has become a usual meaning for us, and thus Paul's criterion (cf. Paul 1920: 75) has been fulfilled. At the same time, a new conceptual category (i.e. GLASS) has become coded into the language, and this has been done on the basis of the existing category system. 3.2. Semantic change and the formation of cultural categories The process of semantic change, as it relates to category coding, is basically twofold and can be easily explained with an example from evolutionary theory. A species can evolve either by anagenesis or cladogenesis. Anagenetic evolution is when a species as a whole goes through genetic changes and evolves into a different new species because of environmental changes in the niche of the species. Cladogenetic evolution, on the other hand, is a bifurcation process in which a population gets isolated (usually geographically) from the rest of the species and evolves into a separate species as a result of different environmental influences, while the original species also goes on existing parallel to the new.

Historical aspects of categorization

189

In an analogous way there can be anagenetic and cladogenetic change in meaning due to encoding. Anagenetic change is when a word with meaning χ changes its meaning into meaning y (in the course of time) while meaning χ ceases to exist. The process is as follows. Word w with the usual meaning χ develops the occasional meaning χ '. With time this latter one develops into a new usual meaning y, while the old usual meaning χ disappears completely, i.e. word w now has the meaning y. Of course word w may go through certain sound changes in the meantime. An example is OE wrîtan 'scratch, carve' > MODENG write.8 On the other hand, cladogenetic change of meaning is when a word with meaning χ gives rise to a meaning y but also retains meaning x. The process again starts with a word w developing the novel meaning x' from its usual meaning x. However, while the novel meaning Λ:' develops into a new usual meaning y, the old usual meaning is also retained. In this case there is often also a bifurcation process in the vocal form, which is a basic cause for the etymology of a word to become obscure. An example is ENG hut < PIE *(s)keu- 'cover, conceal,' while we still have ENG hide < PIE *(s)keu- 'cover, conceal'. In both kinds of change the novel meaning x' can only be understood with respect to the conventional meaning x. However, after a certain period of time speakers no longer have to invoke the conventional meaning x, but understand the erstwhile novel meaning χ ' on its own, and as a rule, they do not even know that meaning x' is related to meaning x. This is the theoretical point at which Paul's criterion for the emergence of a new usual meaning has been reached and when it makes more sense to speak about meaning y rather than meaning x' (cf. Paul 1920: 75).

This account is highly schematic. Changes in meaning are usually quite complex processes and cannot be adequately treated without considering sound change as well. Also, the examples I cite relate to only one novel meaning of a lexical item, although it is more usual to find bunches of cognates clustered around several more or less related meanings. Morphological processes are not included in this schematic description, either, although the emergence of new meanings, especially in the case of coding new categories, usually goes together with some kind of affixation, word formation or compounding. As typical

190

Gábor Györi

examples we can mention ENG belief and GER Glaube 'belief', which are both derived from PIE *leubh- 'to care, desire, love' by attaching the prefixes be- and ge- respectively. Again, the archaic morphological structure of these words is now opaque to modern speakers of English and German. This is also true for ENG window, which is actually an obscured compound from CGER *windaz 'wind' going back to PIE *we- 'to blow' and CGER *augon 'eye,' which itself goes back to PIE *okw- 'to see'. 3.3. Motivation, metaphor and new meaning The fact that knowledge of meaning is based on the way that speakers conceive of entities and their interrelations means that any novel meaning evolving from a conventional one must be semantically motivated in special ways. I use the term motivated in a special technical sense in this paper, as it is also used by Lakoff (1987: 448,537), i.e. as opposite of the term arbitrary (cf. Saussure 1959). Lakoff considers the notion of motivation to be crucial to any cognitive account of language, because language makes use of general cognitive mechanisms and motivation plays a central role in most, if not all of them. As Lakoff puts it, it is easier to learn something that is motivated than something that is arbitrary. It is also easier to remember and use motivated knowledge than arbitrary knowledge (1987: 345). An expression is motivated when a clue can be found why just that and not any other expression has been used for expressing a certain meaning and linguistic forms may be motivated in a variety of ways. For example, in onomatopoeia we see phonetic motivation at work, and in the case of transparent etymology or any type of derived meaning we are dealing with semantic motivation. Most of the time such motivated knowledge draws on the perception of some kind of analogy. The human mind seems to be biologically determined to allow the human being to perceive entities as be-

Historical aspects of categorization

191

ing analogous. This does not mean, however, that analogy is out there and awaits being detected, but rather that the human mind is capable of ignoring irrelevant properties. In the conceptualizing of new entities, the perception of analogy consists in detecting familiar features in unfamiliar phenomena, since categorizing the entity is the first step in evaluating it for its role in our behavior. This analogy based on familiar features is what both speaker and hearer rely upon in creative naming or occasional usages of usual meanings. Using an attribute with high (possibly optimal) cue-validity for naming an entity will achieve just this effect because it is the most likely to cue the concept as a whole, on the one hand, and to cue just that particular concept and no other one, on the other (cf. Rosch 1978: 30). For example, knowing that ENG hut is derived from PIE *(s)keu'to cover', we can assume that hut was construed primarily as a 'cover' and thus the attribute COVER was the one that most readily cued the whole concept along with its other features but no other concepts within that context. It is important to emphasize the relevance of context here, since among others ENG hide and GER Haut 'skin' also go back to PIE *(s)keu- 'to cover', but clearly the contexts of hut and these latter entities exclude each other and so no cueing error occurs in that context. Motivation is not only at work in the creation of easily learned, remembered and understood names, but also has a very basic role in cognition, even in the case of perception, the most basic cognitive activity. This is obvious from Neisser's (1976) work on schemata. An organism can perceive only the phenomena for which it has preexisting schemata. But as the schemata are being learned by experience, they get modified and thus allow new schemata to arise. The new schemata are thus not arbitrarily structured but motivated by earlier ones. In a similar vein, Langacker (1987: 105-106) explains our capacity to impose structure on our mental experience as deriving from our proclivity to interpret novel experiences with reference to previous ones. This basic principle of cognition is reflected at a higher cognitive level in the fact that in the process of concept formation the starting point for the conceptualization of something new is always its conceptualization in terms of something familiar. It is impossible to

192

Gábor Györi

acquire totally new knowledge that is not related to already familiar knowledge. Langacker also describes precisely how any new experience must function as a target in a comparison process and be matched to a familiar standard (Langacker 1987: 101). This is why we assign the role of features to already existing distinct concepts and use them in new concept formations (Smith and Medin 1981: 18). In this way newer categorizations indeed rely on the already existing category system (Rosch 1978: 29,42). In the case of the linguistic encoding of a category at a cultural level, concept formation is of course a historical process that may extend over generations. Seiler (1985: 117) describes the process in which new mental structures emerge from existing ones as differentiating integration . At a higher level, it is the same process as when Neisser's schemata take up (or integrate) new information and become modified (or differentiated) by this. The first step in concept formation is thus the conceptualization of new environmental stimuli (entities) in familiar terms, i.e. integrating them into already existing conceptual structures. However, if they turn out to form a salient group whose discrimination as a separate one is essential in our interaction with the environment, a differentiating process takes place and a new category is established. Thus, for example, the etymology of ENG hut (from PIE *(s)keu- 'cover') suggests that in the beginning the conceptualization was that of integration with other types of covering, but later on it turned out to be such a culturally salient type of covering that it became differentiated as a separate category and is not even conceptualized any longer as a covering in the first place. Turning to other linguistic phenomena, Lakoff and Johnson (1980) have clearly shown that figures of speech are not just decorative devices on the surface of language, but rather they reflect the basically analogical character of human thinking. By using figurative language we not only conceptualize something for ourselves in familiar terms, but also suggest to the hearer the same way of conceptualizing this entity. When the speaker uses a usual meaning in a novel way, there are specific underlying motivated connections because this is just the way the human mind works when trying to conceptualize something new.

Historical aspects of categorization

193

Lakoff (1990) has proposed the Invariance Hypothesis as a means for detailing how motivation works in interpreting metaphorical expressions. In metaphorical extensions the mapping of the structure of the source domain onto the target domain creates ontological and epistemic correspondences between the two domains because the cognitive topology of the source domain is preserved in the target domain. This is how our knowledge of the source domain motivates our understanding of the target domain, i.e. the meaning of the metaphorical expression (Lakoff 1990: 68, 73). [In the literature on metaphor the question is often raised as to how we understand metaphors and how we know that it is not the literal meaning that is meant. In a synchronic and ontogenetic perspective the question is, of course, justified. The basis for any metaphorical meaning is a literal one and the meanings that the child first acquires are also literal. He may of course acquire conventional metaphorical meanings prior to their literal counterparts, but in the cognitive structures of the child these will function as literal meanings. As the child knows only one meaning of the word in such a case, this meaning cannot appear to him as derived from another one. This does not, of course, prevent him from learning later on that there is another meaning of the word that is considered to be the literal one by the conventional standards of his speech community. This information will cause a cognitive restructuring of the child's semantic knowledge. This primacy of literal meanings of metaphors, however, does not seem to be so self-evident in a diachronic view of language.9 On the one hand, metaphorical extension may be one of the motivational devices for deriving new meanings, as we have seen for example, in ENG window, originally meaning 'eye of the wind' (cf. above), or HUNG/zofc 'drawer', originally meaning 'small son (of a piece of furniture)' (cf. above). These meanings may grow literal with time, whenever the metaphorical meaning gains cultural validity and its original literal basis becomes obscure. In a diachronic perspective of language neither metaphorical nor literal meanings have priority as they can be viewed as alternating phases of one and the same process, namely historical semantic development. On the other hand, literal meanings serve as a basis for metaphorical extensions in order to ere-

194

GáborGyöri

ate new meanings but they can themselves be viewed as fossilized remnants of earlier derived meanings.] Based on our biological inclination to perceive analogy in the world, metaphor seems to be a very efficient and natural way of creating new meanings with the least possibility of causing misunderstanding. In the acquisition of new knowledge, a metaphor, through its entailments arising from our beliefs and experience (cf. Lakoff and Johnson 1980: 139), helps us understand what is meant. This is so because a new metaphor partly describes an unfamiliar phenomenon by referring to already known phenomena and relying on existing knowledge of the hearer (cf. Harnad 1987: 554). This creative role of metaphor must be the reason why it is one of the most commonly used devices for extending the lexicon (cf. Dirven 1985). On the one hand, this is precisely in accordance with Paul's conception of usual meanings being extended with novel meanings that with time become conventionalized meanings themselves. This result has been explained by Langacker (1987: 157) as the conventionalization of an expression getting used in a particular context, i.e. a usage event that becomes a conventionalized unit due to entrenchment. On the other hand, this linguistic process can also be taken as the manifestation of the fact that existing category systems influence how we define attributes of new phenomena to be categorized (Rosch 1978: 29), which is maybe even more important for our concern with cultural category formation. 4.

Names, descriptions and categories

So far we have seen that the linguistic encoding of categories takes place within the ongoing process of semantic change and that the formation of new cultural categories can be analyzed generally as some kind of naming by description. Since naming is considered here as a process at the level of culture and not as an act of an individual, a slight redefinition of the term is necessary. This concerns basically two aspects of naming. In the case of semantic change, naming has to be seen first of all as an unconscious act and second as a social one. It is not characteristic for semantic change that the meaning of a word is changed consciously (though it can happen) and no individual can ef-

Historical aspects of categorization

195

feet semantic change unless it is accepted by a whole social group, or by some appropriate subculture such as cognitive linguists. 4.1. Naming is descriptive and categorizing

As we have seen above, the basic reasons for semantic change are external to language. The fact that meanings can and do change is a direct consequence of the function of language as a symbolic device for the experientially based cognition of reality. New entities or new aspects of old entities have to be designated not only for the sake of discourse, but also for the purpose of thinking about them, since fixed conceptualizations and stabilized conceptual structures are essential for economical and effective thought. It is on this basis that the coding of conceptual categories, via either anagenetic or cladogenetic change in meaning can be conceived of as a naming process. Functionally, as determined under experimental conditions, the names given by subjects to unfamiliar objects are mostly descriptive and categorizing (cf. Carroll 1985: 21). These two processes actually go hand in hand. Descriptions tend to have a categorizing function and categorizing unfamiliar objects in an explicit manner can best be done by the listing of features, which amounts to a kind of description. This procedure seems to be very similar to what happens in semantic change when the purpose of the change is the linguistic coding of a culturally significant conceptual category. Of course it never happens that a whole list of attributes is used to name some class of entities, but rather the origin of a word is very often the name of one of the attributes of the entities that the word denotes, as we have seen in examples given earlier (cf. also Wierzbicka 1988: 469). At first glance, naming by description may seem to have only a superficial communicative purpose. But naming, and human linguistic communication in general, have deep cognitive bases. Human linguistic communication occurs with the purpose of being understood, and understanding is a cognitive process (cf. Johnson 1987). Understanding takes place when the same cognitive structures that are activated in the speaker also get activated in the hearer. A precondition for this is, of course, that the hearer share those cognitive structures

196

Gábor Györi

with the speaker, i.e. the event types in question have to be entrenched enough in the hearer's mind to have acquired unit status, because only then can they be executed as cognitive routines (cf. Langacker 1987: 100). But depending on the degree of novelty of the information, it can happen that speaker and hearer do not share the same cognitive structures. In this case the hearer has to build these up in his or her mind in order to understand what is being communicated, i.e. new cognitive routines have to be established. This activity in the hearer's mind is guided by the speaker's use of descriptive terms. Thus, the speaker not only can activate previously shared cognitive structures in the hearer's mind, but he or she can also convey new cognitive structures into the hearer's mind with the help of language. As the descriptive character of naming suggests, the most efficient way for conveying new cognitive structures seems to proceed in two steps. First the speaker has to activate in the hearer's mind the cognitive structures that they both share. This is done by the usage of familiar terms, as, for example, the early automobiles were called 'horseless carriages'. In the second step the hearer must recognize that it is not the familiar cognitive structure that is of interest here. It is not a team of horses at all that is being talked about. The familiar cognitive structure only serves as some kind of analog to what the speaker wants to express. The knowledge of how horses were hitched together to pull a wagon that transported people provides through some kind of mapping a starting point for setting up new knowledge about automobiles. This is actually what motivation is about. 10 4.2. Symbols and new categories The term naming refers here to the devising of labels for talking about previously unknown stimuli, i.e. that which has been called creative naming (cf. Carroll 1985: 3). In Carroll's study, subjects were asked to make up names for various things, either unfamiliar or only lacking a conventional name. It was observed that the names generated tended to describe and categorize, because they referred in some degree to properties of the name's referent, like e.g. 'molasses peanut cookie' for a cookie containing molasses and peanuts, 'tire changer' for a per-

Historical aspects of categorization

197

son who is changing a tire, or 'rayed triangle' for a triangular shape pierced by a line segment (Carroll 1985: 13, 21). When the subjects were asked to rate the names they produced according to quality, the names that were easy to learn and remember (i.e. descriptive, natural etc.) and easy to use (i.e. distinctive, brief, etc.) were rated as "good" names (Carroll 1985: 5). There is a very good reason why most of the names created are descriptive and natural. The explanation for this lies in the cognitive import of descriptions (cf. Wierzbicka 1988: 466-70). Harnad has discussed this in connection with the symbol grounding problem (Harnad 1987, 1990). The symbol grounding problem contradicts the objectivist-rationalist view of cognition which asserts that cognition is nothing other than symbol manipulation, claiming that there is only one kind of correct correspondence between mental representations and the external world, and that the symbols of language are related to reality in a strictly objective way. Because of this, symbols are said to be interpretable purely on the basis of other symbols or symbol strings and on the basis of the structure of these symbol strings (cf. Lakoff 1987: 348, 370). However, this methodology has a basic flaw to it. In Hamad's words, it would amount to a merry-go-round, passing endlessly from one meaningless symbol or symbol-string (the definiens) to another (the definiendum), never coming to a halt on what anything meant (Harnad 1990: 339). On the other hand, it is perfectly clear that not all the symbols that we can interpret are directly grounded in experience. For instance, some symbols stand for abstract entities that cannot be experienced physically, e.g. 'knowledge', 'dependence' etc. Consider also the symbols that designate entities that are products of our imagination, e.g. 'monster' 'Martian' etc. Not even all symbols standing for concrete entities are necessarily grounded in experience. It can be the case that certain speakers have never experienced a kind of entity and have knowledge about it only via the symbol. For most of us who have never looked into an electron microscope a 'living cell' or a 'microorganism' are such entities. There are also concrete entities that probably no one

198

Gábor Györi

ever will directly experience as such, and whose existence will always be mediated to us via its symbol, e.g. 'universe'. Harnad refers to the symbols that are directly grounded in acquaintance with perceived reality as elementary symbols. These are based on iconic and categorical representations, which are sensory representations. Iconic representations are "analogs of the proximal sensory projections of distal objects and events" while categorical representations are "learned and innate feature detectors that pick out the invariant features of object and event categories from their sensory projections" (Harnad 1990: 335). Elementary symbols are thus symbols based on peripherally connected cognitive events, as opposed to categorial ones, which denote autonomous cognitive events (cf. Langacker 1987: 112). The fact, however, that higher order symbols can be interpreted without direct acquaintance with reality does not mean that they are not grounded in experience at all. These higher order symbols together with their underlying symbolic representations are derived from the sensory representations and thus indirectly grounded in the following way: Once one has the grounded set of elementary symbols provided by a taxonomy of names (and the iconic and categorical representations that give content to the names and allow them to pick out the objects they identify), the rest of the symbol strings of a natural language can be generated by symbol composition alone, and they will all inherit the intrinsic grounding of the elementary set (Harnad 1990: 343-344). Harnad illustrates this with the following example. The symbolic representation of zebra (for someone not directly acquainted with zebras) can be formed by describing a zebra by means of a more complex conceptualization that involves combining the semantic unit 'striped' with the meaning 'horse'. The generating of higher order symbols from elementary ones, to which the speaker resorts in absence of an appropriate symbolic unit (cf. Langacker 1987: 460), is done by the symbolic description system, which "operates on existing labels, and constructs categories by manipulating these labels"

Historical aspects of categorization

199

(Harnad 1987: 554). Unfamiliar entities are thus described with the help of familiar ones or with the help of familiar features. An example from modern times is the use of the expression 'black hole' to designate a totally novel astronomical concept of an entity in outer space that cannot be seen through optical telescopes, but occupies an extended three-dimensional area and exerts observable effects on its surroundings.11 One cognitive import of descriptions is that the composed symbols retain the grounding of the composing symbols that constitute their elements. This makes it possible for us to conceptualize two non-experiential kinds of domains of reality: physical domains that we have not yet experienced and abstract domains that are not open to sensory experience at all. Johnson (1987) has given a detailed explanation of how embodiment (which I take to be basically similar to Harnad's grounding) carries over to these domains in the form of metaphorical projections of image schemata. Metaphorical and métonymie projections are descriptive in just the above sense because they characterize the domains in question in terms of perceptual clues of other domains. The invoking of abstract domains is the very way that metaphors can, among other things, enable learning without direct experience, either because we are not in the position to acquire that experience directly or the knowledge to be acquired pertains to abstract domains. As Harnad (1987: 556) puts it, "descriptions spare us the need for laborious learning by direct acquaintance". Someone not acquainted with zebras (though a physical domain) can acquire a certain amount of knowledge about them by combining the sensory representations (iconic and categorical) of stripes and horses on the basis of a descriptive expression (Harnad 1990: 343). In the same way (and this time only in this way) we can gain knowledge about an abstract domain such as e.g. 'argument' on the basis of our physical experience with war, or even on that of another sensory experience transmitted via the domain of war (cf. Johnson 1987: 89). Many historical examples could also be mentioned here. An interesting example is GER Kopf 'head,' borrowed from Latin originally with the meaning of 'cup'. It is obvious that the conceptualization of 'head' as a container served here as the basis for the projection. Similarly, the métonymie projection in the case of GER Hahn 'rooster' de-

200

Gábor Györi

rived from PIE *kan- 'sing' is also evident. Most interesting of all are perhaps examples where the target domain invoked in such a metaphorical development is an abstract one. Thus, e.g. GER wissen 'to know' goes back to PIE *weid 'to see' and ENG life comes from PIE *leip- 'to stick, to adhere'. Another cognitive import of descriptions is that they can create new categories through symbol composition alone by drawing on a repertoire of already labelled categories, thus assigning membership on the basis of stipulated rules rather than on the basis of perceptual invariants derived from direct experience (Harnad 1987: 554). With the knowledge conveyed by the composed symbol 'striped horse', a zebra, when seen, can be readily discriminated as a member of a new category because the act of describing something is tantamount to assigning category membership to it. When we describe zebras as "striped horses" we put them in the category of striped horses. On the cognitive level the content of the composite symbol will not be identical with the sum of the contents of the latter (Langacker 1987: 450), but the component structures will nonetheless involve acts of categorization ( Langacker 1987: 466 ). We can now explain why good names are descriptive. From a cognitive point of view, the usage of descriptive terms is the most economical way to make up names for groups of entities. This is true in several respects. Naming by description is a functional procedure that: (i) facilitates the cohesion of a mental category by emphasizing relevant attributes, (ii) supports the connection between the name and the mental representation by direct reference to features, (in) helps us create new mental categories by the explicit grouping of features. Descriptions of unfamiliar entities (or new descriptions of familiar entities) are thus manifestations of new conceptualizations, which can result in the formation of new concepts.

Historical aspects of categorization

5.

201

Conclusion

At every historical stage in the development of a language, its lexicon defines a cultural system of categories which is stored in the minds of the individuals of the speech community. As we have seen, the formation of these categories can be traced in describing semantic change via the study of the etymology of words. We have found that, most of the time, the meanings of words denoting categories are products of historical categorization processes, i.e. they are fossilized conceptualizations of previous generations. They have outlived the speakers of the times of their emergence, and later on they impose a given categorization of the world on the coming generations. But just as these linguistically coded categories are results of previous conceptualizations on the level of a whole culture, they also provide an ever-ready source for the operation of similar cognitive processes in the future. Etymologies analyzed as the result of processes of descriptive naming exhibit the characteristic of drawing upon the existent category system. More specifically, they also show the influence of functional knowledge, which Rosch observed to be responsible for categories with correlational structures not found in the real world (cf. above and Rosch 1978: 42). Of course, some categorizations indeed depend on bundles of perceived world attributes. The categorization in the word gold, derived form PIE *ghel- 'to shine', as 'the material that shines' seems to be based on such perception. In other cases however, correlational structure is indeed provided by human knowledge. In the case of thumb, derived from PIE *tum- 'swell', the attribute SWOLLEN is relative to the other fingers. The named attribute in hose, derived from PIE *(s)keu- 'cover', is on the other hand a functional one and thus requires prior knowledge of human activities, which means that it is indeed not inherent in the real world. As Langacker points out repeatedly, there are a multiplicity of factors ranging from conventional status to individual idiosyncrasy that relate to why a particular image or mental model is selected as the basis for a particular instance of categorization. Etymologies give us hints about both past and present conceptualizations of phenomena and provide, through a cognitive analysis, explanations of how and why particular conceptual categories emerged

202

Gábor Györi

in a culture. We also have to be cautious in making predictions about what features may be contained in a conceptual category later on, because they can change with novel information and knowledge about the phenomenon or with a novel function of the phenomenon. Experience with further exemplars of a category can also change the relevance of each attribute (Zimmerman 1979; Smith and Medin 1981: 53). However, attention must be drawn to the limitations of inferences that can be made from etymologies. Above all, it has to be noted that the category behind the current usage of a word is in most cases not conceptualized any more in the way suggested by the etymology, i.e. the descriptive naming that gave rise to the word. An example is again breakfast. The category BREAKFAST does not contain the feature FAST BREAKING MEAL any more, i.e we do not conceptuatize the period of night as a period of fast. As opposed to this type of development, two different naming acts, or acts of conceptualization can also lead to the emergence of similar categories. For example, skin and hide are synonyms today, but their etymologies suggest totally different conceptualizations. Skin is derived from PIE *sek- 'cut' via the extended root *skend- 'to peel off (though via Scandinavian transmission), while hide is derived from PIE *(s)keu- 'cover, conceal'. Thus, skin was conceptualized as something that can be cut or peeled off the body of an animal, while hide was conceptualized as something that covers the body. Notes 1.

2.

3.

I am very grateful to Eugene H. Casad and Doris Bartholomew for their comments and help, from which this paper has greatly benefited. I am also indebted to three anonymous reviewers for their comments and suggestions. I do not claim by this that all semantic change can be interpreted as category coding, though the idea seems to be tempting. Testing this would be a worthwhile project for future research. For the etymologies mentioned in the article the following dictionaries were consulted: Benkö et al. 1984, Buck 1949, Drosdowski et al. 1963, Watkins 1985.

Historical aspects of categorization

4. 5.

6.

7.

8.

9.

10. 11.

I am indebted to Doris Bartholomew (personal communication) for this example. A few terminological clarifications are in order here. Reality is understood throughout the paper as the totality of phenomena, including not only entities, but properties of entities and relations between entities, etc. that people are capable of recognizing. The term environment is the segment of reality that a social group or an individual experiences and interacts with. A change in the environment is not only an alteration of a part of reality or the emergence of a new entity, i.e. an objective change, but it can also be a subjective change in which a community adopts a new perspective towards some phenomenon of the environment. Although language is a very flexible device that allows for the creative satisfaction of the changing communicative needs of a community, there are limits to this creativity. The limits are set by the hearer's ability to understand a new meaning. This ability is enhanced if the new meaning is easily derived from an old one with the guidance of the context that is supposed to define the new meaning, just as we see in the case of the origin of GER Wand. I need to point out that, although many etymologies are clearly manifestations of conceptualizations at the cultural level, many of them cannot be unambiguously related to any of these processes. For example, ENG rain derives from PIE *reg- 'wet, moist', from which we can infer that 'rain' was conceptualized as something wet, but not that this points to a time when rain became known. A description is not to be understood here as a detailed account of an entity. Theoretically, the mentioning of even one property is a description, albeit an incomplete one (cf. Wierzbicka 1988: 470). From a historical point of view all literal meanings are motivated in one way or another, i.e. they are derived meanings themselves whose semantic motivation has become obscure. Lakoff s (1990) Invariance Hypothesis is a detailed description of how this motivation works for metaphorical mappings. A description is not to be understood here as a detailed account of an entity. Theoretically, the mentioning of even one property is a description, albeit an incomplete one.

203

204

Gábor Györi

References Anderson, John A. 1988 "Concept formation in neural networks: Implications for evolution of cognitive functions", Human Evolution 3: 81-97. Bammesberger, Alfred 1984 English etymology. Heidelberg: Carl Winter Universitäts-Verlag. Benkö, Loránd et al. 1984 A Magyar nyelv torténeti-etimológiai szótára. Volumes 1-3. (Second edition) Budapest: Akadémiai Kiadó. (A historical-etymological dictionary of the Hungarian language). Buck, Carl Darling 1949 A dictionary of selected synonyms in the principal Indo-European languages. Chicago and London: The University of Chicago Press. Carroll, John 1985 What's in a name? An essay in the psychology of reference. New York: Freeman. Casad, Eugene H. 1988 "Conventionalization of Cora locationals", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins, 345-378. Csányi, Vilmos 1988 "Contribution of the genetical and neural memory to animal intelligence", in: Harry J. Jerison & I. Jerison (eds.), Intelligence and evolutionary biology. Berlin: Springer, 299-318. 1992 "The brain's models and communication", in: Thomas A. Sebeok & Jean Umiker-Sebeok (eds.), The semantic web. Berlin: Mouton, 27-43. Dirven, René 1985 "Metaphor as a basic means for extending the lexicon", in: Wolf Paprotté & René Dirven (eds.), The ubiquity of metaphor. Amsterdam/Philadelphia: John Benjamins, 85-119. Drosdowski, Günther et al. 1963 Das Herkunftswörterbuch. Die Etymologie der deutschen Sprache. Mannheim: Bibliographisches Institut. Geeraerts, Dirk 1985 "Cognitive restrictions on the structure of semantic change", in: Jacek Fisiak (ed.), Historical semantics and historical word-formation. The Hague: Mouton, 127-153. 1988 "Cognitive grammar and the history of lexical semantics", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins, 647-677. Gentner, Dedre & Donald R. Gentner 1983 "Flowing waters or teeming crowds: Mental models of electricity", in: Dedre Gentner & Albert L. Stevens (eds.), Mental models. Hillsdale, N.J.: Lawrence Earlbaum., 325-339.

Historical aspects of categorization

205

Haiman, John 1980 "Dictionaries and encyclopedias", Lingua 50: 329-357. Harnad, Stevan 1987 "Category induction and representation", in: Stevan Harnad (ed.), Categorical perception: The groundwork of cognition. New York: Cambridge University Press, 535-565. 1990 "The symbol grounding problem", Physica D 42: 335-346. Herrnstein, Robert J. 1984 "Objects, categories, and discriminative stimuli", in: Herbert C. Roitblat, Thomas G. Bever & Herbert S. Terrace (eds.), Animal cognition. Hillsdale, N.J.: LawrenceErlbaum, 129-144. Hock, Hans Heinrich 1986 Principles of historical linguistics. Berlin: Mouton de Gruyter. Holyoak, Keith J. 1984 "Analogical thinking and human intelligence", in: Robert J. Sternberg (ed.), Advances in the psychology of human intelligence. Vol. 2. Hillsdale, N.J.: Lawrence Earlbaum, 199-230. Hopper, Paul J. 1990 "Where do words come from?", in: William Croft, Keith Denning & Suzanne Kemmer (eds.), Studies in typology and diachrony. Amsterdam and Philadelphia: John Benjamins, 151 -160. Jeffers, Robert J. and Ilse Lehiste 1979 Principles and methods for historical linguistics. Cambridge, Mass. and London, England: The MIT Press. Johnson, Mark 1987 The body in the mind: The bodily basis of meaning, reason and imagination. Chicago: University of Chicago Press. Kovecses, Zoltán 1986 Metaphors of anger, pride and love. A lexical approach to the structure of concepts. Amsterdam and Philadelphia: John Benjamins. Lakoff, George 1987 Women, fire and dangerous things. What categories reveal about the mind. Chicago: The University of Chicago Press. 1990 "The Invariance Hypothesis: Is abstract reason based on imageschemas?", Cognitive Linguistics 1: 39-74. Lakoff, George & Mark Johnson 1980 Metaphors we live by. Chicago: The University of Chicago Press. Langacker, Ronald W. 1987 Foundations of cognitive grammar. Vol. 1: Theoretical prerequisites. Stanford, CA.: Stanford University Press. 1990 Concept, image, and symbol. The cognitive basis of grammar. Berlin and New York: Mouton de Gruyter. Neisser, Ulric 1976 Cognition and reality. Principles and implications of cognitive psychology. San Francisco: Freeman.

206

Gábor Györi

Palmer, Leonard R. 1978 Descriptive and comparative linguistics. London: Faber & Faber. (First ed. 1972) Paul, Hermann 1920 Prinzipien der Sprachgeschichte. Halle: Max Niemeyer. Rosch, Eleonor 1978 "Principles of categorization", in: Eleonor Rosch & Barbara B. Lloyd (eds.), Cognition and categorization. Hillsdale, N.J.: Lawrence Earlbaum, 27-48. Saussure, Ferdinand de 1959 Course in general linguistics. New York: Philosophical Library, (translated by W. Baskin, orig. pubi, in French 1916) Seiler, Thomas B. 1985 "Sind Begriffe Aggregate von Komponenten oder idiosynkratische Minitheorien?", in: Thomas B. Seiler & Wolfgang Wannenmacher (Hrsg.), Begriffs- und Wortbedeutungsentwicklung. Berlin: Springer, 105-131 Smith, Edward E. & Douglas L. Medin 1981 Categories and concepts. Cambridge, MA.: Harvard University Press. Sweetser, Eve 1990 From etymology to pragmatics: Metaphorical and cultural aspects of semantic structure. Cambridge: Cambridge University Press. Taylor, John R. 1989 Linguistic categorization. Prototypes in linguistic theory. Oxford: Clarendon Press. Watkins, Calvert (ed.) 1985 The American heritage dictionary of Indo-European roots. Boston: Houghton Mifflin Company. Wierzbicka, Anna 1988 The semantics of grammar. Amsterdam and Philadelphia: John Benjamins. 1992 semantics, culture and cognition: universal human concepts in culture-specific configurations. New York and Oxford: Oxford University Press. Zimmerman, Barry J. 1979 "Concepts and classification", in: Grover J. Whitehurst & Barry J. Zimmerman (eds.), The functions of language and cognition. New York: Academic Press, 57-81.

Unpacking markedness* Laura A. Janda

0.

Introduction

Markedness pervades vast sectors of the linguistic literature, unrestricted by the theory or tradition of the authors who evoke it. In the course of its history (traced back 150 years in Andersen 1989), its meaning has become increasingly diffuse, and its application virtually limitless. For example, markedness has been used to describe the relationships that hold among distinctive features, among phonemes, among allophones, among allomorphs, among semantic features, among the terms of case, number, person, tense and other morphological categories, among inflectional and derivational paradigms, among parts of speech, among syntactic constructions, among case systems, among vowel systems, and even among grammars. And markedness is said to manifest itself in a no less impressive array of phenomena, including neutralization, assimilation, reversal, syncretism, direction of language change, order and success of language acquisition, productivity, and universal ordering of elements (i.e., a language that has a marked element must have the corresponding unmarked element, but the converse is not true). Clearly, markedness plays an essential role in language, at all levels of language, both synchronically and diachronically. Its importance is inescapable. But what is markedness? One thing that it seems not to be is a theory, and most scholars who have written on markedness from whatever theoretical standpoint agree on this. In introducing the concept Battistella (1990: 5) states that "markedness has so far resisted a satisfying treatment, and no clearly defined theory of markedness has emerged." Lapointe (1983: 228-9) comes closer to a characterization of the problem, stating that "markedness principles are ... analogous to physical laws, like the ideal gas laws of nineteenth century physics they themselves are not assumed to be fundamental statements of the

208

Laura A. Janda

theory, but are seen instead as signposts along the road in search of the more basic theoretical principles which they follow from." Indeed, if markedness is anything at all, it is a collection of empirical observations, and an increasingly varied one at that. Scholars who use the term markedness theory (and they are in the minority) have promoted empirically observed correlations to the status of theoretical constructs and laid aside all of the existential questions, such as Why should markedness exist? How are phenomena of markedness consistent with the structure of language as a whole? What is responsible for the distinction between marked and unmarked terms? Even in the markedness theory camp scholars admit that markedness as a theory is only weakly realized at best, cf. Tomic (1989: 9), who states that markedness theory "is actually nascent rather than existent." To return to Lapointe's metaphor of markedness as a series of signposts along the road in search of basic theoretical principles, I would like to suggest that we have been taking steps down the very same road in laying out the theoretical framework of cognitive linguistics. The basic theoretical principles of cognitive linguistics are indeed those that will provide a unified theory of linguistics integrating markedness phenomena as a logical and expected result of the way in which linguistic knowledge is constructed. There have been indications of the possible ramifications of cognitive linguistics for markedness coming from various directions (cf. Battistella's ( 1990: 26) inclusion of prototypicality as a criterion in determining markedness relations, van Langendonck's (1989: 180) remark that "markedness theory and prototype theory are in accordance," and Lakoff's (1987: 5961) and Mayerthaler's (1980: 26) mention of this correlation), but the issue is sufficiently complex and diverse in detail to merit closer examination. 1.

Existential questions

Comrie (1983: 85) has suggested that there are two possible approaches to markedness. One is to give up and say that it is a property genetically inherited by human beings. The other is to "try to account for markedness in terms of other, independently verifiable properties

Unpacking markedness

209

of people, the world, or people's conception of the world." Connie favors the second approach, and, because the goal of cognitive linguistics is to account for all linguistic phenomena in terms of other, independently verifiable properties of people and people's conception of the world, certainly cognitive linguistics is an appropriate framework to serve the needs of such an approach. The theoretical constructs of cognitive linguistics that will figure prominently in explicating markedness are the radial category, the idealized cognitive model (ICM), the basic level, and metaphorical mapping. 1 1.1. Why and how does markedness exist? Markedness is usually defined as an asymmetric relationship between two or more elements. Few now hold to the Jakobsonian notion that all relations are privative and binary, and most scholars will admit that markedness relations can be scalar. Thus markedness presupposes some sort of contrast among two or more elements that are somehow related. Cognitive linguistics postulates that most, if not all, linguistic information is organized in cognitive categories with a radial structure of increasingly peripheral members related to a central prototype. If two elements are related to one another, then they are either members of a single category or two categories that form a single superordinate category. The internal structure of the category (or superordinate category) provides inclusive asymmetrical relationships that contrast the elements and also assigns markedness values as a function of distance from the prototype. The relations are inclusive because they incorporate the elements into a single category; and they are all asymmetrical with respect to the center vs. periphery of the structure of the category. Thus markedness is a necessary result of the structure of cognitive categories, a structure that has been independently verified by Rosch's (1973a and b) work in psychology, and by many of the contributions to cognitive linguistics. In Janda (1993a) I have demonstrated in detail that markedness correlates with distance from the prototype of a category, with the least marked elements closest to the center, and the most marked elements in the most peripheral positions.

210

Laura A. Janda

This account should satisfy Comrie, and is also consistent with Andersen's (1989: 37-8) observation that "all paradigmatic relations in language ... are established as inclusive oppositions. As a consequence, they all incorporate asymmetrical value relations even though, from a functional ... point of view, many of these paradigmatic relations are non-inclusive and hence symmetrical." This account further realizes Andersen's (1989: 39) goal of recognizing "in the ubiquitous markedness values the effect of a cognitive strategy which takes precedence, ontogenetically, over the functional (and logical) analysis of the experiential dimensions encoded in language and culture," in other words, at some level the marked and unmarked elements, even if they are logical opposites, are members of a category, be it basic-level or superordinate. In the discussion that follows it is important to bear two things in mind: (a) that both basic-level and superordinate categories are responsible for constructing markedness relations, and (b) that most (if not all) linguistic elements hold membership in more than one category, thus bringing markedness values into dynamic contrast. 1.2. Where do we get the "expectations " that distinguish marked from unmarked?

In trying to define the difference between marked and unmarked elements in a relationship, scholars use two measures: either distribution or simplicity (be it phonological or semantic). I agree with Andersen (1989: 28-30), Comrie (1989: 85) and Andrews (1990: 136-165) that the distributional phenomena of markedness (neutralization, assimilation, etc., to be discussed below) are merely symptoms, not defining properties. An essential definition of what makes an element marked is far more elusive. Comrie (1989), through a series of examples, demonstrates that unmarked elements correlate with expected meanings and situations, whereas the opposite is true for marked elements. Andersen (1989: 39) compares the unmarked element to the thematic ground against which the rhematic figure stands out. In cognitive linguistics we already have a construct that tells us what to expect when dealing with a category: the idealized cognitive model. The ICM gives the category its shape, determining what is the most prototypical, or

Unpacking markedness

211

expected, and therefore unmarked, element that should occupy the central position. Geeraerts (1988) provides a clear example of how ICMS can determine the prototype and the shape of the category based on that prototype. Dutch has two factitive verbs, vernielen and vernietigen, which both denote "destroy" and appear in the same range of uses and collocations. The semantic categories of these two verbs, however, are motivated by two different ICMS and vary in both the identity of their prototype and in details of their infrastructure. Vernielen is motivated by the ICM of "throwing down" and its prototypical uses involve physical destruction and damage, whereas vernietigen, motivated by the ICM of "set to naught", expresses the more abstract concepts of annulment and cancellation in its prototypical uses. 1.3. Why does markedness recur at various levels of language? Markedness was first observed in phonology, but has since been recognized in phenomena of morphology, syntax, and semantics. Schupbach (1984: 64) commented that "there is no a priori reason why relationships at a lower, or more basic level in a system should recapitulate themselves at higher levels of complexity." Cognitive linguistics, however, gives an a priori reason for the repetition of markedness throughout language. All of our linguistic knowledge is organized and stored in cognitive categories,2 and their structure results in markedness relations. This fact, in conjunction with restrictions on variation in certain domains of human experience, also explains markedness universale, both general and specific. 1.4. What kinds of markedness are there? It has been observed by many that as markedness has come to be recognized at various levels of language, subsequent problems of defining markedness have arisen, due not "to the markedness relationship per se, but rather to a conflation of oppositions and indiscriminate application in different domains" (Tomic 1989: 2; cf. also Battistella 1990: 6). What this points to is a need for a typology of markedness

212

Laura A. Janda

that recognizes the fact that markedness is not an entirely uniform phenomenon. I will outline a suggested typology constructed in the framework of cognitive linguistics. It is essential to distinguish the level of the category that produces the markedness relationship, for this affects the nature of the markedness relationship. I distinguish three main types of markedness, on the basis of the level of category membership, and there are subtypes (in which, for example, categories have zero structure) as well. Basic level - The marked and unmarked elements are members of a basic-level category. Examples here would include: values of a distinctive feature, allophones of a phoneme, allomorphs of a morpheme, submeanings of a morpheme, lexical items that are members of a basic-level category. This is the simplest type of markedness. Superordinate level - The marked and unmarked elements are categories in their own rights, and can be equipollent at the basic level, but are related via a superordinate category. Examples here include the terms of a vowel system, a case system, a tense/aspect system, a number system, a person system, and lexical binomials like black/white, up/down, good/bad. Here the effects of markedness are less immediate and require more careful analysis. Conflated - The marked and unmarked elements are members of more than one category, often at more than one level. An example here would be the members of an inflectional paradigm, which participate in the categories of morphemes, and in superordinate linguistic categories such as number, gender, person, etc. Other examples include the relative markedness of syntagms, in which the markedness of each element interacts with others. Markedness in conflated categories is highly complex, and there are some languages (those that are agglutinating and isolating) that strive to avoid paradigmatic conflation at the morphological level. The following data from Turkish (as an agglutinating language that avoids conflation) vs. Russian (as a flexional language with conflated categories) serve to illustrate this point (Turkish data here adapted from Lyons 1968: 188): Turkish Nsg Absg

ev ev-den

Russian Nsg Gsg

stol-0 stol-a

Unpacking markedness

Turkish Npl Abpl where:

ev-ler ev-ler-den ev = 'house' den = Ablative 1er = pi

213

Russian stol-y Npl stol-ov Gpl where: sto = "table" 0 = Nsg mase a = Gsgmasc y = Npl mase ov = Gpl mase

Here we see how these two types of languages handle concurrent expression of case and number. For the agglutinating language, each morpheme has one and only one grammatical value and that value is independent of context. For the language with conflated categories, a single flexional morpheme carries more than just one piece of grammatical information, and, furthermore, the interpretation of that information is frequently dependent upon context (i.e, 0 can indicate, in addition to Nsg mase, Gpl fem or Gpl neut). In addition to basic-level, superordinate-level and conflated categories, which provide language-specific markedness relations, there appear to be some universal optimality networks structured by human experience that represent universal relations between possible elements at the superordinate level, for example, a network with optimal/'ideal' vowel systems in the center, and deviations working toward the periphery. Also consider a network with clusters of word-order patterns, with optimal/prototypical combinations of head/dependent marking for each of SOV, svo and VSO, and a range of less preferred combinations. These, however, are only partially encoded in the linguistic conventions of any given language. 1.5. Why do markedness values vary from language to language? Like the question about the recurrence of markedness throughout language, this fact is hard to motivate outside cognitive linguistics. The answer is that only the type of structure observed in cognitive categories is constant, but the determination of what the members of a category will be and their relative positions in the category are variable.^

214

Laura A. Janda

Cross-linguistic similarities are attributable to similar perception in the human experience, i.e., single is perceived of as more simple (and therefore unmarked) than plural for most count nouns. Assignment of markedness values in a given language is never arbitrary, but neither is it predictable. One goal of cognitive linguistics is to explore the factors that motivate markedness relations. 1.6. If markedness tends to be lost, why does it not reduce to zero? As long as linguistic knowledge resides in cognitive categories, we will always have peripheral, marked elements. If our cognitive categories were to collapse, there would be only symbols with no fabric of meaning to interpret them. This answer is obvious, but the answer to a weaker version of this question is more subtle: Why is there no drastic reduction in the amount of markedness in language, and what motivates the creation of new marked elements? Cognitive categories are recognized to be plastic, allowing for both the growth and the pruning of members at the periphery. We know also, however, that categories can conflict and interact (recall for example Lakoff's (1987: 132) account of the semantics of the English words thrifty and stingy which contrast because they derive from two different ICM'S concerning the management of personal finances). The evolution of new marked elements is likely motivated by the interaction of various categories. Stein (1989: 80) has documented a case in which "the cost for abolishing marked structures on one level is the creation of marked structures on another level": the rise of ¿o-support in English. He has shown how in order to reduce the use of marked consonant clusters in inflected verbs, a marked syntactic structure (do-support in w/i-questions) was developed. Thus avoidance of markedness in the phonology of English has led to a rise in the markedness of its syntax. In the history of the Slavic languages we see that a reduction in the complexity of the case system of Macedonian and Bulgarian correlates with an increase in the complexity of the verbal system, whereas elsewhere in Slavic the trend has been to increase case distinctions while decreasing distinctions among verbal categories. Language contact likewise plays a role in creating new marked

Unpacking markedness

215

elements. The historical change described in 3.1.3. below provides an example of this. Here s > χ in a highly circumscribed set of environments. Borrowings of foreign words with χ in positions not predictable by environment were very significant in helping to establish χ as a new, marked phoneme. 2.

Symptomatic patterns of markedness

There are several phenomena that have been observed to depend in some way on markedness which involve an alignment of markedness values with the distribution pattern of marked and unmarked elements. The cognitive linguistic framework provides key insights in explaining the mechanism of such alignments. 2.1. Br0ndal 's Principle of Compensation

Brandal's Principle of Compensation states that there is more differentiation among unmarked members than among marked members of a relationship. In a basic-level category, the reason for this is obvious. The central prototype occupies a priveleged position in terms of the number of relations that it bears to other members of the category, and it is ultimately related to all other members of the category. The most peripheral and therefore marked members of the category bear fewer relations to neighboring members, and in the limiting case, a peripheral member may be related to only one other adjacent member of a category. Peripheral members bear a high cost of contextualization, restricting the amount of possible expansion at the periphery. Brnndal's principle, however, is most frequently invoked with respect to complex, superordinate-level categories, where the overall effect is similar, but more weakly felt, since the terms are entire related categories, rather than members of one category. The structure of the cognitive category both predicts the tendency named in Br0ndal's principle, and allows for the exceptions occasionally observed. Thus Br0ndal's principle can be translated into cognitive terms as a tendency for greater differentiation and variation in central portions of a

216

Laura A. Janda

category and a limitation of variation and multiple relationships at the periphery. 2.2. Universal ordering, acquisition, and productivity patterns Universal ordering, acquisition, and productivity patterns all suggest that the presence of a marked term of an opposition is dependent upon the prior 4 and more vigorous presence of the unmarked term. The position of the prototype in the radial category explains the priority of the unmarked element. Without it the category cannot exist and the marked elements have no category membership relations to anchor their meaning. 2.3. Allophony, allomorphy, and neutralization Allophony, allomorphy, and neutralization are all symptoms of the cost of contextualization associated with the marked periphery of a category. Allophones and allomorphs exist only in specific contexts and index those contexts. Neutralization takes place in what is conceptually a zero context. Here the unmarked element is selected, because only it bears no cost and indexes no context. 5 2.4. Syncretism and other diachronic loss of markedne s s The same forces are responsible for syncretism and other diachronic loss of markedness. Here we see that peripheral items have been pruned from a category over time. Certainly we would expect that the peripheral marked elements would be most susceptible to loss, both because they have the least essential role to play in the structure of the category, and because of the cost of contextualization with which they are associated. We would not expect the loss of terms at or near the center of a category. This explains why it is usually the marked terms that are reduced or revalued. 6 Historical linguistics provides no shortage of examples to illustrate this phenomenon in which irregular paradigms are gradually eliminated in favor of regular paradigms; the loss of strong verbs in English is just one such example.

Unpacking markedness

217

2.5. Diachronic markedness reversal Diachronic markedness reversal is said to have occurred when the marked and unmarked elements of an opposition have had their markedness values switched in the course of time. An example offered by Lehmann (1989: 183-4) is taken from the history of German. Here the genitive case has been succeeded by the von-phrase as the unmarked element. Lehmann presents the following data and analysis: (1)

der

Chef

meines

Mannes

the

boss-NOM

my-GEN

husband-GEN

'my husband's boss' (2)

der

Chef

the

boss-NOM

von from

meinem my-DAT

Mann husband-DAT

'the boss of my husband'

Originally, only the genitive was admissible in nominal attributes. Later, the preposition von ' f r o m ' was grammaticalized and acquired the function of English of, being used in nominal attribution instead of the genitive. At first, von was more expressive in attribution than the mere genitive. Nowadays, the genitive is becoming increasingly old-fashioned... In terms of cognitive linguistics, what is observed here is a shift in the center of gravity of a category with a very simple structure. We have a category with two elements, bare genitive vs. von, and the interpretation of which of the two is prototypical or primary has changed over time. We would not expect to see such reversals involving central and peripheral members of a category with complex structure.

218 Laura A. Janda

2.6. Markedness alignment The cooccurrence of unmarked elements is frequently referred to as "markedness assimilation", and the cooccurrence of marked elements has been labeled both "markedness assimilation" and "markedness reversal", the reason being that in a marked context, the appearance of more marked elements is unmarked, so the value appears to be reversed. I will call both phenomena markedness alignment, and reserve the term reversal for the type of historical shift just described above. The frequent occurrence of markedness alignment is another example of a markedness phenomenon that has heretofore lacked theoretical explanation. Cognitive linguistics, however, has a theoretical construct called metaphorical mapping, which operates to select and connect counterparts of different domains on the pragmatic level, and we can postulate that similar mapping operates between categories, producing the alignment of central members with central members and of peripheral members with their marked counterparts (as implied by Lakoff's (1987: 283) "Spatialization of Form hypothesis"). The postulation of this type of mapping is also consistent with what we know about mapping functions between neural nets in the brain (cf. Churchland 1986: 453-456), and thus is an expected phenomenon in the framework of cognitive linguistics. This mapping is termed "metaphorical" because it establishes relationships based on abstractly perceived equivalence. In other words, metaphorical mapping entails the identification of an unmarked member of category A with an unmarked member of category B, and the identification of a marked member of category A with a marked member of category B. Lakoff (1987: 276) speaks of mapping as an essential part of metaphor and Lakoff (1989: throughout, but see 89-91 for particularly vivid illustrations) speaks more specifically about metaphorical mapping (there the genre under discussion is poetry, but grammar can operate in a parallel fashion). Because an unmarked member of category A is thus equivalent in some sense (via mapping) to an unmarked member of category B, their cooccurrence in the same context is cognitively well-motivated, and the same holds for the marked members of the two categories. Again we see a parallel to poetic metaphor where mapping motivates the juxtaposition of equiva-

Unpacking markedness

219

lent items from different domains, as in: "My wife ... whose waist is an hourglass" (where mapping connects "hourglass" to "waist"; this example from Lakoff 1989: 90). Although cognitive linguistics anticipates the existence of markedness alignment, it merely provides a motivation for this phenomenon, rather than predicting its occurrence. In other words, markedness alignment must be understood as a possible, but not a necessary option in language. There are certainly other factors that come into play, such as the prevailing form-meaning patterns in the language. Thus markedness alignment is part of the conventionalized structure of a given language. It, like language-specific markedness values, can be likened to language-specific category structure, which is neither predictable nor arbitrary, but rather motivated (cf. Lakoff 1987: 96 and Janda 1993a). The intent of this discussion of markedness alignment is most definitely NOT to provide a theoretical construct that predicts the occurrence of this phenomenon. I am merely attempting to demonstrate that markedness alignment is cognitively well-motivated in language. This does not mean that we can predict that markedness alignment should appear in any specific environments, or even that it should be widespread at all. It does mean that where markedness alignment does exist it is not arbitrary but rather has a sound cognitive anchor. To return to the parallel drawn with poetic metaphor above, the fact that there is a topological similarity between a female torso and an hourglass does not mean that anybody will ever make that connection. However, once the connection is made, it is recognized as natural and well-motivated. In linguistics we see two forces at work in grouping elements: the existence of shared features and the existence of radially-structured categories. A shared feature can be responsible for the shared behavior of a natural class of sounds. In addition, parallel position in categories (identified via markedness alignment) can be responsible for the shared behavior observed by a group of segments that lack a shared feature. See 3.1.3. below.

220

Laura A . Janda

3.

Case studies

This section contains analyses of some particularly intricate markedness phenomena that can be neatly accounted for when the explanation is informed by the theoretical framework of cognitive linguistics. 3.1. Markedness alignment When it occurs, markedness alignment can be quite elaborate, with mapping functions spanning a range of phonological, morphological, and semantic categories. This phenomenon will be the subject of the first three case studies. 3.1.1.

Cases

Nominative singular nominal stem affixes in Russian

Number

Gender

Desinence

Figure 1. NOM sg nominal stem affixes in Russian In Russian we see an alignment of the most unmarked case, the nominative, with the most unmarked number, the singular, the most unmarked gender, the masculine, and the most unmarked desinence, the zero ending, i.e., the nominative singular masculine desinence is zero. Keeping the first two terms constant, if we gradually increase the

Unpacking markedness

221

markedness of the gender term, we see a corresponding gradual increase in the markedness of the desinence. With the feminine we have the least marked non-zero desinence, which consists of the single vowel a, and with the most marked gender, the neuter, we have a desinence consisting of a vowel somewhat more marked than a, namely o (cf. Schupbach 1984: 66-67). The mapping function is responsible for aligning corresponding members of the respective categories into harmonious syntagms. 3.1.2.

The second locative in Russian

Acccotuatioa

Cuw

"symmetry"). An additional variety of the lack of dimensional saliency also arises when the position of the object cannot be determined, as for instance when an object could be maximally extended either along the vertical axis or along the horizontal one, but the scene itself does not provide enough information for deciding which one is really more prominent. Liegen is also required in this case: (16) (17)

Die Steine liegen auf der Straße. 'The stones are on the road.' ??Die Steine stehen auf der Straße.

Liegen and stehen in German

465

Stones can be high, flat, round, with a base etc.; thus, their relative prominence with respect to the road cannot be determined a priori. Given the proper context, (17) becomes acceptable. For example, we can use stehen if we are referring explicitly to large tall stones such as the monoliths of Stonehenge. So far we have seen that the following basic schemata are paired with stehen and liegen when referring to objects located in the perceptual scene: Stehen => Liegen =>

VERTICALLY, BASE HORIZONTALITY, LACK OF DIMENSIONAL SALIENCY

A question arises: is this a completely random distribution of schemata, or is it possible to see a motivation for their convergence on each verb specifically? We have seen that the schema of VERTICALIT Y is basically encoded by the verb stehen, whereas liegen encodes the schema of HORIZONTALITY. But what about the other two schemata? Is there a reason why VERTICALITY matches with BASE and not with LACK OF DIMENSIONAL SALIENCY or with both? Actually, if we take for granted the basic pairings stehen=> VERTICALITY and /iegen=>HORIZONTALITY, there are four theoretically possible combinations involving the other two schemata, i.e.: I. Stehen => Liegen =>

VERTICALITY & BASE HORIZONTALITY & LACK OF DIMENSIONAL SALIENCY

Π. Stehen => Liegen =>

VERTICALITY & BASE & LACK OF DIMENSIONAL SALIENCY HORIZONTALITY

466

Carlo Serra Borneto

ΙΠ. stehen liegen

VERTICALLY & LACK OF DIMENSIONAL SALIENCY HORIZONTALLY & BASE

IV. stehen liegen

VERTICALLY HORIZONTALLY & LACK OF DIMENSIONAL SALIENCY & BASE

As we have seen, the actual combination is I (stehen matches with VERTICALLY & BASE and liegen with HORIZONTALLY and LACK OF SALIENCY), but from a hypothetical point of view it could be equally possible for the two non-basic schemata (BASE & LACK OF DIMENSIONAL SALIENCY) to be matched inversely as in III or for both to be assigned to one of the two verbs (as in II or IV respectively). I would argue that the actual combination (I) is the most consistent with the findings on human visual discrimination in orientation, i.e. this combination is the most likely from a cognitive point of view. Although the real-world background shouldn't be over-emphasized in linguistics, it should nevertheless be kept in mind that we are dealing in this case with locative verbs whose meaning reflects perceptual processes more than almost any other linguistic unit. It is a well-known fact that orientation along the vertical and horizontal axes is perceptually preferred in comparison to all other orientations: "vertical and horizontal lines are detected and discriminated more easily than obliquely oriented lines" (Fisher 1990: 43); "detection and discrimination of orthogonal (vertical and horizontal) lines show an advantage over left-right oblique lines in animals, infants, children, and adults" (ibid.: 45). Thus the basic "codings" of the vertical and horizontal orientation through stehen and liegen reflect relevant preferences in the perceptual setting. But the vertical axis has prominent status even over the horizontal one: "sensitivity to vertical symmetry emerges before sensitivity to

Liegen and stehen in German

467

horizontal and oblique symmetries and remains perceptually superior throughout adulthood" (ibid). For instance, in Braine's three-stage model of developmental orientation (Braine 1978) the first stage clearly prefers verticality over all other dimensions: Braine maintains that in the first part of its life a child is able to identify a shape only as upright or not upright. In cognitive terms this means that the first distinction is used to separate the schema of verticality from every other one. If (some) linguistic units partially reflect perceptual experience in an iconic way (at least in this very specialized and restricted area), it is conceivable that the vertical schema will be "coded" by one specific semantic unit and the remaining schemata by a different one. This would be reflected in German by the distribution of the basic schemata between stehen (verticality) and liegen (horizontality and lack of dimensional saliency). Along this line liegen should be mapping every non-vertical schema, as a kind of "default" choice. As we shall see in the next section, that is exactly what happens. The examples (16)-(17) of "underspecified" orientation already point in this direction). Of course, it is also conceivable that part of the non-vertical and non-horizontal schemata are "taken over" by other verbs, reflecting further stages in the processing of orientation: for instance, a verb like to sit, which relates primarily to the human position intermediate between standing and lying, could be a good candidate for encoding the schema LACK OF SALIENCY (as applied, for instance, to round, "symmetrical" objects), and that is in fact what happens in some American-Indian languages (cf. Watkins 1976). But in no case would the merging of the schemata of VERTICALITY and LACK OF SALIENCY into one and the same verb be consistent with this hypothesis, according to which the schema LACK OF SALIENCY is only compatible with liegen and sitzen 'to sit', but not with stehen, since stehen has to encode only the most salient perceptual schema of VERTICALITY and nothing else. On the other hand, the schema of BASE is compatible with stehen. How can this fact be accounted for? A closer look at the schema of BASE reveals that it can actually be conceived as a subpart of the schema of VERTICALITY. Verticality in space coincides with the trajectory of every physical object submitted to the action of the force

468

Carlo Serra Borneto

of gravity. This is perhaps also at the origin of the preference that verticality enjoys in human perception). In contrast to the horizontal axis, the vertical (gravitational) axis is always interrupted at the point where it encounters ground level, which represents its natural plane of reference in our perception (Clark 1973: 32). This point is located at the bottom of the imaginary vertical (gravitational) axis and is functionally a very specific part of it (i.e. its origin). Similarly, the base of an object is located on its bottom and has the specific function of supporting it, i.e. of permitting the object itself to remain in a vertical position (instead of falling apart, as a newspaper would do). In this sense the base is simply the stative counterpart of the final point of a dynamic gravitational (vertical) axis. Therefore, the two schemata of BASE and VERTICALITY are not only compatible, but can be seen as two aspects of the same schema of VERTICALITY, as applied to static objects. 3.

"Non-perceptual" locations

So far I have been examining the uses of liegen/stehen in relation to objects located in the real perceptual world, i.e. three-dimensional objects which can be seen in the concrete space of our perception. There are, however, instances of location in a more abstract space, which do not relate directly to sensory perception, as for instance locations in mathematical-geometrical space. In this case the required verb is liegen: (18)

Der Punkt liegt auf der Gerade. 'The point is on the line.' (18a) *Der Punkt steht auf der Gerade. (19) Liegen die Geraden g, h, k auf einer Ebene .... 'If the lines g, h, k are on a plane ....' (19a) *Stehen die Geraden g, h, k auf einer Ebene... (20) ... da diese rotierenden Ladungen einem kreisförmig fliessenden Strom gleichkommen, erzeugen sie ein Magnetfeld, dessen beide Pole auf der Drehachse oder Spinachse des Kerns liegen. (IdS, 14/2/86)

Liegen and stehen in German

469

'... since these rotating loadings are equal to a circular current, they generate a magnetic field, both of whose poles are on the rotating, or spin, axis.' One could argue that geometrical points have no vertical extension (thus ruling out stehen), but they have no horizontal dimension either. The schema at work here seems to be LACK OF DIMENSIONAL SALIENCY, but in a particular form which can also include more abstract entities like lines and poles, which do not exhibit a particular dimensional extension or orientation (if seen as mathematical or physical constructs). Another kind of 'non-perceptual' location is the so-called "geotopographical" location: (21)

Frankfurt liegt am Main. 'Frankfurt is on the Main.' (21a) *Frankfurt steht am Main. The town is conceived as being located in an abstract geographical space (like a spot on the map) and is not seen in any particular dimensional extension with respect to its point of location. The following examples illustrate this case more clearly: (22)

(23)

...es ist ein mittlerer Berg, 614 Meter hoch, und er liegt bei Wiesbaden. (IdS, 26/5/86) '... it is a medium-sized mountain, 614 meters high, and it is near Wiesbaden.' ... In Osnabrück, dort, wo es am Stadtrand trist und öde wird, liegen an einem Bahndamm einige Mietskasernen. (IdS, 14/6/86) '...in Osnabrück, where the edge of the city turns sad and desolate, there are several apartment buildings near the train embankment.'

Clearly, a mountain (especially if its height is expressly specified, as it is here) is a good candidate for a vertical object with a base (=> stehen)·, the same holds for the apartment buildings. Indeed, sentences

470

Carlo Serra Borneto

containing Berg or Mietskasernen and stehen are very easily conceivable; but what comes into play in (22) and (23) is rather the image of geographical and topographical location, which is superimposed on a scene containing "vertical" objects compatible with stehen. That this is really the case is confirmed by examples of the same (or of a very similar) locative scene placed in a different context. Imagine that you are standing with a friend in front of the apartment buildings and are pointing at them; in this case it will only be possible to use (24): (24)

Schau mal dort, da stehen einige Mietskasernen am Damm. 'Look, there are some apartment buildings near the embankment.' (24a) *Schau mal dort, da liegen einige Mietskasernen am Damm. In other words, if the objects (here, the houses) are inside the perceptual and deictic scope of the speaker, the scene is concrete and the dimensional character of the location is relevant. In this case stehen is required, since it is impossible to superimpose a non-dimensional and non-perceptual image on the scene, as the geographical location associated with liegen would require (for a similar phenomenon in Cora cf. Casad & Langacker 1985: 266, 276). Are the "non-perceptual" uses of liegen compatible with its characterization as delineated above? One could, for instance, treat the examples in this section simply as default choices of location, since an interpretation of them as instances of the schema LACK OF DIMENSIONAL SALIENCY is consistent: no particular dimension of the located object appears to be more relevant than the others in (18)-(23); indeed, the dimensional extension of the objects plays no role at all in the process of location, which rather resembles a sort of "pure" location outside the boundaries of the spatial scene. Since the perceptual input may be irrelevant, the real process reflected by the verb consists in applying a "perceptually derived" schema onto a conceptual space. As we have noted above when introducing the concept of schematization, this is the process which permits the extension of the use of these kinds of verbs to very different, even figurative, contexts.

Liegen and stehen in German

4.

471

Alternative schematization

As often pointed out by other authors (Talmy 1983; Langacker 1987), speakers have the possibility of profiling different aspects of a scene by alternatively applying different schemata: "Different schémas can usually be applied with equal appropriateness to the same physical configuration, capitalizing on different sets of characteristics contained in the configuration - and, correspondingly, disregarding different sets" (Talmy 1983: 264). We have already seen alternative applications of schémas in examples (23), (24); in that case a difference in the scene ("deictic" versus "non-deictic") determined the verb selection and the choice of the schema. Let us examine now a case of alternative schematization applied to the same (perceptual) scene: (25) (26)

Der Turm steht im Scheinwerferlicht. 'The tower is in (the light of) the floodlight.' Der Turm liegt im Scheinwerferlicht.

Both verbs are acceptable here. The objective scene is the same and belongs to the perceptual field of the speaker. In (25), however, the height, the solidity and the bulky character of the tower is in focus, whereas in (26) the shape of the tower is backgrounded, and the intensity of the floodlight is emphasized. Recall the notions of "figure" and "ground" in their basic psychological sense, as developed by Gestalt psychologists: "1. The figure is more 'thinglike' and more memorable than the ground; 2. The figure is seen as being in front of the ground; 3. The ground is seen as unformed material and seems to extend behind the figure" (Goldstein 1984: 173). In these terms we can say that in this case liegen shifts the attention towards (or even profiles) the background (the "ground") of the scene, while stehen does the same with respect to the foreground (the "figure"). In (25) and (26) the speaker chooses the relevant schema profiled by a particular verb according to his/her intention of stressing one particular aspect of the scene (figure or ground respectively). In doing so he enjoys relative freedom. Sometimes, however, a schema is not a matter of individual choice; instead, it becomes very much conven-

472

Carlo Serra Borneto

tionalized in the speaker's consciousness (as a member of a cultural community) and emerges as a weak form of "frozen expression": (27)

Die Sonne steht am Himmel. 'The sun is in the sky.' (27 a) *Die Sonne liegt am Himmel. The sun can certainly be regarded as a "symmetrical" object (thus, it would appear to require liegen)·, moreover, in the scene itself there is no apparent element containing a base or a maximal vertical extension. Nevertheless the appropriate verb here is stehen. If we consider only the characteristics of the object as a possible source of verb selection, this would appear rather odd. But if we present the hypothesis that in this case a particular image (that of VERTICALITY) is superimposed upon the entire perceptual scene, the selection of stehen becomes more understandable. Indeed, what seems to be categorized in (27) is not the position of the sun in the sky but the entire vertical axis along which it moves, i. e. the abstract path covered by the sun in its daily journey over the horizon: the sun is not seen as an object but as a part of an up-down trajectory in the sky (cf. "Die Sonne geht auf 'the sun rises' and "Die Sonne geht unter" 'the sun goes down', auf and unter being particles which co-determine the vertical axis). This seems to be confirmed by the following examples: (28)

Die Sonne steht am Horizont. 'The sun is on the horizon.' (28a) *Die Sonne liegt am Horizont. Here the horizontal axis is expressly evoked in the scene (horizont) and the sun could be easily conceived as lying on it, but since the conceptual projection of the vertical axis is superimposed over the entire scene, it does not matter at which point of the scene the sun is located or which other positional features appear in it. Once a particular image has established itself and is conventionalized, it applies to an entire set of similar situations. This is typical for the image projection and applies also to the figurative extensions of image schemata (see next section).

Liegen and stehen in German

473

Before turning to the discussion of the figurative uses of liegen and stehen, I would like to summarize the features and schemata attributed so far to each of the two verbs. Stehen:

Liegen:

human being in upright position, VERTICALITY, BASE, preference for the deictic-perceptual scene, focus of attention on FIGURE; in general, more concrete objects. human being in lying position, HORIZONTALITΥ, LACK OF DIMENSIONAL SALIENCY, indeterminate or geotopographical location, geometrical space, focus of attention on GROUND; in general, a tendency towards more abstract locations.

Figurative extensions are basically correlated with at least one of the features/schemata listed above, sometimes with more than one. In the latter case competition can arise, and the choice of one of the two verbs might appear less motivated than the choice of the other, as will be shown in the next section. 5.

Figurative extensions of image schemata

Johnson (1987) devotes an entire chapter of his book to the "metaphorical extensions of image schemata". I prefer speaking here of "figurative extensions" instead of "metaphorical extensions" because I am going to analyze different kinds of examples, which range from idioms to "dead metaphors", from single (traditional) metaphors to conceptual metaphors (in the sense of Lakoff and Johnson 1980 and Lakoff 1987). These figurative extensions generally do not strictly imply any location in the real world, but rather location in some sort of mental "figurative" space. Thus, the shape and orientation of the objects involved play no role in the selection of the verb; verbs are selected, instead, on the basis of conventional schemata "superimposed" on the entire scene, as in (27), (28). At least one of these schemata (sometimes more than one) represents the motivation for the selected verb.

474

Carlo Serra Borneto

In the following I will examine the types of extension that I have worked out so far. 5.1. Extensions from human positions A very simple form of "metaphorical extension" consists of applying the verb denoting a particular human position to some other object or situation, which therefore acquires the humanlike characteristics connected with the position itself: (29) (30)

Die BW-Bank steht fest auf eigenen Beinen. (IdS 25/3/86) 'The BW Bank is (=stands) firmly on its own two feet.' ... allein stand das G + H AG-Ergebnis mit einem Verlust von 0.2 Mio. DM noch auf etwas schwachen Füßen. (IdS 27/5/86) '... only the G+H AG rate, with a loss of 200,000 DM, is still in a somewhat precarious position.'

Stehen, as the verb of human upright position and "base" (feet, legs), conveys a sense of stability (29), but is used also in the opposite sense of instability (30). The contradiction is only apparent if one recalls the principle of the image extension over the entire scene (in this case over the entire metaphorical scene): once the metaphor is introduced in a situational domain, it works for all possible variants, even if they imply opposite conditions. As for the horizontal position of the human body connected with liegen, it is intuitive that conditions like sleep, illness and eventually death are connected with this position and therefore locative expressions like im Schlafe/in Krankheit/im Sarge/im Grabe liegen 'to be asleep, in illness, in the coffin, in the grave' require liegen. Also the localization of an ill or dead person can be regarded as an extension of this pattern, whereby the "concrete" sense of lying horizontally is still very much at work:

Liegen and stehen in German

(31)

(32)

475

... eine Arbeitskollegin liegt schon lange Zeit im Krankenhaus. (IdS 21/6/86) '... A colleague has already been (lying) for a long time in (the) hospital.' ... so romantisch will es jedenfalls der Dichter Theophile Gautier erlebt haben, der heute nicht weit von seinem Freund Henri liegt. (IdS 19/4/86) '...in any case the poet Theophile Gautier, who is (=lies) today not far from his friend Henri, is said to have experienced it in this romantic way.'

Psychological experiences which usually accompany those conditions, such as abandonment, suffering, and destruction are also expressed by liegen (im Kummer/in Sorge liegen 'to be in grief/in worry', but also unbenutzt/müßig liegen 'to be not utilized/idle'). In the following example the location is interpreted metaphorically as a form of discomfort: (33)

... was die Mieterin fröhlich stimmt, liegt der Vermieterin ziemlich quer im Magen. (IdS 24/3/86) 'what puts the tenant in such a good mood is/lies across the landlady's stomach.' (= she is uncomfortable)

An extension of the idea of abandonment can also be seen in expressions of non-accomplishment or avoidance, such as: (34) (35)

Die Arbeit bleibt liegen. (Duden) 'Work remains undone.' ... hatten in den letzten Jahren die Schüler durch das Abwahlsystem die Möglichkeit, Fächer wie Mathematik oder Deutsch links liegen zu lassen, so müssen die Abiturienten dieses Jahrganges ihr Wissen auch auf diesen Gebieten unter Beweis stellen. (IdS, 23/1/86) 'even if in the last years the students had the option of leaving alone (lit. to leave lying on the left) topics like Mathematics or German, this year's graduates have to prove their knowledge in these fields as well.'

476

Carlo Serra Borneto

On the other hand the upright human position (stehen) is connected with positive attitudes (cf. Bierwisch 1967; Clark 1973, among others) towards persons and towards more abstract topics as well: (36)

(37)

"Die Bewohner von Mundenheim-West müssen hinter uns stehen, wenn wir schon gegen die Stadt kämpfen". (IdS, 21/04/88) 'Inhabitants of Mundenheim-West should be (=stand) behind us when we fight against the city administration.' ... wenn den Politikern die Tapferkeit fehlen sollte, notfalls auch gegen populäre Strömungen in der Öffentlichkeit oder in der eigenen Mehrheit oder Partei zu ihren persönlichen Überzeugungen zu stehen ... dann stünde es schlecht um uns. (IdS, 1/12/88) 'should the politicians be lacking the bravery to be (= stand) by their own personal convictions, even against public sentiments within their own majority or party ... then we would be in a bad condition.'

In (36) and (37) the supportive attitude required from people is consistent with a positive "upright" posture (since upright posture generally corresponds to health, strength etc.), as well as in the following example, the "blooming" metaphor evokes positivity and therefore extension from the same posture (note that in this last case human beings are not expressly referenced in the sentence): (38)

...die Liebe zum "Historischen" und die Diskussion um grundlegende Ausführungsdetail stehen in voller Blüte. (IdS 24/5/ 88) '... the passion for historical issues and the discussion about fundamental details in its achievements are in full bloom.'

Liegen and stehen in German

All

5.2. Extensions of vertical axis The examples in this sub-section are characterized by the projection of the VERTICALITY schema and are thus systematically connected with stehen. In the first set (WRITTEN TEXT AS VERTICAL ORDERING) we can see how a figurative extension, starting from a concrete spatial setting, extends step by step towards scenes in which the spatial setting no longer plays a role. The sequence does not imply diachronic development along this pattern. It only shows the possibility of tracing a conceptual path from a concrete instance of verticality towards more abstract ones in the same context. The basic idea is that written texts can be seen as a sequence of lines vertically ordered on the page: (39)

(40) (41)

... schrieb die Schizophrene Emma Hauck ein kleines Blatt mit Wörtern und Zahlen voll, die sich wiederholend untereinander stehen und auf diese Weise dunkle Säulen bilden. (IdS 1/2/86) '... the schizophrenic Emma Hauck filled a small sheet with words and numbers, which were repeated one below the other and thus formed dark columns.' Unter dem Schriftstück steht sein Name. (Duden) 'His name is under the written part.' ... Namen von Patienten einer Marburger Klinik stehen im neuen Marburger Adreßbuch. (IdS 5/6/86) '... names of the patients of a Marburg clinic are in the new Marburg directory.'

In (39) the vertical ordering of the written text is clearly underlined by the metaphor of the column and by the explicit mention of the relative position of words and numbers (one below the other)·, in (40) there is still an allusion to the vertical orientation of the written parts (the name is under the written part); in (41) there is no longer a lexical indicator of vertical ordering or position. The figurative extension, which started from a perceptual image, has established itself in the conventional knowledge of the speakers and is now active, independently from the original spatial image. The pattern is very consistent; for instance:

478

(42) (43)

Carlo Serra Borneto

Der Punkt liegt auf der Diagonale. (Duden) 'The point is on the diagonal.' Hier muß ein Punkt stehen. (Duden) 'Here there should be a period.'

The point {Punkt) in (42) is a geometric point on a one-dimensional axis => liegen)·, the period (also a Punkt) in (43) is no longer seen as a one-dimensional geometrical object, but is construed as a part of the figurative extension of WRITING. As already pointed out, once an object is conceived inside a figurative context, its intrinsic shape/ position characteristics no longer determine the selection of the verb. Likewise: (44)

Das Geld steht auf dem Konto/auf dem Sparbuch. (Duden) 'The money is in the account/in the savings book.' (44a) *Das Geld liegt auf dem Konto/auf dem Sparbuch. (45) Das Geld liegt auf der Bank. 'The money is in the bank.' (45a) *Das Geld steht auf der Bank. In (44) and (45) the intended situation is very similar, but in (44) the money is seen as an amount written on the account and only stehen is allowed; in (45) the money (a dimensionally not specifiable object) is simply conceived as being in the bank and thus requires liegen. The starting point of these figurative extensions has a clear spatial analogy with the perceptual image of VERTICALITY (columnar ordering). On the other hand, in the following examples the figurative extensions apply to already existing metaphors, and a more abstract image of VERTICALITY represents a sort of mental link between them. According to Lakoff and Johnson (1980), the idea of controlling is metaphorically related to an upper position (CONTROL IS UP), lacking control to a low position (LACK OF CONTROL IS DOWN). If we imagine these two metaphors as being connected conceptually, they can be seen as being linked to an UP-DOWN line or axis, which recalls the image of VERTICALITY. In that case we could expect various kinds of control metaphors, all requiring the verb stehen:

Liegen and stehen in German

(46)

(47)

(48)

479

Landeshauptmann Silvius Magnago (72), der seit 1961 an der Spitze der Partei steht. (IdS 8/4/86) 'Governor S. M. (72), who has been at the top [=head] of the party since 1961.' ... die Konferenz steht unter der Schirmherrschaft des Bayerischen Staatsministers P. S. (IdS 15/5/86) '... the conference is under the auspices of the Bavarian state minister P.S.' ... nach der klassischen Wachstumstheorie steht eine Volkswirtschaft unter Wachstumszwang. (IdS 15/2/86) '... according to classical growth theory an economy is under pressure to grow.'

In (46) the controller is foregrounded (it appears as the subject of the sentence) and its location is high (at the top of the party)·, in (47) the controlled factor is foregrounded and is therefore located below the controller (under the auspices of...); in (48) the controller itself (at least the human controller) is no longer mentioned in the sentence. Thus, there is a series of variants related to the same control metaphor, with different degrees of explicitness, but all of them requiring stehen. Similarly we expect stehen to be used for expressing the LACK OF CONTROL metaphors as well, because the CONTROL notion is still salient even though its lack is specifically profiled (in other words, even if the negative of a given situation is invoked, the original situation itself is still semantically relevant and thus available for linguistic coding): (49)

... in einer Erklärung der FDA heißt es, den Ärzten stehe es frei, Alphainterferon Patienten zu geben, die an sogenannter Haarzell-Leukämie leiden. (IdS 6/6/86) '... one statement of the FDA reports that doctors are free to give alpha interferon to patients suffering from so-called hairycell leukemia.'

480

(50)

Carlo Serra Borneto

... wenn ja, dann stehen ihnen dafür zwei Wege offen. Sie können auf Antrag versicherungspflichtig werden oder aber freiwillig beitreten. (IdS 19/4/86) "... if yes, then two paths are open to them: they can apply for insurance or join voluntarily.'

In (49) the FDA organisation does not exert compulsive control over the doctors (who are free to act according to their conscience); nevertheless some sort of CONTROL is still operative, insofar as the FDA is allowing the doctors freedom; in (50) the LACK OF CONTROL is more evident, since people have total freedom to choose between two alternatives (two paths). Interestingly enough the path (zwei Wege) appears here to be compatible with stehen, which was not the case in a previous example involving location in the 'perceptual world': (6a)

*Der Weg steht hinter den Bäumen. 'The path is beneath the trees.'

The fact is, of course, that the path in (50) is a metaphorical path, not a concrete one as in (6); it does not belong to the 'perceptual scene' and is therefore not subject to its constraints. 5.3. Extensions of the horizontal axis In many languages, temporal expressions represent some of the most common figurative extensions of spatial expressions. Time is seen as developing along a one-dimensional axis (notably the front-back axis; cf. Clark & Clark 1978; Traugott 1985; Johnson 1987 among others), and since liegen is the verb selected for highlighting the one-dimensional axis in the spatial domain (geometrical location), its use for temporal expressions appears to be consistent: (51)

Sorge äußerten einige Synodale, daß der Auftritt von Bundeskanzler Helmut Kohl (...) im Vorfeld der Bundestagsauswahl liege... (IdS, 17/5/86)

Liegen and stehen in German

(52)

(53)

481

'Several synod members expressed concern that the appearance of federal Chancellor Helmut Kohl should be (=come) during the run-up to the Bundestag election.' ... liegt dieser Tag vor dem 1. Januar 1983, so muß das Arbeitsamt (...) die Neufestsetzung bereits im Jahre 1986 vornehmen. (IdS, 17/5/86) 'If this day is (=lies) before 1 January 1983, the labour exchange will have to fix a new figure as early as 1986.' ... nun meint Weber, man liege trotz der dreimonatigen Verzögerung noch im zeitlichen Rahmen. (IdS. 19/2/86) 'Weber now thinks that in spite of a 3-month delay we are still within our time frame.'

(51) exemplifies the shift from the spatial to the temporal domain: the expression is clearly temporal but the central lexical unit used in it (Vorfeld = "run-up", but literally "forefield/forefront") still preserves the spatial topology. By contrast, in (52) temporality is overtly expressed by mentioning the specific point of the time location (the day). In (53) time is more generally seen as a bounded region. In all these examples the linear character of the temporal axis is also suggested by evoking two related events, which could be seen as connected points ideally lying on an abstract line. Quantitative data also extend along a one-dimensional axis. As noted by Smith, Rattermann, and Sera (1988: 353), "a quantitative dimension is mathematically a linear ordering; all items can be ordered on a single line." Every item of a quantitative series can be considered a point on a line, very much as the sequence of natural numbers are viewed as lying along a horizontal axis. For example: (54)

(55)

... wie schon berichtet, liegen die Meßergebnisse in Mannheim vom 4. Mai bei 100 Bq pro Liter. (IdS 7/5/86) 'as already reported, the results of the measurement in Mannheim on the 4th of May are 100 Bq per liter.' ... der größte Durchmesser bei diesem Bauverfahren liegt bei 11,50 Meter. (IdS, 10/4/86) 'the maximal diameter in this kind of construction is about 11.50 meter.'

482

Carlo Serra Borneto

There are several examples of this kind in our corpus, involving social, economic and financial data, scientific measures, etc. They all involve the spatial distribution of some entity along a one-dimensional horizontal line (liegen). The question arises why this kind of line should be horizontal: could it not also be a vertical (one-dimensional) line (thus triggering the use of stehen)? My view is that the schema of horizontality is generally more consistent with the idea of one-dimensionality, because the horizontal line is in some sense intrinsically more one-dimensional than the vertical one, or - if seen from the reverse perspective - the vertical line is in some sense inherently more bi-dimensional. Indeed, the vertical axis is always defined in relation to a ground level or plane (see Clark & Clark 1978; but also the OED definition of vertical: "placed or extending at right angles to the plane of the horizon"), i.e. its perception is secondary with respect to another dimension (the horizontal one). By contrast, the horizontal line is simply viewed as a part or extension of the horizon itself (OED: "Of or belonging to the horizon; situated or occurring on the horizon"). Thus, the idea of horizontality seems to be primary with respect to the definition of verticality, which in its turn is intrinsically derived and more complex (as already shown in the discussion about the schema BASE). With its reference to another dimension, verticality involves something more than bare one-dimensionality (although, of course, it is still a monodimensional entity), whereas horizontality remains elemental in this respect. In my view, this explains why this axis is preferred as a figurative location for one-dimensional or zerodimensional entities. It is interesting to notice that the distribution of quantities along the (figurative) horizontal axis contrasts with the well-known schemata of MORE IS UP, LESS IS DOWN (Bierwisch 1967; Nagy 1974; Lakoff and Johnson 1980; Johnson 1987). Larger amounts are always seen in connection with a higher location, smaller with a lower; thus, we would expect the alignment of such quantities on some sort of vertical axis (=> stehen). But in the kind of figurative extensions discussed here the selected verb is always liegen.

Liegen and stehen in German

(56)

(57)

483

... die Entwicklung am deutschen Markt wird wesentlich vom Ablauf des US-Kapitalmarktes bestimmt, wo die Dollarzinsen um etwa 2,5 v.H. über den Kapitalzinsen auf DM-Basis liegen. (IdS, 18/5/88) 'The development of the German market is essentially determined by the course of the US money market because the dollar interest rates are about 2.5% higher than the DM interests.' ... alle Werte liegen deutlich unter 0,4 Bequerel (Bq) pro Quadratzentimeter, dem Grenzwert für Gemüse. (IdS 7/5/86) '... all figures are clearly below 0.4 Bequerel (BQ) per square centimetre, which is the accepted limit for vegetables.'

This characteristic parallels the behaviour already encountered in the case of vertical schemata: the sun in (27) could be very high or very low on the horizon, but the scene always requires the verb stehen', the same applies in the case of the control metaphor (46-50), where the presence of control, as well as the lack of control, requires the verb stehen. Similarly, if in one particular type of scene, there is more than one image-schema simultaneously at work, the conventionalized system of the language "chooses" (establishes) one of them as particularly relevant and then uses it throughout every possible variant of the scene, even if it clashes with other images. Of course the presence of competing schemata in certain types of scenes makes it more difficult to motivate the choice of a particular one and sometimes leads to apparent inconsistencies. For instance, the Duden dictionary reports an example - see (58) below - of a financial figure which contrasts with dozens of similar examples I found in the IdS corpus (for instance (59)): (58) (59)

Der Dollar stand bei 1,78. 'The dollar was at 1.78.' ... so lag der D-Mark-Kurs gegenüber dem Franc bei 3,3850 Francs... (IdS, 14/6/88) '... so the DM exchange rate was 3.3850 against the Franc.'

Duden lists example (58) along with others involving objects having a particular position in a system or scale. Indeed, the schema of SCALE

484

Carlo Serra Borneto

(which requires stehen·, see below) may exert some influence here, although prototypically it should exhibit some additional characteristics, as we will see soon. 5.4. Scale Scale - a very pervasive feature in human experience - can be represented as a kind of vertically oriented path with some "normative character" attached to it (Johnson 1987: 123). Norms immediately evoke the concept of value, because if something is at or above the norm, it is considered positive, otherwise negative. A value is not clearly (mathematically) measurable; rather it is connected with a judgment (good/bad). Good things are seen metaphorically as being high, or superior, bad things are low, or inferior (Bierwisch 1967). Thus, if something has to be judged in terms of value, it will be posited on a high-low scale, i.e. on a vertical scale (=^stehen). Again, the idea of vertically connects two metaphorical models (GOOD IS UP/BAD IS DOWN) and covers all possible scalar values from the top to the bottom (good, fairly good, medium, bad and so on): (60)

(61)

(62)

(63)

... das Thema Renovieren stand auch im Badezimmer hoch im Kurs. (IdS 21/2/86) '... the subject of renovating the bathroom was (=stood) also very high on the agenda.' ... ist es denn faßbar, daß ihnen Waffenhandel und Kommerz höher stehen als das Leben ihrer Mitbürger? (IdS 2/1/86) '... is it then conceivable that for them arms traffic and trade are higher (= are more important) than the lives of their fellow citizens?' ... um die Zahngesundheitserziehung stehe es "miserabel". (IdS 7/2/86) '... dental health education is in a very bad condition.' ... Abstellgleis, Verwahranstalt, letzte Station - in diesem Ruf stehen immer noch die Altersheime. (IdS 6/2/86) 'Siding, house of detention, final station - all are names associated with retirement homes.'

Liegen and stehen in German

(64)

485

Dieser Anzug steht dir gut. (Duden) 'This suit fits you.'

(60) expresses an absolute high value on a practical scale, (61) a relative value (high) on a moral scale, (62) an absolute value (low) on a quality scale; the same holds for (63) with the difference that the quality evaluation is implicit, not lexically expressed; (64) expresses an aesthetic value. There are of course many other examples of this type of scaling. As we saw before, quantitative, numerical data are figuratively lined up on a horizontal axis because they are seen as single one-dimensional points; value and norm-dependent factors, on the other hand, are not seen as single points but as part of a more complex structure involving judgment and evaluation, thus evoking the bipolar axis of verticality on which the upper and lower values can be located. Indeed, in most of these examples with stehen no numerical features appear in the sentence, but only lexical units expressing aspects of scalar approximations. 5.5. Extensions of "gravitational" verticality I have already mentioned (section 2) that one of the basic features of the schema VERTICALITY is connected with the experience of a falling object along the gravitational axis. Consequently, VERTICALITY is also associated with the perception of movement: a falling object always moves toward the ground. Of course, both stehen and liegen are stative verbs and therefore seem unrelated to movement. But there is a very subtle way in which stehen reflects the notion of movement implicit in gravitational verticality: (65) (66)

Die Uhr/der Motor steht. (Duden) 'The watch/the motor stands (=has stopped).' Der Löwe stand zum Sprung bereit. 'The lion was ready to jump.'

486

(67)

(68)

Carlo Serra Borneto

... "wir haben eine Umweltbewegung in Gang gebracht und stehen jetzt am Anfang einer Gesundheitsbewegung". (IDS 1/4/86) 'we have started an environmental movement and are now at the beginning of a health movement.' Das Auto bremste und stand nach hundert Meter (Duden). 'The car braked and stood still after 100 meters.'

In (65) the static situation is seen as an "exception" to a condition in which movement is canonical (watches and engines are canonically supposed to be working and thus to be in some sort of movement); in (66) and (67) movement is "likely to arise" very soon out of a static situation (in [66] in a concrete, in [67] in a figurative way). In (68) stativity results immediately "after" movement (actually horizontal movement, and not vertical); this confirms that once a schema is established, it also remains relevant for contexts different from the original situation). The common feature in all these examples is thus in a sense, a "taste of movement despite stativity", thus revealing the dynamic factor implicit in the schema of stehen. By contrast, liegen is used only for static situations which have lasted or are likely to last for a certain time: (69)

(70)

Der Pannenhelfer des Automobilklubs machte mich darauf aufmerksam, daß ich meinen Wagen falsch abgestellt hatte, weil ein Fahrzeug, das nachts liegen bleibt, in jedem Falle von der Fahrbahn der Landstraße entfernt werden müsse. (IdS 9/4/88) 'The towtruck driver of the Automobile Club pointed out that I had stopped my car in the wrong place, because a vehicle that is (= remains standing) overnight must always be removed from the roadway.' ... bei der Anklagebehörde blieb die Angelegenheit fast zwei Jahre liegen. (IdS 16/4/88) 'The matter was (=stayed) in the district court for nearly 2 years.'

Liegen and stehen in German

487

In (69) - a nice counterpart to (68) - the car has broken down and been abandoned on the side of the road and it will be towed away after a certain amount of time; in (70) the (fairly long) time of the location is explicitly mentioned in the sentence. In this respect it is also interesting to contrast the following examples (for similar examples in Dutch see van Oosten 1986: 148): (71) (72)

Die Griechen lagen vor Troja. 'The Greeks were outside Troy.' Die Russen standen vor Berlin. 'The Russians were outside Berlin.'

In both cases the situation depicts an army located very close to an enemy town, but in the first case (liegen) the army is involved in a long-lasting siege, in the second case the army has stopped at the end of an advance and is now ready for the final attack. The inherent dynamic factor in the scene thus requires the selection of stehen, as opposed to the totally static locative liegen.

5.6. Extensions of the figure/ground distribution of attention The extensions that involve the relative distribution of figure and ground require a certain amount of preliminary discussion because the terms figure and ground can lead to some misunderstanding, given the different contexts in which they can be used (cf. Talmy 1978; Langacker 1987 and 1991; Johnson 1987). I will try to be as clear as possible about the way I would like these terms to be understood in this specific context. First let us examine again the initial examples related to this kind of schema (I repeat them here with their original numbers.): (25) (26)

Der Turm steht im Scheinwerferlicht. 'The tower is in (the light of) the floodlight.' Der Turm liegt im Scheinwerferlicht.

488

Carlo Serra Borneto

As I pointed out in section 4, the bulky character of the tower is highlighted in (25), whereas in (26) the intensity of the floodlight is emphasized. The speaker alternatively emphasizes the two relevant "objects" in the scene, shifting his/our attention towards the first (25) or the second (26). But these "objects" have a particular status in the locational setting. A prototypical location involves a salient object (the one to be located) and a background object with respect to which the salient object is located. The first one, called the "figure", prototypically exhibits the characteristics already listed in section 4 (is "thing like", "stands out" etc.); the second one ("ground") is prototypically larger, lies behind it, is compact, and so on. This is true at a psychological and perceptual level. If we turn now to the linguistic expressions, we note that the most salient element in a sentence is located again prototypically (see Langacker 1991) - at the beginning, i.e. in topic position, which is also the prototypical position of the clausal trajector). Consequently, less salient objects will appear in some other part of the sentence (the comment). Starting from this observation, I will argue that the more an object corresponds to a prototypical figure the more its "natural" position will be topical and, conversely, the more it is "groundlike" the more its natural position will be in that part of the sentence which corresponds to the comment. To take a well-known example by Talmy (1978: 628): (73 a) The bike is near the house. (73b) The house is near the bike. Talmy quotes these examples in order to point out that a sentence pair which in principle should be understood as synonymous by virtue of its symmetrical relation (x is near y = y is near x) is in fact not synonymous at all because of the different roles (i.e. difference in imagery) the objects assume in the sentence: in (73a) the reference point (the ground) is the house, in (73b) it is the bike (and inversely with regard to the figure). The point I am making here is that some objects are more suited for the role of figure (or ground) than others and that this fact also has some influence on the way we make judgments about the sentences. For instance, under normal circumstances (73b) sounds a little odd, because it is strange for a movable, smaller object

Liegen and stehen in German

489

(as the bike) to be chosen as a reference point with respect to an immobile, larger object (the house). Even more peculiar would be the counterpart of our initial example: (74a) The tower is in the light of the floodlight. (74b) The light of the floodlight is around the tower. Used as a locative sentence (for example, as an answer to: "Where is the light?" and not as a variant of the possible poetic sentence "The light surrounded the tower from all sides"), (74b) is certainly "unnatural", because the light is a highly "groundlike object" with respect to the tower and thus its natural position in the sentence should be somewhere in the comment (I will be more specific on this point below). On the other hand, the tower is the typical "standing out", bulky object that we would expect in this scene to take over the role of figure and thus its "natural" position is the topic. If we go back to our initial examples (25) and (26), we notice that this "natural" distribution of figure/topic and ground/comment is maintained in both cases; this is exactly the reason why we need some other device in order to mark the shift of attention from one object to the other. Since in this case a change in word order would infringe on the natural distribution of roles (cf. (74b)), the shift of attention is "encoded" directly in the semantics of the locative verb. In short, the claim is that, if an object is cognitively preferred as the figure, it will be sententially preferred as the topic (and basically as trajector); if it is cognitively preferred as the ground, it will be sententially preferred as a part of the comment (and basically as landmark). Why should this be so? Tentatively I would like to propose the following hypothesis: the figure stands out and is prototypically located in front of the ground; it is the object that comes to our attention first in the process of scanning the depicted scene. If we assume that there is a kind of iconicity between the cognitive locative scene and the sentential setting, we can readily expect the noun that encodes the "first" object in the scene (i.e. the figure) to also be placed in the first position of the clause, which is prototypically the topic/trajector. If this argument is valid, we can postulate the following schematic structure

490

Carlo Serra Borneto

for the prototypical distribution of attention connected with stehen and liegen: Table 1. Prototypical distribution of attention in stehen and liegen

stehen

liegen

attention on

attention on

Cognitive role :

FIGURE

GROUND

Sentential ordering :

TOPIC

COMMENT

Clausal participants : TRAJECTOR

LANDMARK

This means that stehen prototypically places the speaker's/hearer's attention simultaneously on the "figure" (if we examine its function from the point of view of the cognitive roles), on the "topic" (from the point of view of sentential ordering) and on the "trajector" (from the point of view of the clausal participants), whereas liegen places the speaker's/hearer's attention on the "ground" (as cognitive role), on the "comment" (from the point of view of sentential ordering) and on the "landmark" (from the point of view of the clausal participants). Of course, a prototypical structure is not as rigid as a "rule". Since the sentential sequence is influenced by several dimensions of organization, the prototypical distribution is violated in some cases (for instance, the ground could appear in the topic position under certain circumstances). However, as we shall see below, this schema holds for most cases, especially for the ones where the figure and the ground in the scene exhibit prototypical features. According to Langacker, the relevant model for characterizing locative relations is the so-called "stage model". Langacker (1991: 284) speaks of it in the following terms: "The role of the perceiver is in many ways analogous to that of someone watching a play. At any moment his field of vision subtends only a limited portion of his sur-

Liegen and stehen in German

491

roundings, within which his attention is focused on a particular region (similar to a stage). As in the case of the stage, the viewer tends to organize the scene he observes into an inclusive setting populated by interacting participants". What Langacker calls the "region" (sometimes also the "main actor" or the "portion of the stage focused on") can obviously be equated with a figure; it is also prototypically located in the centre of the observer's attention. Thus, the prototypical examples illustrating the shift of attention to the figure refer to a (figurative) "stage-like" setting in which the attention of the observer is directed towards the centre of the scene. This situation is reflected also by the selection of the lexical units (im Mittelpunkt/im Vordergrund 'in the centre/in the foreground' co-occurring with stehen: (75)

(76)

multimediale Projekte, Klanginstallationen, und Produktionen im Zwischenbereich von Wort und Musik stehen im Mittelpunkt der 61. Weltmusiktage. (IDS 12/88) 'multimedia projects, sound installations and productions combining words and music are at the centre of the 61st World Music days festival.' ... vielleicht wird das Dirigieren ein wenig im Vordergrund stehen. (IdS 16/4/88) '... perhaps the conducting will be somewhat in the foreground.'

Obviously the same metaphors (im Mittelpunkt/im Vordergrund stehen) can be used also in much more abstract, non "stage-like" contexts: (77)

(78)

... im Mittelpunkt stehen nach ihrer Darlegung Aufklärung und Vorbeugung sowie optimale Beratung und Betreuung von Infizierten und Kranken. (IdS 27/1/88) '...according to her, explanation and prevention, as well as excellent consulting and care of the infected and ill patients are at the centre of their concern.' ... was an Technik dahinter steckt, damit wirbt kein Hersteller; die Problemlösungen stehen im Vordergrund. (IdS 22/3/88)

492

Carlo Serra Borneto

'... the technique behind it is not publicized by the manufacturer; the problem solutions are in the front (=are focused on).' Expressions like im Zentrum/in der Mitte/zur Seite stehen 'to be in the centre/in the middle/at the side' belong to this group. These are expressions containing a lexeme which designates a concrete spatial reference point (centre, middle, side). There are, however, more abstract expressions connectable with this pattern, which could be aligned along a sort of "chain" of increasing abstractness starting, for instance, from zur Seite stehen. We thus note the following chain of conventionalized constructions: zur Seite 'at the side' stehen > in Kontakt 'in contact' stehen > in Verbindung 'in conjunction' stehen > in Beziehung/in Verhältnis (in relation/relationship) stehen > im Zusammenhang 'in connection' stehen. Some examples of this very common pattern are: (79)

(80)

... übrigens stehen auch die Kinderkliniken in Mannheim, Ludwigshafen und Heidelberg in ständigem Kontakt. (IdS 28/04/88) '... by the way, the children's clinics in Mannheim, Ludwigshafen and Heidelberg are also in constant communication with each other.' ... kann diese Schimmelbildung tatsächlich im Zusammenhang mit meinen Beschwerden stehen und was könnte ich dagegen tun? (IdS 02/04/88) '... but this fungus outbreak can really be related to my complaint and what can I do about it? '

A similar "chain" involving an increasingly negative evaluation of the focussed relationship can be traced starting from im Zusammenhang stehen. This sequence runs as follows: im Zusammenhang 'in connection' stehen > im Wechselspiel 'in interaction' stehen > in Abhängigkeit 'in dependence' stehen > in Konkurrenz 'in competition' stehen > im Widerspruch 'incontradiction' stehen> in Gegensatz 'incontrast' stehen. Of course, as in the development of many radial categories from a semantic core, the most distal points in the chain seem to have little in common with the starting nodes (Brugman 1981; Lakoff

Liegen and stehen in German

493

1987); but as long as the path spreading from the core is traceable, we can assume that the same schema (here: focus of attention on the figure) is working throughout the whole trajectory. Some examples of this group: (81)

(82)

... aufgesaugt vom Stoff, wirken sie wie reines, farbiges Licht, stehen aber in gleichwertigem Wechselspiel zu frei gebliebenen Leinenpartien. (IdS 27/02/88) '... absorbed by the cloth they look like pure colourful light but are in interaction (interacting) with sections of blank canvas.' ... die mittelalterlichen Trachten der Frauen, das klösterliche Ambiente stehen in scheinbarem Gegensatz zu der computergesteuerten Bewachungmaschinerie des Staates. (IdS 03/06/ 88)

'...the medieval costumes of the women, the convent-like environment are in apparent contrast to the computer-controlled state surveillance machinery.' Now, turning to liegen and the corresponding focus of attention on the ground/landmark/comment, I begin the discussion with the following example: (83)

... das liegt vor allem an Geialers deklamatorischem Gesangsstil, seinen Melismen, seinem klar den Figuren zugeordneten, charakterisierenden Themenkomplex. (IdS 12/88) '... this is mainly due to Geialer's declamatory style of singing, his melismas, his characteristic thematic structures, which are clearly subordinate to the ornaments.'

Sentences containing liegen an signal a causal relationship; the clausal subject coincides with the result of the causal relationship and the prepositional phrase with the origin (the causing factor or event). A causal relation is different from a causative in many respects: I cannot go into details at this point (for German causal structures cf. Serra Borneto 1983); roughly speaking, a causative construction tends to be iconic in that it yields a privileged word order which reflects the temporal (and causal) sequence of events in the world, i.e. it starts with

494

Carlo Serra Borneto

the causer as the sentence subject/topic and ends with the causee as object/comment. The structure of a causal expression, on the contrary, tends to be "anti-iconic", in that it starts with the causee (or the resulting event) as the sentence subject and continues with the causer as the object (usually a prepositional object). This is because in the last case the resulting event is usually already known and the speaker wants to point out the reason or motivation for that event (think of complex sentences of the type "x because y"). Attention is thus shifted towards the comment. In (83) the subject/topic position is filled by an anaphoric textual pro-form (das 'this'), which indicates that the caused event has already been mentioned in the text and is therefore known to the reader. The writer points out the new information, which coincides with the unknown causing factor and is located in the sentential comment position. It is interesting to note that out of 108 examples of liegen anconstructions (from a first scanning of 2000 liegen examples in the IdS corpus) 99 had the subject/topic position filled by a pro-form, as in the example above, and only 9 had a full nominal subject - a clear sign that in this kind of construction the role of the figure is not privileged. Also the examples with full subjects refer to something already known or at least implied in the text and highlight the role of the ground/comment; for instance: (84)

... vielleicht liegt der Anstieg im deutschen Süden auch an den Aktivitäten von Prof. Eduard Gaugier, dem Ordinarius für Allgemeine Betriebswirtschaftslehre, Personalwesen und Arbeitswissenschaft an der Universität Mannheim. (IdS 30/03/88) '... perhaps the increase in the south of Germany is also due to the activities of Prof. Eduard Gaugler, professor of Industrial and Personnel Management and Labour Sciences at the University of Mannheim.'

There are numerous examples of this kind in the IdS-corpus, as there are also those with different functions and prepositions (auf, bei, in or even simple dative constructions); thus it is not possible to review them all here. Instead I would like to address a particular instance

Liegen and stehen in German

495

which seems to be problematic in respect to my thesis. The Duden dictionary reports the following (not strictly causal) examples: (85)

Es liegt ganz alleine bei dir, ob du teilnimmst oder nicht. 'It's up to you, whether you take part or not.'

(86)

Die Entscheidung steht [ganz] bei dir, ob und wann mit der Sache begonnen wird. 'The decision is up to you, whether and when the matter is taken up.'

The use of liegen in (85) is consistent with our characterization because the attention is obviously shifted towards the actor, i.e. the person who is going to decide ("you"), who appears in the comment position (=> liegen). But what about (86)? Actually my informants weren't happy with this sentence and tended to reject it as 'very odd', but since it is reported in the authoritative Duden as well as in other dictionaries we shall try to account for it. The point is that in (86) the situation appears to be very similar to the one in (85) and nevertheless the verb used is stehen. The first difference between the two examples is that the subject position is lexically filled in (86) and not in (85), but this does not help, since (85a) is equally acceptable (it is actually much better than [86], according to native speakers): (85a)

Die Entscheidung liegt ganz alleine bei dir, ob du teilnimmst oder nicht. 'The decision is up to you, whether you take part or not.'

Duden itself quotes a similar example of this kind: (87)

Die Verantwortung liegt bei dir. 'The responsibility is up to you.'

But the corresponding sentence with stehen is totally unacceptable: (88)

*Die Verantwortung steht bei dir.

496

Carlo Serra Borneto

The difference between (86) and (88) lies of course in the lexical item involved: Entscheidung 'decision' implies a voluntary participation by the actor, Verantwortung 'responsibility' is passively attached to the actor 'a sort of effecting/affecting polarity'. Thus, the 'relative' acceptability of (86) may be due to the fact that the use of stehen highlights an actor-like aspect in the trajector, which is implicit in Entscheidung 'decision' but not in Verantwortung 'responsibility'. Along the same lines, if the actor itself appears in the topic position, the only acceptable verb is stehen: (89) (90) (91) (92)

Du stehst vor einer wichtigen Entscheidung. 'You are facing an important decision.' *Du liegst vor einer wichtigen Enscheidung. Du stehst vor dieser Verantwortung. 'You are facing this responsibility.' *Du liegst vor dieser Verantwortung.

Thus, we can observe that, if the cognitive figure (protypically an actor) appears in its "natural" topic position, the only choice available is stehen (the "figure and topic" verb), but if the cognitive figure (the actor) is in the comment position, which is not its "natural" sentential position, a conflict of schemata may arise and lead to alternative solutions, as in (85a) and (86). 6.

Stehen and liegen as complex categories

As with most lexical units, stehen and liegen are best represented as "complex categories" (Langacker 1987, ch. 10; see also Vandeloise 1988 for a characterization of length and width along this line). A good representation of a complex category should contain a schematic network of all the semantic nodes implicit in the lexical unit (Brugman 1981). The reason I cannot yet construct such a network is simply that my analysis is not complete, i. e. it does not include all the semantic, cognitive and usage-based information associated with the two verbs. The meanings I have identified so far probably represent some kind of nodes or node-like cognitive information packages, but

Liegen and stehen in German

497

at this point in my research they lack connection and appear too scattered in semantic-cognitive space to be organized into a proper network. What I have discussed in the previous sections are mainly the central schemata and some of their projections, without giving any pretence to a systematic description. Besides, the very character of these units - i.e. they are locative predications - implies a wider range of relationships including all or at least most of the locative verbs which form a sort of consistent semantic field and share the same basic property (BEING LOCATED) with different subspecifications. A reasonable network would thus also have to incorporate the analysis of such verbs as sitzen, stecken, hängen, sein, sich befinden and so on. Despite these caveats, I would like to draw some (partial) conclusions from the work done so far. If a German native speaker is asked about his attitude toward stehen and liegen, he will invariably associate stehen with "vertical" and liegen with "horizontal". "Vertical" and "horizontal" seem to be global prototypes which represent the starting-points of every categorization (for prototype categorization see below). They can also be viewed as a sort of "superschema" which subsumes common characteristics of the central meaning nodes at a highly abstract level. We know that in radial categories the range of such superschemata does not necessarily extend over all nodes of the network, but we can assume in this case that "vertical" and "horizontal" cover at least the core of the network, if not more. At the immediately subordinate level I have isolated four main (core meaning) image-schemata for each verb; they represent the starting-points of the (potential) radial distribution of the nodes. Each of these image-schemata denotes a more specific instance of the conceptual content of the superschema and at the same time evokes a more complex image. Starting with VERTICALITY, for instance, we have first the schema of the human body position (which could be paraphrased as "vertical in the sense of a human body in an upright position"), then the geometric schema (which could be paraphrased as "vertical in the sense of an up-down geometrical axis in perceptual space"), then the gravity schema ("vertical in the sense of the trajectory of a falling body onto the earth"), finally the saliency schema ("vertical in the sense of something which rises and stands out from

498

Carlo Serra Boraeto

the rest of the perceptual scene"). In the following I shall abbreviate these schemata respectively as: HUM VERT, GEOM VERT, GRAVITY VERT, SALEENCY VERT. It is clear that each of these schemata shares the more schematic (= more abstract) idea of verticality, but they represent at the same time its first instantiation into more experiential occurrences. These schemata also share some common characteristics: clearly "gravity verticality" evokes "geometrical verticality", "saliency verticality" evokes human body position etc. Thus, although useful for the development of a potential network (they would constitute the central prototypical nodes), these schemata should not be understood as clear-cut instances, but rather as peaks in a fairly continuous landscape of mutual influences. The following table summarizes the distribution of the instances discussed in this paper for stehen (the numbers in brackets refer to the examples). They are all listed in columns under at least one of the central (core meaning) image-schemata mentioned above and represent possible nodes radiating from the schema itself and are potentially located in the section of the semantic network which ultimately should be built up for stehen. Some of them appear under more than one schema: in a fully developed network they would be linked with two or more other nodes, each connected with a different instantiation of the core meaning. In some cases I have added "linking nodes" (in square brackets) which do not refer to any specific example treated in this paper but which may represent areas to be worked out more deeply in the future in order to find possible missing nodes between the ones already discussed here. The ordering of the nodes from top to bottom reflects a tendency for gradation from the more concrete to the less concrete senses, but it should be pointed out that this arrangement is only intuitive. In addition, an increasing dynamics can be noted in the image-schemata from left to right: GEOM VERT is the most static image, HUM VERT and SALIENCY VERT are potentially connected with movement, GRAVITY VERT intrinsically implies movement or pressure, i.e. some sort of dynamism is involved in almost every kind of instantiation.

Liegen and stehen in German

499

Table 2. S T E H E N VERTICALLY L O C A T E D GEOM VERT

HUM VERT

SALIENCY VERT

GRAVITY VERT

object in vert position (4,8a)

human upright position (1)

perceptual location (24)

(impact to ground)

vertical trajectory (28)

human 'base'

(figure)

BASE (9-11)

(vertical axis)

BASE (9-11)

movement (65-68,72)

scale (60-64)

stability (29-30)

focus on figure (25)

movement down

focus on topic (75-82)

(pressure down)

I

(column)

(health)

writing (39-45)

positive attitude (38)

I

movement (65-68,72)

control (46-50)

I support (36-37) (values) scale (60-64)

A s for liegen, the four central image-schemata related to the more general "superschema" of H O R I Z O N T A L I T Y can be determined as follows: 1) H U M HOR ("horizontal in the sense of a human body in

500

Carlo Serra Borneto

the prone position"); 2) GEOM HOR ("horizontal in the sense of a left-right/front-back geometrical axis in perceptual space"); 3) SALIENCY HOR ("horizontal in the sense of something which does not emerge and remains flat in perceptual space"); 4) DISTRIB HOR ("horizontal in the sense of something equally distributed on the perceptual scene"). Like the vertical schemata, these schemata also overlap to a certain degree; especially SALIENCY HOR and DISTRIB HOR seem to have very much in common. The tentative arrangement of the liegen examples along this line is summarized in the table next page. Categorization by schema does not always coincide with categorization by prototypes (Langacker 1987). In order to detect which of the image-schemata are considered more prototypical by a native speaker I designed the following two tests. In the first one a set of 40 different names for objects, chosen on the basis of their shape and orientation, was given to 15 native speakers, who were asked to match them with the corresponding verb. In the second test 15 different native speakers were asked to write down the best instances of objects to be associated with stehen or with liegen. Since the tests were performed in a very informal way, I am only going to summarize the results. Therefore, although basically reliable, they should be viewed with caution. The answers in the first test were substantially consistent with examples (1-15) in the previous sections, with the interesting exception of Würfel (=dice), more often associated with liegen (80%) but also sometimes with stehen (20%). I have characterized dice as substantially "symmetrical"; it must be recognized, though, that dice have more than a faint resemblance to objects exhibiting a real functional "base" and requiring stehen (as boxes, cans etc.), so that it is fully understandable why 20% of the native speakers matched it with stehen. Again, if different images can apply to the same context, language turns out to be sensitive to all of them at least to a certain degree (see also examples (85)-(92)). "Symmetrical" objects not only sometimes undergo a double categorization, as we have just seen, but they are also fairly marginal in the consciousness of native speakers.

Liegen and stehen in German

501

Table 3. LIEGEN HORIZONTALLY LOCATED GEOM HÖR

object in hör position (4a, 5,7, 8,9a) path (6)

HUM HÖR

SALffiNCY HÖR

DISTRIB HÖR

human in lying position (2)

(not salient)

(object with no salient dimension)

(weak)

focus on ground (26)

symmetrical object (13-15)

iU (31)

focus on comment (83-85)

non-decidable position (83-85)

suffering (33) abandoned/static (34-35, 70-71)

dead (32)

not working (69)

I

not perceptual object (23) geotopographical location (21-23)

geometric (18-20,42) (one-dimensional axes) time (51-53) measure (54-55) numerical quantity (56-57)

502

Carlo Serra Borneto

Analyzing the data resulting from the second test (best instances of objects for both verbs), I noticed that stehen is associated in the first instance with human beings Mensch, Mann, Frau 'human being, man, woman', which I will call type A. Among non-humans the most frequent matching was with objects maximally extended along the vertical axis and possessing a base (Steh)lampe, Flasche, Vase '(standard) lamp, bottle, vase' (type B), then objects with a base but with no particularly salient vertical extension: Glas, Dose, Tisch, Stuhl 'glass, can, table, chair' (type C). Finally, stehen is also associated with objects usually seen as maximally extended along the vertical axis but which potentially can also be laid horizontally: Schirm, Buch 'umbrella, book' (type D). As for liegen, we have again in first place human beings Kinder, Junge, Baby, Mensch 'children, boy, baby, human being' (type E); among objects, first we have objects maximally extended along the horizontal axis: Zeitung, Kleid, Heft, Blatt 'newspaper, dress, notebook, sheet of paper' (type F), and then geographical or topographical places: Dorf, Stadt, Kino, Laden 'village, town, cinema, shop' (type G). Out of about 240 items there was not a single one denoting a "symmetrical" object. These results underline the centrality to semantics of experiences related to the human body (types A and E). But they also show a correlation between the central schemata and the responses of informants: since stehen is connected with the images of base and verticality, the subjects first selected more central objects that exhibit both features (type B); then they selected objects with decreasing saliency of the vertical dimension (not very high objects [type C]); finally not typically 'vertical' objects (objects which often can be seen in a horizontal position as well [type D]). Thus, the more the object departs from the central features of the image, the more it appears to be marginal in the consciousness of the native speakers. Speaking again of liegen, "horizontal" objects (type F) seem to be more central, then geotopographical "objects" (type G). This latter class of objects doesn't seem to be consistent with the view expressed above (the closer to the characteristic features, the more central): although they belong to very common life experiences, geotopographical objects are not particularly "horizontal-like", no more so than

Liegen and stehen in German

503

many other types of objects. Clearly, both the frequency and the social relevance of geotopographical entities play an important role in this case. Again, we can see here two types of conflicting parameters at work: on the one hand, perceptual experience, which is always relevant in human categorization; on the other hand, the social saliency of certain phenomena, which can sometimes be very complex and are therefore symbolized by more complex image-schemata. It is interesting to notice that no "symmetrical" object was mentioned by the informants, although they should be fairly common: the lack of dimensional saliency in these objects might account for this phenomenon. 7.

Conclusion

The analysis of liegen and stehen has shown that horizontality and verticality, as represented in the semantics of these verbs, cannot be equated with simple features. Complex, almost Gestalt-like schemata, linked with (but not necessarily derived from) basic perceptual and psychological experiences, are at work here. Figurative and metaphorical extensions are somewhat related to mental images, but these too are more or less directly connected with the basic schemata of the core meanings. Thus, a semantic continuum can be detected in each verb, starting from meaning nodes related with the perceptual experiences and moving along towards meanings which refer to abstract concepts or figurative images. The mapping of this continuum would be best represented by the semantic network with a radial category structure which is still to be worked out. References Barron, Roger 1982 "Das Phänomen klassifikatorischer Verben", in: Hansjakob Seiler & Christian Lehmann (eds.), Apprehension. Das sprachliche Erfassen von Gegenständen. Tübingen: Narr, 133-46. Bierwisch, Manfred 1967 "Some semantic universale of German adjectivals", Foundations of Language, 3: 1-36.

504

Carlo Serra Borneto

Braine, Lila G. 1978 "A new slant on orientation perception", American Psychologist, 33: 10-22. Brugman, Claudia 1981 The story of over. Berkeley: University of California master thesis. 1988 The story of over: Polysemy, semantics, and the structure of the lexicon. Garland Outstanding Dissertations in Linguistics Series. New York: Garland. Casad, Eugene & Ronald Langacker 1985 "'Inside' and Outside' in Cora grammar", International Journal of American Linguistics 51,3: 247-282. Clark, Herbert 1973 "Space, time, semantics, and the child", in: Timothy Moore (ed.), Cognitive development and the acquisition of language. New York: Academic Press, 27-63. Clark, Herbert & Eve Clark 1978 "Universale, relativity, and language processing", in: Joseph Greenberg (ed.), Universals of human language, 1. Stanford: Stanford University Press, 225-277. Eichinger, Ludwig 1989 Raum und Zeit im Verbwortschatz des Deutschen. Eine valenzgrammatische Studie. Tübingen: Niemeyer. Fagan, Sarah to appear The semantics of the primary positional predicates in German. Fisher, Cecilia 1990 "Children's discrimination of orientation: Development and determinants", Annals of Child Development 7: 43-72. Gerling, Martin & Norbert Orthen 1979 Deutsche Zustands- und Bewegungverben. Eine Untersuchung zu ihrer semantischen Struktur und Valenz. Tübingen: Narr. Goldstein, Bruce 1984 Sensation and perception. Belmont: Wadsworth. Hoijer, Harry 1945 "Classificatory verb stems in the apachean languages", International Journal of American Linguistics 11: 13-23. Johnson, Mark 1987 The body and the mind. The bodily basis of meaning, imagination, and reason. Chicago: University of Chicago Press. Lakoff, George 1987 Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press. Lakoff, George & Mark Johnson 1980 Metaphors we live by. Chicago: University of Chicago Press. Lang, Ewald 1987 "Semantik der Dimensionsauszeichnungen räumlicher Objekte", in: Manfred Bierwisch & Ewald Lang (eds.), Grammatische und

Liegen and stehen in German

505

konzeptuelle Aspekte der Dimensionsadjektive. Berlin: Akademie, 287-458. 1990 "Primary perceptual space and inherent proportion schema: Two interacting categorisation grids underlying the conceptualisation of spatial objects", Journal of Semantics 7: 121-141. Langacker, Ronald 1987 Foundations of cognitive grammar. Vol. 1: Theoretical prerequisites . Stanford: Stanford University Press. 1991 Foundations of cognitive grammar, Vol. 2: Descriptive application. Stanford: Stanford Unversity Press. Nagy, William 1974 Figurative pattern and redundancy in the lexicon. Ph.D. diss., University of California at San Diego. Oosten, Jeanne van 1986 "Sitting, standing and lying in Dutch: A cognitive approach to the distribution of the verbs zitten, staan and liggen", in: Jeanne van Oosten & Johan Snapper (eds.), Dutch linguistics at Berkeley. Berkeley: UCB, 137-160. Serra Borneto, Carlo 1983 Konfrontative Untersuchung zu deutschen und italienischen Kausalzeichen. Ph.D. dissertation, Humboldt Universität Berlin. 1989 "Per una analisi cognitivista dei verbi stehen e liegen in tedesco", Studi italiani di linguistica teorica e applicata XVIII: 343-382. Talmy, Leonard 1978 "Figure and ground in complex sentences", in: Joseph Greenberg (ed.), Universals of human language, 4. Stanford: Stanford University Press, 625-49. 1983 "How language structures space", in: Herbert Pick & Linda Acredolo (eds.), Spatial orientation, theory, research, and application. New York and London: Plenum, 225-283. Traugott, Elizabeth C. 1985 "On regularity in semantic change", Journal of Literary Semantics XIV/3: 155-173. Vandeloise, Claude 1988 "Length, width, and potential passing", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam: Benjamins, 40327. Watkins, Laurel J. 1976 "Position in grammar: Sit, Stand and Lie", Kansas Working Papers in Linguistics 1: 16-41. Wunderlich, Dieter 1985 "Raum, Zeit und das Lexikon", in: Herbert Schweizer (ed.), Sprache und Raum. Stuttgart: Metzler, 67-89.

The semantics of the Chinese verb "come" Ya-Ming Shen

0.

Introduction

The Chinese morpheme lai, meaning "come", has a number of extended uses both as a verb and as various kinds of particles. All of these uses can be related to its basic sense of "come". In this paper, my interest is limited to the meanings of lai when it behaves as a full verb. I will give a preliminary account of the distribution of the verb lai in different sentence patterns, and the different relationships between the verb lai and its subject as well as its postverbal complement. My goal is to present an integrated analysis for both syntactic and semantic structure: How the semantic structures of lai differ from each other in different sentence types, and how these meanings are interrelated by means of a semantic network. The theoretical framework adopted here is Cognitive Grammar as elaborated in Langacker (1984, 1986, 1987a, 1987b, 1987c, 1988, 1991). This framework claims that grammatical structures do not constitute an autonomous formal system. Instead, it is assumed that all grammatical structures are inherently symbolic, and every grammatical unit is bipolar, consisting of a semantic unit at one pole which is overtly realized by a phonological unit at the other pole. Semantic structures are characterized related to cognitive domains, and any type of conceptualization is capable of serving as a domain for this purpose. The inherent properties of a situation, an entity, or an interaction can be construed in various ways, so in order to capture speakers' crucial ability of construal, this theory also provides a set of concepts and representations. Because of space limitations, an introduction to the theory will not be provided and some knowledge of the basic concepts and representations of the approach is presupposed. However, I would like to mention the following notions of Cognitive Grammar which are particularly important to the analysis presented in this paper. The

508

Ya-Ming Shen

first is the "profile/base" organization. The base of an expression is simply its domain, and its profile is the entity that the expression "designates". The second is the "degree of prominence" scale, which reflects an element's relative salience depending on various contributing factors. This is particularly useful during my discussion of two variants of lai in section 1.1.2 and thereafter. The third is the "setting/participant" asymmetry, which sheds light on my proposals of the "destination subject" in Section 1.2.1 and of the "source subject" in 1.2.2. A participant is an entity thought of as participating in a relationship, as opposed to merely providing the "setting" (the global base) for its occurrence, and participants interact with each other but occupy portions of a setting (or location). The fourth is "domain shifting", the speaker's metaphorical ability to conceptualize similar content in different cognitive domains, such as the spatial, temporal, or mental. The fifth is the "subjectivity/objectivity" distinction, defined as the extent to which an entity functions asymmetrically as the subject versus the object of conception. These two concepts make possible the linking between a spatial verb and a mental verb, as I show in Section 2. The last is "active zone", that is, the facet of an entity that most directly interacts with a given domain, or participates in a given relationship. This notion explains the derivation of the schematic verb lai discussed in Section 2.2. Besides these conceptual notions, I will also provide brief notes about the technical terms and symbols used in my figures as well as in my discussion, when they are introduced.

1.

Lai as a spatial motion verb and its subject types

The uses of the verb lai to be discussed in this section all involve spatial motion. First, I will introduce the prototype of lai and its two variants when there is a preverbal theme 1 . Then, I will investigate four types of subjects which can occur with lai when the theme itself is postverbal, that is when the entity in motion does not serve as the subject of lai . I will also discuss how the different subject types affect the semantic structure of lai.

The semantics of the Chinese verb 'come'

509

1.1. Theme subject : lai with a preverbal theme 1.1.1.

The prototype (lai-I)

The most basic meaning of the Chinese verb lai is almost the same as "come" in English. In this sense, the subject, which is the theme, moves toward the speaker's position along a spatial path. (1)

Ta yijing lai le. he already come PERF 2 'He has come already.'

(2)

Ta mingtian hui lai. he tomorrow will come 'He will come tomorrow.'

Figure 1 illustrates the semantic structure of the prototype of lai (lai1).

Figure 1. The prototyp lai (lai-1) 'to come' Here, the relevant domains are space (the outside frame, the base) and time (the arrow marked t at the bottom of the base)3. With the passage of time, the profiled trajector4 (the small circle marked tr) is the theme

510

Ya-MingShen

moving through space. The motion is towards a landmark, a spatial location (the square marked Imj). This destination of the theme is the primary landmark and thus is profiled. The trajectory of the theme, the path, is also profiled (the arrow linking the trajector and the destination landmark, the primary landmark). There is another landmark which is also a spatial location (the square marked lm.2). This secondary landmark is the theme's departure point, the source, and is not profiled. The speaker views the event from the vantage point (the small circle marked VP) within the primary landmark, the theme's destination. 1.1.2.

Two variants (lai-2 and lai-3)

Although the profiled trajectory of the Chinese verb lai basically includes the final stage, that is, the theme's landing point or destination, as shown in Figure 1, 5 it is also possible for only some part of the trajectory to be profiled, and the final stage to be nonsalient. Thus, the uses of lai in the following linguistic contexts are perfectly acceptable in Chinese, although the literal translations (in the parentheses under the free translations) for these sentences are ungrammatical in English. (3)

Ta yijing lai le, xianzai zheng zai he already come PERF now PROG at lu-shang ne. way-on (LOC) PRT 'He has left for here already, and he is on the way right now.' (Lit. '*He has come already, and he is on the way right now.')

(4)

Ta yijing cong xuexiao lai le, keshi he already from school come PERF but banlu you hui-qu le. halfway again return-go PERF 'He had already started coming here from school, but he went back halfway.' (Lit. '*He had come from school already, but he went back halfway.')

The semantics of the Chinese verb 'come'

511

Figure 2 illustrates this variant of lai (lai-2 'start-to-come' hereafter). There, both the trajector (tr) and the destination landmark (/m/) are profiled. However, only part of the trajectory is profiled, excluding the final stage.

1m j

lm2

Figure 2. lai-2 'start-to-come'

In contrast with lai-2 'start-to-come', another use of lai allows the speaker to focus on the final stage, especially when the theme subject is an inanimate noun or noun phrase. In this case, the meaning of lai approximates the sense of 'arrive'. I call this variant lai-3 'come-arrive'. In this usage, the final stage is most in focus; thus, the contexts acceptable for lai-2 'start-to-come' are impossible for lai-3 'come-arrive'. (5)

a. Na feng xin

yijing

lai

le.

that CL letter already come PERF 'That letter came/arrived here already.' b. *Na

that zheng

feng xin

yijing

lai

le,

CL letter

already

come

PERF now

zai

lu-shang

xianzai

ne.

PROG at way-LOC PRT '*The letter has come/arrived already, and it is on the way right now.'

512

Ya-MingShen

c. *Na feng xin that CL letter banlu you halfway again '* The letter has halfway.'

yijing lai le, keshi already come PERF but hui-qu le. return-go PERF come/arrived already, but it went back

Figure 3. lai-3 'come-arrive' Figure 3 illustrates the semantic structure of this variant, lai-3 'comearrive' . In this figure, the endpoint of the trajectory is profiled, while the rest of it is relatively nonsalient, especially the point of departure, which is not profiled. Certainly, the distinction between this variant, lai-3 'come-arrive', and the prototype, lai-1, is not a clear cut, since the meaning of the latter, the prototype, already includes the final stage. However, let us compare the following pair of sentences, in which (a) involves an animate subject and (b), an inanimate subject. (6)

a. Na ge ren cong Shanghai lai le. that CL man from Shanghai come PERF 'That man has come from Shanghai.' b. ?Na feng xin cong Shanghai lai le. that CL letter from Shanghai come PERF '?/*That letter has come/arrived from Shanghai.'

The semantics of the Chinese verb 'come'

513

The verb lai in sentence (6a) instantiates the prototype lai-1, in which the trajectory of the theme is profiled from the very beginning to the end, so it is perfectly acceptable if an overt source phrase is inserted in the same clause. However, the verb lai in sentence (6b) has the meaning of lai-3 'come-arrive', in which the initial stage of the motion is not in focus. Therefore, an overt source phrase is not preferred in the same sentence. On the other hand, lai-3 'come-arrive' is not equivalent to the sense of 'arrive' either. As the above example shows, an overt source phrase is only dispreferred with lai-3 'come-arrive', but it is totally disallowed from cooccurring with the Chinese verb dao , which is the corresponding form for the English verb 'arrive', as seen in (7). (7)

a. Na feng xin yijing dao le. that CL letter already arrive PERF 'That letter has arrived already.' b. *Nafeng xin yijing cong Shanghai dao le. that CL letter already from Shanghai arrive PERF '*That letter has arrived from Shanghai already.'

Im

Figure 4. dao 'arrive'

514

Ya-MingShen

Figure 4 illustrates the Chinese verb dao 'arrive'. 6 There, the theme's departure point is not only out of focus (compare Figure 3 for lai-3 'come-arrive'), but is not even a part of the picture, the base. This explains why inserting a source phrase is totally ungrammatical in a dao 'arrive' sentence, while it is grammatically marginal, but still acceptable, in a lai-3 'come-arrive' sentence. 1.1.3.

The overt realization of the landmarks

I would like to point out that although the verb come in English can only be followed by a prepositional phrase or by an adverb, the overtly realized destination landmark of the Chinese verb lai can be a simple noun or a noun phrase. (8)

Ta mingtian hui lai wo jia. he tomorrow will come I home 'He will come to my place tomorrow.'

(9)

Ta yijing lai guo Hang he already come PAST two 'He has been to the US twice already.'

ci Meiguo. time America

Interestingly, when the destination landmark is overtly realized, the verb lai is not easily construed as lai-2 'start-to-come'. (10)

??Ta yijing lai wo jia le, banlu he already come I home PERF halfway you hui-qu le. again back PERF '*He had come to my place already, but he went back halfway.'

One possible explanation for this is as follows: When the destination landmark of lai is overtly realized, it is relatively more salient, so that the final stage of the trajectory is emphasized and must be profiled. This is in conflict with the sense of lai-2 'start-to-come', where the final stage of the trajectory is not profiled.

The semantics of the Chinese verb 'come'

515

However, an overtly realized destination landmark alone is not enough to make the verb lai have the meaning of lai-3 'come-arrive'. As the following example shows, both an overt source phrase and an overt destination phrase can cooccur in the same sentence. (11)

Ta yijing cong xuexiao lai wo jia he already from school come I home 'He has already come to my house from the school.'

le. PERF

Figure 5 illustrates the semantic structure of a lai sentence with an overtly realized source landmark, whether it occurs with an overt destination (as in (11) above) or without an overt destination (as in (12) below).

Figure 5. lai with an overtly realized source landmark (12)

Ta yijing cong xuexiao lai he already from school come 'He has already come from the school.'

le. PERF

The only difference between Figure 1, which is meant for the prototype lai(lai-l), and Figure 5 is as follows: In the former, the secondary landmark, the source, is not profiled; while in the latter, it is profiled and overtly realized.

516

Ya-MingShen

1.2. Nontheme subject: lai with a postverbal theme In this subsection, I will discuss the four different types of subjects which can occur with lai when the theme is not in preverbal position to serve as a subject 7 , but rather is a postverbal complement. Also, I will investigate how the meaning of lai is affected by these subject types. 1.2.1.

Destination subject

In the following sentence, the theme is in a postverbal position, and the subject of lai is not an agent, but rather a destination, which is the theme's final landing point. (13)

Wo jia lai le liang ge keren. I home come PERF two CL guest 'There were two guests coming to my house.' (Lit. 'My house has come two guests.')

Following Langacker's idea of "setting subject" (Langacker 1987c), the "destination subject" I describe here can be considered as a subtype of "locative subject". Figure 6 represents the semantic structure for lai in this sense.8

Figure 6. lai with a destination subject

The semantics of the Chinese verb 'come'

517

In Figure 6, the theme's destination is construed as the subject, so that it is now the profiled trajector (indicated with the boldface square marked tr) instead of a landmark. Promoting the destination to this most prominent status then expands the profile to include the relationship between the locative/goal subject and the activity (indicated by a dashed boldface line linking the trajector and the trajectory). Accordingly, the theme is demoted to the status of a primary landmark, and it too is profiled. The trajectory is also profiled. The source, which is the secondary landmark, is not profiled. It is worth noting that, with a destination as its subject, the verb lai can not be construed as lai-2 'start-to-come', and thus the following sentence is ungrammatical in Chinese. (14)

*Wo jia yijing lai le liang ge keren, I home already come PERF two CL guest keshi banlu you hui-qu le. but halfway again return PERF '*Two guests were already coming to my house, but they went back halfway.'

It seems that the destination subject, which has the highest degree of prominence in the speaker's mind, ensures that the final stage of "coming", the arrival, will be the most salient part of the motion. Therefore, the endpoint of the trajectory must be profiled. This is incompatible with the sense of lai-2 'start-to-come', where the final stage of the trajectory must not be profiled. Thus, (14) shows that the sense of lai-2 'start-to-come' is not allowed with a destination subject. The difference between (14) and (15) also lends support to this claim. (15)

?Tamen yijing lai wo jia le, keshi they already come I home PERF but banlu you hui-qu le. halfway again return-go PERF 'They had left for my house already, but they went back halfway.' (Lit. '*They had come to my house already, but they went back halfway.')

518

Ya-Ming Shen

The destination wo jia 'my house' in (14) is a subject, which is the most salient part of the sentence; whereas the destination wo jia 'my house' in (15) is the primary landmark, which only has secondary salience. Therefore, the different degrees of salience of wo jia 'my house' in (14) and (15) explains why the context for lai-2 'start-tocome' is unacceptable in (14), which takes a destination subject, but only marginal in (15), where the overt destination serves only as the primary landmark. 1.2.2.

Source subject

In the following lai sentences, like the examples in section 1.2.1, the theme is in a postverbal position, and the subject is a spatial location as well. However, this location is not the destination, but rather the source of the theme. (16)

Shanghai lai le Hang ge ren. Shanghai come PERF two CL man 'Two men have come from Shanghai.' 9 (Lit. '*Shanghai has come two men.')

Like the destination subject I discussed above, a source subject is another subtype of "locative subject".

Figure 7. lai with a source subject (spatial location)

The semantics of the Chinese verb 'come'

519

Figure 7 shows the semantic structure of the variant of lai which takes a source subject. In this figure, the source is now not only profiled, but also elevated to serve as the trajector, the most prominent part within the base. The prominence of the theme as well as of the destination are accordingly reduced. So, the theme here is the primary landmark, and the destination the secondary landmark. However, they both are still profiled, and the trajectory is profiled also. Since the source now serves as the subject and therefore has the highest degree of salience, it selects the initial stage as the most prominent part of the trajectory. It is then expected that the meaning of lai-2 'start-to-come' should be possible for the sentences with a source subject. The following examples support this prediction. (17)

Shanghai yijing lai le liang ge ren, Shanghai already come PERF two CL man xianzai zhengzai lu-shang ne. now PROG way-on (LOC) PRT 'There have been two men coming from Shanghai already, and they are on the way right now.' (Lit. '*Shanghai has come two men already, and they are on the way right now.')

(18)

Shanghai yijing lai le liang ge ren, keshi Shanghai already come PERF two CL man but banlu you hui-qu le. halfway again return PERF 'There had been two men coming from Shanghai already, but they went back halfway.' (Lit. '*Shanghai had come two men already, but they went back halfway.')

Although the source subject is usually a spatial location, it does not have to be, because the relationship between the theme and the source can also be a part-whole relation. Compare the following two pairs of sentences:

520

Ya-Ming Shen

(19) a. Tarnen lai le liang they come PERF two 'Two of them have come.' b. *Ta lai le liang/yi ge he come PERF two/one CL '*Two/one of him has come.'

ge rerij. CL man ren. man

(20) a. Tamen-nar lai le liang ge ren. they-there come PERF two CL man 'Two peoplej have come from theirj place/organization.' b. Ta-nar lai le liang ge ren. he-there come PERF two CL man 'Two people have come from his place/organization.' In (19) the relationship between the theme and the source is a partwhole relation, so the plural subject in (19a) is grammatically appropriate but the singular form in (19b) is inappropriate (even when the theme is also singular), since the whole must be large enough to properly contain the part. Figure 8 shows the semantic structure of this type of sentence.

Figure 8. lai with a source subject (part-whole relation)

Hie semantics of the Chinese verb 'come'

521

In (20), the source subjects have a locative marker (-nar '-there', in these two examples) to show they are the location (in a broad sense) where the theme originates. So, it does not matter whether the subject stem is singular (b-case) or plural (a-case) even if the theme is plural. 1.2.3.

Agent subject

In the following lai sentences, the theme is in a postverbal position also. However, the preverbal nominal, the subject, is not a location as was the cases in section 1.2.1 (destination) and 1.2.2 (source), but rather a participant, an agent. (21)

Ta lai le yi feng xin. he come PERF one CL letter 'He sent me/us a letter.'

(22)

Ta mingtian hui lai dianhua. he tomorrow will come telephone 'He'll call me/us tomorrow.'

In these cases, the theme is usually inanimate, and it cannot move by itself. Figure 9 illustrates the semantic structure of lai in this usage.

Figure 9. lai with an agent subject

522

Ya-Ming Shen

Here, an agent serves as the trajector, a theme as the primary landmark, and both are profiled. The energy transfer between the trajector and the theme is indicated by the profiled double arrow. The trajectory (single arrow) and the theme's destination (rightmost square), the secondary landmark in this sense, are also profiled. The leftmost square marked /mj is the source landmark, where the agent stays and the theme comes from, and it is not profiled. 10 In section 1.1.2,1 pointed out that an inanimate theme subject of lai is unacceptable in the contexts for lai-2 'start-to-come'. However, in the sentences where lai takes an agent subject, although the theme is usually inanimate, the contexts for lai-2 'start-to-come' are acceptable, as seen in the following examples. (23)

Ta yijing lai xin le, xianzai dagai he already come letter PERF now perhaps zheng zai banlu-shang. PROG at halfway-on (LOC) 'He has sent me/us a letter already, and it is perhaps halfway now.'

(24)

Ta qunian gei wo lai guo san feng he last year to I come PAST three CL xin, keshi wo meiyou shou-dao, dagai letter but I haven't receive perhaps banlu diu le. halfway lost PERF 'He sent me/us three letters last year, but I didn't receive them, so they might have been lost halfway.'

Since in these sentences, the agent's position is also the departure point of the theme, it is then associated with the most salient part of the trajectory, and the speaker is thus able to construe the verb lai as lai-2 'start-to-come'. The question arises as to how the agent subject can be distinguished from a source subject without a locative marker. I suggest the following two tests.

The semantics of the Chinese verb 'come'

523

For the first test, adjust the number of the subject nominal. If the subject is a source (whole) of the theme (part), it must be able to be construed as a plural noun, 11 a group noun, or a mass noun, which is large enough to properly include the theme; whereas an agent is not restricted in this way. So, the subject tamen 'they' in (25a) is a source, since its number must be plural, regardless of whether the theme is plural (as in (25b)) or singular (as in (25c)). On the other hand, the plural subject tamen 'they' in (26a) is an agent, since a singular form ta 'he' can occur in the same context, as in (26b). (25) a. Tamen lai le Hang ge ren. they come PERF two CL man 'Two of them have come.' (Lit. *'Two from them have come.') b. *Ta lai le Hang ge ren. he come PERF two CL man '*Two of him has come.' (Lit. *"Two men from him came.') c. *Ta lai le yi ge ren. he come PERF one CL man '*One of him has come.' (Lit. *'One man from him came.') (26) a. Tamen lai le Hang they come PERF two 'They sent me/us two letters.' b. Ta lai le Hang feng he come PERF two CL 'He sent me/us two letters.'

feng xin. CL letter xin. letter

For the second test, insert an overt source phrase. This test is applied in the following pair of examples: The subject tamen 'they' in (27a) is a source itself; an additional locative source thus is extremely marginal. On the other hand, the subject tamen 'they' in (27b) is an agent, so an overt source phrase is acceptable.

524

Ya-Ming Shen

(27) a. */??

Tarnen cong Shanghai lai le they from Shanghai come PERF ge ren. CL man 'Two of them came from Shanghai.' (While the rest of them remained in Shanghai.) b. Tamen cong Shanghai lai le yi they from Shanghai come PERF one 'They sent me/us a letter from Shanghai.'

Hang two

feng xin. CL letter

This distinction can be further exemplified as in (28) below. In (28a), the subject is the source, which is easy to recognize because of the locative marker (-nar '-there'), and cannot cooccur with another source phrase in the same clause. In (28b), by contrast, the subject is the agent, which is also easy to recognize because it is singular, and an overt source phrase is perfectly acceptable. (28)

a. *Ta-nar cong Shanghai gei wo lai le he-there from Shanghai to I come PERF yi feng xin. one CL letter '*A letter to me came from him from Shanghai.' b. Ta cong Shanghai gei wo lai le yi he from Shanghai to I come PERF one feng xin. CL letter 'He sent me a letter from Shanghai.'

Sometimes, however, these two types, the source subject and the agent subject, may overlap, especially when the subject can be construed both as a location and as an organization like Beijing Daxue 'Beijing University' in (29). (29)

Beijing Daxue lai le xuduo xuesheng. Beijing university come PERF many student 'There are a lot of students coming from Beijing University.' (or 'Beijing University sent a lot of students here.')

The semantics of the Chinese verb 'come'

1.2.4.

525

Null subject

When the theme is in postverbal position, the subject position can be filled by a noun or a noun phrase serving as a destination, a source, or an agent, as discussed above. Moreover, it can also be phonologic ally zero, as in the following examples. (30)

(31)

Zuotian lai le yi ge keren. yesterday come PERF one CL guest 'There was a guest coming yesterday.' Zuotian lai le san feng xin. yesterday come PERF three CL letter 'There were three letters coming/arriving yesterday.'

These sentences have been argued to be "subjectless sentences" (there is no subject at all in these sentences) or "subject-verb inverted sentences" (the postverbal theme is actually the subject there) by some Chinese linguists. 12 I, however, believe that there is a null subject which is the understood destination of the postverbal theme in these cases. The following observations support my conjecture. First, if the above sentences are subjectless or subject-verb inverted, a destination, wo jia 'my house', for example, should be able to be overtly realized in a nonsubject position. However, the facts contradict this prediction, as shown in the following data. (32) a. *Zuotian yesterday b. *Zuotian yesterday

lai le yi ge keren (dao) wo jia. come PERF one CL guest (to) I home lai le (dao) wo jia yi ge keren. come PERF (to) I home one CL guest

c. *Zuotian yesterday d. *Dao wo to I 'There was

lai (dao) wo jia le yi ge keren. come (to) I home PERF one CL guest jia zuotian lai le yi ge keren.13 home yesterday come PERF one CL guest a guest coming to my house yesterday.'

526

Ya-Ming Shen

My explanation is as follows: There is a null subject in these sentences, and this null subject is an understood destination. Therefore, an additional destination phrase, whether if it is a prepositional phrase (with the preposition/coverb dao 'to') or a noun phrase (without dao 'to'), is not allowed. Furthermore, in contrast to the unacceptability of a destination phrase in this context, a specified source phrase in a nonsubject position is acceptable. (33)

Zuotian cong Shanghai lai le yi ge keren. yesterday from Shanghai come PERF one CL guest 'There was a guest coming from Shanghai yesterday.'

Finally, the context for lai-2 'start-to-come' is impossible for this type of lai sentences: (34)

*Zuotian lai le yi ge keren, keshi banlu Yesterday come PERF one CL guest but halfway you huiqu le. again return PERF "There was a guest coming yesterday, but he went back halfway.'

In previous discussions, I argued that only a construction with a destination subject (Section 1.2.1) or an inanimate subject (Section 1.1.2, the discussion of lai-3 'come-arrive') disallows the sense of lai-2 'start-to-come'. All other kinds of subjects, source (Section 1.2.2), agent (Section 1.2.3), or animate theme (Section 1.1.2)14 , can appear in the contexts for lai-2 'start-to-come'. In (34), although the postverbal theme is animate, the context of lai-2 'start-to-come' is not acceptable. Again, if it is assumed that there is a null destination subject, this phenomenon is predictable. However, the above discussion regarding the semantic function of the null subject for the verb lai does not include imperative sentences, where the overt realization of the second person subject is optional in Chinese, as seen in (35).

The semantics of the Chinese verb 'come'

527

(35) a. Qing ni chang lai xin. please you often come letter b. Qing chang lai xin. please often come letter 'Please write to (me/us) often.' Since omitting the subject in an imperative sentence is quite common crosslinguistically, and since it has no special significance in the present work, I do not discuss it here. 2.

Lai as an abstract motion verb and its complement types

In section 1,1 discussed the meanings of the verb lai profiling spatial movement. In what follows, I will investigate the semantic structures of lai when it does not focus on any spatial movement, but rather indicates an abstract motion. Depending on the type of complement, the senses of the abstract verb lai can be divided into two subtypes: One takes a verbal complement, and expresses some kind of mental intention with immediate futurity; the other takes a nominal complement, and functions as a schematic verb. 2.1. Lai with a verbal complement When the verb lai is followed by another verb phrase, it is ambiguous both in syntactic structure and in semantic interpretation. For example, the Chinese sentence Ta 'he' lai "come" mai 'buy' liwu 'gift' can mean either "He came to buy a gift" (in answering a question such as "Why did he come here?"), or "He's gonna buy a/the gift" (in a context such as "You take care of something else"). Accordingly, this sentence can be analyzed syntactically as in structure (36a) or (36b): In the (a) example, the verb lai is followed conjunctively by another verb phrase, mai liwu 'buy a gift'; 1 5 in the (b) example, on the other hand, the verb lai takes another verb phrase, mai liwu 'buy a gift', as its internal complement.

528

Ya-Ming Shen

(36) a. [Ta [[lai] he come 'He came to b. [Ta [lai he come 'He's gonna

[mailiwu]]]. buy gift. buy a gift.' [mailiwu]]]. buy gift. buy a/the gift.'

The lai in the (a) structure refers to a spatial movement, while the lai in the (b) structure rather signifies a kind of mental intention. The semantic differences between lai as a spatial motion verb and as a mental intention verb are observed in the following ways. First, the domain for the spatial motion verb lai is physical space, whereas the one for the mental intention verb lai is shifted to a mental space. In the former, there is a spatial movement of the theme; that is, the theme physically moves through a spatial path towards a concrete/definite location. In the latter, there may not exist a spatial motion nor any concrete/definite locations at all, but rather some kind of conceptualized "motion" and "locations", which I will discuss below, are involved. Second, the reading for the former is objective, while the one for the latter is subjective. In the former sense, there is a physical motion by the subject, and the speaker tracks the whole motion objectively. On the other hand, in the latter sense, there is no objective motion by the subject. Instead, it is the speaker who conceptually takes the subject's intended activity as a "destination", and subjectively views the process as the subject progressing along a path traced mentally by the speaker. Third, since the destination for the spatial motion verb lai is a definite/concrete location which is the setting of the next action, it can be overtly realized, as in (37). (37)

Ta lai zher mai liwu. he come here buy gift 'He came here to buy a gift.' (not 'He's gonna buy a gift here.')

The semantics of the Chinese verb 'come'

529

In contrast, the landmark in the mental intention sense is only an indefinite/abstract setting of the intended action, and this setting cannot be overtly realized. If a location phrase is inserted, the lai is no longer a mental motion verb, but rather returns to its original sense, a spatial motion verb. Fourth, lai indicating a mental intention also implies immediate futurity. Thus, a future modal is marginally acceptable in this context since it is redundant (see (38a)); a past tense marker is totally unacceptable (see (38b)), because it is in conflict with the mental motion verb lai which implies immediate futurity. On the other hand, lai referring to a spatial motion allows a modal/tense marker signifying either future or past tense, as shown in (39a) and (39b) respectively. (38) a. ?? Wo hui lai xi wan, ni zuo beide ba. I will come wash dish you do other PRT 'I'll wash the dishes, and you take care of other things.' (=/= Lit. 'I'll come to wash the dishes, and you take care of other things.') b. * Wo lai guo xi wan, ni zuo beide ba. I come PAST wash dish you do other PRT Ί would wash the dishes, you take care of other things.' (39) a. Wo hui lai zher xi wan. I will come here wash dish 'I'll come here to wash dishes.' b. Wo lai guo zher xi wan. I come PAST here wash dish Ί came here to wash dishes.' Finally, neither a tense marker nor a modal is allowed for the embedded clause of the lai indicating mental intention as in (40a) and (40b), because the verbal complement is atemporal in this case and must be specified as a nonfinite clause. Yet, either a future modal or a past tense marker can occur in the verbal clause following the spatial motion verb lai, as in (41a) and (41b).

530

Ya-Ming Shen

(40) a. * Wo

lai

hui xi

wan,

ni

zuo

beide

ba.

I come will wash dish you do other PRT '*I'll being going to wash dishes, and you take care other things.' b. * Wo

lai

xi

guo

wan, ni

zuo beide

I come wash PAST dish you do other '*Γ11 washed dishes, you take care of other things.' (41) a. Wo lai

zher hui xi

ba.

PRT

wan.

I come here will wash dish 'I'll wash dishes when I come here.' b. Wo lai

zher xi

guo

wan.

I come here wash PAST dish 'I've come here and washed dishes.' The following two figures, Figure 10 and Figure 11, illustrate the semantic structures of these two types of lai, one refers to a spatial motion verb and the other signifies a mental intention. Figure 10 depicts the spatial motion verb lai which is followed conjunctively by another process. Here, the relevant domains are still space (the base) and time (the arrow marked with t). The trajector, which is the theme, physically moves along a spatial path toward its destination, a concrete location, to start another activity ('to buy a gift' as in (36a)). The primary landmark, the destination, is thus the spatial base of the second activity.

Im j

lm2

Figure 10. lai as a a spatial motion verb followed by another process

Hie semantics of the Chinese verb 'come'

531

Compare this figure with Figure 1 which is for the prototype lai (lai-1). The only difference between these two figures is that the primary landmark in Figure 1 is simply a location, whereas the one in Figure 10 is the base of another process (the arrow with a wavy line stands for some kind of change). 16

Figure 11. lai as a mental intention verb with a verbal complement Figure 11 depicts the mental intention verb lai which takes another verbal phrase as its complement. Here, the base is shifted from the spatial domain to a mental one. Thus, the profiled trajectory is illustrated as a dashed arrow which signifies a mental process. The profiled trajector (trj) for the main clause is still the theme, even though it may not be a mover at all. The primary landmark is not a concrete location, but rather another process, where the inner trajector (tr2) is identical with the theme (this relation is indicated with a dashed line). The secondary landmark is not necessarily a spatial location either, but rather an abstract one. It refers to the theme's current status (it has not started the intended process yet), and is conceptualized as a "location" related to a sequence of events. It is also a "location" where the speaker sets his reference point (the moment of speaking). However, there also is a vantage point shift involved, that is, the speaker's vantage point is mentally shifted from the present time to the future, from the secondary landmark to the primary landmark, which the speaker conceptualizes as the theme's "destination". Then, the speaker subjec-

532

Ya-Ming Shen

tively views the theme progressing toward the future process, from his vantage point within the primary landmark. There is one more point worth noting: the Chinese verb lai, which has the basic meaning of "come", implies the sense of imminent future when it takes a verb phrase as its internal complement, as discussed above. Interestingly, verbs meaning "go" in some languages, such as English, French, as well as languages of the Uto-Aztecan family 17 , can also behave as markers of futurity. Langacker (1991) points out that the English sentence "He is going to open the door" is ambiguous: One objective reading is that "the subject is following a spatial path at the end of which he will initiate the process of opening the door", while the other reading is subjective and means that "the subject will open the door in the imminent future". Moreover, for most native speakers, the latter reading can also imply a kind of mental intention of the subject. However, in the English case, the domain shift is from space to time as Langacker (1991) points out. Therefore, the primary function of the nonspatial verb go is to mark an imminent future, while the implication of a mental intention is just a secondary one. In the Chinese lai case, by contrast, the domain shift is from spatial to mental. Thus, the expression of a mental intention is the primary reading of the nonspatial lai , while the immediate futurity is rather secondary. Accordingly, adding a true future modal is ungrammatical when the English verb go serves as an imminent future marker; yet it is acceptable to insert a future modal in a construction with the verb lai referring to a mental intention, although this future modal is somehow redundant, and native speakers usually avoid it. (42) a. *Π1 be gonna open the door. b. ? Wo hui lai xi wan, ni zuo beide ba. I will come wash dish you do other PRT 'I'll wash the dishes, and you take care of something else.' (=/= Lit. 'I'll come to wash the dishes, and you take care something else.') Moreover, one of my Chinese speaker consultants18 even accepts the following sentence, though he notes this is only marginal:

The semantics of the Chinese verb 'come'

(43)

533

?? wo lai qu mai piao. I come go buy ticket ' * r m gonna to go to buy the tickets.' (Til go to buy the tickets.')

In (43), lai "come" only expresses a mental intention and futurity, while qu 'go' refers to a true spatial motion. This sentence is easier to accept within certain contexts, as in (44). (44) A:

B:

Women lai kan chang dianying ba. we come see CL movie PRT 'Let's see a movie.' ? Hao, wo lai qu mai piao. OK, I come go buy ticket ΌΚ, let me go to buy the tickets.' (or, ΌΚ, I'll go to buy the tickets.')

The significance of these examples is twofold: On the one hand, this provisional acceptability suggests that the mental motion verb lai is so far from its original meaning "come" that it can even take a counter directional verb (qu 'go') phrase as its complement; on the other hand, the cooccurrence of lai and qu 'go' is only marginally acceptable, which indicates that the mental intention verb lai is still somehow related to its original meaning "come" in the speaker's mind. 2.2. Lai with a nominal object When lai is followed directly by a nominal (instead of a prepositional phrase), the nominal may be a phonologically realized destination of the theme, as section 1.1.3 indicated, or the theme itself if the subject is something else, as discussed in section 1.2. These two cases, both involving spatial motion, are not within the scope of this section. Here, I discuss only the postverbal nominal that serves as a direct object of a schematic verb lai. Like some aspectual verbs in English, such as "begin", "start", and "finish" (Langacker 1984), as well as intentional verbs like "want"

534

Ya-Ming Shen

(Langacker 1991), the abstract verb lai in Chinese can take either a verbal complement (the a-cases in the following examples) or simply a nominal as its direct object (the b-cases in the following examples), if the omitted verb is predictable.19 (45) a. Women lai wan (yi) ge youxi ba. we come play one CL game PRT b. Women lai (yi) ge yowci ba. we come one CL game PRT 'Let's play a game.' (46) a. Wo lai chang (yi) zhi ge, ni lai I come sing one CL song you come tan (yi) duan gangqin, ta lai jiang play one CL piano he come tell (yi) ge gushi, dajia dou lai biaoyan (one) CL story everyone all come perform yi ge jiemu. one CL performance b. Wo lai (yi) zhi ge, ni lai (yi) I come one CL song you come one duan gangqin, ta lai (yi) ge gushi, CL piano he come one CL story dajia dou lai (yi) ge jiemu. everyone all come one CL performance 'I'll sing a song. You'll play a piano piece. He'll tell a story. Everyone will give some performance.' I have discussed the a-cases in Section 2.1, where lai behaves as a mental intention verb which takes another process as its "destination". Figure 12 below illustrates the semantic structure of this sense of lai, with a detailed diagram for the process within the primary landmark, which I ignored in Figure 11. In Figure 11, which is a subschema of Figure 12, the primary landmark stands for any kind of process with at least one participant, the trajector. In Figure 12, the primary landmark refers only to a type of process where two participants are involved, one as the trajector, the other as the landmark. As in Figure 11, the inner trajector in Figure 12 is also identical with the outer trajector.

The semantics of the Chinese verb 'come'

535

• t Figure 12. lai as a mental intention verb with a two participant verbal complement What is interesting is the semantic structure of the b-cases in (45) and (46). There, not only does the verb lai not designate any spatial motion, but also it behaves as a schematic verb, for it can be construed as compatible with various actions according to its object, such as "playing" for games in (45b), "singing" for songs, "playing" for musical instruments, and "telling" for stories in (46b). It seems that the schematic verb lai "absorbs" the meaning of the embedded verb which is not overtly realized. Figure 13 below illustrates the semantic structure of this extended usage of lai.

Figure 13. lai as a schematic verb

536

Ya-Ming Shen

Compare Figure 12 and Figure 13: In the former, the verb lai designates a process in which the primary landmark is another process. In the latter, on the other hand, the primary landmark is not the intended process as a whole, but rather the landmark of that process. The conceptualized "destination", the intended process, is therefore demoted to the status of secondary landmark. However, this intended process remains as a pivotal facet of the whole process, and is called the "active zone" of the primary landmark (the landmark within it), in the terminology of Cognitive Grammar.

3.

Summary

To sum up, the diverse uses of the verb lai are grouped into two major categories: One signifies spatial motion, the other designates abstract motion. My discussion in this paper focuses mainly on the interface of the syntactic conditions and the semantic structures in both senses. In the spatial motion cases, it is observed that the theme can either precede or follow the verb. When the theme serves as the subject of lai , the destination landmark may or may not be overtly realized. On the other hand, when the theme is in a postverbal position, the subject nominal can signify a destination, a source, or an agent. However, a null subject can only be interpreted as an understood destination, unless the imperative mood is involved. It is also observed that under different syntactic conditions, the most salient part of the theme's trajectory can be associated with the final stage (lai-3, 'come-arrive'), or the initial stage (lai-2, 'start-to-come'). In the cases of abstract motion, it is shown that lai takes either a verbal complement or a nominal complement. When lai takes a verbal complement, it indicates a mental intention plus immediate futurity, and its primary landmark is an intended process. When lai takes a nominal complement, on the other hand, it behaves as a schematic verb, and its primary landmark is not the intended process but rather the landmark of this process. Finally, these two major senses, spatial motion and abstract motion, are semantically related to each other by means of a metaphorical shift from physical space to mental space, as well as from an objective to a

The semantics of the Chinese verb 'come'

537

subjective perspective. In the former the speaker sets up his viewpoint within the theme's destination, which is a concrete location, and tracks the whole motion objectively through a spatial path. For the latter, the speaker conceptually takes the trajector's "destination", which is not a concrete location, but rather another process, as a reference point, and views the whole thing as progressing through a mental path. The syntactic structures as well as restrictions for both senses, such as the allowance/disallowance of a location phrase or a tense marker/modal, are accordingly different from each other.

Acknowledgments My special thanks go to Ronald W. Langacker, for his valuable advice during the different stages of this research. I am indebted to the two anonymous readers as well as the editor for their helpful comments and suggestions concerning this paper. I am also grateful for help from Karen van Hoek, Steve Poteet, Linda Manney, and Richard Epstein. However, none of them should be held responsible for the analysis presented in this paper.

Notes 1.

2.

3. 4.

The notion "theme", in Langacker's terminology, refers to the participant in a thematic relationship. The notion is schematic with respect to a number of role archetypes, including patient, mover, non-initiative experiencer, and zero. In this paper, "theme" stands for mover (comer). The following abbreviations will be used in examples: ASP stands for aspect, CL for classifier, LOC for locative marker, PAST for past tense marker, PERF for perfective marker, PROG for progressive marker, PRT for particle, and Lit. means literal translation. Boldface indicates profiling in this and the following diagrams. In Langacker's terminology, the most prominent participant of an activity/process is referred to as "trajector", and other salient participants/locations as "landmarks". These landmarks are called "primary landmark", "secondary landmark", etc., according to their relative degrees of prominence. The activity is said to have a "temporal profile". In a prototypical action verb, the trajector/landmark

538

5. 6. 7. 8. 9. 10. 11.

12.

13.

14.

15.

Ya-Ming Shen

(primary landmark) asymmetry underlies the subject/object distinction. However, lai is an intransitive motion verb, and the primary landmark is not a participant but a location here. Without any special context, the first interpretation of (1) is 'He's come already (he's here now).' For detailed discussion regarding dao 'arrive', see Poteet (1987). Chinese is an SVO language, so the subject always precedes the verb. For more discussion of the notion of "setting subject", see Langacker (1987c). "From Shanghai" modifies the verb "come" here. The source landmark is profiled only when the source is overtly realized, as in (27b) and (28b) below. In Chinese, there are no morphological markers for the number of a noun, except for pronouns. So, I use the phrase "to be able to be construed as a plural noun" here. Chinese lacks socalled "dummy subjects", such as 'there' and 'it' in English. The following examples are often cited as the typical cases for this phenomenon: (i) Xia yu le. fall rain PERF 'it's raining.' (ii) Zai zhuozi-shang you yi ben shu. at table-LOC(up) have one CL book 'There is a book on the table.' Since Modern Chinese is an SVO language, some traditional schools of Chinese linguistics claim that there is no subject at all for these structures as well as for sentences like (30) and (31), namely, "subjectless sentences". Other Chinese linguists analyze the structural patterns in these sentences as a matter of "subjectverb inversion"; there the postverbal nominal is the subject. (See discussions in Zhongguo Yuwen 1956). A noun phrase (without the preposition/coverb dao 'to'), is analyzed as a subject in this position, so there are no parentheses for dao 'to' here. As discussed in section 1.1, lai for an inanimate theme takes the meaning of lai-3 'come-arrive'. The senses of lai-3 and lai-2 are incompatible with each other. However, that is not the case for the example sentence here. This structure is known as liandong 'sequential actions' or 'sequential verb phrases' in Chinese linguistic terminology.

The semantics of the Chinese verb 'come'

16.

17. 18.

19.

539

For the sake of simplicity, I leave the discussion regarding the semantic structure of the process within the inner square to the next subsection. The information about the languages of Uto-Aztecan family is provided by the editor. I am a native Chinese speaker myself. However, I still consult other native Chinese speakers to verify my judgment for some sentences. In what follows, by saying lai ge youxi (45b), it is understood as 'to play a game', not 'to watch a game'. In the context of (46b), lai zhi ge means 'to sing a song' , not 'to listen to a song'; lai duan gangqing , means 'to play a piano piece', not 'to listen to a piano piece'; lai ge gushi , means 'to tell a story', not 'to listen to a story', etc. The predictability seems to be related to socalled "stereotypic actions". However, further study is needed to characterize the predictability more precisely, and I will not discuss it in this paper. Here, I simply assume that the predictability is stored in the encyclopedic knowledge of both speakers and listeners.

References Emanati an, Michele 1991 "Point of view and prospective aspect", Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistics Society : 355367. Huang, Shuanfan 1982 "Space, time and the semantics of lai and qu", in: Shuanfan Huang, Papers in Chinese syntax. Taipei, Taiwan: Wenhe Press, 113-122. Langacker, Ronald W. 1984 "Active zones", Proceedings of the Tenth Annual Meeting of the Berkeley Linguistics Society: 172-188. 1986 "Abstract motion", Proceedings of the Twelfth Annual Meeting of the Berkeley Linguistics Society: 455-471. 1987a Foundations of cognitive grammar, Vol. 1 : Theoretical prerequisites. Stanford: Stanford University Press. 1987b "Nouns and verbs", Language 63: 53-94. 1987c "Grammatical ramifications of the setting/participant distinction", Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society: 383-394. 1988 "A usage-based model", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins, 49-90.

540

Ya-Ming Shen

1991

Foundations of cognitive grammar, Vol. 2: Descriptive application. Stanford: Stanford University Press.

Poteet, Stephen 1987 "Paths through different domains: a cognitive grammar analysis of Mandarin dao". Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society: 408-421. Zhongguo Yuwen [Chinese Language] (ed.) 1956 Hanyu de Zhuyu Binyu Wenti [The issue of subject and object in Chinese], Beijing: Zhonghua Shuju (Zhounghua Press).

Touching: A minimal transmission of energy" Claude Vandeloise

0.

Introduction

Commenting on transitive verbs, Langacker (1990) notes that a typical agent exercises a "physical activity resulting in contact with some external object and the transmission of energy to that object" (p. 216). In this article, I will deal with a borderline case, the transitive usages of the verb toucher (for the intransitive usages of this verb, see Vandeloise 1993). Indeed, while the agent of this verb actually establishes contact with an external object, transmission of energy is forbidden in many of its usages. Passivization of such sentences cannot occur in the present tense. The connection between toucher and the concept of "minimal physical action" accounts for this important property. In these cases, the subject of toucher stands midway between an agent and an experiencer. The ban on passivization may be waived only if the external object makes contact particularly difficult. 1 At a midpoint between verbs describing an action and verbs describing a state, toucher is as well a bridge between displacement verbs and static verbs. Indeed, even if Picoche (1986) is right when she claims that the most representative usages of this verb are kinetic (involving movement), it also describes essentially static situations. These usages are especially interesting since it turns out that toucher is unacceptable any time contact is a consequence of a complex spatial relationship involving the asymmetrical transmission of energy, such as the relationship bearer/burden (B/b), the relationship container/contained (C/c) (Vandeloise 1991) and the relationship of suspension (Sp) (Vandeloise 1989). This demonstrates that contact, an essential feature in topological descriptions of space, is only indirectly coded in language, through "complex primitives" (Vandeloise 1987). The expressions associated with the relationships B/b, C/c and Sp, être sur 'be on', être dans 'be in' and pendre à 'hang from' are cho-

542

Claude Vandeloise

sen when there is transmission of energy between the participants in the relation. If this is not the case, then the verb toucher, associated with the concept of minimal physical action, is utilized. The first part of this article will be devoted to kinetic usages of the verb toucher. In the second part, I deal with the static usages of this verb while the third part endeavors to provide a schema applicable to both the dynamic and the static usages. 1.

Kinetic usages of toucher

I will analyze separately the objective movement determining the "scope" of the verb toucher and its "profile", which is the portion of the scope singled out for maximal salience (Langacker 1991). Concerning the scope, I will examine in turn the role of movement (section 1.1.), of force (section 1.2.), of the surface of contact (section 1.3.), and of the instrument mediating between the agent and the patient (section 1.4.). The last section (1.5.) will be devoted to the part of the objective movement profiled by the verb toucher. 1.1. Role of movement I will successively consider the case where only S (the entity designated by the subject) is moving and the case where both S and O (the entity designated by the direct object) are moving. I reserve the case of static S and O for the discussion in the second part of the article. 1.1.1.

S mobile and O immobile

This is the most typical situation, illustrated by sentence (1): (1)

L'étudiant touche le livre. 'The student touches the book.'

As is illustrated by sentences (2) and (3), S cannot be motionless if O is moving: (2)

*Le mur touche l'étudiant. 'The wall touches the student.'

Touching: A minimal transmission of energy

1.1.2.

543

S and O mobile

Sentence (3) describes two moving cars: (3)

La voiture de l'étudiant touche la limousine du professeur The student's car touches the professor's limousine.

Sentence (3) is more appropriate to figure (1), where both cars are moving in the same direction than figure (2), where the cars are moving in opposite directions:

Sentence (3) is less appropriate for describing Figure 2 because of the force involved in the encounter of the two cars (see section 1.2.). S's intention is not necessarily involved in the contact, as illustrated by sentence (4): (4)

Le professeur touche involontairement le pupitre. 'The professor involuntarily touches the desk.'

Situations in which O is moving in order to avoid the contact will be considered in section 1.2 since they involve factors other than movement.

544

Claude Vandeloise

1.2. Role efforce As illustrated by sentences (5) and (6), contacts described by the verb toucher cannot be violent: (5)

*Le doyen touche le verre de toutes ses forces. 'The dean touches the glass as hard as he can.'

(6)

*Le professeur touche violemment le pupitre. 'The professor violently touches the desk.'

The inappropriateness of sentence (3) for the situation depicted by Figure 2 is also justified by the violence of the shock. There are however circumstances where energy can be transmitted from S to O: (7)

L'obus touche le collège avec force. 'The shell violently hits the college.'

(8)

Le chancelier touche violemment le gouverneur au menton. 'The chancellor violently strikes the governor on the chin.'

Sentences (5) and (6), though, confirm that when force is involved, toucher cannot be freely utilized to describe contact. English is even more strict in this respect since to touch cannot be used in the translation of sentences (7) and (8). Note, furthermore, that the situations which permit violence don't necessarily impose it, as evidenced by sentences (9) and (10): (9)

L'obus touche légèrement le collège. 'The shell lightly touches the college.'

(10)

Le chancelier touche légèrement le gouverneur au menton. 'The chancellor lightly touches the governor on the chin.'

While most usages of the verb toucher proscribe transmission of energy, one may conclude from sentences (7)-(10) that some examples

Touching: A minimal transmission of energy

545

are neutral relative to this factor. This neutrality is systematically motivated by the mobility of O and/or by the difficulty in reaching it because of the distance separating S and O. I will call 'asymmetric transmission of energy' an exchange in which one participant is salient because it initiates the exchange. Unacceptable sentences (5) and (6) involve asymmetrical transmission of energy. In contrast, both participants play the same role in the 'symmetric transmissions of energy' described by acceptable sentences (7) and (8) 2 . Sentence (11) is more appropriate than sentence (3) to describe the situation depicted in Figure 2: (11)

La voiture de l'étudiant et la limousine du professeur se touchent The student's car and the professor's limousine hit each other.

Indeed, even though transmission of energy is involved in this situation, its symmetry, reinforced by the reflexive form of the verb, facilitates the usage of toucher. A last example, for which sentence (12) better fits Figure 3 than Figure 4, will serve as a turning point toward the next section, concerned with the surface of contact between S and O:

Figure 3. A curve with a tangent.

Figure 4. A curve with a secant. (12)

La droite touche la courbe. 'The line touches the curve.'

Two factors explain why sentence (12) describes the tangent to the curve better than the secant. The first factor is force: indeed, since a line is often conceptualized as the trajectory of a mobile punctual ob-

546

Claude Vandeloise

ject, one will admit that the contact made between one object and another is lighter, if the trajectory of the former is tangent to the latter than if it crosses over it. The second factor explaining the preference for Figure 3 will be dealt with in section 1.3. That section, as well as section 1.4, concern the static usages of toucher as well as the kinetic usages of this verb. 1.3. Role of the surface of contact The part of S which makes contact with O is often smaller than this object and limited to one point. The preference for the verb toucher when describing a punctual contact, can be explained by the nature of the most usual instrument used for touching, i.e., the finger tip. The reason why Figure 5 may be described by sentence (13) while Figure 6 must be described by sentence (14) is probably the punctuality of the contact in the former situation:

Figure 5. Road and river touching

Figure 6. Road going along river (13)

Le chemin touche la rivière. 'The road touches the river.'

(14)

Le chemin longe la rivière 'The road goes along the river.'

Touching: A minimal transmission of energy

547

Preference for a punctual contact is also involved in the normal interpretation of sentence (15): (15)

Le doyen touche 3 verres. 'The dean touches 3 glasses.'

Even though a dean is certainly clever enough to touch three glasses at the same time, for example with three different fingers, one is more likely to imagine that he touches each glass successively, establishing a punctual contact with each of them. Representative as it may be, punctual contact is not necessary in order to use the verb toucher. Even though the whole back of the wardrobe makes contact with the wall, sentence (16) is acceptable. I doubt, however, that sentence (17) would be a first choice to describe a painting on a wall: (16)

La garde-robe touche le mur. 'The wardrobe touches the wall.'

(17)

?La peinture touche le mur. 'The painting touches the wall.'

1.4. Role of an intermediary element S does not always directly make contact with O. In sentence (18), notably, it is probably a bullet or an arrow which makes contact with the rabbit rather than the chancellor himself: (18)

Le chancelier touche le lièvre. 'The chancellor hits the rabbit.'

In order for an intermediary to be allowed to interact between S and O, though, it must remain under the control of S. Therefore, sentence (19) but not sentence (20) is appropriate to describe Figure 7:

548

Claude Vandeloise

Figure 7. An intermediary in the process of "touching" (19)

Le doyen touche le mur. 'The dean touches the wall.'

(20)

*L 'étudiant touche le mur. 'The student touches the wall.'

The intermediary is introduced by the preposition avec 'with', as evidenced by sentence (21): (21)

Le doyen touche le tableau avec la règle. 'The dean touches the blackboard with the ruler.'

(22)

Le doyen touche le tableau du doigt. 'The dean touches the blackboard with the finger.'

In sentence (22), the noun introduced by the preposition de is not, strictly speaking, an intermediary but rather the part of S which makes contact with O. Apparently, this preposition de introduces only parts of the human body. Therefore, sentence (23) is unacceptable: (23)

*La voiture touche le mur du phare. 'The car touches the wall with the headlight.'

(24)

Le doyen touche le tableau de la pointe de son bâton. 'The dean touches the blackboard with the tip of his stick.'

As evidenced by sentence (24), however, there is an exception when the tip of the intermediary is explicitly mentioned. This exception might be explained by an analogy with the tip of the finger, the archetypal touching instrument 3 .

Touching: A minimal transmission of energy

549

The principal factors involved in the kinetic usages of the verb toucher have now been reviewed. At this juncture, I will propose a provisional rule explaining the data presented up to now: T'k: S touche O if S is the cause of a movement whose consequences are a contact without asymmetric transmission of energy with a part of O Either S itself is moving or an intermediary / propelled by 5 is establishing contact with a part of O. Only asymmetric transmission of energy is precluded by rule T'fc. Indeed, as illustrated by sentences (6) and (7), symmetrical transmission of energy between S and O may be permitted. Requiring punctual contact of S with O would be too large a constraint on the surface of touching. Rule T'k, though, is formulated in such a way that total contact of S with O is excluded. 1.5. Profile of the verb toucher The provisonal usage rule T'k describes the objective movement which delineates the 'scope' of the kinetic usages of the verb toucher. A verb, however, does not attribute the same importance to all the facets of its scope. While profiled parts of the objective scene are highlighted, other parts remain in the background. Which parts of the objective movement are profiled by the verb toucher? The movement of S preceding contact? The entering into contact? The continuing contact? Or perhaps many of these phases? Before proposing an answer to this question, I will try to locate toucher in the classification of locative verbs proposed by Boons (1987). First of all, can the verb toucher be considered as a displacement verb? According to Boons (1987: 5), such verbs "require a mandatory change of place of an entity which, in the process, admits no further change, neither in its shape nor in its substance". If the last requirement is not met by sentence (25), it is satisfied by sentence (26): (25)

Le chancelier touche le campanile du doigt. 'The chancellor touches the campanile with his finger.'

(26)

Le ballon touche le campanile. 'The bail touches the campanile.'

550

Claude Vandeloise

In contrast to the scene described by sentence (25), in which the chancellor can modify his position without displacement, neither the shape nor the substance of the ball is modified in sentence (26). However, the verb toucher does not behave like a displacement verb in this sentence because a change of place is not mandatory. Indeed, a motionless ball in contact with the campanile might be described by sentence (26). Should any further doubt be lingering on this matter, let us apply two criteria proposed by Boons in order to test whether or not a verb is a displacement verb. First, these verbs accept a medial complement and second, they permit more than one aspectual values (these values being the beginning, the middle and the end of the trajectory). While sentence (27) establishes that toucher does not admit a medial complement, sentence (28) demonstrates that it cannot accept more than an aspectual value: (27)

Le chancelier touche le poisson (*à travers l'eau). 'The chancellor touches the fish (*through the water).'

(28)

Le ballon touche le mur (*des mains du chancelier). 'The bail touches the wall (*from the chancellor's hands).'

Indeed, as evidenced by sentence (28), a complement specifying the origin of the trajectory of the ball (the chancellor's hand) cannot be adduced to the complement describing the end of the trajectory (the wall). Hence, the verb toucher cannot be considered as a displacement verb. In the classification of locative verbs proposed by Boons, the transitive usages of the verb toucher considered in this article correspond to the formula Nx'VNx where Nx' is the localized term and Nx the localizing one. Since the only aspectual value permitted by the verb describes the end of the trajectory, toucher is, according to Boons (1987), a "final unipolar verb". As illustrated by the ambiguity of sentence (26), it is impossible to know whether or not Nx' had already reached Nx at the time of enunciation. The static uses of toucher, where Nx is already reached, will be dealt with in the second part of this article. There it will appear that some uses of toucher are not ambiguous between a moving or a motionless S. Indeed, some usages of

Touching: A minimal transmission of energy

551

this verb are unequivocally static. On this point, the verb toucher contrasts with the other final unipolar verbs. Which facets of the objective movement are highlighted in the kinetic usages of toucher ? The oddness of sentences (29) demonstrates that neither the movement preceding the contact nor the contact's continuation are profiled by the verb (29)

a. ?L'étudiante touche sa couronne en 3 secondes. 'The student touches her crown in 3 seconds.' b. ?L'étudiante touche sa couronne pendant 5 minutes. 'The student touches her crown for 5 minutes.'

These sentences establish that the only highlighted facet of the objective movement is the instant when the bachelor enters into contact with her crown. Another example, proposed by Ruwet (1972) confirms this point: (30)

*La cible est restée touchée tout l'après-midi. 'The target remained touched all afternoon.'

Only when contact is difficult to maintain can pendant be used after the verb toucher. A challenge or a bet will make a duration complement acceptable in sentences (31) and (32): (31)

(32)

L'étudiant défie le professeur de toucher la lampe pendant 3 minutes4. 'The bachelor challenges the professor to touch the bulb for 3 minutes.' L'étudiant parie que le professeur n'osera pas toucher le nez du doyen pendant 1 minute 'The bachelor bets that the professor will not dare to touch the dean's nose for 1 minute.'

One may as well insist on the period of continuous contact in the situation illustrated by figure (8) in which S is touching the back of O running before him.

552

Claude Vandeloise

Figure 8. Continous contact with touch Note that the duration of the touching as well as the distance covered while contact is maintained can describe this situation: (33)

Le chancelier touche le doyen pendant 3 minutes. 'The chancellor touches the dean for 3 minutes.'

(34)

Le chancelier touche le doyen pendant 100 mètres. 'The chancellor touches the dean for 100 meters.'

Thus, the verb toucher can only highlight the period of continuous contact if maintaining touching is made difficult by O. A duration complement is possible with reflexive uses of the verb toucher as well, provided that S and O participate actively in the contact: (35)

L'index et le pouce du candidat se touchent pendant 1 minute. 'The forefinger and the thumb of the candidate touch each another for 1 minute.'

(36)

La tasse et la cafetière se touchent (*pendant 1 minute). 'The cup and the coffee pot touch each other for 1 minute.' As to the period preceding contact, it enters the scope of the verb toucher in one very specific type of situation which is described by sentence (37): (37)

L'obus touche la cible en 1 minute. 'The shell hits the target in 1 minute.'

(38)

*L'obus touche la cible pendant 1 minute. 'The shell hits the target for 1 minute.'

Touching: A minimal transmission of energy

553

The usage illustrated by sentence (37) differs from the other usages of toucher in two respects: (1) It permits transmission of energy from 5 to O and (2) the shell is an intermediary separated from the agent propelling it in sentence (39): (39)

L'artilleur touche la cible (*en 3 minutes). 'The artilleryman touches the target in 3 minutes.'

(40)

Le ballon touche la vitre (?en 3 secondes). 'The bail touches the pane in 3 seconds.'

The inacceptability of a duration complement in sentence (39) demonstrates that the duration complement in sentence (37) applies to the trajectory of the intermediary and not to the action of the agent 6 .. As is suggested by the inacceptability of a duration complement in sentence (40), the whole ballistic framework might be needed to justify the use of such a complement in sentence (39). Putting aside the usage of pendant in sentences (31)-(32) and the usage of en in sentence (37), one may conclude that this verb normally foregrounds "the instant of entering into contact". Therefore, the rule T' k may be reformulated as in Tk: Tfc: S touche O at the instant when a movement caused by S establishes a contact without asymmetric transmission of energy with a part of O The only aspect of the scope put to the fore is the entering into contact, while the preceding movement and the continuing contact remain in the background, unless additional circumstances, depending on O, force them on stage7. 2.

Static usages of the verb toucher

There are numerous constraints on the choice of the terms related by the verb toucher when this verb applies to static situations. These constraints will be considered in section 2.1. In such circumstances, should O be considered as a patient or as a place? I will try to answer this question in section 2.2. Finally, section 2.3 will endeavour to

554

Claude Vandeloise

construct a bridge between the dynamic and static usages of the verb toucher, thanks to the concept of minimal physical action. 2.1. Constraints on the static usages of the verb toucher One might be tempted to consider the static usages of this verb as a meaning extension, according to which a verb describing an action may as well be utilized to describe the result of this action. However, the profile imposed by this verb on its scope does not support this conclusion. Indeed, toucher profiles the instant of contact to the exclusion of its continuation. Furthermore, this hypothesis would not explain the numerous constraints bearing on the static usages of this verb. For example, sentences (41) and (42) may be acceptable to describe a movement of the stone or the cup toward the table or the ground, but they are not the expected expressions to describe the result of this movement, when the cup or the stone is put down on the table or the ground: (41)

*La pierre touche le sol. 'The stone touches the ground.'

(42)

*La tasse touche la table. 'The cup touches the table.'

Thus, kinetic interpretations of toucher are acceptable in these sentences but static usages are not. Furthermore, the hypothesis of a preceding movement is obviously impossible in the case of the acceptable sentences (43) and (44): (43)

La Louisiane touche l'Arkansas. 'Louisiana adjoins Arkansas.'

(44)

Le champ du doyen touche la prairie du chancelier. 'The dean's field adjoins the chancellor's meadow.'

How may the following judgments of acceptability be explained in sentences (45)-(47), describing figure (9)?

Touching: A minimal transmission of energy

555

Figure 9. Contrast between toucher and "be in/on" (45)

La pomme touche le panier. 'The apple touches the basket.'

(46)

*La poire touche le panier. 'The pear touches the basket.'

(47)

*Le panier touche la table. 'The basket touches the table.'

A comparison with sentences (48)-(50) describing the same situation may be useful: (48)

*La pomme est sur/dans le panier. 'The apple is on/in the basket.'

(49)

La poire est dans le panier. 'The pear is in the basket.'

(50)

Le panier est sur la table. 'The basket is on the table.'

These data demonstrate that, for static situations, the verb toucher can be used when être sur/être dans are unacceptable and vice versa. As evidenced by the contrast between sentences (51) and (52) describing a similar situation, the verb toucher is no more successful in describing a static relation of suspension:

556

Claude Vandeloise

(51)

La lampe pend au fil. 'The bulb hangs from the wire.'

(52)

*La lampe touche le fil. 'The bulb touches the wire.'

While toucher cannot apply to situations described by être sur/dans, the prepositions sur and dans cannot introduce O behind this verb: (53)

*Le chancelier touche sur le doyen. 'The chancellor touches on the dean.'

(54)

*Le chancelier touche dans le doyen. 'The chancellor touches in the dean.'

(55)

Le chancelier touche le doyen sur la joue. 'The chancellor touches the dean on the cheek.'

(56)

Le chancelier touche le doyen dans le dos. 'The chancellor touches the dean in the back.'

Sentences (55) and (56) demonstrate that the touched place allows for the use of these prepositions. The complementarity of the relationships C/c, B/b and Sp with the verb toucher enables one to formulate a differential rule governing the static uses of this verb: T's: S touche O if S makes contact with O and if the contact is not a consequence of the relationships C/c, B/b or Sp. Besides this differential rule, can a positive usage rule be formulated? Such a rule might be suggested by the fact that the relationship C/c as well as the relationships B/b and Sp involve asymmetric transmission of energy between S and O. Therefore, I propose a positive usage rule T"s, formulated in such a way as to be as parallel as possible with Tfc: T ' s : S touche O if 5 makes contact with O without asymmetric transmission of energy between S and a part of O

Touching: A minimal transmission of energy

557

However, one might object that the acceptable sentence (45) is synonymous with sentence (57): (57)

La pomme est contre le panier. 'The apple is against the basket.'

Now, the preposition contre notably involves transmission of energy. What is different between the relationships described by this preposition and the relationships C/c, B/b and Sp? In the latter, the action of gravity on S is the initial force against which O then reacts. Therefore, the transmission of energy is asymmetrical. Such an asymmetry does not exist with être contre 'be against'. True, there is an action and a reaction between the apple and the basket of sentence (57), but it would be pointless to ask which object initiates the transmission of energy. Toucher and être contre may only be exchanged for these symmetric situations. In order to account for this fact, rule T " s must be modified in the following way: Tg: S touche O if S makes contact with O without an asymmetric transmission of energy from S to a part of O A positive rule governing the static usages of the verb toucher has thus to be proposed. 2.2. The verb toucher, the relationship agent/patient and localization In the kinetic usages of the verb toucher, even though the activity of the agent is as restricted as possible, S may be considered as an agent and O as a patient. Notably, S may be moving toward O, but the reverse does not hold true. However, the relationship agent/patient must be qualified for static usages of toucher. Indeed, the similarity of the verb toucher with unipolar final verbs has already been noted. Now, when these verbs appear in sentences of the form Nx'VNx, Boons (1987) considers Nx as referring to a place rather than to a patient. Furthermore, if there is no other way of locating one object relative to another, static uses of the verb toucher can establish a localization relationship in which O behaves like a landmark localizing a target S. Two different usages of toucher are illustrated by sentence (58), describing figure (10):

558

Claude Vandeloise

b

Figure 10. The polysemy of toucher (58)

La flèche touche ι la chaise qui touche 2 l'arbre8. 'The arrow hits the chair touching the tree.'

While toucher \ is describing a relationship agent/patient between the arrow and the chair, toucher2 helps to determinate which chair is reached by the arrow. Further evidence for the role of localization played by toucher2 in sentence (58) is provided by the contrast in acceptability between sentence (59) and sentence (60): (59)

La flèche touche 7 une chaise qui touche2 l'arbre. 'The arrow hits a chair touching the tree.'

(60)

*La flèche touche 1 la chaise qui touche 2 un arbre. 'The arrow hits the chair touching a tree.'

While the indefinite article is acceptable in the agentive relationship established by toucher \ in sentence (59), its role is more questionable in sentence (60) behind toucher2, because it constrains the ability of the landmark to locate the chair. Because its agent is minimally agentive, the verb toucher occupies an intermediate position between verbs like regarder 'look at' which don't imply physical action (and make the passive sentence conceptually odd) and verbs like casser 'break', which imply transmission of energy (and accept the passive voice): (61)

*Le mur est regardé par le chancelier. 'The wall is looked at by the chancellor.'

Touching: A minimal transmission of energy

(62)

?Le mur est touché par le chancelier. 'The wall is touched by the chancellor.'

(63)

Le gouverneur est touché au menton par le chancelier. 'The governor is struck on the chin by the chancellor.'

(64)

Le mur est cassé par le chancelier. 'The wall is broken by the chancellor.'

559

Like regarder, the verb toucher does not accept the passive voice in sentence (62). In sentence (63), though, this verb allows passivization like casser9. The verb toucher is also a borderline case between motion verbs and static verbs. As a unipolar final verb, remember that it is impossible to decide whether or not the position of S is the result of a preliminary movement. Furthermore, toucher applies to necessarily static uses, as evidenced by sentences (43) and (44), recalled here: (43)

La Louisiane touche l'Arkansas.

(44)

Le champ du doyen touche la prairie du chancelier.

Excluding cases such as sentence (28), for which S is a path, (28)

La route va au collège. 'The road goes to the college.'

I don't know of any motion verb with necessarily static uses. Thus, the intermediary character of the verb toucher appears once again. Since the verb toucher wavers between a description of contact without asymmetric transmission of energy for its kinetic uses and the localization of S by O for its static uses, one understands why O, in the formula S touche O (parallel to the formula Nx'VNx of Boons) is intermediary between a patient and a place. When toucher takes a determinative or a locating function, it enters into competition with the preposition à, locating a target relative to a landmark. Since toucher imposes contact between the two terms, it lo-

560

Claude Vandeloise

cates more precisely the target relative to the landmark than à. Indeed, this preposition imposes no constraint on the location of the target relative to the landmark, provided that the former is close enough to the latter to be spotted. However, the preposition is generally preferred to the verb to localize a target: (65)

a. L'étudiant est à Coimbra. 'The student is at Coimbra.' b. *L'étudiant touche Coimbra. 'The student touches Coimbra.'

(66)

a. Le professeur est au collège. 'The professor is at the college.' b. *Le professeur touche le collège. 'The professor touches the college.'

Sentence (66b) can be acceptable only if the professor makes contact with the walls of the college. The only chance left for the verb toucher to fulfill its localization function is with objects located in the visual field, for which the preposition à is unacceptable: (67)

a. *La fourchette est à l'assiette^. 'The fork is at the plate.' b. La fourchette touche l'assiette. 'The fork touches the plate.'

(68)

a. *Donne-moi la fourchette à l'assiette. 'Give me the fork at the plate.' b. Donne-moi la fourchette qui touche l'assiette. 'Give me the fork touching the plate.'

The verb toucher, then, fills in the blanks left by the prepositions dans, sur and à in the description of space. To round off the study of the static uses of the verb toucher, I repeat the two usage rules proposed above: T V S touche O if S makes contact with O and if the contact is not a consequence of the relationships C/c, B/b or S

Touching: A minimal transmission of energy

561

Ts·- S touche O if S makes contact with O without an asymmetric transmission of energy from 5 to a part of O A general rule explaining all the usages of the verb toucher has to be schematic enough to cover at the same time the kinetic usages of toucher, ruled by the kinetic usage rule Tk, recalled below: Tk·- S touche O at the instant when a movement caused by S makes contact without an asymmetric transmission of energy to a part of O The connection I will look for between the static and the kinetic usages of toucher is not the relationship between a movement and its consequence but rather the concept of "minimal physical action". 2.3. The verb toucher and the concept of minimal physical action If one compares rules Tk and Ts, the concept of contact makes it easy to find the commonality between the kinetic and the static usages of the verb toucher. It may be stated in two steps: (i)

A usage rule T' formulated as follows:

Τ\· S touche O at the instant when contact between 5 and O is noticed by the speaker If the speaker sees the movement preceding the contact, the entering into contact is likely to be the instant put in profile by the verb, as it is stated in rule Tk- For static uses, any instant where the speaker notices the contact will do. However, there is a problem with rule T'. Indeed, in order to be descriptively adequate, this rule must be accompanied by a restriction R stating that: R: There is no asymmetrical transmission of energy from S to O. This restriction is needed to explain the contrast in the paradigm below: (69)

a. Le professeur prend la craie pour écrire au tableau. 'The professor takes the chalk to write on the blackboard.' b. *Le professeur touche la craie pour écrire au tableau. 'The professor touches the chalk to write on the blackboard.'

(70)

a. Le chancelier en colère casse la chaire. 'The irate chancellor breaks the desk.'

562

Claude Vandeloise

b. *Le chancelier en colère touche la chaire. 'The irate chancellor touches the desk.' This contrast shows that contact is not a sufficient condition for the usage of this verb. If it were the case, one would expect this verb to be generic for verbs implying contact such as prendre 'take' and casser 'brake', an expectation obviously contradicted by the unacceptability of sentences (69b) and (70b). Therefore, descriptively adequate as it may be, the above solution is not satisfying. Indeed, verbs such as prendre and casser are very numerous and therefore, a majority of the potential usages of toucher generated by rule T' should be ruled out by restriction R. This leads to a very undesirable situation where most of the work done by the rule should be undone by the restriction on the rule. Instead, I will look for another solution which will directly explain most of the usages of the verb toucher with the concept of minimal physical action. There is physical action as soon as an entity S makes contact with another entity O. Therefore, the verb toucher is a physical action verb. Even in the lightest touching, contact is accompanied by transmission of energy from S to O. In the circumstances described by prendre or casser, of course, transmission of energy is much stronger. Besides contact and transmission of energy, verbs describing physical actions such as scier 'saw' or coudre 'sew up' involve special tools and skills. These additional factors don't concern us here. All the verbs of physical action involve contact and meet rule T'. Because these situations cannot be completely described by the verb toucher, they must be excluded by the restriction R. This undesirable process may be avoided through the concept of "minimal physical action". Indeed, among the verbs describing physical action, toucher allows for the smallest transmission of energy. Thus, rule Τ alone accomplishes most of the work done by rule Τ and the restriction R: (ii)

T: S touche O at the instant when a minimal asymmetrical physical action from S to a part of O is noted by the speaker

Contact between S and O is made necessary in rule Τ since it is an obligatory condition for physical action between these two entities. In contrast to rule T', which was overextensive, rule Τ rules out correct usages of toucher, such as (a) the cases for which S gives a blow to O

Touching: A minimal transmission of energy

563

(sentence 8) and (b) the cases for which a bullet hits a target (sentence 7). Therefore, an extension of rule Τ is needed: E: Transmission of energy from S to O is possible if the interaction is symmetrical11 What is the advantage of rule Τ accompanied by extension E over rule T' and restriction R1 First, most uses of toucher are directly explained by rule T. Thus, the circumstances for which E is needed are far rarer than the situations in which R is called for, in order to filter the overextensions allowed by rule Τ . Second, one may consider that the extension E concerns marginal usages of toucher. Indeed, in English, if a hunter hits a rabbit or if a chancellor strikes a governor, the verb to touch could not be used as is the case in French. Some of the extensions of toucher, then, do not hold for all languages. There is a commonality between rule Τ and its extension E: the agent's effort cannot exceed O's effort. As a consequence, since O is passive in rule T, the transmission of energy must be minimal. In E however, because O is active in avoiding blows, the agentive role of S is reinforced and transmission of energy is allowed. Because the affirmative usages of the verb toucher designate a minimal physical interaction, the negative usages of this verb gain a particular strength. Indeed, in prohibiting even the gentlest contact between S and Ο, the negation of toucher all the more forbids striking it, taking it or damaging it. This is the rationale for an expression wellknown to children and museum visitors: Ne pas toucher 'Don't touch'. While in the language system, it is impossible to substitute the affirmative form of the verb toucher for the verbs prendre and casser (see sentences(69)-(70)), the negative forms of these verbs are interchangeable, with sentence (72) having a stronger prohibitive force than sentence (71): (71)

Ne prends pas le pot de peinture. 'Don't take the pot of painting.'

(72)

Ne touche pas le pot de peinture12. 'Don't touch the pot of painting.'

564

Claude Vandeloise

3.

Conclusion

The main lesson one can draw from the study of the verb toucher concerns the role of contact in language. This concept is often presented as an important semantic feature in the componential analysis of the lexicon. Indeed, contact provides a convenient way of dividing all the spatial relationships into two classes, the former allowing contact, the second excluding it. However, this very generality, so appealing to componential analysis, looks unattractive to language. Instead, linguistic categorization relies on complex bundles of attributes, globally conceptualized, such as the relationship C/c, the relationship B/b or the relationship Sp. Contact is no more than one attribute of these relationships. While it is sometimes a necessary condition for their fulfilment, contact is never a sufficient condition. The same observation holds true for the usage of the verb toucher, which is better described by the concept of minimal physical action than by the topological concept of contact. Involving a contact without transmission of energy, i.e. minimal physical action explains why A touche Β cannot be utilized kinetically instead of A prend Β or A casse Β; and statically instead of A est sur Β or A est dans B, even if A and Β make contact. Indeed, the energy involved in the action of taking or of breaking, as well as the role of gravity in the relationships C/c and B/b violates the minimal physical action required for the usage of the verb toucher. For spatial relationships which don't involve transmission of energy, however, this verb may fill the blanks left by the preposition à. This holds especially true for objects in the visual field. Thus, information that cannot be given by Ha cuiller est à l'assiette will be correctly expressed by la cuiller touche l'assiette.

Touching: A minimal transmission of energy

Notes *

1. 2.

3.

4. 5. 6. 7.

8. 9.

10. 11.

This paper has been presented at the universities of Toulouse-Le Mirail, Paris VIII, Lilles-Charles de Gaulle and Duisburg. It benefitted from the comments of my listeners. I especially wish to thank Michel Aumague, Andrée Boriilo, Anne-Marie Berthonneau, Anne Condamines, René Dirven, Jacqueline Gueron, Dany Laur, Jean-Pierre Morel, Nicolas Ruwet, Laur Vieu. The final version benefitted as well from the comments of two anonymous readers. I deal with the psychological uses of this verb in Vandeloise (forthcoming). The activity of the second element can either reinforce the transmission of energy (as in the situation depicted by Figure 2 or try to avoid it, as in the situation described by sentence (8). The latter case allows for the use of toucher more easily than the former. If the symmetry of the exchange of energy is clear enough in the case of the governor avoiding the blows of the chancellor, it is more farfetched in the case of the shell hitting the college in sentence (7). However, as noted by an anonymous reader, one can also say: Le doyen touche le tableau de sa manche. 'The dean touches the blackboard with his sleeve.' Pendant 3 minutes in sentence (31) modifies the verb toucher. As in sentence (31), pendant 1 minute modifies the verb toucher. As a matter of fact, en trois minutes would be acceptable if it described the efforts of the artilleryman to hit the target. This organization of the objective movement is compatible with an etymological explanation often proposed for the verb toucher, which claims that this word originates from an onomatopoeia toe, imitating the sound produced by two entities entering into contact. Indeed, this sound coincides with the moment of impact. Touche in this sentence may be kinetic as well as static. Note that in the past tense, sentence (62) would be acceptable: Le mur a été touché par le chancelier. 'The wall has been touched by the chancellor.' In contrast, La fourchette est à côté de l'assiette is acceptable. One may wonder if condition E should not be reformulated as E'. E': Transmission of energy from S to O is allowed if O makes contact difficult. Indeed, as has already been noted, symmetry is rather far-fetched in the case of sentence (7). In contrast, the distance between the ar-

565

566

12.

Claude Vandeloise

tìlleryman and the college makes contact difficult, in keeping with extension E'. On the other hand, extension E explains better than E' the acceptability of sentence (11). In this case, the symmetric exchange of energy is certainly undesirable, but it is not especially difficult. Whether symmetry, difficulty of contact, or a combination of both should be used in the extension of rule Τ might be little more than a matter of exposition. Ne touche pas au pot de peinture would be a stronger interdiction (cf. Vandeloise 1993).

References Boons, Jean Paul 1987 "La notion sémantique du déplacement dans une classification syntaxique des verbes locatifs", Langue Française 76: 5-40. Langacker, Ronald W. 1990 "Settings, participants and grammatical relations", in: Savas L. Tsohatzidis (ed.), Meaning and prototypes: Studies in Linguistic Categorization. London and New York: Routledge, 213-239. Picoche, Jacqueline 1986 Structures sémantiques du lexique français. Paris: Nathan. Ruwet, Nicolas 1972 Théorie syntaxique et syntaxe du français. Paris: Les Editions du Seuil. Vandeloise, Claude 1987 "Complex primitives in language acquisition", Belgian Journal of Linguistics 2: 11-36. 1989 "L'expression linguistique de la relation de suspension", Cahiers de Lexicologie 55: 101-133. 1991 Spatial prepositions. Chicago and London: The University of Chicago Press. 1993 "La préposition à pâlit-elle derrière toucher?', Langages 110: 107127. forthcoming Les usages psychologiques de 'toucher'.

Section III Some of the architecture

Complement construal in French: A cognitive perspective Michel Achard

0.

Introduction1

It is a well documented fact that in a number of languages, different types of main verbs are followed by different formal types of sentential complements. For instance, in French, certain verbs are followed by indicative complements, whereas others are followed by subjunctive complements. The examples in (1) illustrate this situation: (1)

a. b.

Je sais qu'il est parti, (IND) Ί know that he left.' Je veux qu 'ilparte. (SUBJ) Ί want him to leave.'

The type of distribution presented in (1) can be analyzed in two ways. The first, or "syntactic" account, treats the specific form of the complement as a mere reflex of the main verb, triggered by some property of that verb. In that view, the complement structure is not given any particular meaning. The verbs which take subjunctive complements are somehow marked as such in the lexicon. This position is rather difficult to maintain, because of the arbitrary nature of that marking. On the other hand, a "semantic" analysis (Terrell and Hooper 1974) treats the complement structures themselves (the indicative or subjunctive clause) as meaningful units. The presence of a certain type of complement following a specific verb is thus not due solely to a property of the latter, but is imputable to the necessary semantic compatibility between the matrix verb and the complement structure. This paper provides a semantic account of mood distribution in sentential complements in French,2 using the concepts made available by the theory of Cognitive Grammar (henceforth CG, Langacker 1987,

570

Michel Achard

1991). Particular emphasis is placed on the meaning of an indicative clause. Once the semantic import of the indicative inflection has been identified, its compatibility with different types of main verbs will be evaluated. It will be shown that the verbs which are fully compatible with the meaning of an indicative clause are followed by indicative complements. The verbs which are incompatible with the meaning of an indicative clause are followed by subjunctive complements. There are also verbs which are only potentially compatible with the meaning of the indicative (to be explained later). Those verbs are also followed by subjunctive complements. The paper is organized in the following fashion: part 1 introduces the problem, part 2 briefly considers the issue of complementation and conceptualization. Part 3 presents an investigation of the semantic import of an indicative clause. Part 4 evaluates the semantic compatibility between an indicative clause and various matrix verbs. Part 5 recapitulates the results and concludes this paper. 1.

The problem

The distribution of indicative and subjunctive complements in French is illustrated in (2) - (11). The mood of the subordinate verb is indicated in parentheses. IND stands for indicative, SUB J for subjunctive. Examples (2) and (3) present verbs of perception: (2)

Paul voit que Marie est malade. (IND ) 'Paul sees that Mary is sick.'

(3)

Jean a remarqué que Marie avait grossi. (IND) 'John noticed that Mary had put on weight.'

Examples (4) - (5) present verbs of declaration: (4)

Le docteur dit que nous ne dormons pas assez, (IND) 'The doctor says that we don't sleep enough.'

(5)

Jean prétend que Marie est malade. (IND) 'John claims that Mary is sick.'

Complement constnial in French

571

Examples (6) - (7) are concerned with verbs of propositional attitude. The category of propositional attitude is quite broad. It involves the description of any level of certainty (or uncertainty) exhibited by the main clause subject towards the content of the complement structure (in the eyes of the speaker). (6)

Je crois que la porte est fermée, (do) Ί believe that the door is locked.'

(7)

Marie sait que Jean est parti. (ind) 'Mary knows that John has left.'

Examples (8) - (9) present the situation with verbs of volition: (8)

(9)

Le directeur veut que tu partes tout de suite. (SUBJ) 'The director wants you to leave right away.' Papa demande que tu le conduises au bureau. (SUBJ) 'Daddy asks that you drive him to work.'

Finally, examples (10)- (11) illustrate the situation with verbs of emotional reaction: (10) J'ai peur qu 'il ait compris. (SUBJ) 'I'm afraid that he understood.' (11)

Paul est content que vous soyez revenu. 'Paul is glad that you came back.'

(SUBJ)

The distribution of indicative and subjunctive complements in positive sentences in French can briefly be summarized as follows: when the main verb is a verb of perception, declaration or propositional attitude, the subordinate verb is in the indicative mood. With the verbs of volition or emotional reaction, the subordinate verb is in the subjunctive.3 Before proceeding with the analysis, a point should be made clear about the methodology adopted, as well as the scope of this paper with respect to a full-fledged account of the indicative/subjunctive

572

Michel Achard

contrast in French. Sentential complements do not constitute the only environment where the indicative and subjunctive moods are in opposition. A similar contrast can also be observed in relative and adverbial clauses for instance. The solution proposed for sentential complements is not supposed to directly accommodate those cases. Consistently with the semantic theory defended by Cognitive Linguistics, I consider the indicative and subjunctive moods as radial categories, with different related senses in different constructions. It is only after a careful examination of each particular construction that the overall indicative/subjunctive contrast can be fully understood. This paper is strictly concerned with the situation in sentential complements (with special emphasis on the presence of the indicative mood). The methodology proposed can however be extended to the study of other constructions.4 The type of solution adopted here is not new. It rests on the idea that indicative and subjunctive complements have a specific meaning, and that their meaning must be compatible with that of the main verb they occur with. My analysis has benefited from the results of a long and fruitful line of research in the area of modal semantics in Romance languages in general, and French in particular. For instance, Guillaume (1929, 1990, 1992) very thoroughly analyzed the internal organization of the French verbal system, and established the difference between the indicative and the subjunctive in terms of the level of completion of the verbal chronogenesis (chronogénèse). When it is in the indicative, the chronogenesis of the verb is complete. When the verb is in the subjunctive, its chronogenesis is interrupted before its completion (also see Moignet 1981). The contrast between indicative and subjunctive complements has also received a lot of attention. To provide just a few examples, Rivero (1971) argued that in the case of possible alternation in Spanish, the presence of an indicative complement is correlated with a positive presupposition about the truth of its complement, whereas the subjunctive is neutral. Terrell and Hooper (1974: 448) noted that the propositional complements can be associated with the semantic notions of "assertion" and "presupposition". Their claim is that the choice of indicative mood correlates with the notion of assertion, whereas the choice of the subjunctive correlates with the notion of presupposition. In her cross-linguistic analysis,

Complement constnial in French

573

Wierzbicka (1988) isolated two senses of the subjunctive inflection, namely "anti-assertion" and "anti-cognition". The verbs of volition illustrate the anti-assertive sense. The presence of the subjunctive inflection means that the subject "explicitly refrains from committing himself to an assertion" (Wierzbicka 1988: 143). This sense of the subjunctive inflection could be paraphrased as "I don't say: I say this". The anti-cognitive component of the subjunctive is illustrated by its use with the "emotive factive" verbs (the verbs of emotional reaction). In that case, the subjunctive means: "I don't want to say: I know this". The goal of this very brief overview of some of the literature is merely to show that the solution developed in this paper is deeply rooted in the semantic tradition of modal investigation. All the analyses presented above provide very useful insights. My purpose is not to criticize them, but to provide an alternative (and often complementary) view on the topic. 5 In particular, I will argue in favor of a specific strategy for the semantic characterization of mood inflection (specifically illustrated by the indicative). More precisely, it will be shown that the meaning of the indicative inflection is best understood in terms of the type of construal (Langacker 1987) its presence imposes on the content of the complement clause. This type of semantic characterization in turn permits a more precise definition of the notion of semantic compatibility between main verb and complement which, even though it figures prominently in most semantic analyses, is too often left undefined. 2.

Complementation and conceptualization in Cognitive Grammar

Since the analysis makes use of the concepts developed by the theory of Cognitive Grammar, I will start by very briefly introducing the aspects of the framework most relevant to the issues presented here. In the CG model, meaning is equated with conceptualization, to be explicated in terms of cognitive processing. It includes not only fixed concepts but also novel sensations and experiences as they occur. Conceptualization includes abstract and intellectual conceptions as

574

Michel Achard

well as sensory, emotive and kinesthetic sensations. It also considers a person's cultural and social awareness of the speech event. It therefore follows that the meaning of an expression cannot be derived from the sole observation of characteristics intrinsic to the entity described. OG embraces a "subjectivist" view of meaning in that the semantic value of an expression involves the way the conceptualizer chooses to think about it and represent it, as well as the properties inherent to the scene conceptualized. The particular way in which the conceptualizer chooses to express his conceptualization is referred to as the "construal relation". Langacker (1988a: 7) writes: "In choosing a particular expression or construction, a speaker construes the conceived situation in a certain way, i.e. he selects a particular image (from a range of alternatives) to structure its conceptual content for expressive purposes." Alternate constructions impose contrasting images on the conceived situation. A linguistic expression is characterized semantically relative to one or more cognitive domains. Some domains are irreducible: they pertain to our experience of space, time, the different senses, emotions. Most linguistic expressions are however characterized with reference to complex domains. Any knowledge system or conceptualization can function as a domain for the characterization of a linguistic expression, regardless of its possible complexity or abstractness. Particularly relevant to this presentation, different folk models which pertain to our conventional conception of certain concepts can be used as cognitive domains relative to which the meaning of linguistic expressions is characterized. The semantic value of an expression derives from the imposition of a "profile" on a "base". The base consists of those facets of active cognitive domains that are directly relevant to the expression, hence necessarily accessed when the expression is used. The profile is a sub-region within the base. It is that sub-region that the expression designates and thus makes prominent within the base. We are now ready to consider the problem of mood distribution in sentential complements.

Complement construal in French

575

2.1. Complementation in conceptual terms

In accordance with the principle of CG that meaning is equated with conceptualization, the formal description of the problem of verbal complementation has conceptual correlates. It is thus useful to present these clearly and to develop a vocabulary which will be used throughout this paper. The speaker is the conceptualizer of a sentence. The sentence represents his conceptualization. In a complex sentence, typically, the main clause subject is also a conceptualizer, relative to the process evoked in the complement. The complement structure thus represents the conceptualization of the main clause subject. Importantly, the latter is not necessarily coreferential to the conceptualizer of the sentence. Figure 1 illustrates the different levels of conceptualization found in sentences involving a verbal complement, regardless of the form of the verb in the complement structure:

CL ι

Ï

S Si Ci Vi CLj

= = = = =

speaker subject conceptualizer verb clause Figure 1. Complementation in conceptual terms

576

Michel Achard

The outer rectangle CLi represents the sentence. The inner rectangle Q-2 represents the complement structure. The dashed arrows indicate the direction of the conceptualization. The dual role of the speaker, as a speaker and a conceptualizer, is illustrated by S and Co respectively. The arrow going from Co to CLi illustrates the fact that the whole sentence represents the conceptualization of Co- The dual role of the main clause subject, as a subject and a conceptualizer, is represented by S ι and C ι respectively. The arrow from Ci/S ι to CL2 illustrates the fact that the complement structure represents the conceptualization of Ci. Vi represents the main verb, V2 the subordinate verb. In this presentation, the sentence CLi will also be referred to as a sentential complement construction (henceforth see). Figure 1 presents different meaningful levels of grammatical organization. Of particular interest to our current concerns are the main verb Vi and the complement structure CL2. It will be shown that different complement structures (indicative or subjunctive clauses) have different meanings. The meaning of each particular complement has to be compatible with that of the main verb. 2.2. Semantic compatibility and complement construal It has already been pointed out that the notion of semantic compatibility between Vi and CL2 is a crucial tenet of the semantic tradition. However, within a CG analysis, that notion can be characterized more precisely. Vi profiles the relation existing between two participants: the subject C1/S1 and the complement CL2. Langacker (1991) claims that the relation which holds between the subject and complement is analogous to the one which exists between the ground and the grounded structure. He writes (1991: 442): The analogy is strongest when the subject (or another main-clause participant) functions as conceptualizer with respect to the contents of the subordinate clause, e.g. with verbs like say, believe, imagine, want, enjoy and realize. The subject's conceptualizing role vis-à-vis the subordinate structure is then comparable to that of the speech-act par-

Complement construal in French

577

ticipants in conceptualizing an expression's meaning (the construal relationship). The choice of a particular complement reflects the subject's specific construal of the complement scene. Since the subject and the complement are both participants in a specific relationship expressed by the main verb, the possibilities of construal by the subject (the range of possible ways to structure the conceptual content of the complement) are limited by the meaning of the main verb. A more precise way to evaluate the semantic compatibility between Vi and CL2 is therefore to consider the possible type of construal relations existing between Ci and CL2. In the next section, the semantic import of an indicative clause is defined in terms of the type of construal its presence imposes on the complement content. Different verbs will be shown to be compatible with different types of complements because their respective subjects construe the content of their complement in different ways. 6 3.

Towards the definition of an indicative clause

A successful analysis of the meaning of an indicative clause requires the consideration of two Idealized Cognitive Models (henceforth ICM, Lakoff 1987) which function as some of the cognitive domains in terms of which its meaning gets characterized. Both models pertain to our folk understanding of reality. The first one is concerned with the way in which we conceive the elements or circumstances of the world. The second one involves the organization of our folk conception of reality. However, those two models are so broad and far-reaching that I will merely focus on some of their most relevant aspects. 3.1. Conception of an abstract element: dominion The organization and structure of the most abstract domains of our experience are more easily understood metaphorically, i.e. in terms of the organization and structure of other concepts which are more familiar to us (Lakoff and Johnson 1980). Our understanding of reality is

578

Michel Achard

based on such metaphors. I will now briefly consider one of them, namely "knowing is owning". A given conceptualizer handles the elements of the world in a way similar to the objects he owns. He manipulates them in a quasi-physical fashion. He can for instance extract them (this man is a mine of information), accept them (thanks for giving me the straight facts), misplace them (/ forgot what color the French flag is), share them (do you want me to give you a clue?). The relationship which holds between a conceptualizer and the circumstances of the world he conceives of can be viewed as one of abstract possession. For the characterization of different structures of concrete and abstract possession, Langacker (1991) developed the Reference Point Model (henceforth RPM). The basic idea of the RPM is that the world is full of objects, and that some of these objects are more cognitively salient than others. Salient objects are reference points, in relation to which other objects are more easily located. Each reference point anchors a region which is called its dominion. Consistent with the metaphor "knowing is owning", each conceptualizer can be thought of as a reference point, and the abstract region composed of the circumstances he conceives of can be considered his dominion (a more precise definition of the notion of dominion as it is used in this paper will be given in section 3.3). The elements of a conceptualizer's dominion will be called propositions. A conceptualizer's dominion is fully describable by a set of propositions. Part of what it means for a conceptualizer to have a particular proposition in his dominion is that he has active control over it (he can manipulate it). We will see below (in section 3.3) how, in order for a conceptualizer to control and manipulate a proposition, that proposition must be grounded (located with respect to the speech situation). For now, we will merely state that a proposition is a grounded circumstance of the world. 3.2. A folk model of reality There is an idealized conception that reality is an objective entity, independent from all conceptualizes, of which each person has a very

Complement construal in French

579

limited and fragmentary knowledge (as indicated by the irony contained in the phrase a know-it-all). We know certain things and do not know others, we know that our knowledge of some facts is incomplete or simply false. We also know that other people may have a very different set of elements they consider true, that they are likely to have a different experience of reality, depending on their experience of the world. Even though, at a very abstract level, reality is thought of as being objective and independent from its conceptualization, each individual is only aware of a sub-part of it. Each conceptualizer's experience of the world represents his own conception of reality. Of course, since a very important part of our knowledge of the world is common human or cultural experience, a substantial degree of overlap is expected between people's conceptions of reality. Some human facts, which arise from common sensory perception, are universally considered true (fire burns, snow is cold, stones are not good to eat). Similarly, the basic facts which pertain to the organization of a given community are likely to be shared by most members of that community. For instance, most Americans know that the United States has a president, and the colors of the US flag. Obviously, the people who share the greatest part of their experience of the world will have the most overlap in their conceptions of reality. 3.2.1.

Dynamic character

Even though reality is usually considered stable, it is not a static configuration of events. It is conceived as evolving through time. For the characterization of the meaning of modality, Langacker (1991) developed the Basic Epistemic Model. In this model, reality is perceived as an "ever-evolving entity whose evolution continuously augments the complexity of the structure already defined by its previous history; the cylinder depicting it should thus be imagined as "growing" along the axis indicated by the arrow" [the arrow depicts the axis along which reality evolves]. (Langacker 1991: 242). An illustration of the basic epistemic model is given in Figure 2 (from Langacker 1991: 242):

580

Michel Achard

Figure 2. Basic reality

In Figure 2, the conception depicted by the cylinder could be called "basic reality", i.e. the sum of "known reality" and "immediate reality". 3.2.2.

Basic reality

Very basically, reality represents the history of what actually happened. This includes the events (or states) which are currently in progress or have actually occurred, the events which we see (or think we see) happening, and those which have (or we think have) happened. However, the "growth" of reality along the time axis (cf. Figure 2) can account for yet another conception of reality. Talking about reality and its continuous flow, Langacker (1991: 276) writes: "There is an essential force dynamic aspect to our conception of its structure which we see as constraining and influencing the elements which unfold within it." It follows that its "evolutionary momentum" (a more precise characterization of its force dynamic nature) allows the future course of events to be predictable to some extent. Some elements are seen as possible, while others are definitely excluded from the possible turn of events. Langacker (1991: 276) speaks of the "organic continuity" of the evolution of reality: "successive instantiations of reality cannot represent totally distinct and unrelated conceptions. Instead, one instantiation bequeathes most of its organization to its successor, which diverges from it in only limited ways and only as permitted by the world's structure." Part of our conception of reality includes the understanding that the way it has already evolved leaves the potential

Complement construal in French

581

for further evolution in constrained directions. The possibility of predicting with reasonable confidence which among those directions will actually be realized is by itself a reflection on the way reality has evolved so far. In this sense, it is part of reality. This particular aspect of reality, however, represents an extension of basic reality. We will call it "elaborated reality". 3.2.3.

Elaborated reality

Part of what it means to be aware of the dynamic nature of reality involves some knowledge of the possible paths the course of events will take, given the state of the world and its evolutionary momentum. A more abstract and inclusive conception of reality therefore needs to include, along with the events of basic reality, the knowledge of what is not, could or could not possibly happen. An illustration of the model of elaborated reality is given in Figure 3 (see Langacker 1991: 277):

known reality

=> Figure 3. Elaborated reality Crucially, elaborated reality represents the level at which the occurrence (or as the case might be the possible occurrence or non-occurrence) of events is described or reported. The elements of elaborated reality must therefore be considered relative to the speech situation, or in other words, grounded. Like a conceptualizer's dominion, his conception of elaborated reality is composed of grounded circumstances of the world, namely propositions. A conceptualizer's conception of elaborated reality is composed of the set of propositions accepted as true by that conceptualizer. In the next section, I will explore the rela-

582

Michel Achard

tion between a conceptualizer's dominion and his conception of elaborated reality. 3.3. Meaning of an indicative clause The possibility for a conceptualizer to manipulate propositions has already been considered. We are now in a position to specify the purpose of that manipulation, and to give a more precise definition of dominion: a conceptualizer establishes dominion over a circumstance to the extent that he actively controls and manipulates it in order to assess its status with respect to some conception of elaborated reality (his own or some other conceptualizer's). For example, when a sentence such as "John is here" is uttered, the speaker manipulates the proposition it contains by presenting it to his interlocutor. The purpose of that manipulation is to convince the hearer to insert the proposition in his conception of elaborated reality. This definition allows the notions of elaborated reality and dominion to be integrated, yet separate. A dominion is composed of the set of propositions which a conceptualizer can manipulate (i.e. which he has control over), and elaborated reality is composed of the set of propositions he holds true (notice the manipulative term here). Consequently, a conceptualizer's dominion obligatorily includes his conception of elaborated reality but the converse is not true. A given conceptualizer's conception of elaborated reality cannot be larger than his dominion. The definition of dominion presented above explains why a proposition must be grounded, which was only assumed in section 3.1. Grounding predications (such elements as tense, modality, and negation) are a necessary part of a proposition because they provide the location (the address) of that proposition in elaborated reality (immediate reality in the case of "John is here"). Since it obligatorily incorporates grounding predications, the appropriate verbal structure used for the expression of a proposition is a fully articulated indicative clause (Langacker 1991: 439). We are now in a position to offer a definition of the indicative mood. It has been pointed out that a given conceptualizer establishes dominion over a circumstance to the extent that he actively controls

Complement construal in French

583

and manipulates it in order to assess its status with respect to elaborated reality. It has also been noted that a clause must be finite (grounded) for this purpose, since it is the grounding predications which provide the location of the proposition in elaborated reality (they locate the former with respect to the speech situation or immediate reality). Furthermore, finite verb inflection can be identified with indicative as opposed to subjunctive mood because subjunctive complements cannot be located in elaborated reality. For example, in Je veux que Jean revienne "I want John to come back", the subjunctive complement Jean revienne has no possible address in elaborated reality. It is merely considered with respect to the subject's wishes. The following definition of the meaning of the indicative mood can be proposed: the indicative mood signals a conceptualized s dominion over the circumstance described in the clause, implying both grounding and active control. This definition accounts for the presence of the indicative mood in simple sentences (as well as in Vi in a SCC) in a natural way. The production of any sentence has to be understood against the background of a model of communication. Communication can be thought of as the sharing of information. Part of what we do when we share information is address the status of what is being said with respect to elaborated reality. The utterance of any sentence can be considered an act of manipulation of the proposition expressed in that sentence by the speaker. Manipulation implies control, so the proposition is necessarily in the speaker's dominion. Consider the following independent clause examples: (12) Jean a acheté une voiture. 'John bought a car.' (13) Marie déménage demain. 'Mary moves out tomorrow.' The purpose of the speaker in (12) - (13) is to offer to the hearer a piece of information he possesses. He manipulates the proposition expressed in the sentence, with the intention of convincing the hearer to include it in his conception of elaborated reality. However, there is no

584

Michel Achard

obvious way of knowing if the speaker considers the proposition true (part of his conception of elaborated reality), or if he simply has control over it (it is part of his dominion). The first scenario corresponds to a piece of information offered to the hearer in good faith (even if the proposition turns out to be inaccurate), the second one represents a conscious lie. The speaker tries to mislead the hearer by consciously conveying wrong information. In both cases, the propositions expressed in the independent clauses are in his dominion, hence justifying the indicative inflection. There is an ICM of communication (as well as a Gricean maxim) which allows us to interpret independent clauses such as the ones in (12) - (13) as true by default (we do not prototypically lie), unless some other element forces us to do otherwise. Similarly, the presence of the indicative mood in Vi in a SCC is an indication that CLj is in the speaker's dominion. I will now show that the mood variation in V2 is imputable to whether or not the complement of Vi represents a proposition, or, more precisely, if the content of the complement can be viewed as part of some conceptualizer's dominion. The approach is as follows. The meanings of certain Vi verbs are fully compatible with the notion of proposition. They are thus compatible with the meaning of the indicative inflection and will appear with an indicative complement. The term "fully compatible" means that the subject of these verbs can construe the scene described in the complement as a proposition, and that particular construal is directly relevant to the verb's meaning. Some verbs are not semantically compatible with the notion of proposition. They are therefore not compatible with the meaning of the indicative and will be followed by subjunctive complements. Other verbs are only potentially compatible with the meaning of the indicative mood, because even though their subjects have the ability to construe the complement scene as a proposition, that particular construal is not directly relevant to their meaning. A verb's partial compatibility with the indicative inflection is reflected syntactically by a great deal of synchronic, diachronic, and cross-linguistic variation in the choice of its complements (indicative or subjunctive).

Complement construal in French

4.

585

Semantic compatibility between main verbs and complements

This section is concerned with the semantic characterization of the different types of Vi verbs considered earlier. These verbs are often introduced as verb classes. Although this is a convenient heuristic, such a description is not totally satisfactory. A more accurate description of the semantic grouping of a given set of verbs would indicate that all the members of that particular set share important elements of their base. The evocation of those common elements provides their semantic unity, therefore justifying the label of class. The semantic characterization of each individual verb needs to consider the conceptual base it evokes as well as the exact nature of the profile it imposes on that base. The different categories which constitute the bases on which the V ι verbs impose their specific profiles correspond to different categories which compose our Western folk model of the mind (D'Andrade 1987). 7 Each category's complex internal structure is organized as an ICM. I will now consider the ICM'S used as a base by all the members of the different verb classes presented earlier, as well as the particular profile imposed by each particular verb, in order to evaluate their semantic compatibility with an indicative clause. 4.1. Perception verbs and an ICM of perception Objects are considered part of the outside world, and we establish contact with some of them. The fact that a person sees an object, for instance, implies visual contact between that person and an outside entity, namely that object. Nouns following verbs of perception illustrate the purest perceptual relation between the subject and the complement of that verb (D'Andrade 1987). This is illustrated in (14) and (15): (14) Marie sent le parfum des roses. 'Mary smells the fragrance of the roses.'

586

Michel Achard

(15) Paul a entendu la cloche. 'Paul heard the beli.' Example (14) presents olfactory contact between the conceptualizer Marie and the object (the roses). In (15), the contact between Paul and the complement is established through hearing. Perception is represented as the presence of quasi-physical contact between a conceptualizer and an entity of the world, so that the conceptualizer and that entity are in a certain way connected. When a perception verb is a main verb in a SCC, its purely perceptual original meaning is somewhat extended. For instance, consider the difference between the infinitival and indicative complements illustrated in (16) - (17): (16) J'ai entendu la cloche sonner. (INF) Ί heard the bell ring.' (17) J'ai entendu que la cloche sonnait, (IND) Ί heard that the bell was ringing.' If the object of perception is an event, the complement of the perception verb can be an infinitive, as shown in (16), or an indicative clause, as in (17). The contrast between (16) and (17) illustrates Bolinger's (1974) notions of "percept" and "concept". In both cases, the object of perception provides direct sensory input. However, the two examples illustrate different construals of that input. In (16) the content of the complement is viewed as an event (or in Bolinger's terms a percept). In (17), it is viewed as a facet of the main clause subject's conception of elaborated reality (Bolinger's concept). Bolinger notes that the presence of a sentential complement after a verb of perception necessarily involves greater conceptual distance between the subject and the object of perception than when the verb is followed by an infinitive. Besides reflecting additional conceptual distance between the participants in the perceptual relation, the use of perception verbs in a SCC often involves the passage from perception to cognition or, in

Complement construal in French

587

other words from concrete to abstract (Sweetser 1987). Consider the examples in (18) - (19): (18) Je vois qu 'il m'a menti Ί see that he lied to me.' (19) Jean sent que Marie ne l'aime plus. 'John feels that Mary does not love him any more.' Examples (18) and (19) illustrate the fact that voir 'see' and sentir 'smell' take on the more general meaning of 'realize' and 'feel' respectively.8 Examples (16) - (19) clearly show that the position of perception verbs as Vi in a see implies some departure from their original meaning. It involves either additional conceptual distance between the participants in the perceptual relation they profile, or the semantic extension to a more abstract type of cognitive relation. However, the basic idea that they profile a certain type of connection between a conceptualizer (their subject) and their complement remains unchanged. In a sec, the presence of a perception verb in Vi position profiles the mental contact existing between Ci and CL2. Every specific verb renders more precise the exact nature of that contact (vision or understanding for voir, smell or feeling for sentir, etc.). Crucially, the presence of mental contact between subject and complement means that the latter's content is part of the main clause subject's conception of elaborated reality (in most cases), or at least, in his dominion. The verbs of perception are thus fully compatible with the meaning of an indicative clause. Their use as Vi in a sec indicates that the speaker situates the proposition expressed in the complement in some conceptualized s conception of elaborated reality (usually Cj's, or possibly his own), and hence in that conceptualized s dominion. The traditional analysis of perception verbs insists on their factive character (Karttunen 1971, Kiparsky and Kiparsky 1970). As factive verbs, they presuppose the truth of their complements. It is a valid observation that the complements of perception verbs are prototypically true. However, this is not always the case. Consider the example in

588

Michel Achard

(20), where a person is telling another person that a mutual acquaintance of theirs is totally persuaded that his wife is betraying him: (20) Il voit qu'elle le trompe dans tous ses gestes! 'He sees that she betrays him in everything she does!' In (20), the complement is not necessarily true. The same sentence could be used to describe the difficulty of persuading the husband that he is not being betrayed. Examples such as (20) can be used to illustrate the difference between an analysis concerned with the truth conditions of the sentence, and the account presented here, where the particular conceptualization of a given conceptualizer is being considered. An analysis of the presence of the indicative mood in terms of the truth value of the complement would have difficulties explaining examples such as (20). A conceptual account has no problems with this type of data. The indicative is motivated by the fact that in the present and the affirmative form, the speaker presents the proposition expressed in the complement as part of Ci's conception of elaborated reality, and hence in his dominion. The fact that the same proposition is not in the speaker's conception of elaborated reality (or anyone else's for that matter) has no bearing on the presence of the indicative inflection. The cases where the complement can be considered "true" are simply cases where the speaker and Ci (and possibly other conceptualizes) share that particular facet of elaborated reality. In terms of the analysis proposed here, factivity is reducible to shared conceptualization. 4.2. An ICM of communication The exchange of ideas and information certainly represents one of the core functions of our linguistic experience. The ICM which structures our folk conception of communication is highly complex. I will merely focus here on some of its relevant elements. Communication involves an exchange between interlocutors, and thus an act of manipulation by the speaker of the proposition expressed in the sentence he utters. In other words, the speaker makes a

Complement construal in French

589

part of his dominion available to the person he is speaking to. We exchange ideas in order to enlarge our knowledge of the world. Other people provide us with the information we may lack, and in exchange, we give other people the benefit of our experience by transmitting our knowledge to them. In a prototypical kind of communication, each speaker speaks to be believed. He tries to convince his interlocutor that he is right, and that the latter should locate the proposition he offers in his conception of elaborated reality. As Vi in a s e e , the verbs of communication profile the transmission by the main clause subject of the proposition expressed in the complement to another person. Examples involving verbs of communication are given in (21) - (22): (21) Paul dit que Marie est belle. 'Paul says that Mary is beautiful.' (22) Le directeur a déclaré que Paul devait partir. 'The director declared that Paul had to go.' Dire 'say" constitutes the most neutral way to describe the subject's communicative manipulation. Other verbs give further information about the precise manner in which that manipulation is performed. Chuchoter 'whisper', balbutier 'mumble', crier 'shout', déclarer 'declare', s'exclamer 'exclaim', hurler 'scream' represent some of those verbs. Crucially, the communication verbs given above do not profile any type of confrontation between the interlocutors as to the location of the proposition with respect to elaborated reality. The verbs simply profile the passage of a piece of information from the speaker to the hearer. Of course, the latter has the option of rejecting the proposition offered to him if it does not correspond to his interpretation of the world, but that possibility is not evoked by the verb. However, our ICM of communication cannot be reduced to mere transmission of information. It has already been pointed out that part of what we do when we speak is try to convince other people that they should adopt our views of the world. This points to a potential conflict if our interlocutor is resistant to accepting the information we have to

590

Michel Achard

offer. Several verbs of communication also profile a certain amount of confrontation between the interlocutors regarding the location of the proposition with respect to elaborated reality. Consider the examples in (23) and (24): (23) Jean prétend que Marie a gagné. 'John claims that Mary has won.' (24) Il t'a juré qu 'il ne l'avait pas vue. 'He swore to you that he hadn't seen her.' Like the verbs in (21) - (22), the verbs in (23) - (24) profile the main clause subject's transmission of the proposition expressed in the complement to the hearer. However, unlike those in (21) - (22), they presuppose the existence of a background of communication prior to the production of the utterance. They imply that Ci's claim that the proposition he offers deserves to be included in the hearer's conception of elaborated reality, is controversial, and needs to be defended. Along with the act of manipulation of the proposition by Ci, the latter's extra effort to convince his interlocutor is also profiled. Another set of verbs presents the other end of the communicative act. They profile their subject's final decision to accept or reject the proposition previously offered to him by his interlocutor. Such verbs include convenir 'agree', reconnaître 'recognize', admettre 'admit', avouer 'admit'. They are illustrated in (25) and (26): (25) Jean reconnaît que Paul s 'est amélioré. 'John recognizes that Paul has improved.' (26) Ma mère a convenu quelle avait tort. 'My mother agreed that she was wrong.' Prior to the production of (25) and (26), Ci/Si's interlocutor has offered a proposition (Paul s'est amélioré 'Paul improved'; sa mère a tort 'his mother is wrong'), with the purpose of convincing the main clause subject to locate it in his conception of elaborated reality. The sentences in (25)-(26) represent the latter's response. He expresses

Complement construal in French

591

complete agreement with his interlocutor and unconditionally accepts the proposition into his conception of elaborated reality, hence justifying the use of the indicative.9 To briefly recapitulate, it has been pointed out that the verbs of communication profile some aspect of a complex conceptual base, represented by our ICM of communication. Regardless of their semantic differences, they all involve the manipulation by the main clause conceptualizer of the proposition expressed in their complement, with the intention of convincing their interlocutor to insert it in his conception of elaborated reality. Conception being a prerequisite to manipulation, the proposition is necessarily in Ci's dominion. The communication verbs are therefore straightforwardly compatible with the meaning of an indicative clause. 4.3. Propositional attitudes: certainty, thoughts, and beliefs The world is a difficult place to know completely. The only way to reach some level of understanding of it is through the interpretation of the elements which unfold throughout the course of events. Our beliefs and certainties about the facts of the world are provided to us by clues coming from various sources. For instance, the sound of a car near Mary's house might lead me to believe that she is now getting home. However, those hints do not provide us with undeniable certainty about the presence of the proposition considered in elaborated reality. They simply permit us to assess that presence with varying degrees of confidence. Furthermore, since our opinions about the world are mostly a matter of interpretation, different conceptualizers are likely to interpret the world differently, and reach different conclusions. The verbs of propositional attitude profile the whole range of possible levels of commitment by the main clause subject to the proposition expressed in the complement (as interpreted by the speaker, of course). Examples of such verbs in a see are given in (27) -(30): (27) Jean sait que Marie est malade. 'John knows that Mary is sick.'

592

Michel Achard

(28) Jean est convaincu que Marie a tort. 'John is convinced that Mary is wrong.' (29) Je pense que j'ai compris le problème. Ί think I understood the problem.' (30) Jean soupçonne que sa femme le trompe. 'John suspects that his wife is unfaithful.' The verbs presented in (27) - (30) profile different levels of commitment by the main clause subject to the presence of the proposition expressed in the complement in his conception of elaborated reality. With the verbs in (27) and (28), the latter is presented in Ci's conception of elaborated reality. With such verbs as penser (illustrated in (29)), Ci considers the proposition a good candidate for insertion into his conception of elaborated reality. With verbs like soupçonner in (30), the main clause subject evaluates the proposition as a possible candidate for insertion into his conception of elaborated reality. 4.3.1.

Certainty and knowledge

The verbs of knowledge presented in (27) and the verbs of certainty presented in (28) exhibit strong similarities (both present the circumstance expressed in their complement in their subject's conception of elaborated reality); however, philosophers of language have repeatedly pointed out the differences between them. The concept of knowledge necessarily involves reference to some notion of "truth" (Nissenbaum 1985). Knowledge has traditionally been considered a true belief, which has been verified by logical or empirical truth conditions. However, since we operate within a framework where meaning is equated with conceptualization, the very notion of truth needs to be further defined. In the analysis proposed here, truth does not always refer to objective concepts, universally and unquestionably observable. It very often merely rests on a conventionally accepted opinion, validated (or dictated) by someone (or a group of people) invested with the author-

Complement construal in French

593

ity to determine its value. The knowledge of something is a matter of agreement between the subject of the knowledge verb and the authority who assumes responsibility for the validity of the proposition expressed in the complement. For instance, the speaker in (27) validates Jean's position by assuming the responsibility to declare the proposition true. Consequently, with verbs of knowledge, the speaker and the main clause conceptualizer both have the proposition expressed in the complement in their respective conceptions of elaborated reality. The verbs of certainty are solely concerned with the main clause subject's position towards the proposition expressed in the complement. They profile its presence in Ci's conception of elaborated reality, but are totally neutral as to anyone else's belief. In that respect, they are very similar to the belief verbs to be considered later. Consequently, with verbs of certainty, the speaker and the main clause subject do not necessarily share the proposition expressed in the complement in their respective conceptions of elaborated reality. Since the verbs of certainty profile the presence of the proposition in Ci's own conception of elaborated reality, their use with common knowledge propositions (which are by definition shared by a whole section of the population) is quite strange. Compare (31) with (32): (31) ?Jean est certain que la terre tourne autour du soleil. 'John is certain that the earth revolves around the sun.' (32) Jean est certain que l'eau bouet à 50°. 'John is certain that water boils at 50°.' Example (31) is only possible as a sarcastic statement making fun of John's poor knowledge of physics. The description of the proposition as only present in one person's conception of elaborated reality represents a violation of the Gricean maxim "say as much as you can". With the verbs of certainty, the speaker (and possibly other conceptualizes) do not usually share Ci's position. On the other hand, the example in (32) is perfectly felicitous (even though it can also be sarcastic). It describes a belief solely attributed to John which, crucially, the speaker does not share. The fact that that belief is not shared by most people is inconsequential here.

594

Michel Achard

Even though the verbs of knowledge and certainty exhibit strong semantic differences, both are compatible with the meaning of the indicative, since in both cases the proposition expressed in the complement is part of the main clause conceptualizer's conception of elaborated reality. The main semantic difference between the two classes of verbs, namely the sharing of the proposition in elaborated reality between the speaker and Ci, is not relevant to the selection of the indicative inflection. 4.3.2.

Thoughts and beliefs: penser and croire

Verbs like penser 'think' or croire 'believe' present weaker beliefs than the verbs of certainty. The contrast between the verbs of belief and certainty is illustrated in (33) and (34): (33) Jean est sûr que Paul est arrivé. 'John is sure that Paul has arrived.' (34) Jean pense que Paul est arrivé. 'John thinks that Paul has arrived.' In (33), the proposition Paul est arrivé is considered as part of the main clause subject's conception of elaborated reality. In (34), it is only considered a good candidate for insertion into C i ' s conception of elaborated reality. Note that penser exhibits the same neutrality as être certain as to the speaker's position concerning the location of the proposition expressed in the complement. It simply profiles the situation of the proposition as a good candidate for insertion into Ci's conception of elaborated reality. Croire 'believe' is ambiguous. It can pattern with penser, in which case the proposition expressed in the complement is considered a good candidate for insertion into Ci's conception of elaborated reality. This case is illustrated in (35). It can also mean "hold the belief that", as illustrated in (36). In that case, the proposition is definitely part of Ci's conception of elaborated reality.

Complement construal in French

595

(35) Jean croit que Marie est partie, mais il n'est pas sûr. 'John believes that Mary has left, but he is not sure.' (36) Paul croit que tous les Américains ont un cheval *mais il n'est pas sûr. 'Paul believes that all Americans have a horse *but he is not sure.' When croire patterns with penser, it is neutral as to the speaker's position towards the proposition expressed in the complement. When it means "hold the belief that", it is very similar to the certainty verbs considered earlier. The main difference between that sense of croire and the verbs of certainty, is that the latter verbs imply the previous consideration of the proposition, and a decision made by Ci to consider it true. Croire does not evoke the consideration of possible alternatives to the proposition expressed in the complement. It simply describes Ci's belief, which might have never been questioned. 4.3.3.

Suspicion

The verbs of suspicion such as soupçonner 'suspect' present a much weaker commitment by the main clause subject to the proposition described in the complement. The circumstances in CL2 are presented as possible candidates for insertion into Ci's conception of elaborated reality. Note that the speaker is neutral as to the situation of the proposition. The latter is exclusively considered with respect to Cj's conception of elaborated reality. The verbs of propositional attitude are fully compatible with the meaning of an indicative clause because their meaning precisely consists in evaluating the complement clause with respect to its position in elaborated reality. The hierarchy considered between the être sûr, penser and soupçonner verbs involves the level of commitment by the main clause subject to the presence of the complement in his conception of elaborated reality. Total commitment is provided by the être sûr verbs. The penser verbs present the proposition as a very good candidate for insertion into the subject's conception of elaborated

596

Michel Achard

reality, and the soupçonner verbs present the proposition as a possible candidate. Even though, with the suspicion verbs, the main clause subject is not yet sure of the location of the proposition with respect to reality, he nonetheless clearly conceives of that proposition. His conception can be demonstrated by certain types of actions or behaviors justifiable on the basis of the suspicion. Savoir is analyzed in the same way as the certainty verbs, with the added particularity that the speaker and Ci both have the proposition expressed in the complement in their respective conceptions of elaborated reality. 4.4. Verbs followed by subjunctive complements It is now time to turn to the verbs which take subjunctive complements, namely the verbs of volition and emotional reaction. It will be shown that the meaning of the volition verbs is incompatible with the meaning of an indicative clause, because they do not allow the circumstance expressed in the complement to be considered as part of any conceptualizer's dominion. In other words, they do not present a proposition. The presence of the subjunctive inflection in the subordinate verb reflects that semantic incompatibility. The verbs of emotional reaction are not incompatible with the meaning of an indicative clause, but the construal of the complement scene as a proposition (although possible) is not crucial to their meaning. The presence of the subjunctive inflection reflects the irrelevance of the construal of the complement content as a proposition for the subject. An important part of the analysis rests on the difference between events and propositions. The distinction between these two notions is very basic, and can be traced back to our folk model of reality. Recall that there is a conception of reality concerned with the mere occurrence of past and present events (which we will simply define here as the accomplishment of a specific process by some participant). This conception of reality was called basic reality (cf. section 3.2.2). The elements of basic reality are conceptualized as mere events, and hence do not necessarily include grounding predications. ^ When these events are being described or reported (when someone says that an event occurred), they are described or reported in relation to the

Complement construal in French

597

speech situation, and thus grounded. In other words, they are described or reported in terms of propositions. The description, or report of an event is made at the level of elaborated reality, where the verbal structure contains such grounding predications as modality or negation. The distinction between events and propositions correlates with the distinction between basic and elaborated reality. Basic reality pertains to events, elaborated reality pertains to propositions. The distinction between events and propositions on the one hand, and basic and elaborated reality on the other hand can be illustrated in the following example. Consider the independent clause: "John came yesterday". The sentence presents an event, namely John's coming. That event is a potentially observable element of basic reality. The description of the occurrence of that event by some conceptualizer as "John came yesterday" yields a proposition. A proposition is an element of that conceptualizer's dominion. The grounding predications it contains provide it with a putative address in elaborated reality (here in known reality). The distinction between events and propositions is directly relevant to the distribution of indicative and subjunctive complements in French. It has already been shown that the verbs which take indicative complements are compatible with the meaning of a proposition, I will claim that the verbs which take subjunctive complements are those which are solely concerned with the events expressed in their complements, and not with the construal of those events as propositions. This accounts for the volition verbs which are incompatible with the meaning of an indicative clause, as well as the verbs of emotional reaction for which the construal of the complement scene as a proposition is not directly relevant. I will now consider these two cases in tum. 4.4.1.

Volition and desire

Volition is parasitic on the absence of possession. In Old English, want and lack are synonyms (Newman 1981). Wanting something necessarily implies that the object of desire is not in the wanter's possession. With a simple sentence such as "John wants an orange", the

598

Michel Achard

orange is not in John's possession. The orange is also the object of John's desire. Consider the examples in (37), where the volition verbs are in V j position in a SCC: (37) a. b.

Jean veut que Marie parte. 'John wants Mary to leave.' Le directeur désire que Jean vienne tout de suite. 'The director desires that John come right away.'

In the examples in (37), the objects of Si's desire (Mary's departure, John's immediate arrival) are events. Their accomplishment would change the course of events in the direction most suitable for the main clause subject. The volition verbs express a conflict between the current state of affairs and Si's comfort. The accomplishment of the event described in the complement will resolve that conflict. For instance, in (37a), Mary's departure will resolve John's displeasure. Volition verbs are thus at no time concerned with the description or report of the event expressed in the complement. They are strictly concerned with the accomplishment of the events necessary for the world to fit the desires of the main clause subject. Importantly, the realization of these events does not necessarily figure along the plausible paths reality is likely to take. Volition can thus be considered as a way to try to change the world. This view of volition is compatible with Bolinger's (1968) notion of "influencing the outcome". It is also compatible with Searle and Vanderveken's (1985) "world to word direction of fit" of commissive and directive predicates. On psychological grounds, it seems reasonable to propose that one desires an event, not a proposition. This psychological observation has a strong correlate in the analysis proposed here. The ungrammatically of volition verbs with an indicative complement neatly falls out of the semantic incompatibility between the meaning of those verbs and that of a proposition. A fundamental difference between events and propositions is that since propositions are grounded, they are tied to the time of speech (immediate reality). Events are not. Consider the ungrammatical sentences in (38):

Complement construal in French

(38) a. b.

599

*Je veux que vous mangez. (IND) Ί want that you eat.' *Elle demande que vous partez. (IND) 'She demands that you leave.'

The complements in (38) are in the present tense. They are thus candidates for insertion in immediate reality. Recall however that this is incompatible with the notion of wanting, since the possession of an element is incompatible with the wanting of the same element. Notice importantly that the ungrammaticality of (38) is truly caused by the semantic incompatibility between vouloir and the indicative, and not by the presence of the present tense. A future in the complement clause fares no better: (39) a. b.

*Je veux que vous mangerez. (IND.FUT) Ί want that you will eat' *Elle demande que vous partirez. (IND.FUT) 'She demands that you will leave'

The events named by manger 'eat' and partir 'leave' can conceivably occur in the future. However, the propositions that vous mangerez 'you will eat' and vous partirez 'you will leave' name are candidates for insertion into elaborated reality (more precisely projected reality; see Figure 3, p. 581). This means that reality can be predicted to evolve in such a manner as to incorporate those events. Recall that prediction into projected reality is rendered possible by the evaluation of the evolutionary momentum of current reality. With vouloir, there is no evolutionary basis to the projection, since only a wish is expressed. A wish (or a command) does not primarily concern itself with the way events naturally occur, it simply expresses the most beneficial outcome for the conceptualizer. The volition verbs in French are thus incompatible with the meaning of an indicative clause because they do not allow Ci to construe the event presented in the complement as a proposition.

600

Michel Achard

4.4.2.

Emotional reaction and the use of the subjunctive

It has been shown that the volition verbs are incompatible with the meaning of a proposition, and hence with that of the indicative inflection. Even though they are also followed by subjunctive complements, a similar statement can not be made for the verbs of emotional reaction, since they used to be followed by indicative complements in Old French. Furthermore, in other Romance languages, (Romanian, Spanish under certain conditions) the indicative mood can be present in the complement. The claim here is that although its construal as a proposition is possible, it is the consideration of the event presented in the complement scene which is directly relevant to the meaning of the verbs of emotional reaction. Consider the examples in (40) - (41): (40) Jean préfère que tu reviennes tout de suite. (SUBJ) 'John prefers that you come back right away.' (41) Je suis content que vous soyez venu. (SUBJ) 'I'm glad you came.' The verbs in (40) - (41) profile the reaction of their subject towards the scene presented in the complement. It is the particular construal of that scene which determines the syntactic behavior of the verb. First of all, unlike the volition verbs, the verbs of emotional reaction are to some degree compatible with the meaning of an indicative clause. In order for the subject to exhibit a particular reaction towards the event expressed in the complement, he must be able at some level to describe that event (and therefore describe a possible facet of elaborated reality). It is thus possible for the subjects of the verbs of emotional reaction to construe the content of their complement as a proposition. However, the construal of the complement content as a proposition is not directly responsible for the subject's reaction. More directly relevant to the reaction profiled by Vi is the particular event profiled in CL2-1 claim that the subjunctive inflection in (40) - (41) indicates that the verbs of emotional reaction are specifically concerned with the reaction of their subject to the event expressed in the complement. The

Complement construal in French

601

relevance of events or propositions to a particular main verb accounts for the contrast in (42) - (43): (42) Jean sait que vous avez échoué, (IND) 'John knows that you failed.' (43) Jean est désolé que vous ayez échoué, (SUBJ) 'John is sorry that you failed.' The conceptual content of the complement structure is similar in (42) and (43). However, that content is construed differently in the two sentences. In (42), the semantics of Vi indicate that the speaker is mostly concerned with the description of a facet of Ci's conception of elaborated reality. The construal of the complement content as a proposition is thus directly relevant. In (43), the speaker is more concerned with the description of John's reaction to a particular event expressed in the complement. It is undeniable that Ci is capable of construing the complement content as a proposition, but that construal is not directly relevant to the meaning of the main verb (and hence to the communicative purpose of the sentence).11 The solution presented here is consistent with Wierzbicka's analysis of the subjunctive following emotional reaction verbs as anti-cognitive. 4.4.3.

Events versus propositions: the case of espérer

The analysis presented here allows us to capture subtle semantic differences between verbs which are otherwise very close in meaning. These semantic differences are in turn reflected in their different syntactic behaviors (i.e. the kind of complements they take). As an example, consider the notoriously difficult case of espérer 'hope'. Conceptually, a hope is very close to a wish. In both cases, the scene described in the complement is uncertain, and its realization represents the desired outcome for the main clause conceptualizer. However, syntactically, espérer patterns with the belief verbs, taking indicative complements and rejecting the subjunctive. The contrast between espérer and the volition verbs is illustrated in (44) and (45):

602

Michel Achard

(44) a. b.

Je veux qu'il gagne au Loto. (SUBJ) *Je veux qu'il gagnera au Loto, (IND) Ί want him to win the Lotto.'

(45) a. b.

J'espère qu'il gagnera au Loto, (IND) *J'espère qu'il gagne au Loto. (SUBJ) Ί hope that he will win the Lotto.'

In accordance with the analysis presented above, we have to claim that the complement presents an event in (44a) and a proposition in (45a). The ungrammaticality of (44b) has already been accounted for by the semantic incompatibility between the volition verbs and an indicative clause. More precisely, with vouloir, there is no current basis to indicate that reality can be expected to evolve in such a way as to contain the event described in the complement. This is not the case with espérer. A hope is different from a wish in that there must be some basis (indication) which leads the conceptualizer to believe that the momentum of reality might lead it along a path which includes the preferred outcome. This difference between vouloir and espérer is illustrated in (46) - (47): (46) Je veux gagner au Loto, mais je ne joue jamais. Ί want to play the Lotto, but I never play.' (47) ??J'espère gagner au Loto, mais je ne joue jamais. Ί hope to win the Lotto, but I never play.' Whereas (46) is perfectly felicitous, (47) is very strange. One can not hope for a particular outcome unless one takes some action (or has some independent evidence) which indicates that this outcome is indeed possible. Similarly, (45a) is infelicitous if Ci knows that S2 never plays. This semantic difference between vouloir and espérer accounts for their syntactic difference. Unlike that of the volition verbs, the subject of espérer does not try to influence the outcome of things. He is aware that the course of events might take different paths, and merely points out one of those paths as the most beneficial for him-

Complement constnial in French

603

self. Crucially, what allows him to even entertain the idea that reality might evolve in such a way as to incorporate the event most beneficial for him is the presence of certain elements of current reality which validate his interpretation. For instance, the fact that S2 played (or usually plays, or even that Ci thinks that S2 usually plays) in (45a) constitutes an indication which justifies Ci's belief that reality might evolve in such a way as to incorporate S 2's winning. Espérer therefore involves the description of a possible facet of elaborated reality, appropriately described by a proposition. Ci can conceive of that proposition and claim his anticipation to establish it in his conception of elaborated reality. It is therefore part of Ci's dominion, and espérer is compatible with the meaning of an indicative clause.12 The different syntactic behaviors of vouloir and espérer illustrated in (44) and (45) are thus based on the respective semantics of the main verbs. In spite of their conceptual proximity, the two verbs are compatible with different types of complements because their respective subjects construe the content of their complements differently. 5.

Recapitulation and conclusion

The necessity of semantic compatibility between the main verb and the complement structure was shown to account for the distribution of the indicative/subjunctive sentential complements in French. The meaning of the indicative was described as the sign of a conceptualizer's dominion over the circumstance described in the complement clause. The verbs of perception, communication, and propositional attitude have been shown to be fully compatible with the meaning of an indicative clause because they allow the speaker to present the circumstance described in the complement as part of Ci's dominion (to construe the complement content as a proposition). The presence of indicative complements following these verbs illustrates that compatibility. On the other hand, the verbs of volition have been shown to be incompatible with the meaning of an indicative clause since they are solely concerned with the event described in the complement. They do not allow the conceptualizer Ci to construe the content of the com-

604

Michel Achard

plement as a proposition. The verbs of emotional reaction have been shown to be potentially compatible with the meaning of an indicative clause. Even though Ci is capable of construing the content of the complement as a proposition, it is the very occurrence of the event (âuu »iül the possible description of that event) which is the most relevant to the main verb's meaning. The subjunctive inflection illustrates the irrelevance of the construal of the complement's content as a proposition.13 An analysis based on different degrees of compatibility for the presence of indicative complements precisely isolates the verbs (iespérer and the verbs of emotional reaction) which are the most likely to exhibit synchronic, diachronic and cross-linguistic variation in the syntactic realization of their complements. The characterization of the meaning of the complement structure in terms of the type of construal it imposes on the complement content has been shown to successfully account for the problem of mood distribution in sentential complements. I believe that the approach presented here sketches a promising methodology for the semantic consideration of other constructions where the indicative and subjunctive inflections contrast, and that it can be extended to the analysis of infinitival complements. Once a precise semantic characterization of the indicative, subjunctive and infinitival mood is achieved (in terms of the type of construal their presence imposes on the complement content), the numerous issues pertaining to their distribution can confidently be addressed. Notes 1.

2.

3.

I would like to express my gratitude to Richard Epstein, Gilles Fauconnier, Suzanne Kemmer, Ronald W. Langacker, Ricardo Maldonado, Errapel Mejias-Vicandi, Maura Velazquez, Sanford Schane, as well as three anonymous reviewers for their very helpful comments on earlier versions of this paper. All remaining errors and shortcomings are of course my own. Even though infinitives can also be considered sentential complements, this paper is solely concerned with the indicative/subjunctive alternation. The connection between negation and the subjunctive mood seems to run very deep. It is well-known that the verbs of propositional attitude such as croire 'believe', or penser 'think' are followed by

Complement construal in French

4.

5. 6.

7.

8. 9.

10. 11.

subjunctive complements in the negative form whereas they are followed by indicative complements in the positive form. In fact, in the negative form, even the verbs of perception can be followed by subjunctive complements, as in the following example: Je n'avais pas remarqué qu'il fasse ( S U B J ) aussi froid Ί had not noticed it was so cold'. The felicity of the sentence is greatly helped by the presence of the past tense (plus-que parfait). The connection between negation and the subjunctive mood is therefore not restricted to a certain type of verbs, but seems to involve more general elements. It will not be considered here. I started my investigation of the indicative/subjunctive contrast with sentential complements because I believe these senses are very close to the prototype of the indicative and subjunctive inflections. This is, however, only a working hypothesis, which has no bearing on the analysis of the sentential complement construction itself. In other work (Achard 1993) I consider more specifically how my solution fits with respect to the accounts mentioned here. This type of analysis is not to be confused with a "syntactic account" referred to earlier, where an arbitrary feature of the main verb triggers the use of a specific complement. On the contrary, in the account proposed here, it is the semantic structure of a morpheme which determines how readily it lends itself to certain kinds of valence relations (Langacker 1988 b). The valence of the main verb, i.e. what type of complement it occurs with is therefore partly predictable from its semantics. D'Andrade recognizes 6 categories which compose our Western model of the mind: perception, belief/knowledge, feeling/emotions, intentions, desires/wishes, resolution will/self control. Of particular interest to this paper are the categories of perception, belief, feelings and desires. Intentions and resolutions prototypically involve situations expressed linguistically by an infinitive complement. They will not be considered in this paper. Sweetser (1987) points out that diachronically, most terms of understanding derive from visual terms. Verbs such as reconnaître, admettre can also be followed by subjunctive complements, as in the following: J'admets qu'il soit riche, mais ça ne l'excuse pas. Ί admit that he is rich, but that does not excuse him'. Such examples are related to the presence of the subjunctive inflection in negative contexts. They will not be considered here. Since the term "event" is taken in its broadest sense of the accomplishment of some process by some participant, it also includes states or emotions. As it has been indicated earlier, a hope is conceptually very close to a wish. The "hope" verbs are less compatible with the meaning

605

606

12.

13.

Michel Achard

of an indicative clause than the verbs of perception, declaration and prepositional attitude. This accounts for the cross-linguistic variation observed in the Romance languages. In Italian, the equivalent of espérer is followed by subjunctive complements. I am grateful to an anonymous reviewer for pointing out to me that in Spanish, esperar can be followed by either indicative or subjunctive complements, depending on the subject's confidence in the realization of the complement scene (Real Academia Española: Esbozo de una nueva gramática de la lengua Española: par. 3.13.4). This variation is fully consistent with the analysis proposed here. In French, fear is viewed as a specific kind of emotional reaction. The fear verbs such as craindre 'fear' are thus followed by subjunctive complements. For the sake of brevity, these will not be considered in detail here. Note that the presentation of an event is not sufficient to characterize the meaning of the subjunctive inflection. Infinitival complements can also be described as presenting events. The difference between an event presented by an infinitive and a subjunctive involves the precise evaluation of the semantic import of both types of complements. This issue will be considered in later work.

References Achard, Michel 1993 Complementation in French: A cognitive perspective. Ph.D. dissertation, University of California at San Diego. Bolinger, Dwight 1968 "Postposed main phrases, an English rule for the romance subjunctive", Canadian Journal of Linguistics 14: 3-30. 1974 "Concept and percept, two infinitive constructions and their vicissitudes", in: Working papers in phonetics: Festschrift for Dr Onishi's Kiju. Tokyo: Phonetic Society of Japan, 65-91. D'Andrade, Roy 1987 "A folk model of the mind", in: Holland, Dorothy & Naomi Quinn (eds.), Cultural models in language and thought. Cambridge: Cambridge University Press, 112-148. Guillaume, Gustave 1929 Temps et verbe. Paris: Champion. 1990 Leçons de linguistique de Gustave Guillaume 1943-1944. Laval, Québec: Presses de l'Université. 1992 Leçons de linguistique de Gustave Guillaume 1944-1945. Lille: Presses Universitaires. Karttunen, Lauri 1971 "Some observations on factivity", Papers in Linguistics 4: 55-70.

Complement constnial in French

607

Kiparsky, Paul & Carol Kiparsky 1970 "Fact", in: Manfred Bierwisch & Karl E. Heidolph (eds.), Progress in linguistics. The Hague and Paris: Mouton, 143-173. Lakoff, George 1987 Women, fire, and dangerous things: What categories reveal about the mind. Chicago and London: University of Chicago Press. Lakoff, George & Mark Johnson 1980 Metaphors we live by. Chicago and London: University of Chicago Press. Langacker, Ronald W. 1987 Foundations of cognitive grammar. Vol. I: Theoretical prerequisites. Stanford: Stanford University Press. 1988a "An overview of cognitive grammar", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: Benjamins, 3-48. 1988b "The nature of grammatical valence", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins, 91-125. 1991 Foundations of cognitive grammar. Vol. II: Descriptive application. Stanford: Stanford University Press. Moignet, Gérard 1981 Systématique de la langue Française. Paris: Klincksieck. Newman, John 1981 "The semantics of raising constructions". Unpublished Ph.D. dissertation, University of California at San Diego. Nissenbaum, Helen Fay 1985 Emotion and focus. Stanford: Center for the Study of Language and Information (CLSI) Publications. Real Academia Española 1973 Esbozo de una nueva gramática de la lengua Española. Madrid: Escalpe-Calpe. Rivero, Maria-Luisa 1971 "Mood and presupposition in Spanish", Foundations of Language 7: 305-336. Searle, John & Daniel Vanderveken 1985 Foundations of illocutionary logic. Cambridge: Cambridge University Press. Sweetser, Eve E 1987 "Metaphorical models of thought and speech: A comparison of historical mappings in the two domains", Berkeley Linguistic Society 13: 446-459. Terrell, Tracy & Joan Hooper 1974 "A semantically based analysis of mood in Spanish", Hispania 57: 484-494.

608

Michel Achard

Wierzbicka, Anna 1988 The Semantics of grammar. Amsterdam and Philadelphia: Benjamins.

Typology of (/"-clauses Angeliki Athanasiadou and René Dirven

0.

Introduction

Research on conditionality has thus far exhibited a multifaceted picture. Conditionality has been intensively studied by logicians (Hilpinen 1981; Harper 1981) and linguists (Schachter 1971; Lauerbach 1979; Davies 1979; Traugott 1986). Conditionals have long been related to causality (Geis and Zwicky 1971) or to temporal reference (Comrie 1986), but more interesting approaches develop when emphasis is shifted to the pragmatics of conditionals (Haiman 1978; Haegeman 1984; König and Van der Auwera 1988). Despite the great number of articles, the appearance of a special issue of the Journal of Pragmatics 1983, and the excellent collective volume of Traugott et al. 1986, there is as yet no extensive description nor analysis of a possible typology of conditionals. Akatsuka (1986), for instance, argues against a truth-conditional perspective in favour of a linguistic, specifically a pragmatic approach. Using Japanese, English and some German data reported in newspapers, she shows that we must consider discourse context as well as the speaker's attitude and prior knowledge to account for the semantics of conditionals. The discourse function of conditionals is a major concern in almost every paper in the Traugott et al. (1986) volume. Ford and Thompson's contribution is, however, the only one which analyses actual data, setting out to test Haiman's (1978) hypothesis that conditionals are topics. Pragmatic conditionals are treated by Haegeman (1984) and also by Van der Auwera (1986), who discusses the interpretation of conditionals as threats and promises or as concessives. The class of conditional speech acts is investigated by Wunderlich (1977) as warnings and threats, pieces of advice, extortions and negotiations, offers and proposals, and the role of speech acts that use counterfactuals.

610

Angeliki Athanasiadou and René Dirven

The main object of Sweetser's analysis is to elucidate the functioning of conditionality in the content domain ("real world"), in the epistemic domain (knowledge of the truth of the hypothetical premise expressed in the protasis would be a sufficient condition for concluding the truth of the proposition expressed in the apodosis) and in the speech-act domain (the performance of the speech act represented in the apodosis is conditional on the fulfillment of the state described in the protasis). All these above analyses, however insightful and detailed they are, do not consider the full spectrum of «/-clauses in English. In fact, they mostly concentrate on one or two types of «/-clauses, namely what we can already call now hypothetical and logical ¿/-clauses. In contrast to this, the aim of the present paper is to provide a detailed description of all the types of English i/-clauses. We also want to study the relationships and connections between these types and subtypes in order to make an initial contribution to the discussion of more fundamental questions such as the following: which are the central uses of «/clauses, how do these cover major cognitive and communicative needs, and how have more marginal uses developed as a result of other, more specific, cognitive needs. In this pilot study of conditionals, about 300 instances were examined, taken from the Collins/Birmingham University International Language Database (COBUILD). This corpus represents a large number of English registers and therefore guarantees reliable source materials. The total number of «/-clauses in the COBUILD corpus is approximately 20,000 instances, but although this study is ultimately based on only 400 of them, it turned out to be representative of the way «/-clauses behave: when one hundred more examples, taken from different parts of the corpus, were compared to the original 300 examples, the overall distribution remained the same. The selected examples were numbered and the number is given after each example quoted. In our attempt to set up a typology of «/-clauses we will start from the assumption that the classes of conditional clauses and «/-clauses are not necessarily one and the same thing. Therefore, if we take them as potentially different things we must also come up with a classification that reflects this potential difference. Thus, we want to propose that the first two types of «/-clauses below are always conditionals, but

Typology of (/-clauses

611

that the next two types are marginal conditionals, which means that some of them are conditionals and others are not. Following this line we can say that conditional clauses and/or «/-clauses can be classified into four types of clauses due to semantic oppositions; those types can be labelled and illustrated as follows: Event-based conditionals: (1)

Course of events conditionals (ŒC) If there is a drought at this time, as happens so often in central Australia, the fertilised egg in the uterus still remains dormant. (43)

(2)

Hypothetical conditionals (HC) If there is no water in your radiator, your engine will overheat immediately. (17)

Marginal conditionals: (3)

Logical //-clauses (Lie) If there's an elite in China, she wrote, it's the masses; and the masses are the workers, peasants and soldiers. (14)

(4)

Conversational (/-clauses (cic) He had learnt far more from him than from his tutors, if the truth were known. (307)

Types (1) and (2) are both event-based conditionals, i.e., one has a relationship between a first event and a second event in (1), or between a possible event (or more generally a situation) and its consequence in (2). The first case is what we call a course of events conditional (CEC), the second case is a hypothetical conditional (HC). The main distinctive characteristics between the two types are the following: (a) CECs refer to generally or occasionally recurring events, whereas HCs usually refer to hypothetical events, which mostly denote a single occurrence of an event - although it is, of course, possible that a hypothetical event is seen as recurring, too; (b) CECs can refer to simultaneous

612

Angeliki Athanasiadou and René Dirven

or consecutive events, whereas HCs can only refer to consecutive events, and (c) CECS refer to real time, i.e., events situated in present and/or past time, whereas HCs refer to hypothetical time, that is events in the future, in a time combination of future and present, or in the imagined past or present. Our corpus analysis showed fairly quickly that conditionals are not only or not even predominantly represented by HCs, but that CECs are the largest group of conditionals (44,7% CECs and 36.9% HCs; for exact figures see appendix; p. 650). In fact, the difference between the two types has been vaguely felt by some authors, for instance by Sinclair et al. (1990, 8: 25-42). What may be entirely new about the present typology of conditionals is the fact that (i) conditionals do not just consist of HCs but also of CECS, and (ii) that CECs are even the dominant type of conditional clauses in one of the most diversified and reliable corpora of present-day English. In traditional approaches to conditionals it is most of all HCs that are discussed, and CECs are either treated as a kind of marginal affair or not at all. One of the questions this study should answer is why this overestimation of HCs could occur at all and why the dominant type of CECs was hardly taken any notice of. Could it be that we only see those things for which we have set up categories and labels to handle them? Can one suppose that it is because HCs are more amenable to a formal analysis made in terms of the constructs and notations of mathematical logic? Or is it because we have not yet taken corpus linguistics seriously and traditionally based our research on invented examples? However that may be, we will in this paper pay more special attention to the most dominant type of conditional clauses, which are the CECs, without however neglecting HCs. In fact, a clearer understanding of CECs may lead to a better insight into the precise function of HCs and into the question why we can't do without HCs. To put it in cognitive terms, we can rephrase these questions by the following ones: (a)

Why do we want to speak in the form of an i/-clause about the relationship between a first event and a second event if these really occur?

Typology of {/-clauses

(b) (c)

613

What is the deeper cognitive need to speak in the form of a conditional about hypothetical events? Do CEC s and HC s resemble each other more than either of them would resemble the two other types of (/-clauses in (3) and (4)?

The i/-clauses in (3) and (4) are not called "conditionals", because they do not denote a direct relationship between two events (event 1 and event 2). In logical //-clauses we rather have a metalinguistic operation in which we are not even referring to two events but rather to one event and the logical identification of one of its participants based on truth conditions. In conversational //-clauses, the //-clause is totally independent of any event referred to, but the //-clause is linked to and made possible by its relationship to the speech event, i.e., the conversation that is taking place. For the sake of clear exposition we will now analyse each of the four types in somewhat more detail, and thereby repeat the same examples: (1)

If there is a drought at this time, as happens so often in central Australia, the fertilised egg in the uterus still remains dormant. (43)

In this type of conditional, event 1 is the drought and event 2 is the fertilised egg remaining dormant in the uterus of some animal. The relationship between the two events could perhaps be interpreted as a chain of cause and effect, but that is not the way it is conceptualized by means of the course of event conditional (CEC). On the contrary, the CEC construction avoids such a strong commitment and only construes the link between the two events "as a natural course of two events". The one is not necessarily dependent upon the other, but there is the suggestion that both are real and whenever the first event occurs also the second event occurs. As we will analyse in still greater detail later on, there is a strong suggestion of empirical evidence gathered by someone with expert knowledge about a real state of affairs. One detail suggesting this factual character is the parenthetical clause as happens so often in central Australia.

614

Angeliki Athanasiadou and René Dirven

In the second type of conditional, (HC), there is a clear experiential basis for the statement to be made, but there is no claim as to the actual occurrence of the state of affairs: (2)

If there is no water in your radiator, your engine will overheat immediately. (17)

Here, the two events are the absence of water in the radiator and the overheating of the engine. The relationship between event 1 and event 2 is construed as the prediction of a consequence, whereby event 2 follows from event 1. The nature of the prediction strongly depends on the use of the modal auxiliary, which is prototypically will or would. The two states of affairs are hypothetical in as far as no commitment is made as to their actual occurrence. The hypothetical character relates to the occurrence of event 1 and event 2, not to the relationship between the two. In fact, one of the major questions this paper tries to answer is the relationship between the //-clause and the main clause in both CEC s and HC s. Is the relationship in HCS fundamentally the same as in CEC s or, given the real versus hypothetical state of affairs, is it basically different? Because the two following types are marginal conditionals, we will not refer to them as conditionals all the time, but simply as //-clauses: (3)

If there's an elite in China, she wrote, it's the masses; and the masses are the workers , peasants and soldiers. (14)

(4)

He had learnt far more from him than from his tutor, if the truth were known. (307)

As stated already, in these two types there is no relationship between two extralinguistic events but rather between an extralinguistic event and a logical or a metalinguistic operation (3), or else between an extralinguistic event and a factor in the conversation (4). It is because of this absence of a prototypical relationship between two extralinguistic or external events that we cannot always call the types (3) and (4) conditionals. In other words, there is usually not a "conditional" relationship between two events: in (3) there is only one event i.e., the ex-

Typology of //-clauses

615

istence of an elite in China, and the main clause does not predict or state any event related to it, but it only identifies where this elite is to be found. This logical process of identification is very much different from the prediction of the relationship in HCS. That'S also the reason why logical ¿/-clauses as a rule never have the modal auxiliary will. Type (3) may just like CECs be related to real situations; it is however very much different from type (1), because there is no reference to two, more or less simultaneous events as in (1), but to only one event. In example (4) as in (3), there is only one event. In fact, example (4) can be paraphrased as follows: My statement of event 1, that he had learnt far more from him than from his tutor, would be valid for everyone if the truth were known. That means that in a conversational »/-clause, the speaker refers to an "essential condition" in his making a statement, i.e., that he has the evidence for the truth of his statement. This type of i/-clause is in some way comparable to type (1) in that it also refers to a real event, but it completely differs from type (1) in that it refers to only one event and the ¿/-clause merely brings up a factor in the conversational situation, namely the fulfilment of one of the essential conditions. After this general characterization of each of the four types, in the following sections we analyse each type in turn and especially the internal ties between the subcategories of each of the four major categories and also try to contrast some of these subcategories within one major category or across major categories 1 . 1.

Course of events conditionals (CECS)

1.0. General characteristics of

CECs

CECs usually denote two recurring events; the relationship between the two events is construed not as a causal one but as a more or less simultaneously occurring one and the two events occur in real present or past time. In spite of these common characteristics of CECs, it is possible and necessary to distinguish between three subcategories, i.e., descriptive, inferencing and instructive CECS. The following examples illustrate the three different types:

616

Angeliki Athanasiadou and René Dirven

(5)

But if there has been rain and there is good pasture, then the egg now restarts its development. (44)

(6)

He looked at his watch; if the soldier was coming, it was nearly time. (243)

(7)

It is wise to call the doctor in all cases of sore throat, especially if there is a fever of 101 (99)

Example (5) belongs to the descriptive type of CEC s because it describes two events observed in reality. Example (6) is of the inferencing type of CEC s because the second event of the conditional clause is not necessarily an observed event, but it is an event that has been inferred from the occurrence of another real event: the arrival of the soldier in (6) is, on the grounds of its repeated occurrence, a sign that it is nearly time to engage in some other business, and that is why the soldier checks the time on his watch. So both events are real but the reality of the second event follows from inference rather than from observation although this is checked on his watch. In example (7) there is, just like in sentence (5) and especially (6), the implication of recurring events. It's because they recur so often that they are situated in real time. This general character of recurrence is only implied in (5) but clearly present in the previously discussed example (1), with which it is textually connected and is expressed there in the parenthetical clause as happens so often in central Australia. In sentence (6), this generally recurring character is rather implied than expressed. But in sentence (7) it is explicitly expressed again in the phrase in all cases of sore throat, and in addition by the use of especially. Example (7) It is wise to call the doctor belongs to the instructive type of CECs because it refers to an instruction of what is to be done in these cases. We can now point out some more general characteristics of CECs. The most common denominator of CECs, especially in contrast to HCs, is the time reference. CECs refer to general time in the present or the past or a combination of present and past. But as Langacker (1991: 245) proposes, present and past tense forms do not in the first place denote time but the more abstract distinction "immediate" vs. "non-

Typology of i/-clauses

617

immediate". Present time and past time is but one possible instantiation of this more abstract schema. Modal auxiliaries are said to denote irreality, either immediate or non-immediate irreality. Since the most typical characteristic of CECs is the absence of modals, we now realise even more that CECs are needed to talk about a world of reality, experienced and described usually by someone with expert knowledge. The general time aspect in each of the three examples is then highlighted especially in terms of adverbials (adverbs, prepositional phrases or clauses). This general time aspect also appears from the fact that in CECs we can always substitute if by means of the temporal conjunction whenever2. This whenever-construcúon does not have the same meaning as the «/-construction because it highlights the recurring event. But if this recurring character is already made explicit in the sentence, as for instance by means of the expression in all cases of sore throat in sentence (7), then the substitution of if by whenever only changes the frequency of the scene, not the character of reality, as can be seen in sentence (7'): (7')

It is wise to call the doctor in all cases of sore throat, especially whenever there is a fever of 101

Whereas the use of if marks a bounded construal of the scene evoked in (7) and implies a single or repeated occurrence of it, the use of whenever in (7') also marks a bounded construal, but one that is indefinitely iterative. A very typical means of expressing a more general type is the use of the present perfect tense, which in English is the form par excellence to combine past time and present time. This time combination is typically found in sentence (5) If there has been rain and there is a good pasture, then the egg now restarts its development. Here the first (/"-clause expresses both a real fact in the past and the idea of immediate reality, and the second if-clause presents the idea of immediate reality as an actual result in the present. The present time is further emphasised in the main clause by the adverb now and by the simple present tense restarts. It is typical that in most CECs the simple present tense is used and not the present progressive. This is a natural conse-

618

Angeliki Athanasiadou and René Dirven

quence of the fact that it is generally occurring events and not hic et nunc occurring events that are referred to in CECS. In sentence (6) He looked at his watch; if the soldier was coming, it was nearly time the use of the past tense is triggered by the narrative text type from which the sentence has obviously been taken.3 It refers to a general experience made in the past related to one particular instance. The particular instance character is suggested by the past progressive was coming. But the general character follows from the implicit knowledge that the arrival of the soldier means the beginning of some other activity. In sentence (7) It is wise to call the doctor in all cases of sore throat, especially if there is a fever of 101°, the use of present time reference occurs, but again here the use of the general time adverbial in all cases and the more specific reference to the cases of fever of 101° evokes the general time character. Even if this sentence could be interpreted as referring to the future, this only follows from its strong generalising character. The difference between a CEC and a HC is perhaps felt even better by substituting the present tense of the main clause by means of will, as can be seen in the following example: (8)

a. b.

If there are no passengers, he comes back here to the garage and gets on with some repair work. (147) If there are no passengers, he will come back here to the garage and get on with some repair work.

To speak again in Langacker's terms, (8a) construes the event as being real even when it can be interpreted in an iterative way. Sentence (8b), however, denotes an event which is seen as not real. The speaker has now built into the scene he describes the idea of "epistemic distance". According to Langacker (1991: 246), "the modals can be described as contrasting with one another because they situate the process at varying distances from the speaker's position at immediate known reality." In the case of (8b), the speaker predicts the likelihood of future actions and in contrast with (8a) the main clause is ambiguous between a possible generalising interpretation and a single occurrence of an event in the future. 4 We will now further examine each of the three subtypes of CECs.

Typology of ¡/-clauses

619

1.1. Descriptive conditionals Descriptive conditionals differ from the other two types of CECs, viz., inferencing conditionals and instructive conditionals, in that they contain a reference to or a description of two observed events, whereas with the other two CECs the second event need not be an observed one. It is on the basis of such empirical observation that CECs very often imply a generalised situation, and as a result of that or on the basis of other facts, also a sense of reality. We will now look into various ways how these associations of generality and reality can be evoked. One way of generalisation is - as has already been pointed out in the discussion of sentence (1) - the use of a parenthetical clause pointing out that there are many cases: (9)

In other words, if the thumb sucking is given up by six years of age - as it is in a great majority of cases - there is very little chance of its hurting the permanent teeth. (293)

The sense of reality is expressed here by the phrases as it is and there is very little chance. Other ways of establishing such an effect of generality and reality are the use of adverbs such as normally, always and even sometimes : (10)

If the tonsils are removed, the adenoids are sometimes cut out, too. (305)

Also the function of too is interesting in this respect: it underlines both the more or less simultaneous character of the two events, and the real character of their regular occurrence. Not only adverbs, but also determiners play an important role in expressing the general character of a situation. Just like the adverb sometimes, also the determiner some can have the effect of a generalization: (11)

Some parents prefer a baby's fabric both on high legs if there is room for it. (59)

620

Angeliki Athanasiadou and René Dirven

But some also has the function of suggesting that the entity or entities referred to are real and in this respect it contrasts with the use of any, which is typically far more often used in HCs (see later section 2): (12)

If there was no beast - and almost certainly there was no beast - in that case, well and good; but if there was something waiting on top of the mountain - what was the use of three of them, handicapped by the darkness and carrying only sticks? (132)

The example in (12) is extremely interesting in that it shows to what extreme point the CEC can be stretched. Both sentences in (12) very clearly make a claim as to the actual occurrence of a state of affairs; yet, the two sentences contain entirely contradictory descriptions. Although the first sentence seems to rule out the possibility of there being a beast on top of the mountain, the second sentence very strongly evokes the possibility of there being something by the use of the emphatic stress on was and the obligatory use of some (thing) as a consequence of that. The notion of "epistemic distance" must not be interpreted in a sense that none of the modal auxiliaries can occur in CECS. Since modal auxiliaries situate a process at varying distances from the speaker's known reality, some of them like must and should come somewhat closer to known reality. Further also the epistemic modal auxiliaries going to, can and may are used to evoke the sense of coming a bit closer to known reality and hence, one can say, close to the real objective occurrence of events. Especially going to, in this respect, strongly contrasts with the form will used in HCS. The form going to does not denote a prediction like will, but on the contrary an event that is already "under way", that is, an event which can be observed to be likely to happen because all signs for it to happen are present: (13)

If there are distance problems, when engaged in conversation, then there are clearly going to be even bigger difficulties where people must work privately in a shared space. (121)

Typology of i/-clauses

621

Also the use of the modality adverb clearly reinforces the "deductive" and objective nature of coming close to the known reality of the second event. Also the modal auxiliaries can and even may function as markers of descriptive conditionals. Can denotes, by nature, a more "objective, theoretical possibility" (Leech 1969: 221) and may a more subjective, but actually existent possibility. Still, both may contribute to a sense of near reality: (14)

If the share of the poorest in the total national income declines faster than the total is growing, then the poor can get poorer at the same time as the country as a whole seems to be getting richer. (222)

This is not merely a hypothetical case, as found in HCS, but several elements "conspire" to make it into a repeatedly occurring and therefore real case, amongst others, the use of progressive forms (is growing, seems to be getting, which describe the background in such recurring processes), the simultaneity marker (at the same time as) and also the use of can, which here marks the repeated occurrence. It is not surprising that can is used more often than may/ might in descriptive conditionals, namely 10 can to 7 may.5 Indeed, the difference between the two modal auxiliaries is that can foregrounds the generally recurring element in the descriptive conditional, whereas may highlights the hie et nunc aspect of the potentially recurring event. So in sentence (14) can cannot be replaced by may. Hence, also may is an important marker of instant possibility and of a heightened sense of a potentially actual occurrence: (15)

However, a doctor may be slower to advise giving unboiled pasteurized milk if a baby is particularly susceptible to diarrhoea, if the weather is hot, or if there is not a good refrigerator in which to keep the quart contained. (58)

The opposition with a former case, implied by however suggests that a doctor still advises this, but that he is slower in doing so; also the three i/-clauses suggest repeated occurrence of the event in the main clause.

622

Angeliki Athanasiadou and René Dirven

A final marker of the sense of generality and reality is the use of the adverbs of focus especially or particularly (used in the example in (15) above). The combination especially if focuses on one of many possible cases as e.g., in (16): (16)

Temporary constipation is common during illness, especially if there is fever. (73)

Whereas common implies repeated and more general occurrence, especially if introduces an aggravating circumstance for even more occurrences of the state of affairs denoted in the main clause. As the examples in (15) and (16) - and also that in (11) - show, the postposed {/-clause is not uncommon in descriptive conditionals. These three instances out of eight examples given so far in this section may even give the impression that postposed «/-clauses represent a quarter of all cases. This is confirmed by the corpus where the figure for postposed »/-clauses is 16 out of a total of 61 descriptive conditionals, or about one quarter. How can this relatively high number of postposed «/-clauses be accounted for? First of all, in the case of especially if or particularly if which are prototypical for CEC s and do not very often occur in HCs, the postposed position is obligatory from a semantic or cognitive point of view, since here the «/-clause specifies a special circumstance under which a frequently recurring event obtains. The postposed «/-clause often contains a reference to elements in the main clause as in (17) and thus must follow the main clause, because it elaborates one aspect of the scene that the main clause describes: (17)

A swelling that puffs out quickly on a child's skull after a fall doesn't mean anything serious if there are no other symptoms. (116)

Although it is theoretically possible to exchange both clauses here, the version with a preposed «/-clause would be far more difficult to process, probably because it contains so much elaborated information. Also the reverse may be true: if the «/-clause is a very long one or if there are various «/-clauses, these tend to be in postposed position.

Typology of (/-clauses

623

This was already the case in (15) and is again shown in (18), which contains an instance of a very short main clause: (18)

This may work well if the child is fairly independent and sociable, if the class is small, and if the teacher is so warm and understanding that she makes the children feel secure. (281)

Another difference between (18) and (15) is that the three postposed «/-clauses in (18) constitute a cumulative set of conditions, whereas in (15) there is a series of alternative conditions. But more insight into the reasons for the postposing or preposing of //-clauses can be gained from the analysis of inferencing conditionals in the next section. 1.2. Inferencing conditionals As stated before, inferencing conditionals consist of a descriptive ifclause and a main clause which is based on inference rather than on observation, but which also "implies" an actual situation. A typical example is the following: (19)

If a child has a fever with a skin infection, or if there are red streaks running up his arm or leg, or if he has tender lymph glands in his armpit or groin, the infection is spreading seriously and should be considered a real emergency. (114)

The general course of events character of this sentence is especially guaranteed by the generic use of the indefinite article (a child, a fever, a skin infection) at the beginning. But the use of the progressive in the main clause suggests an inference from any of the previously described symptoms and describes this inferred situation of gradual increase; it is precisely the use of the progressive that stresses the actual processual character of this situation. This example also shows that a series of i/-clauses can very well - and in fact here even must - precede the main clause: first the alternative symptoms are to be described and then the illness or infection can be concluded from them. In fact, the

624

Angeliki Athanasiadou and René Dirven

only example of a postposed (/-clause in the inferencing type of conditional is a question: (20)

The problem remains: How can ideas such as the foregoing be the subject of conversation between an adolescent and a grown-up if the two are separated by silence, by awkwardness, by mistrust or exasperation, by the teenager's dogmatic refusal to talk to Mum or Dad ("Why should I?"). (321)

But if this sentence was stripped of all the contextual constraints that make the present preposed position of the main clause necessary, the order of both clauses could be changed. This is also confirmed by a case such as (21): (21)

If this was the invasion, was it aimed at Normandy? (384)

The inferencing character is quite clear in an affirmative paraphrase, e.g., If this was the invasion, it was aimed at Normandy. The meaning of the question in (21) can be paraphrased as follows: "If this was the invasion, can we infer from this that it was aimed at Normandy?" As already noted with example (19), the use of the progressive form may be a special device to underline the inferencing character of the conditional. In (19) this is due to the fact that the progressive is the form par excellence to describe a gradual change of state. Another function of the progressive may be to underline the epistemic (here inferencing) nature of a sentence and to rule out a deontic interpretation. This we find in (22): (22)

If these reports are as good as they look, your chaps should be getting a little more rest before long. (186)

Without the progressive infinitive be getting the modal form should might be interpreted as (deontic) desirability, but with the progressive it can only be interpreted as probability, and hence it underlines the inferencing act. Of course, this can also be done by using the adjective probable or the adverb probably :

Typology of (/-clauses

(23)

625

If these feelings do not exist, it is probable the parent has been blocked out. (201)

One of these lexical items and the progressive can even be combined so as to reinforce each other and the inferencing character of the main clause: (24)

If these are performed by a girl whose upper-body activities are either totally non-erotic or even anti-erotic, then she is probably feeling more sexually responsive than she cares to admit. (211)

In spite of the epistemic character of the main clause, expressed by probably, the use of the progressive is feeling forces the interpretation of the actual character of the situation. This latter aspect can also be expressed by the use of the past tense, here again illustrated in (25), but also found in the previously given examples (6) and (21): (25)

If this took place in South America, as some evidence suggests, they spread across into the Australian-Antarctic bloc. (380)

In fact, such past tense instances are pure cases of inference: here we really have two actual cases of some (past) state of affairs: if one event took place in South America, then the conclusion is that from there some element or other spread across the ocean into other continents (the italicised parts have been added but are part of the implied assumptions made in (25)). We thus see that the "course of events" character of the main clause in an inferencing conditional can be stronger or weaker: with past tense combinations it is very strong; with progressive forms it is still strong, but in sentences with probable or should (of probability) it is weaker of course, and in questions as in (21) it is weakest. But it is never totally absent.

626

Angeliki Athanasiadou and René Dirven

1.3. Instructive

conditionals

In an instructive conditional the main clause denotes an instruction of what is to be done in case the situation denoted in the subclause arises. But this is not presented as something hypothetical, but rather as something that normally happens and the further course of events is to act as suggested in the instruction. The instruction is expressed by an imperative form, paraphrases such as it's wise to as in (7), or the deontic use of the modals should, can or may. Even with the imperative it is very well possible to suggest a course of events context: (26)

If there is more than one contributor, either sort out separate responsibilities or pool the family income. (21)

This is not an instruction what to do in a single hypothetical case, but it is an instruction to a tax office worker about how he has to handle cases with more than one contributor to the family income. Consequently, he must follow one of the two possibilities in the main clause. All this contributes to the course of events character of the sentence as a whole. The same "course of events" nature may follow from the many different symptoms that are enumerated as in the next example (27). Notice that here the imperative precedes the «/-clause, although this is rather exceptional: (27)

In a general way, suspect a fracture, if pain in a limb continues, or if there is swelling, or if a black-and-blue mark appears. (115)

Here the generalising phrase in a general way and the indefinite article in a fracture contribute to the general nature of the case, and simultaneously these elements cause the main clause to require pre-position. It is interesting to contrast this instance in (27) with the pre-position of the (/-clauses in the inferencing conditional in (19): there the inference in the main clause contains the definite article (the infection)

Typology of ¿/-clauses

627

and this functions as theme (resuming the various symptoms) for the inferenced rheme (is spreading seriously ). Instructive conditionals with imperatives do not constitute the majority of cases, but the most frequent form is a paraphrasing expression such as it's wise to in (7). Other possible expressions are: it is sensible to, it is necessary to, it is vitally important to, it is helpful to, it is Utopian to, it would be better if, it's more convenient if, the most important step is to, it usually works better (not) to, you have to be careful that, it is kinder to, this rule is particularly important if, you want to, etc. (28)

When a rash is bad, and especially if there are a lot of pustules (whiteheads), it usually works better not to use an ointment but expose the whole nappic area to the air for several hours a day. (76)

This example is also striking in that it exhibits so many general time references such as the conjunction when, the focus adverb especially (if), the adverb usually, and the distributive phrase (for several hours) a day. In comparison with the above paraphrasing expressions which constitute rather indirect "instructions", and the direct force of the imperative the modal auxiliaries should, can and may have a position between these two extremes. Should as the expression of advice and desirability is somewhat closer to the imperative: (29)

You should call a doctor to diagnose and treat your child if there is a rash, certainly if there is a fever or the child feels sick. (102)

Can on the other hand is closer to the indirect, paraphrasing expressions and may is especially used to be combined with such expressions, e.g., it may be wise to so that we are even one degree higher on the scale of indirectness: (30)

If there is one aggressive child who regularly bullies your child and your child is becoming more intimidated rather than less as the weeks go by, it may be wise for a couple of months to take her somewhere else to play, where she will have more chance of finding her courage. (87)

628

Angeliki Athanasiadou and René Dirven

2.

Hypothetical conditionals (HCS)

2.0. General characteristics of HC s In order to understand the nature of hypothetical conditionals (HCS) better, it seems useful to contrast them again to course of events conditionals (CECs). If we better understand the cognitive basis or the cognitive need to construe HCS, we may also see better why they have the form and the characteristics they do. Many examples of the CEC type in the corpus stem from the world of science and counselling on e.g., baby care, family planning, etc. They refer to a world of expert knowledge amassed in these domains and most of the examples state what happens and can be done in problem cases. With HCS we are in a set of much more varied domains, ranging from the above domains of science or counselling to all possible sets of very practical domains such as e.g., the behaviour of overheating automobile engines as in (2), here repeated as (31): (31)

If there is no water in your radiator, your engine will overheat immediately.

The speaker of such an HC does not just - in a kind of detached way describe an actual course of events, but he does three things-in-one: he speaks on the basis of his everyday experience; he selects a possible or hypothetical future case, which means that he creates an epistemic distance from any actual, present reality and he predicts what will happen if this case pertains. So in comparison with a CEC, an HC does not describe actual reality, but predicts possible distant reality, which however, since it is based on experts' former observation and experience, can also be implicitly descriptive. Another difference with CECs is that an HC is not a value-free utterance, but tends - as we will see in section 2.1. - to build in all sorts of communicative intentions (or "speech act forces"); here in (31) for instance, a possible interpretation of the speaker's communicative intention is that of advice or warning to be careful about the radiator.

Typology of i/-clauses

629

Not only predictions but also advice or warnings always relate to the future. Thus the HC with its future time reference is a construction in which everything almost "conspires" to imply all sorts of potential results, especially, the force of indirect speech acts. But the prediction may be based on far less objective information and therefore all sorts of distance-markers may be built in. It may be useful to illustrate first the effect of distance marking outside the context of conditionals. The use of a modal auxiliary can creates a much greater distance from the imposing force of an imperative as in (32a) below. The use of the socalled past tense of the modal form for can, viz. could, creates an even greater distance from imposing your own will, as in (32b): (32) a. b.

Can you hold this for me, please? Could you hold this for me, please ?

Similarly, the use of the past tense in HCS creates a much greater distancing effect; compare (31) and (33). (33)

If there was no water in your radiator, your engine would overheat immediately.

Such past tense forms in HCS can have - as we will see in greater detail later - various different functions: one of them is the meaning that it is not very likely that this would occur in your case. This "distancing" use of the past tense is not possible with CECS: there a past tense form can only be interpreted as past time as examples (6) and (12) clearly show. So in HCs the speaker can take on a whole series of attitudes towards the degree of likelihood of the elements in his own supposition. This "likelihood scale" is not only expressed by the past tense, but also by modal auxiliaries, adverbs, periphrastic constructions etc. In other words, there is a whole continuum of possibilities on the scale of likelihood. If the lowest degree of attitude towards the likelihood of one's own supposition is in view, the speaker uses the auxiliary will and for the sake of easy comparison we can call this type of HC a "neutral" HC as in (31); (33), then, is an instance of an HC placed on the next higher level of the scale of likelihood, which we will designate as the "more or less likely" level. English has a very special form

630

Angeliki Athanasiadou and René Dirven

allowing to express a next higher degree on the scale of likelihood, namely, the subjunctive form were instead of was as in (34): (34)

If there were no water in your radiator, the engine would overheat immediately.

We can call this the "unlikely" level on the scale of likelihood. Although in English the difference between (33) and (34) can exceptionally be expressed in the case of be, other verbs do not offer this possibility so that most examples of the type of (33) and (34) are always ambiguous between "more or less likely" and "unlikely". The fourth degree on the scale of likelihood is the "unreal" level which is found in (35): (35)

If there had been no water in your radiator, the engine would have overheated immediately.

The use of the past perfect tense in this construction is a further application of the distancing principle: since the past tense is here used to make an epistemic "distancing move" for a potentially present situation, a past time situation, which would already require the use of the past tense, can only be further epistemically "distanced" by the use of a form which is one more step away from the past tense, which is the past perfect. Again we are very far away here from the world of CECS, which nearly always refer to the actual occurrence of events and which do not allow epistemic distancing. But in fact, besides these four types of HCS, viz., "neutral" HCS as in (31), "more or less likely" HCs as in (33), "unlikely" HCs as in (34) and "unreal" HCs as in (35), there are many more combinations, which will be discussed below in a special section on "peripheral" types of HCs. Summarising this general characterisation of HCs, we can say that whereas CEC s refer to a real world of "objective" experiences and aim at the description of it, HCS rather construe suppositions about possible worlds which are presented with some degree of likelihood, and then make predictions on the effect of these suppositions or on the effects if the suppositions had been true. The scale of likelihood can be presented in the following summarising Figure 1. We shall now discuss each of the four categories in turn.

Typology of í/-clauses

631

2.1. Neutral HC s Neutral HC s constitute more than half of the total number of HCS (65 out of 114) and the immediate question that arises is why this should be so? A possible answer may be the great variety of speech act forces that can be associated with the neutral HCS, which has the canonical form "if + present tense, modal (will)". Sentences with past tense forms or perfect forms may be less frequent, precisely because they are far more limited in expressing different pragmatic forces.

Figure 1. Scale of likelihood Finally, the general characterisation given in 2.0. should not lead to the wrong conclusion that HCS would not occur in the domain of expert knowledge and counselling, which can be easily attested by numerous examples: (36)

Later on, if there is any question about continuing to breastfeed, you'll find several friends who'll urge you to stop. (55)

The difference between ŒCs and HCS like (36) is then that here in neutral HCS the speaker who is basing himself on experience may still be acting as a describing expert but mainly becomes an advising one who wants to warn and also to create a cautious attitude on the part of breast-feeding mothers. Of course, such advice is not far away from the facts and the more detached attitude in CECs; only the situation is

632

Angeliki Athanasiadou and René Dirven

now clearly located in future time by the use of temporal elements like later on, continuing to breast-feed, the use of will (found here even two times), and it thus gets the character of predicting a possible future course of events, against which the expert wants to warn. But another difference between CECS and HCS is that HCS are used by all possible speakers in all possible registers, e.g., as a threat associated with a typical utterance used by blackmailers: (37)

On my return home, if there has been any attempt to contact the police or lay a trap, you will die. (135)

The two examples in (36) and (37) both contain the determiner any and the existential there is-construction. In contrast to CECs, which because of their "reality" character require the use of the affirmative determiner some (see (12), HCS require the non-affirmative form any, since this agrees with the hypothetical nature of the state of affairs in the if-clause. Also the existential there is -construction occurs very frequently in hypothetical «/-clauses. In fact, non-affirmative any and existential there is reinforce each other in a hypothetical context. The cases discussed so far were more epistemic in nature. We now turn to those deontic cases where the i/-clause denotes a more unexpected or an imaginary world; the main clause may now denote a volitional act, e.g., a momentary decision rather than a prediction, as illustrated in (38): (38)

If there's a beast, we'll hunt it down. (130)

Similarly, imperatives can be used to give instructions on what to do and thus are close to volitional will : (39)

If there's anything urgent in the mail, just give it to me and I'll deal with it. (28)

The difference between such an HC i/-clause plus imperative and a CEC ¿/-clause plus imperative (termed "Instructive Conditional" in section 1.3) is striking: here in (39) we have the non-affirmative form any, but in (26) If there is more than one contributor, either sort out separate

Typology of ¡/-clauses

633

responsibilities or pool the family income, we have the specific number one·, in (39) the imperative is followed by a sentence with will and is future in orientation, whereas in (26) one of the two alternative imperatives applies anyway in the present time. In all the previous cases of HCS, the subclause with if does not contain a will-form, although semantically the »/-clause refers to the future and not to the present. This is due to the fact that the future reference form will of the main clause is strong enough to force a future interpretation on the «/-clause. But if this interpretation does not follow automatically, then the distancing future form will has to be used exceptionally in the «/-clause (2 cases in the corpus): (40)

If the room will be cold enough so that an adult would require a good wool or acrylic blanket for covering, the baby will need an acrylic bag. (173)

This sentence refers to a time still further distanced in the future and the use of the present tense is in the «/-clause would evoke an ambiguous present or near future time; it would moreover evoke a positive association whereas what is referred to is a negative situation. So we can propose a tentative rule of the following type: "If the future time of the «/-clause is not clearly given, will must be used in it in order to avoid temporal ambiguity." Other modal auxiliaries that are used in the main clause instead of predictive or volitional will are either epistemic may or deontic can (with the pragmatic force of a casual instruction): (41)

If the scab on the unhealed navel gets pulled by the clothing, there may be a drop or two of blood. (180)

The use of will instead of may would be too strong here, because it suggests that there would normally always be drops of blood if the scab gets pulled; the use of epistemic may reduces this to a mere possibility, which again does not make it less descriptive than examples (2) and (31), but which shows that the speaker does not refer to an actually occurring event, but to events observed before and to the accompanying phenomena which normally (use of will) or possibly (use

634

Angeliki Athanasiadou and René Dirven

of may) follow them. The use of can is what Leech (1969: 222) calls a "casual instruction" and it is very close to the meaning of an imperative: (42)

If there is no irritation, you can keep on. (61)

Summarising the analysis of neutral HCs we can conclude that they are used for a wide range of pragmatic (both deontic i.e., volitional and epistemic) purposes, which may also account for their much greater frequency than that of the other types of HCs. 2.2. Non-neutral HCs The prototypical form of neutral HCs is: "If + simple present, will" or some other modal auxiliary. In non-neutral HCs, there are two possible form categories: "If + simple past, would + infinitive" and "If + simple past or past perfect, would + perfect infinitive". Although in most grammars there is a strict distinction between the two formal categories, we propose to take them together and to treat them as different possibilities on a scale of likelihood. In fact, in our opinion, a simple past tense in «/-clauses can even have five different interpretations: (43)

a. b. c.

d.

e.

Vf asked to define my condition, I'd say "bored". (C. C. 8:34) If the slick did get too big, it would screw us up for the next day. (238) If it faced the sun and particularly if there was any kind of breeze, the plates would become very efficient cooling radiators. (42) If there were a serious end-point to these sequences of behaviour, then the individual actions used would become subordinated to this goal. (124) //"there were a beast, I'd have seen it. (128)

Our interpretation of these sentences is represented in the following more detailed representation of part of the likelihood scale:

Typology of «/-clauses

(43a) close to neutral

(43b)

(43c)

more likely less likely

(43d) unlikely

635

(43e) unreal

The detailed interpretation is as follows: (43a) is very much synonymous with "If anyone asks me to define my condition, I'll say "bored". That is, the "present + will" or "past + would" only makes a slight difference in that in (43a) this chance of "being asked" is perhaps slightly less likely. But the real situation of "less likelihood than in a neutral HC" is given in (43c): it is impossible to say whether it may occur or not, but the "particularly if- construction" gives it some real chance to happen. Now (43b) is in between (43a) and (43c): the emphatic did-form makes it somewhat more likely to happen. Likewise (43d) is in between (43c) and (43e), which is an unreal HC itself: (43d) has the subjunctive form were as against the was-form in the particularly {/-construction of (43c), and therefore becomes more unlikely than (43c); (43e) finally not only contains the subjunctive wereform, but the would have seen form in the main clause makes the ifclause unreal, too. It can be concluded therefore that, although English does not have to make the distinction between "less likely" and "unreal" HCs, it can do so if the need arises.6 This issue is so important in our opinion, that at this point we would like to offer a small test here and ask the reader to judge the sentences in (44) to (46) as likely, unlikely, or unreal:7 (44)

I venture to say that if these youngsters had a sufficient number of these relationships, in a period of a few months, perhaps years, they could learn enough self-control to enable them to feel and act OK. (203)

(45)

The President feels that if the secret were out the Press would have afield day. (214)

(46)

And perhaps if these fools I complain of were French or Dutch or German, I would not mind so much, because then I could say "what else can you expect?" (189)

636

Angeliki Athanasiadou and René Dirven

We conclude that the distinction between the various possible interpretations of "if + past tense, modal (would)" are firmly established and that the rest of the sentence or the wider context offers enough clues to choose the interpretation intended by the speaker or the author. It is further interesting to note that the unreal HC type, "if + past perfect, modal (would) + perfect infinitive" illustrated in (35), can have one meaning only, viz., "unreal supposition in the past". In this respect of being unambiguous, it resembles the "neutral HC if + present", modal (will), which as a neutral HC is also unambiguous on the likelihood scale. The full-fledged likelihood scale of HC can thus be extended as encompassing six levels on a continuum: HCs

Neutral

Non-neutral

More or less likely

Unlikely

Unreal (43e); (46)

(31); (42) Close to neutral (43a)

More likely (43b)

Less likely (43c)-(44)

Figure 2. The full-fledged likelihood of HC 2.3. Peripheral HCs In spite of the six different levels of HCs discussed so far and summarised in Figure 2, there are only three canonical forms of HCs, which have the following formal characteristics:

Typology of ¿/-clauses

(i) (ii) (in)

637

HC Type 1 : If+ present tense, modal (will) + infinitive HC Type 2: If+ past tense, modal (would) + infinitive HC Type 3: If + past perfect, modal (would) + perfect infinitive

But various other formal combinations may occur in the actual use of English. Thus we may meet with a combination of two different time units such as "present time + past time" or "past time + present time". For instance: (iv)

If + past tense, modal (would) + perfect infinitive (present time + past time)

In (47), the unreal supposition about the present time leads to a conclusion about the past time. (47)

If there were a beast, I'd have seen it. (43e)

The combination (past time + present time) is represented schematically in (v) and is illustrated in (48). (ν)

ξ + past perfect, modal (would) + infinitive (past time + present time)

(48)

And if these men had been allowed to live, perhaps they too would be spared from the cooking cauldron. (220).

In (48), the unreal supposition about the past time leads to a conclusion about what might be the case in the present time. 3.

Marginal conditionals: logical and conversational ^-clauses

We will treat the type of «/-clauses that we will call logical «/-clauses (Lies) and conversational «/-clauses (cics) together as marginal cases for three reasons: (i) their frequency is relatively low (34 Lies and 23 cics); (ii) they cannot be interpreted as prototypical conditionals, but

638

Angeliki Athanasiadou and René Dirven

rather they are constructions using the pattern of an {/-clause, and (iii) some of them, especially CiCs, are highly idiomatic. There is, however, no conceptual link between these two marginal types of «/-clauses and one of the problems this section will try to solve is: do they relate to the two main types of conditionals, CECs and HCs; and if they do, how do they do it? 3.1. Logical if-clauses (Lies) The difference between CECs (49), HCS (50) and Lies (51) may appear from the following set of examples: (49)

But if there is a particularly wet season, its larvae retain their gills. (40)

(50)

But if there is a particularly wet season, the larvae will retain their gills.

(51)

But if there is a particularly wet season, this is due to the heavy rainfalls in the winter.

In course of events conditionals as (49) the two events described are related to each other in a descriptive way: this is how things happen in the normal course of events; as scientific observation has shown, that's how things pass. A different attitude is taken in (50): the speaker does not imply that there is a particularly wet season now or at any other time, but he just predicts - probably also on the basis of his expert knowledge, but this is not made overt - that something else will follow. In a logical »/-clause such as (51) the speaker does not relate two events in their temporal succession as in the HC of (50), but in the main clause he tries to identify one unknown element of the {/-clause. The line of argument could be made explicit as follows: (52)

a. b. c.

A particularly wet season must have some reason. This reason can only be heavy rainfalls before. So if there is such a season, this is due to such rainfalls.

Typology of //"-clauses

639

The hybrid character of logical i/-clauses is that in many respects the (/-clause rather resembles that of CEC s and not that of HCs. This is especially felt in the use of the pronoun some in Lies instead of any in HCs. This use of some indicates that there has really been an event in which some unknown element χ was involved as is clear in the following example: (53)

If the shark does ill someone else, then the only one who gets screwed is Vaughan. (224)

In fact, this is a much more prototypical example of Lie than the one in (51); this is due to the use of the emphatic form does, which always creates greater likelihood of occurrence, and especially to the "affirmative" form someone (instead of anyone). Moreover this form someone is an element of the sentence nucleus (someone is direct object of ill) that is further identified in the main clause, whereas in (51) it was a circumstantial element of the sentence periphery that was spelled out in the main clause. Therefore, in the more prototypical instance of (53) there are not two events, but in fact only one event. This event is not even introduced as new information, but it is assumed to be known already and the only bit of new information is the statement about the identity of the unknown χ in the main clause. The logical form of the sentence could even be made clearer by means of an existential if there is-construction, so that (53) could be paraphrased as (53'): (53')

If there is someone elseiwho the shark does ill it is Vaughan. who gets screwed

(The phrases ill someone and get screwed are taken to denote the same act, but seen from different perspectives). The type of (53') is the sentence one finds most often in the category of Lies; instead of reference to human entities by someone, one also finds reference to abstract categories by means of one : (54)

If there's one human species that ought to be put out to pasture, it's Presidents and Prime Ministers. (136)

640

Angeliki Athanasiadou and René Dirven

A second subcategory of Lies next to this identifying type is the infer encing type as in (55), which differs in subtle ways from "inferencing" conditionals discussed in section 1.2.: (55)

If the super-organism created by a colony of termites can be compared to an antelope, then the disciplined aggressive columns of the army ants must be reckoned to be the insect equivalent of a beast of prey. (268)

Whereas in inferencing conditionals of the CEC type such as e.g., in (25) always two events are related to each other, here in (55) we find a purely logical process of inference, which in its simplest form could be phrased like this: (55')

If termites are the equivalent of an antelope, army ants are the equivalent of a beast of prey.

Given this strong suggestion of a logical link between the two clauses, it is only natural to find here stronger markers of logical operations, such as the modal auxiliary expressing inference must, the phrase (must) be reckoned to be or other phrases such as it is safe to say or it occurred to me as in the next example (56), which also uses a weaker form to denote possibility (might) instead of absolute inference: (56)

It occurred to me that if there ever was such a man, and a British one at that, he might be the kind who would never get his hands dirty inside this country, see. (142)

In fact, this logical inference concerns the identity of an unknown χ in the main clause, so that here the identifying and inferencing types of Lie are combined in one construction. 3.2.

Conversational if-clauses ( ClCs)

Conversational «/-clauses do not constitute one clear category, but they are rather relatable to various other subtypes of (/-clauses. What these

Typology of {/-clauses

641

have in common is that they are strongly dependent on the conversational context, the speech act itself, the partners, or still other elements in the speech situation such as, for instance, the "essential condition" discussed with reference to sentence (4) in the introduction. A first type of conversational «/-clauses are what we will call "performative conditionals".8 These are constructions of the type: (57)

If anyone wants me, I am downstairs.

This construction could be considered as an elliptical form of an explicit performative construction of the form in (57'): (57')

If anyone wants me, (tell them) I am downstairs.

But such an interpretation is far too specific. It is not certain that the speaker of (57) wants the hearer to tell the caller that he is downstairs. On the contrary, it may be his intention not to be found out and certainly not to refer to the downstairs rooms. Still, we propose to call such constructions "performative conditionals", since the link between the two parts of the whole sentence is a performative act. This is kept vague, because it will depend on the context of the utterance how the speech act is to be understood. In fact, the corpus does not contain such clear cases as (57), but only somewhat vaguer "performative conditionals" as: (58)

What about the parents demonstrating, if there are no friends? (84)

This sentence can only be interpreted as a suggestion, i.e., (58')

If there are no friends, (I suggest that you consider the idea that) the parents could demonstrate.

An explicit form of such a performative conditional is (59): (59)

If they want others to do it, I'd advise against their having children. (175)

642

Angeliki Athanasiadou and René Dirven

The difference between this type of performative conditional and a hypothetical conditional can be inferred from a comparison between (59) and the HC in (60). (60)

If they want others to do it, I'll advise against their having children.

Here no act of advising is performed, but only a prediction that such an act will take place. But in (59) the speaker pronounces his or her conditional negative advice. Therefore it also seems justified to speak of "a conditional" in this first type of conversational ¿/-clauses (cics); here there clearly are two events: the one in the «/-clause, and the (implicit or explicit) performative event in the main clause. However, the link between the ¿/-clause and the main clause need not be a performative construction, it can also be a zero-form or else a form such as in (61): (61)

If after a certain level such as the deputy, if there is one, the names are not in alphabetical order, you can be sure that they are in order of seniority. (29)

Obviously the phrase you can be sure has less performative force than a phrase such as I'd advise against it. Nevertheless, (61) explicitly expresses the attitude of the speaker's self-certainty about his own proposition and therefore it would fundamentally change the meaning of (61) if this explicit reference to his attitude to the proposition were left out. In that case, we would not have a performative ¿/-clause but a logical ¿/-clause, as in (61'): (61 ')

If after a certain level the names are not in alphabetical order, they are in order of seniority.

This also shows that the boundaries between Lies and ClCs are fairly firm. A second type of conversational ¿/-clause is the elliptical one. In fact this is the only type of CIC that Quirk et al. (1972: 746) quote, but they call it "style disjunct" e.g., if you please, if you don't mind, if you

Typology of (/-clauses

643

follow me, if I may say so. These are not represented in the corpus, however. The different type of elliptical {/-clause that is represented in the corpus is the one introduced by and if or but what if: (62)

And if these venerable, old ideas are thought not to be worth bothering about. (219)

(63)

But what if the sitting vice-president does or is unfit? (234)

Both are read as questions (though (62) has no question mark) and can only occur as second part in an adjacency pair. The part that has been ellipted is something like and (what happens) if ox but (what happens) i f . Although the fillers chosen here may suggest that we use these ellipted and if- or but what «/-clauses for CECS, this need not be the case. They may as well refer to HCs. A third type of conversational «/-clause is the parenthetical type, which is comparable to HCs, because of the existential there is and the use of any: (64)

The shooting-season opens Saturday and the birds will be scattered all over the place after that - if there is any left. (165)

The parenthetical clause here comes as an afterthought, but it could also be placed inside the sentence viz. and the birds - if there is any left - will be scattered all over the place. Still the parenthetical clause lies outside the main clause and could not form a semantic unit with it: (64')

*The shooting-season opens Saturday and if there is any bird left, the birds will be scattered all over the place.

So both the referential problem and the semantic incompatibility of singular any and the plural idea implied in scattered show that the parenthetical construction has a status of its own and is not just a kind of reduced HC, although it has several of its features. Whereas the former parenthetical «/-clauses are related to HCS, the following parenthetical «/-clauses can be related to logical «/-clauses.

644

(65)

Angeliki Athanasiadou and René Dirven

This is a time - if there ever was one - for parents to show their thoughtfulness and generosity towards each other. (117)

Even more than in (64), the parenthetical construction here is a rhetorical device to convince parents of the point made. But in its formal construction this parenthetical ¿/-clause rather resembles a logical ifclause: (65')

If there ever was a time for parents to show their thoughtfulness and generosity towards each other, this is the time.

However, this paraphrase in (65') misses the rhetorical point, since it presupposes the contents as already known information, whereas in (65) everything except this (time) is new, even the addition of the parenthetical if there ever was one. The above analysis of cics has shown and confirmed the correctness of our assumption that these (/"-clauses are marginal cases and that, with the exception of performative (/-constructions, they are not prototypical conditionals. Though they are relatable to either CECS or to HC s (but also to Lies), they lead a life of their own and are not to be treated as reduced CECS, HCS or Lies.

4.

Conclusions

In this study, we have been able to distinguish between four types of ¿/-clauses. Given the different frequencies with which they are used, these four types of ¿/-clauses have different degrees of cognitive saliency: obviously, the course of events conditionals (44,7%) seem to have a greater cognitive saliency than the hypothetical conditionals (36,9%). These two together (81,6%) cover about four-fifths of the total uses of «/-clauses. The remaining one-fifth (18,4%), although given very much attention in the various papers referred to in the introduction, has a marginal status especially when one takes into account that there are no internal links between the two types, so that logical ifclauses only constitute 11,2% and conversational »/-clauses only constitute 7,2%. Of course, this does not mean that they are negligible

Typology of «/-clauses

645

phenomena, but clearly they are not the ideal points of departure for a global approach to (/-clauses. It is only by realising their fairly marginal status that one may come to understand their proper cognitive and/or communicative functions. In these conclusions we will try to make a first attempt at characterising the cognitive needs that seem to govern the various types of ifclauses. This characterisation cannot, of course, explain why the two types of conditionals (CEC s and HC s) cover so much ground and why the Lies and the cics are marginal cases. But what we can do is to give a characterisation of the cognitive needs that may underlie the use of each of the four types of i/-clauses. The major cognitive need for the use of the CECs is the following situation: an observer, be it a scientist, a counsellor, or any other type of expert, knows that there is a fixed pattern in a set of events. Thus he knows that whenever event A occurs, Β occurs as well, and when A does not occur, Β is postponed and may occur later. So there is no element of prediction here but firm knowledge of real situations which occur regularly and which allow of such generalisations. The observer has then several possibilities to perspectivise the relationships between event A and event B. By using the relator whenever he makes an absolute statement; indeed it would sound too apodictic to talk like that all the time, although in some languages the conditional relator derives from equivalent expressions of whenever (see the discussion of examples (7) and (7') and footnote 2.). Alternatively, the observer can exploit the ubiquitous vagueness of language forms and select a less apodictic relator. In English, we have the somewhat extreme situation that a very sharp distinction is made between temporal when and conditional i f . But in various other languages, one form can be used for both of these English relators. For instance, in German the relator wenn can be used for both temporal and conditional sentences and in the colloquial variety of German there is even no special relator to express the concept of mere conditionality. 9 The major cognitive need for hypothetical conditionals is quite or totally different from that of the CECs. Whereas in CECs the observer starts from recurring and generalised experiences, there is a different conceptual situation in HCs and hence a different cognitive need urging the speaker to use HCs. The observer is confronted with single

646

Angeliki Athanasiadou and René Dirven

cases of any type of event. These cases are not situated in immediate or non-immediate reality but at a variable degree of epistemic distance from the observer's position and consequently they are by and large time-independent. What he does conceptually is to make suppositions, and link them to possible consequences. Why do speakers need to make suppositions? As Haiman (1978: 583) says, and as is confirmed by Ford and Thompson (1986: 370), the ¡/-clause has mainly a function of introducing topics in the protasis whereas in the apodosis there is room for making comments. This, of course, can only be true in cases of preposed «/-clauses. The prototypical function of the preposed hypothetical {/-clause is then to introduce topics that have more or less a speculative character so that the speaker takes a possible situation, introduces it as a potential case and then states what this will lead to. In the process of making suppositions and linking them with their consequences, the speaker does not make statements but he makes predictions. The question is: why does he do so? We have no clear answer to this very important question about the cognitive need that underlies this fact. But we can only assume tentatively that the speaker makes a prediction as a kind of distancing move from the present situation. So his prediction could have the negative function of avoiding making a statement. The difference between a statement and a prediction comes out very clearly in the often-quoted pair of examples: (66)

a. b. c.

If John says that, he is a liar. If John says that, he will be a liar. If John said that, he would be a liar.

In the first example, the speaker does not really introduce a new topic, but rather reiterates something that has already been stated by someone else before and then makes a concluding statement he is a liar. It is a logical //-clause (Lie). In the second example, however, the speaker does not refer to a former statement by someone else and he does not make any statement about John; on the contrary, he merely hints at a potential situation of John saying something and he expresses his evaluation of this possible situation. He uses a HC. So here the speaker has refused to commit himself in any way, he steps out of the present, actual reality and distances himself into imaginary non-re-

Typology of i/-clauses

647

ality (66b, c). He can either take a minimal distance and then use the future will (66b) or a greater distance and then use the past tense form (66c). A natural corollary of this conceptual combination of supposition and prediction is the need to differentiate between various levels on a scale of likelihood. As the distribution of the two main types of HCs shows, there are 65 neutral instances and 41 non-neutral HCs. This ratio of approximately 60-40 suggests that in the majority of cases speakers only want to take a minimal distance from any actual present reality. In the other cases (40%), speakers do want to take a greater distance, and either express that the situation is more or less likely, unlikely and, in the extreme case, unreal. A second corollary, but of a different nature, is that HCs are extremely apt to carry all kinds of speech act forces. As referred to in the introduction, i/-clauses have from a very early stage been approached from the point of view of their pragmatic potential (Wunderlich 1977; Haegeman 1984; Van der Auwera 1986; Akatsuka 1986). What these authors have not yet tried to discuss is the question why i/-clauses should have such a rich pragmatic potential. We are now in a position to rephrase this question in a sharper way: why do especially hypothetical conditionals have this rich pragmatic potential? This may very well be due to the fact that in HCs we are concerned with suppositions and predictions. We do not commit ourselves, but we distance ourselves from any real world situation and speak in a more indirect way about things. This non-commissive, indirect way of expressing oneself is the ideal vehicle for speech act forces. Even the simplest prediction, as in the example quoted before in (38) and here requoted in (67): (67)

If there's a beast, we'll hunt it down.

can be used with the force of a promise, a threat, a warning; in one word, we can pragmatically commit ourselves because we have not committed ourselves epistemically. After having pointed out these fundamental differences between CECs and HCs, the question must be raised again if there is still something in common between the two types of «/-clauses, as was suggested by the use of a common label for

648

Angeliki Athanasiadou and René Dirven

the two types of i/-clauses, i.e., conditional i/-clauses (as against the label marginal conditionals for types 3 and 4). The reason why types 1 and 2 were subsumed under the label "conditional" in the introduction, was that they both exhibit two events whereas the marginal conditionals do not denote any direct relationship between two events. So the label "conditional" in CECs and HC s must somehow be understood as a negative characterisation: it does not do much more than merely contrast the great majority of over 80% against the remainder of less than 20% marginal conditionals. But that's all there is to it. The conceptual difference between CECs and HC s is so great that they seem to constitute different cognitive strategies. In reality CECs and HCS may very well be as different from each other as they are from marginal conditionals. We can be very brief about these marginal conditionals, comprising logical and conversational ij-clauses. The logical «/-clauses are based on the conceptual process of identification or inference. For the sake of easy discussion we will here only concentrate on the identification type of Lies. The main cognitive need for using an i/-clause at all in this situation is the necessity of being cautious and hence not all too direct. Let's consider two sentences quoted before in (3) and (53). (68)

a. If there's an elite in China, it's the masses. b. If the shark does ill someone else, then it's Vaughan.

In the first example, the speaker cannot commit himself to the truth of the protasis and therefore makes a kind of cautious proviso (if we accept the thesis that there is an elite in China), which he then can also fill in in his own way (the elite is the masses). The second example is a bit different because the speaker does not make any proviso, but in fact stresses the real nature of the situation denoted in the protasis (the shark does ill someone). Still, he plays down his certainty and makes a less direct statement, but it remains a statement and this second example is somehow closer to CECS. In both examples, the speaker is playing a game with the language, which however does not exclude a deep cognitive need. Conversational //-clauses are strongly related to a factor in the conversational context but again it is often not absolutely necessary to use

Typology of «/-clauses

649

an [/-clause. We could as well use a clause expressing reason by means of since. Thus in the example quoted before in (58) What about the parents demonstrating, if there are no friends? we could use since instead of i f . The use of if in such a case can be motivated by playing down the unambiguous since and using the pragmatically indirect ifconstruction. Finally, the parenthetical [/-clauses seem to be motivated by their functioning as a kind of afterthought. A typical example quoted in (64) was The birds will be scattered all over the place after that - if there is any left. There is even no syntactic link between the first part and the second part of the utterance. In fact, the protasis if there is any left has no apodosis. But isn't that the proper structure of an afterthought which is made plausible precisely because of the speaker's extralinguistic knowledge? This discussion of the four types of [/-clauses has led to a somewhat remarkable conclusion; the only type of [/-clause where if functions in a fully transparent semantic way is the hypothetical [/-clause. Indeed, in type 1 (CEC) if is used to play down the strong character of an apodictic statement that would be evoked by the use of the closest comparable relator whenever ·, in type 3 (LIE), we may play down the strong certainty that we have and in type 4 (CIC) we may play down the use and the expression of too obvious a reason. This conclusion raises the further question of whether the use of if in type 2 (HC) could be called the prototypical use. An affirmative answer seems to be suggested by the fact that most researchers and grammar books have so far limited their attention to this type of conditional. But we prefer not to draw any conclusions at this moment and will take up the question again later* when we have finished a survey of all the possible uses of if-

650

Angeliki Athanasiadou and René Dirven

Appendix: Frequency of the various types of {/-clauses 1. 1.1. 1.2.

1.3. 2.

2.1. 2.2.

2.3. 3. 3.1. 3.2.

Course of Events Conditionals (CECs) 138 (44,7%) Descriptive Conditionals 61 Inferencing Conditionals 27 Instructive Conditionals 50 Hypothetical Conditionals (HCs) 114 (36,9%) Neutral HCs 65 Non-neutral HCs 41 Peripheral 8 Marginal Conditionals: Logical and Conversational «/-clauses 57 (18,4%) Logical «/-clauses (LICs) 34 Conversational «/-clauses (CICs) 23 sum 309

Notes *

1.

2.

Ulis is part of a much larger research project in progress. We thank Eugene Casad as well as an anonymous referee for their insightful remarks and suggestions. The inconsistencies and flaws that remain are, of course, our own responsibility. There are many more types of (/-clauses than the four we have selected for discussion. For instance, (/-clauses in indirect constructions such as ask i f , wonder i f , decide i f , and complex ¿/"-constructions with even i f , as i f , only i f , etc. Such types must eventually be included in a full overview of the use of (/-clauses. But also other types of conditionals introduced by new prepositions such as provided that, in case, etc. should be taken into account. Of course these are all problem areas that are intertwined but they cannot be discussed in the framework of a single paper. It is very interesting to note that Closs-Traugott (1985: 292) mentions as one type of the historical source of the equivalents of if in various languages the equivalents of whenever. In these languages (/-conditionals are a further abstraction of wfienever-constructions.

Typology of i/-clauses

3.

4.

5.

6.

7. 8.

Given the nature of the corpus that was available to us and which only consists of isolated instances of ¡/-clauses, one must try to reconstruct the possible context in which each example could have occurred or might reasonably be expected to be able to occur. For instance, sentence (6) clearly suggests a narrative context. More generally speaking every isolated sentence can only be made sense of, if we construct or invent a context for it. Will can, of course, also have the meaning of habit, e.g. He will sit there for hours and say nothing', this reading is highly unlikely in (8b). This reflects the general frequency of can and may, for which the LOB-corpus gives 1,831 can tokens and 1,395 may tokens (see Dirven 1991: 55). But in general, may is far more frequent than can to express possibility. The 1,831 tokens for can in the LOB-corpus also cover all the cases of ability, which may even constitute the majority of the uses of can. English also uses forms of was/were to be, should and inverted conditionals to express various degrees of (un)likelihood, but since these were not found in the (parts of the) corpus under study, we'll not go into them here. In our opinion (44) is likely, (45) unlikely, and (46) unreal. This type of «/-clause has received a good deal of attention in the literature, for instance in Haegeman (1984), Johnson-Laird, who calls them "relevance conditionals", and in König & Van der Auwera (1988), who make the interesting observation that in German and Dutch equivalents of this construction there is no inversion as in normal sentences with preposed i/-clause, but on the contrary, there is the order subject-verb as if these were independent clauses. The German and Dutch equivalents of (57) are as follows: (57') a. Wenn jemand mich sucht, ich bin unten. b. Als iemand naar mij vraagt, ik ben beneden.

9.

German can of course express the concept of mere conditionality by means of the word falls, which is an old genitive of the noun Fall/Falles "(in) case". But the form falls is limited to more formal language and probably comes from judicial and legal language. The situation is similar though not identical in Dutch. The relator als is used both in a temporal and a conditional sense. Dutch also has a more formal item to express the notion of mere conditionality, i.e. indien, which derives from the preposition in and the determiner die(n). The difference with German, however, is that Dutch indien is somewhat less marked for formal language.

651

652

Angeliki Athanasiadou and René Dirven

References Akatsuka, Noriko 1986 "Conditionals are discourse-bound", in: Elizabeth Closs- Traugott et al. (eds.), On conditionals, 333-351. Athanasiadou, Angeliki 1988 "Theoretical model of conditionals", Journal of Applied Linguistics 4: 12-27. Closs-Traugott, Elizabeth 1985 "Conditional markers", in: J. Haiman (ed.), Iconicity in syntax. Amsterdam: Benjamins, 289-307. Comrie, Bernhard 1986 "Conditionals: a typology", in: Elizabeth Closs-Traugott et al. (eds.), 77-99. Davies, E. 1979 On the semantics of syntax: Mood and condition in English. London: Croom Helm & Atlantic Highlands, New York: Humanities Press. Declerck, Renaat & Seki, S. (1990) "Premodifled reduced i'í-clefts". Preprint, Katholieke Universiteit Leuven Campus Kortrijk: Faculteit van de Letteren en de Wijsbegeerte. Dirven, René (ed.) 1989 A user's grammar of English. Frankfurt/Bern: Lang. Dirven, René 1990 "Schema and subschemata in the lexical structure of the verb agree", in: Sylviane Granger (ed.), Perspectives on the English lexicon. Louvain la Neuve, CILL 17: 25-42. 1991 "The Functions of modal auxiliaries and modal adverbs", in: Piotz Kakietek (ed.), Problems in the modality of natural language. Opole: The Pedagogical University of Opole. (Studies and Monographs No. 153). Dirven, René & Günter Radden 1977 Semantische Syntax des Englischen. Wiesbaden: Athenaion. Eilfort, W. H. 1987 "A unified analysis of initial if-clauses in English", in: B. Need, E. Schiller & A. Bosch (eds.), Papers from the 23rd Regional Meeting Chicago Linguistic Society. Chicago: Chicago Linguistic Society, 56-63. Ford, Celia E. & Sandra A. Thompson 1986 "Conditionals in discourse: a text-based study from English", in: Elizabeth Closs-Traugott et al. (eds.), On conditionals, 353-371. Geis, Michael L. & Arnold M. Zwicky 1971 "On invited inferences", Linguistic Inquiry 2: 561-566.

Typology of ¿/-clauses

653

Haegeman, Liliane 1984 "Pragmatic conditionals in English", Folia Linguistica 18: 485502. Haegeman, Liliane & Herman Wekker 1984 "The syntax and interpretation of futurate conditionals in English", Journal of Linguistics 20:45-55. Haiman, John 1978 "Conditionals are topics", Language 54: 564-589. 1983 "Paratactic if-clauses", Journal of Pragmatics 7: 263-281. Harper, W. L„ R. Stalnaker & G. Pearce (eds.) 1981 Ifs: conditionals, belief, decision, chance, and time. Dordrecht: Reidel. Hilpinen, R. 1981 "Conditionals and possible worlds", in: G. Floistad & G. H. von Wright (eds.), Contemporary philosophy. Vol. 1. The Hague: Nijhoff, 299-335. Johnson-Laird, Pilip N. 1986 "Conditionals and mental models", in: Elizabeth Closs-Traugott et al. (eds.), On conditionals, 55-75. König, Ekkehard & Johan van der Auwera 1988 "Clause integration in German and Dutch conditionals, concessive conditionals, and concessives", in: John Haiman & Sandra Thompson (eds.), Clause combining in grammar and discourse. Amsterdam: John Benjamins. Lakoff, George 1987 Women, fire and dangerous things. What categories reveal about the mind. Chicago: University of Chicago Press. Langacker, Ronald W. 1987 Foundations of cognitive grammar. Vol. I. Theoretical prerequisites. Stanford, CA : Stanford University Press. 1991 Foundations of cognitive grammar. Vol. II: Descriptive application. Stanford, CA: Stanford University Press. Lauerbach, Gerda 1979 Form und Funktion englischer Konditionalsätze mit "if'. Eine konversationslogische und sprechakttheoretische Analyse. Tübingen: Niemeyer. Leech, Geoffrey Ν. 1969 Towards a semantic description of English. London: Longman. Meier, G. Ε. H. 1988 "The if-cleft sentence", Acta Linguistica Hafniensia 21:51-61. Quirk, Randolf et al. 1972 A grammar of contemporary English. London: Longman. Schachter, J. 1971 Presupposition and counterfactual conditional sentences. Ph.D. dissertation, University of California of Los Angeles.

654

Angeliki Athanasiadou and René Dirven

Sinclair, John et al. 1990 Collins/Cobuild English grammar. London and Glasgow: Collins. Smith, Ν. V. 1983 "On interpreting conditionals", Australian Journal of Linguistics 3: 1-23. Sweetser, Eve 1990 From etymology to pragmatics. Cambridge: Cambridge University Press, 113-144. Traugott, Elizabeth C., Alice ter Meulen, Judy Snitzer Reilly & Charles A. Ferguson (eds.) 1986 On conditionals. Cambridge: Cambridge University Press, van der Auwera, Johan 1986 "Conditionals and speech acts", in: Elizabeth Closs-Traugott et al. (eds.), On conditionals, 197-213. Wunderlich, Dieter 1977 "Assertions, conditional speech acts, and practical inferences", Journal of Pragmatics 1: 13-46.

Boundedness in temporal and spatial domains Hana Filip

1.

Introduction

The influence of verbal predicate operators on the meaning of nominal predicates in Slavic languages, such as Czech, can be best illustrated with examples that contain perfective verbs and undetermined NPs that are headed by mass and plural nouns. Consider, for example, Vypi/ p vino- 'He drank up (all) the wine'. Here the mass Direct-Object-NP 'wine' is interpreted as "bounded" and "referentially specific", even though mass nouns on their own denote unbounded continua of stuff that cannot be anchored to any particular entity. Such examples clearly show that mass NPs derive their interpretation from the verbal predicate operator marking the perfective aspect. This phenomenon has been rarely noticed in Slavic languages, let alone systematically described. In addition to such semantic effects that are typically ascribed to articles, verbal predicate operators can have various semantic effects that are comparable to those of determiner quantifiers and various quantificational and measure expressions within NPs. The non-compositional nature of the data to be described here poses an intriguing puzzle: we have here an encoding system that is located on the verb and that is primarily designed for expressing semantic distinctions pertaining to the domain of events, yet at the same time this system is exploited in the interpretation of nominal arguments. I will propose a description within an approach that takes as a basic unit the notion of grammatical construction and that assumes that conceptual event schémas provide the background against which sentences are interpreted. The crucial role in this interpretive process is played by the well-attested homologies between the temporal and spatial domains and by the Incremental Schema that serves to relate these two domains in a systematic way.

656 Hana Filip

In Section 2 the pivotal notions of aspect and Aktionsart, as they are understood here, are characterized. Section 3 describes in detail all the relevant data. Section 4 sketches previous analyses. Section 5 outlines the framework I will be working in. And Section 6 provides further evidence for the observations and analysis in Sections 3 and 5. 2.

Aspect and Aktionsart

In order to understand how the verbal morphology interacts with the semantics of nominal arguments, a few introductory remarks on Slavic verb morphology and its relation to the categories of aspect and Aktionsart are in order. Every state of affairs, which is changeable in time, has, in principle, a beginning, a certain extent, and an end. Every such state of affairs may be conceived of as having boundaries. Perfective predications can be analyzed in terms of a semantic representation that contains a temporal boundary on the denoted state of affairs. They are "bounded" in this sense. Imperfective predications lack such a boundary, they are "unbounded" (this principle of contrast goes back to the Praguian markedness analysis; cf. Jakobson 1936/71). In general, a perfective operator selects the boundaries that are typical for the various classes of states of affairs denoted by the predication in its scope. Since telic verb expressions (accomplishments and achievements) entail an inherent definite change that necessarily terminates the denoted state of affairs, a perfective operator focuses on the final boundary, on the fact that the change was (or will be) attained. Certain stative states of affairs, such as those involving knowledge, beliefs, dispositions, for example, are states that can be acquired or entered into and they do not typically entail any definite end state. Therefore, it is cognitively significant to mark their beginning, (or inchoative phase). This can be achieved by applying a perfective operator to an imperfective stative verb, as in the following Czech sentence Zamilovalp se do ni - 'He fell in love with her'. The corresponding imperfective verb would be used in a sentence denoting the resulting states of affairs: Miloval1 ji - 'He loved her'. A perfective operator applied to verbs denoting activities,

Boundedness in temporal and spatial domains

657

also selects the beginning (inchoative phase), as in Rozplakalp se 'He started to cry'. The aspectual "perfective (P)-imperfective (I)" distinction in Czech is coded by lexical-derivational means: by prefixation {psát 'to write' -prepsatp 'to write over/again'), suffixation (otrhat p - otrhâ-va-t1 'to pick'), change of the stem extension {skákat 'to jump, i.e., to be jumping' or 'to jump repeatedly' - skocitp 'to jump') or suppletion (brát 1 -vzítp 'to take') 1 . In general, most Slavic verbs can be classified as either perfective or imperfective. Apart from their aspect coding function, many verbal predicate operators (prefixes, suffixes, etc.) have effects on the lexical semantic properties of verbs that have been described under the notion of "Aktionsart" (German term meaning "kind of action"). The notion "Aktionsart" is independently needed for the description of an interaction between aspect and lexical semantics of verbal expressions on which aspect operates. In Slavic and Germanic linguistics, Aktionsart is traditionally used in the narrow sense for semantic distinctions expressed by lexical-derivational morphology. Such distinctions concern, for example, manner, quantity, measure, phase ("inchoative" or "inceptive", "continuative" and "terminative", etc.) and degree of intensity as well as such quantificational notions as "iterativity", "semelfactivity", "distributivity" (cf. Isacvenko 1960 and 1962, for example). More recently, Aktionsart has been extended beyond its narrow, morphologically based, understanding to include certain semantic distinctions not only on the level of lexical semantics of individual verbs but also on the level of VPs and sentences. In this broad sense, it comprises Vendler's (1957;67) classes state, activity, accomplishment and achievement (cf. Hoepelman 1981; Hinrichs 1985, among others) or the corresponding "telic-atelic" distinction that was coined by Garey (1957).2 3.

Verbal predicate operators and the semantics of noun phrases

The influence of verbal predicate operators on the meaning of nominal predicates in Slavic languages like Czech can be best illustrated in transparent contexts with examples that contain determinerless NPs

658

HanaFilip

that are headed by mass and plural nouns, as is illustrated by the pair of sentences in (1) and (2): (1)

a. Pil1 vino. drank-SG-MASC wine-SG-ACC 'He was drinking (some) wine.' b. Vypilp vino. PREF-drank-S G-M AS C wine-SG-ACC 'He drank up (all) the wine.'

(2)

a. Dával1 jim knihy. gave-SG-MASC them-DAT-3PL books-PL-ACC 'He was giving them books.' b. Porozdalp jim PREF-PREF-gave- S G-MAS C them-DAT-3PL knihy. books-PL-ACC 'He gave them (all) the books.'

The sentences in each of the above pairs differ in their main verbs: (la) and (2a) are headed by simple imperfective verbs, while (lb) and (2b) are headed by the corresponding prefixed perfective verbs. In (lb) and (2b), the prefix modifies the verb that roughly has the meaning "V completely/all the way". Since the verb has an object that can be viewed either in terms of its parts or in its entirety, the interpretation of "V + DO-NP" is approximately "V all the DO-NP(s)". In other words, (lb) and (2b) entail that the denoted event ended when the Agent finished drinking all the available wine and giving out all the books. The negation of this end state yields a contradiction, as is shown in (3): (3)

*Vypilp vinoi, ale * PREF-drank- S G wine-SG-ACC but jeji vsechno. it-SG-ACC all-SG-ACC

nevypilp not-drank-SG

Boundedness in temporal and spatial domains

659

In short, (lb) and (2b) have an "all-inclusive" or a "holistic" entailment and they can be paraphrased with sentences containing the determiner quantifier all within their DO-NPs. In general, if some entity is understood as being completely subjected to an event, there must be some limits imposed on its spatial extent, it must constitute one bounded whole. Hence, in (lb) and (2b), the mass NP "wine" and the plural NP "books" are understood as denoting bounded entities, even though mass and plural NPs in general denote entities that do not have inherent boundaries, that is, they are inherently unbounded (cf. also Talmy 1986: 14 and Jackendoff 1990: 5ff.). In the corresponding imperfective sentences, (la) and (2a), the imperfective verbs may be interpreted as "V incompletely/ unsuccessfully/part of the way' or "nearly V". And since the DO-NP denotes an entity that allows for a "part o f ' interpretation, this results in "V only some DO-NP(s)" or "V only some of the DO-NP(s)". Under this partitive interpretation, (la) and (2a) assert that some wine and some books out of some understood larger quantity of wine and books were subjected to the denoted event. In other words, the mass NP "wine" and the plural NP "books" are unbounded. (2b) illustrates another important piece of data. Due to the complex perfectivizing prefix po-roz-, (2b) has a quantificational, distributive, reading. (2b) entails that all the books were gradually distributed, one after another, among the recipients. It can be then argued that the prefix po-roz here expresses a type of quantification, it requires a domain restriction and a scope. While the prefix here indicates what sort of quantification is involved in the proposition expressed by (2b), the nominal argument denotes the kind of individuals the quantification is restricted to range over. (2a), on the other hand, asserts that 'he' was in the process of giving away books, without providing any information in which way. Apart from "distribution", verbal predicate operators carry such notions as "succession", "iteration" and also "small quantity", "large quantity", "some unspecified bounded quantity" etc. These notions are used in traditional Slavic linguistics to delimit various Aktionsart types. And such notions are also relevant for the interpretation of nominal arguments. Hence, verbal predicate operators convey mean-

660 Hana Filip

ings that are typically expressed by quantifiers, and various other quantifying and measure expressions within NPs. The functions that are ascribed to articles, in particular the expression of referential specificity, are related to the aspectual "bounded/ unbounded" distinction. Like most other Slavic languages Czech does not have an overt article system. The semantic differences that are carried by articles in English, for example, are here inferred through, or expressed by, a variety of morphological, syntactic, prosodie and lexical devices: word order, stress, determiner quantifiers, and various other lexemes that modify nouns. However, what has been less frequently noticed in this connection, let alone systematically described, is the influence of verbs on the interpretation of nominal arguments. In the most natural, single event, interpretation of (lb) and (2b), the DONPs "wine" and "books" refer to a contextually specific or known portion of wine and to a specific set of books, respectively. In other words, the use of undetermined NPs with mass and plural noun heads here corresponds to the referential use of definite descriptions in languages that have a definite article. So in such sentences as (lb) and (2b), the DO-NPs will be most likely translated with the definite article the into English. In (la) and (2a), on the other hand, it may be irrelevant for the purposes of communication whether the DO-NPs have specific referents or not. In many contexts, the use of undetermined NPs with mass and plural noun heads most closely corresponds to English NPs with no articles (or perhaps with the unstressed "some"). The crucial point illustrated by the pair of sentences in (1) and (2) is that the differences in the interpretation of nominal arguments arise from verbal morphology. Such examples show that verbs in Czech have semantic effects on nominal arguments that are comparable to those of (i) articles and (ii) determiner quantifiers and various other quantifying and measure expressions. The most striking examples are those with undetermined mass and plural DO-NPs in perfective sentences (lb) and (2b), which clearly show that mass and plural NPs derive their bounded, quantificational and referentially specific interpretation from perfective verbs. One of the puzzles that needs to be accounted for is that a difference in verb morphology is not always correlated with a difference in

Boundedness in temporal and spatial domains

661

the interpretation of nominal arguments. This is shown by such sentences as those in (4) and (5): (4)

a. Videli1 ru'znobarevné balóny. saw-PL-MASC multicolored-PL-ACC balloons-PLACC 'They saw multicolored balloons.' b. Uvidel,ip ru'znobarevné balóny. PREF-saw-PL-MASC multicolored-PL-ACC balloonsPL-ACC 'They saw multicolored balloons.'

(5)

a. Míchala1 jsem polévku. stirred-SG-FEM am-AUX-lSG soup-SG-ACC Ί was stirring (the) soup.' b. Zamíchalap jsem polévku. PREF-stirred-SG-FEM bam-AUX-lSG soup-SG-ACC Ί stirred (the) soup.'

If there is any difference in the interpretation of DO-NPs in the above imperfective and the corresponding perfective sentences at all, it will stem from contextual factors other than just the difference in verb morphology. Such data from Slavic languages are known. However, apart from occasional marginal comments (Wierzbicka 1967; Forsyth 1970; Comrie 1976; Rassudova 1977; Talmy 1991), there has been no attempt to provide a systematic account of them.2 The interaction between verbal predicate operators and nominal arguments constitutes an important field of study. Among other reasons, two are especially important for the present study: First, this data is not compositional. Examples (1) and (2) show that the interpretation of mass and bare plural NPs depends on the verb. Second, the interaction between verbal predicate operators and nominal arguments promises to give us valuable insights into the language-specific schematizations and semantic universals. One of the most important questions that arises in this connection is the following one: Under

662 Hana Filip

which conditions does a given verbal predicate operator extend its semantic effects over a nominal argument? 4.

Previous approaches

4.1. D-quantification and A-quantification Partee suggests that the cases in which "an operator with some quantificational force (and perhaps further content as well) is applied directly to a verb or other predicate at a lexical level" (Partee 1990: 19) seem "best analyzed as an operation on the argument structure of the verb with a corresponding semantic operation on the interpretation" (Partee 1990: 19). She illustrates this point with the Czech prefix po-. Consider the following examples: (6)

a. Maloval1 hesla (na stenu). painted-SG slogans-PL-ACC (on-PREP wall-SGACC) 'He painted (the/some) slogans (on the wall).' b. Pomalovalp stênu (hesly). PREF-painted-SG wall-SG-ACC slogans-PL-INSTR) 'He covered the wall (with slogans).'

The prefix po- can be applied to verbs that belong to the family of writing, drawing, etc. The resulting perfective verb takes as its direct object the optional locative complement of the original imperfective verb (what one writes on, etc.) and it does not allow any overt expression of the original direct object (what is written, etc.). The meaning is "write all over X" or "cover X with writing", etc., "a meaning which is in a certain sense quantifieational but is certainly to be captured at a lexical rather than a syntactic level" (Partee 1990: 19). A similar interaction between verb morphology and nominal arguments that can be observed in Slavic languages like Czech was also noticed in other languages (Hindi, Japanese, cf. also Hale's work on Warlpiri, Evans' work on Warlpiri and Gun-djeyhmi, and Bach's work on Haisla in Partee 1990). Consider, for instance, the following

Boundedness in temporal and spatial domains

663

Warlpiri example with the partitive preverb puta- quoted by Partee (1990: 17): (7)

Ngapa O-ju puta-nga-nja. water AUX-1SG PART-drink-IMP 'Just drink some (not all) of my water!'

According to Partee (1990: 18), the preverb puta- should be regarded as modifying both the verb and its direct object. In order to capture the parallels between the different morpho-syntactic means by which quantification is expressed within one language and across languages, Partee, Bach & Kratzer (1987) suggest that we distinguish two main syntactic classes: D-quantification, typically expressed in the NP with determiner quantifiers, and A-quantification that is expressed at the level of the sentence or VP with sentence adverbs (usually, always), "floated" quantifiers (each), auxiliaries, and verbal affixes, for example. The study of non-NP quantification by means of various Α-quantifiers is important as "a counterbalance to the nearly exclusive concentration on NP quantification in most of the previous syntactic and semantic literature" (Partee 1990: 8). According to Partee, A-quantification is a heterogeneous class and it subsumes a variety of phenomena that can be divided into the following sets: (i) "true A-quantification, with unselective quantifiers and a syntactic basis for determining, insofar as it is determinate, what is being quantified over, and (ii) lexical quantification, where an operator with some quantificational force (and perhaps further content as well) is applied directly to a verb or other predicate at a lexical level, with (potentially) morphological, syntactic, and semantic effects on the argument structure of the predicate" (Partee 1990: 19). Partee (1990) also points out that the semantic effects of verb morphology are directed at a specific argument of the verb. In this respect, the lexical quantification differs from A-quantification, in particular

664 Hana Filip

A-quantification expressed by sentential adverbs, such as always, that can bind more than one free variable in their scope. In this connection, Partee, Bach and Kratzer (1987: 21) pose a question similar to mine: "What are the constraints for associating a quantifier with the arguments of a verb?" I suggest that a significant part of the answer to my question as well as to the question posed by Partee, Bach and Kratzer (1987: 21) can be found in Krifka's lattice-theoretic analysis. 4.2. Lattice-theoretic analysis Krifka's (1989; 1992) account of the interaction between verbal and nominal predicates has the following main features: he applies Link's (1983) lattice-theoretic analysis of mass and plural NPs to both NPs and verbal predicates. He argues that this interaction can be explained by establishing a homomorphism between algebraically structured NP and verbal predicate denotata. Furthermore, the laws governing the interaction of nominal predicates on verbal predicates are to be stated relative to thematic relations. Since my proposal takes as a point of departure Krifka's lattice-theoretic analysis, I will describe these features in more detail. First, within the lattice-theoretic semantics the domain of events and individuals is characterized as two non-overlapping sorts of entities, each of which has the structure of a joint semi-lattice without a bottom element. Algebraic relations, which characterize a homomorphism, are then defined between the lattices representing the predicates of objects and events (cf. Krifkà 1989; 1992).3 In lattice sorts, we can represent in an explicit way the well-known parallels between NPs, that is, mass nouns, count nouns, measure constructions, and plurals, and the Aktionsart of verbal predicates, activities and accomplishments (cf. Verkuyl 1972; Fiengo 1974, Gruber 1967; Taylor 1977; Mourelatos 1978; Talmy 1978 and 1986; Bach 1986a and 1986b; Langacker 1987a and 1987b; Jackendoff 1990, among others). Mass and plural NPs and atelic verb expressions - activities such as 'running', and states such as 'knowing' - are both cumulative. Cumulative expressions pass the additivity test: "(a) If a is

Boundedness in temporal and spatial domains

665

water and b is water, then the sum of a and b is water" and "(b) If the animals in this camp are horses, and the animals in that camp are horses, then the animals in both camps are horses" (Link 1983: 303). On the other hand, singular count NPs (an apple), quantified NPs (five apples) and measure NPs (a glass of wine) and telic expressions - accomplishments like building a house, and achievements like arriving are quantized (cf. Krifka 1986; 1989). An expression is quantized if it does not pass the additivity test, or conversely if it is non-divisible: that is, if one cannot divide its referent up and get individual parts that can be named by the same expression. Second, Krifka argues that the interaction between verbal and nominal predicates can be explained by establishing a homomorphism between algebraically structured NPs and event denotata. In particular, the homomorphism idea motivates the influence of the reference type of NPs on the Aktionsart (telicity) of VPs or sentences. For example, in John drank wine the mass NP wine is responsible for an atelic reading of the whole complex verbal predicate, while in John drank a glass of wine, on the other hand, the measure NP a glass of wine yields a telic verbal predicate. To illustrate how the idea of homomorphism works, we can give the following example from Dowty (1991: 567): If somebody mows the lawn, this event can be measured by the state of the lawn, because the lawn acquires a new property in distinguishable, separate stages, it changes 'incrementally' in lockstep with the progression of the mowing event. We can map the state of parts of the lawn and their partwhole relationships into the parts of the event of mowing that lawn and its part-whole relationships. Therefore, since the lawn has a definite extent, the event of mowing of that lawn does, as well. And similarly, in to eat a sandwich (consumed object), to build a house (effected object) and to destroy a city (destroyed object), for example, the measurable property is a decreasing or increasing quantity of the object that delimits the event over time. With verbs that take 'event' objects, such as to play a sonata, it is the temporal linear dimension inherent in the object of performance, as it is realized through performance over time. On the other hand, to stir the soup does not denote an event that evolves in lockstep with the changes that the soup undergoes (at least

666 Hana Filip

under the most usual reading). That is, it is not possible to correlate a given subportion of the soup with the part of the time interval during which this subportion of the soup was stirred in the same way in which a part of a house, for example, can be correlated with the time interval during which building of that part of a house took place. Third, Krifka assumes that the laws governing the influence of the reference type of the NP on the complex verbal predicate "can be stated most easily relative to thematic relations" (Krifka 1987: 12). This is motivated, according to him, by the observation that this influence often depends on the thematic relation the NP bears to the verbal predicate. Dowty (1988; 1991) coins the term "Incremental Theme" for this thematic role. This new thematic role serves as one of the contributing properties of the Patient Proto-Role in Dowty's Proto-Role system. Such verbs as to drink and to mow are then said to entail a "Theme-to-event homomorphism" (cf. Dowty 1991: 567). Given the above assumptions, Krifka arrives at the following generalization: A quantized Incremental Theme NP yields a quantized (or telic) verbal expression, while a cumulative one yields a cumulative (or atelic) verbal expression. Krifka (1989: 186-189; 1992: 49-51) extends the same apparatus to the description of Slavic languages like Czech. Here, it is the verbal predicate operator that influences the interpretation of a nominal predicate (cf. (1) and (2), for example). Given that the homomorphic mapping works in both directions, namely from objects into events and from events into objects, such an extension is well-motivated and provides a further support for Krifka's lattice-theoretic analysis. Given that his overall approach is strictly compositional, while the Czech data are not, Krifka has to make two further assumptions: First, there is a syntactic rule "NP ->• N" that allows two different semantic interpretations, cumulative and quantized. According to him, this is motivated by the fact that in Czech the NP vino, for example, can mean either "wine" or "the wine". In the definite reading, vino is quantized, while in the indefinite reading, it is cumulative. In other words, NPs in Czech are ambiguous. Second, perfective operators can be only applied to quantized verbal predicates, while imperfective operators to cumulative ones.

Boundedness in temporal and spatial domains

(8)

667

a. Jan pil1 vino. John drank-SG-MASC wine-SG-ACC 'John was drinking wine.' b. Jan vypilp vino. John PREF-drank-SG-MASC wine-SG-ACC 'John drank up (all) the wine.'

On this approach, the examples in (8) can be described in the following way: perfective/imperfective aspect has scope over the complex verbal predicate and it forces its quantized/cumulative interpretation. The quantized/ cumulative complex verbal predicate, in turn, forces a quantized/cumulative interpretation of the Incremental Theme NP. "If we assume the normal transfer of properties for the object role of verbs like eat and drink [i.e., the Theme-to-event homomorphism, HF], then we see that only with a quantized object the complex verbal predicate will be quantized as well" (Krifka 1992: 50). Krifka also assumes that "the perfective aspect is only compatible with the definite interpretation of the object". Hence, in such sentences as (8b) this leads to "the definite interpretation of the object, as this is the only quantized interpretation ..." (Krifka 1992: 50). Krifka's introduction of the lattice-theoretic structure and homomorphism is an important theoretic innovation and improvement on the previous accounts (in particular, Verkuyl 1972 and Dowty 1972; 1979), which tried to explain the influence of NPs on the Aktionsart of VPs or sentences. A similar proposal based on the lattice-theoretic approach can be also found in Hinrichs (1985). Krifka's approach differs from Hinrichs' (1985) in so far as it involves the postulation of a new thematic role, and therefore, it has important consequences for the theory of thematic roles in general. Krifka's account has a number of appealing features. It is based on a semantics for NPs that is non-quantificational. The notions of "cumulativity" and "quantization", on which his proposal is grounded, are cross-categorial notions that cut across the syntactic distinction between nouns and verbs as well as across the ontological distinction between objects and events. The modelling of objects and events in terms of their "part structure" in a semi-lattice is based on the dimension of quantity inherent in the conceptualizaton of objects and events,

668 Hana Filip

as described in many cognitive approaches (cf. Talmy 1986; Langacker 1987a and 1987b; Jackendoff 1990 among others). In particular, it pertains to its subcategories number, state of boundedness or state of dividedness (or quantity's internal segmentation as "discrete" or "continuous"; cf. Talmy 1986: 15). The cognitive approaches emphasize the psychological motivation underlying such parallels. The latticetheoretic analysis, on the other hand, can be seen as complementary to the cognitive approaches in so far as it provides an explicit representation of the parallels between objects and events. The lattice-theoretic structure and homomorphism allows Krifka to provide an explicit and precise account of the dependencies between nominal and verbal predicates. And finally, the idea that the Incremental Theme role plays a crucial role in the interaction between verb morphology and nominal arguments in Slavic languages like Czech is well-founded. It is wellknown that aspect interacts in a systematic way with the Aktionsart distinction "telic/atelic", a distinction which is partly motivated by the properties of the Incremental Theme role. However, two main objections can be raised against Krifka's account: the first objection is directed against the attempt to attribute the interaction between verbal and nominal predicates solely to one particular thematic role, namely to the Incremental Theme role. The second objection regards his strictly compositional description of the Czech data. 5.

My proposal

5.I. Incremental Theme and Incremental Schema The first objection regards Krifka's (and Dowty's) proposal that the laws governing the influence of the reference type of the NP on the complex verbal predicate are to be formulated relative to thematic roles. It seems that this approach would limit us to a rather narrow range of data, because many individual verbs cannot often be classified once and for all as denoting a homomorphism. And therefore, we cannot decide on the level of lexicon which verbs will take an Incremental Theme role. We seem to be here faced with a similar problem

Boundedness in temporal and spatial domains

669

as Vendler's attempts to classify surface verbs as activities and accomplishments (see Dowty's (1979: 60ff.) criticism of Vendler). If the influence of the reference type of the NP on the complex verbal predicate were to be attributed to an Incremental Theme role, as Krifka and Dowty suggest, how can we account for the fact that the decision whether a denoted event is understood as evolving in an incremental way, and whether it may also be regarded as telic, often depends on other factors? Various adjuncts (The truck rumbled vs. The truck rumbled into the garage) and additional arguments (The critics laughed vs. The critics laughed the show out of town), for example, may play an important role in this decision. Dowty is well aware of such examples and he states that "THE MEANING OF A TELIC PREDICATE IS A HOMOMORPHISM FROM ITS (STRUCTURED) THEME ARGUMENT DENOTATIONS INTO A (STRUCTURED) DOMAIN OF EVENTS, modulo its other arguments" (Dowty 1991: 567). However, it is not clear how the influence of other arguments and of adjuncts should be handled. Such examples as The truck rumbled into the garage pose yet another problem: "the 'argument' with respect to which these telic predicates are homomorphisms on this hypothesis, namely the Path argument, is (...) not a syntactically realized argument at all; ..." (Dowty 1991: 569). The prepositional phrase into the garage refers to the end-point of the Path. And similarly in John was becoming an architect but was interrupted before he could finish his degree, "the 'Path', if we want to call it that, is even more removed from syntactic expression - the stages that one goes through to reach the status of architect were partly but not exhaustively achieved,..." (Dowty 1991: 569). One way in which we could account for the above examples would be to postulate two senses for each predicate, or two different verbs, connected by lexical rules, whereby only one of them would denote a homomorphism. However, such an account would force us to postulate quite implausible senses of verbs. For example, we would have to postulate a special sense of rumble in The truck rumbled into the garage, "to move into Y by rumbling". The account of the influence of NPs on the Aktionsart interpretation of sentences is also complicated by the fact that the decision

670 Hana Filip

whether a given sentence denotes an event that can be viewed as proceeding in an incremental way may also depend on the cognitive schémas associated with particular form-meaning linguistic pairings. This can be illustrated with such examples as John saw twenty-five elephants and The doctor examined the patient. Such sentences can be construed as describing events that involve some established procedure that delimits them. Only under such an "incremental" construal are the above sentences telic, otherwise, they are atelic. (These examples were brought to my attention by Charles J. Fillmore.) Given the above observations, it is obvious that it is not a grammatical fact that we can eat a sandwich only once, and "typically" do it in an incremental way. I suggest that we maintain the notion "Incremental Theme" for those cases in which a simple NP is associated with the participant that "measures out" the event. That is, as in Krifka's and Dowty's theory, the Incremental Theme role will be linked to the DONPs in such expressions as to build a house. However, at the same time, we need to acknowledge that the homomorphism cannot be simply attributed only to the presence of a particular thematic role, it cannot be simply viewed as a projection of the lexical semantic properties of individual verbs, but rather it often has other sources. Assuming that MEANINGS ARE RELATIVIZED TO SCENES (Fillmore 1977a: 59, capitals his), I propose that the homomorphism between objects and events characterizes a fragment of conceptual structure, an "Incremental Schema". An Incremental Schema is one of the interpretive schémas or frames (in the sense of Fillmore) associated with sentences. And it is against this schema that certain Aktionsart and aspect properties of sentences are interpreted. Thus, the status of the Incremental Schema in the conceptual representation of sentences is comparable to that of a "scalar" model with respect to which, for example, a let alone sentence is interpreted (cf. Fillmore, Kay, O'Connor 1988). My second objection has to do with the compositionality account and the directionality that is implicit in such notions as "Theme-toevent" homomorphism (Dowty 1991: 567) or "transfer of reference mode" (Krifka 1986; 1989). The "Theme-to-event" homomorphism is motivated by the influence of NPs on the telic properties of complex verbal predicates in English. Assuming this "transfer of reference mode", Krifka can provide a straightforward compositional account

Boundedness in temporal and spatial domains

671

for the constitution of Aktionsart in English sentences. However, his attempt to give a similar compositional account for the influence of verbal predicate operators on the nominal predicates in Czech raises a number of serious problems. I will briefly discuss two. First, in order to uphold the compositional account also for the Czech data, Krifka assumes that NPs with mass and bare plural noun heads in Czech are ambiguous between a definite/quantized and an indefinite / cumulative reading. "[T]he unwelcome reading [of vino as in (8b), for example, HF] is excluded by general principles, just as in rob the bank the unwelcome readings of bank are excluded by the lexical meaning of rob" (Krifka 1992: 50). In short, the quantized/cumulative reading of a NP is selected or enforced by the quantized /cumulative verbal predicate. At the same time, the Incremental Theme NP "transfers" its quantized/cumulative referential properties into the complex verbal predicate, and this "transfer", in turn, motivates the quantized /cumulative interpretation of the verbal predicate. This seems to involve a redundancy and is not quite what actually happens. Second, Krifka's compositional account does not distinguish between Aktionsart and aspect. He assumes that perfective operators can be only applied to quantized verbal predicates, while imperfective operators to cumulative ones. However, a closer look at the Slavic data shows that this is not the case. For example, there is a class of perfective verbs derived from cumulative (atelic) stative verbs with the prefix pro- and po-, as in Czech postât and Russian postojat' 'to stand for a while' orprostát, prostojat' 'to stand through (some period)', that are best classified as atelic (cf. also Kucera 1983: 174). The existence of such verbs shows that a perfective operator can be applied to cumulative (atelic) verbal predicates and that we need to integrate a class of perfective atelic verbs into our verbal system. This requires that we distinguish between the 'bounded' temporal profile associated with the semantics of perfective aspect, on the one hand, and the entailment of a definite change of state inherent in the lexical semantics of 'telic' verbal expressions, on the other hand. If imperfective operators required cumulative (atelic) verbal predicates in their scope, how would we account for such examples as (9)?

672

(9)

Hana Filip

Pil1 sklenici vina. drank-SG-MASC glass-SG-ACC wine-SG-GEN 'John was drinking a glass of wine.'

(9) combines a quantized Incremental Theme NP a glass of wine and an imperfective verb. Due to the partitive reading involved in imperfectivity, (9) asserts that the Agent was in the process of drinking some subpart of wine that is contained in the portion denoted by a glass of wine. Krifka's approach predicts that the imperfective aspect here forces a cumulative interpretation of the Incremental Theme NP a glass of wine. That is, on such an account both (9) and (8a) Jan pil1 vino are cumulative (atelic). However, it seems desirable to distinguish between the two types of imperfective sentence exemplified by (9) and (8a). Although they are alike on the level of aspect, because they are both imperfective and denote unbounded events, they do differ on the level of Aktionsart. (9) is telic, as the quantized NP a glass of wine delimits the denoted event by virtue of being a quantized Incremental Theme NP. (8a), on the other hand, is atelic, as the cumulative NP wine does not delimit the denoted event. In other words, we need to represent adequately this difference. In order to do so, we need to distinguish between the entailment of a definite change of state inherent in the lexical semantics of "telic" verbal expressions, on the one hand, and the "unbounded" temporal profile associated with the semantics of imperfective aspect, on the other hand. In short, we need to distinguish between Aktionsart and aspect. 5.2. Construction Grammar approach The translations of Czech examples into English may often make it appear as though Czech mass NPs like vino, for example, were ambiguous between "wine" and "the wine". However, the contextually determined interpretations of NPs in languages that have no overt article system, or do not use any systematic indication of (in)definiteness, are not a matter of ambiguity. It is more appropriate to consider NPs in such languages as being underspecified with respect to

Boundedness in temporal and spatial domains

673

(in)definiteness. I will suggest that the (in)definiteness effect on NPs is a pragmatically determined by-product of a bounded and an unbounded interpretation assigned to certain NPs in perfective and imperfective sentences, respectively. I will argue that verbal predicate operators that code aspect, among other things, take scope over whole propositions and determine the bounded/unbounded interpretation of certain NPs in those sentences that evoke an Incremental Schema. To be more precise, verbal predicate operators direct their semantic effects at the NP that is linked to the Incremental Theme role, that is, that corresponds to the participants in the Incremental Schema whose extent is intrinsically tied to the individuation and temporal structure of the event itself. The above observations presuppose that the semantic property of NPs that is determined by the verbal aspect be characterized in terms of the "bounded/unbounded" distinction that characterizes aspect, rather than in terms of the "cumulative/quantized" distinction, as Krifka (1989; 1992) suggests. I suggest that the "cumulative/quantized" distinction be reserved for the inherent semantic properties of NPs as well as for the properties of verbs, VPs and of sentences that are relevant for Aktionsart (telicity). The distinctions "quantized/cumulative" and "bounded/unbounded" belong to a finite set of primitives that characterizes parts of conceptual structure. The "cumulative/quantized" distinction is here understood in the same sense as it is defined in Link (1983) and Krifka (1989; 1992). The "bounded/unbounded" distinction is orthogonal to the distinction between individuals and events, just like the "quantized/cumulative" distinction. Saying that a given expression is "bounded", in addition to saying that it is "quantized", means that we view the entity denoted by it not only as an individuated entity, but also that we see it in its entirety, that is, in this sense, we focus on its boundaries. Many objects have a shape with a canonical orientation and a starting and an end point. And similarly, telic expressions denote events with a canonical orientation, they are conceived of as progressing towards a set terminal point, or some other limit. Hence, if we view such telic events as bounded, we typically focus on their end boundary that coincides with their inherent limit. In this sense, we regard them as completed. On

674 Hana Filip

the other hand, saying that a given expression is "unbounded" simply means that we abstract away from the boundaries of the entity denoted by it and instead consider some of its subpart(s). In the domain of objects it holds that if an object is bounded, it is quantized as well. However, in the domain of events, this does not apply. Quantized or telic events involve directionality, an inherent orientation toward a definite end state. If a perfective operator imposes bounding over an atelic event, it does not add telic structure. The notion of simple bounding with no directionality in the verbal domain parallels bounding of mass nouns, for example, in the nominal domain: Just as a bathtub can be full of water, a jar full of beans, so can a sustained stretch of an activity, such as blushing, be thought of as being simply associated with a certain bounded interval of time. We may assume that obligatory arguments of the main verb are accessible to the semantic effects of verbal predicate operators (cf. Bach in Partee, Bach, & Kratzer 1987: 22). In Czech, and other Slavic languages, these are typically DO-NPs, and also subject-NPs, as is shown in (10): (10)

Jablka se porozkutálelaP apples-PL-NOM REFL PREF-PREF-rolled-PL-NEU podlaze. floor '(All) the apples rolled apart all over the floor.'

po on

The derived perfective verb po-roz-kutáletp se requires a plural subject-NP and its meaning has a distributive component, it can be characterized as "to move by rolling, one by one, into different directions (and as a result be at different locations)". Given that the verb and its arguments are in the relation of predication and given that the predication is necessarily a local relationship, we can expect that a verbal predicate operator will have strictly local semantic effects on NPs. Moreover, since such semantic effects are contingent on the presence of the Incremental Schema, which in turn is evoked by whole clauses, I suggest that they be captured at a level of grammatical constructions that underlie single clauses.

Boundedness in temporal and spatial domains

675

The notion of 'grammatical construction' is one osf the central notions of Construction Grammar, an emergent framework that is being developed in Berkeley (cf. Fillmore 1988; Kay and O'Connor 1988; Fillmore and Kay 1992; Lambrecht 1986, 1987, 1988, 1990). "Constructions (...) are much like the nuclear family (mother plus daughters) subtrees admitted by phrase structure rules, ..." (Fillmore, Kay and O'Connor 1988: 501) except that "constructions may specify, not only syntactic, but also lexical, semantic, and pragmatic information; ..." (Fillmore, Kay & O'Connor 1988: 501). Construction Grammar is a monostratal, non-transformational and unification-based framework (cf. Shieber 1986). Similarly as in the Head-Driven Phrase Structure Grammar (Sag and Pollard 1993), the central explanatory mechanism is 'structure sharing', i.e., "token identity between substructures of a given structure in accordance with lexical specifications or grammatical principles (or complex interactions between the two)" (Sag & Pollard 1993:viii). Linguistic generalizations are formulated in terms of constraints that establish the appropriate identities between partial structures. Constraints imposed by the language require that two different sign tokens introduce instances of the same referential parameter, whether those constraints arise from lexical entries, grammar rules, pragmatic rules, or principles of universal grammar. Unification-based theories are purely declarative in the sense that they characterize 'what' constraints are operative in language use independently of 'what order' the constraints could be applied in. Instead of assuming that NPs in Czech are ambiguous and "transfer" their reference properties into the complex verbal predicate, that is, instead of an implicitly directional and procedural approach, my description is declarative. Both the verbal predicate operators and head nouns of Incremental Theme NPs impose constraints on the properties of the resulting NP. Verbal predicate operators and Incremental Theme NPs introduce instances of the same parameters: "boundedness" and "cumulativity". These parameters encode information coming from three sources: Aktionsart, characterized in terms of the "quantized/cumulative" distinction, aspect, characterized in terms of the "bounded/unbounded" distinction, and Incremental Theme NP which is characterized in terms of both these distinctions, as it interacts, at the same time, with both Aktionsart and aspect. Constraints

676 Hana Filip

imposed by language require that information coming from these three sources be compatible. In general, the syntactic and semantic constraints of the constructions are matched against the requirements of their constituting lexical items. Particular grammatical constructions are associated with the boundedness aspectual properties of verbal predicates that function as their heads. Whenever these disagree with inherent properties of the nouns, the meaning associated with the construction wins, and the lexical meaning of the noun yields (cf. also Fillmore 1989: 48). If a perfective verb co-occurs in the same construction with an undetermined Incremental Theme NP that is headed by a mass or plural noun, this may be implemented in the following way: It is assumed that NPs have different feature specifications for the head noun and the whole phrase. The head noun will be specified with the feature attribute "cumulative", which characterizes its inherent lexical semantic properties, while the whole NP in terms of the feature attributes "cumulative" and "bounded". A mass or plural noun head will then be specified with the feature specification "[cumulative +]". If the whole NP functions as an Incremental Theme NP of a perfective verb, it "acquires" a "[bounded +]" status from it via unification. An object that is viewed in its entirety, with respect to its boundaries, must be quantized (that is, "[cumulative -]"), as well. This is captured by the following feature co-occurrence restriction: "[bounded +] > [cumulative -]". In imperfective constructions, the Incremental Theme NP construction "acquires" via unification the "[bounded -]" status from the imperfective verb. This apparatus yields the right results, namely that perfective sentences with cumulative Incremental Theme NPs are bounded (perfective) and quantized (telic), while imperfective sentences with cumulative Incremental Theme NPs are unbounded (imperfective) and cumulative (atelic). Imperfective sentences with quantized Incremental Theme NPs are quantized (telic) and unbounded (imperfective). Hence, this correctly predicts that only DO-NPs in such pairs of Czech sentences as (1) and (2) will have different interpretations with respect to the "bounded/unbounded" distinction, while such sentences as DONPs in (4) and (5) will not. The reason, of course, is that sentences headed by such verbs as to see and to stir denote events that do not

Boundedness in temporal and spatial domains

677

evoke an Incremental Schema, at least not under the most usual interpretation. Such a unification-based account has the following advantages: it allows us (i) to distinguish between the interaction of nominal and verbal predicates on the level of aspect and on the level of Aktionsart as well as to define the relation between the two; (ii) to provide an intuitively more plausible account of the data from such Slavic languages as Czech; (iii) to compare the different morphological and syntactic strategies for encoding aspect in typologically distinct languages in terms of a difference in the grammaticalization of the 'bounded/ unbounded' distinction (See Filip 1992 for the comparison of the interaction between the nominal and verbal predicates in Czech and Finnish). The choice between expressing the "bounded/unbounded" distinction within NPs (by D-quantifiers, for example) and on the level of verbs, VPs and sentences (possibly by various Α-quantifiers) is not imposed on languages by the facts in the real world, but rather it is a matter of language-specific schematizations, and of cognitive choices inherent in such schematizations. Further investigation of the structure and interpretation of such linguistic means can give us an indirect access to the semantic differences underlying the "verb-noun" distinction and its relation to the ontology of individuals and events. The account of the influence of a verbal predicate operator on the meaning of a nominal predicate is complicated by the fact that the meaning of a derived verb is often different from that of a stem and in many cases, the meaning of the derived verb does not arise compositionally from the meaning of a verbal predicate operator and the stem. Often, it is partly or fully lexicalized. Such data from Slavic languages, among others, prompted Spencer (1991) to observe that aspect coding morphology "provides an example of what appears at first sight to be inflectional morphology behaving like derivational morphology" (Spencer 1991: 197). Take, for example, the process of prefixation in Czech. While it holds without exceptions that adding a prefix to an imperfective verb yields a perfective verb, apart from this regular change in aspect, other meaning changes that are induced by prefixation are difficult to predict and have so far escaped any truly systematic and revealing description. It is difficult to predict for a

678 Hana Filip

given prefix what meaning(s) it will assume with different verbs or classes of verbs. For instance, the prefix u-, applied to the imperfective verb pit1, yields the perfective verb upitp, as in (11): (11)

Upilp kávu PREF-drank-SG-MASC coffee-SG-ACC 'He took a sip of coffee (from the cup)'

(zesalku) (from cup)

The prefix u- can be interpreted as modifying only the verb with the meaning "to take a sip". At the same time, it can be viewed as indicating that the DO-NP denotes a small portion of the substance denoted by its head noun. However, u- does not have this meaning in upléstp 'to knit', 'to finish knitting'. Here, the prefix u- simply contributes the completive, all-inclusive interpretation. And as it has been observed, the prefix u- in uvidetp (see example (4b)) concerns the aspect of the verb, but not the semantics of nominal arguments. Given that there are about twenty prefixes that serve to derive perfective verbs from simple imperfective verbs in Czech 5 , apart from other verbal predicate operators, the task of describing the impact of verbal predicate operators on the interpretation of nominal predicates may seem at first sight quite daunting. Although I do not attempt to solve this problem of Slavic morpho-syntax and do not deal with the details of verbal morphology, my constructional approach represents an important step towards a better understanding under which circumstances verbal predicate operators influence the interpretation of NPs. 6.

Implications for referential specificity and determination

The influence of verbal predicate operators on NPs touches on a number of issues that have to do with referential specificity, topicalization and explicit quantifieational operators of various kinds. They provide arguments in support of my claim that it is the Incremental Theme NP that is the target of the semantic effects of verbal predicate operators.

Boundedness in temporal and spatial domains

679

6.1. Boundedness and referential specificity As has been observed above, the aspectual "bounded /unbounded" distinction is directly related to the functions that are ascribed to articles within NPs. I propose that the difference in referential specificity that the Incremental Theme DO-NPs manifest in such sentences as (la) - (lb) and (2a) - (2b) follows as a pragmatically determined byproduct of a bounded and an unbounded interpretation assigned to them in perfective and imperfective sentences, respectively. By default, the bounded interpretation of Incremental Theme NPs in perfective sentences takes on a holistic interpretation. (This default interpretation can be overriden by the lexical semantics of a particular perfectivizing verbal predicate operator, see examples with the prefix nafurther below.) For example, (lb) describes an event that ended when the Agent finished drinking all the available wine. In general, an entailment that a given object or a set of objects was completely subjected to an event presupposes that it is bounded. If a NP is cumulative, the only way in which the boundaries of its referent can be fixed, is to anchor it to an entity or a set of entities easily identifiable in the discourse context. This explains why the speaker who utters such a perfective sentence as (lb), for example, presupposes that the hearer can uniquely identify the entity that is spoken of: a 'specific' or 'known' portion of wine. A supporting bit of evidence for this claim can be provided by the data from Bulgarian which combines the Slavic aspectual system with a partially realized article system. Here, the use of an enclitic definite article -to in the Incremental Theme NP is obligatory in perfective sentences. This is illustrated by (12): (12)

Toj izpip he-NOM PREF-drank-SG-MASC DF-ACC 'He drank up (all) the coffee.'

kafe /kafeto. coffee-ACC /coffee-

It must be emphasized that the semantic effect of verbal predicate operators on nominal predicates in Slavic languages like Czech cannot be simply equated with the functions of definite and indefinite articles.

680 Hana Filip

As Jackendoff points out, the definite article itself does not contribute the bounded reading of a NP, it "contributes only 'contextually identifiable' and the choice of boundedness depends on other constraints; ..." (1990: 15). This claim can be supported by examples like The water kept spurting out of the broken hose, in which the water is unbounded and is "a contextually identifiable medium" (1990: 15). In Czech, the bounded reading of an inherently unbounded NP depends on the Incremental Themehood of a NP and on aspect. If an Incremental Theme NP in perfective sentences is assigned a bounded and a holistic reading, its referentially specific reading can be motivated by general pragmatic principles outlined above. The existence of a subpart of an entity does not presuppose the existence of a whole bounded entity, rather it merely allows for the possible existence of a (contextually) relevant additional quantity or continuation. From this it follows that cumulative Incremental Theme NPs in imperfective sentences tend to have not only "unbounded" interpretation, but also a referentially 'unspecified' interpretation. For example, the Incremental Theme NP wine in (la) is assigned an "unbounded" interpretation, and we need not identify the boundaries of the entity of which it is a subportion, that is, we need not anchor it to any particular whole portion of wine in the domain of discourse. Moreover, inferences to specific bounded subportions in such sentences are in general not valid, because they would provide more information than is linguistically specified. The correlation between a bounded interpretation and a referential specificity, on the one hand, and an unbounded interpretation and an unspecified object interpretation, on the other hand, is restricted to those cases in which the Incremental Theme NP is undetermined and headed by a mass or plural noun and in which it functions as a DO. It does not apply if the Incremental Theme NP contains a determiner quantifier or some other quantifying or measure expression, if it functions as a subject, or if a clause contains Α-quantifiers, including those incorporated in verbal predicate operators. First, subjects often function as topics. And topicalized constituents that occur in a sentence-initial position are often highly individuated and definite, regardless of the verb aspect. Consider the following example:

Boundedness in temporal and spatial domains

(13)

681

a. Vlaky profizdëly1 hranici. trains-PL-NOM PREF-passed-IPF-PL border-SGINSTR 'The trains were crossing the border.' ('There were (some) trains crossing the border.') b. Vlaky projelyp hranici. trains-PL-NOM PREF-passed-PL border- S G-INSTR '(All) the trains crossed the border.'

Example (13a) is most likely to mean "The trains were crossing the border". For "There were (some) trains crossing the border", a different word order would be preferable, namely Hranici profizdëly vlaky. If we change the word order in (13b) to Hraniciprojely vlaky, the sentence will mean "Some/The trains crossed the border". If we put the NP vlaky 'trains' in the final position, it is likely to express new information, in which case the meaning 'some trains' is available. Second, if Incremental Theme NPs are quantized, they may have a referentially specific or unspecified interpretation, independently of the verb aspect. The reason is that the bounded and unbounded interpretation is assigned with respect to the prototypical extent of their denotata or with respect to the quantity indicated by the quantifying or measure expression within the Incremental Theme NP. In particular, in perfective sentences the assignment of a bounded reading to quantized Incremental Theme NP is not contingent on its contextual anchoring to a specific entity in the domain of discourse. Therefore a quantized Incremental Theme NP need not have a referentially specific interpretation. In general, NPs that contain determiner quantifiers or measure expressions have a different discourse function than referring NPs. While a proposition with a referring NP picks out a specific object in the domain of discourse, a proposition that contains a quantified or a measure NP describes an object or an individual. Quantified and measure NPs are typically low in individuation. For example, we do not usually talk about a specific yard, a pint of beer, a cup of coffee ("the yard", "the pint of beer", "the cup of coffee"), we count such entities, but we do not take an interest in them individually as discrete particular participants in an event.

682 Hana Filip

Third, some verbal predicate operators incorporate notions that are related to quantification and measure expressions. If they modify a verb that takes an Incremental Theme argument, they neutralize the differences in referential specificity of Incremental Theme NP in perfective and imperfective sentences in a similar way in which determiner quantifiers and measure expressions do. For example, the prefix na- contributes the notion of gradual amassing or accumulation to the meaning of the verb it modifies. Its impact on the Incremental Theme NP is roughly comparable to the unstressed "some" in English. This can be shown by the fact that Incremental Theme NPs of raz-verbs can be only modified with measure expressions and determiner quantifiers that do not require that the noun in their scope refer to a quantity consisting of a number of discrete and countable entities. This is illustrated by (14): (14)

Nakoupilp hodnë /kos PREF-bought-SG a-lot-of / basket-SG-ACC *?pet jablek. *?five apples-PL-GEN 'He bought a lot of / a basket of / five apples.'

/ /

Such examples show that Incremental Theme NPs that function as arguments of na-verbs are treated as constituting an undifferentiated whole, and not as composed of separate individuals. The notion of amassing, accumulation, or a vague measure expressed by ««-verbs is clearly related to the fact that Incremental Theme NPs that function as their arguments are 'referentially unspecified' and are low on an individuation scale. For example, if a question such as 'Where did you buy these postcards?' introduces 'postcards' into the domain of discourse, we cannot answer with the verb nakoupitp 'to buy', because it requires a referentially unspecified object. Instead, the appropriate answer should contain the perfective verb koupitp 'to buy' that can be combined with a referentially specific DO-NP:

Boundedness in temporal and spatial domains

(15)

683

a. *Nakoupilp jsem je *PREF-bought-SG-MASC am-AUX-lSG them-PLACC ν kiosku. in kiosk. b. Koupilp jsem je ν bought-SG-MASC am-AUX-lSG them-PL-ACC in kiosku. kiosk. Ί bought them in the kiosk.' 6

Other perfective na-verbs are, for instance: natrhatp 'to pick', nabratp vodu 'to draw (in) some water', nachytatp ryby 'to catch some fish', nasbiratp jahody 'to pick some strawberries', nasporitp penize 'to save some money'. 6.2. Boundedness and determination A compelling argument in support of my claim that it is the Incremental Theme NP that is accessible to the semantic effects of verbal predicate operators can be provided by the restrictions on the occurrence of determiner quantifiers and quantifying and measure expressions in Incremental Theme NPs that occur in imperfective sentences. To illustrate this point, consider first the following examples: (16)

a. Pili (*)vsechnu kávu. (*)all-SG-ACC coffee-SG-ACC drank-SG-MASC (*)dvë kávy PiO (*)two coffee-SG-GEN drank-SG-MASC / (*)hodne kávy. / (*)a-lot-of-coffee-SG-GEN

In imperfective sentences denoting simple events, such as (16a) and (16b), Incremental Theme NPs cannot be quantified with the universal quantifiers "all" and "whole". They cannot be enumerated by count cardinal numerals and they usually do not occur with most other quan-

684 Hana Filip

tífiers and with various measure expressions. "(*)" in (16a) - (16b) indicates that a clash between an imperfective aspectual verbal predicate operator and a quantified Incremental Theme NP in its scope can be resolved if an iterative or a habitual interpretation can be assigned to the whole predication. (16b), for example, would be acceptable in the context of a frequency adverbial phrase: "Every day, he drank two coffees, a lot of coffee". In this case, the iterative operator takes scope over both the aspectual verbal predicate operators and the quantified Incremental Theme NP. In imperfective sentences that contain a quantified Incremental Theme NP we may enforce a "simultaneous-events" reading by using the temporal adverbial najednou 'at the same time', as in (17): (17)

Pletla1 deset svetru najednou. knitted-S G-FEM ten sweaters-PL-GEN at-the-sametime 'She was knitting ten sweaters at the same time.'

(17) entails that each individual pullover in the set denoted by the numerically-specified NP ten sweaters was partially subjected to the knitting event, and thus was gradually coming into existence. We may conclude that imperfective sentences with Incremental Theme NPs that contain a determiner quantifier tend to lose the ability to denote simple single events. They denote (i) iterative, habitual events, or (ii) an event that is directed at a number of individuals at the same time. If the context excludes these two interpretations, the use of a quantified or a numerically-specified Incremental Theme NP is often ungrammatical or, at least, odd. So (16b), for example, would be odd in the following context: "Last night, he drank two coffees". In such a context, quantified Incremental Themes strongly favor the environment of perfective aspect: Vcera vecer vypilp dvë kávy 'Last night, he drank (up) / he had two coffees'. On the other hand, in imperfective sentences there are no restrictions on the modification of the Incremental Theme:

Boundedness in temporal and spatial domains

(18)

685

a. Deti videly1 vsechny children-PL-NOM saw-PL-NEU all-PL-ACC chfestyse. rattle-snakes-PL-ACC 'The children saw all the rattle-snakes.' b. Deti videly1 hodne / deset children-PL-NOM saw-PL-NEU a-lot-of / ten chrestysu. rattle-snakes-PL-GEN 'The children saw a lot of / ten rattle-snakes.'

In this case, the imperfective sentence can denote a single seeing event that is directed at a group of individuals. The fact that the quantified DO-NPs in (18) are compatible with imperfective aspect, while in (16) they are not, can be explained, if we assume that the DO-NPs in (18) are not linked to the Incremental Theme role. The seemingly complicated way in which quantified NPs interact with aspect puzzled linguists working on Slavic languages (cf. Wierzbicka 1967; Rassudova 1977; Merrill 1985; among others). Slavic linguistics has so far failed to provide an adequate description for this interaction. In this section, I have suggested that we can easily describe it, if we recognize that the Incremental Theme provides the missing semantic link in this puzzle. The restrictions on the occurrence of determiner quantifiers and quantifying and measure expressions that modify Incremental Themes can be explained if we assume that verbal predicate operators have semantic effects on Incremental Theme arguments that must be compatible with the quantifying and measure expressions that modify them. 7.

Conclusion

This paper is a contribution to the study of non-NP means that are operative in constraining the interpretation of NPs in languages that have no systematic means to mark (in)definiteness. The proposed analysis has the advantage that all the parameters on which it is based are independently motivated and needed elsewhere in the grammar. In this

686

Hana Filip

paper, I only examine Czech. However, the phenomena described are not restricted to this language. They can be clearly observed in other Slavic languages, and also in such typologically distinct languages as Hindi and Japanese, for example, that do not have articles and that express the aspectual distinction "perfective vs. imperfective" in a systematic way by means of verbal expressions. The interaction between verbal predicate operators and nominal arguments represents an important field of study, as it promises to give us valuable insights into the language-specific schematizations and semantic universale.

Notes 1.

2.

3.

For a discussion on the fuzziness of the inflection-derivation distinction and the Slavic aspectual distinction, see Spencer (1991: 195ff.). It is surprising how little attention has been paid to understanding the impact of verb morphology on the interpretation of nominal arguments in Slavic linguistics. Standard grammar handbooks that describe particular Slavic languages characterize many lexicalderivational operators that are applied to verbs in terms that are related to quantity, measure and quantification. This is in particular true for the description of préfixai semantics, as in Petr (1986: 387ff.) and in IsaCenko's work (1960 and 1962: 385-418). Various studies on Slavic linguistics contain occasional hints at the interaction between verbal and nominal predicates and there are a few studies that deal with some of its aspects, for instance, in Polish (Wierzbicka 1967) and in Russian (Forsyth 1970; Merrill 1985; and Russell 1985). However, a systematic analysis is so far missing. The main reason for this gap in the coverage of data can be seen in the concentration of Slavic linguistics on the form-meaning correspondences on the level of verb morphology or on the description of aspect and Aktionsart in the discourse. Tommola (1990: 361) observes that "in the traditional aspect realm - in Slavic linguistics - an approach has gained ground that takes such notions as definiteness and specificness into consideration (Leinonen 1982; KabakCiev 1984)". Krifka (1992: 39) defines the homomorphism with the following notions: Summativity:

Boundedness in temporal and spatial domains

V R [SUM ( R ) Ve, e', χ, χ' [ R (e, χ) λ R (e', χ') —» R ( e u e', χ u χ')]] Uniqueness of objects V R [UNI-0 (R) Ve, χ, χ' [R (e, χ) Λ R (e, χ') χ = χ']] Uniqueness of events V R [UNI-E (R) Ve, e', χ [ R (e, χ) λ R (e', χ) —» e = e']] Mapping to objects V R [ M A P - 0 (R) V e , e', χ [R (e, χ) λ e' ç e 3x* [χ' ç χ A R (e', χ')]]] Mapping to events V R [MAP-Ε (R) Ve, χ, χ' [R (e, χ) λ χ' ç χ -> 3e' [e' ç e A R (e', χ')]]] " υ " : the operation of join " ç " : the relation of part

4.

5.

6.

In (11), we could also use the mass noun in the genitive case: kávy (lit.: coffee-GEN). The case difference does not have any effect on the overall meaning of the sentence. Smilauer (1968;71: 165), for example, gives the following list: 1. do-, 2. na-, 3. nad-, 4. o-, 5. ob-, 6. od-, 7. po-, 8. pod-, 9. pro-, 10. pre-, 11. pfed-, 12.pH-, 13. roz-, 14. s-(sou-), 15. u-, 16. v-, 17. vy18. vz-, 19. z-, 20. za-. The following example also confirms my claim that na- verbs only take referentially non-specific arguments: Dëti natrhalyP jablka children-PL-NOM PREF-picked-PL apples-ACC (?ze stromu). (?from tree) 'The children picked some apples (from the tree).' The only acceptable interpretation assigned to this sentence would require that the PP ze stromu 'from the tree' refer to a specific tree. This, however, would also require that 'apples' would have to be referentially specific. This explains why the use of the PP in this sentence is odd.

687

688

Hana Filip

References Bach, Emmon 1986a "Natural language metaphysics", in: Ruth Barcan Marcus, Georg J. W. Dorn & Paul Weingartner (eds.), Logic, methodology and philosophy of science VII. Amsterdam: North-Holland, 573-579. 1986b "The algebra of events", Linguistics and Philosophy 9:5-16. Brodie, Brody & David R. Dowty 1984 "The semantics of "floated" quantifiers in a transformationless grammar", in: Mark Cobler, Susannah Mackaye & Michael T. Wescoat (eds.), Proceedings of the West Coast Conference on Formal Linguistics. Volume Three. Stanford: The Stanford Linguistics Association, 75-90. Comrie, Bernard 1976 Aspect. An introduction to the study of verbal aspect and related problems. Cambridge: Cambridge University Press. Dahl, Osten 1985 Tense and aspect systems. London and New York: Basil Blackell. Dalrymple, Mary, Mchombo Sam & Stanley Peters 1992 "Semantic similarities and syntactic contrasts between Chichera and English reciprocals". Ms. Dowty, David R. 1972 "Studies in the logic of verb aspect and time reference in English". Ph. D. dissertation, University of Texas at Austin. 1977 "Toward a semantic analysis of verb aspect and the English 'Imperfective' progressive", Linguistics and Philosophy 1:45-79. 1979 Word meaning and montague grammar. The semantics of verbs and times in generative semantics and in Montague 's PTQ. Dordrecht: Reidel. 1988 "Thematic proto-roles, subject selection, and lexical semantic defaults", Ms. (Paper presented at the Twenty-Second Annual Meeting of the Linguistic Society of America, San Francisco; preliminary draft of January 1988). 1991 "Thematic proto-roles and argument selection", Language 67, 3: 547-619. Fiengo, Robert 1974 "Semantic conditions on surface structure". Ph. D. dissertation, Cambridge, Mass.: MIT Press. Filip, Hana 1992 "Aspect and the semantics of quantity of verbal and nominal expressions", Proceedings of the East Coast Conference in Linguistics. Ithaca: DMLL Publications, Cornell University, 80-91. Fillmore, Charles J. 1985 "Frames and the semantics of understanding", Quaderni di Semantica VI, 2: 222-254.

Boundedness in temporal and spatial domains

1988

689

"The mechanisms of 'Construction Grammar"', in: Shelley Axmaker, Annie Jaisser & Helen Singmaster (eds.), Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society. Berkeley: 35-55. 1989 "On grammatical constructions". Department of Linguistics, The University of California at Berkeley, Ms. Fillmore, Charles J., Paul Kay & Mary Catherine O'Connor 1988 "Regularity and idiomaticity in grammatical constructions: The case of let alone". Language 64, 3: 501-538. Fillmore, Charles J. & Paul Kay 1992 "On grammatical constructions". Department of Linguistics, The University of California at Berkeley, Ms. Michael S. Flier & Richard D. Brecht (eds.) 1985 Issues in Russian morphosyntax. UCLA Slavic Studies, Vol. 10. Columbus, Ohio: Slavica, 58-72. Michael S. Hier & Alan Timberlake 1985 The scope of Slavic aspect. Columbus, Ohio: Slavica. Forsyth, James 1970 A grammar of aspect. Usage and meaning in the Russian verb. Cambridge: Cambridge University Press. Garey, Howard B. 1957 "Verbal aspects in French", Language 33: 91-110. Gruber, Jeffrey, S. 1965 "Studies in lexical relations". Ph. D. dissertation, Massachusetts Institute of Technology. Heim, Irene R. 1982 "The semantics of definite and indefinite noun phrases". Ph. D. dissertation, University of Massachusetts at Amherst. Hinrichs, Erhard 1985 "A compositional semantics for Aktionsarten and NP reference in English". Ph. D. dissertation, Ohio State University. Hoepelman, Jakob Ph. 1981 Verb classification and the Russian verbal aspect: A formal analysis. Tübingen: Gunter Narr. IsaCenko, Alexander V. 1960 Grammaticeskij stroj russkogo jazyka ν sopostavlenii s slovackim. Morfologia, Part 2. Bratislava: Izd-vo Slevetsko: Akademii nauk. 1962 Die russische Sprache der Gegenwart, Part I, Formenlehre. Halle (Saale): Niemeyer. Jackendoff, Ray S. 1990 "Parts and Boundaries", Ms. Jakobson, Roman O. 1936/1984 "Beitrag zur allgemeinen Kasuslehre: Gesamtbedeutungen der russischen Kasus", Travaux du Cerele linguistique de Prague VI: 240-288.

690

Hana Filip

[1971]

[reprinted in: (Selected Writings, Vol. 2, second edition). The Hague: Mouton, 23-71.] KabakCiev, Krasimir 1984 "The article and the aorist/imperfect distinction in Bulgarian: An analysis based on cross-language 'aspect' parallelisms", Linguistics 22, 5: 643-672. Krifka, Manfred 1986 "Nominalreferenz und Zeitkonstitution. Zur Semantik von Massentermen, Individualtermen, Aspektklassen". Ph. D. dissertation, The University of Munich, Germany. 1987 Nominal reference and temporal constitution: Towards a semantics of quantity. (FNS-Bericht 17) Tübingen: Forschungsstelle für natürliche Systeme, Universität Tübingen. 1989 Nominalreferenz und Zeitkonstitution. Zur Semantik von Massentermen, Individualtermen, Aspektklassen. München: Wilhelm Fink. 1992 "Thematic relations as links between nominal reference and temporal constitution", in: Ivan A. Sag & A. Szabolsci (eds.), Lexical matters. Stanford: Center for the Study of Language and Information, 29-53. KopeCny, Franti Sek 1962 Slovesny vid ν ëeStinë. Praha: Nakladatelstvï Ceskoslovenské akademie vëd, Series title: Ceskoslovenská akademie vëd. Kuöera, Henry 1983 "A semantic model of verbal aspect", in: Michael S. Flier (ed.), American Contributions to the Ninth International Congress of Slavists. Kiev, September 1983. Volume I: Linguistics. Columbus, Ohio, 171-184. Lambrecht, Knud 1986 "Pragmatically motivated syntax: presentational cleft constructions in spoken French", in: Anne M. Farley, Peter T. Farley & Karl-Enk McCullough (eds.), Papers from the 22nd meeting of the Chicago Linguistic Society, part 2, Papers from the parasession on pragmatics and grammatical theory. Chicago: Chicago Linguistics Society, 115-126. 1987 "Aboutness as a cognitive category: The thetic-categorical distinction revisited", in: Jon Ashe et al. (eds.), Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society. Berkeley: Berkeley Linguistics Society: 366-382. 1988 "There was a farmer had a dog: Syntactic amalgams revisited", Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society. Berkeley: Berkeley Linguistics Society, 319339. 1990 "What, me worry? - 'Mad magazine sentences' revisited", in: Proceedings of the Sixteenth Annual Meeting of the Berkeley Linguistics Society. Berkeley: Berkeley Linguistics Society, 215-228.

Boundedness in temporal and spatial domains

691

Langacker, Ronald W. 1987a Foundations of cognitive grammar I: Theoretical prerequisites. Stanford: Stanford University Press. 1987b "Nouns and Verbs", Language 63: 53-94. Leinonen, Marja 1982 Russian aspect, "temporal'naja lokalizacija", and definiteness/ indefiniteness (= Neuvostoliitto-instituutin vuosikirja, 27). Helsinki. Lewis, David 1975 "Adverbs of quantification", in: Edward L. Keenan (ed.), Formal semantics of natural language. Cambridge: Cambridge University Press: 3-15. Link, Godehard 1983 "The logical analysis of plurals and mass terms", in: Rainer Bäuerle, Christoph Schwarze & Armin von Stechow (ed.), Meaning, use, and interpretation of language. New York: de Gryuter, 302-323. Merrill, Peter 1985 "Universal quantification and aspect.", in: Michael S. Flier & Richard D. Brecht (eds.), Issues in Russian morphosyntax. UCLA Slavic Studies. Vol. 10. Columbus, Ohio: Slavica, 58-72. Mourelatos, Alexander P. D. 1978 "Events, processes and states", Linguistics and Philosophy 2: 415434. Partee, Barbara H. 1990 "Domains of quantification and semantic typology", in: Frances Ingeman (ed.), Proceedings of the 1990 Mid-America Linguistics Conference. Lawrence: University of Kansas: 84-106. 1991 "Adverbial quantification and event structures", Proceedings of the Seventeenth Annual Meeting of the Berkeley Linguistic Society. Berkeley: Berkeley Linguistic Society: 439-456. Partee, Barbara, Emmon Bach & Angelika Kratzer 1987 "Quantification: A cross-linguistic investigation". NSF proposal, University of Massachusetts at Amherst, Ms. Petr, Jan 1986 Mluvnice Övesvtiny. Part I: Fonetika, Fonologie, Morfonologie a morfemika, Tvoïvenîslov. Praha: Academia. Pollard, Carl & Ivan A. Sag 1993 Head-drive phrase structure grammar. Chicago: Chicago University Press and Stanford: Center for the Study of Language and Information (CSLI) Publications. Rassudova, Olga P. 1977 "Aspectual meaning and aspectual context in the teaching of Russian verbal aspect", in: R. D. Brecht & D. E. Davidson (eds.), Soviet-American Russian language contributions. Urbana, 111. : G & G Press, 139-144.

692

Hana Filip

Russell, Pamela 1985 "Aspectual properties of the Russian verbal prefix na-", in: Michael S. Flier & A. Timberlake (eds.), The scope of Slavic aspect. Columbus, Ohio: Slavica, 59-75. Shieber, Stuart M. 1986 An introduction to unification-based approaches to grammar. Stanford: Center for the Study of Language and Information. Spencer, Andrew 1991 Morphological theory. An introduction to word structure in generative grammar. Oxford and Cambridge, Mass.: Basil Blackwell. Smilauer, Vladimir 1968/71 NovoCeské tvoïeni slov. (The Word Formation in Modern Czech). Praha: Státni pedagogické nakladatelství. Talmy, Leonhard 1978 "The relation of grammar to cognition: A Synopsis.", in: David Waltz (ed.), Theoretical issues in natural language processing. proceedings of TINLAP-2, University of Illinois. New York: Association for Computing Machinery. 1986 "The relation of grammar to cognition", Berkeley Cognitive Science Report No. 45. Berkeley: University of California at Berkeley, Institute of Cognitive Studies. 1988 "The relation of grammar to cognition", in: Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam and Philadelphia: Benjamins. Taylor, Barry 1977 "Tense and continuity", Linguistics and Philosophy 1: 199-220. Tommola, Hennu 1990 "On Finnish 'aspect' in discourse", in: Nils B. Thelin (ed.), Verbal aspect in discourse. Contributions to the semantics of time and temporal perspective in Slavic and Non-Slavic languages. Amsterdam and Philadelphia: John Benjamins, 349-366. Vendler, Zeno 1957 "Verbs and times", Philosophical Review 56: 143-160. Verkuyl, Henk J. 1972 On the compositional nature of the aspects. Foundations of Language, Supplementary Series. Vol. 15. Dordrecht: Reidel. Wierzbicka, Anna 1967 "On the semantics of verbal aspect in Polish", in: To Honor Roman Jakobson. Volume 3. Paris and The Hague: Mouton, 2231-49.

Case markers and clause linkage: Toward a semantic typology* Toshio Ohori

1.

Introduction

Across a variety of languages, we find parallels between case markers and clause linkage markers, such as dative-purpose, ablative-reason, locative-time, etc. (cf. the series of articles in Stanford Working Papers on Language Universals during 1970s, and more recently Genetti 1986, 1991; also cf. Haspelmath from a different angle). In this paper, I will consider a broad range of data, paying special attention to language-specific peculiarities, and I show that the parallels between case markers and linkage markers are motivated by the fundamental distinction between "figure" and "ground" (Talmy 1978) that operates in the process of semantic extension. In pursuing these goals, it is expected that our understanding of the cognitive aspects of complex constructions will be enhanced. To begin, let us look at an example of the dative-purpose parallel, which is found very widely: Japanese (1)

Neko-ni

gohan-o

age-ta.

cat-DAT

food-ACC

give-PAST

'(I) gave food to the cat.' (2)

Hon-o

kari-ni

book-ACC borrow-DAT/PURPOSE

tosyokan-e

it-ta.

library-to

go-PAST

'(I) went to the library to borrow the book.' In (1), ni is attached to a NP and marks neko 'cat' as recipient, hence the gloss DAT. In (2), on the other hand, the same morpheme is at-

694 Toshio Ohori

tached to a S or its analog, hon-o kari 'borrow the book' (citation form kariru), to code purpose. Here kari is partially deverbalized: it is tenseless but assigns accusative case to its direct object hon 'book'. 1 The same parallel can be seen elsewhere in the world as well. Note the following examples from Turkish: Turkish (3)

Ahmet Istanbul-a

git-ti(-0).

Ahmet Istanbul-DAT

go-PAST(-3sg)

'Ahmet went to Istanbul.' (4)

Ahmet bt kitap al-mag-a Ahmet a book buy- INFDAT/PURPOSE 'Ahmet went to buy a book.'

git-ti(-0). go-PAST(-3sg)

Here too the same morpheme appears next to a NP and an infinitive S, as in the above examples from Japanese. As far as I know, the first mention of the parallels between case markers and linkage markers (especially for subordination) in a crosslinguistic perspective is found in Moravcsik (1972: 151), who discusses the overlapping uses of case markers across languages and remarks on "the use of some particles both as a preposition or postposition, especially for dative marking, and as a complementizer. "2 The present paper is, in some respects, an attempt to join the game 20 years later, with a hope of adding some new insights. Here I take the term 'case markers' in a broad sense, including both adpositions and noun declensions as long as they mark the grammatical relations of NPS to Vs. This approach can be justified if we recognize that the problem is not simply a matter of how to define cases, but that of describing the polyfunctionality of certain morphemes. In addition, I use the term 'linkage markers' instead of 'complementizers', because the following discussion includes clauses which may be simply adjoined to, rather than embedded in, the main clause. In the following discussion, I will first provide a brief survey of case marker-linkage marker parallels, drawing upon languages from different stocks. Then, I will point out some problems that have not

Case markers and clause linkage

695

been discussed in detail up to the present. After that I will examine general tendencies in, and motivations for, the polyfunctionality of some of these morphemes. 2.

Presentation

The use of case markers for linkage markers was first systematically investigated by Genetti (1986), based on sampling from the Bodic branch of Tibeto-Burman languages. The result is summarized as follows (commonest ones are in italics, added by Ohori): (5)

LOCATIVE > if/although, when/while/after ABLATIVE > whenAvhile/after, because, non-final ALLATIVE > purpose DATIVE > purpose ERGATIVE/INSTRUMENTAL > because, when/while/after

Similar sorts of extensions from case markers to linkage markers can be seen in various parts of the world, and one may feel strongly that there is some universal basis for it. Examples of polyfunctionality other than the dative-purpose parallel are given below: Martuthunira (Pama-Nyungan, Dench 1988): locative (6)

yakarrangu-la sun-LOC

'in the sun' (7)

Karlarra-npa-lha-la paju-rru, puwara-npa-lha-la hot- INCH-PAST-LOC/TIME very-NOW coal- INCH-PAST-LOC/TIME paju-rru, ngarri-ngka kampa-rninyji-rru ngurnaa. very-NOW

ashes-LOC

cook-FUT-NOW

that-ACC

'When it (the fire) has become very hot and burned down to coals, cook that one in the ashes.'

696

Toshio Ohori

Lahu (Lolo-Burmese, Matisoff 1982): ablative (8)

ha-ni hó o te k mô ö rock-red below LOC from down LOC 'from below Red Rock downwards'

(9)

Su gò tu phi 3rd intensifier burn set st? pù ε gì three basketful TOP get.to.do 'Since they went and burned up



te-ΐε

BEN ABL/REASON

B? È?. enough INTJ [all my fields] on me, I'll only

harvest three basketfuls!'

Diegueño (Yuman, Gorbet 1973,1974): inessive (10)

?3 wa:+pu+Ly house+DEF+INESS 'in(to) the house'

?a:+Ly

(11)

I+SUBJ I-go+iNESS/coMP Ί want to go.'

lar. I-want

Tauya (Papuan, Macdonald 1988, 1990): ergative-instrumental (12)

?asu-ni

fai-e-?a

• knife-ERG cut-l/2-lND

Ί cut (it) with a knife.' (13)

Fanu

nipi ?umu-a-na-ni

man

her

wamasi

die-3sg-SUB-ERG/CAUSE widow

mene-a-?a stay-3sg-IND

'Her husband died so [= and consequently] she's a widow.' While these examples represent general tendencies in the extension from case markers to linkage markers, there are notable languagespecific wrinkles, which must be given due consideration. Let us look at some of these cases.

Case markers and clause linkage

697

First, the extent to which the case marker-linkage marker parallel is established differs from language to language. For example, the "ablative" marker te le in Lahu has in fact a relatively restricted use in the language, so it would be an injustice if I did not mention "the incommensurability of cross-language semantic-grammatical categories" (Matisoff 1982: 167). The first element te is indeed verbal and "fromness" is one of its derived meanings, so "the idea seems to be that 'having finished our consideration of this place, we will proceed from it to another point' " (Matisoff 1982: 168). Compare this marker with Japanese kara in the following examples: Japanese (14)

Heya-kara

de-ta.

room-ABL gO.OUt-PAST '(I) went out of the room.'

(15)

Netu-ga

at-ta-kara yasun-da. fever-NOM be/have- PAST-ABL/REASON absent-PAST 'Since (I) had a fever, (I) was absent [= didn't come].'

In Japanese, kara is the commonest marker of "fromness", so it can be said that the parallel between ablative and reason is, unlike in Lahu, firmly established. In other languages, for example Turkish, there does not seem to be an ablative-reason parallel. Turkish uses the ablative case dan in the linkage construction which codes reason, but ablative in that construction is part of a larger connective expression and one cannot say that the parallel is really established.3 On the other hand, some languages, notably the Pama-Nyungan languages of Australia, grammaticize dative-time and ablative-reason parallels, but lack the dative-purpose parallel, which is perhaps the most widely observed case of polyfunctionality. This is because Pama-Nyungan languages have a suffix that is specialized for marking purpose (Dixon 1977, 1980). This is, for example, the case in Yidiny, in which the suffix gu is used to mark purpose. As a result, although these languages may use the dative case for marking a whole clause,

698

Toshio Ohori

its typical meaning is not purpose, but "something happening concurrently with the main clause" (Dixon 1980: 459). Second, there is often syncretism or conflation of case markers. Datives, for example, tend to absorb many different functions. In Japanese, goal and location are both marked by the form ni, as in many other languages. Further, in a variety of languages, both direct and indirect objects are marked by the same form. For example, Lahu uses the marker tha? for both direct and indirect objects, alongside many other NPs depending on context. Consequently, it lacks a morpheme that can be securely identified as either "dative" or "accusative", and tha7 is not used to link clauses. Besides, Lahu is a typical monosyllabic language where grammatical functions are realized analytically, so notions like beneficiary, location, etc. are often realized by a complex interplay of particles. This seems to account for the fact that Lahu does not grammaticize the dative-purpose parallel, though it has the ablative-reason parallel. A similar sort of peculiarity is found in Diegueño as well. It has such parallels as comitative-circumstantial, inessive-purpose, locativecircumstantial and ablative-conditional (Gorbet 1973, 1974), but nevertheless it happens not to have the dative-purpose parallel. This is because, like Lahu, direct and indirect objects are marked by the same morpheme in Diegueño and there is no special case marker for indirect objects. As such, the semantic content of the case affix may be too broad to be specialized for purpose. Third, the fuzziness of category boundaries can be problematic. In many languages, dependent clauses bearing case markers are deverbalized, but the extent of this deverbalization varies from language to language. Even within a single language, say in Japanese, the verb form is in the infinitive when the dative ni is attached, while it bears tense when the ablative kara is attached (compare (2) and (15)).4 Further, different strategies may coexist for realizing the same semantic relation in a given language, as in the following Basque examples (Brettschneider 1980: 194; English translations are added by Ohori):

Case markers and clause linkage

699

Basque (16)

etorri. Gaixo izan-a-z ez n-aiz krank sein(VAdj)-df/sg-INST nicht Is-AUX gekommen 'Durch das Krank-Sein bin ich nicht gekommen (=Because of being sick, I didn't come/go).'

(17)

Ez n-in-tza-n etorri nicht AUX (=1S-TNS-AUX-TNS ) kommen nintza( n)-la-ko-z.

gaixo krank

AUX-COMP-DELIM-INST

'Ich war nicht gekommen, weil ich krank war (= I didn't come, because I was sick).' Although the instrumental case marker appears in both examples, (16) gaixo izan-a-z, 'because of being sick' is completely nominalized, but this is not so in (17) gaixo nintza(n)-la-ko-z, 'because I was sick'. In the latter, the embedded clause has an auxiliary, which cannot be found in a nominalized clause. This last point raises an important issue, which must be clarified in order to correctly understand the nature of the polyfunctionality under consideration. One may claim that the case marker-linkage marker parallel is a pseudo-problem because case markers are attached to nominalized clauses, and thus remain case markers all the time without becoming linkage markers. I would say this is a pseudo-criticism, simply because we have to understand why languages ever bother to nominalize a clause and treat a predication like an individual. The former has an internal event structure with participants and processes, but the latter is typified by concrete objects. To simply construe the above examples as nominalizations does not justify ignoring the data. Furthermore, in many cases the case-marked clause bears tense or person marking and its nominal quality is very weak, i.e. many such clauses are to some extent both noun-like and verb-like. What need to be examined are the semantic motivations for drawing upon the grammatical resources of nominalization. The problem is not only the parallel between NPS and Ss on morpho-syntactic grounds but that between individuals and events on semantic grounds.

700

Toshio Ohori

3.

Tendencies and motivations

In the following discussion, I will proceed by examining the following question and explore the semantic basis for case marker-linkage marker parallels. (18)

What are the motivations for the extension of case markers to clause linkage markers and what constraints are there on that process?

In the paper mentioned earlier, Genetti (1986) gives an account based on the localist hypothesis, noting that markers that code spatio-temporal relations can be extended to more abstract logical relations, as in the table below (from Genetti 1986: 394): (19) SPATIAL TEMPORAL LOGICAL

LOCATION

SOURCE

GOAL

locative when/while if

ablative since because

allative until purpose

When the notion of location is extended to the temporal domain, we obtain the temporal location, i.e. time (paraphrasable with "when" and "while"). The logical "location" is protasis of the conditional. Likewise, the source and goal become the starting point (="since") and the endpoint (="until") of events in the temporal domain, and the cause (="because") and intended result (="purpose") in the logical domain. This explanation is fairly straightforward, and seems plausible on intuitive grounds. However, there are several problems to be investigated further, some of which are related to the points raised above. Most crucially, (19) does not explain why some case markers are more likely to be grammaticized for linkage markers than others. For example, case markers for core grammatical relations, such as subject and object, are less likely to be grammaticized as markers for dependent clauses.5 Also, comitati ves tend not to be used to link clauses, as long as they have separate forms from other markers such as possessives. This ten-

Case markers and clause linkage

701

dency is pointed out by Genetti (1986: 392), but it is also true outside the Tibeto-Burman. These points are by no means trivial, because there is no a priori reason to prohibit the use of the nominative for, say, cause, or the accusative for purpose. The localist hypothesis as given in (19) seems generally correct, but it fails to be explicit about how to construe markers for core grammatical relations (e.g. nominatives and accusatives) when they are used - if ever - with full clauses. Thus there remains a need for articulating the constraints on the extension of case markers to linkage markers regarding which case markers acquire polyfunctionality more easily than others. At the same time, it is also important to examine the basis of the localist hypothesis, by asking what it is that guarantees the use of the same morpheme for marking both NPS and Ss, which are structurally - and conceptually as well distinct units. Explanations to these problems, I believe, can be found by appealing to the distinction between 'figure' and 'ground' (cf. Talmy 1978; also the reformulation by Langacker 1987), which is rooted in our cognitive apparatus. The figure, as usually understood, refers to an individuated and salient entity which can be the focal point of attention. It is "a moving or conceptually movable point". The ground, in contrast, is "a reference-point, having a stationary setting within a reference-frame", and the figure's path or site is characterized with respect to it (Talmy 1978: 627). Since the figure-ground distinction is independent of sensory modalities, it is naturally expected to show up in the organization of linguistic systems as well. Thus in the following example, the pen and the table refer to the entities that function as figure and ground respectively: (20)

The pen fell off the table.

Here we understand this sentence by constructing an image in which the pen (=figure) is positioned in relation to the table (=ground). Importantly, many grammatical patterns embody this distinction, disallowing an unnatural figure-ground assignment. Talmy (1978: 629) remarks as follows: "Even where a speaker does not want to assert

702

Toshio Ohori

anything about relative referencing, language inescapably imposes that semantic addition upon a basic proposition." This is true in both simple and complex sentences, which I will examine below following Talmy's arguments. In simple sentences, the figure-ground distinction is reflected in the way NPs are coded. Of the following examples, the first one sounds natural but the second one does not (Talmy 1978: 628): (21)

a. The bike is near the house. b. TThe house is near the bike.

Given the two entities, a bike and a house, we would normally expect that the former stands as the figure, because it is a smaller, conceptually movable entity. In the first example, the grammatical coding conforms to this expectation, because the bike is coded by the core NP (namely the subject) of the sentence, to which the figure is typically assigned. In the second example, however, the grammatical coding forces us to construe the house as the figure whose location in relation to the bike requires attention, but this is contradictory to what is expected under normal circumstances. To put this in more general terms, figure and ground typically correspond to core and peripheral NPs in simple sentences. This is natural in view of the fact that entities that form necessary components of motion are likewise necessary arguments of predications. Next, in complex sentences, the figure-ground distinction corresponds to the distinction between main and subordinate clauses. According to Talmy (1978: 638), the normal way of coding asymmetric relations between events (e.g. causality and temporality) is to treat "the earlier event as a reference point, or ground, and the latter event as requiring referencing, i.e. as the figure". This point is illustrated by the following example: (22)

Since nobody was around, I shouted for help. ( 'Resulting from the fact that nobody was around, I shouted for help.')

Case markers and clause linkage

703

Here, the main and subordinate clauses correspond to figure and ground respectively. Note that the occurrence of the event in the main clause is determined in relation to that in the subordinate clause. In (22), therefore, the coding of the events in the linked clauses is congruent with their conceptualization. The primacy of the figure-ground distinction in coding events in complex sentences is illuminated by the fact that it is strongly counterintuitive to put an event which has a figure status in the subordinate clause. See the following example (since'1 expresses the converse function of since): (23)

?Since'11 shouted for help, nobody was around. ( 'Resulting in the fact that I shouted for help, nobody was around.')

English lacks any marker with which result is coded within the subordinate clause with the main clause expressing the background event. This is probably the same in many other languages. The difficulty with example (23) is precisely that the main clause is reserved for figure and the subordinate clause is reserved for ground. This tendency puts an important constraint on coding events in complex sentences. 6 From these considerations, I argue that the case marker-linkage marker parallels can be understood in terms of the 'gestalt-preserving' nature of semantic extension. The difficulty in extending certain case markers (e.g. nominatives and accusatives) to linkage markers comes from the fact that, in simple sentences, they are primarily used to code the figure, and as such they are not suited to code the ground in complex sentences, namely subordinate clauses. These correspondences can be schematized as follows: (24)

NP: S:

FIGURE

GROUND

Core Main

Periphery Subordinate

This table shows that the core NP and the main clause both have the status of figure in simple and complex sentences respectively, and that

704

Toshio Ohori

the peripheral NP in simple sentences corresponds to the subordinate clause in complex sentences, both having the status of ground. Consequently, such peripheral case markers as datives, instrumentais, ablatives, and locatives, can be easily extended for marking purpose, cause, time, and conditional, all of which serve as ground in complex sentences. It is perhaps right to say that the relative ease of using case markers for clause linkage correlates with their typicality as ground. This idea can be captured by the following figure: (25)

NPS

Ss

At this point, we may turn to the conceptual basis of the localist hypothesis. In (19), it was shown that notions like "reason" and "purpose" are obtained by mapping "source" and "goal" from the spatial domain onto the logical domain. However, if this analysis is correct, then we need to posit some sort of abstract motion from source to goal in the logical domain. In the above discussion, it was argued that the figure-ground distinction is necessary in accounting for the constraints on the extension from case markers to linkage markers. But here the same conceptual distinction is required to solve another problem, namely the conditions for the localist hypothesis: what guarantees the extension from a concrete domain to an abstract domain, despite the fact that physical motion in the former is non-existent in the latter? The answer again lies in the gestalt-preserving nature of semantic extension. As the figure-ground distinction is primary in coding information in language, it manifests itself in different conceptual domains - spatial, temporal and logical. Although there is no concrete "motion" in complex sentences, we may posit a chain of causal ac-

Case markers and clause linkage

705

tions, on which the figure, i.e. the event coded in the main clause, can be located (cf. Croft 1991 for a highly insightful discussion). To this extent, there is imaginary motion in each conceptual domain. In the previous studies based on localism, much attention has been paid to what changes to what, but not to what remains in the process. The above discussion illuminates this neglected aspect. Having thus clarified the nature of case marker-linkage marker parallels, I will discuss a few problematic cases and provide ways to make sense of them. First, comitatives and genitives do not mark core NPS in simple sentences, but they are not usually grammaticized as linkage markers. This is because these markers do not code the function of NP with reference to motion, but they code some sort of associative relations between NPS. Conceptual asymmetries may be present with these case markers, especially in genitives, but they do not seem to involve any chain of causality that evokes motion. Since temporal sequence and causality are both conceptualized according to this chain, it is less common that comitatives and genitives are used to link full clauses. Second, despite the foregoing arguments, there are languages in which markers for core grammatical relations are used as linkage markers. For example, Yaqui, a Uto-Aztecan language of Mexico and Arizona, grammaticizes the accusative case for "(a) direct and indirect objects, (b) possessor nouns in possessed object noun phrases, (c) objects of the postposition, (d) nominalizations, (e) nominalized clauses and (f) temporal clauses" (Eugene Casad (p.c.)). Here it is probably a legitimate guess that markers for direct objects are better suited to link clauses than those for subjects. This difference can be understood by noticing that NPS typically coded as subjects (e.g. agentive and topical participants) are primary figures, whereas those typically coded as objects (e.g. participants that undergo some change) may only have secondary significance (cf. Langacker's distinction between trajector and landmark, which he treats as a basic figure-ground relationship, with the subjects and object distinction being a special case, e.g. 1987). Another example of the use of core case markers for linking clauses is found in the diachronic data of Japanese. In Old Japanese (OJ, 811C), both nominative and accusative markers could occur next to a

706

Toshio Ohori

full clause. One of their uses was head-internal relativization (cf. Akiba 1977 for an overview). See the following example, taken from Taketori Monogatari, a narrative text around 9C: (26)

Kono menowaraha-ha tahete this girl-TOP at.all miyadukahe-tukamaturu-beku-mo court.service-perform-would-PRT arazu-haberu-wo, mote.wadurahi-haberi. (p. 42) be- NEG-POL-ACC cannot.handle-POL '(We) cannot handle this girl, who would not serve at the court at all.'

The translation uses a relative clause, but the paraphrase using coordination is also possible (for example, "This girl would not serve at the court at all, and (we) cannot handle (her)"). Indeed, there are cases that do not allow a relativization reading. The following is such an example, also taken from Taketori Monogatari : (27)

Kanarazu mi-tatematuri-te-maire-to ohosegoto ari-turu by.all.means see-POL-LINK-come-COMP command be-PERF mono-wo, mi-tatematura-de-ha ika.de-ka thing-ACC see-POL-NEG-TOP how-PRT kaheri-maira-mu. (p. 41) return-come-would 'There was a command that (I) must see (Kaguyahime) by all means, and how would (I) return if (I) do not see (her)?'

It is important to note that the main part of the sentence, mi-tatematura-de-ha ika.de-ka kaheri-maira-mu, 'how would (I) return if (I) do not see (her)?', is equivalent to an independent sentence with respect to its argument structure. The antecedent clause contains no NP that must be linked to the following clause. This type of non-relative linkage with the accusative wo is already observed in early OJ texts. The same sort of polyfunctionality is observed with the nominative ga, but, crucially, the non-relative reading became established only in late OJ, much later than other case markers. While head-internal rela-

Case markers and clause linkage

707

tivization with ga is present in Taketori Monogatari (9C), its non-relative use can be found as late as in Konzyaku Monogatari (11C), which is a collection of Buddhism-inspired stories (cf. example (28), adapted from Ishigaki 1955): (28)

Ko

hutari-ha ihe-wo

child two-TOP

house-ACC

kakomi-wake-te

wi-tari-

enclose-divide-LINK stay-PERF-

keru-ga,

kono ko-domo-no yama-yori EVID-NOM this child-PL-PRT mountain-from kaeri-ki-taru-ni, return-come- PERF-PRT 'The two children had partitioned the house and lived there, and these two children now returned from the mountain,...' The implication of this delay in the development of polyfunctionality is this: the nominative marker came to be used for non-relative linkage later than other case markers, precisely because it was firmly associated with the notion of figure. Furthermore, even when used for nonrelative linkage, the semantic relation is rather open, and is best understood as expressing some pause of thought or weak antithetical relation. These readings, especially the latter, are shared by the accusative-marked clauses as well. Note that these semantic relations are different from the relations considered so far. The former lacks the asymmetry that is so prominent in the latter. In this sense, the syntactic relation between the clauses linked by the nominative ga or the accusative wo is closer to coordination rather than subordination. From these facts, the following generalization is obtained: even when markers for core grammatical relations are used to link clauses, the linkage does not look like typical subordination with the figureground asymmetry either structurally or semantically. Finally, I consider grammatical and semantic prerequisites for languages to develop case marker-linkage marker parallels. Morpho-syntactically, one prerequisite is word-order consistency. That is, languages in which the relative order of NP and case marker parallels that of subordinate clause and linkage marker seem to be possible candidates. The majority of the languages reviewed above, such as Japanese, Papuan, and Tibeto-Burman, are of this type with fairly

708

Toshio Ohori

rigid order of operator-operand, while Australian languages, known for their freedom of word order, may be an exception. On semantic grounds, the crucial step for the grammaticization of the case marker-linkage marker parallel is when we understand events in terms of objects. It may be speculated that languages may differ with respect to the explicitness with which they mark the distinction between events and objects. English, for example, is very rigid in maintaining the distinction, so it is (barely) OK to say With his cold becoming very bad, John will stay home, but not *With his cold became very bad, ... retaining tense marking. Compare this with the Tauya example in (13), where person marking is present on the instrumental-marked clause, or the Japanese ablative-marked clause in (15), which retains tense marking. Significantly, this point is also connected to the figure-ground distinction, in the sense that the distinction between events and objects is less rigid when they are in the background, and being so, the semantic extension becomes more likely. 4.

Conclusion

The major claims of this paper can be summarized in the following way. The extension of case markers to linkage markers is motivated by the interplay of localism and the gestalt-preserving nature of semantic extension. Case markers for peripheral relations are more likely to be extended for clause linkage than those for core grammatical relations, because peripheral NPS such as datives, ablatives, instrumentais, locatives, etc. and subordinate clauses such as time, reason, condition, etc. all share the feature of ground. On the other hand, nominatives and accusatives do not fit the semantic structure of the parallels because they are either utilized for the figure proper or are indeterminate with respect to the figure/ground distinction. In conclusion, the investigation of the semantic motivations for the parallels between case markers and linkage markers - and polyfunctionality of grammatical forms in general - sheds light on the ways natural language is structured. Cross-linguistic generalizations, combined with a close look at language-specific facts, promises to be one

Case markers and clause linkage

709

fruitful direction for explaining the partitioning of functional domains and their interconnections.

Notes *

Information on Turkish and Lahu was provided by Karl Zimmer and James Matisoff, respectively. Gary Holland shared with me very fruitful discussions, including information on Indo-European absolutes. Also, I am indebted to the editor and two anonymous reviewers for theirs comments. I hereby express my gratitude to all of them. Any misunderstanding or misrepresentation is my own. Interlinear glosses of morphemes from languages other than Japanese have been reproduced from respective sources with only minor regularization. Abbreviations for function words are: ABL(ative), Acc(usative), Aux(iliary), BEN(efactive), coMP(lementizer), DAT(ive), DEF(inite), DELiM(itative), ERG(ative), EviD(ential), FUT(ure), iNCH(oative), iND(icative), iNESs(ive), iNF(initive), iNST(rumental), INTJ (interjection), LINK(age), Loc(ative), NEG(ation), NOM(inative), Nz(=nominalizer), PERF(ective), PL(=plural), POL(ite), poss(essive), PRT(=particle), suB(ordinator), suBJ(ect), TNs(=tense), TOP(ic). When a single word in one language corresponds to a string of words in the other, dots are used instead of spaces to show word boundaries. Elements that are unexpressed in the original sentence, e.g. subject NPs, are put in the parentheses in the English translation.

1.

Notice that not all deverbal forms can behave exactly like ordinary lexical nouns in Japanese. For example, while nomi-ni iku drinkDAT go, 'go to drink' is fine (nomi is deverbalized from nomu), just like syokuzi-ni iku meal-DAT go, 'go for a meal' is, it is odd to say *nomi-wa itu? drink-τορ when, 'when is drink(ing)?' (cf. syokuziwa itu?, which is fine). One reviewer of this paper informed me that Kurylowicz's (e.g. 1960) distinction between "grammatical cases" and "concrete cases" is reflected in some of the ideas in the present study. I think it is correct, and is here acknowledged with gratitude. One example is:

2.

3.

(N-l)

Yorgun ol-dug-nu-dan dolayi gel-me-di(-0). tired be-NZ-3rd.Poss-ABL because, of come-NEG-PAST(-3sg) 'Because (s/he) was tired, (s/he) didn't come.'

710

4.

5.

Toshio Ohori

Here, the causal meaning is mainly expressed by dolayi, and the ablative dan is morphologically not the main part of the connective. In addition, Japanese has the third type of case-marked clauses (along with the deverbal type like (2) and the inflected type without nominalizer like (16)), perhaps reflecting the diachronic layering of constructions. The third type is inflected and is explicitly accompanied by the nominalizer no, as in modern Japanese noni (no plus dative), which can be glossed "although". Here it ought to be mentioned that early Indo-European absolutes can be an exception, because the dependent clause, which is in a participial form, can be marked nominative. An archaic Latin example is given below from Holland (1986: 176): (N-2)

6.

si ambo praesentes sol occasus suprema tempestas esto (Leges XII tabularum I 9) 'If both are present, sun set shall be the latest time (for proceedings).'

Here, "ambo and the participle praesentes are clear nominatives; there is no resumptive element linking the two parts of this sentence [N.B. hence not a correlative construction], and the subject of the second clause is totally unconnected to that of the first [N.B. hence an adjoined construction]." (Holland 1986: 177). But the later paths that Indo-European languages took, namely the proliferation of case from nominative to dative, locative, ablative, etc., together with the drift from hypotaxis to parataxis, seem to fit the characteristics of the grammaticization of subordinate clauses given in this paper. Holland's remark that "The shift from nominative to locative was an IE grammatical possibility, but its implementation seems to have been einzelsprachlich" (p. 190) can be supported by appealing to the cross-linguistic tendency to use case markers for peripheral NP's, rather than those for core NP's, for adverbial clauses. Interestingly, there are languages in which a class of dependent clauses can only be defined in terms of the opposition between figure and ground. In "adjoined" constructions in Australian languages, the various types of dependent clauses marked with what is usually glossed REL (general-purpose subordinator) share the feature of ground as opposed to figure. In Rembarrnga (Non-PamaNyungan, McKay 1988), the adjoined clause is used for adverbial clauses, relative clauses, and the analog of cleft constructions, but not for perception complements. This is because, according to McKay, complements do not generally serve as the ground, whereas all the other clause types do.

Case markers and clause linkage

711

References Akiba, Katsue 1977 "Switch reference in Old Japanese", Proceedings of the Annual Meeting of the Berkeley Linguistics Society 3: 610-619. Austin, Peter (ed.) 1988 Complex sentence constructions in Australian languages. Amsterdam: John Benjamins. Brettschneider, Gunter 1980 "Zur Typologie komplexer Sätze: Vorüberlegungen", in: Gunter Brettschneider & Christian Lehmann (eds.), 192-198. Brettschneider, Gunter & Christian Lehmann (eds.) 1980 Wege zur Universalienforschung: Sprachwissenschaftliche Beiträge zum 60. Geburstag von Hansjakob Seiler. Tübingen: Gunter Narr. Croît, William 1991 Syntactic categories and grammatical relations: The cognitive organization of information. Chicago: University of Chicago Press. Dench, Alan 1988 "Complex sentences in Martuthunira", in: Peter Austin (ed.), 97139. Dixon, Richard M. W. 1977 A grammar of Yidiny. Cambridge: Cambridge University Press. 1980 The languages of Australia. Cambridge: Cambridge University Press. Genetti, Carol 1986 "The development of subordination from postposition in Bodic languages", Proceedings of the Annual Meeting of the Berkeley Linguistics Society 12: 387-400. 1991 "From postposition to subordination in Newari", in: Elizabeth C. Traugott & Bernd Heine (eds.), II: 227-255. Gorbet, Larry 1973 "Case markers and complementizers in Diegueño", Working Papers on Language Universals 11: 219-222. 1974 A grammar of Diegueño nomináis. New York: Garland. Greenberg, Joseph H., Charles Ferguson & Edith Moravcsik (eds.) 1978 Universals of human language. 4 vols. Stanford: Stanford University Press. Haiman, John & Sandra A. Thompson (eds.) 1988 Clause combining in grammar and discourse. Amsterdam: John Benjamins. Haspelmath, Martin 1989 "From purposive to infinitive: A universal path of grammaticalization", University of Cologne, Ms.

712

Toshio Ohori

Holland, Gary 1986 "Nominal sentences and the origin of absolute constructions in Indo-European", Zeitschrift fir Vergleichende Sprachforschung 99: 163-193. Ishigaki, Kenji 1955 Jos h i no Rekishiteki Kenkyu. (A historical study on particles) Tokyo: I wan ami. Kurylowicz, Jerzy 1960 Esquisses linguistiques. Wroclaw: Ossolineum. Langacker, Ronald 1987 Foundations of cognitive grammar I: Theoretical prerequisites. Stanford: Stanford University Press. Macdonald, Lorna 1988 "Subordination in Tauya", in: John Haiman & Sandra A. Thompson (eds.), 227-246. 1990 Tauya. Berlin: Mouton de Gruyter. Matisoff, James A. 1973 [1982 2 ] The grammar of Lahu. Berkeley and Los Angeles: University of California Press. McKay, Graham R. 1988 "Figure and ground in Rembarrnga complex sentences", in: Peter Austin (ed.), 7-36. Moravcsik, Edith 1972 "On case markers and complementizers", Working Papers on Language Universalst: 151-152. Sakakura, Atsuyoshi (ed.) 1970 Taketori Monogatari. Tokyo: Iwanami. Tal my, Leonard 1978 "Figure and ground in complex sentences", in: Joseph H. Greenberg, Charles Ferguson & Edith Moravcsik (eds.), IV, 625-649. Traugott, Elizabeth C. & Bernd Heine (eds.) 1991 Approaches to grammaticalization. 2 vols. Amsterdam: John Benjamins.

The thing is is that people talk that way. The question is is Why? David Tuggy

1.

Introduction: the 2-B construction

This paper presents the results of an informal investigation of a copular construction of the form "NP is is CLAUSE" which is utilized by some speakers of American English. 1 We will call this phenomenon the two- "be" construction (2-B)\ it is marginal in several respects, and there, in no small measure, lies its interest. The 2-B can be characterized as follows: Typically, a short, definite noun phrase, headed by the word thing, is followed by two instances of the word is, which in turn are followed by that and a finite clause. The clause may be of considerable length; it is usually significantly longer than the introductory noun phrase. Less typical versions of the construction change one or more of these specifications. One of the more common variations is to use the noun problem instead of thing ; other nouns such as point or fact or answer also occur, though somewhat less frequently. The noun phrase is sometimes longer and more complex, or an infinitival construction may be used instead of the that clause; more rarely the noun phrase may be non-definite, one or both of the is forms may be in a different tense, or a non-clausal structure may be used instead of the that clause. The meaning of the construction is very similar if not identical to that of a similar copular structure with only one is (which we will call a one-"be" construction or 1-B). Sentences (1) and (2) are typical examples of the 2-B; (3) to (6) are less typical but still relatively good, whereas (7) to (10) are atypical. (1) (2) (3)

The thing is is that we haven't told John yet. The funny thing is is that they didn't say anything about it. The problem is is that she already paid for it.

714

(4) (5)

(6) (7) (8) (9) (10)

David Tuggy

What really bothers me is is that it's going to mean a lot more work for somebody. My feeling is is that in framing the Constitution, they were trying to make sure that everyone could do what he wanted, within reasonable limits. The upshot of this study is is that the strong-willed child really stresses his parents out. The first thing was is that Daniel had made up his mind, and he wasn 't about to change it. The one thing on the mind of that gazelle is is "I'm going to get up, and I'm going to get some food. " The question is is why? The point I was trying to make was was the violence.

These sentences may occur with a range of intonational patterns; in particular the first is in (1) and (2) is often stressed (and similarly, though somewhat less frequently, in (3)), and there may be anything from a full pause, with appropriate intonational contours fore and aft, to no hesitation at all between the two ises. The 2-B is marginal in several respects, some of which merit much more attention than I have been able to give them. Many speakers of American English do not seem to use it at all - I have only once caught myself using it, for instance. It is doubtless limited in its geographical distribution, more common in some areas, less common and perhaps altogether absent in others; there may be social differentiations as well. 2 But the most important area of marginality has to do with speech styles and particularly with membership in that subset of constructions speakers use which they consciously consider proper or correct. Many who use it will, when queried, take these forms as grammatically erroneous ("wrong"), as deviations from the related 1B's. I know no speakers who always use 2-B's as opposed to 1-B's, though some clearly prefer the 2-B in the prototypical cases. 3 As might be expected for a construction thought to be "wrong", the 2-B tends to be avoided in carefully planned speech, and I have not observed it in written communication, though I would be surprised not to do so soon. 4

The thing is is that people talk that way

715

To summarize beforehand the conclusions of this paper, the 2-B has apparently arisen from a number of sources, most of them grammatically anomalous or even erroneous in some degree. These sources continue to be synchronically relevant, although the 2-B is strongly enough established in many people's grammatical systems to exist without its connection to them. In particular there is evidence that three different sources (parallelism with "legitimate" is is structures, solidification of the phrase the thing is with concomitant loss of function of the parts, and use of a unit complementizer is that), are involved, as reflected in three different patterns of non-present tense formation in 2-B's. 2.

Aspects of Cognitive Grammar (CG) relevant to the analysis

We will be examining this structure under the Cognitive Grammar model (CG) as developed by Langacker and others (Langacker 1987ab, 1990; Rudzka-Ostyn, ed., 1988). 2.1. Categorization by schema and by prototype; filli and partial schematicity One aspect of CG which is highly relevant to the analysis is the way it handles categorization. 5 Categories are groups of structures held together more or less tightly by relationships of "full" or "partial schematicity". When two concepts are totally compatible, but one of them is more detailed or elaborate than the other, the less elaborate concept is a "schema", and the more detailed concept is its "elaboration". The relationship between them, which is one of "full schematicity", is represented diagrammatically by an arrow from the schema to the elaboration. For instance, the concept SUITCASE is schematic for the concept RED SAMSONITE SOFTSIDED SUITCASE: its specifications are completely compatible with those of its elaboration, and thus the relationship can be represented as SUITCASE - » RED SAMSONITE SOFTSIDED SUITCASE. This means, in CG, that RED SAMSONITE SOFTSIDED SUITCASE is included unproblematically as a member of the category defined by SUITCASE. Sometimes, how-

716

David Tuggy

ever, there is a conflict in the specifications of two concepts. This naturally makes the categorization more difficult (with the difficulty naturally tending to co-vary with the degree of conflict), but it is still possible. Although the specifications of FOOTLOCKER conflict in some degree with those of SUITCASE, a FOOTLOCKER can still be viewed as a kind of (distorted) SUITCASE. To do so implies recognizing a relationship of "partial schematicity" or "extension", which is diagrammed with a broken line arrow: SUITCASE -> FOOTLOCKER. Relationships of full schematicity are claimed to have a natural "salience" (i.e. they will, ceteris paribus, occur more energetically in the mind), but if a categorizing structure such as SUITCASE is highly salient itself (i.e. if through usage it is so entrenched in the cognitive system that it is easily elicited and occurs energetically), categorizations which it anchors also tend to be salient, even when it is only partially schematic for the structures it categorizes. Categories typically, then, are structured like the category represented in Fig. 1, with highly schematic members such as PIECE OF LUGGAGE, which by relationships of full schematicity categorize all or much of the category, and more salient subcases such as SUITCASE, called "prototypes" (their salience represented by the boldness of the box enclosing them), which categorize some substructures ("prototypical" substructures) 6 by full schematicity and others (less prototypical substructures) by relationships of partial schematicity. Relationships of full or partial schematicity can be resolved into (i.e. they consist of) indefinitely many sub-relationships, many of identity or full schematicity (identity being a limiting case of schematicity along a parameter of "elaborative distance"), and others of non-schematicity. Thus the general shape of a TRUNK and a FOOTLOCKER correspond (box-shaped and hollow, with the cavity enclosed between a body and a slightly hollowed, hinged lid); their canonical resting orientation, with the lid on top and the hinges at the back, is the same; their use (in contrast to that of a SUITCASE) for semi-permanent storage of clothes or other household goods as well as for transporting of them, the typical presence of handles on the ends, reinforcers at the corners, heavy-duty fasteners and a hasp with a lock in the center of

Hie thing is is that people talk that way

717

the front of the lid, and so forth, are either identical or slightly more specific in FOOTLOCKER than in TRUNK.

Figure 1. A typical category It is only the disparities of size and proportion (which are matters of degree rather than absolute in any case), and related differences such as the (probable) absence of internal horizontal dividers in a FOOTLOCKER, and presence of a handle over the hasp by which the FOOTLOCKER can be carried SUITCASE-style, that make the FOOTLOCKER not a straightforward elaboration of TRUNK. 7 These correspondences of individual features together make up the global relationship, and the degree to which the global relationship approaches full schematicity can be judged by the proportion of such correspondences which are in accord vs. those which conflict. Fig. 1 represents a lexico-semantic category, but other kinds of categories, including grammatical categories, are organized according to the same principles. The 2-B, for instance, is a category of syntactic constructions, and we will characterize a schematic representation of the construction, which will include all the instances of it which we examine, but we will also characterize a prototype, thereby describing

718

David Tuggy

what class of sub-structures within the whole class is most typical of the class, most central to it, and claiming that other structures' inclusion in the class depends in no small degree on their likeness to that prototype (see Fig. 2). 2.2. Full (direct) and partial

sanction

A notion closely related to categorization and schematicity is "sanction". Any linguistic structure which is categorized by another, established, structure, is "sanctioned" by that structure, i.e. some of the legitimacy of that structure accrues to it, motivating its existence. The strength of sanction varies according to three parameters: (i) the salience of the sanctioning structure, (ii) the degree of concord or conflict between the specifications of the two structures involved, i.e. the degree to which the sanctioning relationship is one of full rather than partial schematicity, and (iii) the "elaborative distance" between the two structures, i.e. the extensiveness of the specifications, even though the structures be compatible, by which the sanctioned structure differs from the sanctioning structure. As one might suppose, (i) the more salient the sanctioning structure, the stronger the sanction, (ii) full schematicity gives "full" or "direct sanction", which is stronger than the "partial sanction" afforded by relationships of partial schematicity or extension, and (iii) a minimal elaborative distance means stronger sanction. Thus in Fig. 1 RED SAMSONITE SOFTSIDED SUITCASE is sanctioned most strongly by SUITCASE, since SUITCASE is at once (i) the most salient entity categorizing it, (ii) fully schematic for it, and (iii) the closest structure in terms of elaborative distance. 8 FOOTLOCKER, in contrast, (ii) receives only partial sanction from SUITCASE, although SUITCASE remains (i) the most salient and (iii) a very close categorizer; however, the sanction received from PIECE OF LUGGAGE is also important to its inclusion in the category because (ii) it is full sanction, even though PIECE OF LUGGAGE is neither (i) as salient as SUITCASE nor (iii) as close in terms of elaborative distance. The sanction received from TRUNK is the most important of all in this case: although TRUNK (i) is not as salient as SUITCASE, it is more so than PIECE OF LUGGAGE, and (ii)

The thing is is that people talk that way

719

although it does not sanction FOOTLOCKER fully as does PIECE OF LUGGAGE, its sanction is more direct than that of SUITCASE (there are fewer and less important conflicting specifications), and (iii) the elaborative distance is minimal. Thus a FOOTLOCKER is more saliently a kind of TRUNK than anything else. It is important to note that there is no problem at all with a structure (such as FOOTLOCKER) receiving sanction from several different sources; such "multiple sanction" is the norm rather than the exception, though it is also quite normal for one source to be so preeminent that for many purposes the others can be ignored (cf. Malkiel 1967; Hankamer 1977; Du Bois 1985). It is also important to note that a structure can sanction itself. This we will call "internal sanction" as opposed to "external sanction", but the mechanism is the same: a structure sanctions itself strongly to the degree (i) it is entrenched (and therefore salient) in the language, and that sanction is maximal both in terms of (ii) there being no conflict in specifications, so that the sanction is direct, and (iii) there being no elaborative distance at all. Any such structure is ipso facto part of the grammar of the language in question, regardless of what other sanction it may have received in the past or may still receive synchronically. It is possible for a novel structure with virtually no external sanction to become established, simply because someone, in defiance of established convention, uses that structure until it itself becomes established. When my sister Dale was a child, she had an imaginary playmate named [ î q? î q?]. This imaginary playmate's name did not have even the minimal sanction of being made up of English phonemes, and the sanction it would receive from the existence of an English convention of naming individuals would be very slight: nevertheless it became an established part of my family's linguistic system, and that of some of our friends, because Dale used it long enough. Normally, however, innovative structures rely much more heavily on sanction from already established structures. When they are directly sanctioned by a highly salient structure at a minimal distance, they may not even be perceived as innovations. The concept RED SAMSONITE SOFT-SIDED SUITCASE is not an established unit of English; however, it is unproblematically accepted as well-formed,

720

David Tuggy

even as a prototypical member of the LUGGAGE category, because of the strong sanction it receives from SUITCASE. When the sanction is weaker, however, whether because the sanctioning structures are less salient or more distant, or because there is a degree of extension in the sanctioning relationship, the sanctioned structures are more likely to be recognized as innovations or even as deviations. A structure is deviant or "anomalous", then, to the degree that it lacks direct sanction. It is "(grammatically) erroneous" or is an error to the extent that it is anomalous and its anomaly is judged to be nonvolitional.9 Thus [ î q? î q?] is highly anomalous, but not erroneous. Typically an error can be motivated in some degree by such nonvolitional factors as emotional stress, distraction of attention, shortness of breath, computational overload, muscular fatigue, neuromuscular mix-ups ("your tongue getting tangled up"), and so forth. All of these issues are relevant to the description of the 2-B. It is already established to some degree, and thus sanctions itself, but the sanction afforded it by the 1-B and a number of other constructions surely was involved in its being established in the first place, and doubtless remains important. A number of these sanctioning structures are erroneous themselves, and the 2-B receives no strong direct sanction, and this helps explain why it is so commonly perceived as not just deviant but actually erroneous; yet it is so strongly sanctioned by the 1-B and so many other structures, as well as internally by itself, that it often passes unnoticed by speakers. 3.

The 2-B prototype

As we have already mentioned in the introduction, a prototypical 2-B consists of a short, definite noun phrase, headed by the word "thing" (or, somewhat less typically, "problem", "point", "or fact"), followed by two instances of the word is, which in turn are followed by that and a finite clause. 10 This is represented by the structures in Fig. 2.a-b (and secondarily in 2.k-l, with only "problem" represented). Particular structures like those in 2.c-j are all prototypical because of the full sanction they receive from 2.a-b, regardless of how firmly they are established in their own right. It will be noted that the degree of estab-

TTie thing is is that people talk that way

721

lishment of these prototypical structures thus varies from virtually novel, non-established structures like the first thing you hear is is that CL (2.c) or the sixteenth thing is is that CL (2.h), through marginally established structures like the weird thing is is that CL (2.j), to relatively well-established structures like the funny thing is is that CL (2.i) or the first thing is is that CL (2.g), to the most prototypical of all: the thing is is that CL (2.e).

Figure 2. The Double- is Construction (2-B) Some of these structures can be grouped together into families: thus the well-established (2.g) and the non-established (2.h) and many other established and possible structures are ranged under a schema the Nth thing is is that CL (2.f). Less established structures are also motivated by the partial sanction they receive from similar, more established structures; thus 2.c is sanctioned by 2.g, and 2.j by 2.i. 2-B's with head nouns other than thing or problem, such as the issue is is that CL (2.o) or the upshot is is that CL, even though not established in their own right, may be sanctioned directly by 2.k, and partially by the much more salient 2.a, b or e. 11 Even less prototypical structures, such as 2.t-w, are sane-

722

David Tuggy

tioned directly by the quite non-salient 2. s and may also receive some degree of partial sanction from more prototypical structures: thus e.g. 2.t is sanctioned by 2.e, and 2.u by 2.a. They may of course also be established to some degree in their own right, as is 2.w. One aspect of the 2-B prototypes on which we have not yet commented is the semantic schematicity of the head noun thing. Thing's meaning is about as schematic (unspecified, vague) as is possible for a noun - it is no accident that in CG the technical term for the schema defining the class of nominal elements is, in fact, Thing (Langacker 1987a: 183-213). In context (i.e. in the focus formula construction, cf. Section 4.2), it means something like "noteworthy thing"; and "noteworthy" would have to mean "in disconformity with something normal/established/backgrounded". Problem means, essentially, "Thing which is in disconformity with something (established as) desired"; point means "thing in disconformity with irrelevant background", and fact "thing in disconformity with what is (only) believed/apparent". Thus the schema uniting these cases (represented as 2.k) will specify that the head noun of the NP is highly schematic, but noteworthy. It is, in this regard, worth noting that novel or near-novel structures like the issue is is that CL (2.o), in which the head noun is another highly schematic one (this time perhaps paraphrasable as "thing in disconformity with what is of little concern"), are significantly more acceptable than structures with a less schematic noun, like the rebuttal is is that CL (2.p), or Mexico Branch policy is is that CL. This fits our analysis in that 2.p is not directly sanctioned by any prototypical structure, and the partial sanction afforded by 2.e (and the lesser sanction from 2.m, etc.) is naturally weaker because the two structures are less alike, i.e. the degree of extension between them is greater. 2.q is particularly interesting. For some speakers it seems to be well entrenched. For others it apparently is not (e.g. they do not say it); it is the grammar of such speakers that is represented in Fig. 2. Although 2.q is not directly sanctioned by 2.k, but only by 2.r, and it is not established in its own right, it is nevertheless quite acceptable, because it is so very similar to the highly prototypical 2.e, and thus receives relatively strong sanction from it. 1 2

The thing is is that people talk that way

4.

723

1-B's and focus formulas (FF's)

4.1. 1-B's 1-B's in general are a much wider class than 2-B's. Virtually any noun that is semantically construable to describe a clause can be head of the initial NP in a 1-B, and the NP can attain a considerable length. Thus (11) and (12) are perfectly good 1-B's; the corresponding 2-B's (reading the is in the angle brackets) would be very difficult. (11)

(12)

The first of several very serious objections raised in my mind, both by their high-handed attitude and the general touchiness of the whole situation that will result if we buy in, is