188 87 11MB
English Pages 338 [340] Year 2001
Where Lexicon and Syntax meet
W DE G
Trends in Linguistics Studies and Monographs 135
Editor
Walter Bisang Werner Winter
Mouton de Gruyter Berlin · New York
Where Lexicon and Syntax meet
by
Doris Schönefeld
Mouton de Gruyter Berlin · New York
2001
Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.
© Printed on acid-free paper which falls within the guidelines of the ANSI to ensure permanence and durability.
Library of Congress Cataloging-in-Publication
Data
Schönefeld, Doris, 1953 — Where lexicon and syntax meet / by Doris Schönefeld, p. cm. — (Trends in linguistics : Studies and monographs ; 135) Includes bibliographical references and index. ISBN 3-11-017048-5 (cloth : alk. paper) 1. Lexicology. 2. Grammer, Comparative and general — Syntax. 3. Linguistic models. 4. Psycholinguistics. I. Title. II. Series. P326 .S3 2001 413'.028-dc21 2001030401
Die Deutsche Bibliothek — CIP-Einheitsaufnahme Schönefeld, Doris: Where lexicon and syntax meet / by Doris Schönefeld. - Berlin ; New York : Mouton de Gruyter, 2001 (Trends in linguistics : Studies and monographs ; 135) Zugl.: Jena, Univ., Habil.-Schr., 1999 ISBN 3-11-017048-5
© Copyright 2001 by Walter de Gruyter GmbH & Co. KG, 10785 Berlin All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information storage and retrieval system, without permission in writing from the publisher. Printing & Binding: Hubert & Co, Göttingen. Cover design: Christopher Schneider, Berlin. Printed in Germany.
Contents
Chapter One: Introduction 1.1. The endeavor
1 1
Chapter Two: Grounding and definitions 2.1. At the core of the language 2.1.1. Lexicon 2.1.2. Syntax
5 5 6 12
Chapter Three: Theories of language processing 3.1. The lexicon-syntax interface in performance models 3.2. Models of language production 3.2.1. An overall survey 3.2.2. Selected issues: Where lexicon and syntax meet.... 3.3. Models of language comprehension 3.3.1. An overall survey 3.3.2. Syntax and lexicon revisited
15
Chapter Four: Linguistic models under scrutiny 4.1. The lexicon-syntax interface in competence models 4.1.1. Linguistic models and the concept of naturalness 4.1.2. General assumptions as to the interrelation between lexicon and syntax 4.2. Functional approaches 4.2.1. Dik's model 4.2.2. Halliday's model 4.2.3. The methodological turn from Halliday to Sinclair 4.3. Generative approaches 4.3.1. The Government-and-Binding model 4.3.2. The model of Lexical Functional Grammar
15 20 20 30 48 48 55 89 89 90 93 97 98 104 108 120 123 132
vi Contents 4.3.3. The model of Lexical-Generative Grammar 4.3.4. The model of Head-Driven Phrase Structure Grammar 4.4. Cognitive linguistic approaches 4.4.1. Deane's explorations in cognitive syntax 4.4.2. Goldberg's Construction Grammar 4.4.3. Langacker's Cognitive Grammar
138 142 148 161 165 176
Chapter Five: Performance data 5.1. Securing and interpreting the evidence 5.2. Reformulations/self-repairs 5.3. Overlaps 5.4. Lexical co-occurrences
187 187 190 212 227
Chapter Six: In the psycholinguist's laboratory 6.1. An experimental test 6.2. The experiment 6.2.1. Method and procedure 6.2.2. Results 6.2.3. Discussion
247 247 256 257 261 265
Chapter Seven: The finale 7.1. Launching the project 7.2. Bringing in the harvest 7.2.1. The lexicon-syntax interface as reflected in performance models 7.2.2. The lexicon-syntax interface as reflected in competence models 7.2.3. The lexicon-syntax interface as reflected by performance data 7.2.4. The lexicon-syntax interface as reflected by experimentally elicited data 7.1. Evaluating the results - a psychologically plausible linguistic model
279 279 280
298
References
303
Index of subjects
329
280 283 292 297
Chapter One Introduction
1.1. The endeavor It is common for linguists (myself included) to describe their own analyses as natural, reserving the term unnatural for the analyses of other investigators. From this one deduces that naturalness is something to be desired in a linguistic description. Yet the term natural is elusive and largely unexplicated, having so little intrinsic content that in practice it easily comes to mean simply "in accordance with my own ideas". (Langacker 1987: 13)
Linguistic models of the present time claim to be more or less explanatory, i.e., they claim to be able to explain how a competent speaker of a language acquires this competence. This also implies among many other things - statements about the way the linguistic subsystems or -components artificially separated in the descriptions of the language system for methodological reasons actually interact. The validity of such statements can be measured against the facts revealed by the research into language processing, and such an evaluation is exactly what I aim at in the project presented here. It can be roughly described as the search for a "natural" linguistic model, a model which is compatible with findings about language use and which can, apart from defining any "grammatical" (in the sense of grammatically correct) linguistic product as a result of the language user's competence, also explain various performance data, in particular those which seem to be aberrations from "grammatical" or well-formed constructions. Since in a project like this, it is hardly feasible to discuss the assumptions made with regard to the interrelation/interaction of all the components in linguistic models, I felt the requirement to restrict myself to some representative subgroup of them. Stimulated by the growing linguistic interest in the lexicon, by the increasing importance
2
Introduction
attributed to it, and by the centrality of the syntactic component in most current linguistic models, I have focussed on the relationships specified for the lexicon and the syntax and how this is reflected in individual linguistic models. The models I have chosen for an analysis regarding this are the functional ones by Dik (1989) and by Halliday (1985, 1994), the generative ones by Chomsky (1988, 1993, 1995a and b) (Government & Binding and Minimalist Program), by Bresnan (1982) (Lexical Functional Grammar), by Pollard & Sag (1994) (Head-Driven Phrase Structure Grammar) and by Diehl (1981) (Lexical-Generative Grammar), and the assumptions made with respect to the functioning of language by cognitive linguists, such as Deane (1992), Fillmore & Atkins (1992), Fauconnier (1994), Fauconnier & Turner (1996), Goldberg (1995), Lakoff (1987), and Langacker (1987, 1991a and b, 1999). The hypotheses made with regard to the lexicon-syntax interface vary considerably, with some even being contradictory to one another. In order to assess which of them are more plausible or "natural", which of them are not only elegant and supported by theory-internal facts, but are also in line with the natural procedures involved in speech processing, I compared the claims made in the respective linguistic models with those made in psycholinguistic ones. If one considers the relationship between lexicon and syntax, linguistic models can basically be divided up into two groups. There are models which give priority to syntax, and there are models which give priority to the lexicon in the total arrangement of language. This means that either the syntax or the lexicon is considered the central and dominating component of language and the other components are described as being more or less dependent on it or as being secondary to it1. For psycholinguistic models of language comprehension, language production, or both (e.g., those developed by Forster (1979), Garrett (1980), Levelt (1989), Kempen & Hoenkamp (1987) Dijkstra & Kempen (1993), Dijkstra & de Smedt (1996b), Bock (1982), Handke (1995) and Frazier (1987, 1989, 1990)), or reflections on the lexicon1
This is not to be equated with the debate about autonomy or modularity as it is going on in the fields of cognition in general and language in particular.
The endeavour
3
syntax interface as they appear in the research by Shapiro, Zurif & Grimshaw (1989), Trueswell, Tanenhaus & Garnsey (1994), MacDonald (1993), MacDonald, Pearlmutter & Seidenberg (1994), Pearlmutter & MacDonald (1995) and Marslen-Wilson (1989), the situation is more uniform. For language production, there is general agreement on the fact that the lexicon plays a central role in the processing procedures in that the syntactic structure of an utterance is considered to basically, and at least partially, evolve from the syntactic information stored in the lexical entries retrieved for its construction. Models of comprehension, however, differ in the way they assume the parsing process to work, reflecting the polarity found in linguistic models: One group of models describes the lexicon as the driving force in the parsing process, the other attributes priority to syntax in that the parsing process is initiated on the basis o f word-categoiy information before any other type of information (pragmatic, semantic, thematic) has been actually accessed. The present book is meant to trace my search for a natural linguistic model which can give a plausible answer to the question of where and how exactly lexicon and syntax are assumed to meet: I analyse the competence models mentioned above as to what they claim with regard to the lexicon-syntax interface, and I measure the plausibility of their claims against findings from psycholinguistics and a number of performance data. In particular, I ask and try to answer the following two questions: 1. In what way are the selected linguistic models compatible with psycholinguistic assumptions about the lexicon-syntax interaction in language use? 2. How can the performance data I concentrated on, namely selfrepairs, overlaps and lexical patterns, be explained by the linguistic models under analysis? For that purpose, I first assemble what psycholinguistic models assume with regard to the interaction between syntax and the lexicon, finishing up with a summary of those claims that I strongly support and adding a few aspects that I consider important to my argumentation. Secondly, I scrutinize the linguistic models mentioned above, focussing on what assumptions about the syntax-lexicon interface they
4
Introduction
make or allow for, and considering in what relation that stands to the psycholinguistic claims. Then I go on to describe the performance data I drew on in order to see how they relate to the claims psycholinguists make with regard to the lexicon-syntax interaction in language processing, and in order to further evaluate the "naturalness" or "plausibility" of linguistic models: I analysed reformulations or rather self-repairs to find out what the mechanisms of their production are, and whether these mechanisms are provided for by the general design of the linguistic models under discussion. The data were derived from conversations recorded in the London-Lund-Corpus (LLC). Secondly, I analysed overlaps, i.e., moments in a conversation at which both interlocutors speak simultaneously, as to what they can tell us about language comprehension and whether the procedures involved are explicable by linguistic models. These data were extracted from the British National Corpus (BNC). A third type of performance data I took from corpus-linguistic research results, in particular from the discovery of not only syntactic, but also an impressive number of lexical patterns in language use. Drawing on what corpus linguists have revealed about lexical patterning, I once again ask whether the linguistic models under discussion can sufficiently account for this phenomenon, and which psycholinguistic claims it can be taken to support. Since all three types of performance data can give evidence regarding the lexicon-syntax interaction only via the interpretation by the analyst, I thought it important and necessary to look for more "objective" experimental evidence for the claims I make and support. That is why I designed and carried out an experiment which is meant to reveal information about the cognitive status of lexical patterns, in particular of collocations. The final step in my argumentation is to check and evaluate the linguistic models discussed in the light of the psycholinguistic findings and generalizations. They will also be evaluated with respect to their capability of covering and explaining the phenomena found in the analyses of performance data, i.e., I will end up discussing which of the numerous models and assumptions presented assigns to lexicon and syntax the appropriate places in the total structure of language.
Chapter Two Grounding and definitions
2.1. At the core of language: lexicon and syntax Language is commonly understood as simply consisting of a vocabulary and rules and regularities for the combination of its elements into larger units, i.e., phrases, clauses and sentences. This general understanding shows, among others, in the description of "language" in the Cambridge Encyclopedia of Language (Crystal 1987). Crystal's survey contains a number of definitions, two of which are presented here for illustration. (Part o f ) the dictionary definition of language he selected reads as follows: "the words, their pronunciation, and the methods of combining them used and understood by a considerable community and established by long usage." (Gove 1961: 1270) Chomsky, whose definition (1957: 13) sounds much more technical, though it actually provides less information than the first, is quoted as representing the views of one of the specialists dealing with the subject: "From now on I will consider a language to be a set (finite or infinite) of sentences, each finite in length and constructed out of a finite set of elements." Thus, the "common" view is not only indicative of what the layman understands a language to be, but also reflects in some very general descriptions and definitions given by linguists. In the latter, the two constituents, the words and the combinatory rules, are usually identified by the terms "lexicon" and "grammar", though there is no general agreement on this. Since "grammar" can be understood in a narrow and a broad sense, with the first referring to what can be more specifically called "morphology" and "syntax" and the latter referring to the language system as a whole, it is just as common, and perhaps more exact, to use the term "syntax" for the combinatory component of language.
6
Grounding and
definitions
The use of "grammar" in the former sense is to be found, e.g., in The Encyclopedia of Language and Linguistics (Asher & Simpson 1994), where Humphreys gives the following definition: The well-formed utterances or sentences of a language are specified by two components: the grammar, which is a set of general rules for combining and ordering word classes in the language, and the lexicon, which lists everything which is not in itself a general rule. The grammar is about linguistic generalities; the lexicon is about linguistic singularities." (Humphreys 1994: 2192)
With regard to "grammar" in its wider sense, one could take both components, lexicon and syntax, to be at the core of a language, with other components, such as stylistics, pragmatics, sociolinguistics etc. superimposing on them. Having in mind that the subdivisions just made are artificial, since language functions in its totality and is separated into subparts for the reason of making its analysis and explanation feasible only, I will now start out to analyse what the interrelation between the two core components is. As a prerequisite for finding this out, I shall first comment on what is generally understood by both the one and the other.
2.1.1. Lexicon The term "lexicon" will be used here in only one of its common meanings, namely in the sense of "vocabulary" or "word-stock" of a language. In this use it is opposed to the second meaning, which is commonly associated first by the ordinary language user, the sense of "dictionary", or "vocabulary of a language as it is arranged in a dictionary", where the arrangement may follow various criteria. These may be, for example, the alphabet (as in the "typical" dictionaries), the meanings to be expressed (as in a thesaurus or other onomasiologically oriented dictionaries), or topics (as in terminological dictionaries), to name but a few. The two different, though related, senses of "lexicon" also reflect in the numerous definitions of the term that have been given from a linguistic point of view. Naturally, the survey given here cannot be exhaustive, and, for the sake of brevity, I will have to concentrate on
At the core of language: lexicon and syntax
7
what can be found in some of the relevant linguistic encyclopedias. Moreover, our selection is also influenced by the perspective adopted here. The definitions assembled in the following are meant to list important characteristics of the sense under investigation, and I essentially do not disagree with the views represented and the claims made by them. Bußmann (1990: 456) defines "lexicon" in a most general way, thus also allowing for the second reading: "Lexikon ... Im allgemeinsten Sinn: Beschreibungsebene, die den Wortschatz einer Sprache insoweit kodifiziert, als seine Formen und Bedeutungen nicht aus allgemeinen Regularitäten des Sprachsystems ableitbar sind." Her definition implies that the lexicon contains only those items of the vocabulary that have idiosyncratic properties with respect to their forms and/or meanings. This understanding of "lexicon" denies the motivated word formations as well as the inflectional forms of a lexical item a place in the lexicon2. At the same time, it is not explicit about the particular features that can be taken as specified for each item listed against the background that the information contained enables the speaker to use the item correctly once he has acquired it. This sort of information is to be found in a separate entry in which Bußmann specifies the term as it is understood in one of the major paradigms of the last 40 years, in generative (transformational) grammar: There the lexicon is defined as part of the basic component of the grammar, and the characteristics that make up a lexical entry consist
2
This agrees with early generative assumptions with regard to the character of the lexicon, which were revised by Chomsky's lexicalist hypothesis (Chomsky 1970). With this hypothesis, Chomsky places the establishment of a relationship between a word and its derivatives into the lexicon, implicitly claiming that syntax is blind to morphology (cf. Zwicky 1992: 11): Hence, the lexicon contains also derived words, and the particular form needed for the construction of a sentence is determined by the phrase type the head of which it is meant to become (see also Sproat 1992: 335). In the Government-and-Binding Model (Chomsky 1988), lexical items are assumed to project their syntactic (and semantic) features into the syntax, thus minimizing the importance of phrase-structure rules (see section 4.3.1). The "extreme" position that all morphologically complex words are contained in the lexicon is held by Lexical Functional Grammar (Bresnan 1982) (see also section 4.3.2).
8
Grounding and definitions
in a list of phonological features to which specific syntactic features are assigned (cf. Bußmann 1990: 456). Chomsky (1988: 5) is more specific with regard to the features that go into each entry: "The lexicon specifies the abstract morphophonological structure of each lexical item and its syntactic features, including its categorial features and its contextual features." A similar definition, being as theory-specific as the latter two, is given by Lyons: The lexicon lists, in principle, all the lexical items of the language and associates with each the syntactic, semantic and phonological information required for the correct operation of the (phrase-structure) rules. (Lyons 1970: 125)
Also Humphreys's definition (1994: 2193) reflects main-stream linguistics of the last 40 years when he classifies for "Formal Grammar" that: "...the lexicon is the repository of basic items on which grammar rules operate (words) together with word-related constraints on the free operation of those rules (see X-bar syntax...)". Lewandowski's definition (1976: 674) is meant to be more theoryneutral and reads as follows: Lexikon... Die Gesamtheit der Wörter bzw. der Wortschatz einer (natürlichen) Sprache im Sinne des internalisierten Wissens des Sprachteilhabers von den lexikalischen Eigenschaften der Wörter/Lexeme (phonologisch-phonetische, orthographischgraphematische, syntaktische und semantische Informationen).
The definitions just quoted attribute quite an amount of information to a lexical entry, which a speaker is supposed to know as soon as he has acquired this particular item of the lexicon. The information contained in a lexical entry covers (almost) every aspect of knowledge needed by the language user for the verbalization of his intentions and for the translation of sound into meaning. This is information about: the meaning (concept(s) designated by the particular item), its syntactic category (word class), its grammatical features (e.g., number, person, tense, etc.), its morphological classification (morpheme structure), its derivational morphology (i.e., assignment of the compatible affixes), its subcategorization (i.e., configurational information), its predicate-argument structure (i.e., thematic information),
At the core of language: lexicon and syntax
9
- the cases (of its possible arguments), and - register (style). Thus, due to the fact that knowing a word also implies knowing about its use, the speaker/hearer will be heavily constrained as to the structures and forms he may choose or expect when constructing or comprehending an utterance. Certainly, in the course of language acquisition, the native speaker of a language will also have to find out how all this information of a lexical entry is "disguised" in this particular language and how it is used. That means that he will have to generalize and abstract from experienced particular instances of word usage, almost exclusively from speech input, what the concept(s) named by a word is/are and what the combinatory or the appropriateness rules of his native language are. So, at a certain age, the native speaker will naturally have semantic, structural (syntactic), stylistic, pragmatic knowledge as such, perhaps also in the form of "autonomous" rules, but he does not normally use this knowledge separately and, what is more, all this knowledge is present in his mind as soon as a lexical entry is activated from his mental lexicon. This amounts to recognizing that it is extremely difficult to draw a dividing line between lexicon and syntax, and it implies that, for determining the relationship between the two, it will not be sufficient to analyse and interpret linguistic models of the language system ("langue") or of the language user's competence ("competence", "Ilanguage"), but that one will have to consider the assumptions and the data provided by the research into language processing and language acquisition as well. In these areas of psycholinguistics, the lexicon and its component parts have been a constant object of enquiry, be it with regard to their acquisition, storage, access, or retrieval, or their processing. The inclusion of these aspects in the concept of "lexicon" is made explicit by a more specific term used for the designation of the lexicon, namely the use of "mental/internal lexicon" instead. The term is also given separate entries in most linguistic dictionaries. Generally speaking, the "mental lexicon" can be considered to be the internalized knowledge of the properties of words.
10
Grounding and
definitions
Bußmann's definition (1990: 480) reads as follows: Teilkomponente Wörter/Morpheme
der
Grammatik,
gespeichert
in
der
sind,
die
Informationen bei
über
einzelne
Sprachproduktion
und
Sprachverstehen abrufbar sind. Zu diesen Informationen zählt das Sprecher/Hörerwissen über phonetisch-phonologische Form, morphologische Struktur, semantische Repräsentation und syntaktische Regularitäten...
This formulation already indicates currently open questions as to the form in which the lexicon is stored3. Is it words, or is it morphemes, or, from a perspective of parallel distributed processing (for details see section 3.2.1, pp. 27-30), does the lexicon, at a microstructural level, exist "merely" in the form of activation patterns distributed over particular units at the levels of orthographic, phonetic, and semantic knowledge about the words (for a discussion of the "standard" views as against the assumptions of parallel distributed processing see Neumann 1990: 174-176, for example). Lewandowski uses the term "internal lexicon" and defines it as a model that has been constructed about the internal representation of lexical items in the semantic memory. The latter he claims to contain the language user's subjective knowledge of the meaning(s) and the use(s) of a linguistic sign, and about the way language users gain access to lexical information in speech production and perception, (cf. Lewandowski 1976: 482). In the psycholinguistic literature, the term is ubiquitous and so basic that it is not always defined explicitly. I will illustrate its reading by a few examples. Handke, who sets out to analyse and describe the lexicon as the central component of natural-language processing, defines the (mental) lexicon as follows: A lexicon ... is the central module of a natural language processing system ... It closely interacts with the other components of the language processor and provides detailed information about the words to be produced or comprehended. (Handke 1995: 50)
The items contained in the lexicon, the lexical entries, he assumes to be specified with regard to phonological/graphological, morphological, 3
Other definitions also show this indeterminacy: For the lexicon in Formal Grammar, Sproat summarizes that: "[t]he inventory of words or morphemes of a language is the LEXICON." (Sproat 1989: 335). For a discussion of what is listed in the mental lexicon see Hankamer 1989: 392-408.
At the core of language: lexicon and syntax
11
syntactic and semantic aspects, which - depending on the mode of language use, viz., production or comprehension - are made available in different ways (cf. Handke 1995: 68). Schreuder & Flores d'Arcais (1989: 409) describe the "mental lexicon" to stand for the store of all our knowledge related to words. We will assume here the current view of the mental lexicon as the important relay station connecting certain specific sensory events or motor (output) patterns with mentally represented knowledge structures.
From the point of view of language production, Levelt defines the mental lexicon as a language user's store of information about the words in his language. As such it contains information about all the lexical items he knows. When a lexical item is retrieved from the mental lexicon (in the productive mode), this is done on the basis of its meaning, but in addition to the meaning, it contains syntactic, morphological, and phonological information (cf Levelt 1989: 6) Roelofs, basically drawing on Levelt's description, elaborates that a lexical entry's lemma, which does not contain form-related information, is a representation of the meaning and the syntactic properties of a word. In addition, it also contains functional information, i.e., information on the mapping of thematic arguments on syntactic functions. Via the access of an entry's lemma, also morphological and phonological information contained in the "form lexicon" becomes available, (cf. Roelofs 1996: 310) Another aspect, which is implicitly contained in the meaning information specified in the previous definitions, is made explicit in Kess's definition (1992: 80-81), namely the assumption that also information about the relationships to other lexical entries is available with any entry. Moreover, it also contains hints at possible mechanisms involved in the recognition of items of the lexicon: The mental lexicon is your mental dictionary, that vast compendium of information about words and their relationships that you carry about in your head (...). Like the dictionary on your bookshelf, it too is organized along principles which reflect the phonological, orthographic, and semantic characteristics that words share. But in searching through the mental lexicon as we attempt to place a word, we note that the process of word recognition is sensitive to other characteristics as well, characteristics like word frequency and the effects of context.
12
Grounding and definitions
This rather comprehensive understanding of "mental lexicon" will be the basis for further considerations in chapter 3, where the point at issue is how the information contained in the lexicon interacts with our general knowledge of syntax in language use.
2.1.2. Syntax It is even more difficult to find a theory-neutral definition of syntax than one of the lexicon. Syntax, traditionally determined as the theory of sentence construction, is generally defined along the same lines even now. What has sometimes been added are explications of probable mechanisms involved and criteria effective in it. Crystal's encyclopedia contains one such very general definition of syntax: Syntax is the way in which words are arranged to show relationships of meaning within (and sometimes between) sentences. The term comes from syntaxis, the Greek word for 'arrangement'. Most syntactic studies have focused on sentence structure, for this is where the most important grammatical relationships are expressed. (Crystal 1987: 94)
Bußmann (1990: 766) is more explicit about the elements and procedures that play a part in the construction of sentences when she defines one sense of syntax as: Teilbereich der Grammatik natürlicher Sprachen (auch: Satzlehre): System von Regeln, die beschreiben, wie aus einem Inventar von Grundelementen (Morphemen, Wörtern, Satzgliedern) durch spezifische syntaktische Mittel (Morphologische Markierung, Wort- und Satzgliedstellung, Intonation u.a.) alle wohlgeformten Sätze einer Sprache abgeleitet werden können...
Abraham (1988: 855) adds the fact that by "syntax" we do not only understand the rules for combining words into phrases and sentences, but also some principles for describing these rules. In the subsequent sections, he speaks of "autonomous syntax" and "generative syntax", which is indicative of particular linguistic assumptions and thus no longer theory-neutral. However, we know that every attempt to define a more or less theoretical term will necessarily reflect the assumptions made by the model whose beliefs the "definer" shares. That is why I will - just as it is intended for the understanding of "lexicon" - take into consideration the definitions offered by the linguistic models under discussion in chapter 4.
A t the core of language: lexicon and syntax
13
In the psycholinguistic literature, one encounters definitions of "syntax" only very occasionally. In descriptions of syntactic processing, terms such as "parsing", "syntactic analysis", "syntactic frame", "syntactic ambiguity resolution" etc., will be met, but they all presuppose a general understanding of what "syntactic" or "syntax" is. And this seems to be exactly that one which a particular linguist has. Handke (1995: 5) uses the term syntax to denote the study of sentence structure. From the point of view of language acquisition, Clark (1995: 318) explains what knowledge of syntactic structures implies, namely the recognition of the systematicity of word combinations, of their contributions to meaning, and of the means by which they are marked, such as the order of constituents, morphological marking, intonation, etc. I take all these definitions to agree in what is important to our understanding of the term under investigation: Syntax describes the rules by which words combine in a verbal utterance, what their contribution to the utterance meaning is, and the means by which the intended combinations are signalled or expressed. Whether the knowledge of these rules is separate from the knowledge we have about words and, what is more, whether the former can be considered autonomous is one of the questions that is still under general discussion, and I will take it up occasionally within the course of my argument. On the basis of these general readings of the terms "syntax" and "lexicon" and of the psycholinguistic readings of "mental lexicon" and "syntax" in particular, I can now set out to collect information on what psycholinguistic findings and generalizations predict with regard to the interaction of lexicon and syntax in language use.
Chapter Three Theories of language processing
3.1. The lexicon-syntax interface in performance models Interest in the procedures involved in language processing has been vivid for quite some time: with psychologists investigating - among other things - the relationship between language and cognition, and linguists constructing language models which they claim to be psychologically real. The intersection of psychological and linguistic research interests resulted in a new interdisciplinary science, that of "psycholinguistics", a field which is mainly concerned with the discovery of the procedures that are involved in language acquisition, language loss, language comprehension and production, and - as Kess (1992: 14) put it - "a field which depends in some crucial way on the theories and intellectual interchange of both psychology AND linguistics". Thus, both linguistics and psycholinguistics are centrally interested in the phenomenon of human language, but they analyse their common research object from different perspectives and with different aims in mind. Linguistic models (as they are described below) are meant to describe what human language is like, what elements it consists of, and what the principles are for combining these elements into larger units. In structuralist terms, this is what makes up the language system ("langue"), in generative terms, this aspect of language is referred to by the term "competence", which is to be understood as the native speaker's internal knowledge of his language (the "steady state", cf. also section 4.4). Psycholinguistic models, on the other hand, aim at describing how this knowledge of one's native language is put to use. The association with such concepts as language use ("parole") or "performance" respectively becomes obvious right here. Moreover, from the psycholinguist's point of view, speech/language use is considered to be
16
Theories of language
processing
informative regarding the character of cognition in general, it is considered as a window to the nature and structure of the human mind (cf. Scovel 1998: 4). Most psycholinguistic enterprises try to find out what is going on when language is used in communication, that is, when it is produced, or when it is comprehended. These two main activities involved in language use are commonly summarized under the term of "language processing". Analysing language processing, psycholinguists also consider whether the processes and the representations assumed are compatible with the inventory of elements and principles suggested by the various linguistic theories, they may even start out from the latter to develop their own models, as is done by, e.g., Frazier (1995). Apart from that, they also take a vivid interest in how language is acquired, i.e., how a child finally manages to master the language into the speaker community of which it is born, and what the individual stages in this process are. Last, but not least, there is also considerable interest in language loss, that is, in the phenomena of language decay in an individual due to illness, accident and/or old age. In my search for cues for the relationship between the lexicon and syntax of a language I will concentrate on what psycholinguists have found out about language use. I will neglect what the findings about their interaction(s) in language acquisition are and what might be concluded from phenomena related to language decay. As to models of language use, there are some general surveys or overall sketches available, which try to incorporate everything that is possibly involved in the translation of thought into verbal utterance and vice versa. These are complemented by more detailed elaborations of individual facets of the whole process (in the one or the other direction), such as speech perception, especially segmentation and perception of auditory units (cf., e.g., Cutler 1989; Nygaard & Pisoni 1995), lexical access (cf., e.g., Forster 1976, 1989, 1990; Seidenberg 1990; Roelofs 1996), phonological encoding (cf., e.g., Dell & Juliano 1996), or articulation (cf., e.g., Fowler 1995), to name but a few. As the topics already indicate, these partial processes involved in language processing are analysed and described either for the comprehensive mode or for the productive one. This is due to the fact
The lexicon-syntax interface in performance models
17
that profound differences are assumed and have been recognized to exist between the two. Apart from that, assumptions about the functioning of language also differ with regard to the question of whether the ability to understand and speak is just one specification of, or can be derived from, general cognitive abilities, or whether language is a particular module of the mind with its own specific structure, representations, and procedures, that is, whether language is self-contained and independent of other parts of the cognitive system (cf. also footnote 4). All in all that means that there is a rich diversity of models, and I do by no means aim at a comprehensive survey of the state of the art. However, for a better understanding of how the procedures involved in language processing can be assumed to interact, I think it helpful to present two suggestions about "the language-user framework" and "the architecture of a natural processing system". These were made by Dijkstra & de Smedt (1996) and Handke (1995) respectively, who, on their part, draw also on ideas proposed by Bock, Levelt and Kempen:
Discourse comprehension
I Sentence parser
Discourse planning Conceptual system
Conceptual Memory
Grammatical encoder Formulator
Word recognizer
Syntax
Signal recognizer
Lexicon & morphology
Phonological encoder
Articulator, motor control
Phonology
Knowledge sources Fig. 1. The language-user framework (source: Dijkstra & de Smedt 1996: 16)
18
Theories of language processing
This framework allows for the description of the production as well as of the comprehension of an utterance, with the arrows indicating the directions. When producing an utterance, the speaker starts out from his intention, i.e., he conceptualizes what he wants to express. In order to grammatically and phonologically encode and articulate his message (cf. the right column), he then draws on his linguistic knowledge and on his knowledge of the world (cf. the middle column). The framework does, however, not spell out the details of the exact procedures involved in the encoding of a message. Thus, for the field of our special interest, the grammatical encoding of a message, it remains undecided in what particular way the speaker uses his "conceptual system", his knowledge about, say, the lexicon, syntax, or phonology. The framework merely specifies that the "encoder" does make use of it. As for the comprehension of an utterance (cf. the left column), the framework informs us about the general algorithm from signal recognition, via word recognition and sentence parsing to the extraction of the meaning, which commonly corresponds, or rather should do so, to the speaker's intended message. Once again, the details are left unspecified. The reasons for which this framework is not expressive with regard to these details may be that it would simply be less clear if all the possible connections and interactions between the stipulated elements and procedures had been indicated and, what is more, that there is no general agreement on some of those. Handke's illustration of the architecture of the natural-language processing system (shown in figure 2 below) contains some more detailed information about how the individually listed parts probably act in combination. It makes explicit a number of assumptions about the course of the individual processes and procedures involved in producing and comprehending language: The "hollow" arrows indicate the sequence of procedures in language processing, the others - the assumed interactions between parts (elements and procedures) of the system. Moreover, this view of language processing also projects some compartmentalization onto the overall processing mechanism, resulting
The lexicon-syntax interface in performance
models
19
in the three segments of "conceptualizing", "linguistic processing" and "low-level processing".
Conceptualizing Conceptualizer
^Knowledge base Encyclopedia Situation knowledge Model construction Pragmatic interpretation
Message generator Monitor
u Linguistic processing Production system - Grammatical encoding - Phonetic planning
Lexicon ^ |_ Lemma Lexicon ^ Form Lexicon
u
Comprehension jLsystem - Semantic „ interpretation - Parsing
η Low-level processing
Output system - Articulation - Writing
Overt speech - Interlocutor's speech - Written language
Input system - Acoustic analysis - Visual analysis
Fig. 2. The architecture of a natural-language processing system (source: Handke 1995:35)
It also commits itself on a more or less modular view of processing language in that it posits subparts which operate on one particular type of input only, thus also allowing for the flow of information in only one direction, whereas the framework proposed in the first illustration
20
Theories of language processing
(Figure 1) deliberately leaves this issue open (cf Dijkstra & de Smedt 1996: 16). In the following, I will separately consider the mechanisms assumed for language production and comprehension, also touching upon the general question of the modular or non-modular architecture of the respective models.
3.2. Models of language production
3.2.1. An overall survey Language production has been analysed less comprehensively than language perception, one reason being that it is very complicated to exactly know or find out what a speaker's intended message of an utterance is and to test or influence this experimentally, whereas speech comprehension allows for manipulation of an utterance and for experimentally testing the consequences for the understanding of the message (cf Keller & Leuninger 1993: 208; Dell 1986: 283; Garrett 1980: 177-178; Bock 1995: 206). Nevertheless, there have been developed quite a number of language production models, all of which are derived from the analysis of performance data, predominantly speech errors of all kinds. They basically fall into two groups: models which consider the production process to be a serial procedure of individual steps (e.g., Fromkin 1971; Garrett 1975, 1980; Cooper 1980; Levelt 1989; Keller & Leuninger 1993; Pechmann 1994), and those considering production to be a procedure of interactive processes (e.g., Dell & Reich 1981; Stemberger 1982, 1985; Dell 1986). A short survey of the essential features of both kinds of models follows: Serial models commonly assume three or five levels of processing: The three-level models comprise the levels of: 1. conceptualization, i.e., the combination of thoughts/concepts and intentions into the (preverbal) message to be transferred,
Models of language production
21
2. formulation, i.e., the grammatical and phonological encoding or the transfer of the message into a lexico-syntactically and morpho-phonologically specified form, which results in a phonetic/articulatory plan, 3. articulation, i.e., the conversion of the latter form into a sound form, which is controlled by the muscles of the speech organs and results in overt speech (cf. Levelt 1989: 9-14). For the five-level models, the following levels are specified: 1. conceptualization, 2. encoding on a functional level, i.e., the specification of a sentence frame with regard to the semantically relevant constituents (such as the theta-roles/deep cases/transitivity structure (cf. chapter 4)), their word categories, and their grammatical functions. 3. encoding on a positional level, i.e., the specification with regard to the positions of the constituents and their attendant grammatical formatives, 4. encoding on the phonological level, i.e., determination of phonetic details, and 5. articulation (cf., Garrett: 1975: 176, 1988: 78; Keller & Leuninger 1993: 218). As can easily be seen, the differences merely follow from a more detailed description of what the assumed procedures of "translating" the preverbal message into an articulatory plan are. Thus, for all the serial models, the first stage of speech production is the conceptual level, at which concepts, ideas/thoughts, intentions are arranged so that they result in a message. At the next level, the formulation/encoding, the individual models differ: does the message first initiate a syntactic frame (specifying syntactic functions, the predicate-argument-structure) or the selection of lexical items or both at the same time, in other words is the formulation process driven by phrase structure or by the lexical entries needed for the expression of the intended message or by both in cooperation? These distinctions show in how exactly the formulation processes are spelled out, that is, they are directly depicted in the general claims of a five-level model, or they need to be further elaborated for the "formulation" in a three-level model.
22
Theories of language processing
Garrett and Keller & Leuninger, for example, postulate a functional level ("prädikative Ebene") at which both the selection of the lemmas of lexical items and the construction of a functional structure is localized, whereas Fromkin and Cooper locate lexical selection/insertion at stage 3, after the specification of syntactic structure at stage 2. At the third level of the former models, which is called the positional level ("positionale Ebene"), the morphophonological forms of the lemmas and the positional frame of the sentence to be produced become available. The final level in the production process is the sound level, where phonetic details are specified and commands are sent to the muscles of the vocal apparatus, which on their part initiate articulation. For the point in dispute, i.e., the places at which the lexical entries needed and the syntactic structure of the utterance come into play, Cutler (1995:119) reports the situation to be as follows: "It has been argued that syntactic formulation precedes lexical selection (Fromkin 1971), follows it (Bierwisch & Schreuder 1992) or operates in parallel to it (Bock 1982)." This crystallization of views is also reported by Garman (1990: 414). Allowing for both serial and parallel positions, he specifies: ...given that the message level controls both lexical access and syntactic structuring, it could be that either the one is dependent upon the other, or that the two processing hierarchies interact, with lexical decisions affecting syntactic choices, and vice versa.
The assumption of parallel action of processes or that of mutual influence between several processes was not made from the very first days of research into language comprehension and production. On the contrary, most of the serial models considered speech production as a strict top-down process, where there is feed forward only and no feedback to previous levels or stages of the procedure. Models which postulate components that operate in a strictly serial way (that is, in a top-down manner in language production and in a bottom-up manner in language comprehension, where lower level representations are the only input for the construction of higher level representations) and do so independently of one another are known by the terms of "noninteractive" or "autonomy" models (cf. Garnham 1985: 186). An even more important assumption of strictly autonomous models is that specified here for comprehension models -
Models of language production
23
high-level decisions cannot be used to influence the computations that take place at low levels of representation. Thus, the incoming waveform is translated into an acoustic-phonetic representation at the first stage of processing (...) [in language comprehension D.S.]. This representation is used to access the mental lexicon. Lexical information drives processing within a syntactic module. Finally, the output of the syntactic processor is used to begin message-level computations (...) (Lively et al. 1994: 280)
But the further analysis of, and the quest for, the explanation of speech errors, such as alternative and competing plan errors, have led to changes in these conceptions, especially with regard to assumptions about parallelism of processing procedures. These modifications are reflected, for example, in Butterworth's, Garrett's, and Levelt's models: Butterworth (1982: 102-103) assumes parallel operation of the processes of the selection of syntactic structures, intonation contours and lexical items, whose outputs then merge again in what he calls the "phonological assembly system". Garrett (1988: 90) allows for "some form of parallelism in the processing scheme" as well, specifying this possibility for the level of message representation. The evidence he draws on is the occurrence of competing plan errors, such as Please turn off the flower, uttered while looking at a flower pot on top of the TV set with the intention to say Please turn off the TV (cf. Garrett 1988: 92). Levelt (1989: 235) assumes the processes of accessing lemmas and building syntactic structures, which he calls "Grammatical Encoding", to be lexically driven and incremental in their operation, thus basically following Kempen and Hoenkamp's (1987) model of an "Incremental Production (or Procedural) Grammar". This sort of "parallelism of action" allows for the simultaneous operation of all the processes involved in the formulation of an utterance, with the proviso that they "manipulate" different parts of it. The assumption of parallel or incremental operations in the production of speech in the above-mentioned models does, however, not deviate from the "modularity hypothesis"4, which claims cognitive 4
The "modularity hypothesis" follows from a modular conception of the mind, where the mind is conceived of as a complex system of subcomponents or modules with specific tasks and abilities, which work on particular input. This conception is closely associated with Fodor (1983) and is broadly discussed in Garfield (1987: 1), who summarizes the essential claims of the so-called modularity hypothesis: "The mind is
24
Theories of language processing
(thus also linguistic) processing to be divided among autonomous subsystems 5 (cf. Tanenhaus, Dell & Carlson 1987: 83). The subcomponents involved in language processing are assumed to be "blind to each other's internal states and operations and ... [to] communicate only at their input and output stages." (Tanenhaus, Dell & Carlson 1987: 84) This assumed blindness or encapsulation of information in the distinct linguistic subcomponents, such as syntax, semantics, lexicon, not a seamless, unitary whole whose functions merge continuously into one another; rather, it comprises - perhaps in addition to some relative seamless, general-purpose structures - a number of distinct, specialized, structurally idiosyncratic modules that communicate with other cognitive structures in only very limited ways. According to this hypothesis, these modules include, roughly, input systems (including certain components of the perceptual systems and of the language-understanding system) and certain components of the output systems (including processes involved in motor control and language production). The hypothesis contrasts these modules with the presumably nonmodular structure of, for example, long-term memory, or the cognitive structures underlying general knowledge." Fanselow & Felix (1987: 173) formulate: "Die Modularitätsthese besagt, daß das menschliche Kognitionssystem modular aufgebaut, d.h. aus einer (finiten) Anzahl von eigenständigen und unabhängigen Subsystemen (=Module) besteht. Jedes dieser Module hat seine spezifische Struktur und seinen spezifischen Aufgabenbereich." Hence, language is considered as a particular module of the mind and it is claimed to be modular in itself. 5 It becomes obvious here, that the autonomy hypothesis is an ally of the modularity hypothesis. The former implies claims as to both the overall architecture of the mind and the language module in particular. It is generally assumed that the modules involved in cognition are independent of one another, or rather autonomous. This means: "... die interne Struktur eines Moduls ist nicht auf die interne Struktur irgendeines anderen Moduls reduzierbar. Ebensowenig gibt es ein „Supermodul", aus dessen Prinzipien wiederum die internen Strukturen der einzelnen Module ableitbar sind. Kognitive Leistungen entstehen in der Regel aus der Interaktion zwischen verschiedenen Modulen, wobei diese Interaktion jedoch nicht die interne Struktur der Module verändert oder beeinflußt. Mit anderen Worten, Interaktion vollzieht sich auf der Ebene des Input/Output der verschiedenen Module, nicht jedoch auf der Ebene der modularinternen Verarbeitung (...)." (Fanselow & Felix 1987: 173) As for the language module in particular, it is one such module. Consequently, linguistic structures are understood as "autonomous from more general conceptual structures with the language faculty being its own special mental organ or module." (Gibbs 1995: 31) It is, above all, two of its subcomponents which have been found to be autonomous, namely syntax and phonology. As postulated by Fanselow & Felix (1987: 67), these components show regularities that cannot be detected in other knowledge systems and hence cannot be attributed to general cognitive abilities.
Models of language production
25
amounts to the claim that the processes in one component do their processing without drawing on information being processed in the other components. The acceptance of parallelism and mutual influence of processes in linguistic processing can, however, be made compatible with the modular conception of the mind in that one postulates informational encapsulation for the language module as a whole and makes no claims about the processing conditions within this module (cf. also MarslenWilson& Tyler 1987:41). Another way is to postulate an "editing device" outside the production system (editing theories) or internal to the system (connectionist theories), which checks the output of one component against that of (higher-level) components that did their work prior to it (cf. Levelt 1989: 498). These checks may probably lead to an overruling or even revision of the original output of these higher-level components (in language production), thus also providing for feedback, or bottom-up effects. The contraposition to the modular conception of the mind is taken by the proponents of a holistic view of the mind (cf. section 4.4). The latter allows for massively parallel and interactive cognitive processes, which, with regard to language, means that the language processor can be understood "s a highly interactive system in which different sources of knowledge communicate freely." (Tanenhaus, Dell & Carlson 1987: 84). This conception has led to the development of models of language processing, i.e., both comprehension and production, which are based on interactive accounts, that is, they assume a high degree of parallelism and interaction. Models of this type are the so-called interactive, connectionist, or spreading activation models. Because of their similarity to the neural networks in the brain they are also known by the term "neural networks". The basic elements or primitives of such models are thought to be units or nodes representing some sort of linguistic notion, such as sound features, sounds, words, etc., and links or connections between them. The nodes form levels and there are activation values associated with them, which result from the input and from the activation of other nodes they are connected to. The connections represent paths via which
26
Theories of language
processing
activation can spread from one unit to all the other units it is linked to. They are either excitatory or inhibitory, depending on what nodes they link: the connections between incompatible nodes (which are predominantly those between the nodes of one level) are inhibitory, those between mutually consistent nodes (of different levels) excitatory. The flow of activation is assumed to spread in either a parallel or an interactive way. The former assumption results in models in which the activation of simultaneously activated units or nodes spreads unidirectionally to the linked nodes until one node (the most plausible one) exceeds the threshold level, fires, inhibiting the competing nodes, and is thus chosen for the representation under construction. The exemplification of a procedure like this can be found in Morton's "logogen model" (1969) of word recognition. The latter assumption is incorporated into models in which activation spreads multidirectionally, so that all the nodes simultaneously activated pass on their activation to both lower and higher levels, that is, forwards and backwards. Activations between different levels exert a facilitatory effect, activations within one level - an inhibitory one, and finally the most plausible node, the best match, will be the one with the strongest activation, because this one was activated strongly enough to inhibit the competing nodes and drive their activations down. Forster explains the differences between the two procedures metaphorically as "first past the post" and "survival of the fittest" (Forster, personal communication). For concise surveys of the mechanisms assumed to apply here see, e.g., McClelland & Rumelhart 1981: 378-379; Stemberger 1985: 145-147 Forster 1994: 1307; Handke 1995: 44-45; Murre & Goebel 1996: 5051. Examples of models based on interactive activation are Stemberger's "interactive activation model of language production" (1985), or Dell's "spreading-activation theory of retrieval in sentence production" (1986). For the production process in general they postulate a network of linguistic rules and units in which decisions about what unit or rule to choose are based on the activation levels of the nodes representing those rules or units (...). (Dell 1986: 283).
That means that language processing, here language production, is no longer considered to operate serially, where the output of a higher level
Models of language production
27
operation becomes the input of the next lower level operation with no feedback. Rather it is a cascading system; information is passed on to higher or lower levels as soon as it becomes available. All subunits of a higher unit are partially activated at he same time, so that they coexist during production. Different levels are interactive, so that they can mutually influence each other. (Stemberger 1982: 54)
The two models mentioned are indebted to McClelland & Rumelhart's "interactive activation model of context effects in letter perception" (1981), who developed their model to account for the interaction between knowledge and perception in visual and auditory word recognition. Since this model can be considered the classical forerunner of all the interactive models, I think it helpful to summarize here some of its basic tenets, especially those which might prove influential regarding our key question, the relationship between syntax and the lexicon. These are (specified for word recognition): 1. the assumption of levels within the processing system, 2. the assumption of parallel processing, namely the parallel processing of more than one unit at a time (called "spatially parallel") and parallel processing at several levels, and 3. the assumption of perception as an interactive process (cf.McClelland & Rumelhart 1981: 377). Although these assumptions are specified for word perception, they, nevertheless, are and have been attributed to production procedures as well. The third assumption seems to us the most important one, since it represents the one which most distinctly differentiates interactive from serial conceptions of (linguistic) processing: ... we assume that "op-down" or "conceptually driven" processing works simultaneously
and in conjunction with "bottom-up" or "data driven"
processing to provide a sort of multiplicity of constraints that jointly determine what we perceive. (McClelland & Rumelhart 1981: 378).
A particular variant of interactive models is represented by parallel distributed processing (or PDP) models. In such a model, linguistic notions of various complexity are represented as patterns of a number of activated nodes. That means that there is no direct correspondence between a linguistic unit (e.g., letter, phoneme, or word) and a particular node, as is assumed for the localist representations in the interactive activation models mentioned above, but that the unit's
28
Theories of language
processing
representation consists of an activation pattern or rather is distributed over several nodes, hence the name "parallel distributed processing" (PDP). Moreover, such models are capable of learning by being trained on input-output patterns in that - on the basis of the output errors - the weights of the involved connections are adjusted so that input presentations finally produce the "correct" output. The majority of PDP-models designed have been implemented not for the total process of language processing, but for partial procedures only, one reason being that simulations of this kind would need very extensive nets of nodes which can, if at all, only be modelled by extremely powerful computers. Well-known examples of simulating linguistic processing by PDPmodels are Seidenberg & McClelland's "model of word recognition and naming" (1989) or Dell, Juliano & Govindjee's "model of phonological encoding" (1993). Though the first models comprehension phenomena and the latter those of production, they do not differ that much: the mappings implemented are from letters to sounds and from lemmas to sounds respectively. So for illustrating the mechanism, either one will do. We will draw on Seidenberg & McClelland's model, where orthographic representations are to be transformed into phonological ones. This is realized via the interaction among the units that are part of the distributed activation patterns. The units of the two types of representation are not directly connected, but an additional layer of hidden units is included, in order to enlarge the processing capacities of the network. The mapping process itself can be described as follows: In processing an input, units interact until the network as a whole settles into a stable pattern of activity - termed an attractor - corresponding to its interpretation of the input. Unit interactions are governed by weighted connections between them, which collectively encode the system's knowledge about how the different types of information are related. Weights that give rise to the appropriate transformations are learned on the basis of the system's exposure to written words, spoken words and their meanings. (Plaut 1997: 767768)
At the initial state, the model has no weighted connections yet, that is, it does not know about the relations between letters and sounds. As a consequence of being repeatedly exposed to the orthographic
Models of language production
29
representation (input) and the corresponding phonological representation (the intended output), the model can learn them: When a letter string is presented to the model, it produces an output pattern on the phonological units. This is compared to the correct output pattern and the deviation from the latter is the basis on which learning occurs: The learning procedure adjusts the strength of all the connections in the network in proportion to the extent to which this change will reduce a measure of the total error. (Seidenberg & McClelland 1989: 527)
Grainger & Dijkstra (1996: 153) describe the learning procedure in its overall effect: During the learning phase, the network discovers (or rather codes in the weights on its connections) regularities in the mapping between the various codes [orthography, phonology, semantics D.S.]. The resulting 'experienced' model can then exhibit a rule-following recognition or production performance while at the same time accounting for exceptions.
A detailed description of PDP-models inclusive of many technicalities, for example, the learning rules, can be found in Murre & Goebel (1996), with the standard reference book being Rumelhart et al. (1986). Apart from these more general architectural and procedural aspects of PDP-models, there is another feature attributed to them which is of interest to our discussion: The models do not make an explicit distinction between content and rules or structure, a distinction which is made in other theories of language processing in that they assume a frame-and-slot mechanism for the construction of units on the successive levels of representation; thus, the level of grammatical encoding distinguishes between phrase structures (frame, rules) and words (slot-fillers, content). In the PDP-models, it is an effect of their processing mechanisms that what seems to be the application of a rule actually emerges from the interaction of activated node patterns of the "rule-related" items, which - when they occur repeatedly (as they should if they represent "rule-like" phenomena) - are stored, resulting in particularly weighted connections (cf. Rumelhart & McClelland 1986: 120; Seidenberg & McClelland 1989: 548-549; Dell, Juliano & Govindjee 1993: 155). Dell & Juliano (1996: 343) explain this procedure in some more detail: ..., structural or rule-like effects emerge from how the individual items, in this case words, are stored. Each item is represented in a distributed and composite
30
Theories of language
processing
fashion with the result that items are superimposed on the same set of connection weights. When a particular item is processed, one is effectively processing many other items as well. This massed influence of the other items enables the model to be sensitive to the general characteristics of the set, that is, to reflect their structure.
I will come back to what an assumption like this might imply for the interrelation between syntax and the lexicon in the following subchapter.
3.2.2. Selected issues: Where syntax and the lexicon meet In the following, I will focus on what (some of) the models under analysis specify for the assembly of words and structures in the course of utterance production, leaving aside the details of how in particular the words and the structures to be used are related to the intended message. For my purposes, it will do to assume that words are selected on the basis of the concepts making up the message, and that meaning and the communicative situation in which the utterance is made will exert an influence on its structural organization (cf. Bock 1995: 191). Starting out from that, I will concentrate here on what Bock calls the "coordination problem", by which she understands the problem of how words find their places within structures. The solution most commonly suggested to this problem is a slot and filler mechanism (...) in which a structural frame provides slots into which words ... must be inserted, and the words ... carry information about the kinds of slots they require. Coordination is then a simple matter of fitting pieces into place. (Bock 1995: 195).
This, however, seems to imply that the frames are computed independently of the words to be inserted. Supposing this is the case, this could lead to mismatches when the words selected do not fit into the slots provided by the frame, for instance, when there is a slot for a dative object and the verb donate was selected. Hence, one must assume some sort of interdependence between frame construction and lexical selection, and it is exactly specifications of this interdependence which I am looking for. A position similar to Bock's is advocated by de Smedt (1996: 282), who considers the "coordination problem" to be one of the main questions arising from the assumption of syntactic structures or plans in language production:
Models of language production
31
There are occasions when the production process may be driven by syntactic patterns which follow directly from the content and structure of the message to be transferred. ... This seems to suggest that syntactic plans can be created on the basis of the meaning to be expressed. However, if this is done without regard for the specific words to be inserted, this would still give rise to ungrammatical ities.
Production models allowing for interactive processing should naturally also specify the relationship between words and structures in the production of an utterance as being interactive. Stemberger (1985: 148), who looks for evidence for his model mainly by analysing speech errors, claims that structures are accessed in a similar way as words are, that both processes occur at the same time and heavily interact with each other. The interaction he spells out as follows: As already noted, the syntactic structures that are activated select words that have certain characteristics (such as being a noun). By the same token, words select particular types of syntactic structures. ... The existence of particular lexical items often affects which phrase structure is accessed. (Stemberger 1985: 152)
The influence of selected lexical items on the structure into which they are to be incorporated is especially obvious in cases of speech errors in which the erroneously uttered word leads to changes in the utterance's structure, a phenomenon which is known by the term of "syntactic accommodation". To give an illustration, I will quote from Stemberger's examples (cf. Stemberger 1985: 155): (1)
Your teeth are all red. (intended utterance: Your tongue is...)
The choice of "teeth" determines the concord with the verb to be plural instead of singular, as the original word would have specified. The influence of syntax on the lexicon shows, according to Stemberger (1985: 155), in the fact that word selection errors are almost always of the same word category. However, in his discussion of interactive versus serial procedures in language processing, Stemberger (1985: 152) is aware of the fact that the assumption of interaction is not the only way to explain his speecherror data:
32
Theories of language processing ...interactions between components cannot be proven absolutely at this point. While interactiveness leads to greater simplicity, there are more complex options that can handle the data in other ways.
Dell's "spreading-activation theory of retrieval in sentence production" (1986: 286-287) bases its predictions with regard to processing on three types of linguistic knowledge, namely nonproductive knowledge stored in the lexicon, productive knowledge coded in generative rules, and so-called insertion rules by which the first two types are related to each other. Though Dell focusses on phonological encoding, he also considers what his theory implies for higher level processes, which, in his view, comprise morphological and syntactic encoding6. Generally, he assumes that the relationship between the activated elements and the linguistic rules (which are responsible for the generation of frames containing the categorically defined slots for these elements) is similar at the individual representational levels (cf. Dell 1986: 314). With regard to the syntactic level, the level at which the lexicon and the syntactic rules must be assumed to interact, Dell (1986: 316) specifies that ...it appears that both input from the semantic representation and input from activated word nodes guide the building of the syntactic frame. Although it is clear that semantic and pragmatic considerations should guide the syntax, it is less clear that the activation levels of word nodes should, as well. There are, however, some experimental findings that indicate that the retrievability of a particular word affects the syntactic structure of the spoken sentence (...). To model these effects in the present theory, the activation levels of word nodes will have to be taken into account by the syntax. For example, if an intransitive verb is very highly activated, it will have to assure the creation of a frame without a direct object (...).
This sounds as if the lexicon were essentially involved in the generation of the syntactic form in which an utterance is cast in that it constrains the selection of the syntactic structure to be used. However, in the
6
Dell's "compartmentalization" of the translation of meaning into an utterance is somewhat different from the one given for production models in general (cf. section 3.2.1: 20 23): "During the translation from meaning to sound, words must be selected and ordered in adherence with the rules of the grammar of the speaker's language. This I call syntactic encoding. These words must be specified in terms of their constituent morphemes (morphological encoding), and these morphemes must be spelled out in terms of their sound (phonological encoding)." (Dell 1986: 283)
Models of language production
33
further course of the formulation process it may just as well be the case that a particular structure "called" by a lexical item requires particular lexical items in order to be "saturated". A situation like this illustrates how syntax constrains the lexicon or rather lexical selection. Consequently, also Dell's model provides for the interaction between the two components in both ways. At a later stage, Dell et al. (1993) develop a "theory of frame constraints in phonological speech errors". Here they start out from the claim made by PDP-models that separate rules, which seem to operate in language use with regard to the combination of lower-level units into those of higher levels, can also be understood to emerge from the interactions among the participating processing units (cf. 3.2.1, p. 2730), that is, they need not be stored separately. With this claim, they present an alternative to the standard assumption of production models ... that utterances are constructed by a mechanism that separates linguistic content from linguistic structure. Linguistic content is retrieved from the mental lexicon, and is then inserted into slots in linguistic structures or frames. (Dell, Juliano & Govindjee 1993: 149)
Since the commonly made content/structure distinction also reflects in the distinction between, e.g., lexicon and syntax, we think it interesting and necessary for our argument to find out about the predictions made by Dell et al.'s alternative. Would their assumption (of distributed representation) actually mean that syntax "emerges" from the use - in similar or identical ways and combinations - of words (represented by particular patterns of activated nodes) in that, and because, these combinations are represented as part of the activation patterns? In fact, Dell et al. are very cautious with regard to the generalizations which can be made on the basis of the model they developed for the production of phonological forms of single words. Their model can produce the phonological forms of words with the same sort of error effects that are usually attributed to the matching or coordination of sounds with the frames for possible sound combinations, that is, to a slot-and-filler scheme. Dell et al. (1993: 177) conclude that ... using a sequential PDP architecture suggests that the ... [error D.S.] effects do not necessarily reflect separate rules or frames, rather they can be produced by a mechanism that does not separate linguistic structure and content in an explicit a priori fashion.
34
Theories of language processing
That means that their simulation results are merely another way of explaining what the assumption of a slot-and-filler mechanism can explain just as well. Thus, there are two procedures available which can both account for the same effects, with no superiority being discernible for the one or the other, since also the slot-and-filler mechanism has been successfully simulated (cf. Dell 1986). The difference between the two is that the "parallel distributed processing" (PDP) alternative explains error patterns by the operation of similarity and sequential biases, the slot-and-filler mechanism - by the coordination of sounds and phonological frames. What it means for higher levels (morphology, syntax) when no a priori distinction is made between linguistic structure and content has not been elaborated in Dell et al.'s discussion. As for the level of syntax, they think it reasonable to assume the rules to be stored separately, since they can well account for the creativity one finds in the syntax. Their final comment (Dell et al. 1993: 189), however, indicates that this is not an all or nothing decision: Nonetheless we believe that there may be many performance phenomena, including syntactic phenomena, whose explanation lies in the experienced set of items and the mechanisms for their storage. At the same time, there may be other effects best explained by structural knowledge that has been separated out from linguistic content. A worthy goal of psycholinguistic research is to determine which is which.
To sum up, if we assume a model without explicitly specified rules, the question about the relationship between lexicon (content) and syntax (frame) will become null and void, since knowledge of both is inherent in the activation patterns, i.e., in the units and the connections representing a particular item. If, on the other hand, we assume a slot-and-filler mechanism, the lexicon and the syntax will be considered to interact in the ways specified above. When we now analyse serial models with regard to the assumptions they make about the "coordination problem", we will have to have a close look at the processes subsumed under the terms "grammatical encoding" or "encoding on the functional and positional levels" (cf. section 3.2.1). These stages have to do with the selection of the lexical items and with the computation of the syntactic frame needed for the formulation
Models of language production
35
of an utterance, and we will see what some of the serial production models mentioned specify for the relationship between the two procedures. Cooper (1980), who - in his model - places lexical insertion after the formulation of an "underlying syntactic representation", explains some of the details of the possible information-flow in the formulation of an utterance. He postulates that the syntactic representation, seemingly derived from the semantic representation, contains information about the mood (structure) of the clause to be formulated and about some higher-level phrase structure nodes, thus being only a partial representation. For the further computation of the syntactic representation, selection of lexical items and the further elaboration of the syntactic structure alternate until the syntactic tree is complete, a processing mechanism that Cooper (1980: 297) calls "grammar-lexical recycling". We understand this mechanism to be an interactive one, the instigator being syntax (which for its part is set in motion by semantics). Garrett (1980) claims that syntactic factors come into play at two levels in the production process, the functional and the positional one. At the functional level, which precedes the positional one, a syntactic frame is specified on the basis of the grammatical roles of the lexical items selected for semantic reasons. Thus, contrary to Cooper's assumption, the computation of a syntactic frame seems to be initiated by the lexicon, though Butterworth (1980: 443) attributes to Garrett the view that lexical items and syntactic structures are selected interdependently. However, these two interpretations are not incompatible: also when the two components are considered to do their work interdependently, one will have to take the initiative. In his 1982 working model, Garrett (1982: 67) presents the processes involved at the functional level in mere juxtaposition: Procedures applied to the Message level representation construct the first language specific level of representation. Three aspects of the process are distinguished: (a) determination of functional level structures, (b) meaning based lexical identification, and (c) assignment of lexical items to functional structures; representation is syntactic.
In a more detailed description, he then elaborates the interrelation to be such that the functional level structure, which is assumed to be generated in parallel for a verb dominated group, is constructed on the
36
Theories of language
processing
basis of the verb and its predicate-argument structure, its phrasal arguments (cf. Garrett 1982: 68). Thus, the (lexical) choice of a verb can be considered to trigger off the determination of a functional level structure which then provides the functional categories to which the further lexical items selected for verbalising the speaker's message are assigned. Garrett (1988: 71) claims that the lexical content of a sentence is determined by conceptual, syntactic and phonological constraints, and that lexical selection processes in particular are governed by conceptual and syntactic factors. In his illustration of "successive assignment of lexical content to phrasal structures" (given below as figure 3), he, however, puts lexical selection and the determination of functional structures side by side, and there is no indication of any interdependence: Both processes are shown as being initiated by the message representation. The functional structure follows from aspects of meaning which are related to the state-of-affairs/event to be communicated (thematic roles grammatical functions) and to the linguistic and situational contexts (pragmatic aspects)7. Thus, we must assume the structural frames to be instigated by "separate syntactic knowledge" which is fed by semantics and pragmatics. In the further course of language production, these frames must be "coordinated" with the syntactic information included in the lexicon (the combinatory information stored in the lexical entries) (Garrett, personal communication). Both the selected lexical items and the functional structure are not yet specified formally. The retrieval of form is attributed to the positional level of language production, which is initiated after the semantically specified lexical items have been assigned to the semantically and functionally specified "slots" of the functional structure. Figure 3 below is meant to illustrate the processes assumed for the lexicon-syntax interaction.
7
Caplan (1987: 274) interprets the functional structure in Garrett's model to arise from and contain sentential semantic information.
Models of language production
37
Inferential processes
Message representation
«J
I*
Le xi cal set
)
Determination of functional structures
a y )
(s'
( 0 'AGENT
)
* PATIENT
(cf. Bresnan & Kaplan 1982: xxv; Horrocks 1987: 236), or for seem seem:
V, 'seem
(SUBJ)' PROP
(cf. Horrocks 1987: 240) The former two entries can be "translated" to mean that kiss predicts for a sentence in English that the agent role is to be assigned to the subject function, the patient role - to the object. In the passive, the assignment is different: the patient role is to be assigned to the subject, the agent role - to the by-object, or it is not materialized. The latter entry contains the information that the verb seem cooccurs with two grammatical functions, but expresses a predicate that binds only one argument. The subject function is theta-marked not by the verb seem itself, but by the predicate of its propositional argument. That is why, the subject function is listed outside the argument list as indicated by the use of brackets (cf. Horrocks 1987: 240). The forms of the lexical entries are considered to be psychologically real. They can explain, among other things, why the speaker of a language understands or produces "complex" utterances, that is, those that, in Chomsky's understanding, are derived from a base form by a number of transformations, at basically the same speed as "simple" ones, thus calling into question the hypothesis of the "derivational theory of complexity" (cf. footnote 26 in section 4.1.1)43. The
43
Bresnan's comment reads as follows: "... the experimental evidence tends to support the psycholinguistic reality of grammatical structure, but ... the evidence does not consistently support the reality of grammatical transformations as analogues of mental operations in speech perception and production." (Bresnan quoted in Diehl 1981: 283)
134
Linguistic models under scrutiny
assumption of Lexical Functional Grammar is that, despite the proposed redundancy rules (cf. below), all the potential assignments of arguments to functions are stored in the individual lexical entry so that, in language processing, the operations specified in these rules need not be carried out again: If we further assume that all lexical forms are accessed in parallel, then in this model the complexity of syntactic computations will not reflect the complexity of the lexically encoded feeding relations, but only the complexity of the analysis of the surface phrase structure tree. (Bresnan & Kaplan 1982, xxxiv).
For the model of Lexical Functional Grammar we can thus summarize that: lexical entries subcategorize the surface grammatical functions around them (cf. Andrews 1988: 80), - they assign their θ-roles to grammatical functions, not to structural positions, - they also contain information about alternative assignments (as shown in the passive verb entry), which is in contrast to the Θcriterion in the Government-and-Binding model, and regular relationships between lexical entries are captured by the proposal of redundancy rules, which are "lexical rules or principles for associating grammatical functions with predicate arguments" (Bresnan & Kaplan 1982: xxvi), e.g., passive: (SUBJ) - > ( B Y - O B J ) / ( 0 ) (OBJ) (SUBJ) (cf. Horrocks 1987: 236). The latter assumption, the effect of which is attributed to transformations of a D-structure in the Government-and-Binding model, is one reason for the "loss" of transformations in Lexcial Functional Grammar: "Because the various function-argument correspondences are already encoded in lexical entries, no phrase structure manipulations are needed to express the grammatical relations of sentences." (Bresnan & Kaplan 1982: xxix) The implications these kinds of lexical entries have for structural representations and rules can be described as follows: Since there are no transformations needed, the differentiation between D- and Sstructure becomes null and void. Constituent (c-) structure in Lexical Functional Grammar, therefore, refers to the "surface" form of utterances and corresponds to the X-bar schema. The assignment of the
Generative approaches
135
thematic structure to the c-structure is the result of correlating the grammatical functions stipulated in the lexical entry with those that are syntactically associated with the c-structure and shows in the functional (f-) structure. Lexical entries are then inserted directly into the appropriate positions of c-structure. Horrocks' illustration (1987: 301) gives a clear summary thereof: Lexicon
Semantic representations Semantic interpretations
C-structure rules
Phonetic representations
Fig. 8. The essentials of Lexical Functional Grammar
C-structure is roughly equivalent to S-structure in the Government-andBinding model and, additionally, contains functional information, with the functions not being defined configurationally, but as "primitives". At the level of f-structure, the functions from c-structure, such as subject and object, are properly linked with the functionally marked thematic roles from the lexical entries to be inserted. If there is a secondary predicate, an infinite predicate, embedded in the syntactic structure arising from a first order predicate, Lexical Functional Grammar assumes a kind of control structure. This structure regulates by means of a lexical rule of functional control the assignment
136
Linguistic models under scrutiny
of the function of "controller" to a particular function of the first order predicate. Thus, in the verb promise the control (of the embedded predicate) is assigned to the subject: (source: Caplan & Hildebarndt 1988: 25): ' p r o m i s e '
Agent Theme (tsUBJ) = (TxCOMP SUBJ)
As can be seen, the functional control structures contain the information that is indicated in the Government-and-Binding model by NP-traces and empty categories (cf. Caplan & Hildebrandt 1988: 2329). The principles regulating the assignment of the thematic structure to grammatical functions within the lexical entries are elaborated by Bresnan & Kanerva (1992: 75) in the "Lexical Mapping Theory": There are four components of the lexical mapping theory: (1) hierarchically ordered semantic role structures, (2) a classification of syntactic functions along two dimensions, (3) principles of lexical mapping from semantic roles to (partially specified) functions, and (4) well-formedness conditions on lexical forms.
The first component explains the hypothesis of a universal hierarchy of theta-roles44, the second postulates a classification of syntactic functions on the basis of the features [± r] (thematically restricted) and [± o] (objective) (cf. Bresnan & Kanerva 1992: 75-76). The third component lists the principles according to which the thematic roles link with the syntactic functions classified in (2), the lexical mapping principles: These principles are of three kinds: (1) intrinsic role classifications, which partially specify syntactic functions according to the intrinsic semantic properties of thematic roles; (2) morpholexical operations, which add or suppress thematic roles; and (3) default classifications, which specify syntactic functions according to the hierarchical relations of thematic roles. A constraint on all lexical mapping principles is the preservation of syntactic information:
44
See also the discussion of Dik's "Semantic Function Hierarchy" in section 4.2.1, footnote 29.
Generative approaches
137
they can only add syntactic features, and not delete or change them. (Bresnan & Kanerva 1992: 77).
Against this background, Bresnan (1982: 149) claims with respect to our topic, the determination of the relationship between syntax and the lexicon in Lexical Functional Grammar, that: ...the predicate argument structures of lexical items are represented independently of their syntactic contextual features as functions of a fixed number of grammatically interpretable arguments. A mapping between predicate argument structure and syntactic constituent structure is specified by means of grammatical functions. These are assigned to surface phrase structure positions by syntactic rules and to predicate argument structure positions by lexical rules... the information above the dotted line is provided by the syntactic component of a generative grammar for English, and the information below the dotted line is provided by the lexical component.
Such examples suggest that the hearer can construct the semantic representation of an utterance he is listening to before all the constitutive elements have "come in". Thus, as the placement of the overlaps indicates, potential feedback can be given before the final (few) word(s) has (have) actually been perceived. The two kinds of motivation listed so far can be classified as being turned backward and forward respectively. The remaining 27 overlaps, which is more than 10 %, can be categorized as follows: Seven overlaps of this "rest group" represent attempts on the hearer's side at changing the topic of the conversation, without any reference being made to the ongoing utterance. Four overlaps are due to the hearer's wish to interrupt the conversation (e.g., by saying hang on a minute). For the rest of the overlaps I could not find any helpful information about a potential motivation, because the parts of the utterance surrounding the overlaps are inaudible and cannot be used for a contextual guess.
216
Performance data
All in all, one can state that the majority of overlaps found in my corpus are directed forward, that is, at indicating the hearer's understanding of the utterance before its production is even finished. Secondly, the overlaps directed backwards are placed without considering phrasal boundaries; although they are meant to indicate the comprehension of a complete syntactic unit (most likely of the size of a clause, expressing a complete proposition), they cannot be placed neatly at exactly the end of this unit, because, as mentioned above, comprehension lags behind production by one or two syllables. A closer inspection of the overlaps directed forward revealed some more interesting details: In his overlap, the interrupting interlocutor often does not take up the structure as it has already been developed by the speaker, but simply indicates his comprehension of the complete utterance by an affirmative remark: (89)
A: B:
with a piece of china or something like that it 7/ just
On the other hand, he may confirm his comprehension of the whole message by uttering the word which makes the utterance complete at the same time as it is articulated by the (original) speaker, thus indicating that he has understood it implicitly before its actual articulation: (90)
A: B: A:
something like a
stick or a rope...
The hearer's guess is not always absolutely correct, but it is semantically coherent. Ergo, what allows for hypothesizing about the complete form of an utterance that is still under way is probably the linguistic and situational context, that is, the information that has already been specified verbally and the information that can be taken from the situation in which the utterance is being made.
Overlaps
217
Occasionally the intruding hearer can be found to take up the exact syntactic structure; he continues using the same words/phrases/ structures as the original speaker had planned to use: (91)
A: B:
I think he Newcastle
Examples (88) to (91) clearly indicate that comprehension can and actually does overtake production. From this, I take and generalize that comprehension is a procedure in which the comprehender not merely perceives and interprets input, but in which he actively constructs the utterance for himself, practically also "producing" it silently. In fact, the hearer will decode the sound input and access words. By activating a word, all its features will become active, also those determining or predicting its combinatory behavior, so that he will have them available and exploit them in the form of expectations regarding the structure or even the words to follow. He can be considered an on-line hypothesis-builder, who uses contextual information and all the information that is available to him once he has constructed a semantic representation from the speech input, in order to make what he is listening to a meaningful message. A procedure like that is not really different from speech production, except for the facts that the person setting the first lexical item which instigates the whole mechanism and the person constructing the message are not identical, - the comprehender's hypotheses vary in their strictness (from the expectation of one particular word to that of a particular structure, or a neighboring word category) according to the cooccurrence information contained in the word just retrieved, and the choices/decisions made by the comprehener need to be confirmed by the speaker's input. The overlaps that result from the (original) hearer's assistance offered to the (original) speaker in a situation of speech need make the hearer's active engagement in the comprehension process even more obvious:
218
Performance data
(92)
A Β A
So you can -pause-
what you are going to say...
(93)
A:
I was in the kitchen and Ijust heard something about a
B:
(94)
A: B:
For er, those er - pause -
The mechanisms I assume and suggest for the comprehension process would also plausibly explain why comprehension is so fast and (seemingly) effortless as it is and why it can overtake production. Moreover, if comprehension works like this, the interaction of syntax and lexicon can be compared to what I assume for it in the production process: (conceptually motivated) words are accessed and these constrain - via the combinatory information they contain - the syntactic structure into which the intended message can be cast. Comprehension is different from production in that the initial representations are the oppositional ends: whereas the speaker starts from a conceptual representation, the hearer does so from a phonological one. But as soon as a word's lemma has been accessed, no matter whether on the basis of a conceptual motivation or of a sound input, production and comprehension work in a comparable way: the lemmas constrain the syntactic structure to be used by the speaker or to be expected by the hearer. Hence, when the first lemma is accessed for producing an utterance, the speaker will also construct a structural frame, determining the form his utterance is going to have on the basis of the co-occurrence information contained in it. The other words he needs and accesses for the conveyance of the intended message and his structural knowledge will further constrain the form of the utterance. When an utterance is being comprehended, the hearer will hypothesize about the form of the utterance on the basis of the co-occurrence information contained in the lemma that he has already accessed as a consequence of perceiving a
Overlaps 219 sound segment that constitutes a word. If the co-occurrence information is strongly biased regarding one particular structure or even regarding particular words to co-occur, the hearer's hypothesis will be strong, if not - it will be weak. In any case, however, the expectations of the hearer (his tentative choices/decisions) need to be confirmed by the input (consecutively) produced by the speaker. Thus, it seems that comprehension is not only an input-driven, bottom-up procedure, but also a top-down procedure, driven by the comprehender's expectations, with the input supporting or contradicting these expectations, that is, with much of the input serving to confirm or reject the comprehender's hypotheses. Example (95) illustrates the rejection of such an expectation-driven hypothesis, with the speaker simply ignoring the hearer's (incorrect) intrusion: (95)
A: B: A: B:
They get first choice for - pause- European Club tickets A lot of < seats > play in that though.
Comprehension is also a bottom-up process in that the first input item sets the scene and instigates the mechanisms involved in comprehension, e.g., the structure-building procedures. Moreover, comprehension is solely input-driven in situations where the hearer cannot make an intelligent and specific guess about what the speaker is going to say and/or how he will do that. The reason for the inability may be that there are no contextual clues and that the word(s) already perceived and accessed do(es) not contain any helpful constraining cooccurrence information besides that of its (their) word category (categories). Such a situation is easily conceivable, for instance, when an NP has been uttered. It may not be clear to the hearer what semantic role it is going to realize in the sentence being produced and comprehended. Besides, it may also be unclear whether it will be further specified/modified or whether it will be assigned to a particular event as such, etc. If the following sentence (from a newspaper) were presented to the reader one word at a time, his expectations regarding the structure and words to follow would be very unspecific at a number
220
Performance data
of places, for there are several words with very general co-occurrence information only (indicated by a following question mark), which does not allow for specific guesses: (96)
A (?) growing shortage of high-tech workers and around (?) the country (?) has prompted to look (biased for/expected: for) overseas for (from: The Arizona Republic, 17 May 1998, ρ
(?) in Arizona (?) many companies recruits. 28)
As for the interaction of lexical and syntactic information in the comprehension process, I assume that incoming lexical entries predict (more or less strictly) - on the basis of what has already been processed and of their co-occurrence information - the structure of the following utterance, sometimes even the words to follow, in an incremental way by their subcategorization and selectional restriction features and by their meanings. Besides, there are also stages in the comprehension process where e.g., thematic knowledge (expectations with regard to theme-rheme structure), or purely (non-local) syntactic knowledge causes certain expectations in the hearer. The latter will be effective when an utterance starts with a quite complex NP (as in example [96]): The hearer will eventually expect some VP-predicate to follow. As compared to production, where the speaker is going to articulate what he has constructed, the hearer uses what he has constructed for getting at the meaning of the utterance, for constructing a conceptual representation. That is why there seems to be a different weight attributed to the necessity with which both the speaker and the hearer must construct a complete (and the "correct") syntactic representation. The syntactic representations constructed by the hearer need perhaps not be specified in every detail in order to enable him to comprehend the utterance. The speaker, on the other hand, will ideally have to specify each and every detail of the form correctly, because he is to explicitly articulate the utterance. For the hearer it might be sufficient to get those details right which are decisive for the meaning, and perhaps he can neglect further details as soon as he has constructed a conceptual representation. This "generosity" regarding the exactness or rather the explicitness of the syntactic representation may well be one
Overlaps
221
reason for the fact that hearers do not always arrive at, or end up with the conceptual representation intended by the speaker. That the different goals one aims at when producing and comprehending an utterance may cause such effects is also discussed by Garrett (1980: 216) who, however, sceptically remarks "[j]ust how much latitude that may purchase for comprehension systems is far from clear." What I could show regarding the overall well-formedness of spontaneously produced oral utterances in the preceding subchapter adds further doubt - not to the lesser importance of a perfectly specified syntactic representation, but to a noticeable difference in this respect between production and comprehension. Apart from these assumed commonalities between the two modes of language processing, I also see parallels in the way how the lexicon and syntax interact. The production procedure is predominantly lexicondriven, with each retrieved lexical item setting a structural frame which, in cases of frequently used, habitual expressions, may even be lexically specified or filled. For the comprehension process I assume a similar mechanism with the proviso that there is an additional step involved, namely that of checking the comprehender's hypotheses against the perceived input. All those overlaps that can be classified as turned or directed forward speak for such an assumption. Besides the evidence presented by examples (88) to (95), I found further evidence for the comprehender being an active or silent producer, namely overlaps where the comprehender breaks in on the speaker's utterance and simply carries on, although there was no indication of the speaker being in speech need: (97)
A: B: A:
He's still a
ain 't he
Also the intraphrasal location of more than 50 % of the overlaps is indicative of what is going on in comprehension. The general assumption is certainly this: the hearer will have to wait until all the input has arrived, before he can produce a final representation of the
222
Performance data
content that has been transferred to him. He will lag behind the speaker by - as is estimated - one or two syllables, that is, roughly by a word. This reflects in the overlaps that result from situations in which the listening interlocutor breaks in turning backwards and is not fast enough, so that the speaker has already started his next utterance. Hence, the overlap will be located after the utterance part to which it relates has come in, without being influenced by the following input: A: B: A: B: A:
I'm doing a market research. Oh I see. And I've got to
everybody - pause - 1 talk to.
Overlaps are located before the phrase being uttered is finished in situations where the hearer has built his own hypothesis of what is going to follow, thus "overtaking" the speaker. It is especially these cases which are evidential of our hypothesis. The position of the overlap between verb and its object is illustrative in this respect: (99)
A: B:
(100) A: B:
I don't blame dear Well I could lighten I know!
The object is semantically and syntactically predictable from the verb's lexical representation and from the input already processed. The hearer has seemingly understood the complete NP, before it has been articulated completely. Other locations of the overlap which can illustrate our point are those between determiner and noun (example [101]), those between auxiliary verb and main verb (example [102]), and those between preposition and noun phrase (examples [103] and [104]):
Overlaps
(101)
A: Β:
(102)
A: B:
(103)
A: B:
223
Well, give her my I will.
Well she 7/ want it in - pause -1 know
she will. (104) A: B:
So current times the voltage and then divide by the resistance, so current equals V over < R > < R >
Data like these make the assumption of a lexically-dominated and contextually-constrained parser plausible, the claims of which are the following: The construction of structure in the comprehension process is dependent on, or constrained by, the (probabilistic) cooccurrence information contained in the lexical entries already accessed by the hearer. - The hearer's expectation of the structure or potentially also of the particular item(s) to follow is naturally constrained by the content and the structure of what has already been processed. At the same time, these data speak against an assumption related to the garden-path theory of sentence comprehension, namely the assumption that the immediate structural analysis of incoming input is not determined by (nonstructural) considerations of meaning and plausibility (cf. Frazier & Clifton 1996: 8 and what is said about the "lexical-filtering view" in section 3.3.2). As the overlaps show, the hearer has not only analysed the input structurally, but he has already extracted the content representation, before the final constituent(s) making the structure complete has/have been articulated. From my point of view, this implies that meaning and plausibility in all probability determine the parsing process for the very simple reason
224
Performance data
that there is no information other than that available to the hearer to arrive at a conceptual representation. Another potential explanation relates to what I have said about comprehension and the necessity of explicitly constructing a complete syntactic representation. If comprehenders could also get at the content of an utterance without always constructing a fully detailed syntactic representation, this would most likely be due to the highly predictive power (with regard to structure and meaning) inherent in a number of lexical items and the hearer's general capability of inferring from contextual information and from what has been processed before. Anyway, what is common to both explanations is the importance attributed to meaning and plausibility in the comprehension process from the very beginning, i.e., as soon as the first word has been accessed. Last but not least, I also had a look at the forms of the overlaps produced by the (original) hearer. They are manifold, but they show a general difference which is related to the motivation for which they are produced. In overlaps turned forward, the hearer intrudes by uttering e.g., a noun, a verb, a noun phrases, that is, just the expression he expects to follow in the speaker's utterance. This is done for various reasons: the speaker might need help, the hearer might want to indicate that he "got the message", or he might simply want to let the speaker know that he is still with him (with the latter motive attributing a more or less phatic function to such overlaps). For all three situations, the size of the overlap to be expected does not go beyond a word or a short phrase. This seems logical and plausibly relates to the assumption that the hearer's guess is based on contextual information and local (cooccurrence) constraints. In overlaps turned backward, not surprisingly, we find short affirmative or negative forms, such as yes, right and no, or short questions. They are either of a phatic type or they are produced to express the hearer's need for further information, that is, they are of an informative type (cf. Halliday's characterization of language functions (1994: 33-36, Halliday & Hasan 1990: 15-28). The former type is represented by short forms that are habitually used in this way, the latter type comprises predominantly questions, in
Overlaps
225
fact, short form questions, which is probably due to the fact that the questioning hearer does not have the floor at that very moment. All in all, the only fact noticeable in the forms of overlaps that is informative regarding comprehension processes appears to be their shortness, especially that of those turned forward: Since the hearer does not know which overall content the speaker is going to transfer to him, his expectations are naturally constrained to those parts that are sufficiently cued by the linguistic and situational context. These are most likely encoded in the final word(s) of the speaker's utterance, and do not normally go beyond the information contained in one autosemantic word. To sum up what can be specified for overlaps, I claim that they first and foremost support the idea that a word's (probabilistic) cooccurrence information and the (linguistic and situational) context in which it is embedded exert a strong influence on the comprehension process, and that they do so not only after a syntactic representation has been constructed by the hearer, but as soon as the first word has been accessed. Secondly, overlaps are also indicative of the interaction between lexical and syntactic information in the comprehension process. In fact, they make obvious that lexical entries contain syntactic information which is exploited by (the speaker and) the hearer to build structure into which further incoming input has to be incorporated. They also show that the hearer's expectations can just as well be based on "separate" syntactic knowledge, for example, on his knowledge of what the general rules for potential combinations of word categories predict for a well-formed utterance. This is illustrated by the fact, that the hearer is almost always correct in his guess at the word category of the word he contributes to the utterance in the overlap. There was only one case in which his word category guess was erroneous (cf. example [90], and it seems to be due to the possibility of expressing the respective information at various levels of concept specificity [wood vs. a wooden something]. Apart from that, the overlaps produced by the hearer also show the early exploitation of semantic information in the comprehension
226
Performance data
process. In most cases, the listener is able to infer the actual word to be used in a particular slot, effectively exploiting what he has already constructed from the preceding utterance as well as from the situational context. The early use of semantic knowledge, even before the syntactic representation is completely available to the hearer is a challenge to all those models in which semantics is considered a purely interpretative component. Finally, though one can find support from my analysis of overlaps for the existence and exploitation of lexicon-inherent co-occurrence information, I have not come across many overlaps that are related to collocations, that is, where the words contributed by the hearer to the speaker's utterance are part of a collocation. If they are, as is the case in example (101), this fact alone makes the actual words to follow predictable, that is, they are predictable also in context-free usage or in a neutral context (for example [101] this is the phrase give her/him/them my love). I am fairly certain that the analysis of a larger corpus would bring to light more examples of overlaps in which the part contributed by the hearer consists of one or more word(s) habitually co-occurring with the word(s) previously uttered. From all that has been found for overlaps, the following claims have a special impact on my conceptualization of a natural linguistic model: A language's lexicon and syntax do not operate separately, but in close co-operation. The construction of a message is predominantly lexicon-driven. That is why syntax cannot be considered more basic or important than other subcomponents of the language system. On the contrary, some of my findings can even be taken to speak against its ubiquity. Besides, also semantic/pragmatic factors have turned out to be inseparable from syntax, that is, the use of particular syntactic structures is not only motivated by the word categories that need to be combined, but also by semantic/pragmatic considerations (e.g., the level of concept specificity, the focus of the utterance, etc.).
Lexical co-occurrences
227
5.4. Lexical co-occurrences It seems clear that our psychological lexicon contains large numbers of multiple-word units - stock phrases of various sorts. Indeed, there are probably at least as many as of single words. (Kelly & Stone 1975: 65)
The repairs I presented in section 5.2 contain some evidence for the speaker's occasional use of syntactic clusters, i.e., chunks of habitually co-occurring words. The overlaps I discussed in the preceding subchapter do, however, not contain examples which strongly support my hypothesis of the existence in the language user's mental lexicon of fragments larger than single lexical items, of collocations (cf. hypothesis 8 in section 3.2.2, and hypotheses 8-10 in section 3.3.2). In the majority of cases, the intruding listener takes up and continues the speaker's utterance on the basis of the combinatory information contained in the lexical entries already processed and the contextually (semantically/pragmatically) constraining information he has extracted from the preceding input, as well as on the basis of the situation in which they communicate. Hence, as I already mentioned in section 5.3, the words involved in the overlaps do not represent established collocations, with the exception of example (101). However, evidence for the phenomenon of collocations can be found as a result of analyses of yet another type of performance data, namely analyses of authentic written and spoken texts regarding the overall regularities they expose. Such analyses are the domain of corpuslinguistic research, and they have revealed a large amount of patterning in products of language use, which goes beyond a mere reflection of structural or syntactic regularities. Corpus linguists have discovered patterns of lexical co-occurrences, that is, they have found that there are numerous structures which are instantiated repeatedly by the same lexical material, or, as seen from the lexical end, that particular words co-occur in particular structures much more often than is predicted by chance, when the assumption is that words are freely combined according to the rules licensed by a language's syntax. The lexical patterns show that in the construction of an utterance, the selection of words cannot be considered a slot-filling process which is merely dependent on the intended message and the syntactic frames already specified by previous material. Rather, the words to be selected for a
228
Performance
data
particular slot seem to depend on the actual words that have been selected before. This constraint may be so strong that the choice for the word to follow is pre-empted, that is, it is no choice at all. An example will make clear what I mean: In an analysis regarding the use of introductory it as Object followed by an adjective or an NP in English (e.g., ... make it clear ...), Francis (1993) finds that this structure is lexically restricted. The structure under analysis occurs with only few verbs, 98 % of the citations in her corpus (the COBUILD corpus) occur with the two verbs find, and make', occasionally the verbs think and consider can be found. There are further restrictions noticeable as to the adjectives following the placeholding it: find predominantly co-occurs with difficult, hard, or easy, whereas make shows a preference for combining with clear, resulting in fragments such as find it difficult/hard/easy, make it clear. The whole structure is related to a particular communicative function, namely to present a situation as it is evaluated (cf. Francis 1993: 140141). Francis (1993: 144) presents other examples of "prepackaged phrases", where the lexical choices are pre-empted, as in put on a brave face, or where there is only a limited choice, as in I haven't the faintest/slightest/foggiest/remotest/least idea/ notion/ conception. The examples given can be understood to be on a cline between idioms, which can actually be seen as single choices, and free phrases in which both the choice of words and the way of combining them is only restricted by the meaning to be conveyed and the rules of syntax. Phrases or fragments in which the selection of words is not free, that is, in which all or some Iexico-syntactic choices are pre-empted, represent clusters of usually/habitually co-occurring words that are commonly called "collocations"76 (cf. also footnote 31 section 4.2.3). The term goes back to Firth (1957: 11, 14), who established the concept of "collocation" to denote the syntagmatic relations between actual words (as against those between word categories). His idea that a word, or rather its meaning, is known by the company it keeps reflects his understanding of meaning as function in context77 and is the basis for 76
For a list of alternatively used terms see Kjellmer 1994. This understanding of meaning goes back to Malinowski, who - according to Steiner (1983: 57-58) - defines "meaning" by referring to both "function" and "context": Meaning is function in context, it is use of language and not concept. The 77
Lexical co-occurrences
229
his definition of "collocation": "The habitual collocations in which words under study appear are quite simply the mere word accompaniment.... Collocations are actual words in habitual company." This very general characterization has been further specified and elaborated by linguists who enquire into the phenomenon of cooccurrence preferences/restrictions. Altenberg & Eeg-Olofsson (1990: 3) differentiate between a broader and a stricter sense in which the term "collocation" is used and understood in linguistics, with the former being equivalent to "recurrent word combination", and the latter to "habitually co-occurring lexical items" or "mutually selective items". As can be easily seen, it is this latter sense that follows from the Firthian definition. For the two senses of the term Altenberg & Eeg-Olofsson (1990: 3) further specify: Both interpretations imply a syntagmatic relationship between linguistic items, but whereas the broad sense focuses on word sequences in texts, the stricter sense goes beyond this notion of textual co-occurrence and emphasizes the relationship between lexical items in a language (...)
That means that collocations in a stricter sense seem to have become part of the inventory of a language's elements, they seem to have acquired the status of "single" (though obviously analysable) elements that are disposable for the further (rule-governed) construction of utterances. From my point of view, this status cannot be attributed to all the collocations alike, since there are differences in their entrenchment in a language user's linguistic repertoire. That means that only the very frequent and rigid collocations can be assumed to have a unit-like status, whereas the less frequently used collocations and the ones that are more flexible (i.e., those allowing for quite an amount of variation) are more likely to be represented in the mental lexicon as individual words, though equipped with the respective co-occurrence information (for details on "co-occurrence information" see section 3.3.2). Hence, I consider collocations to be a probabilistic phenomenon, that is, the co-
same concept of meaning is taken up and further elaborated by Halliday. It eventually reflects in collocational analyses (e.g., Sinclair 1991), in analyses regarding the typical syntagmatic context of words.
230
Performance
data
occurring words tend to be used together more or less frequently, in a more or less rigid form78. Smadja (1994), starting out from Benson's (1990) definition79, discusses four properties of collocations, mainly from the perspective of computational linguistics, where special attention is paid in the description of linguistic data to those facts that are relevant to automated language processing. The features he attributes to collocations are those of being arbitrary, domain-dependent, recurrent and cohesive lexical clusters (cf. Smadja 1994: 146-147). The first feature is equivalent to claiming that collocations are language-specific, that is, they may well be different in different languages. This makes them unpredictable for the foreign learner of a language, as can be seen in the following examples (105 (106 (107 (108 (109 (110 (111 (112 (113 (114 (115
deliver/give a lecture deliver/give/make a speech make a decision meet the requirements a crushing defeat a formidable challenge blizzards rage bees buzz a pride of lions sound asleep amuse thoroughly
eine Vorlesung halten eine Rede halten eine Entscheidung treffen den Anforderungen entsprechen eine vernichtende Niederlage eine gewaltige Herausforderung Blizzards toben Bienen summen ein Rudel Löwen fest schlafend sich gut/köstlich amüsieren
(source of the English examples: The BBI Dictionary of English).
Combinatory
But also the native speaker of English will not be familiar with such collocations, until he has sufficiently experienced them. Only after he 78
This assumption deviates from other readings of the term "collocation". Benson et al. (1896: ix), for example, define collocations as "fixed, identifiable, non-idiomatic phrases and constructions", whereas I attribute the feature of being fixed to rigid and highly established collocations only. 79 "A collocation is an arbritrary and recurrent word combination" (Benson [1990], quoted in Smadja 1994: 146).
Lexical co-occurrences
231
has been exposed to these combinations, will they become predictable for him, will they become entrenched or established units for him. In this respect, they seem to be comparable to words: they must be remembered. Smadja's claim (1994: 146) that collocations are domain-dependent relates to differences in the use of collocations that result from their being restricted to particular domains or technical fields. Thus, a dry suit has a special meaning in the technical language of sailing. It does not refer to a suit that is dry, but to a suit that sailors wear to stay dry in bad weather conditions. In order to understand or to use the phrase correctly, it is not sufficient to logically infer its meaning from the meanings of its constitutive parts, but one has to have acquired it. That collocations are recurrent means that they are repetitive (in the respective context), that is, they represent word combinations typical of the language in general or of particular domains, they are recurring clusters of words. The collocational feature of being a cohesive lexical cluster is meant to express the ability of one or several constitutive words to predict the rest of the collocation. From this Smadja (1994: 146) concludes about statistical distributions of collocations: This means that, for example, the probability that any two adjacent words in a sample will be "red herring" is considerably larger than the probability of "reif' times the probability of "herring". The words cannot be considered as independent variables.
Just as Francis (1993), also Smadja attributes different degrees of rigidity to collocations, the two extremes being very rigid and very flexible ones. The latter allow for quite some variability and, hence, are characterized as collocations on the basis of their recurrent, though variable, occurrence, a feature by which they should differ from free combinations80. Mackin (1978: 151-152), discussing what sort of word-strings should be included in a dictionary of idiomatic English, that is, where a borderline between collocations and free phrases (which he calls "open collocations") is to be drawn, states that 80
Benson, Benson & Ilson (1986: xxiv) define "free (lexical) combinations" as "those in which the two elements do not repeatedly co-occur; the elements are not bound specifically to each other; they occur with other lexical items freely."
232
Performance
data
[o]ne method of determining whether to include or exclude a given collocation in such a dictionary is to regard it as having a position somewhere on a scale (... a cline) of probability. On this scale, at the lower level of probability of cooccurrence we would place expressions like 'colorless green ideas', to quote a famous concoction, at the higher level of probability ... expressions like 'eke out' and 'bode ill/ well (for...)'. The more uniform the usage of any given collocation, the more predictable or fixed each constituent word may be said to be in relation to the other phrases.
The feature of recurrent occurrence, which is related to frequency, is sometimes taken to be the only necessary and sufficient criterion on the basis of which to determine the collocations of a language (e.g., Sinclair 1991, Kjellmer 1994). Frequency of co-occurrence is a critical parameter in a corpus-based approach to the identification of collocations. Clear (1993: 273) states that, though, in principle, every single co-occurrence of words can be seen to constitute a collocation (since the corpus is only a sample of the language under analysis), a threshold value is usually applied to the frequency of occurrence, so that single occurrences are discarded. Hence, his definition of collocations also reflects frequency: "I have defined collocations as the mere recurrent co-occurrence in text of word-forms". He also comments on another feature, which he calls "stereotyping", and which he considers to be a consequence of recurrent cooccurrences. He records that these recurrent clusters have a tendency to slip out of their appointed range of free commutation and form particular attachments... [they have the tendency] to develop a life of their own as identifiable pieces of a native speaker's lexical hoard. (Clear 1993: 272-273)
It is important to my argument to note Clear's assumption regarding the status of collocations: "identifiable pieces of a native speaker's lexical hoard", that is, they can be assumed to be stored in a native speaker's mental lexicon. However, we will qualify this statement, following Kjellmer (1994: xvii), who claims that "[l]exicalization as such is clearly not a suitable criterion for collocationhood." From my point of view, the lexicalization of a collocation depends on two facts, namely on its "rigidity", and on the degree to which it is entrenched/established in the individual language user. Consequently, I (once again) have to assume a continuum with the entrenched and rigid collocation at the one end - which are the more likely candidates for lexicalization - , and the little established and variable ones at the other
Lexical co-occurrences
233
- which are less likely candidates for being lexicalized as long as there are no substantial differences noticeable in the use of the respective variables. As soon as particular preferences become evident, the likelihood of becoming lexicalized increases. Discussing the term "collocation" as it is defined by Benson, Benson & Ilson (cf. above), Kjellmer (1994: xvi) lists still another criterion for the distinction between collocations and free phrases, namely that of the substitutability81 of constitutive words: Substitutability and frequency are the main criteria for deciding whether a string of elements belongs in one category rather than in the other. If the elements of the string are uniquely bound to one another, in the sense that none of them can be replaced without a change of meaning, the string clearly has a permanent character...
The less substitutable a constitutive word is, the more the lexical choice is pre-empted in Francis' sense (cf. above), the closer the collocation is to a truly "fixed expression". For his own dictionary (A dictionary of English collocations, 1994), Kjellmer also draws on the feature of grammatical well-formedness in order to decide whether or not to include a recurrent string of words into the dictionary, whether or not to treat it as a collocation. This enables him to discard groupings of words that occur repeatedly, but have no "organic interrelationship", as e.g., day but, however in the, night he etc. (cf. Kjellmer 1994: xv). In other words, the words making up a collocation will have to correspond to the (hierarchically determined) sequences licensed by the syntax of English, because they reflect conceptually related facts. This is a plausible requirement. For, as I mentioned in sections 3.2 and 3.3, I consider the recurrent character of many phenomena, situations, and events which we experience and talk about to be the eventual motivation for collocations to arise. And experiencing/recognizing reality as a particular event (state, phenomenon, etc.), i.e., the structuring of reality, implies that the
81
The same phenomenon is discussed by Altenberg (1991: 128) for amplifier collocations in spoken English. He records that many amplifiers tend to be "collocationally restricted", i.e., the words (amplifier and amplified) can not usually be substituted freely.
234
Performance data
elements playing a role in this particular event (etc.) are also conceptually related. Ergo, the fact that collocations represent grammatically well-formed patterns is due to their expressing structured experience. The last of Kjellmer's criteria for including a word string into his dictionary is that they must be native-like. If I understand Kjellmer correctly, this criterion is comparable to what I already said about the predictability of collocational elements. Particular habitual combinations are not logically or semanticalIy motivated, they look as if constructed at random. Yet, the language user is not free to choose, native-like usage requires that he use make the bed, but lay the table, tremble with fear, but quiver with excitement (the examples are quoted from Kjellmer 1994: xviii). He cannot know about these constraints until he has experienced and acquired them. Yet, I have to comment on still another phenomenon that has been attributed to the use of words in collocations and relates to the meanings of the constitutive words. I said that the language user's choice of words in a collocation is pre-empted, that is, the words making it up are not chosen independently. That implies for these words that "they convey meaning only as a part of the environment in which they are used: they are not meaningful as separate units." (Partington 1993: 186) In other words, they develop a "collocative meaning", which Leech (1981: 17) defines as "the associations a word acquires on account of the meanings of words which tend to occur in its environment." Sinclair (1992) attributes these changes in the meanings of words making up a collocation as compared to the meanings they have as individual words to the fact that the meaning of a collocation is shared between the constitutive words, and describes the process as the phenomenon of "delexicalization" 82 . He elaborates that [t]he meaning of words chosen together is different from their independent meanings. They are at least partly delexicalized. This is the necessary correlate of co-selection. If you know that selections are not independent, and that one selection depends on another, then there must be a result and effect on the
82
The phenomenon of delexicalization is also topicalized by Ross (1992: 168), who speaks of a "semantic contagion". It is defined as the "adaptation of meaning (of the same words) to varying semantic contexts".
Lexical co-occurrences
235
meaning which in each individual choice is a delexicalization of one kind or another. It will not have its independent meaning in the full if it is only part of a choice involving one or more words. A good deal o f . . . evidence leads us to conclude that there is a strong tendency to delexicalization in the normal phraseology of modern English. (Sinclair 1992:16-17)
That means that a word's meaning in a collocation cannot be separated from its environment. For, the meaning is represented by the collocation as a whole, and usually this particular meaning is not made up by simply adding the meanings of the words involved in it. On the contrary, in most cases, the meanings of the words constituting a collocation will influence and modify one another, the concepts they refer to may vary, with the variation depending on what the cooccurring words are. Take the word dry for example: it will be related to different concepts in phrases, such as a dry suit (in the technical sense mentioned above) or a dry cow. In other words, the concepts named by the whole phrases are not the result of simply adding the concepts named by the individual words, but they are the result of an integration of the two, with the one selecting, emphasizing, or profiling particular aspects of the other. The question arising from this is when words actually express their "independent" meanings and what they consist of. From my point of view, the "independent" meanings of words are the meanings these words have when they are used in free combinations of words, or in isolation (which is a highly unusual case), as e.g., in (116) Can I have a dry shirt, please? (117) Dry is the antonym of wet. However, as soon as the combination as a whole refers to an integrated conceptual entity (which can be considered to be a result of the process of blending described in section 4.4.2), the meaning of the same adjective is no longer "independent", but is influenced and modified by the noun it combines with. Thus, whereas in a dry shirt, the meanings of dry and shirt can be understood to add up to refer to a shirt that is dry (in its "independent", or rather "default" sense of "not being wet"), in a dry cow, the meaning of dry is affected in a particular way by its co-occurrence with cow: A dry cow is a cow which does not produce milk for some reason, and in
236
Performance data
order to know this, one will have to have encountered the phrase before in a context in which this sense was self-evident, or one has to be told what this phrase means. That means the "independent" meaning of dry cannot merely be added to that of cow, but the phrasal meaning is either available to the user as a whole conceptual entity (a dry cow - i.e. a pregnant cow, or a calf, or a very old cow - vs. a milk-giving one, i.e., a "normal" cow), or, if this is not the case, it can only be understood via accessing both the concepts and inferring about the relationship between them, that is by finding a way in which the two concepts might be integrated to make sense. The list of examples given below shows in which way the meanings of the adjective dry are influenced or modified by a following noun, when dry "shares its meaning" with this noun in a collocation: i / ^ + N:
river/lake/well oil-well cow weather/ period of timet)· a place mouth/throat cough eyes place humor voice piece of writing! speech bread sherry/wine sound
Ν which has (temporarily) run out of water; Ν which is used up, no longer producing any oil; does no longer produce milk (cf. also above); Ν without moisture (from precipitation, with little or no rainfall); Ν lacking moisture (thirst, having little or no saliva); type of Ν (serious, no production of phlegm); lacking moisture, often used non-literally: without mourning; Ν where there are laws forbidding anyone to drink, sell, or buy alcohol; type of Ν (amusing, in a subtle and clever way); type of Ν (showing no emotions, cold or dull); Ν without much embellishment (dull and uninteresting); piece of Ν with no topping Ν of a particular quality (not sweet) type of Ν (rough, sharp, crackling, not smooth).
Lexical co-occurrences
Tbl
The fact, that the combined words refer to a conceptually integrated entity is even more obvious in compounds, which have been established and lexicalized for the very reason that there was a need to name an entity related to, but different from those entities named by its constitutive elements: compounds dry + Ν dry cleaner, dry dock, dry ginger, dry goods, dry land, dry rot (source of the adjective + noun combinations and the compounds Sinclair 1987b: 437). In each of the combinations the word dry expresses a different, though related sense. And these senses are by no means covered by what is given as the "independent" meaning of the word: "Something that is dry has no water or other moisture on it or in it." (Sinclair 1987b: 437). I am inclined to say that what we consider to be independent meanings of individual words is nothing but the central/prototypical case of them being used in a context, that is, the strong default value. So, when the meaning of a word is to be described without a context, for instance, as a reply to the question of "what is the meaning of X?", we tend to describe what we mean by the word by referring to the typical situation in which it is most commonly used. For dry this may be the definition just quoted. From what I have said so far, I can summarize the following features with regard to which collocations differ from free word combinations: - the recurrence/frequency of co-occurrence the degree to which elements are substitutable without markedly changing their meaning - the degree to which the constitutive words are predictable the way their constitutive parts contribute to the meaning of the utterance in which they are used - the potential of being or becoming lexicalized. As the examples of collocations presented so far may suggest, the occurrence of collocations seems to be a common phenomenon in language use, that is, strings of texts will probably contain quite a number of word combinations which are not incidental, but habitual. Pawley & Syder (1983: 193) have observed that the ignorance of these habitual co-occurrences can make the resulting text unidiomatic:
238
Performance data
It is a characteristic error of the language learner to assume that an element in an expression may be varied according to a phrase structure or transformational rule of some generality, when in fact the variation (if any) allowed in nativelike usage is much more restricted. The result, very often, is an utterance that is grammatical but unidiomatic...
Kjellmer (1992: 329), who analysed some very basic patterns in an English corpus (namely, the use of present vs. past tense, of to + infinitive, of passive), hypothesizes that: [i]f items belonging to the same rule-defined group differ significantly ... with respect to the incidence of the pattern(s) allowed by the rule, this could be taken as an indication of the existence o f lexically-based restrictions that operate within the rule-defined field. Such restrictions could then be seen as instrumental in selecting nativelike sequences from among merely grammatically possible ones.
The results of his analyses confirm what he postulated. He can demonstrate that the use of a language does not simply consist in the free application of its general rules, but is partly constrained by the lexical items that are selected for communicating a particular message in that those are habitually bound to occur in particular structures. I would like to extend this idea by adding that lexical items can also be habitually bound to co-occur with other lexical items, as is shown in the manifold lexical patterns that corpus analyses have brought to light. In the following I have compiled what some of the (almost countless) studies which aim at the elicitation of lexical patterns from text corpora have revealed: Kjellmer (1984: 168) topicalizes collocations "introduced by" the verb give. His analysis is based on the Brown-corpus and extracts such collocational items as give a damn, give away, give him time, give information, give me a chance, give rise to, give way, to name but a few. Altenberg (1991: 136, 139) analyses amplifier collocations in spoken English, and produces lists of preferred co-occurrences, e.g., quite + sure/ clear/ right/ certain/ different/ agree, etc. absolutely + nothing/ no/ certain/ not/ super, etc. perfectly + well/ true/ all right/ willing, etc. entirely + new/ agree/ different, etc. completely + different/ wrong/free, etc. very much + thank you/ thanks/ depend/ like, etc. terribly + difficult/ hard/ important, etc.
Lexical co-occurrences
jolly extremely awfully bloody
239
+ good/ well/ nice, etc. + difficult/ good/ well, etc. + nice/ early/ good, etc. + cold/ great, etc.
Kennedy (1991) presents an analysis regarding the lexical patterns (i.e., the collocations with preceding and following words) in which the two prepositions between and through take part and the semantic functions they serve. Kjellmer (1991) provides the reader with a more general discussion of the phenomenon of collocations, characterizing types of set expressions and analysing collocations in a prose sample. Renouf & Sinclair's (1991) analysis concentrates on the elicitation from a section of the COBUILD database of "collocational frameworks in English": Our 'frameworks' consist of a discontinuous sequence of two words, positioned at one word remove from each other; they are therefore not grammatically self-standing; their well-formedness is dependent on what intervenes. (Renouf & Sinclair 1991: 128)
The frameworks they analyse are made up of grammatical words, e.g., 'a + ? + o f , 'be + ? + to', or 'had + ? + o f . The analysis shows that these frameworks are highly selective in their collocates: a lot/ kind/ number/ couple/ matter/ sort/ series/piece/ bit/... of be able/ allowed/ expected/ said/ made/ prepared/ possible/ ...to had enough/plenty/thought/heard/one/died/spoken/none/... of (cf. Renouf & Sinclair 1991: 130-133). The authors can demonstrate (Renouf & Sinclair 1991: 143) that two very common grammatical words, one on either side, offer a firm basis for studying collocations. We have shown that the choice of word class and collocate is specific, and governed by both elements of the framework. ... We have also offered evidence in support of a growing awareness that the normal use of language is to select more than one word at a time, and to blend such selections with each other (...).
From the types of collocations cited here, we can furthermore conclude that the syntactic chunks in the sense of "co-selected words" which the language user may use as ready-made units do not always and necessarily correspond to phrases as syntactically complete structures.
240
Performance
data
Francis (1993) discovers lexical regularities in constructions with introductory it as object (cf. above) and appositive that-clause qualifiers. The latter co-occur with six broad sets of semantically grouped nouns, e.g., nouns expressing an illocutionary process {announcement, recommendation, suggestion...), those encoding a mental state regarding a particular issue {assumption, belief, view...), or those expressing feelings and attitudes {astonishment, expectation, surprise...) (cf. Francis 1993: 149). For the noun reason as a head noun, she elicits from her corpus (which is the COBUILD) a unique phraseological environment, namely the pattern for the simple reason that... (cf. Francis 1993: 153). Hoey's (1993: 81)) analysis is also focussed on the word reason. He produces a long list of findings, from which I have selected only a few for presentation: 1. The meanings of reason ("rational faculty" and "cause") have an influence on the syntactic structures in which the word occurs. 4. Reason in sentence-initial position expresses a reason relation in patterns such as x. The reason is simple, y x. The reason is y. x. The reason for this (z) is y The reason χ is y. 5. When reason is (part of) the object, it is typically followed by for x, why + clause, or to x. 7. When reason is object, the typical verb it follows is either see or have. Also here, it becomes obvious that one word can and actually does exhibit preferences regarding its structural and lexical environments. Clear (1993: 280-281, 286-287)) illustrates the description of a computer program for the study of collocations by discussing collocates around the node words83 taste and order. Some of the collocates are
83
The terms "node word" and "collocate" are used in Sinclair's (1991) sense: "A line of text [i.e., a concordance line D.S.] may contain as many as eight or nine words on either side of the central word, or node, ... it is reasonable to examine the vocabulary of the concordance. In order to do this, a list is compiled in frequency order, of all the
Lexical co-occurrences
241
clearly and quite rigidly associated with the node words: acquired taste, good/bad taste or in order to, order of magnitude, pecking order, tall order, restore/keep/maintain order, etc. Sinclair (1987c, 1991) comments on the patterns to be found around the verbs happen and set in, noting that an "innocent" verb like happen is commonly associated with unpleasant events and, on set in, that [t]he striking feature of this phrasal verb is the nature of the subjects. In general they refer to unpleasant states of affairs. Only three refer to the weather, a few are neutral, such as reaction and trend. The main vocabulary is rot (...), decay, malaise, despair, ill-will, decadence, impoverishment, infection, prejudice ... slump. Not one of these is desirable or attractive. (Sinclair 1987c: 155-156)
This collocational phenomenon, which is usually noticeable only when a larger number of such word combinations have been compiled, has been termed "semantic prosody". Sinclair uses the term "prosody" in the same sense as Firth (cf. Louw 1993: 158), thus indicating that the phenomenon named by it extends over more than one unit. Further studies focussing on such semantic profiles of collocations are Louw (1993) and Stubbs (1995), for example. Stubbs' (1996) book on corpus analyses contains many examples of how particular words are used in the same structural pattern(s), that is, how these structures are often lexically specified. One of his studies is related to the words happy and happiness. In a comparison of two short texts, he finds that both words occur in particular structural frames, such as the Subject-Verb-Complement (SVC84: sb. is X), Subject-VerbObject-Complement (SVOC: sb. makes sb. else X), and that there are also particular collocates: happy life, be happy, die happy, live happy; step towards happiness, get happiness, giving out happiness (cf. Stubbs 1996: 85-88). Johansson & Okseijell (1996: 61-62) present an interesting study about the syntax and semantics of the verb get, which can occur in a variety of structures, with its meaning determining which one is to be selected in a particular utterance. But they also come across lexical patterns, such as get hold of, get in touch.
word-forms in the concordance. These are called the collocates of the node." (Sinclair 1991: 105) 84 We use these terms and abbreviations in the same sense as Quirk et al. (1985).
242
Performance
data
Gavioli's (1997) exploration of texts by means of a concordancer shows how the concordances (i.e., collocational data of a node-word extracted from a corpus) found for a word (her example is rift) can reveal information about its meaning, at the same time demonstrating the effect the character of the text corpus (that is, the text types considered) exerts on the results: An English newspaper corpus reveals the most frequent collocates of the word criminal. They are: war/act/law. Her analysis of a corpus of academic texts, however, had law/liability/English at the top of the frequency list of collocates (cf. Gavioli 1997: 87-88). She also gives an example of how to use concordances to pin down interferences between a language learner's native language and the language he is learning: English crucial is compared with Italian cruciale, so that the learner can easily find out commonalities and differences (cf. Gavioli 1997: 94-95). Corpus analyses focussing on the elicitation of both structural and lexical patterns (in English texts) have also resulted in language descriptions of a more general kind. The COBUILD corpus, recently renamed "the Bank of English", was the information source for the compilation of an English dictionary and an English grammar book: Collins COBUILD English language dictionary and Collins COBUILD English grammar {Sinclair 1987b and 1990). That the data list presented here comprises only collocational data regarding the English language does not mean that those about other languages are not available. However, they are to be found far less frequently85 and hence do not cover such a wide range of phenomena as I have just shown for English. After having established what collocations are and having provided sufficient evidence from corpus-linguistic research for the assumption that language products (i.e., spoken and written texts) are obviously rich in them, one should also enquire into the reason why language in use exhibits/has such patterns. Sinclair's answer to this question - as already discussed in section 4.2.3 - is the assumption of an "idiom principle" (besides an "open-choice principle") for the functioning of language. I will repeat here the essential phenomenon described by the idiom-principle: The repetitive occurrence of particular patterns, both 85
For an example see Dodd (1997).
Lexical co-occurrences
243
structural and lexical, suggests that the language user does not always compute his utterances by applying general construction rules to the constitutive elements of this language, but that he may just as well have representations of larger constructions which are holistically retrieved and inserted into the utterance under construction. The existence of patterns or clusters (both structural and lexical) in language use is assumed to "reflect the recurrence of similar situations in human affairs; it may illustrate a natural tendency to economy of effort; ..."(Sinclair 1991: 110). I would like to add that the occurrence of particular language patterns, the existence of the idiom-principle as a strategic device in language use, reflects the way we experience and structure or construe the world, the way we conceptualize what we experience, the patterns reflect the mental models we construct from what we know and perceive (cf. also p. 233-234). Does this imply that the open-choice principle is only of marginal importance, do language users rarely construct their utterances word by word, following the general syntactic rules for their combination, do they use semi-finished construction elements instead of operating like traditional bricklayers? From what most linguistic models, making a clear distinction between lexicon and productive syntactic rules, induce, language users should be bricklayers. Corpus linguists judge differently: According to Sinclair, the ratio between pre-fabricated chunks and freely constructed phrases in language use is reflected in the occurrence of patterns in texts. He concludes from his research in lexis and collocation that the structural and lexical patterns are repeated much more often than one expects on the basis of the idea that words are combined as licensed by syntactic rules. That is why he claims that the idiom-principle is dominant. A similar view is held by Kjellmer, who states (1994: ix) that [t]here is no doubt that natural language has a certain block-like character. Words tend to occur in the same clusters again and again. When we speak or write it is therefore more apposite to say that we move from one cluster to the next than to say that we move from one word to the next. Words differ with regard to their 'constructional tendency' - in some cases it is high (...), and in others it is low (...) - but the clustering tendency can probably be shown to exist in any natural language, ...
244
Performance
data
Other corpus linguists, such as Stubbs (1993), Altenberg & EegOlofsson (1990), have come to similar conclusions, but they are not the only ones. From the perspective of language acquisition, the existence of patterned speech is discussed by Maratsos. Starting out from considerations about the organization of children's word combinations, he notes (1979: 296) for the adult language user that [a]ttention to highly specific lexical patterns remains a constant feature even in adult syntax. Adult English requires much specific semantic-syntactic memorization of the speaker, down to a level of near idiomaticness for many usages.
Levelt (1989: 21), trying to specify the language user's activities involved in the production of an utterance, comments on the prefabricated chunks available to the speaker: ... not all processing in message encoding is under executive [i.e., central D.S.] control. An adult's experience with speaking is so extensive that whole messages will be available in long-term memory and thus will be retrievable.
Thus, there is considerable agreement on the existence of habitually recurring lexico-syntactic strings 86 in language products, and the seemingly large amount thereof allows for the assumption of a "hybrid" linguistic element: an element which has features of a single word (it has an integrated meaning, it has the potential of being used invariantly, as a (quasi-) lexicalized unit) and features of a construction to be built in the process of language processing (the element is analysable, its constitutive elements may be flexibly put together, but they contain biasing co-occurrence information, with the bias being more or less strong). I consider this assumption to be especially important for our discussion of natural linguistic models. It is not only based on the observations and comments just mentioned, but it also rests on one of the features ascribed to collocations: Owing to their arbitrary nature, collocations must be acquired/remembered individually. This necessarily implies that the mental lexicon does not only comprise 86
As to the form of the strings that language users have available (as an effect of the idiom principle), they do not necessarily correspond to the syntactic units (phrases, clauses) specified for that language: The length of the syntactic fragments may vary from a single word inclusive of its co-occurrence information to a complete sentence (see also section 4.2.3).
Lexical co-occurrences
245
single words, but also larger phraseological units of both a fixed or a more variable type (for a comparable argument see Kjellmer 1991). I, therefore, require that a natural/plausible linguistic model reflect the existence of such elements. In other words, since the hypothesized existence of (quasi-)lexicalized utterance fragments implies that linguistic information may be represented more than once, in various forms (here in the form of a pre-fabricated cluster - potentially with the words fully specified - , side by side with the knowledge of the rules for combining words into such a string), it is implausible that linguistic models should be free of redundancy. Redundancy-free models do not reflect the language facts appropriately, when they "merely" posit a set of productive grammar rules and a lexicon, without leaving room for an inventory of lexicalized and semi-lexicalized utterance fragments. Apart from that, the phenomenon of "collocation" is further evidence for a blurred distinction between a language's lexicon and syntax: Fragments which look like being produced by combining words according to the licensed syntactic rules (may) turn out to be stored as wholes and accessed and retrieved from memory in a way comparable to single words. Linguistic models could incorporate such a phenomenon by considering lexical and syntactic knowledge not as totally different in type, but as linguistic knowledge which merely differs with regard to the level of abstraction. Since the postulation that collocations are hybrid linguistic elements exhibiting both features typically considered syntactic and lexical features is no more than a logical and reasonable conclusion from the data, I thought it necessary to look for true evidence for such a strong claim. That is why I decided to additionally put the claims I make with regard to collocations to the test by running a suitable experiment. What I did and found out is described in what follows.
Chapter Six In the psycholinguist's laboratory
6.1. An experimental test It is often hard to determine what predictions a theory makes about performance in a given experimental situation, and a wrong interpretation of the theory's prediction will make the experiment useless, or at least less informative than intended. It is very difficult to even conceive of experiments that would test many aspects of any given theory. (Stemberger 1985: 10)
In the preceding chapter I inspected particular kinds of performance data for the way in which they can provide evidence for the one or the other assumption about the lexicon-syntax interrelation made by the linguistic models under discussion. I did so by using psycholinguistic claims as an intermediary. That means the performance data were interpreted with regard to what they can reveal about language use, and then the claims that could be plausibly supported by the data were made the measure against which the linguistic models have to be evaluated for their naturalness. I presented performance data which speak for a close interaction in language processing of a language's lexicon and syntax. At the same time, they show how difficult it is to clearly attribute particular processing procedures to either of the two. Grammatical encoding as well as parsing seem to be based on both lexical co-occurrence information and knowledge of syntactic rules. It is especially the phenomenon of lexico-syntactic patterns which calls into question whether syntax and lexicon can be separated neatly, with the one representing the elements and the other giving the rules for their combination. On the other hand, the mere occurrence of such patterns in texts is not sufficient evidence for their existence as utterance fragments that are different from freely constructed syntactic phrases. Though the repetitive nature of the former makes them special, it must be asked whether the two are actually different in status and/or whether they differ in their processing procedures, and whether it is, hence, justified
248
In the psycholinguist's
laboratory
to take collocations as indicative of a fuzzy borderline between lexicon and syntax. How can one get at that? Bearing in mind that I required that the process of verifying one's hypotheses exploit all sorts of evidence (introspective, performancerelated and experimentally elicited), I decided to run an experiment to see what the elicited data can contribute to, and reveal about the processing mechanisms I am interested in. Once again, the experimental evidence relates to the linguistic claims under analysis via the support it can give to the one or the other assumption about language processing procedures. The experiment was meant to discover differences in the processing of collocations as against the processing of free constructions. If any such differences can be found, this would imply that collocations have a special status as compared to free phrases. As was shown in section 5.4, there is every reason to assume that collocations are hybrid linguistic forms which combine features of syntactically constructed units with those of lexically stored ones. This characterization would mark them as being on the borderline between syntax and the lexicon, blurring an exact distinction between the two and thus speaking against a strictly modular organization of the language processor. The experiment I am presenting here aims at the potential differences in the processing of collocations and freely constructed syntactic phrases ("productive phrases") with regard to language comprehension. For language comprehension, the models whose claims we share predict that the words already accessed constrain the process of constructing a syntactic representation. They do so via the information contained in their lexical entries (no matter whether represented in a distributed or non-distributed way). The constraints exerted by contextual and lexical co-occurrence information are usually assumed to be weak, so that they, in the presence of several alternatives, do not have a preselective influence on the exact incorporation of incoming material into the syntactic representation already constructed thus far, but merely help with the selection of the best suitable alternative. Moreover, they rarely predict a target word entirely, which means that they seldom predict the identity of the word itself (cf. MacDonald et al. 1994a: 686 and our discussion of constraintbased models of language comprehension in section 3.3.2).
An experimental test
249
Consequently, the language processor will incorporate incoming input on the basis of local syntactic and non-syntactic constraints, namely (lexically induced) word-category information and (probabilistic) information about the links possible between the lexical items, such semantic constraints as the thematic fit between a word/phrase and the argument positions specified by other phrasal heads, for example, and contextual constraints. In addition to that, there are also non-local syntactic constraints effective which are not specified in the lexical entries and thus represent syntactic knowledge per se (for the distinction between local and non-local constraints in the way just applied see MacDonald 1994a: 696). However, when the comprehender processes a collocation, I assume that the parsing procedure can be different. The degree to which it is different is hearer/speaker-specific, idiosyncratic, since it will depend on the frequency with which the comprehender has met the collocation or with which the comprehender uses it as a speaker in his productive mode, that is, on how well it is entrenched or established as a conventional unit (cf. Langacker 1991a: 45). If it is highly established, it probably is stored as a structured unit and will be retrieved in a form comparable to the retrieval of a lexical unit (which does not involve the construction of syntactic structure). If it is not established, the procedure will be the one described for the default case above, where the co-occurring words do not show any specific or habitual mutual attraction. From our point of view, the two cases just mentioned represent the two extremes on a scale, with a large number of cooccurring words in between which are neither really free combinations nor firmly established collocations. So, between the two extremes, we would have to position words that have a weaker tendency to co-occur with other words, which, however, is stronger than chance would predict. This is true for the following examples: (118)
search for truth, twelve months, various types of, under no circumstances (source: Kjellmer 1994: 2077, 2081, 2086, 2089).
250
In the psycholinguist's
laboratory
Also flexible collocations, i.e., words which show co-occurrence preferences with several words in several structures, are of such an intermediate character. They are illustrated in (119): (119) be true that/for/ of barely/ hardly/just possible as close/far/ long/ much/ quickly/ small/ soon as possible (source: Kjellmer 1994: 1448). The differences which potentially may arise for the comprehension mechanism of the highly established collocations are expected to have an effect on the time needed for their comprehension, in fact they are expected to result in a faster comprehension process. The reason why this type of collocations may be processed faster than free combinations has already been mentioned: they may be processed like single words, or - due to a highly biasing "co-occurrence factor" (cf. section 3.3.2) the process of constructing a syntactic representation may be sped up considerably87. At the current state of inquiry, I cannot say whether these assumptions are close to reality, let alone which of the two is the more plausible one. No matter what may finally turn out to be conceivable or true, I will additionally have to provide for individual (i.e., user-specific) processing differences. For, there might be language users who have not used and/or encountered a particular collocation frequently enough to have stored it as a cluster of related words or even as a whole (though analysable) unit. Those language users are likely to apply the same strategy as assumed for free combinations of words, where the lexical entries do not contain co-occurrence information other than that regarding co-occurring word categories and structures. The processing of less firmly established collocations may also benefit from the co-occurrence information included in the "participating" word entries, though to a lesser degree than that of the 87
An assumption like this is related to what Taft & Forster found for the processing of compounds: Though compounds are accessed via their first constituent, all the constitutive words need to be accessed independently. They can, however, be put together very swiftly, since the information about how to do it syntactically and semantically is incorporated in the respective lexical entries (Forster, personal communication).
An experimental test
251
highly established ones. For more flexible collocations, i.e., those in which the node-word co-occurs with several other words (cf. above, examples [118] and [119]), I assume two comprehension mechanisms. They may be processed either in the way suggested for highly established collocations or in the way suggested for less highly established ones, with the one actually applied being influenced by further contextual effects. If the context (linguistic and situational) is helpful in the "disambiguation" of the collocation, the comprehension mechanism can be assumed to be more word-like than in situations in which this is not the case. The experiment was designed to get at the mechanisms for the processing of collocations and freely combined phrases by measuring the size of the "repetition effect" in a lexical decision experiment 88 . A repetition effect occurs when units are to be processed which occur more than once in a lexical decision experiment. There is general evidence that the repeated processing of a stimulus, a particular item, can facilitate the performance on a cognitive task (cf. Bainbridge et al. 1993: 619). The kind of facilitation relates to time and/or accuracy of performance: Repetition benefit is an increase in the speed or accuracy of task performance caused by previous experience with the same task, stimulus events, or both. When the previous experience exactly matches a current experience, the amount of benefit is high. When the two experiences differ along many dimensions, benefit is low. This differential raises the question of just how similar two experiences have to be, and along what dimensions, for substantial repetition facilitation to occur. (Carlson et al. 1991: 924)
The latter question will also concern me since I also try to vary the similarity of the stimuli to be presented in the experiment. In psycholinguistics, repetition effects have commonly been studied for words and they are generally defined as such effects on reaction times in lexical decision tasks as result from the repeated encounter of a word in the course of a lexical decision experiment. It has been found in such experiments that a word is recognized as a word much faster and more accurately when it occurs more than once, even if the
88
A lexical decision experiment is an experiment in which subjects are presented with letter sequences and have to decide whether these are words or not.
252
In the psycholinguist's
laboratory
individual presentations are not successive and other material has intervened (cf. Forster & Davis 1984: 680). The design of my experiment enables me to record the reaction times for lexical decisions on words making up a sentence which either contains a collocation or a comparable phrase which is productively constructed following the rules of English syntax. Furthermore, I have provided for the repetition of the material making up the collocations and the comparable free phrases. This enables me to also record the reaction times for the two types of sentences, when, in the previous phase, the collocations and productive phrases have already been encountered. They occur in the first phase either in exactly the same way as in the second phase, or their constitutive words are presented in a distributed way, i.e., in different phrases distributed over a limited number of sentences (for details see below). This design allows for recording reaction times for sentences containing non-repeated collocations/free phrases, exactly repeated ones and distributively repeated ones, so that repetition effects of various constellations, if found, can be compared with control sentences, where no material has been presented before. I must, however, qualify the design by saying that I did not control the sentences for the repetition of synsemantic words. This seems admissible on the assumption that they are fairly evenly represented in any type of sentence, so that their potential effects will occur alike in all the sentence and cannot be held responsible for the differences that may be found in the reaction times for the sentences to be compared. The expectations I have with respect to the experimental results are related to the repetition effect as a variable dependent on phrase type and type of repetition. That is why I will discuss them in groups. The first group of expectations has to do with the general repetition effect for collocations and productive phrases. Bearing in mind the assumptions I have made about the procedures employed for the comprehension of these two phrase types, I can think of two possible predictions regarding the repetition effects. First, if collocations are processed in a word-like way (= hypothesis 1), the repetition effect for exactly repeated collocations should be smaller than that for exactly repeated productive phrases. For, the words making up the free phrase will each gain by repetition, whereas in the
An experimental test
253
collocation, the gain will be there only once, namely for the collocation as a whole. This assumption is based on the idea that, after a sentence consisting of freely constructed phrases has been processed, it is not the phrases as such that are stored, but it probably is the meaning to the expression of which they contributed. For (highly established) collocations, on the other hand, we assume that both meaning and form are stored, so that there is no construction process necessary for either finding its meaning from the presentation of its constitutive words, or vice versa. Secondly, if I assume that the constitutive words of a collocation would have to be accessed independently, but can, as a result of the biasing co-occurrence information, be interpreted faster (= hypothesis 2), the repetition effect should be bigger than that of free phrases. The gain would be explicable by the assumptions that each repeated word may produce a repetition effect and that there might be an additional repetition effect of the "higher-order unit", the collocation as a whole. As regards productive phrases, the retrieval of the constitutive words will also benefit from their repetition, but I expect that there is no "higher-order unit" effect, and that additional time should be needed for the structural incorporation of the respective words, so that reaction times should be longer. The second group of expectations is specified with regard to collocations and type of repetition. I expect a bigger effect for those collocations that occur as an exact repetition versus those that follow the distributed representation of their constitutive words. This is due to the assumption (of hypothesis 2) that the processing of collocations in their exactly repeated second presentation should derive an extra benefit from the repetition as a higher-order unit. Consequently, the previous encounter of the words making up a collocation as constituents of various freely constructed phrases should have a smaller facilitatory effect on the subsequent processing of a sentence containing these words in the usual collocation (e.g., the collocation drop somebody a line is presented in the second phase, after its constitutive words drop and line have been encountered in two different phrases in the first phase, say in the phrases drop a toy and a strange line). The same predictions follow from hypothesis 1, though for different reasons: If collocations are word-like, they should have an entry of their
254
In the psycholinguist's
laboratory
own in the mental lexicon, and this means that the encounter of a collocation after the encounter of its constitutive words cannot be considered to involve repetition at all: Each of the constitutive words and the collocation itself must be understood as individual entries. With regard to hypothesis 2, the size of the facilitatory effect in the "distributed" repetition will probably vary with the degree to which the collocations tested are established in the language users running in the experiment. If a collocation is not established in the respective subject, it is expected to be processed like a productive phrase and the repetition effects should correspond to what is found for those. If, however, the collocation under analysis is highly established/entrenched in a subject, its constitutive words, at least its node or base word(s) - i.e., the word(s) around which the collocation is arranged, e.g., turn a deaf ear to advice, drenched to the skin - is/are assumed to contain the respective co-occurrence information which should be (momentarily) available also when the word(s) are used in a phrase different from the collocation. (This assumption is a consequence of the adoption of a constraint-based approach to language processing, see section 3.3.2) Thus, in these cases, the previous encounter of the node word(s) can be assumed to facilitate the processing of the collocation in addition to the "mere" repetition effect that arises from the repeated encounter of the individual word(s). In the extreme case, the node word(s) of a highly established collocation alone should induce the access of the collocation, so that the processing of the collocation, after its node word has already been encountered and processed in previous sentences, should benefit to (almost) as high a degree as it benefits from the previous encounter of the collocation as a whole. Thirdly, the expectation with regard to the "behavior" in the comprehension process of productive phrases whose constitutive words have been encountered in a distributed manner is that the reaction time should not considerably differ from that in the condition of exact repetition. For, in both cases, the words to be retrieved may benefit from having been encountered before, but will, nevertheless, have to be incorporated into the syntactic representation constructed from the previous input. This does not seem to be different for an exactly repeated phrase or for a phrase whose constituents have been
An experimental test
255
encountered distributively, since I do not assume that language users store the structural representation of a productive phrase constructed around a particular base word inclusive of its constitutive words. Rather, the language user will have stored structural representations of a very general kind, perhaps in the form of phrase-structure rules, which can be retrieved as the consequence of the retrieval of a lexical entry which typically occurs in this structural frame. Basically, this amounts to the assumption that the words retrieved for the productive phrases have to be syntactically processed in both conditions, a procedure which I doubt or even dispute for the processing of (highly established) collocations. On the other hand, there is evidence from investigations into memory for an "associative repetition effect", which should provide for an extra benefit - in addition to that of "mere" item repetition - when the repeated items have been presented in a combination that may create an association between them. This "associative repetition effect" has been observed when responses to word pairs are faster or more accurate when these pairs are presented in the same combination as at study (e.g., study: pause - weird and slope - plate test: pause - weird and slope - plate the intact condition) than when they are rearranged to form new pairs (e.g., study: pause - plate and slope - weird test: pause - weird and slope - plate the recombined condition). (Goshen-Gottstein & Moscovitch 1995: 1249).
From this, I hypothesize that the associative repetition effect can potentially be found in the processing of exactly repeated productive phrases as well. Its existence is plausibly explained by the "perceptual contiguity hypothesis", according to which "perceptual contiguity is necessary and sufficient for binding items together and retrieving them in data-driven implicit tests of memory." (Goshen-Gottstein & Moscovitch 1995: 1251). Goshen-Gottstein & Moscovitch (1995: 1258-1259) were able to show that the association-specific repetition effect depends on the preservation of the perceptual gestalt of the units tested in the repeated presentation, which means for the effect to appear, the units must be perceived in the same spatial-temporal arrangement. This is true for exactly repeated productive phrases, so that one may expect the associative repetition effect to show up here and it plausibly
256
In the psycholinguist's
laboratory
explains an extra benefit in the processing of an exactly repeated phrase versus a distributively repeated phrase. Since the preservation of the perceptual gestalt is even more true for exactly repeated collocations (which are assumed to be stored if they are established highly enough), it will also show up there, so that potential differences between exactly repeated collocations and productive phrases must be attributed to the differences in processing already discussed (or to something else I have not found out yet). To sum up, if the experiment can reveal differences between collocations and free combinations of words regarding repetition effects and/or temporal processing in general, this will support the assumption of a potential difference in the linguistic status of the two, though it will not immediately reveal what the exact nature of the difference is. As for this, I tentatively assume that a highly entrenched collocation may be assigned to the entities which the language user stores and accesses in language processing, whereas a less highly entrenched collocation and certainly a free phrase may or will have to be assigned to those units which are constructed in the course of language processing. Less highly entrenched collocations will, however, be different from free phrases in that the constitutive words of the former will contain biasing cooccurrence information which can be understood to speed up the processing to a variable extent. For ideas of how to get at the problems and questions just sketched out and for invaluable advice and help with the design of the experiment and the discussion of its results I am very grateful and very much obliged to Kenneth I. Forster, professor in the Psychology Department of the University of Arizona.
6.2. The experiment Repetition effects with collocations and productive phrases. The experiment aims at finding out differences in the processing of collocations and comparative productive phrases. It has been designed to measure the size of the repetition effect in a lexical decision task. For this purpose, the words making up collocations and comparable productive phrases were incorporated into complete sentences and the
The experiment
257
lexical decisions had to be made not on individual words or word pairs, but on all the words occurring in the sentence at once. The subjects were instructed to respond YES, if all items were words, and to respond NO, if they came across a non-word. For each subject's trials, the reaction times for pressing the yes- or no-button were recorded. The yes-responses can only be made after all the words making up a sentence have been checked. The no-responses, however, can be made as soon as a misspelled word or a non-word has been encountered. Since these are evenly distributed within the sentences from the first to the last word, the reaction times will not tell anything about the processing of the sentences, which may not even be started when a non-word occurs in sentence-initial position. Consequently, we discarded all the reaction times for the (correct) noresponses. The reaction times for the (correct) yes-responses were recorded for all the subjects and sentences tested. I assume that these reaction times are related to the processing of the sentences, since the check for words (vs. non-words) will probably not be made for each word as an isolated item, but by scanning the sentences and thus also getting their meanings. Our hypothesis is that the repetition effects for sentences containing collocations should be different from those for sentences containing the comparable productive phrases in the ways specified above.
6.2.1. Method and procedure Subjects Sixteen graduate and post-graduate students and two faculty members of the Psychology Department at the University of Arizona participated in the experiment. Their participation was a voluntary constituent of a graduate psycholinguistics course. Equipment The experiment was run using the DMASTR software developed at Monash University and at the University of Arizona by K.I. Forster and J.C. Forster.
258
In the psycholinguist's
laboratory
The computer program measured and recorded the reaction times the subjects needed in order to make a lexical decision on words making up sentences which contained either a collocation or a comparable freely combined/constructed phrase (in short: a "productive phrase"). Material Two groups of target sentences were generated. The first group of 15 sentences contained 15 collocations, which were randomly chosen from the BBI Combinatory Dictionary of English (1986). The second group of 15 sentences was constructed freely by randomly picking words from the English lexicon and combining them according to the syntactic rules of English into meaningful sentences. For each sentence containing a collocation, a sentence containing a productive phrase was matched in length and in the major constitutive word categories. This total of 30 sentences made up phase two of the experiment, which was identical for all the subjects. For phase one of the experiment, three lists of items were developed to counterbalance the materials to be tested on different subjects in the various conditions specified below. Each list consisted of 20 sentences, 10 were related to the collocations presented in phase two, 10 - to the productive phrases used there. Five sentences of group one contained, in exactly the same form, five of the collocations that also occurred in sentences of phase two of the experiment, with the rest of the sentences being changed. Another five sentences contained all the words occurring in five collocations (different from the first five), this time in a distributed way, i.e., the words occurred in an arrangement deviating from that in the corresponding collocations in phase two. Five sentences of the second group were generated around the five exactly repeated productive phrases that matched the collocations in phase two. The final five sentences contained material from another five of the productive phrases arranged in the distributed way. Thus, the items making up phase two of the experiment contain collocations and productive phrases which have already been encountered in phase one of the experiment, either in a distributed or in exactly the same way. Additionally, they comprise sentences with collocations and productive phrases which have not been encountered before, i.e., which are non-repeated.
The experiment
259
To provide for the character of a lexical decision task (and to distract the subjects from the true purpose of the experiment), a total of 50 sentences containing misspelled words or non-words were added, 30 to phase two, 20 to phase one. Example sentences from each group are given in table 2. Table 2. Sample sentences from the experiment Phase
2
Group
Example
collocation exact repetition
Let me know as soon as possible.(a)
collocation distributed repetition
The owl sealed her nest in the darkness. (b)
productive phrase exact repetition
Make them spell as exactly as indicated.
productive phrase distributed repetition
Her routine provided a plus for the eroup. (c)
misspelled/ non-word
Reading is one of the basic requirements at shool.
collocation
Tom promised to write the letter as soon as possible.
productive phrase
Pat wanted to apply the drug as exactly as indicated.
misspelled/ non-word
Can you explain het function of a microchip?
^ the underlining did not appear in the experiment the words originate from seal somebody's fate, stir up a hornet's nest, and plunged into darkness. occurring as collocations in phase 2, the remaining words of each collocation occur in sentences of the same group. ^ the words originate from check without routine, provide a new answer to the question, occurring as productive phrases in phase 2, the remaining words of each productive phrase occur in sentences of the same group. (b)
260
In the psycholinguist's
laboratory
Three presentation lists of 100 items were constructed by combining the 60 sentences of phase two with a total of (3 x) 40 sentences of phase one. The latter comprised one set of 20 sentences as specified for the variants of phase one (cf. above), and 20 sentences containing a misspelled word or a non-words each. Phase two, consisting of 15 sentences with a collocation, 15 sentences with comparable productive phrases, and the 30 sentences containing misspellings or non-words, was identical in all three lists, hence also for all the subjects. The lists differed, as already pointed out, in the 20 sentences that contained productive phrases or collocations exactly or distributively repeated from phase two. Finally, the items were scrambled (separately for the two phases), which was meant to prevent the subjects from recognizing the points crucial to us and from potentially developing a particular decision strategy or routine. The design of the experiment permitted us to test the total of 18 subjects on 15 collocations as compared to 15 productive phrases. This total of 30 phrases was tested for six conditions: Conditions 1 and 4: The phrases (collocations [1] - productive phrases [4]) occur only once, i.e., they are not repeated. Conditions 2 and 5: The phrases (collocations [2] - productive phrases [5]) occur in two sentences in exactly the same form, once in phase one, once in phase two. Conditions 3 and 6: The words constituting the phrases (collocations [3] - productive phrases [6]) occur in different sentences in phase one, thus providing for the "distributed" repetition, and once in their original form in phase two. The collocations and the productive phrases tested occurred under different conditions in each of the three item lists, so that all the phrases I am interested in were tested in all conditions. The following example is given for illustration: the collocation as soon as possible was presented in condition 1 in the first item list, in condition 2 in the second, and in condition 3 in the third.
The experiment
261
Additionally, in each item list, the items within the two phases were presented to the individual subjects in a different sequence, which was under control of the computer program. Procedure The three lists of sentences were presented to six subjects each, who, after an appropriate instruction and the presentation of some practice sentences, had to classify each sentence according to whether it contained only (correctly spelled) words ("Yes") or not ("No"). (In the latter case, the "No" is based on the fact, that the sentence contains a misspelled word or a non-word.) Each trial consisted of the presentation of a single sentence. The sentences were presented to the subjects on a computer screen as in normal text, with each sentence fitting on a single line. The subjects were asked to decide as fast as they could without making mistakes. For answering affirmatively, they had to press a "Yes"button, for answering negatively - a "No"-button. The sentence to be decided on was visible on the screen until the subject had made his decision. After the subjects had decided on the item shown on the screen, they were given feedback as to whether their answer was correct. They then requested the next item by pressing a foot pedal, so that the speed of the presentation was self-paced. After the subjects had run in the experiment, they were informed about what we had tried to test and about the hypotheses we sought evidence for. Each session inclusive of the debriefing lasted approximately 10-15 minutes.
6.2.2. Results In this experiment, errors were discarded from the analysis, and reaction times more than two standard-deviation units above or below the mean for the respective subject in all conditions were trimmed to the appropriate cut-off value. Subjects who performed with an error rate higher than 20 % were replaced. Mean reaction times and error rates in each condition are shown in table 3.
262
In the psycholinguist's
laboratory
The analysis of the results was carried out in the form of several 3 χ 2 x 2 factorial analyses of variance, one for the effect of exact repetition, one for the effect of distributed repetition, for both the subject means and the item means. The factors were groups (subject groups in the analysis of subject means, item groups in the analysis of item means), item or phrase type (collocation vs. productive phrase) and type of repetition (non-repeated vs. exactly repeated and nonrepeated vs. distributively repeated). Table 3. Mean reaction times (RT) in ms and percentage error rates (in parentheses) for collocations and productive phrases that were either non-repeated, exactly repeated, or distributively repeated in phase two of the experiment. Repetition condition
Collocation
Non-repeated
Exact
Distributed
1753 (8.9)
1597 (4.4)
1619 (3.3)
156
134
1687 (3.3)
1776 (3.3)
88
-1
Repetition effect Productive phrases Repetition effect
1775 (4.4)
The analysis with respect to the exact repetition shows that collocations and productive phrases have significant repetition effects in both the mean subject and item reaction times: F\ (1, 15) = 7.12,/? < 0.05, F i (1, 24) = 10.37, ρ < .01, however, there was no significant interaction between phrase type and type of repetition: F\ (1,15) = .74, ρ > .05; F2 (1, 24) = .96,ρ > .05. The analysis carried out with respect to the distributed repetition revealed a significant repetition effect only in the mean item reaction times: F2 (1, 24) = 4.59, ρ < .05, whereas the repetition effect just failed to reach significance in the subjects analysis: F\ (1, 15) = 3.88,/? > .05. But, contrary to the exactly repeated condition, the interaction between phrase type and type of repetition turned out to be significant: F\ (1, 15) = 5.21,ρ < .05: F2 (1, 24) = 4.07,ρ = .05.
The experiment
263
In other words, though both collocations and productive phrases gain from repetition in the exactly repeated condition, they do not differ significantly from each other in the reaction time the subjects need for making a lexical decision on them. In the distributedly repeated condition, however, the subjects' reaction times for making the lexical decisions are significantly different for collocations and productive phrases, with a repetition effect only noticeable for collocations. The results were also analysed in a 3 χ 3 factorial design (the factors being subject groups and type of repetition), in order to find out about the significance of the differences found in the reaction times for processing the variously repeated material in collocations and productive phrases respectively. The way in which the material was repeated produced a significant difference in the reaction time for collocations (non-repeated: 1753 ms / exactly repeated: 1597 / distributively repeated: 1619) in both subject and item means F\ (2, 30) = 6.11 ; F 2 ( 2 , 24) = 6.40, ρ < 0.01. It did not do so for productive phrases (non-repeated: 1775 ms / exactly repeated: 1687/ distributively repeated: 1776) F\ (2, 30) = 1.48 ; F2(2, 24) = 2.57, ρ >.05. In order to elicit information about the significance of the individual repetition effects separately for the two phrase types, we also analysed the data in a 3 χ 2 factorial design (the factors being subject groups and two types of repetition). The reaction times for both exactly and distributively repeated collocations differed significantly from the reaction times for non-repeated ones: exact repetition F\ (1, 15) = 9.76, ρ < 0.01; F2 (1, 12) = 7.58, ρ < 0.05; distributed repetition F\ (1, 15) = 8.75, ρ < 0.01 ;F2 (1, 12) = 6.86,/? < 0.05. The differences in the mean reaction times for productive phrases failed to reach significance: exact repetition F\ (1, 15) = 1.80, ρ > .05; F2 (1, 12) = 3.00, ρ > .05; distributed repetition F\ (1, 15) = .00, ρ > .05; F2 (1, 12) = .01,ρ >.05. Finally, the reaction times were also analysed in a 3 χ 2 factorial design (the factors being subject groups and phrase type), in order to check the significance of the individual repetition effects for collocations vs. those for productive phrases. The mean subject reaction time for non-repeated collocations (1753 ms) is slightly shorter (22 ms) than that for non-repeated productive phrases (1775 ms), the reaction
264
In the psycholinguist's
laboratory
time is, however, slightly longer (17 ms) when item means are compared (1779 ms). Both differences are not significant F\ (1, 15) = 0.14,/? > .05; F2 (1, 24) = .03,ρ > .05. The mean subject reaction time for exactly repeated collocations (1597 ms) is shorter (90 ms) than that for exactly repeated productive phrases (1687 ms), the difference being significant: F\ (1, 15) = 5.23,ρ < 0.05. The difference between the item means (71 ms) does not reach significance: F2 (1, 24) = 1.13,/? > .05. The mean subject reaction time is also shorter (157 ms) for distributively repeated collocations (1619 ms) than that for productive phrases (1776 ms), the difference again being significant: F\ (1, 15) = 13.37,/? < 0.01. The difference between the item means (142 ms) again turned out to be insignificant: Fi (1, 24) = 3.65,ρ > .05. Figures 15 and 16 present the overall results in graphs:
collocations —•— productive phrases
1800 1750 1700 1650 1600 1550 1500 non-repeated
exact repetition
distributed repetition
Fig. 15. Mean subject reaction times (in ms) for processing sentences in the nonrepeated, exactly repeated and distributively repeated conditions. One type of sentences contains collocations, the second one, being matched with the first for the (major) word-categories and length, consists of freely constructed phrases only.
The experiment
collocations
265
productive phrases
1900 1850 1800 1750 1700 1650 1600 1550 non-repeated
exact repetition
distributed repetition
Fig. 16. Mean item reaction times (in ms) for processing sentences with collocations vs. those with productive phrases in the non-repeated, exactly repeated and distributively repeated conditions
6.2.3. Discussion For the exactly repeated condition, the results of the experiment reveal that the fact that material has been exactly repeated had a significant effect on the reaction time recorded for all the experimental items. However, there is no significant interaction between this type of repetition and the phrase type, i.e., collocations and productive phrases could not be measured to be processed in a significantly different way. Hence, at this level of generality, neither hypothesis 1 nor hypothesis 2 (cf. above) have been found to be supported by the data. For the distributively repeated condition, the results look different: The fact that material has been distributively repeated produced a significant effect on the reaction time only in the item analysis; it just fails to reach significance in the subject analysis. In addition to that and contrary to the exactly repeated condition, phrase type and type of repetition (non-repeated vs. distributively repeated) interact
266
In the psycholinguist's
laboratory
significantly: there is a repetition effect for collocations, for productive phrases - there is none. The separate analysis of the results recorded for collocations reveals that a significant repetition effect was obtained in both the exactly repeated and the distributedly repeated condition. The results recorded for productive phrases, however, do not show a significant repetition effect in any of the conditions. The ANOVAs (analyses of variance) separately conducted for the individual repetition effects on the reaction times for collocations as compared to that of productive phrases reveal that the reaction times for exactly and distributively repeated collocations were shorter (90/157 ms) than for productive phrases, resulting in a significant effect of the phrase type in the subject analysis, though not in the item analysis. That means the overall interaction of phrase type and type of repetition is only weak, though stronger in the distributively repeated condition than in the exactly repeated condition (as became evident in the aforementioned 3 x 2 x 2 analysis). Bearing in mind my assumptions regarding the repetition effects for collocations and productive phrases, one must ask why phrase type and repetition effect do not interact significantly in the item analysis. One reason can very likely be seen in the low number of items I was able to test in the experiment. It seems conceivable that the test of a larger number of items will produce a significant effect of the phrase type on the reaction times in the individual repetition conditions. Secondly, one must ask why the experiment did not produce results that generally show a significant effect of the phrase type on the reaction times for exactly repeated material. One possible assumption is that the collocations used and tested in the experiment are not highly established in the subjects. For collocations that are not highly established, I had predicted a processing mechanism comparable to that for free phrases, so that also the repetition effects should be similar. Since I consider the entrenchment of a collocation a subject-specific factor, there is almost no way to control for it in the experiment. The only hint as to the potential entrenchment of a collocation in speakers of a particular language is the frequency with which this collocation can be found in a representatively large corpus of this language in general. One can assume that most of the native speakers should be familiar
The experiment
267
with frequent collocations, but that the same cannot be expected for infrequent collocations. Consequently, frequent and thus more familiar collocations are more likely to show any difference in sentence processing and repetition effects than rarer/less familiar ones. Unfortunately, the material used in the experiment was not controlled for frequency of occurrence and/or familiarity. But, in order to find out whether my assumption is a reasonable one, I asked 11 of the subjects - after they had run in the experiment - to rate their familiarity with the collocations they had encountered in the experiment. The results are shown in table 4. Table 4. Familiarity ratings given by 11 subjects for the 15 collocations tested in the experiment RATINGS active use
collocation often as soon as possible drenched to the skin in the long run caught in the act visible to the naked eye turn a deaf ear to advice cry one's eyes out drink sb. under the table drop sb. a line a deafening crescendo seal someone's fate stir up a hornet's nest leave a bitter taste in one's mouth plunged into darkness efforts were crowned with success
10 1 6 -
-
-
1 2 -
-
-
-
-
1 -
normal 1 1 5 8 4 1
passive encounter rare _ 9 -
2 9 1 3 1
3 7 IQ JO 9 2 10 8 10
4 1
7 9
1
10
-
often
normal
rare
10
1 4 5 8 5 2 4 4 9 1 4 3
_
3 6 9 7 7 1 10 7 8
5 2
6 8
-
6 -
-
-
-
1 -
-
-
-
1 -
-
7 -
11
The ratings clearly show that the subjects are not very familiar with most of the collocations, both from the point of view of active use and
268
In the psycholinguist's
laboratory
passive encounter: two collocations can be considered highly familiar, two - familiar and nine - not familiar. This implies that these collocations, except for the four rated to be (highly) familiar, can be assumed to be processed in a way more similar to that of productive phrases. This assumption may well explain why the collocations do not produce results significantly different from the productive phrases. Thus, I can conclude that the lack of a significant effect of the phrase type on the reaction times for exactly repeated material in the analyses can in all probability be attributed to the subject-specific familiarity with the items tested. I will now have a close look at the repetition effects found in the experiment for collocations and productive phrases. At first, I will discuss whether and how the results obtained for the processing in the exactly repeated condition meet my expectations. The analysis shows a significant effect of the phrase type on the mean subject reaction time in the exactly repeated condition (collocations gain 90 ms over the productive phrases), the effect fails to reach significance for the item analysis. That means that, though the individual subjects produce reliably different reaction times for the two types of phrases, I cannot conclude that the effect occurs across all items. Thus, I can perhaps only tentatively conclude that the average native language user is faster in processing (repeated) familiar collocations as compared to free constructions, and it needs to be tested in further experiments whether this assumption is valid for other items as well. If significance could also be reached for the item reaction times, the results would speak for an extra benefit for the repeated collocations in general. As already mentioned earlier, this benefit might be due to the repetition effect of the higher-order unit, i.e., the collocation as a whole, and the (repeatedly) sped up construction of a syntactic representation for this phrase - because of the strongly biasing co-occurrence information available in the node word (cf. hypothesis 2). Hypothesis 1, predicting a smaller repetition effect for exactly repeated collocations (vs. that of productive phrases) would, consequently, be ruled out. However, if the item analysis does not reach significance, hypothesis 1 cannot be dismissed totally. For, it may be that a collocation which is not highly established in the comprehender is processed by accessing
The experiment
269
the constitutive words and swiftly producing a syntactic and semantic representation in its first encounter. When it is encountered for the second time, it may, however, be accessed like a word: it has been recognized as a collocation in the first encounter and its meaning is now present/stored in the mental lexicon and can be retrieved without a construction process being involved. It seems conceivable that, once an unfamiliar collocation has been encountered, it gains - as a holistic unit - from the same repetition effect as do low-frequent words. Thus, it would sooner be a further source of the extra benefit in the reaction time for the collocations. The repeated productive phrases, however, seem to gain merely from the repetition effects produced by the participating words, which then still have to be processed syntactically in the usual way. Thus, at the current state, the experimental results obtained in the exactly repeated condition do not help to discard one of the two hypotheses I have proposed with respect to the processing of collocations. They do not allow for any conclusion to be drawn with respect to the actual status of collocations. It remains open whether they are comparable to words or whether the extra benefit gained can be attributed to sped up processing, or whether anything else comes into play, which I have not thought of. Secondly, I need to discuss the results obtained in the distributively repeated condition. Also here, the analysis shows a significant effect of the phrase type on the mean subject reaction time. However, the effect on the mean item reaction time turns out to be insignificant again. I had expected a bigger effect of the distributed repetition for collocations than for productive phrases, especially when the subjects are highly familiar with the former. This expectation was grounded on the assumption that, for entrenched collocations, the earlier encounter of the node word(s) might facilitate - by having potentially activated the whole collocation via the respective co-occurrence information the processing of the actual collocation presented later. The results, indeed, indicate faster processing for the collocations after their constitutive words have been encountered in a distributed way: The gains for collocations are (for subject reaction time) 157 ms and (for item reaction time) 142 ms. However, there is one factor that is incompatible with this argument, namely the "familiarity factor" which
270
In the psycholinguist's
laboratory
I elicited above. The ratings given by 11 of the participating subjects indicate that they were not highly familiar with the majority of the collocations, and I thus would have to conclude that the co-occurrence information assumed to be present in the node word(s) of established collocations is not there or only weakly so in unfamiliar collocations. This implies that the collocations in the distributed condition should not behave much differently from free phrases. But the results are contrary to that. The only reason which I can think to possibly be responsible for a bigger effect in collocations in this condition (as well as in the exactly repeated condition) is related to assumptions with respect to the interaction between a word's repetition effect and its frequency. This interaction has shown in the "frequency attenuation effect", which is short for Scarborough et al.'s findings that low-frequency words benefit more from repetition than do highfrequency words (cf. Scarborough et al. 1977, quoted in Forster & Davis 1984: 681). In what way is this effect likely to play a role in the processing of collocations in the repeated conditions? It is conceivable that, on the average, the autosemantic words making up a collocation are less frequent in language use than those (I) used in the freely constructed phrases, and this would account for a bigger repetition effect in collocations versus productive phrases. A few examples from the material used in the experiment will show that our assumption is apparently a reasonable one. In table 5,1 compare the frequencies of the constitutive words of some exemplary collocations and their comparable productive phrases as they are given in the Kucera & Francis list of word occurrences in the Brown corpus. As can easily been seen, the trend is as expected: The constitutive words of the collocations are in all probability less frequent than those of the productive phrases, hence they should show a bigger repetition effect, whereas most of the words making up the productive phrases in my experiment are probably more frequent ones, and hence should benefit less from repetition. Still, considering that the results in the exactly repeated condition cannot exclusively be attributed to this effect, I am hesitant to hold it exclusively responsible for the results obtained in the distributively repeated condition either, and I will have to look for further explanations.
The experiment
271
Table 5. Compared word frequencies collocation
productive phrase
(1) stirred up a hornet's stir - 7x Ox stirred - 15x
nest 20x
(2) drenched lx
to
the skin 47x
(3) turned 320x (past/part)
a deaf ear 12x 29x
(4) caught 98x
in the act 283x (Ν/ V)
to advice 51x
gave up g i v e - 3 9 lx gave - 285x
an attorney's sg.Gen 2x sg.Nom 65x
sent 145x
to
the gym 2x
wrote 181x
a good paper 807x 157x
career 67x
on aphasia Ox
found in the desert 5 3 6x (past/part) 21x
(source of the frequencies: Kuöera & Francis 1967: alphabetical list)
Finally, I have to discuss the individual repetition effects separately for both collocations and productive phrases. For collocations, the overall differences in the mean reactions in the non-repeated, exactly repeated and distributively repeated conditions are significant, the repetition effect being 155 ms and 134 ms respectively. However, the manner in which the material was repeated did not yield a significant difference in the mean subject and item reaction times. This can be explained for entrenched collocations, where at least the node words are assumed to carry strong contextual information which, also when these words are used in a phrase other than the collocation, biases the subject to expect the collocation. If that is what happens, the processing of the sentences containing core words in a non-collocational phrase should be slightly inhibited (as against "normal" free phrases), and the processing of a subsequent sentence containing the collocation should be facilitated in a way comparable to exactly repeated collocations (cf. above). This might explain what I have found. But, once again, this explanation clashes with the familiarity factor elicited for the collocations under analysis, which
272
In the psycholinguist's
laboratory
marked most of them as unfamiliar to the subjects. I expected this to make their processing comparable to that of productive phrases. For the latter, however, the results are quite different. The differences in the (subject and item) reaction times between the nonrepeated, exactly and distributively repeated conditions did not reach significance. As compared to the non-repeated condition, exact repetition gains 88 ms, distributed repetition loses 1.5 ms (in the subject reaction times). Though these differences are not significant, I take them to be big enough to indicate that the type of repetition matters for the processing of productive phrases, with the exact repetition producing a noticeable effect and the distributed one - practically none (cf. also the discussion of the 3 χ 2 χ 2 factorial analysis above). These results may be explained by drawing on the associative repetition effect (cf. above), which should provide for an extra benefit in the processing of an exactly repeated phrase vs. a distributively repeated one. Though this might be a plausible explanation for the productive phrases, I will have to look for an explanation for the absence of this effect in the collocations. In addition to that, I have to face another problematic fact: the total lack of a repetition effect for distributively repeated free phrases. Comparable results have been obtained in a reading experiment reported by Carlson et al. (1991) and carried out by Levy & Burns (1990, quoted in Carlson et al. [1991]). Their experiments were meant to provide information about the representational level at which repetition benefits are located. In order to get at that, they measured the transfer of repetition benefits across several degrees of context change, using as stimuli intact multiparagraph passages, paragraph-reordered, sentence-reordered and word-reordered ones. In the condition which shows some similarity with the design of my experiment (and the results of which are hence comparable to mine) the following conclusions were drawn: When the target text for the second reading was an intact passage, considerably less benefit occurred if the first reading had been sentence reordered than if it were exactly the same intact passage, and no significant benefit occurred if the first reading had been word reordered. These results support the conclusion that all repetition effect to the reading of coherent targets was mediated through sentence and paragraph representations, with no measurable amount
The experiment
273
arising from lexical repetition by itself in the absence of sentence or paragraph repetition. (Carlson et al. 1991:925)
Levy & Burns' results differed from those found in other experiments, which showed that lexical repetition did produce significant benefit to the second reading, but that the benefit in the condition of changed context was smaller than in the condition of exactly repeated sentences. Carlson et al. discuss these seemingly contradictory findings and conclude that they are due to the different tasks set in the respective experiments. For comprehension and memory instructions, the representation that endures
and hence is capable of facilitating subsequent processing if
another stimulus is encountered that can be encoded by the representation is at text level ... In contrast ... when attention is directed to processing at a wordby-word level, repetition of the word is sufficient to produce facilitation of coherent targets, regardless of whether constituent structure and hence textlevel context are also repeated. (Carlson et al. 1991: 929-930)
As for my own experiment, the instruction was to look for misspelled words and non-words, which is basically a word-by-word processing task. But as the words were used in meaningful sentences, comprehension was not excluded. On the contrary, I thought (and still think) that in sentences without misspellings/non-words, the subjects read and certainly comprehended the whole sentence, so that the reaction times measured for these sentences reflect the time needed for comprehension. I also assume that the enduring representation for these sentences is most likely of a higher order, viz., of their contents. The individual words expressing these contents, or rather the proposition will probably not produce a facilitatory effect in sentences where they occur in different phrases, since in the latter they will contribute to a different chunk of content. But they might facilitate the processing of sentences where they occur in exactly the same phrase, since this particular content chunk will be identical. In other words, I can assume that, in my experiment, the subjects may have established a content representation of the sentences encountered, which can facilitate the processing of repeated, identical parts of the content structures above the word level, but will not become effective when, later on, the individual participating words are encountered in a deviating environment, encoding a different content chunk.
274
In the psycholinguist's
laboratory
An assumption like this can hence also explain the lack of difference in the reaction times recorded for the processing of non-repeated and distributively repeated free phrases. Another proposal is made by Forster & Davis (1984), who argue that long-term repetition effects (i.e., those lasting over quite an amount of intervening material) are purely episodic, thus implying that there are no long-term lexical effects of repetition (cf. Forster & Davis 1984: 694). They further conclude from the assumption that episodic retrieval is context-sensitive that then any change in the way in which the prime and the target are presented may reduce the accessibility of the trace of the prime, and hence may reduce the repetition effect. (Forster & Davis 1984: 695)
Basically, this proposal provides another explanation of the associative repetition effect, besides the one given by Goshen-Gottstein & Moscovitch's "perceptual contiguity hypothesis" (cf. above). These explanations seemingly work well for my findings regarding the processing of productive phrases. But how can the differences recorded for the processing of collocations be explained? The associative repetition effect and the lack of priming by distributively represented material I have just described should be noticeable there as well. However, distributively repeated collocations yield almost as big an effect as do exactly repeated ones, so that the gain attributed to the associative repetition effect for the exactly repeated collocations must be made up for by another factor in the distributively repeated condition. The only plausible explanation I can give at the current state of the analyses is to assume that this factor must be looked for in the status of a collocation as distinct from that of a freely constructed phrase. The strongly biasing co-occurrence information hypothesized for the lexical entry of an entrenched collocation's node-word(s) is one thing that can be held responsible for producing the benefit found in the reaction times of distributively repeated collocations (cf. my hypothesis 2 as defined above). Another, even stronger hypothesis would be to approximate an entrenched collocation to a holistic, word-like entry in the mental lexicon (cf. my hypothesis 2 ) . For both cases, the additional benefit can thus be attributed to a (very) strong association between the constitutive parts of a collocation, an association which goes (very
The experiment
275
much) beyond that between words which combine with other words simply in accordance with what is predicted by their word categories and the rules of syntax. That entrenched/established associations between words have an influence on repetition effects is supported by experimental findings by e.g., Carroll & Kirsner (1982). Among other things, they aimed at eliciting data about the effect of context repetition on lexical decision. The context they analysed was defined as the co-occurrence of two words in pairs (vs. the design of my experiment, where context amounted to whole phrases), and they found - contrary to their expectations - that there is an advantageous effect on lexical decision when two words are repeated in intact pairs, provided that the two words are related. This result is explained by assuming that also links or associations between words can be primed (cf. Carroll & Kirsner 1982: 61). Bainbridge et al. summarized these findings to indicate that repetition of context can facilitate lexical decision over and above mere repetition of the stimulus words, provided that the items in repeated pairs were preexperimentally associated. (Bainbridge et al. 1993 : 620).
This is in line with what I have found for collocations. The second finding, that the pairs not preexperimentally related did not produce an effect different from that for rearranged pairs, can, however, not be confirmed by my experimental results: I did find extra benefit for exactly repeated productive phrases (cf. above). Apart from that, other experiments have produced evidence that entrenched language material is processed differently from material productively constructed. Potter & Faulconer (1979), examining the retrieval of noun phrases, tried to elicit information about whether the meanings of the constitutive words are retrieved independently and then combined or whether the retrieved meanings for the words are context dependent. They found from the experiments they carried out that a listener hearing a noun phrase such as burning house retrieves a unitary meaning for the whole phrase, apparently without first retrieving a context-free meaning of house and then combining it with burning. Since unitary comprehension does not occur when the adjective is separated from the noun, interactive retrieval is probably under the control of syntactic as well as semantic structure. A post hoc analysis suggests that context-dependent interpretation of noun meaning may be limited to phrases that express ideas already represented in memory. (Potter & Faulconer 1979: 518)
276
In the psycholinguist's
laboratory
It does not seem inconceivable to me that, also at the stage of building a syntactic representation, the holistic character of such a unit represented in one's memory is recognized and exploited. To sum up, my experiment brought to light some interesting differences regarding the repetition effects as they reflect in the reaction times for the processing of collocations vs. that of comparable productive phrases. Table 6 presents an overview: Table 6. Experimental results (overview) phrase type
exact repetition
distributed repetition (vs. nonrepeated)
collocations >• and productive phrases
significantly shorter subject reaction times significantly shorter item reaction times
insignificantly shorter subject reaction times significantly shorter item reaction times
insignificant
significant
interaction phrase type and repetition collocation
productive phrases
collocation
productive phrases
exact repetition (vs. non-repeated) shorter subject reaction times shorter item reaction times
significant significant
exact repetition (vs. Non-repeated) shorter subject reaction times shorter item reaction times
insignificant insignificant
distributed repetition (vs. non-repeated) shorter subject reaction times shorter item reaction times
significant significant
distributed repetition (vs. non-repeated) almost the same subject reaction times shorter item reaction times
insignificant insignificant
collocation
non-repeated vs. exact repetition vs. distributed repetition significant differences in subject and item reaction times
productive phrase
insignificant differences in subject and item reaction times
The experiment
111
These results, especially the ones revealing a significant interaction between phrase type and type of repetition in the distributedly repeated condition, can be assumed to indicate differences in the procedures employed in the processing of collocations on the one hand, and freely constructed phrases on the other. Since some of the further differences elicited do, however, fail to reach significance for the item or for the subject analyses, further experiments need to be carried out in order to find out about the reliability of the effects and the plausibility of the conclusions drawn. Future experiments will, first of all, have to aim at the replication of the findings made so far, with special attention paid to the item lists regarding such factors as frequency of occurrence and familiarity. In addition to that, they need to aim at revealing what the actual differences in the processing of the two phrase types are, since the design and the results of the experiment described here do not allow for any definite conclusion with respect to a particular status of collocations as compared to that of free phrases. The differences elicited may just as well reflect mere processing differences which arise from variously strong (or weak) contextual constraints exerted by the co-occurrence information included in the lexical entries of the respective (constitutive) words. This seems to be the minimal conclusion to be drawn, which, however, does not allow for any final decision between the two hypotheses I made regarding the processing of collocations (cf. above, pp. 260-261). Considering the support the experiment described can give to the plausibility of models of language processing, I can take the results to be indicative of constraint-based mechanisms in sentence comprehension. They provide evidence for the particular effect of the probabilistic co-occurrence information which is assumed to be part of a word's lexical entry in the mental lexicon of the language user. Thus, they contribute to the plausibility of the assumption of a lexically dominated parsing procedure in that they show that the processing of particular syntactic units (phrases and beyond) is dependent on information stored with individual lexical items. This claim does not generally dispute the existence and exploitation of separate/pure syntactic knowledge. The latter can be assumed to be effective in the parsing of syntactic units whose constitutive words do not contain cooccurrence information biasing towards one particular structure, i.e.,
278
In the psycholinguist's
laboratory
units where the construction of structure solely depends on the wordcategory information included in the encountered words and the general combinatory possibilities licensed by the language's syntax. The experimental results also reflect a very close interaction between lexicon and syntax in language comprehension, an interaction which goes beyond that established by the factor of "word category" (which is simultaneously part of a lexical entry and the item over which syntactic rules operate). That is why I take them to speak for a fuzzy boundary between the two and against the strict autonomy of the one from the other. Consequently, these results can also be considered to support my concept and understanding of a natural/plausible linguistic model. For, only linguistic models which allow for the interaction between syntax and the lexicon accordingly, can also incorporate and explain the phenomena I have found to be related to the processing of collocations.
Chapter Seven The finale
7.1. Launching the project This study was meant to investigate the "naturalness" of basic assumptions made in linguistic models about the interrelation between lexicon and syntax. In doing so I aimed at evaluating these assumptions with regard to their psychological plausibility. I consider the latter to be measurable by their compatibility with what is claimed regarding the two components and their relationship in psycholinguistic models. That is why I first of all compiled the psycholinguistic assumptions made with respect to the role played by lexicon and syntax in language use, i.e., in the processing of verbal information in language production and comprehension. The claims made - mainly on the basis of experimental evidence - can be polarized according to whether syntax is seen as an autonomous component of our language faculty, or whether it is seen to be interlinked with - if not dependent on - the lexicon. The latter view is based on the assumption that a lexical entry contains all sorts of information that can be seen as factors which from early on constrain the syntactic and semantic interpretations of a verbal utterance (or the syntactic encoding of a message in the productive mode). An autonomous syntax must, however, be conceived of as working on syntactic information alone, for instance on the basis of word-category information, when a verbal utterance is understood or produced. My decision on which of these claims I give preference to and which I consider most plausible may - at that place - have looked quite unmotivated, and my discussion may have given the impression of groundlessly "biased" interpretations of the positions held by the respective models. However, both decision and discussion were motivated by my own data analyses, which were presented in chapter 5. In chapter 4 I compiled the hypotheses on the problem at issue (implicitly or explicitly) made in a number of linguistic models.
280
The finale
As regards the models selected for discussion and evaluation within the scope of this study, I had to restrict my analysis to a manageable number of models. The models I chose are all models which have been broadly discussed and reflected in linguistic circles, though they are by no means the only ones worth considering. My selection was mainly determined by the models' importance in terms of broad recognition, if not acceptance, of their basic tenets, as well as by my own linguistic background. Other models, perhaps equally good candidates for a plausible reflection of the lexicon-syntax interface, such as the formalisms known and widely used in computational linguistics (Tree Adjoining Grammar, Categorial Grammar), had to be discarded. In what follows, I summarize the main findings of my inquiries. I will make them the foundation on which my final evaluation of the plausibility of linguistic models rests.
7.2. Bringing in the harvest 7.2.1. The lexicon-syntax interface as reflected in performance models Psycholinguistic research has produced supportive evidence for the following views: 1. Syntactic processing cannot work properly without taking into consideration lexical information. In language comprehension, lexical access precedes the construction of a syntactic structure, which can also be assumed for production with the proviso that also material other than lexical (i.e., semantic, pragmatic) can instigate structure building. 2. Syntactic processing in production, i.e., the construction of syntactic frames is lexically driven. That means that the information accessed with a lexical entry also contains information about the combinatory behavior of the respective entry, thus constraining the possibilities available for the grammatical encoding of the speaker's message. The final coordination between lexical items and structural frames is based on the thematic information which is available from the predicate-argument structures of the lexical entries retrieved, most importantly from that of the verbs, and is also reflected in the frames. From all that, it follows that we assume lexically guided interaction
Bringing in the harvest
281
between lexicon and syntax, with the choices to be made also constrained by content and context. In language comprehension, the language processor starts the construction of syntactic structure immediately/shortly after the first lexical entry has been accessed and proceeds in an incremental way. In both modes, the syntactic parser does not construct structure blindly, merely following the syntactic rules of the respective language and on the basis of word category information, but the structure which is being built primarily depends on the lexical entries already accessed. Owing to the rich lexical representations assumed, the parsing process as well as the process of grammatical encoding is multiply constrained, namely by information about a word's (preferred) word category, its potential predilection for a (number of) thematic role(s), and by lexical "combinatory" or "co-occurrence" information, such as information about the (probabilistic) position/function of the respective word in an utterance, its X-bar and argument structure(s). 3. As already implied in 1., there are also semantic and pragmatic effects on the structure(s) to be chosen for the encoding of one's message. Also the parsing of an utterance does not work without being affected by the contextual information available from the preceding verbal context and from the situational context. As for the temporal arrangement of these context effects, I place them early in the comprehension process, i.e., I assume them to influence initial (syntactic) attachment decisions. For language production, context can also be assumed to exert its influence early. In particular, context effects can be traced in the mechanisms posited for the assignment of functions, which is understood to be controlled by thematic information and by discourse information (discourse and attentional roles). These types of information are closely linked to lexical knowledge (subcategorization features or predicate-argument structure of lemmas) on the one hand, and to such factors as perceptual prominence and discourse focus (topicalization and information structure) on the other (cf. Bock & Levelt 1994: 964-965). The latter two factors exert their influence on function assignment by making some participant - due to its mental prominence - more accessible for encoding than others, so that it can take a function that occurs early in the sentence, usually the subject, and thus also constrains the other assignments to be made.
282
The finale
This means that the prominence or salience of participants (arguments and modifiers) has an influence on their positional distribution of the words in the sentence to be uttered. 4. Certain aspects of frame construction are seemingly due to the application of purely syntactic knowledge, especially to that of syntactic categories and the relations between them (hierarchy and serialization). Also for language comprehension, "purely" syntactic knowledge cannot be denied. Parsing is also guided by expectations that are based on knowledge of syntactic rules alone. It seems reasonable to suppose that there is a potential for the operation of "pure" syntactic knowledge, e.g., after a subject-NP has been processed, creating the expectation that some VP will follow. 5. It can be assumed that the language user in both modes does not only draw on word-like lexical entries and syntactic knowledge of how to combine them, but that he also has at hand prefabricated syntactic fragments/clusters which may be used, that is, stored in, and retrieved from, his mental lexicon holistically. The respective fragments may exist in the form of particular structural patterns which are either lexically empty or partly or even completely lexicalized, as is the case for collocations and idioms. That is why I claim that, in the course of the production of an utterance, the syntactic processor need not always have been active when this is suggested by the product, a syntactically structured fragment. In the comprehension process, the existence of such clusters allows the hearer to develop particular expectations for the structures and/or words to follow in the string he is about to understand. Assumptions like these question the common view that words are stored and accessed in the process of language comprehension, whereas syntactic structures are constructed by the application of syntactic rules. For the general architecture of the language processor and for the procedures involved, all this implies that it is extremely difficult to draw a dividing line between the lexicon and the syntax (as two constitutive parts of the system) and lexical and syntactic processing (as the related procedures). I feel strongly inclined to conclude that the language processor does not appear to consist of separate, autonomous components which communicate only on an output-input basis. However, the evidence reviewed in sections 3.2.2 and 3.3.2 cannot be
Bringing in the harvest
283
considered sufficient to clearly refute a modular conception of the mind. Most of the data mentioned can also be explained on the basis of a modular account, though less convincingly so, from my point of view. To sum up, the psycholinguistic assumptions I share are basically those made by lexically driven models (for production) and lexical guidance models which additionally provide for pragmatic/contextual influences on the parsing procedure, in particular by constraint-based accounts (for comprehension).
7.2.2. The lexicon-syntax interface as reflected in competence models In chapter 4 I evaluated a number of linguistic models with regard to what they claim the relationship between lexicon and syntax to be. From the psycholinguistic assumptions I share, it follows that all those models which have received the label of being lexicon-oriented will consequently be considered more natural and plausible candidates of linguistic models than those which attribute a central and/or autonomous status to syntax, where syntactic knowledge operates separately and independently of lexical (and semantic/pragmatic) factors. However, also within the lexicon-oriented models not all the suggestions are equally good candidates in our rating. The syntaxoriented models are less good candidates from the start. In the following, I summarize the comments I made in the discussion of the respective models. Dik's model is lexicon-oriented in its first stages, namely in the formation of core predications. Later, in the course of the clausestructure formation, the assignment of syntactic functions to the constituents of the core predication follows a general assignment strategy, that means at these stages, syntax operates independently of what is specified in the predicates and terms making up the predication. This cannot be assumed to happen for all the expressions of a language, since there are restrictions in the assignment of syntactic functions which arise from lexical information, that is, there are also lexical items, which require the functions to be assigned idiosyncratically, deviating from what the general strategy predicts. A second implausible factor in Dik's model is the late placement of serialization processes,
284
The finale
which become effective only after the clause structure has already been specified on the basis of syntactic (function) and pragmatic information. I consider these factors to be sufficient for refusing Dik's model a top rank in the group of plausible linguistic models. Halliday's model is difficult to attribute to the one or other group, since, with its claim of a continuum of lexicon and syntax, it allows for both lexicon and syntax to be made the point of departure in the construction of an utterance. Halliday does not go beyond placing lexicon and syntax side by side and stating that both represent the same thing, namely the inner core of language, though from different (i.e., opposite) perspectives. This implies that syntax is not considered an autonomous language component. Halliday models a language as a systemic network of choices to be made in the verbalization of a message, where lexical choices, that is, choices of particular lexical entries, seem to be the final, the "most delicate" choice. This cannot be considered to reflect what actually goes on in language use, because the lexicon exerts an early influence on the verbalization process. Halliday's model represents all the options in a paradigmatic relation, showing that any choice is related to all the others in the network. However, it does not comment on the place where the speaker might enter the system network (cf. Halliday 1985: xxvii), or which choices he has to make in what sequence in order to encode the intended meaning. This is one reason for rating this model as being not actually natural. From my point of view, Halliday's view of a "lexicogrammar" is taken up and elaborated in Langacker's model, where syntax can be understood to arise from the lexicon via abstraction. However, since these relations are not made explicit in Halliday's model, I cannot but consider it insufficient in this respect either. Sinclair , who stands for the theoretical conclusions drawn from corpus-linguistic research in general, arrives at stating that lexicon and syntax are interdependent, that we must assume co-selection to be effective in the process of utterance formation. Quite naturally, following from his research strategy, his ideas clearly belong to lexicon-orientation in linguistic modelling in that the combinatory behavior of lexical items is brought to light by analysing concordances
Bringing in the harvest
285
from large language corpora. The main point made by the findings of corpus linguistics is the noticeable fact that individual words do not freely combine in a way licensed by syntax, but that what is generally possible is heavily constrained by what is actually and usually selected and said. This difference is also understood to distinguish native from non-native usage of a language. In a number of corpus studies, the restrictions placed by lexicon on syntax are considered sufficient evidence for claiming that syntax is driven by the lexicon. Still, corpus-linguistic claims have a major drawback: They are made on selected linguistic issues only, and, thus, do not represent a self-contained linguistic model. This implies that a number of linguistic phenomena has not (yet) been (comprehensively) topicalized in corpus linguistics, as, e.g. , the locus at which and the way in which pragmatic factors exert an influence on the form of an utterance. Hence, Sinclair's views cannot be evaluated with regard to being a plausible model; I can merely indicate that, from my point of view, what he generalizes from data analyses could be or become part of such a model. Chomsky's models are known for claiming an autonomous and allimportant syntax, so that they should be listed in the group of syntaxoriented models. At closer inspection, however, I found that the lexicon is (implicitly) assumed to have an ever increasing importance, so that in Chomsky's (current) final version of a generative linguistic model, the minimalist program, an autonomous syntax is merely responsible for the eventual joining of X-bar subtrees which have been projected from the lexicon. However, Chomsky himself never actually says that syntax is dependent on the lexicon; in his view, syntax is an autonomous linguistic component, it represents separate linguistic knowledge which cannot be reduced to merely being abstracted from any other sort of knowledge. A view like this is a necessary consequence of the modularity hypothesis, according to which syntax (and phonology) must be considered autonomous in that they show regularities that cannot be detected in other knowledge systems and hence cannot be attributed to general cognitive abilities (cf. Fanselow & Felix 1987: 67). As the psycholinguistic claims I share question the status of syntax as an independent and autonomous component, this is reason enough for a lower rating of the Chomskyan models on the scale of naturalness/plausibility of linguistic models.
286
The finale
In Bresnan's model of Lexical Functional Grammar, part of the information necessary for the formation of an utterance is provided twice by the two components under discussion, namely information as to the (syntactic) functions of the participating lexical elements. This does not apply to potential "free" extensions of the structure, that is, those that are not subcategorized by some lexical head (e.g., adjuncts). These do not appear to depend on lexical stipulations, but are left for determination by phrase-structure rules. Hence, leaving aside adjuncts, the model of Lexical Functional Grammar theoretically allows for both the lexicon or the syntax being the driving force in the construction of language structure; it, therefore, can be considered to be lexicon- as well as syntax-oriented, since both components are richly specified and describe and explain each a considerable part of the functioning of language. What remains doubtful to us, however, is the actual and constant application in language use of c-structures inclusive of their functional specifications per se, i.e., the construction of c-structures without being evoked by some concept and the related lexical items, and their subsequent matching with the lexically stipulated grammatical functions in f-structure. From my point of view, this procedure does not differ from the Government-and-Binding view of lexical insertion into D-structures, where the latter are assumed as being generated by the categorial component (the X-bar schema). Although the native speaker will certainly know about the potential of structures his language provides, it is hardly conceivable that he will always and in any case use this knowledge independently of what the lexical choices he has made predict. That is why I do not rate Bresnan's model highly plausible either. Diehl's model of Lexical-Generative Grammar can be considered strictly lexical in that syntactic structure is determined by the lexical entries involved in an utterance, that is, grammar is virtually equated with the lexicon. This view strongly deviates from the common understanding of (generative) grammar as a rule system: there are no phrase-structure rules, no transformational rules, no rules for the semantic interpretation of a representation or anything like that; instead, syntactic structure is considered to be characterized exclusively by the lexicon. This implies that also adjuncts are assumed to be stipulated by
Bringing in the harvest
287
a lexical entry, probably that of a clause's verb. Ergo, the model of Lexical-Generative Grammar can be considered to represent the most consistent attempt to incorporate a language's syntax into the lexicon. However, structural rules are smuggled into it in the disguise of redundancy rules, which the model allows for as well. Another shortcoming of the model is its lack of providing for the mechanisms by which pragmatic information might be conceivably signalled in a lexical entry. Is the correspondence between such pragmatically motivated constructions as the cleft-construction or object-fronting (for English) and the lexical entries that, accidentally, occur in them actually anything to consider? Obviously, it is not. For, the constructions under discussion are not predicted or required by the use of (a) particular word(s), rather, they are the result of matching the state of affairs to be verbalized with the extralinguistic and the linguistic contexts, or context and co-text respectively. Hence, I will have to deny this model a top rank in the group of natural/plausible linguistic models either. For, this time, too much power is attributed to the lexicon. Pollard & Sag's model of Head-Driven Phrase Structure Grammar allows for the explanation of speaking, at least in part, as a lexically driven procedure. This is also provided for in the case of such pragmatically motivated structures as just mentioned in the evaluation of Diehl's model by claiming that a verb which subcategorizes a constituent contains information about both its local and non-local realizations. However, I am doubtful about the psychological reality of these assumptions: For, there are utterances in which a constituent is made the topic/theme, although the speaker has not yet an idea of the verb by which this is to be licensed. In cases like these, a topicalized object cannot be understood as determined by its non-local realization, but as the verbal realization of the concept that was the first in the speaker's mind, which then must find itself a suitable structural environment, namely a verb with the appropriate subcategorization frame. Ergo, though the model can be attributed to the lexicon-oriented ones, I must, however, say that not all of the procedures discussed in Pollard & Sag's treatise seem very plausible to me as actually going on
288
The finale
in language processing: they are much too complex to be effectively passed through in producing and understanding speech in real time. That is why also Head-Driven Phrase Structure Grammar cannot be listed as a very promising candidate for a natural linguistic model. Deane's theory represents a non-autonomous theory of syntactic competence which heavily relies on the metaphorical transfer of spatial thought, especially the physical-object schema, to the recognition of syntactic patterns or the analysis of syntactic structure respectively. It is characterized by the attempt to motivate syntactic phenomena such as grammatical relations and constituent structure by innate general knowledge (e.g., basic image schemata) and embodied experience, to base syntactic representation on the OBJECT schema. Deane's argument leaves only little room for claims and predictions regarding the interaction between syntax and the lexicon. Provided I have understood Deane correctly, he assumes the difference between syntax and lexicon to consist in the differing degrees of entrenchment of their constitutive elements, i.e., of words and phrases/constructions. He claims that lexical items are highly entrenched, whereas phrases are less entrenched the more complex they are and are therefore often created rather than memorized (cf. Deane 1992: 139). This claim implies that the feature of entrenchment is not restricted to lexical items, but may be valid for phrases as well. From this follows the possibility for syntactic clusters to be either created or memorized. If this can be shown to be a realistic assumption, it will, first, blur the distinction between a language's syntax and the lexicon and, secondly, contradict the reductionist view that everything exhibiting features of being produced by rules is not represented as a separate unit, but - for reasons of the manageability and economy of knowledge - is constructed by applying these rules. As regards the question of lexicon- vs. syntax-oriented linguistic models, I cannot readily assign Deane's views to the one or the other. This is because his elaborations are concentrated on syntax, with lexicon being mentioned only occasionally. However, what he posits about the "activation frames" and the language user's capability of memorizing units larger than a word can be understood a consequence of considering lexicon and syntax as phenomena of a continuum which
Bringing in the harvest
289
are not different in principle. Secondly, since the activation due to which syntactic information becomes available is assumed to spread from the focal item to related concepts, it is conceivable that the syntactic information in question is the combinatory information contained in the lexical entries. If this is what Deane implies, a good portion of syntax can actually be understood to arise from the lexicon. However, what is left unspecified in Deane's argument is the way in which for example pragmatic factors are assumed to take part in or even shape the process of utterance formation. That is why Deane's model must be considered incomplete and cannot occupy a top rank in the group of plausible linguistic models. Goldberg's construction grammar argues against an entirely lexiconbased approach to grammar, an approach which attributes to individual lexical items all the information needed for the construction of linguistic structures or utterances respectively. In addition to lexical items, she postulates form-meaning correspondences to exist in the language user's mental lexicon in the form of constructions. The point she makes with regard to constructions is that their forms and/or meanings are not predictable. This puts constructions on the same level as morphemes which also show an arbitrary pairing of form and meaning. Hence, no strict division can be drawn between lexicon and constructions. Considering phrases such as put/throw something somewhere, for example, Goldberg speaks for the storing in the language user's memory of both the individual word's (in particular the verb's) syntactic environment and the (generalized) construction as an independent entity (cf. Goldberg 1995: 139). I take this to emphasize that a language's syntax and lexicon cannot be clearly separated from each other. This claim is made manifest in the assumption that syntactic units (constructions) are not always constructed from elementary units (words) by the application of combinatory rules. Instead, they can just as well be stored and memorized as such, with the slots provided by the construction being either (semantically and) syntactically specified or lexically filled. However, I doubt the existence of constructions as units independent of the lexical items making them up, because I think the meaning of a construction to be associated with the typical lexical entries that gave rise to their existence. From my point of view, the caused-motion construction, for example, will always be associated
290
The finale
with the events of throwing or putting, so that its meaning must be assumed to derive from these event verbs. Hence, Goldberg's postulation of constructions does not actually argue against an entirely lexicon-based approach to grammar in that the constructions must be considered to be motivated by lexical entries that typically occur in them. This is in line with my argument for a plausible linguistic model, though I do not deny the existence of separate syntactic knowledge, which can also be exploited without being "called" or instigated by a particular lexical item. I must however qualify her model by stating that also here it remains open how for example pragmatic factors can be understood to influence the formation of an utterance. I consider this sufficient reason to refuse Goldberg's model a top rank in the group of plausible linguistic models. Langacker's Cognitive Grammar is the last model of those that I have analysed and evaluated. Since he develops a "usage-based approach", giving special importance to the use of the linguistic system, it should be a promising candidate for a natural/plausible linguistic model. For, as has been argued, I claim the naturalness of a linguistic model to be measurable by its compatibility with what has been found to go on in language use. Langacker's model explicitly reflects the Cognitive Commitment (cf. section 4.4): the claims he considers central to his model include the one that grammar or syntax respectively is not autonomous (cf. Langacker 1987: 2). Linguistic structure is assumed to be understandable and describable only within a broader account of cognitive functioning (cf. Langacker 1987: 64). This holistic view also shows in a number of general psychological phenomena which Langacker considers essential to language, though not limited to it: entrenchment, abstraction, comparison, composition, and association. Moreover, Langacker ascribes to Cognitive Grammar the following features: it is maximalist in that it allows for a massive and highly redundant cognitive representation of language; it is non-reductive in that it includes both rules and instantiating expressions; it is bottom-up in that it claims that rules (or schemata, that is, generalizations over
Bringing in t}ie harvest
291
individual language expressions) arise as schematizations of overtly occurring expressions (cf. Langacker 1999: 91-92). Starting out from these fundamental aspects, Langacker elaborates that a grammar contains an inventory of conventional units in the form of (symbolic) expressions and schemas. The expressions are specific in that they are pre-assembled verbal clusters (words or larger units) which the language user does not need to construct from their constitutive elements (morphemes or words), the schemas are abstracted away from these expressions and represent patterns/templates/models for the computation of novel expressions. When a number of expressions give rise to the abstraction of a schema, this does not imply that the (specific) expressions are no longer listed in the grammar. On the contrary, when a language user has acquired expressions as conventional, that is, fixed, units, they are part of his linguistic knowledge and must, consequently, be "listed" as elements of the inventory. That means that such a "redundancy" must also be allowed for with regard to more complex expressions (that is, those beyond the word level): Collocations can be considered as conventional for a large number of English native speakers, although they can just as well be described as being assembled according to the rules of combining simple structures into more complex ones. The provision for both, rules - or rather schemas - and lists is characteristic of cognitive grammar, it makes the cognitive approach to language description both maximalist and non-reductive. These assumptions reflect the impossibility of drawing a clear borderline between lexicon and syntax. Both merely differ along such parameters as generality (or degree of abstraction)/specificity, novelty/re-occurrence and size, and can, therefore, not be understood as being the two poles of a dichotomy. Rather, they form a continuum, with lexical items in the classical sense at the one end, and fully productive schemas in the classical sense of syntactic rules at the other. I consider this cognitive understanding of grammar and lexicon to be plausible and natural, since it is compatible with procedures that are assumed to go on in language processing. It allows for both the storage and retrieval of complex linguistic structures and their computation, the choice of the one or the other being automatic, not under voluntary
292
The finale
control, depending on whether a structure is entrenched or not (which is a consequence of frequency of use/reinforcement). This implies that separate syntactic knowledge is also provided for, namely in the assumed existence of "rule schemas". Nevertheless, the model is lexicon-oriented in that syntax, i.e., the schemas/patterns, arise from their instantiations in which particular lexical items occur in a particular structure. In other words, lexical items occurring in repetitive structural constellations give rise to the abstraction of schemas, so that both lexicon and syntax represent the same kind of knowledge, though on different levels of abstraction. From the perspective of language use (Langacker's model is indeed usage-based), the structure in which an utterance is cast, must be assumed to eventually originate from the lexicon. In addition to that, also the influence exerted by for example pragmatic factors is provided for. Both verbal and situational contexts can be understood to (also) decide upon the construal of the state-ofaffairs to be communicated (cf. footnote 68). From this it follows that I consider Langacker's Cognitive Grammar as the best candidate for a natural linguistic model. It seemingly explains best where lexicon and grammar meet.
7.2.3. The lexicon-syntax interface as reflected by performance data After having evaluated linguistic claims with respect to the lexiconsyntax interrelation by comparing them with psycholinguistic findings and testing them for their compatibility with the latter, I also thought it interesting to look for further clues to how a language's lexicon and syntax relate to each other. According to what I said about the value of introspection, performance, and experimentally elicited data for the verification/falsification of the hypotheses made in linguistic models (cf. section 4.2.3), I considered it worthwhile to analyse a number of performance data for what they can tell with regard to the problems I am concerned with. Since psycholinguistic models, to a large extent, result from analyses of performance data and thus represent generalizations about language use, they can be considered an intermediate step in the logical reasoning
Bringing in the harvest
293
from performance to competence. Hence it follows that, when I try to directly infer from performance data about what they can reveal about the interrelation between lexicon and syntax, this works via the one or other claim made in models of language processing. Within the scope of this study, I analysed reformulations or rather self-repairs to find out what the mechanisms of their production are, and whether these mechanisms are provided for by the general design of the linguistic models under discussion. Secondly, I analysed overlaps, i.e., moments in a conversation at which both interlocutors speak simultaneously, as to what they can tell about language comprehension and whether the procedures involved are explicable by linguistic models. A third type of performance data I took from corpus-linguistic research results, in particular from the discovery of not only syntactic but also an impressive number of lexical patterns in language use. Drawing on what corpus linguists have revealed about lexical patterning, I once again asked whether the linguistic models under discussion can sufficiently account for lexical patterning, and which psycholinguistic claims this phenomenon can be taken to support. The major results and answers I can give are listed in what follows: Repairs show that lexicon and syntax operate in close cooperation in the process of formulating an utterance in that the words needed for the conveyance of the intended message constrain the syntactic structures the speaker will be able to or will need to construct. On the other hand, syntactic structures - once they have been called either by some word already accessed, by a completed phrase or some semantic/pragmatic factor - also influence or constrain the selection of the lexical material needed to fill the slots provided by those structures. That means that (the structures of and the motivation for) repairs reveal that, though knowledge of different levels of abstraction (lexical and syntactic) is involved in producing an utterance, one cannot generally and always delimit the exact contribution of the one or the other to the production procedure. Many repairs indicate that the syntactic information determining the structure that is being constructed comes from the lexicon in the form of the combinatory information contained in the lexical entries that are needed for the expression of the intended message. The noticeable effect of syntactic constraints exerted by
294
The finale
lexical entries makes it especially difficult to demarcate syntax from the lexicon. Secondly, I think that the results obtained from my analysis are indicative of speech production being a predominantly lexically driven and incremental procedure. Many repairs show that syntactic frames are adapted to the selected lexical material, and not vice versa. Moreover, we could find that quite a substantial amount of neglect of syntactic well-formedness in oral texts does not basically impair their comprehension. Thus, it is not mere speculation to conclude that the successful conveyance of a message seems to depend more urgently on the selection of semantically appropriate lexical material than on the overall well-formedness of the utterance. This amounts to saying that utterance parts which are syntactically incoherent can often be understood correctly, whereas inappropriately or even erroneously selected words seriously affect the message in that they will cause an interpretation different from the one intended by the speaker. Thirdly, there are also data suggesting that syntactic fragments are not always and generally generated by combining the selected words according to the possibilities licensed by syntax. Speakers occasionally use syntactically complex units which do not seem to be freely constructed at the moment at which the utterance is being produced, but seem to be remembered as lexico-syntactic chunks. An assumption like this would, among other things, help to explain the immense rapidity of speaking (and understanding), admittedly at the cost of higher demands on the storage capacity of the human mind and a more extensive storing activity. Finally, the results obtained do not imply that speakers do not have and use knowledge of syntactic rules and regularities per se, that syntactic knowledge can be reduced to the knowledge of the contextual information contained in the lexical entries selected for the expression of a particular utterance, or that it does not play any distinguishable role in utterance production. On the contrary, speakers will use "purely" syntactic knowledge at particular stages in the formulation process, for example, after a phrase has been constructed which does not constrain the further form the utterance is going to have, or when material from the conceptualizer can be expressed at various positions and/or in
Bringing in the harvest
295
various forms, as is the case for adjuncts or some kinds of modifiers (e.g., the choice between a reduced and an unreduced relative clause modifying a noun). As for my analysis, it has shown that separate syntactic knowledge also comes into play in situations of "speech need". When a word form is not available, speakers certainly do know about the syntactic properties of the missing word or they do know how to generate an alternative structure in which the missing word is not needed or which can accommmodate an alternative expression/word. But - heretically - I can ask where this knowledge comes from. In the course of language acquisition, children experience words used in particular constellations for the expression of particular ideas without having the theoretical concepts of phrase structure, word categories and the like available from the very beginning. These concepts more likely are a result of this experience and the children's growing ability to recognize and extract patterns from what they perceive. So, from my point of view, it seems justified to say that the "pure" syntactic knowledge the speaker has available and draws on is abstracted from his experience of word strings/combinations for the expression of particular meanings (that is, from instantiated syntax), it is a result of pattern recognition in speech. Overlaps have been found to be illuminating with respect to comprehension processes. As a result of my analysis, I claim that overlaps support the idea that a word's (probabilistic) co-occurrence information and the (linguistic and situational) context in which it is embedded exert a strong influence on the comprehension process, and that they do so not only after a syntactic representation has been constructed by the hearer, but as soon as the first word has been accessed. In particular, overlaps are indicative of the interaction between lexical and syntactic information in the comprehension process. In fact, they make obvious that lexical entries contain syntactic information which is exploited by (the speaker and) the hearer to build structure into which further incoming input has to be incorporated. They also show that the hearer's expectations can just as well be based on "separate" syntactic knowledge, for instance, on his knowledge of what the general rules for potential combinations of word categories predict for a wellformed utterance. This is illustrated by the fact, that the hearer is almost
296
The finale
always correct in his guess at the word category of the word he contributes to the utterance in the overlap. Apart from that, the overlaps produced by the hearer also show the early exploitation of semantic information in the comprehension process. In most cases, the listener is able to infer the actual word to be used in a particular slot, effectively exploiting the conceptual representation he has already constructed from the preceding utterance as well as from the situational context. The early use of semantic knowledge, even before the syntactic representation is completely available to the hearer is a challenge to all those models in which semantics is considered a purely interpretative component. Finally, though I can find support from my analysis of overlaps for the existence and exploitation of lexicon-inherent co-occurrence information, I have not come across many overlaps that are related to collocations, that is, where the words contributed by the hearer to the speaker's utterance are part of a collocation. If they are, this fact alone makes the actual words to follow predictable, that is, they are predictable also in context-free usage or in a neutral context (for example, this is true for the phrase give her/him/them my ... /love7 uttered in a suitable context). I am fairly certain that the analysis of a larger corpus would bring to light more examples of overlaps in which the part contributed by the hearer consists of one or more word(s) habitually co-occurring with the word(s) previously uttered. The separate discussion of lexical co-occurrences has revealed that language products are obviously rich in lexical patterns. As could be shown, there is considerable agreement on the existence of habitually recurring lexico-syntactic strings in texts, and the seemingly large amount thereof allows for the assumption of a "hybrid" linguistic element: an element which has features of a single word (it has an integrated meaning, it has the potential of being used invariantly, as a (quasi-) lexicalized unit) and features of a construction to be built in the process of language processing (the element is analysable, its constitutive elements may be flexibly put together, but they contain biasing co-occurrence information, with the bias being more or less strong). I consider this assumption to be especially important for my discussion of natural linguistic models. It is not only based on the
Bringing in the harvest
297
observations and comments just mentioned, but it also rests on one of the features ascribed to collocations: Owing to their arbitrary nature, collocations must be acquired/remembered individually. This necessarily implies that the mental lexicon does not only comprise single words, but also larger phraseological units of both a fixed or a more variable type (for a comparable argument see Kjellmer 1991).
7.2.4. The lexicon-syntax interface as reflected by experimentally elicited data Since the postulation of a "hybrid" linguistic unit, uniting features of both words and phrases, is a strong claim, especially with regard to the basic differences commonly assumed for lexicon and syntax, I thought it necessary to get closer to the actual cognitive status of these hybrids. That is why I turned to the third type of evidence available for hypothesis-testing, that means I designed an experiment from which I hoped to receive the respective information, at least some indication thereof. The experiment was meant to discover differences in the processing of collocations as against that of free constructions. If any such differences can be found, this would imply either that collocations have a special status (more exactly, a word-like status) as compared to free phrases (hypothesis 1), or that the procedures involved in the processing of collocations are sped up by strongly biasing cooccurrence information (hypothesis 2). The experiment was designed to get at that by measuring the size of the repetition effect in a lexical decision task. It brought to light some interesting differences regarding these effects as they reflect in the reaction times for the processing of collocations vs. that of comparable productive phrases. The results, especially the ones revealing a significant interaction between phrase type and type of repetition (that is, exact or "distributed" repetition of collocations and the matched free phrases), can be assumed to indicate differences in the procedures employed in the processing of collocations on the one hand, and freely constructed phrases on the other. Since some of the differences elicited do, however, fail to reach significance for either the item or for the subject
298
The finale
analyses, further experiments need to be carried out in order to find out about the reliability of the effects and the plausibility of the conclusions drawn. Future experiments will, therefore, have to aim at the replication of the findings made so far, with special attention paid to the item lists regarding such factors as frequency of occurrence and familiarity. More importantly, future experiments need to aim at revealing what the actual differences in the processing of the two phrase types are, since the design and the results of the experiment described here do not allow for any definite conclusion with respect to the particular status of collocations as compared to that of free phrases. The differences elicited may just as well reflect mere processing differences which arise from variously strong (or weak) contextual constraints exerted by the co-occurrence information included in the lexical entries of the respective (constitutive) words. This seems to be the minimal conclusion to be drawn.
7.3. Evaluating the findings - a psychologically plausible linguistic model As stated repeatedly, the compatibility with psycholinguistic assumptions is not the only criterion on the basis of which one can decide upon the naturalness in the sense of psychological plausibility of a linguistic model. It is also mere performance data and experimentally elicited data which are influential in awarding this feature to a linguistic model. In conclusion I compile what the consequences are for linguistic models when they are to meet the requirement of incorporating and explaining performance and experimental data. As for the question of how the facts we have elicited from our analysis of repairs can be accommodated in descriptions of the language system, I conclude that for a linguistic model to be natural it must reflect both the interaction and the fuzzy borderline between lexicon and syntax. Since especially those repairs resulting in a change of the syntactic frame suggest that syntax changes on lexical and semantic/pragmatic grounds, competence models should also allow for an appropriate place of the lexicon. That means that lexical entries
Evaluating the results
299
should rank as possible candidates for the setting of a syntactic frame, or, more generally speaking, that syntax should - at least to some extent - be treated as an abstraction from the lexicon. Moreover, natural linguistic models should also reflect the piecemeal use of its elements, which would probably result in discarding or at least re-considering the status of the sentence as one of the basic syntactic units to be assumed. Instead, it seems more natural to base the establishment of syntactic units of the language system on phonetic and semantic/pragmatic considerations (e.g., on the analysis of phonemic clauses and semantic entities/units). In doing so, the resulting system could also incorporate those phenomena that have hitherto been attributed to a grammar of the "spoken language", e.g., syntactically incomplete structures, or syntactically incongruent combinations of utterance parts, which markedly differ from the units and structures we find in the written medium and which are the forms which a conventional linguistic system is not expected to describe and explain. From all that has been found for overlaps, the following claims have a special impact on the conceptualization of a natural linguistic model: A language's lexicon and syntax do not operate separately, but in close co-operation. The construction of a message is predominantly lexicon-driven. That is why syntax cannot be considered more basic or important than other subcomponents of the language system. On the contrary, some of my findings can even be taken to speak against its ubiquity. Besides, also semantic/pragmatic factors have turned out to be inseparable from syntax, that is, the use of particular syntactic structures is not only motivated by the word categories that need to be combined, but also by semantic/pragmatic considerations (e.g., the level of concept specificity, the focus of the utterance, etc.). From what has been revealed by corpus linguistic analyses about lexical co-occurrences and the existence of lexico-syntactic clusters, it seems necessary that a natural/plausible linguistic model reflect the existence of such elements. In other words, since the hypothesized existence of (quasi-) lexicalized utterance fragments implies that linguistic information may be represented more than once, in various forms (here in the form of a pre-fabricated cluster - potentially with the words fully specified side by side with the knowledge of the rules for
300
The finale
combining words into such a string), it is implausible that linguistic models should be free of redundancy. Redundancy-free models do not reflect the language facts appropriately, when they "merely" posit a set of productive grammar rules and a lexicon, without leaving room for an inventory of lexicalized and semi-lexicalized utterance fragments. Apart from that, the phenomenon of "collocation" is further evidence for a blurred distinction between a language's lexicon and syntax: Fragments which look like being produced by combining words according to the licensed syntactic rules (may) turn out to be stored as wholes and accessed and retrieved from memory in a way comparable to single words. Linguistic models could incorporate such a phenomenon by considering lexical and syntactic knowledge not as totally different in type, but as linguistic knowledge which merely differs with regard to the level of abstraction. Considering the support the experiment carried out can give to the plausibility of models of language processing, the results may be taken to be indicative of constraint-based mechanisms in sentence comprehension. They provide evidence for the particular effect of the probabilistic co-occurrence information which is assumed to be part of a word's lexical entry in the mental lexicon of the language user. Thus, they contribute to the plausibility of the assumption of a lexically dominated parsing procedure in that they show that the processing of particular syntactic units (phrases and beyond) is dependent on information stored with individual lexical items. This claim does not generally dispute the existence and exploitation of separate/pure syntactic knowledge. The latter can be assumed to be effective in the parsing of syntactic units whose constitutive words do not contain cooccurrence information biasing towards one particular structure, that is, units where the construction of structure solely depends on the wordcategory information included in the encountered words and the general combinatory possibilities licensed by the syntax of a language. The experimental results also reflect a very close interaction between lexicon and syntax in language comprehension, an interaction which goes beyond that established by the factor of "word category" (which is simultaneously part of a lexical entry and the item over which syntactic rules operate). That is why I take them to speak for a fuzzy boundary between the two and against the strict autonomy of the one from the
Evaluating the results
301
other. Consequently, these results can also be considered to support my concept and understanding of a natural/plausible linguistic model. For, only linguistic models which allow for the interaction between syntax and the lexicon accordingly, can also incorporate and explain the phenomena found to be related to the processing of collocations. From my understanding of the linguistic models discussed within the scope of this study, there is only one model which readily meets all these requirements just summarized - Langacker's Cognitive Grammar. Hence, this model is found to be the most plausible one with respect to what it specifies and predicts about a disputed area in linguistic knowledge - where lexicon and syntax meet.
References
Aaronson, Doris and Robert W. Rieber (eds.) 1979 Psycholinguistic Research: Implications and Applications. Hillsdale, New Jersey: Erlbaum. Aarts, Jan and Willem Meijs (eds.) 1984 Corpus Linguistics. Amsterdam: Rodopi. 1990 Theory and Practice in Corpus Linguistics. Amsterdam etc.: Rodopi. Abraham, Werner 1988 Terminologie der neueren Linguistik. Tübingen: Niemeyer. Aijmer, Karin and Bengt Altenberg (eds.) 1991 English Corpus Linguistics. London etc.: Longman. Allerton, David J. 1982 Valency and the English Verb. London: Academic Press Inc. Altenberg, Bengt 1991 Amplifier collocations in spoken English. In: Stig Johansson and Anna-Brita Stenström (eds.), English Computer Corpora, 127-147. Berlin etc.: Mouton de Gruyter. Altenberg, Bengt and Mats Eeg-Olofsson 1990 Phraseology in spoken English: presentation of a project. In: Jan Aarts and Willem Meijs (eds.), Theory and Practice in Corpus Linguistics, 1-26. Amsterdam etc.: Rodopi. Altmann, Gerry Τ. M. 1987 Modularity and interaction in sentence processing. In: Jay L. Garfield, (ed.), Modularity in Knowledge Representation and Natural-Language Understanding, 249-257. Cambridge, Mass. etc.: MIT. Altmann, Gerry Τ. M. (ed.) 1990 Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives. Cambridge, Mass. etc.: MIT. Andrews, Avery D. 1988 Lexical Structure. In: Frederick J. Newmeyer (ed.), Linguistics: The Cambridge Survey. Vol.1, 60-88. Cambridge etc.: CUP. Armstrong, Susan (ed.) 1994 Using Large Corpora. Cambridge, Mass. etc.: MIT. Asher, R. E. and J. Μ. Y. Simpson (eds.) 1994 The Encyclopedia of Language and Linguistics. Oxford etc.: Pergamon Press.
304
References
Baayen, R. Harald 1996 Modelling the processing of morphologically complex words. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 166-191. London etc.: Taylor & Francis. Bader, Markus and Ingeborg Lasser 1994 German verb-final clauses and sentence processing: evidence for immediate attachment. In: Charles Clifton, Lyn Frazier and Keith Rayner (eds.) Perspectives on Sentence Processing, 225-242. Hillsdale, New Jersey etc.: Erlbaum. Bainbridge, J. Vivian, Stephan Lewandowsky and Kim Kirsner (eds.) 1993 Context effects in repetition priming are sense effects. Journal of Memory and Cognition 21 (5): 619-626. Baker, Mona, Gill Francis and Elena Tognini-Bonelli (eds.) 1993 Text and Technology. Amsterdam/Philadelphia: Benjamins. Balota, David Α., Giovanni B. Flores d'Arcais and Keith Rayner (eds.) 1990 Comprehension Processes in Reading. Hillsdale, New Jersey etc.: Erlbaum. Benson, Morton, Evelyn Benson and Robert Ilson 1986 The BBI Combinatory Dictionary of English. A Guide to Word Combinations. Amsterdam/Philadelphia: Benjamins. 1990 Student's Dictionary of Collocations. Berlin: Cornelsen. Benson, James D., Michael J. Cummings and William S. Greaves (eds.) 1988 Linguistics in a Systemic Perspective. Amsterdam/Philadelphia: Benjamins. Blackmer, Elizabeth R. and Janet L. Mitton 1991 Theories of monitoring and the timing of repairs in spontaneous speech. Cognition 39: 173-194. Bock, Kathryn J. 1982 Towards a cognitive psychology of syntax. Psychological Review 89: 1-40. 1991 A sketchbook of production problems. Journal of Psycholinguistic Research 20: 141-160. 1995 Sentence production: from mind to mouth. In: Joanne Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 181-216. San Diego etc.: Academic Press. Bock, J. Kathryn and Anthony S. Kroch 1989 The isolability of syntactic processing. In: Greg N. Carlson and Michael K. Tanenhaus (eds.), Linguistic Structure in Language Processing, 157-196. Dordrecht etc.: Kluwer Academic Publishers. Bock, J. Kathryn and Willem J. M. Levelt 1994 Language production. Grammatical encoding. In: Morton Ann Gernsbacher (ed.), Handbook of Psycholinguistics, 945-984. San Diego etc.: Academic Press.
References
305
Bolinger, Dwight 1975 Aspects of Language. New York etc.: Harcourt Brace Jovanovich. Bornstein, Diane D. 1976 Readings in the Theory of Grammar. Cambridge, Mass.: Winthrop Publ. Boskovic, Zeljko 1994 D-structure, theta-criterion, and movement into theta-positions. Linguistic Analysis 24 (3-4): 247-286. Bower, Gordon H. (ed.) 1975 The Psychology of Learning and Motivation, (Vol.9), New York etc.: Academic Press. Bowman, Elizabeth 1966 The minor and fragmentary sentences of a corpus of spoken English. International Journal of American Linguistics 32 (3): 1668.
Bresnan, Joan (ed.) 1982 The Mental Representation of Grammatical Relations Cambridge, Mass. etc.: MIT. Bresnan, Joan and Jonni-Miikka Kanerva 1992 Locative inversion in Chichewa: A case study of factorization in grammar. In: Tim Stowell and Eric Wehrli (eds.), Syntax and Semantics Vol.26, Syntax and the Lexicon, 53-101. San Diego etc.: Academic Press. Bresnan, Joan and Ronald M. Kaplan 1982 Introduction: Grammars as mental representations of language. In: Joan Bresnan (ed.), The Mental Representation of Grammatical Relations, xvii-lii. Cambridge, Mass. etc.: MIT. Bright, William, (ed.) 1992 International Encyclopedia of Linguistics. Oxford: OUP. Brody, Michael 1995 Lexico-Logical Form. Cambridge, Mass. etc.: MIT. Bußmann, Hadumod 1990 Lexikon der Sprachwissenschaft. Stuttgart: Kröner Butterworth, Brian 1982 Speech errors: Old data in search of new theories. In: Anne Cutler (ed.), Slips of the Tongue and Language Production, 73-108. Berlin etc.: Mouton de Gruyter. Butterworth, Brian (ed.) 1980 Language Production: Vol. I, Speech and Talk. London: Academic Press Campos, Hector and Paula Kempchinsky (eds.) 1995 Evolution and Revolution in Linguistic Theory. Washington D.C.: Georgetown University Press.
306
References
Caplan, David 1987 Neurolinguistics and linguistic aphasiology. Cambridge etc.: CUP. Caplan, David and Nancy Hildebrandt 1988 Disorders of Syntactic Comprehension. Cambridge, Mass. etc.: MIT. Caplan, David, Andre Roch Lecours and Alan Smith (eds.) 1984 Biological Perspectives on Language. Cambridge, Mass. etc.: MIT. Carlson, Greg N. and Michael K. Tanenhaus (eds.) 1989 Linguistic Structure in Language Processing. Dordrecht etc.: Kluwer Academic Publishers. Carlson, Laura Α., AnnJanette R. Alejano and Thomas H.Carr 1991 The level-of-focal-attention hypothesis in oral reading: Influence of strategies on the context specificity of lexical repetition effects. Journal of Experimental Psychology: Learning, Memory, and Cognition 17(5): 924-931. Carroll, Marie and Kim Kirsner 1982 Context and repetition effects in lexical decision and recognition memory. Journal of Verbal Learning and Verbal Behavior 21: 5569. Casad, Eugene H. (ed.) 1996 Cognitive Linguistics in the Redwoods. Berlin etc.: Mouton de Gruyter. Chomsky, Noam 1955 Logical Structure in Linguistic Theory. Cambridge, Mass.: MIT Library 1957 Syntactic Structures. The Hague etc.: Mouton 1965 Aspects of the Theory of Syntax. Cambridge, Mass.: MIT 1966 Cartesian Linguistics: A chapter in the history of rationalist thought. Lanham etc.: University Press of America. 1970 Remarks on nominalization. In: Roderick A. Jacobs and Peter S. Rosenbaum (eds.), Readings in English Transformational Grammar, 184-221. Waltham, Mass. etc.: Ginn. 1986a Knowledge of Language. Its Nature, Origin and Use. New York etc.: Praeger. 1986b Barriers. Cambridge, Mass.: MIT 1988 Lectures on Government and Binding. Dordrecht: Foris Publications. 1990 Language and innateness. In: William G. Lycan (ed.), Mind and Cognition, 627-646. Oxford etc.: Basil Blackwell. 1991 Linguistics and adjacent fields: A personal view. In: Asa Kasher (ed.), The Chomskyan Turn, 3-25 Cambridge, Mass. etc.: Blackwell.
References 1993
1995
307
A minimalist program for linguistic theory. In: Kenneth Hale and Samuel Keyser (eds.), The View from Building 20. Essays in Linguistics in Honor of Sylvain Bromberger, 1-52. Cambridge, Mass. etc.: MIT. Bare phrase structure. In: Gert Webelhuth (ed.) Government and Binding Theory and the Minimalist Program, 383-439. Oxford etc.: Blackwell.
Clahsen, Harald Normale und gestörte Kindersprache. Amsterdam/Philadelphia: 1988 Benjamins. Clark, Eve V. Language acquisition: The lexicon and syntax. In: Joanne Miller 1995 and Peter D. Eimas (eds.), Speech, Language, and Communication, 303-337. San Diego etc.: Academic Press. Clear, Jeremy From Firth principles. Computational tools for the study of 1993 collocations. In: Mona Baker, Gill Francis and Elena TogniniBonelli (eds.), Text and Technology, 271-292. Amsterdam/Philadelphia: Benjamins. Clifton, Charles and Fernanda Ferreira 1987 Modularity in sentence comprehension. In: Jay L. Garfield, Modularity in Knowledge Representation and Natural-Language Understanding, 277-290. Cambridge, Mass. etc.: MIT. Clifton, Charles Jr., Lyn Frazier and Keith Rayner (eds.) 1994 Perspectives on Sentence Processing. Hillsdale, New Jersey etc.: Erlbaum. Cogen, C. et al. (eds.) 1975 Proceedings of the First Annual Meeting of the Berkeley Linguistic Society. Berkeley, California. Collins, Peter 1991 Will and shall in Australian English. In: Stig Johansson and AnnaBrita Stenström (eds.) English Computer Corpora, 181-199. Berlin etc.: Mouton de Gruyter Connine, Cynthia 1990 Effects of sentence context and lexical knowledge in speech processing. In: Gerry Τ. M. Altmann (ed.), Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives, 281-294. Cambridge, Mass. etc.: MIT. Cooper, William E. 1980 Syntactic-to-phonetic coding. In: Brian Butterworth (ed.), Language Production: Vol. 1, Speech and Talk, 297-333. London: Academic Press.
308
References
Cooper, William E. and Edward C. T. Walker (eds.) 1979 Sentence Processing: Psycholinguistic Studies Presented to Merrill Garrett, Hillsdale, New Jersey: Erlbaum. Cowie, Anthony P., Ronald Mackin and Isabel MacCaig (eds.) 1975-1983 Oxford Dictionary of Current Idiomatic English, I-II. (MacCaig coeditor of part II only), London etc. : OUP. Croft, William 1997 Some contributions of typology to cognitive linguistics. Conference paper from the International Cognitive Linguistics Conference (ICLC) 1997, Amsterdam (unpublished). Crystal, David 1987 The Cambridge Encyclopedia of Language. Cambridge etc.: CUP. 1992 An Encyclopedic Dictionary of Language and Languages. Oxford etc.: Blackwell. Cutler, Anne 1989 Auditory lexical access: Where do we start? In: William MarslenWilson (ed.), Lexical Representation and Process, 342-356. Cambridge, Mass. etc.: MIT. 1995 Spoken word recognition and production. In: Joanne Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 97136. San Diego etc.: Academic Press. Cutler, Anne (ed.) 1982 Slips of the Tongue and Language Production. Berlin etc.: Mouton de Gruyter. Davies, Martin and Louise Ravelli (eds.) 1992 Advances in Systemic Linguistics. London etc.: Pinter Publishers. Davis, Philip (ed.) 1995 Alternative Linguistics: Descriptive and Theoretical Modes. Amsterdam/Philadelphia: Benjamins. Deane, Paul D. 1992 Grammar in Mind and Brain. Explorations in Cognitive Syntax. Berlin etc.: Mouton de Gruyter. 1996 Neurological evidence for a cognitive theory of syntax: Agrammatic aphasia and the spatialization of form hypothesis. In: Eugene H. Casad (ed.) Cognitive Linguistics in the Redwoods, 55-115. Berlin etc.: Mouton de Gruyter. Dell, Gary S. 1986 A spreading-activation theory of retrieval in sentence production. Psychological Review 93 (3): 283-321. Dell, Gary S. and Cornell Juliano 1996 Computational models of phonological encoding. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 328-359. London etc.: Taylor & Francis
References
309
Dell, Gary S., Cornell Juliano and Anita Govindjee 1993 Structure and content in language production: A theory of frame constraints in phonological speech errors. Cognitive Science 17: 149-195. Dell, Gary S. and Peter A. Reich 1981 Stages in sentence production: An analysis of speech error data. Journal of Verbal Learning and Verbal Behavior 20: 611-629. de Haan, Pieter 1991 On the exploration of corpus data by means of problem-oriented tagging: Postmodifying clauses in the English noun phrase. In: Stig Johansson and Anna-Brita Stenström (eds.), English Computer Corpora, 51-65. Berlin etc.: Mouton de Gruyter. de Smedt, Koenraad 1996 Computational models of incremental encoding. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 279-307. London etc.: Taylor & Francis. Diehl, Lon G. 1981 Lexical-Generative Grammar. Toward a Lexical Conception of Linguistic Structure. Ann Arbor, Michigan: University Microfilms International. Dijkstra, Ton and Gerard Kempen 1993 Taalpsychologie. Groningen: Wolters-Noordhoff. Dijkstra, Ton and Koenraad de Smedt (eds.) 1996a Computational Psycholinguistics. London etc.: Taylor & Francis. Dijkstra, Ton and Koenraad de Smedt 1996b Computational models in psycholinguistics: An introduction. In: Ton Dijkstra and Koenraad de Smedt (eds.) Computational Psycholinguistics, 3-23. London etc.: Taylor & Francis. Dik, Simon C. 1980 Studies in Functional Grammar. London etc.: Academic Press. 1987 Some principles of functional grammar. In: Rene Dirven and Vilem Fried (eds.), Functionalism in Linguistics, 81-100. Amsterdam/Philadelphia: Benjamins. 1989 The Theory of Functional Grammar. The Structure of the Clause. Dordrecht etc.: Foris Publications. 1991 Functional grammar. In: Flip G. Droste and John E. Joseph (eds.), Linguistic Theory and Grammatical Description, 247-274. Amsterdam/Philadelphia: Benjamins. Dirven, Rene and Vilem Fried (eds.) 1987 Functionalism in Linguistics. Amsterdam/Philadelphia: Benjamins.
310
References
Dodd, Bill 1997
Exploiting a corpus of written German for advanced language learning. In: Anne Wichmann, Steven Fligelstone, Tony McEnery and Gerry Knowles (eds.), Teaching and Language Corpora, 131145. London etc.: Longman. Droste, Flip.G. and John E. Joseph (eds.) 1991 Linguistic Theory and Grammatical Description. Amsterdam/Philadelphia: Benjamins. Ellis, Andrew W. (ed.) 1982 Normality and Pathology in Cognitive Functions. London etc.: Academic Press. 1985 Progress in the Psychology of Language, 1. Hillsdale New Jersey etc.: Erlbaum. 1987 Progress in the Psychology of Language, 3. London etc.: Erlbaum. Emons, Rudolf 1976 Valenzgrammatik für das Englische. Eine Einführung. Tübingen: Niemeyer. Fanselow, Gisbert and Sascha W. Felix 1987 Sprachtheorie. 1 Grundlagen und Zielsetzungen, 2 Die Rektionsund Bindungstheorie. Tübingen: UTB. Fauconnier, Gilles 1994 Mental Spaces. Cambridge etc.: CUP. Fauconnier, Gilles and Mark Turner 1996 Blending as a central process of grammar. In: Adele Goldberg (ed.), Conceptual Structure, Discourse and Language, 113-130. Stanford: CSLI Publ. 1998 Conceptual integration networks. Cognitive Science 22 (2): 133187. Fawcett, Robin P. 1988 The English personal pronouns: An exercise in linguistic theory. In: James D. Benson, Michael J. Cummings and William S. Greaves (eds.), Linguistics in a Systemic Perspective, 185-220. Amsterdam/Philadelphia: Benjamins. Felix, Sascha W. 1987 Cognition and Language Growth. Dordrecht etc.: Foris Publications. Felix, Sascha, Siegfried Kanngießer and Gert Rickheit (eds.) 1990 Sprache und Wissen. Opladen: Westdeutscher Verlag. Fillmore, Charles J. 1975 An alternative to checklist theories of meaning. In: Cogen et al. (eds.), Proceedings of the First Annual Meeting of the Berkeley Linguistic Society, 123-131. Berkeley, California. 1985 Frames and the semantics of understanding. Quaderni di Semantica 4 (2): 222-254.
References
311
Fillmore, Charles J. and Beryl T.Atkins 1992 Toward a frame-based lexicon: The semantics of RISK and its neighbors. In: Adrienne Lehrer and Eva F. Kittay (eds), Frames, Fields and Concepts, 7 5 - 1 0 2 . Hillsdale, New Jersey etc.: Erlbaum. Firth, John .R. 1957 A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis, Special Volume. Philological Society, 1-32. Fligelstone Steve, Paul Rayson and Nicholas Smith 1996 Template analysis: Bridging the gap between grammar and the lexicon. In: Jenny Thomas and Mick Short (eds.), Using Corpora for Language Research, 181-207. London etc.: Longman. Flores d'Arcais, Giovanni B. 1990 Parsing principles and language comprehension during reading. In: David A. Balota, Giovanni B. Flores d'Arcais and Keith Rayner (eds.), Comprehension Processes in Reading, 345-357. Hillsdale, New Jersey etc: Erlbaum. Fodor, Jerry A. 1983 The Modularity of Mind. Cambridge, Mass. etc.: MIT. Fodor, Janet D. 1990 Thematic roles and modularity: Comments on the chapters by Frazier and Tanenhaus et al. In: Gerry Τ. M. Altmann (ed.), Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives. 432-456. Cambridge, Mass. etc.: MIT. Forster, Kenneth I. 1976 Accessing the mental lexicon. In: Roger J. Wales and Edward Walker (eds.), New Approaches to Language Mechanisms, 257287. Amsterdam: North Holland. 1979 Levels of processing and the structure of the language processor. In: William E. Cooper, and Edward C. T. Walker (eds.), Sentence Processing: Psycholinguistic Studies Presented to Merrill Garrett, 27-85. Hillsdale, New Jersey: Erlbaum. 1989 Basic issues in lexical processing. In: William Marslen-Wilson (ed.), Lexical Representation and Process, 75-107. Cambridge, Mass. etc.: MIT. 1990 Lexical processing. In: Daniel N. Osherson and Howard Lasnik (eds.), An Invitation to Cognitive Science, 95-131. Cambridge, Mass. MIT. 1994 Computational modeling and elementary process analysis in visual word recognition. Journal of Experimental Psychology: Human Perception and Performance 20 (6): 1292-1310. Forster, Kenneth I. and Chris Davies 1984 Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology, Learning and Cognition 10 (4): 680 698.
312
References
Fox, Gwyneth 1993
A comparison of 'Policespeak' and 'Normalspeak': A preliminary study. In: John McH. Sinclair, Michael Hoey and Gwyneth Fox (eds.), Techniques of Description, 183-195. London etc.: Routledge.
Fowler, Carol A. 1995 Speech production. In: Joanne Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 29-61. San Diego etc.: Academic Press. Francis, Gill 1993 A corpus-driven approach to grammar. In: Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.), Text and Technology. 137-156. Amsterdam/Philadelphia: Benjamins. Francis, W. Nelson 1992 Language corpora B.C. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 17-32. Berlin etc.: Mouton de Gruyter. Frauenfelder, Uli Η. 1996 Computational models of spoken word recognition. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 114-138. London etc.: Taylor & Francis. Frazier, Lyn 1987 Theories of sentence processing. In: Jay L. Garfield, (ed.), Modularity in Knowledge Representation and Natural-Language Understanding, 291-307. Cambridge, Mass. etc.: MIT. 1989 Against lexical generation of syntax. In: William Marslen-Wilson (ed.), Lexical Representation and Process, 505-528. Cambridge, Mass. etc.: MIT. 1990 Exploring the architecture of the language-processing system. In: Gerry Τ. M. Altmann (ed.), Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives, 409-433. Cambridge, Mass. etc.: MIT. 1995 Issues of representation in psycholinguistics. In: Joanne Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 129. San Diego etc.: Academic Press. Frazier, Lyn and Charles Jr. Clifton 1996 Construal. Cambridge, Mass. Etc.: MIT. Freidin, Robert 1992 Foundations of Generative Syntax. Cambridge, Mass.: MIT. Fromkin, Victoria A. 1971 The non-anomalous nature of anomalous utterances. Language 47: 27-52. Garfield, Jay L. (ed.) 1987 Modularity in Knowledge Representation and Natural-Language Understanding. Cambridge, Mass. etc.: MIT.
References
313
Garman, Michael 1990 Psycholinguistics. Cambridge etc.: CUP. Garnham, Alan 1985 Psycholinguistics. London etc.: Methuen & Co. Garrett, Merrill F. 1975 The analysis of sentence production. In: Gordon H. Bower (ed.), The Psychology of Learning and Motivation, (Vol.9), 133-177. New York etc.: Academic Press. 1980 Levels of processing in sentence production. In: Brian Butterworth (ed.), Language Production: Vol. 1, Speech and Talk, 177-220. London: Academic Press. 1982 Production of speech: Observations from normal and pathological language use. In: Andrew W. Ellis (ed.), Normality and Pathology in Cognitive Functions, 19-76. London etc.: Academic Press. 1984 The organization of processing structure for language production: applications to aphasic speech. In: David Caplan (ed.), Biological Perspectives on Language, 172-193. Cambridge, Mass.: MIT. 1988 Processes in language production. In: Frederick J. Newmeyer (ed.), Linguistics: The Cambridge Survey, Vol. 3, 69-96. Cambridge etc.: CUP. Garside, Roger, Geoffrey Leech and Geoffrey Sampson (eds.) 1987 The Computational Analysis of English. A Corpus-Based Approach. London etc.: Longman. Gavioli Laura 1997 Exploring texts through the concordancer: Guiding the learner. In: Anne Wichmann, Steven Fligelstone, Tony McEnery and Gerry Knowles (eds.), Teaching and Language Corpora, 83-115. London etc.: Longman. Geeraerts, Dirk 1993 Cognitive semantics and the history of philosophical epistemology. In: Richard A. Geiger and Brygida Rudzka-Ostyn (eds.), Conceptualizations and Mental Processing in Language, 53-79. Berlin etc.: Mouton de Gruyter. Geiger, Richard A. and Brygida Rudzka-Ostyn (eds.) 1993 Conceptualizations and Mental Processing in Language, Berlin etc.: Mouton de Gruyter. Gernsbacher, Morton Ann (ed.) 1994 Handbook of Psycholinguistics. San Diego etc.: Academic Press. Ghomeshi, Jila and Diane Massam 1994 Lexical/syntactic relations without projection. Linguistic Analysis 24: 175-217. Gibbs, Raymond W. 1994 The Poetics of Mind. Cambridge etc.: CUP.
314
References 1995
What's cognitive about cognitive linguistics? In: Eugene H. Casad (ed.), Cognitive Linguistics in the Redwoods, 27-53. Berlin etc.: Mouton de Gruyter.
Goldberg, Adele 1995 Constructions. Chicago etc.: The University of Chicago Press. 1996 Jackendoff and Construction-Based Grammar. Cognitive Science 7: 3-19. Goldberg, Adele (ed.) 1996 Conceptual Structure, Discourse and Language. Stanford: CSLI Publ. Goshen-Gottstein Jonathan and Morris Moscovitch 1995 Repetition priming effects for newly formed associations are perceptually based: Evidence from shallow encoding and format specificity. Journal of Experimental Psychology: Learning, Memory and Cognition 21 (5): 1249-1262. Gove, Philip Babcock (ed.) 1961 Webster's Third New International Dictionary of the English Language. Chicago etc.: Encyclopaedia Britannica. Grainger, Jonathan and Ton Dijkstra 1996 Visual word recognition: Models and experiments. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 139-165. London etc.: Taylor & Francis. Greenbaum, Sidney 1992 A new corpus of English: ICE. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 171-179. Berlin etc.: Mouton de Gruyter. Greenbaum, Sidney, Gerald Nelson and Michael Weitzman 1996 Complement clauses in English. In: Jenny Thomas and Mick Short (eds.), Using Corpora for Language Research, 76-91. London etc.: Longman. Grewendorf, Günther 1992 Parametrisierung der Syntax. Zur kognitiven Revolution in der Linguistik. In: Ludger Hoffmann (ed.), Deutsche Syntax, 11-73. Berlin etc.: Mouton de Gruyter. Grimshaw, Jane. B. 1992 Argument Structure. Cambridge, Mass. etc.: MIT. Gropen, Jess, Steven Pinker, Michelle Hollander, Richard Goldberg and Ronald Wilson 1989 The learnabiiity and acquisition of the dative alternation in English. Language 65 (2): 203-257. Hale, Kenneth and Samuel J. Keyser (eds.) 1993 The View from Building 20. Essays in Linguistics in Honor of Sylvain Bromberger. Cambridge, Mass. etc: MIT.
References
315
Halliday, Michael. A. K. 1976a A brief sketch of systemic grammar. In: Gunther Kress (ed.), Halliday: System and Function in Language, 3-6. London: OUP. 1976b The form of functional grammar. In: Gunther Kress (ed.), Halliday: System and Function in Language, 7-25. London: OUP. 1976c Categories of the theory of grammar. In: Gunther Kress (ed.), Halliday: System and Function in Language, 52-72. London: OUP. 1985 A Short Introduction to Functional Grammar. London etc.: Arnold. 1991 Corpus studies and probabilistic grammar. In: Karin Aijmer and Bengt Altenberg (eds.), English Corpus Linguistics, 30-43. London etc.: Longman. 1992 Language as system and language as instance: The corpus as a theoretical construct. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 61-77. Berlin etc.: Mouton de Gruyter. 1994 An Introduction to Functional Grammar. London etc.: Edward Arnold Halliday, Michael. A. K. and Robin Fawcett (eds.) 1987 New Developments in Systemic Linguistics 1: Theory and Description. London etc.: Pinter. Halliday, Michael A. K. and Ruqaiya Hasan 1990 Language, context, and text: aspects of language in a sociosemioticperspective. Oxford etc.: OUP. Handke, Jürgen 1995 The Structure of the Lexicon. Berlin etc.: Mouton de Gruyter. Hankamer, Jorge 1989 Morphological parsing and the lexicon. In: William Marslen-Wilson (ed.), Lexical Representation and Process, 392-408. Cambridge, Mass. etc.: MIT. Hasan, Ruqaiya 1987 The grammarian's dream: Lexis as most delicate grammar. In: Halliday, Michael. A. K. and Robin Fawcett (eds.), New Developments in Systemic Linguistics 1: Theory and Description, 184-211. London etc.: Pinter. Haspelmath, Martin 1994 Functional categories, X-bar theory, and grammaticalization theory. Sprachtypologische Universalienforschung 37 (1): 3-15. Hawkins, Joyce M. and Robert Allen (eds.) 1991 The Oxford Encyclopedic English Dictionary. Oxford: Clarendon Press. Heath, David, Thomas Herbst, Ian Roe and Dieter Götz (in preparation) Collins Cobuild Dictionary of English Word Complementation. London: HarperCollins.
316
References
Heibig, Gerhard (ed.) 1971 Beiträge zur Valenztheorie. The Hague: Mouton. Henderson, Eugenie J. Α. 1987 J. R. Firth in retrospect: A view from the eighties. In: Ross Steele and Terry Threadgold (eds.), Language Topics. Essays in Honour o/M. Halliday, 57-68. Amsterdam/Philadelphia: Benjamins. Herbst, Thomas, Rita Stoll and Rudolf Westermayr 1991 Terminologie der Sprachbeschreibung. Ismaning: Hueber. Hillert, Dieter (ed.) 1994 Linguistics and cognitive neuroscience. Linguistische Berichte. Sonderheft 6 Hoey, Michael 1993 Α common signal in discourse: How the word reason is used in texts. In: John McH. Sinclair, Michael Hoey and Gwyneth Fox (eds.), Techniques of Description, 67-82. London etc.: Routledge. Holmes, John N. 1988 Speech synthesis and and recognition. Wokinham, Berks: Van Nostrand Reinhold Holmes, V. M., L. Stowe and L.Cupples 1989 Lexical expectations in parsing complement-verb sentences. Journal of Memory and Language 28: 668-689. Horrocks, Geoffrey 1987 Generative Grammar. London etc.: Longman. Jackendoff, Ray S. 1983 Semantics and Cognition. Cambridge, Mass.: MIT. 1992 Babe Ruth homered his way into the hearts of America. In: Tim Stowell and Eric Wehrli (eds.), Syntax and Semantics 26. Syntax and the Lexicon, 155-178. San Diego etc.: Academic Press. Jacobs, Roderick A. and Peter S. Rosenbaum (eds.) 1970 Readings in English Transformational Grammar. Waltham, Mass. etc.: Ginn. Johansson, Stig, 1992 Comments. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 332-334. Berlin etc.: Mouton de Gruyter. Johansson, Stig and Signe Oksefjell 1996 Towards a unified account of the syntax and semantics of GET. In: Jenny Thomas and Mick Short (eds.), Using Corpora for Language Research, 57-75. London etc.: Longman. Johansson, Stig and Anna-Brita Stenström (eds.) 1991 English Computer Corpora. Berlin etc.: Mouton de Gruyter. Johnson, Mark 1987 The Body in the Mind. Chicago etc.: The University of Chicago Press.
References 1992
Philosophical implications Linguistics 3-4: 345-366.
of
cognitive
semantics.
317
Cognitive
Kasher, Asa (ed.) 1991 The Chomskyan Turn. Cambridge, Mass. etc.: Blackwell. Keller, Jörg and Helen Leuninger 1993 Grammatische Strukturen - kognitive Prozesse. Tübingen: Narr. Kelly, Edward F. and Philip J. Stone 1975 Computer Recognition of English Word Senses. Amsterdam etc.: North-Holland Publication Company. Kempchinsky, Paula 1995 From the lexicon to the syntax: The problem of subjunctive clauses. In: Hector Campos and Paula Kempchinsky (eds.), Evolution and Revolution in Linguistic Theory, 228-250. Washington D.C.: Georgetown University Press. Kempen, Gerard 1996 Computational models of syntactic processing in language comprehension. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 192-220. London etc.: Taylor & Francis. Kempen, Gerard and Edward Hoenkamp 1987 Incremental procedural grammar for sentence formulation. Cognitive Science 11: 201-258. Kempen, Gerard and Peter Huijbers 1983 The lexicalization process in sentence production and naming: Indirect elections of words. Cognition 14: 185-209. Kennedy, Graeme 1991 Between and through: The company they keep and the functions they serve. In: Karin Aijmer and Bengt Altenberg (eds.), English Corpus Linguistics, 95-110. London etc.: Longman. Kess, Joseph F. 1992 Psycholinguistics. Psychology, Linguistics, and the Study of Natural Language. Amsterdam/Philadelphia: Benjamins. Kirkeby, O. F. 1994 Cognitive science. In: Asher, R. E. and J. Μ. Y. Simpson (eds.), The Encyclopedia of Language and Linguistics, 593-600. Oxford etc.: Pergamon Press. Kittay, Eva F. and Adrienne Lehrer 1992 Introduction. In: Adrienne Lehrer and Eva F. Kittay (eds., Frames, Fields and Concepts, 1-18. Hillsdale, New Jersey etc.: Erlbaum. Kjellmer, Göran 1984 Some thoughts on collocational distinctiveness. In: Jan Aarts and Willem Meijs (eds.), Corpus Linguistics, 163-171. Amsterdam: Rodopi.
318
References 1991 1992
1994
A mint of phrases, In: Karin Aijmer and Bengt Altenberg (eds.), English Corpus Linguistics, 111-127. London etc.: Longman. Grammatical or nativelike? In: Gerhard Leitner (ed.), New Directions in English Language Corpora, 329-344. Berlin etc.: Mouton de Gruyter. Dictionary of English Collocations, Based on the Brown Corpus. Oxford: Clarendon.
Klatt, Dennis H. Review of selected models of speech perception, In: William 1989 Marslen-Wilson (ed.), Lexical Representation and Process, 169226. Cambridge, Mass. etc.: MIT. Konieczny, Lars, Barbara Hemforth, Christoph Scheepers and Gerhard Strube 1997 The role of lexical heads in parsing: Evidence from German. Language and Cognitive Processes 12 (2-2): 307-328. Kress, Gunther (ed.) 1976 Halliday: System and Function in Language. London: OUP. Kucera, Henry and W. Nelson Francis 1967 Computational Analysis of Present-Day American English. Providence: Brown University Press. Kuhn, Thomas S. 1978 Die Entstehung des Neuen. Frankfurt am Main: Suhrkamp. Lakoff, George 1987 Women, Fire, and Dangerous Things. Chicago etc.: The University of Chicago Press. 1990 The invariance hypothesis: Is abstract reason based on imageschemas? Cognitive Linguistics 1: 39-74. 1993 The contemporary theory of metaphor. In: Andrew Ortony (ed.), Metaphor and Thought, 202-251. Cambridge: CUP. Lamb, Sydney 1991 Syntax: Reality or illusion? LACUS-Forum 18: 179-185. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar 1. Stanford, California: Stanford University Press. 1988a An overview of cognitive grammar. In: Brygida Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics, 3-48. Amsterdam/Philadelphia: Benjamins. 1988b A view of linguistic semantics. In: Brygida Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics, 49-90. Amsterdam/Philadelphia: Benjamins. 1988c A usage-based model. In: Brygida Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics, 127-161. Amsterdam/Philadelphia: Benjamins. 1990 Subjectification. Cognitive Linguistics 1-1: 5-38.
References 1991a
319
Foundations of Cognitive Grammar 2., Stanford, California: Stanford University Press. 1991b Cognitive grammar. In: Flip G. Droste and John E. Joseph (eds.), Linguistic Theory and Grammatical Description, 275-306. Amsterdam/Philadelphia: Benjamins. 1995 Viewing in cognition and grammar. In: Philip Davies (ed.), Alternative Linguistics: Descriptive and Theoretical Modes, 153212. Amsterdam/Philadelphia: Benjamins. 1999 Grammar and Conceptualization. Berlin etc.: Mouton de Gruyter. Lapointe, Steven G. and Gary S. Dell 1989 A synthesis of some recent work in sentence production. In: Greg N. Carlson and Michael K. Tanenhaus (eds.), Linguistic Structure in Language Processing, 107-156. Dordrecht etc.: Kluwer Academic Publishers. Leech, Geoffrey 1981 Semantics. Harmondsworth: Penguin Books. 1992 Corpora and theories of linguistic performance. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 105-122. Berlin etc.: Mouton de Gruyter. Lehrer, Adrienne and Eva F. Kittay (eds.) 1992 Frames, Fields and Concepts. Hillsdale, New Jersey etc.: Erlbaum. Leitner Gerhard (ed.) 1992 New Directions in English Language Corpora. Berlin etc.: Mouton de Gruyter. Levelt, Willem J. M. 1983 Monitoring and self-repair in speech. Cognition 14: 41-104. 1989 Speaking. From Intention to Articulation. Cambridge, Mass. etc.: MIT Levinson, Stephen C. 1983 Pragmatics. London etc.: CUP. Lewandowski, Theodor 1990 Linguistisches Wörterbuch 1-3. Heidelberg: Quelle & Meyer. Lively, Scott E., David B. Pisoni and Stephen D. Goldinger 1994 Spoken word recognition. In: Morton Ann Gernsbacher (ed.), Handbook of Psycholinguistics, 265-301. San Diego etc.: Academic Press. Lombardi, Linda and Mary C. Potter 1992 The regeneration of syntax in short term memory. Journal of Memory and Language 31: 713-733. Louw, Bill 1993 Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies, In: Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.), Text and Technology, 157-176. Amsterdam/Philadelphia: Benjamins.
320
References
Lycan, William G. (ed.) 1990 Mind and Cognition. Oxford etc.: Blackwell. Lyne, Anthony A. 1988 Systemic syntax from a lexical point of view. In: James D. Benson, Michael J. Cummings and William S. Greaves (eds.), Linguistics in a Systemic Perspective, 53-72. Amsterdam/Philadelphia: Benjamins. Lyons, John (ed.) 1970 New Horizons in Linguistics. Baltimore: Penguin Books. 1971 Einführung in die moderne Linguistik. München: Beck. MacDonald, Maryellen C. 1993 The interaction of lexical and syntactic ambiguity. Journal of Memory and Language 32 (5): 692-715. MacDonald, Maryellen C., Neal J. Pearlmutter amd Mark S. Seidenberg 1994a Lexical nature of syntactic ambiguity resolution. Psychological Review 101: 676-703. 1994b Syntactic ambiguity resolution as lexical ambiguity resolution. In: Charles Jr. Clifton, Lyn Frazier and Keith Rayner (eds.), Perspectives on Sentence Processing, 123-153. Hillsdale, New Jersey etc.: Erlbaum. Mackin, Ronald 1978 On collocations: "Words shall be known by the company they keep" In: Peter Strevens (ed.), In Honour of A. S. Hornby, 149-165. Oxford: OUP. Mair, Christian 1991 Quantitative or qualitative corpus analysis? Infinitival complement clauses in the survey of english usage corpus, In: Stig Johansson and Anna-Brita Stenström (eds.), English Computer Corpora, 67-80. Berlin etc.: Mouton de Gruyter. Malmkjaer, Kirsten (ed.) 1991 The Linguistic Encyclopaedia. London: Routledge. Marantz, Alec 1995 The minimalist program. In: Gert Webelhuth (ed.), Government and Binding Theory and Minimalist Program, 349-382. Oxford etc.: Blackwell. Maratsos, Michael 1979 How to get from words to sentences, In: Doris Aaronson and Robert W. Rieber (eds.), Psycholinguistic Research: Implications and Applications, 265-283. Hillsdale, New Jersey: Erlbaum. Marslen-Wilson, William (ed.) 1989 Lexical Representation and Process. Cambridge, Mass. etc.: MIT.
References
321
Marslen-Wilson, William and Lorraine K. Tyler 1987 Against modularity. In: Jay L. Garfield (ed.), Modularity in Knowledge Representation and Natural-Language Understanding, 37-62. Cambridge, Mass. etc.: MIT. Massaro, Dominic W. 1994 Psychological aspects of speech perception. In: Morton Ann Gernsbacher (ed.), Handbook of Psycholinguistics, 219-264. San Diego etc.: Academic Press. 1996 Modelling multiple influences in speech perception. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 85-113. London etc.: Taylor & Francis. Matthews, Robert J. 1991 Psychological reality of grammars. In: Asa Kasher (ed.), The Chomskyan Turn, 182-199. Cambridge, Mass. etc.: Blackwell. McClelland, James L. and David E. Rumelhart 1981 An interactive activation model of context effects in letter perception: Part 1. Psychological Review 88 (5): 375-407. McLaughlin, Margaret L. 1984 Conversation. Beverly Hills etc.: Sage Publication. McNeill, David 1979 Natural processing units of speech. In: Doris Aaronson and Robert W. Rieber (eds.), Psycholinguistic Research: Implications and Applications, 241-261. Hillsdale, New Jersey etc.: Erlbaum. Mey, Jacob L. 1993 Pragmatics. Oxford etc.: Blackwell. Miller, Joanne L. and Peter D. Eimas (eds.) 1995 Speech, Language, and Communication. San Diego etc.: Academic Press. Mitchell, Don C. 1994 Sentence parsing. In: Morton Ann Gernsbacher (ed.) Handbook of Psycholinguistics, 375-410. San Diego etc.: Academic Press. Morton, John 1969 The interaction of information in word recognition. Psychological Review 76: 165-178. Murre, Jacob M. J. and Rainer Goebel 1996 Connectionist modelling. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 49-81. London etc.: Taylor & Francis. Neumann, Odmar 1990 Lexical access: Some comments on models and metaphors. In: David A. Balota, Giovanni B. Flores d'Arcais and Keith Rayner (eds.), Comprehension Processes in Reading, 165-185. Hillsdale, New Jersey etc: Erlbaum.
322
References
Newman, John 1996 GIVE. A Cognitive Linguistic Study. Berlin etc.: Mouton de Gruyter. Newmeyer, Frederick 1988 Linguistics: The Cambridge Survey 1. Linguistic Theory: Foundations. Cambridge etc.: CUP. Nygaard, Lynne C. and David B. Pisoni 1995 Speech perception: New directions in research and theory. In: Joanne L. Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 63-96. San Diego etc.: Academic Press. Ortony, Andrew 1993 Metaphor and Thought. Cambridge: CUP. Osherson, Daniel. N. and Howard Lasnik (eds.) 1990 An Invitation to Cognitive Science: Language 1. Cambridge, Mass. etc.: MIT. Partington, Alan 1993 Corpus evidence of language change. In: Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.), Text and Technology, 177-192. Amsterdam/Philadelphia: Benjamins. Pearlmutter, Neal J. and Maryellen C. MacDonald 1995 Individual differences and probabilistic constraints in syntactic ambiguity resolution. Journal of Memory and Language 34 (4): 521-542. Pechmann, Thomas 1994 Sprachproduktion: Zur Generierung komplexer Nominalphrasen. Opladen: Westdeutscher Verlag. Perfetti, Charles A. 1990 The cooperative language processors: Semantic influences in an autonomous syntax. In: David A. Balota, Giovanni B. Flores d'Arcais and Keith Rayner (eds.), Comprehension Processes in Reading, 205-230. Hillsdale, New Jersey etc.: Erlbaum. Piattelli-Palmarini, Massimo 1994 Ever since language and learning: Afterthoughts on the PiagetChomsky debate. Cognition 50: 315-346. Pinker, Steven 1989 Learnability and Cognition. The Acquisition of Argument Structure. Cambridge, Mass. etc.: MIT. Plaut, David C. 1997 Structure and function in the lexical system: Insights from distributed models of word reading and lexical decision. Language and Cognitive Processes 12 (5/6): 765-805. Pollard, Carl and Ivan A. Sag 1987 Information-Based Syntax and Semantics 1., Stanford: CSLI. 1994 Head-Driven Phrase Structure Grammar. Chicago etc.: The University of Chicago Press.
References
323
Potter, Mary C. and Barbara A. Faulconer 1979 Understanding noun phrases. Journal of Verbal Learning and Verbal Behavior 18: 509-521. Quirk, Randolph 1992 On corpus principles and design. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 457-469. Berlin etc.: Mouton de Gruyter. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik (eds.) 1985 A Comprehensive Grammar of the English Language. London etc.: Longman. Radford, Andrew 1988 Transformational Grammar. Cambridge etc.: CUP. Renouf, Antoinette and John McH. Sinclair 1991 Collocational frameworks in English. In: Karin Aijmer and Bengt Altenberg (eds.), English Corpus Linguistics, 128-143. London etc.: Longman. Robins, Robert H. 1991 A Short History of Linguistics. London etc.: Longman. Roelofs, Ardi 1996 Computational models of lemma retrieval. In: Ton Dijkstra and Koenraad de Smedt (eds.), Computational Psycholinguistics, 308327. London etc.: Taylor & Francis. Rudzka-Ostyn, Brygida 1993 Introduction. In: Richard A. Geiger and Brygida Rudzka-Ostyn (eds.), Conceptualizations and Mental Processing in Language, 120. Berlin etc.: Mouton de Gruyter. Rudzka-Ostyn, Brygida (ed.) 1988 Topics in Cognitive Linguistics. Amsterdam/Philadelphia: Benjamins. Rumelhart, David E., James L. McClelland and the PDP Research Group (eds.) 1986 Parallel Distributed Processing: Explorations in the Microstructure of Cognition. 1-2. Cambridge, Mass. etc.: MIT. Samuel, Arthur G. 1990 Using perceptual-restoration effects to explore the architecture of perception. In: Gerry Τ. M. Altmann (ed.), Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives, 295-314. Cambridge, Mass. etc.: MIT. Scarborough, Don L., Charles Cortese and Hollis Scarborough 1977 Frequency and repetition effetcs in lexical memory. Journal of Experimental Psychology: Human Perception and Performance 7: 3-12. Schmitz, Ulrich 1992 Computerlinguistik: Eine Einführung. Opladen: Westdeutscher Verlag.
324
References
Schreuder, Robert and Giovanni Flores d'Arcais 1989 Psycholinguistic issues in the lexical representation of meaning. In: William Marslen-Wilson (ed.), Lexical Representation and Process, 409-435. Cambridge, Mass. etc.: MIT. Schwarz, Monika 1992 Einführung in die kognitive Linguistik. Tübingen: Francke. Scovel, Thomas 1998 Psycholinguistics. Oxford etc.: OUP. Seidenberg, Mark S. 1990 Lexical access: Another theoretical soupstone. In: David A. Balota, Giovanni B. Flores d'Arcais and Keith Rayner (eds.), Comprehension Processes in Reading, 33-71. Hillsdale, New Jersey etc.: Erlbaum. 1995 Visual word recognition: An overview. In: Joanne Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 137-179. San Diego etc.: Academic Press. Seidenberg, Mark S. and James L. McClelland 1989 A distributed, developmental model of word recognition and naming. Psychological Review 96 (4): 523-568. Shapiro, Lewis P., H. Nicholas Nagel and Beth A. Levine 1993 Preferences for a verb's complements and their use in sentence processing. Journal of Memory and Language 32: 96-114. Shapiro, Lewis P. and Cynthia K. Thompson 1994 On lexical properties, syntax and brain damage. In: Dieter Hillert (ed.), Linguistics and cognitive neuroscience. Linguistische Berichte. Sonderheft 6: 168-201. Shapiro, Lewis P., Edgar B. Zurif and Jane Grimshaw 1989 Verb processing during sentence comprehension: Contextual impenetrability. Journal of Psycholinguistic Research 18: 223-243. Short, Mick, Elena Semino and Jonathan Culperer 1996 Using a corpus for stylistic research: Speech and thought presentation. In: Jenny Thomas and Mick Short (eds.), Using Corpora for Language Research, 110-131. London etc.: Longman. Sigurd, Bengt 1992 Comments. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 123-125. Berlin etc.: Mouton de Gruyter. Sinclair, John McH. 1987a Collocation: A progress report. In: Ross Steele and Terry Threadgold (eds.), Language Tpoics. Essays in Honour of M. Halliday, 319-331. Amsterdam/Philadelphia: Benjamins. 1988 Sense and structure in lexis. In: James D. Benson, Michael J. Cummings and William S. Greaves (eds.), Linguistics in a Systemic Perspective, 73-97. Amsterdam/Philadelphia: Benjamins.
References 1991 1992a
325
Corpus, Concordances, Collocations. Oxford etc.: OUP. The automatic analysis of corpora. In: Jan Svartvik (ed.), Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics, 379-397. Berlin etc.: Mouton de Gruyter. 1992b Trust the text. In: Martin Davies and Louise Ravelli (eds.), Advances in Systemic Linguistics, 5-19. London etc.: Pinter Publishers. Sinclair, John McH. (ed.) 1987b The Collins Cobuild English Language Dictionary. London etc.: Collins. 1987c Looking up. An Account of the COBUILD Project in Lexical Computing. London etc.: Collins. 1990 Collins Cobuild English Grammar. London: Collins. Sinclair, John McH., Michael Hoey and Gwyneth Fox (eds.) 1993 Techniques of Description. London etc.: Routledge. Smadja, Frank 1994 Retrieving collocations from text: Xtract. In: Susan Armstrong (ed.), Using Large Corpora, 143-177. Cambridge, Mass. etc.: MIT. Sproat, Richard 1992 Lexicon in formal grammar. In: William Bright (ed.), International Encyclopedia of Linguistics, 335-336. Oxford etc.: OUP. Steedman, Mark, J. 1989 Grammar, interpretation, and processing from the lexicon. In: William Marslen-Wilson (ed.), Lexical Representation and Process, 463-504. Cambridge, Mass. etc.: MIT. Steele, Ross and Terry Threadgold (eds.) 1987 Language Topics. Essays in Honour of M. Halliday. Amsterdam/Philadelphia: Benjamins. Steiner, Erich 1983 Die Entwicklung des Britischen Kontextualismus. Heidelberg: Julius Groos. Stemberger, Joseph P. 1982 The lexicon in a model of language production, Ph.D. dissertation, University of California, San Diego. 1985 An interactive activation model of language production. In: Andrew W. Ellis (ed.), Progress in the Psychology of Language, 1. Hillsdale New Jersey etc.: Erlbaum. Stowell, Tim 1992 The role of the lexicon in syntactic theory. In: Tim Stowell and Eric Wehrli (eds.), Syntax and Semantics 26. Syntax and the Lexicon , 952. San Diego etc.: Academic Press. Stowell, Tim and Eric Wehrli (eds.) 1992 Syntax and Semantics 26. Syntax and the Lexicon. San Diego etc.: Academic Press.
326
References
Strevens, Peter (ed.) 1978 In Honour of A. S. Hornby. Oxford: OUP. Strohner, Hans 1995 Kognitive Systeme. Opladen: Westdeutscher Verlag. Stubbs, Michael 1993 British traditions in text analysis. From Firth to Sinclair. In: Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.), Text and Technology, 1-33. Amsterdam/Philadelphia: Benjamins. 1995 Collocations and semantic profiles: on the cause of the trouble with quantitative methods. Functions of Language 2(1): 1-33. 1996 Text and Corpus Analysis. Oxford etc.: Blackwell. Svartvik, Jan (ed.) 1992 Trends in Linguistics. Studies and Monographs. Directions in Corpus Linguistics. Berlin etc.: Mouton de Gruyter. Tanenhaus, Michael K. and Greg N. Carlson 1989 Lexical structure and language comprehension. In: William Marslen-Wilson (ed.) Lexical Representation and Process, 529561. Cambridge, Mass. etc.: MIT. Tanenhaus, Michael K., Gary S. Dell and Greg N. Carlson 1987 Context effects in lexical processing: A connectionist approach to modularity. In: Jay L. Garfield (ed.), Modularity in Knowledge Representation and Natural-Language Understanding, 83-108. Cambridge, Mass. etc.: MIT. Tanenhaus, Michael K., Susan M. Garnsey and Julie Boland 1990 Combinatory lexical information and language comprehension. In: Gerry Τ. M. Altmann (ed.), Cognitive Models of Speech Processing. Psycholinguistic and Computational Perspectives, 383-408. Cambridge, Mass. etc.: MIT. Tanenhaus, Michael K. and John C. Trueswell 1995 Sentence comprehension. In: Joanne L. Miller and Peter D. Eimas (eds.), Speech, Language, and Communication, 217-262. San Diego etc.: Academic Press. Taraban, Roman and James L. McClelland 1988 Constituent attachment and thematic role assignment in sentence processing: Influences of content-based expectations. Journal of Memory and Language 27: 597-632 1990 Parsing and comprehension: A multiple-constraint view. In: David A. Balota, Giovanni B. Flores d'Arcais and Keith Rayner (eds.), Comprehension Processes in Reading, 231-263. Hillsdale, New Jersey etc.: Erlbaum. Thomas, Jenny and Mick Short (eds.) 1996 Using Corpora for Language Research. London etc.: Longman.
References
327
Thompson, Sandra A. 1992 Functional grammar. In: William Bright (ed.), International Encyclopedia of Linguistics, 37-39. New York etc.: OUP. Tognini-Bonelli, Elena, 1993 Actual and actually. In: Mona Baker, Gill Francies and Elena Tognini-Bonelli (eds.), Text and Technology, 193-212. Amsterdam/Philadelphia: Benjamins. Trask, Robert L. 1993 A Dictionary of Grammatical Terms in Linguistics. London etc.: Routledge. Trueswell, John C. and Michael K. Tanenhaus 1994 Toward a lexicalist framework for constraint-based syntactic ambiguity resolution. In: Charles Jr. Clifton, Lyn Frazier and Keith Rayner (eds.), Perspectives on Sentence Processing, 155-179. Hillsdale, New Jersey etc.: Erlbaum. Trueswell, John C., Michael K. Tanenhaus and Susan M. Garnsey 1994 Semantic influences on parsing: Use of thematic information in syntactic ambiguity resolution. Journal of Memory and Language 33:285-318. Tuggy, David 1997 On the storage vs. computation of complex linguistic structures, conference paper from the ICLC 1997, Amsterdam (unpublished). Tyler, Lorraine ΚΙ 989 The role of lexical representations in language comprehension. In: William Marslen-Wilson, (ed.), Lexical Representation and Process, 439-462. Cambridge, Mass. etc.: MIT. 1992 Spoken Language Comprehension. An Experimental Approach to Disordered and Normal Processing. Cambridge, Mass. etc.: MIT. Van Lancker, Diana 1987 Nonpropositional speech: Neurolinguistic studies. In: Andrew W. Ellis (ed.), Progress in the Psychology of Language 3, 49-114. London etc.: Erlbaum. Van Wijk, Carel and Gerard Kempen 1987 A dual system for producing self-repairs in spontaneous speech: Evidence from experimentally elicited corrections. In: Cognitive Psychology 19: 403-440. Wales, Roger J. and Edward Walker (eds.) 1976 New Approaches to Language Mechansims. Amsterdam: North Holland. Webelhuth, Gert (ed.) 1995a Government and Binding Theory and the Minimalist Program. Oxford etc.: Blackwell.
328
References
Webelhuth, Gert 1995b X-bar theory and case theory. In: Gert Webelhuth (ed.), Government and Binding Theory and the Minimalist Program, 1595. Oxford etc.: Blackwell. Welke, Klaus Μ. 1988 Einführung in die Valenz- und Kasustheorie. Leipzig: Bibliographisches Institut. Wescoat, Michael T. and Annie Zaenen 1991 Lexical functional grammar. In: Flip G. Droste and John Joseph (eds.), Linguistic Theory and Grammatical description: Nine Current Approaches, 103-136. Amsterdam/Philadelphia: Benjamins. Wichmann, Anne, Steven Fligelstone, Tony McEnery and Gerry Knowles (eds.) 1997 Teaching and Language Corpora. London etc.: Longman. Willis, Dave 1990 The Lexical Syllabus. Collins COBUILD. London and Glasgow: Bell and Bain Ltd. 1993 Grammar and lexis: some pedagogical implications. In: John McH. Sinclair, Michael Hoey and Gwyneth Fox (eds.), Techniques of Description, 83-93. London etc.: Routledge. Zwicky, Arnold M. 1992 Morphology and syntax. In: William Bright (ed.), International Encyclopedia of Linguistics, 10-12. Oxford etc.: OUP. Zwitserlood, Pienie 1994 Access to phonological-form representations in language comprehension and production. In: Charles Jr. Clifton, Lyn Frazier and Keith Rayner (eds.), Perspectives on Sentence Processing, 83106. Hillsdale, New Jersey etc.: Erlbaum.
Index of subjects
ambiguity resolution 66-79, 83, autonomy hypothesis 24, 85, 96, 151 blending see conceptual blending Chomsky an models see generative grammar cognitive grammar 176-186, 290-292, 301 cognitive linguistic approaches 148186, 288-292 cognitive linguistics 150-160 cognitive syntax 161 -164 collocations 47, 96, 109, 119, 227245, 248-278, 282, 296-298 distributively repeated collocations 252-276 exactly repeated collocations 252276 non-repeated collocations 252-276 combinatory lexical information 59 competence models see linguistic models comprehension models 22, 48-88 autonomous models 51-55 interactive models 51-55 conceptual blending 169-176,179 conceptualization (in language production) 21-22, conceptualization (in cognitive linguistics) 153-154, 158 connectionism 51-55 constraint-based accounts see constraint-based models constraint-based models 66-83, 87, 248, 283 constraint-satisfaction see constraintbased models
construal hypothesis 63-65 construction grammar 165-176,289290 constructional schema see schema constructions 158, 164, 166-169, 173-176, 182, 289 context effects 27, 53-55, 74- 77, 86, 281 co-occurrence factor 71, 74, 76, 250 coordination problem 30, 34, 41 corpus linguistics 108-119, 284-285 corpus-linguistic research 95, 108120, 227-244, 284, 293 Deane's cognitive syntax see cognitive syntax Deane's theory see cognitive syntax derivative syntax see conceptual blending Dik's (functional) model 98-104, 283-284 entrenchment 164-165, 177-182,288, 290 entrenchment hierarchies 164-165 exactly repeated collocations 252-276 exactly repeated productive phrases 252-276 experientialism 155-156 experimental evidence 4, 51, 66, 78, 113, 133, 248, 279 experimental evidence explanatory adequacy 89, 148 frequency factor see frequency information frequency informtation 68-79, 232, 237
330
Index of subjects
frequency of co-occurrence see frequency information function assignment (in language production) 35-42,281 function assignment (in Lexical Functional Grammar) 133-135 functional approaches 97-108 functional structures (in language production) 35-37 functional structures (in Lexical Functional Grammar) 135 garden-path model see garden-path theory garden-path theory 59-63, 77-78, 223 generative approaches 120-148 generative grammar 120-123, 149151,154, 176 generative linguistics 110, 151, 156157 Goldberg's construction grammar see construction grammar Government-and-Binding model 123132 grammar 5-7,32-34, 106-108, 122128, 158-159, 182-186 grammatical encoding 18,21,39-43, 46, 280-283 Halliday's (functional) model 104108,284 Head-Driven Phrase Structure Grammar 142-148,287-288 idiom principle 113-115, 242-244 idioms 109 incrementality 43-45 incremental production see incremental ity instantiation 166 internal lexicon see mental lexicon interrelation between lexicon and syntax see lexicon-syntax interface introspection 110-113
intuition see introspection Langacker's Cognitive Grammar see cognitive grammar language comprehension 15,22,4888, 212-213, 247-248, 280, 282, 292-297 language models see linguistic models language perception see language comprehension language processing 15-88, 247-248, 276-278, 280-283, 292-298 language production 15,20-50,279283 language use 15-16,108-113,176, 187-245,279, 292-297 language-user framework 17 lemma 39, 43, 281 lexical access 39, 55-57, 280 lexical co-occurrences 95, 227-245, 296-297 lexical decision experiment see lexical decision task lexical decision task 251,256,259, 297 lexical dominance 71-75, 87-88 lexical entry 7-9, 39-41, 55-57, 7376, 101-103, 122-124, 139, 146148, 277-281 Lexical Functional Grammar 132138, 286 lexical hypothesis 39 lexical patterns see lexical cooccurrences lexical redundancy rules 134-135, 139, 143 lexical representation 56-61 lexical selection 22, 30, 33, 36 Lexical-Generative Grammar 138141,286 lexically driven parser 65 lexically patterned speech see lexical co-occurrences
Index of subjects lexicon 1-13,30-48,55-87,93-96, 99, 104, 108, 122-123, 126-130, 132, 137, 140, 143, 158, 164, 166, 173, 182, 185-188, 279-301 lexicon-oriented descriptions see lexicon-oriented models 93,95, 108, 283 lexicon-syntax interaction see lexicon-syntax interface lexicon-syntax interface 1-3, 16, 27, 46-48, 85-88, 93-97, 187-189, 209210, 212, 225-226, 247, 277-301 lexicon-syntax interrelation see lexicon-syntax interface lexico-syntactic patterns 96, 247 see also collocations linguistic models 1-4, 89, 94, 96, 187-188, 247, 278-280, 283-292, 293, 296,298-301 mental lexicon 9-12,47,227,229, 232,277 Minimalist Program 127-132, 285 models of language comprehension see comprehension models models of language processing see performance models models of language production see production models models of language use see performance models modularity 23-24, 51-55, 63, 84, 151, modularity hypothesis 23-24, 151, 285 morphology 5, 8, 17, 122, 158, 182184 natural linguistic model 3, 226, 244, 292, 298-301 natural-language processing system 17-19 naturalness 1,4,90-93,279,298 objectivist paradigm 154
331
open-choice principle 114-115, 242243 overlaps 3-4, 189, 212-226, 295-296, parallel distributed processing 10, 2729, 34 parallel processing 2 7 , 5 4 , 6 1 parsing 17-19,56-61,65-67,82-87, 281-283 PDP-models 27-29, 33 perceptual contiguity hypothesis 255, 274 performance data 3-4, 110-113, 121, 160, 187-189, 212, 227, 247, 292297 performance models 2-3, 15-20, 25, 51,91-93, 188,292-293,300 phrase-structure rules 123-124, 200, 286 plausibility 3-4, 87, 90-93, 95, 1 Π Ι 18, 186, 188, 277, 279-280, 283292, 298-301 pragmatic aspects see pragmatic factors pragmatic factors 32, 36, 283-293, 299 predicate-argument structure 8, 47, 122-125 prefabricated segments 114 production models 20-48 interactive models 25, 27 serial models 20-22, 34, 38 producti on-comprehens i on relationship 49 productive phrases 248, 252-276, 297 distributively repeated productive phrases 252-276 exactly repeated productive phrases 252-276 non-repeated productive phrases 252-276 projection principle 125,126 psycholinguistic models see performance models
332
Index of subjects
psychological reality (of a linguistic model) 90-93, 113, 146, 183, 287 reformulations 3-4, 190-211,293 repetition effect 251-276,297 schema 166,175-186,290-292 schematization 172, 176-178, 185, 291 selectional restrictions 55, 124 self-repairs see reformulations slot-and-filler mechanism 34 Spatialization of Form Hypothesis 161-162 subcategorization frame 55,68,123124,146, 287 subcategorization information see subcategorization frame syntactic ambiguity resolution 13, 53, 68-74, 83, 197 syntactic building procedures 39, 42 syntactic fragments 47, 86, 96, 115, 244, 282, 294 syntactic frame 21, 32-35, 46, 114, 194-196, 202-203,298, 299 (syntactic) frame construction 30, 46, 282 syntactic knowledge 36, 46, 86-87, 186, 225, 245, 277, 282-283, 290, 295, 300
syntactic representation 73-74, 131, 220-221,224-226, 295-296 syntax 1-4, 12-13, 30-48, 55-87, 9396, 99, 101-102, 109, 122-127, 137, 140, 143-144, 158, 161, 165, 174, 176, 182, 185-188, 279-301 syntax-driven parsing 65 syntax-lexicon interface see lexiconsyntax interface syntax-oriented models 93, 138,283, 285-286 systemic grammar 104-105 systemics see systemic grammar thematic information 8, 36, 41-42, 47, 52, 62, 67, 73, 124, 135-136, 280281 thematic roles see thematic information theta roles see thematic information θ-roles see thematic information type of repetition 252-253, 262-266, 272, 277, 297 Universal Grammar 120, 121 valency theory 119 X-bar schema 125-126,130,134,286