228 40 3MB
English Pages 310 [312] Year 2014
Thomas Herbst, Hans-Jörg Schmid, Susen Faulhaber (Eds.) Constructions Collocations Patterns
Trends in Linguistics Studies and Monographs 282
Editor
Volker Gast Editorial Board
Walter Bisang Jan Terje Faarlund Hans Henrich Hock Natalia Levshina Heiko Narrog Matthias Schlesewsky Amir Zeldes Niina Ning Zhang Editor responsible for this volume
Natalia Levshina
De Gruyter Mouton
Constructions Collocations Patterns Edited by
Thomas Herbst Hans-Jörg Schmid Susen Faulhaber
De Gruyter Mouton
ISBN 978-3-11-035610-6 e-ISBN (PDF) 978-3-11-035685-4 e-ISBN (EPUB) 978-3-11-039442-9 ISSN 1861-4302 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. ” 2014 Walter de Gruyter GmbH, Berlin/Boston Printing and binding: CPI books GmbH, Leck Printed on acid-free paper Printed in Germany www.degruyter.com
Table of contents From collocations and patterns to constructions – an introduction Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber
1
First language learning from a usage-based approach Elena Lieven
9
Item-based patterns in early syntactic development Brian MacWhinney
33
Construction learning as category learning: A cognitive analysis Nick C. Ellis and Matthew Brook O’Donnell
71
Pattern grammar in context Susan Hunston
99
Frames, constructions, and Frame-Net Charles J. Fillmore
121
The valency approach to argument structure constructions Thomas Herbst
167
Collostructional analysis: A case study of the English into-causative Anatol Stefanowitsch
217
Lexico-grammatical patterns, pragmatic associations and discourse frequency Hans-Jörg Schmid
239
Index
295
From collocations and patterns to constructions – an introduction Linguistic usage patterns are not just coincidental phenomena on the textual surface but constitute a fundamental constructional principle of language. At the same time, however, linguistic patterns are highly idiosyncratic in the sense that they tend to be item-specific and unpredictable, thus defying all attempts at capturing them by general abstract rules. A range of linguistic approaches inspired by surprisingly different background assumptions and aims have acknowledged these insights and tried to come up with ways of emphasizing the importance of linguistic repetitiveness and regularity while doing justice to unpredictability and item-specificity. Their efforts are epitomized in the terms enshrined in the title of the present volume, whose aim is to provide a multifaceted view of Constructions, Collocation and Patterns. What all of these approaches share, in addition to their interest in recurrent patterns, is a strong commitment to the value of usage, be it in the wider sense of usage as an empirical basis for sound linguistic analysis and description or in the narrower sense of usage as constituting the basis for the emergence and consolidation of linguistic knowledge. The first and presumably oldest (though to some perhaps not the most obvious) tradition takes the perspective of foreign language linguistics. Any reflection upon what is important in the learning – and, consequently, also in the teaching – of a foreign language will have to take into account the crucial role of conventionalized but unpredictable collocations. Any attempt by a learner to achieve some kind of near-nativeness will have to include facts of language such as the fact that it is lay or set the table in English, but Tisch decken in German, and mettre la table in French.1 It is thus not at all surprising that foreign language linguistics has resulted in extensive research on collocations and how they can best be taught and learnt. In fact, the very origin of the term collocation can be traced back to the Second Interim Report on English Collocations by Harold E. Palmer published in 1933 (Cowie 2009: 391-393, Stubbs 2009). Secondly, while the phenomenon of collocation concerns the associations between lexical items, verb complementation or valency patterns present learners with the same kind of difficulty, since a similar element of unpredictability can be observed in this area in that you can say Sie erklärte 1.
For a short history of the term collocation in linguistics and lexicography see Hausmann (2007: 218, 225–228), and Stubbs (2009).
2
Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber
mir das Problem in German but not *She explained me the problem in English. Again, the foreign language context has inspired research on complementation – a lot of work on the development of valency theory, for instance, has been done in a foreign language context. The reason why both collocations and complementation patterns are central to foreign language learning is that they concern item-specific knowledge with respect to the co-occurrence of one word with another word or one word with a particular grammatical construction. It is thus perfectly natural that research on collocation and research on valency have resulted in extensive lexicographical descriptions not only in special dictionaries such as collocation dictionaries or valency dictionaries, but also in general learner’s dictionaries: it was one of the outstanding features of the first major English learner’s dictionary, A.S. Hornby’s Advanced Learner’s Dictionary, first published in the 1940s, that it described the syntactic constructions in which particular verbs can occur in terms of a system of 50 verb patterns. 2 Thirdly, the advent of machine-readable corpora resulted in an enormous rise in interest for collocations and patterns, as a consequence of which “the analysis of language has developed out of all recognition”, as John Sinclair (1991: 1) rightly put it. Even if the importance of collocation as an element of language description had been pushed by Harold E. Palmer (1933) and John Rupert Firth (1968), the fact that large-scale corpus analyses revealed the extent to which fixed or partially fixed multi-word units determine the character of everyday language use certainly gave new impetus to collocational research. These findings have given rise to different concepts – for instance, that of lexical bundles in the Longman Grammar of Spoken and Written English by Biber, Conrad, Leech, Johansson and Finegan or that of “extended units of meaning” in the writings of John Sinclair. Corpus linguistic investigations of lexicogrammatical patterns and attraction phenomena also motivated Sinclair’s (1991: 110) suggestion that the idiom principle had to complement slot-and-filler type open-choice decisions in models of sentence structures. A huge body of research into collocations and lexicogrammatical associations was inspired by these insights and the new opportunities provided by computer corpora. Usage-based cognitive-linguistic approaches represent a fourth important line of investigation. Not surprisingly, cognitive linguists have focused their attention on the cognitive underpinnings of linguistic 2.
For the treatment of patterns and collocations in early English learner’s dictionaries of H.E. Palmer and A.S. Hornby see Cowie (1999).
Introduction
3
knowledge, asking questions as to how linguistic patterns and item-specific knowledge are stored and represented, how this knowledge emerges and what the cognitive processes involved in this emergence are. It is in this tradition that the terms construction and (constructional) schema have acquired fresh prominence. If the concept of construction is to include both unpredictable form-meaning pairings and highly frequent ones (Goldberg 2006), it easily accommodates collocations and valency as well as other types of lexical and lexicogrammatical patterns. Joan Bybee (2010: 28), for example, gives Firth’s example of the collocation dark night as an example of what she calls “conventionalized instances or exemplars of constructions that are not unpredictable in meaning or form … but are known to speakers as expressions they have experienced before”. Doing justice to the tension between repetitiveness and idiosyncrasy, it is especially the idea of itembased constructions (e.g. MacWhinney 2005: 53) which can be applied to collocations and also to valency patterns. In fact, the whole concept of construction grammar arose from the idea of integrating idiosyncratic elements such as idioms as central elements of the theory (see Croft and Cruse 2004: 225). This is apparent from a statement by Fillmore, Kay and O’Connor (1988: 534): “Those linguistic processes that are thought of as irregular cannot be accounted for by constructing lists of exceptions: the realm of idiomaticity in a language contains a great deal that is productive, highly structured, and worthy of serious grammatical investigation.” Fifthly, usage-based approaches of language learning have collected strong evidence that repeated lexically specific sequences in parents’ and children’s speech do not only play key role for the acquisition of early chunks such as what’s that, wanna or give me, but also constitute the starting-point for the emergence of variable low-level schemas (wanna X, give me X) and even more flexible unfilled schemas such as the ditransitive or other argument-structure constructions (Tomasello 2003, Goldberg 2006). The construction grammar approach has proved to be an attractive model not only for researchers interested in theoretical models of language, but also to those who are concerned with collocation and patterns in the context of foreign language linguistics. After all, generative transformational grammar had little to offer in terms of integrating such phenomena into a general theory of grammar, which had resulted in an unnecessary alienation of applied linguistics and theoretical linguistics. Similarly, of course, construction grammar has offered an appealing theoretical framework for accommodating many of the findings of less cognitively-minded corpus linguistics concerning the importance of recurrent patterns. In view of this
4
Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber
situation, Michael Stubbs (2009: 27) aptly remarks “… that, when scholars set out from different starting points within different traditions, use data of different kinds and independent arguments, but nevertheless arrive at similar conclusions, then the conclusions are worth studying closely, because the convergence of views is prima facie evidence that they are well founded.” However, despite this convergence of views, it would of course be wrong to assume that we are heading towards agreeing on a generally accepted theory of language. Firstly, although the approaches that can be summarized under the label usage-based all share certain basic assumptions concerning the nature of language, there are also considerable differences between them, for instance, concerning the degree of formalization and their commitment to providing a cognitively plausible model. 3 Secondly, the fact that cognitive linguists attribute an important place to collocation and other types of patterns does not necessarily mean that all corpus linguists and foreign language linguists would agree with the cognitive approach as a whole. And thirdly, while many usage-based researchers in cognitive linguistics have, of course, embraced the corpus method, it is still true to say that they have been more interested in arriving at generalizations than in reaching the level of descriptive granularity and specificity that is typical of more traditional corpus-based approaches, in particular those coming from a language teaching or lexicographical background. In view of this divergence within convergence the present volume aims at providing general and readable surveys of different lines of usage-based investigations of constructions, collocations and patterns, some focusing on linguistic, some on psychological aspects, and some addressing the role attributed to linguistic patterns in different research traditions. The first three chapters of this book outline why usage-based approaches seem to open up a very promising framework for accounts of language acquisition. ‘First language learning from a usage-based approach’ by Elena Lieven gives an account of how children learn constructions on the basis of the input they receive, discussing experimental evidence as well as
3.
For an outline of different strands of construction grammar – such as Berkeley Construction Grammar, Sign-Based Construction Grammar, Radical Construction Grammar, Cognitive Construction Grammar, Fluid Construction Grammar, Embodied Construction Grammar – see Fischer and Stefanowitsch (2006: 3–4) and especially Hoffmann and Trousdale (2013).
Introduction
5
the role or errors in language development. Lieven outlines how a network of constructions can be imagined to develop. Brian MacWhinney stresses the role of ‘Item-based patterns in early syntactic development’ and provides a detailed outline of the properties of such patterns, which involves a discussion of factors such as errors, conservatism or correlational evidence. After a short discussion of featurebased patterns, MacWhinney goes on to throw light on the role of itembased patterns in second language acquisition before providing a comparison of his model and other approaches, also touching upon questions of computational models. In ‘Construction learning as category learning’, Nick Ellis and Matthew Brook O’Donnell focus on the frequency distribution of verbs in verb-argument constructions as a determinant of second-language learning. Analysing the distribution of the verbal fillers of 23 verb-argument constructions in a large corpus they show that the frequencies, functions and forms of the input that learners are exposed to provide ideal conditions for construction learning by means of inductive statistical learning from the input. Zipfian type-token distribution, selective verb-construction relations and coherent meanings of verb-argument constructions are identified as key variables favouring the learning of schematic constructions. The four contributions that follow all deal with issues of syntactic patterning and meaning. Susan Hunston, in ‘Pattern Grammar in Context’ illustrates how syntactic patterns were identified on the basis of research using the Bank of English. She then goes on to discussing similarities and differences between her own pattern grammar approach and construction grammar. Charles Fillmore’s contribution ‘Frames, constructions, and FrameNet’ combines lexicographical and theoretical issues. It describes the original set-up of the FrameNet project and shows how FrameNet descriptions can be adapted to suit the approach of Berkeley construction grammar. As an illustration, a large sample of text is analyzed in the framework developed before an outline of important research issues for the future is given. Similarly, Thomas Herbst’s chapter ‘The valency approach to argument structure constructions’, outlines a framework for the application of valency theory to English and shows how valency can be described in terms of a network of item-based constructions. It is argued, however, that, in the light of the enormous amount of item-specific knowledge to be accounted for, Goldberg’s (2006) theory of argument structure constructions should be complemented by a valency realization principle.
6
Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber
To what extent lexical material and constructions interact is also shown by Anatol Stefanowitsch in the chapter entitled ‘Collostructional analysis: A case study of the English into-causative’. Stefanowitsch outlines the basic principles of collostructional analysis, which reveals the extent to which particular lexical items are attracted by a certain construction, and shows how this in turn can be used to reveal the meaning of the construction. Stefanowitsch concludes with general methodological considerations on the use of corpus data and interpretation of the data. In the final contribution entitled ‘Lexico-grammatical patterns, pragmatic associations and discourse frequency’, Hans-Jörg Schmid begins by developing a usage-based emergentist model of language which consists of a limited number of cognitive and socio-pragmatic processes. The paper then focusses on effects of pragmatic associations on the emergence and syntagmatic chunking of different types of lexicogrammatical patterns, ranging from frozen expressions to collocations, collostructions, valency patterns and argument-structure constructions. Pragmatic associations are then put into relation to discourse frequency as partly competing and partly cooperating determinants of chunking. The different chapters of this book throw light on different types of patterning to be observed in the analysis of language. The label of construction – vague as it may sometimes still be – can thus be seen as representing a general concept under which many linguistic phenomena that hitherto have been studied under different labels in different approaches (such as valency or collocation) can be subsumed. We hope that this volume offers an attractive introduction to various approaches in the field and an illustration of the fact that different theoretical frameworks such as frame semantics or valency theory, for example, are definitely moving towards construction grammar. All of the chapters of this volume are based on papers given in a series of talks held at the Interdisciplinary Centre for Research on Lexicography, Valency and Collocation at the Friedrich-Alexander-Universität ErlangenNürnberg and the Interdisciplinary Centre for Cognitive Language Research at Ludwig Maximilians University Munich. We would like to thank Barbara Gabel-Cunningham for her invaluable help in preparing the manuscript. Thomas Herbst Hans-Jörg Schmid Susen Faulhaber
Introduction
7
References Biber, Douglas, Susan Conrad, Geoffrey Leech, Stig Johansson, and Edward Finegan 1999 Longman Grammar of Spoken and Written English. London et al.: Longman. Bybee, Joan 2010 Language, Usage and Cognition. Cambridge: Cambridge University Press. Cowie, Anthony P. 1999 English Dictionaries for Foreign Learners: A History. Oxford: Oxford University Press. Cowie, Anthony P. 2009 The earliest foreign learners’ dictionaries. In The Oxford History of English Lexicography. Volume II: Specialized Dictionaries. A.P. Cowie (ed.), 385–411. Oxford: Oxford University Press. Croft, William, and David A. Cruse 2004 Cognitive Linguistics. Cambridge: Cambridge University Press. Fillmore, Charles, Paul Kay, and Catherine M. O’Connor 1988 Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64: 501–538. Firth, John Rupert 1968 Linguistic analysis as a study of meaning. In Selected Papers by J. R. Firth 1952–59, Frank R. Palmer (ed.), 12–26. London/Harlow: Longmans. Fischer, Kerstin and Anatol Stefanowitsch 2006 Konstruktionsgrammatik: Ein Überblick. In Konstruktionsgrammatik: Von der Anwendung zur Theorie, Kerstin Fischer and Anatol Stefanowitsch (eds.), 3–17. Tübingen: Stauffenburg. Goldberg, Adele 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press. Goldberg, Adele 2006 Constructions at Work. Oxford/New York: Oxford University Press. Hausmann, Franz Josef 2007 Die Kollokationen im Rahmen der Phraseologie: Systematische und historische Darstellung. Zeitschrift für Anglistik und Amerikanistik 55 (3): 217–234. Hoffmann, Thomas and Graeme Trousdale 2013 The Oxford Handbook of Construction Grammar. Oxford: Oxford University Press.
8
Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber
Hornby, A. S., E. V. Gatenby, and H. Wakefield 1942 Idiomatic and Syntactic English Dictionary. Tokyo: Kaitakusha. [published under the title of A Learner's Dictionary of Current English by Oxford University Press in 1948, retitled The Advanced Learner's Dictionary of Current English in 1952]. Hunston, Susan and Gill Francis 2000 Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Amsterdam/Philadelphia: Benjamins. MacWhinney, Brian 2005 A unified model of language acquisition. In Handbook of Bilingualism: Psycholinguistic Approaches, Judith F. Kroll and Annette M. B. De Groot (eds.), 49–67. Oxford: Oxford University Press. Palmer, Harold E. 1933 Second Interim Report on English Collocations. 4th impression 1966. Tokyo: Kaitakusha. Sinclair, John McH. 1991 Corpus, Concordance, Collocation. Oxford: Oxford University Press. Stubbs, Michael 2009 Technology and phraseology: With notes on the history of corpus linguistics. In Exploring the Lexis-Grammar Interface, Ute Römer and Rainer Schulze (eds.), 15–31. Amsterdam/Philadelphia: Benjamins. Tomasello, Michael 2003 Constructing a Language. Cambridge, Mass./London: Harvard University Press.
First language learning from a usage-based approach Elena Lieven
1.
Introduction
First language development is based on usage: what children hear and what they want to communicate. Children learn pieces of language to achieve their communicative ends. These form-meaning mappings build into a network of constructions that allows for generalisation, categorisation and increasing abstraction. This approach contrasts with the nativist-linguistic approach of universal grammar (UG), which sees sentence construction as involving the algorithmic assembly of highly abstract symbols representing constituents. Since constituents cannot be identified from actual input, both the underlying system and ways of mapping it to meaning must be innate. This approach tends to stress the speed of language learning and that it is largely error-free. Where errors do occur and are not attributed to either noise or performance limitations, they are explained in terms of features of UG. In this chapter, I will focus on explicating the usage-based approach rather than on a comparison of the two approaches. For those interested, Ambridge and Lieven (2011) provides an exhaustive comparison. During infancy there is developing sensitivity to the patterning of different features of language: newborns are already able to make a number of discriminations based on prosodic features in their ambient language and they develop increasing sensitivity to both the prosody and phonology of their surrounding language(s) over the first year of life (Curtin and Hufnagle 2009). Around 8–9 months, there is a crucial development in social cognition: infants start to show clear evidence of intention understanding. Thus they follow people’s points with gaze checking, they start to point declaratively themselves and to imitate with reference to the inferred intentions of others – all of this indicating awareness of other minds (Tomasello et al. 2005). Once this development is underway, children can begin to map what they are hearing to the meaning they infer that it conveys and then attempt to reproduce this when they themselves are trying to communicate.
10
Elena Lieven
Children start by learning phonologically specific ‘chunks’ mapped to child-extracted meanings. These strings can be of varying length from word, word plus inflection(s) and multiword strings, and this will partly depend on the language being learned. If the strings are of high frequency, i.e. there are many tokens of the same string, the child may learn this as a fixed phrase. In English child directed speech (CDS), What’s that? is probably an entrenched string for many children: at the syntactic pole, there is no knowledge of wh-movement or of the copula; at the semantic pole, there is pragmatic understanding that this is a turn-taking move requiring a response but it is probably more likely that the child interprets it as a display request than as a genuine request for information. On the other hand, if there are high type frequencies in the same position in the string, children may extract a frame with a slot: an example is Where’s X gone? where X indicates the slot. This is a highly frequent question in English CDS, often with considerable variability in the slot. Thus effort after meaning is matched to emergent structure. Children’s networks build up towards more complex constructions (with more parts) and more abstract constructions (less item-specific). This is based on inferences they make about meaning from their input and from interaction, using cognitive skills of patternidentification, memory, analogy and abstraction. Developing networks of form-meaning mappings become connected in multiple ways (through both form and function). Constituency and more complex syntax emerge through this process. There are important methodological implications that derive from this approach. Because we do not credit the child with pre-given, abstract linguistic categories from the outset, we only posit the existence of an abstract category when there is evidence for it. For instance, if the child produces utterances with the following form: It’s X-ing, You’re X-ing , we start by analysing at the level of the specific form and string, for example is or are rather than the auxiliary category as a whole or auxiliary BE (Lieven 2008, Rowland and Theakston 2009). In this chapter, I first discuss the structure of the input and how this may assist learning. In section 2, I illustrate the development of constituency by reference to studies using the densely collected corpora of four Englishspeaking children aged between 2;0 and 3;0. Section 3 addresses the question of what it means to learn a construction by presenting experimental studies in English and other languages. In section 4, I address the learning of more complex constructions: relative clause structures and complements. Children’s errors can illuminate the processes involved in language learn-
First language learning from a usage-based approach
11
ing and I illustrate this in section 5. Finally, in section 6, I discuss the ways in which constructions can interact in the network to either support the learning of new constructions or to compete with it. 2.
The input
We know that there are significant correlations with many aspects of CDS and children’s language development. Parents who elaborate on the child’s focus of attention have children with larger vocabularies (Carpenter, Nagell, and Tomasello 1998); parents who elaborate on what the child has just said have children with longer mean lengths of utterance (MLU) at a particular age (Barnes et al. 1983); the use of more complex syntax to children by parents and teachers is correlated with more complex syntax in the children’s own language (Huttenlocher et al. 2002). There are also strong correlations at every level with the frequency of forms and constructions in CDS and the order of emergence in children’s speech:1 for instance, inflections (Farrar 1990), verbs (Hoff-Ginsberg and Naigles 1998), verb argument structure (Theakston et al. 2001), types of matrix clauses in finite complement constructions (Brandt, Lieven, and Tomasello 2010) and types of relative clauses (Kidd et al. 2007). CDS is a genre with a number of well-established characteristics: utterances tend to have exaggerated prosody, to be shorter than those addressed to adults, to concentrate on the here-and-now and to have a relatively high proportion of questions and imperatives – all of which makes perfect sense from the point of view of communicating with a two-year-old. In a study of 12 English-speaking children’s development between 2;0–3;0, CameronFaulkner, Lieven and Tomasello (2003) found that only 18% of CDS utterances were in canonical SVO word order, while 32% were questions and 15% were copulas. The authors then analysed the first 1–3 words of each utterance in these categories and defined a frame whenever the same word or string of words occurred at the beginning of utterances more than 4 1.
There is an important issue here in relation to sampling: low density sampling can give the impression that the child has not yet learned a form that is present in the input. Although this might be the case, it could also be due to the fact that there are many more adult than child utterances in the corpus. This means that various control measures should be implemented. Readers are referred to Tomasello and Stahl (2004), Rowland and Fletcher (2006) and Rowland, Fletcher, and Freudenthal (2008).
12
Elena Lieven
times. They found that 77% of copula utterances were accounted for by just 8 frames and 67% of questions by 20 frames. Of course this is not surprising: it is explained by the fact that English has only a limited range of whwords and auxiliaries, and that inversion applies only to the copula and auxiliaries in English (unlike German, where main verbs can be inverted). However it means that children are hearing a very great deal of repetition at the beginnings of utterances and that this might aid them in extracting frames with slots. Indeed the study found that 45% of the mothers’ utterances began with one of 17 words and that 52 ‘core frames’ (frames used by more than half the mothers) accounted for 51% of all utterances. Examples of frames are A X, It’s a X, What do X?, Are you X?, Let’s X. However compared to many other languages, English has more rigid word order and very impoverished morphology. This could mean that it is easier for children to extract slot and frame patterns from what they hear in English input than it would be in other languages. In a study using methods very similar to Cameron-Faulkner et al. (2003), Stoll, Abbot-Smith and Lieven (2009) analysed English, German and Russian CDS for the presence of frames at the beginnings of utterances. They found differences deriving from the typological differences between the languages: the absence of the copula in the present tense in Russian and of determiners meant that the Russian data had a higher proportion of shorter frames than English or German; and the presence of OVS word order and main verb inversion in German reduced the overall number of frames per mother. At the same time, the level of repetitiveness at the beginnings of utterances, as measured by the proportion of utterances accounted for by frames, was high for all three languages – over 70%. Again, this almost certainly reflects a common approach to interaction with two-year-olds in these groups of mothers. A rather different approach to the potential of frames in CDS to assist language learning was pioneered by Mintz (2003). Mintz was interested in the slots that occurred in frequent frames and whether a particular frame tended to have words of the same grammatical category in the slot. He found that, for English, this was indeed the case with the 45 most frequent frames categorising the words in their slots with over 90–98% accuracy and similar levels of accuracy have been found for French (Chemla et al. 2009). However similar analyses for Dutch (Erlekens 2009) and German (Stumper, Bannard, and Lieven 2011) show much lower levels of accuracy. There are a number of reasons for this. One is that, in the German study, almost all definite articles which occurred as left-framing elements were also used as pronouns, meaning that the variable word in the frame could belong to a
First language learning from a usage-based approach
13
number of different categories. Thus das was used as a pronoun if it was followed by a verb or an adverb, but was used as a determiner if it was followed by a noun or a pronoun. A second possibility suggested by Mintz (2003) is that languages with more flexible word order than English might give better categorisation accuracy if the analysis was based on morphemes rather than words. This latter is an important point: many languages of the world have far more elaborate morphology than the Indo-European or EastAsian languages such as Chinese and Japanese, which have been the main focus of study to date. For instance, Chintang, until recently, an undocumented, polysynthetic, Tibeto-Burman language of Eastern Nepal, has around 2000 verbal inflections. Children learning this language are more likely to extract morphological frames than word-based frames (Stoll et al. 2012). 3.
Productivity, creativity and constituency
Are children constructing their early utterances by generating them from underlying categories or on the basis of surface strings that they have extracted from the input? The repetitiveness that has been found in English CDS is also present in children’s own early utterances. In a study of four two-year-old children’s multiword utterances, Lieven, Salomo and Tomasello (2009) found that between 20–40% of the children’s utterances were exact repeats of what they had said before and 40–50% involved only one substitution into the slot of a previously produced frame. The vast majority of these slots were filled with words for referents (people and objects). The children with higher MLUs produced fewer exact repeats and the slots in their frames had a wider range of meanings, suggesting that as the child’s network of form-meaning mappings develops, they need to rely less on rote-learned strings and on only one, noun-like category. This study ‘traced back’ utterances in the last two hours of recording of a 30-hour corpus for each child collected over 6 weeks to utterances that had been produced in the first 28 hours (see also Dąbrowska and Lieven [2005] for a similar study of wh-questions). But this constrains the analysis to only those utterances that the child produced in the final two hours of this corpus. Using two of the same corpora and two from the same children collected for 30 hours over 6 weeks after their third birthday, Bannard, Lieven and Tomasello (2009) worked in the opposite direction by building grammars out of the first 28 hours of each child’s speech at each age and then analysing how well these grammars could account for the utterances in the
14
Elena Lieven
last two hours of the corpus. The important point about the grammars is that they contain no pregiven categories: they are probabilistic grammars constructed ‘bottom-up’ from the child’s utterances by searching the utterances for matching strings and slots. How successful the grammars were at accounting for the test utterances was measured in two ways: first, the proportion of utterances for which the grammars could provide a parse (coverage); and, second, how many slot substitutions were required to arrive at a parse, i.e. how complex was the parse. It was found that these entirely lexically specific grammars did very well on the 2;0 corpora with 74–85% coverage and around 60% of parses requiring only one or two substitutions. However although the 3;0 grammars also gave good coverage, very many more substitution operations were required than at 2;0, suggesting that, by this age, the children’s grammars were becoming more complex and less lexically-specific. That this was indeed the case was supported by a subsequent analysis in which the authors substituted first a noun category and then a verb category across the corpus. For the 2;0 grammars, the noun category improved the coverage significantly for one child but adding a general verb category did not, while at 3;0 both categories did. This then supports the idea that abstraction develops and that for English-speaking children, the first abstraction of a linguistic category is an emergent noun category. It was also interesting that each child’s grammar did very badly at parsing the other child’s at 2;0 but the 3;0 grammars did much better at parsing the other corpus. This suggests that children’s grammars at 2;0 can be rather idiosyncratic but, by 3;0, they are converging on the grammar of the adult language. In this approach constituency develops out of the form to function mapping of the slots. For instance, as noted above, in the Lieven et al. (2009) study, most slots in the schemas at 2;0 are the names of people or objects produced as bare nouns. But depending on each child’s level of language, these referential slots also contained nouns preceded by the determiners a or the and a few adjectives, and, in the case of the child with the most complex language, other determiners and a wider range of adjectives. Once there is a slot in the schema with a referent function, the child can proceed to build up more complex forms of referring. At the same time, the fact that many schemas are used by the child to refer (Gimme X, That’s an X, There’s an X; I want X) will contribute to both the form and the function of the noun phrase becoming more abstract with its own internal slots for determiners and adjectives. It is this process that eventually allows the child
First language learning from a usage-based approach
15
to treat a relative clause such as ‘the boy who is smoking’ as one referential entity mapped to one constituent. What about verbs? In his (1992) study of his daughter’s early utterances with verbs (Akhtar and Tomasello 1997), Tomasello showed that each verb seemed to be a constructional ‘island’ with initially no generalisation of arguments. For example cut only ever appeared with things to be cut as an imperative (Cut X) while draw appeared with a number of ‘drawers’ (X draw), a number of things to be drawn (Draw X) and a number of sites of drawing (Draw on X). Tomasello (2003) suggested that children develop more abstract constructions by making functional analogies across these different verb islands to develop abstract argument slots. There have been two types of challenge to an extreme version of the verb island hypothesis. First, while children may not have a fully verb-general category from early on, they may well be developing subcategories of verbs (Clark 1996), for instance from schemas such as I’m X-ing, It X-ed (Pine, Lieven, and Rowland 1998). These subcategories could then contribute to the generalisation of the verb slot in argument structure constructions as well as building up emergent categories of arguments (e.g. subject) if some of the schemas contain the same verbs. Secondly, there is a great deal of controversy and subsequent research as to how early children do, in fact, show abstract knowledge of verb argument structure. 4.
The development of the transitive and intransitive constructions
The transitive construction identifies the agent and patient of an event. Depending on the language, this can be marked by various combinations of word order, case-marking and verb agreement. In addition there are regularities of usage, for instance in active transitives, agents are usually animate and, if one of the NPs is inanimate, it is likely to be the patient (not always, of course: The wind buffeted him). If children initially learn the positioning of arguments on a verb by verb basis (i.e. in English, for the verb hit, the ‘hitter’ comes before the verb and the ‘hittee’ after) but not at a more abstract level (e.g. in an active transitive the agent/actor comes before the patient, or the subject before the verb), they should not be able to identify the agent and patient of a sentence containing a novel verb, if there are no other cues (e.g. animacy). Many experiments, using different methods, have been used to test this hypothesis. To summarise a large body of research, studies have found that children up to about 3;0 find it much more
16
Elena Lieven
difficult either to produce or act out active transitive sentences with two animate nouns and a novel verb than they do sentences with familiar verbs (e.g. The bunny meeked the duck versus The bunny hit the duck, Tomasello 2000). On the other hand, studies using either a preferential looking (Gertner, Fisher, and Eisengart 2006) or a forced choice pointing paradigm (Noble, Rowland, and Pine 2011) show evidence that young two-year-olds can (a) correctly match an active transitive with two animate participants to a causal scene when given a choice between this and a non-causal scene (e.g. the bunny doing something to the duck versus the duck and the bunny both doing something independently); (b) match an active transitive to a scene with the correct agent and patient when given a choice between the same referents in the opposite roles (e.g. The duck meeked the bunny versus The bunny meeked the duck). However, when presented with an intransitive sentence with a conjoined agent subject (the bunny and duck are meeking), children are unable to match this to a non-causal scene until the age of 3;4. There are two possible structure-general biases that might account for these results: first, a bias to treat the first noun in an utterance as the agent and, second, a bias to map the number of arguments to the number of roles. The first bias would lead the child to correctly match an active transitive to the scene in which the first noun is the agent. The second bias would lead the child to match a transitive sentence to a causal scene with two actors rather than to a non-causal scene. However both biases could lead the child to make errors with the conjoined agent intransitive. The first noun as agent bias might lead children to interpret the first noun in a conjoined agent intransitive as an agent and incorrectly choose the causal scene over the noncausal scene. And hearing two nouns might also lead the child to map these to the two semantic roles of agent and patient (i.e. again to choose the causal scene over the non-causal scene). In a recent study this is exactly what Noble, Theakston and Lieven (submitted) found. The suggestion that children are using structure-general biases to interpret transitive utterances is supported by studies showing that full comprehension of the transitive takes a considerable time to develop. Initially children seem to require a ‘gestalt’ which reflects what is most frequent in CDS in order to be able to interpret what they hear in experiments. Thus a study by Dittmar et al. (2008) found that German-speaking children, aged 2;7, presented with active transitives containing novel verbs, could only point to the picture with the matching agent and patient when they were presented with sentences that were in SVO order and had case-marking on the two
First language learning from a usage-based approach
17
NPs (e.g. Der-NOM Hund wieft den-ACC Löwen-ACC ‘The dog is wiefing the lion’). They were at chance on sentences with case-marking but OVS word order (e.g. Den-ACC Bären-ACC wieft der-NOM Tiger ‘The bear-PATIENT is wiefing the tiger-AGENT’). They were also at chance with sentences in which casemarking was not available because both NPs were in feminine or neuter gender (which are not marked for nominative or accusative in German: e.g. Die Katze wieft die Ziege). Five-year-olds chose the first noun as agent in these SVO, non-case-marked sentences but still failed to reliably interpret the case-marked sentences with OVS word order. Only the group of 7-yearolds managed to interpret these latter sentences correctly. At first sight this is somewhat surprising because sentences with OVS word order and casemarking occur about 20% of the time in German CDS (Dittmar et al. 2008). But this figure depends on counting case-marking at the level of the abstract category and, in fact, Dittmar et al. found that 76% of the argument NPs in the OVS sentences of the German CDS corpus contained either 1st or 2nd person pronouns (e.g. Ich, mich, Du, dich). Thus young children might have learned to map these words to agent and patient roles without having a fully abstract representation of case. As well, OVS word order in German is highly marked and even German adults might show slowed reaction times in processing such sentences if they are presented with neutral intonation. A subsequent study by Grünloh, Lieven and Tomasello (2011) showed that if these sentences were presented to five-year-olds with stress on the first NP, children did better at interpreting them and this was even more the case if they were also presented in the discourse context in which they are normally used (that of contrast with a preceding claim). So German children are used to hearing OVS word order in utterances with a set of extremely frequent case-marked pronouns and contrastive stress. If they hear something similar in the experiment, even with a novel verb, they do much better, than if they hear sentences in which these cues are presented separately. What children hear in the input is critically important to interpreting the level of abstraction in their linguistic representations and how this develops. But, as the Competition model of Bates and MacWhinney (1987) attempts to quantify, it is not only the frequency of a particular form but also its salience and how reliable the mapping is of the form to the particular function (see also Kempe and MacWhinney [1998]). Chan, Lieven and Tomasello (2009) tested English, German and Cantonese-speaking children aged 2;6, 3;6 and 4;6 on three types of active transitives with novel verbs: sentences in which the subject was animate and the patient inanimate (A-I)
18
Elena Lieven
(the most frequent type in the CDS of all three languages); sentences in which both agent and patient were animate (A-A); and, finally sentences in which there was an inanimate agent and an animate patient (I-A). Even the youngest group in all three languages was able to choose the first noun as agent when hearing the A-I sentences, but all were at chance in the other two conditions. Across sentence types, children made significantly fewer first-noun-as-agent choices when the animacy contrast was neutralized (AVI vs. AVA) or introduced to conflict with word order (AVA vs IVA). Cantonese children did this at all three ages; German children at ages 2;6 and 3;6; and English children at 2;6. Chan et al. suggested that these results are directly related to the frequency of SVO active transitives and the expression of their arguments. As noted above 20% of active transitives in German have OVS word order. Cantonese is a language with a very high rate of argument drop (of both subjects and objects). This results in children having much less evidence for SVO word order. There is also the issue of what to count. In the Dittmar et al. (2008) study we arrived at very different frequency measures when we counted at the level of abstract case-marking on NP arguments than when we counted at the level of particular casemarked pronouns. Even for English, a language that lacks case-marking except on pronouns, Ibbotson et al. (2011) have shown that children just under 3;0 can interpret passive sentences with novel verbs, provided they contain two case-marked pronouns (e.g. She is being tammed by him). With development, item-specific cues become integrated into more abstract categories while, at the same time, these more abstract categories become independently weighted. This means that what we measure in counts of relative frequency will change as the system develops (Ambridge 2010, Lieven 2010). As we shall also see in Sections 5 and 6 below, high frequencies can have important effects both in protecting from error and in inducing errors. 5.
Explaining errors: Why do they matter?
Errors of omission and commission are important because they can be very informative about the child’s current linguistic representations. Children are frequently reported as making relatively few errors, particularly in inflectional morphology. For some children and some languages, this seems to be the case but children may still, of course, be operating with limited productivity as has been shown by Aguado-Orea (2004) and Krajewski, Lieven and Theakston (2012). In other cases, however, the low overall error rate
First language learning from a usage-based approach
19
disguises parts of the system in which error rates can be much higher (Pizzuto and Caselli 1992, Aguado-Orea 2004, Pine et al. 2005) Within generative approaches to child language, errors, when they occur systematically, have often been explained as the outcome of highly abstract processes, arising from the problems of mapping between UG and language-specific features. They also, of course, need explanation from the usage-based perspective. Examples of errors that have been the subject of much research are errors in questions (e.g. non-inversion errors, What she is eating? [Rowland and Pine 2000]); auxiliary-doubling errors (e.g. What does she doesn’t like? [Ambridge et al. 2006]); incorrect use of pronouns (e.g. Me do that, Her haves some tea [Rispoli 1998]); the omission of finiteness marking (e.g. That go there, John play football [Freudenthal et al. 2007]), and argument structure overgeneralisations (e.g. Water bloomed these flowers, I said her something [Bowerman 1988]). From a usage-based perspective these errors arise from the entrenchment of high frequency strings which then compete with differential rates of semantic and syntactic generalisation in related parts of the network. Thus children are less likely to make errors with question frames that are frequent in the input (Rowland and Pine 2000, Rowland 2007). Noninversion errors occur on low-frequency strings and children show alternations between correct and incorrect inversion while the system is developing. In an experimental study in which children had to produce questions using combinations of wh-words and auxiliaries, Ambridge et al. (2006) showed that strings of particular wh-words and specific lexical auxiliary forms could account better for the pattern of errors than either the wh-word or the auxiliary type alone. On the other hand, entrenched high frequency strings can also lead to error. Ambridge and Rowland (2009) found that if the children had an entrenched positive question frame such as What does X?, they were significantly more likely to make auxiliary doubling errors when the related negative question was elicited (i.e. to say What does she doesn’t like? instead of What doesn’t she like?). Another area that has received considerable attention from both sides of the generativist-usage-based debate is that of optional infinitive (OI) errors – utterances that lack finiteness marking (e.g. He going, He do that). In probably the most influential generativist approach to this issue, the Agreement-Tense Omission model (ATOM), Wexler (1998) suggests that while children have correctly set the tense and agreement parameters of their language from a very early stage, they are subject to a unique checking constraint in early development which means one of these features may
20
Elena Lieven
be optionally underspecified. This theory makes a number of predictions that are seemingly borne out by the evidence. First, that in languages in which subjects can be dropped (‘pro-drop’ languages), the rate of OI errors will be very low because children do not have to deal with checking the agreement feature on the subject. In support of this, young Spanishspeaking children make few OI errors while German and Dutch speaking children, who are learning non-prodrop languages show very high rates, with Dutch rates even higher than German. The second prediction is that accusative-for-nominative errors (e.g. me do it) occur when children check tense on the verb and do not therefore also mark case on the subject but use the default form (in English, the accusative hence me-for-I errors, but in German and Dutch, nominative, so these errors are less likely to occur). Usage-based approaches explain both these phenomena in terms of competition between different strings learned from the input. Freudenthal et al. (2007) have shown that differential rates in OI errors can be accounted for by the relative frequency of utterance final, non-finite verbs (Dutch 87%, German 66%, Spanish 26%) which results from the verb-second rule in Dutch and German (see also Wijnen, Kempen, and Gillis 2001). Thus the suggestion is that the child’s processing mechanism may be differentially picking up forms at the ends of utterances and it is this that gives rise to the different rates of OI errors. In a more recent study, which compared the usage-based position on OI errors with the ‘Variable learning model’ of Legate and Yang (2007), Freudenthal, Pine and Gobet (2010) found that the particular verbs with which children were more likely to make OI errors were those that occurred more frequently in complex verb phrases in the input, suggesting a specific lexical effect on learning. Theakston, Lieven and Tomasello (2003) showed a similar effect of input in an experimental study in which children were taught novel verbs either in utterances with complex verb phrases in which the verb was unmarked (Will it tam?) or with the verb in third person singular (It tams). Children who only ever heard the verbs in the unmarked condition were significantly more likely to produce them as OI errors. As far as accusative-for-nominative errors in English are concerned, Kirjavainen, Theakston and Lieven (2009) investigated whether complex utterances in the input (Let me do it) might explain the origin of Englishspeaking children’s first person, pronoun case errors, where accusative pronouns are used in nominative contexts. Naturalistic data from 17 two-tofour-year-olds was searched for 1st person singular (1psg), accusative-fornominative case errors and for all 1psg preverbal pronominal contexts.
First language learning from a usage-based approach
21
Their caregiver’s data was also searched for 1psg preverbal pronominal contexts. The children’s proportional use of me-for-I errors was correlated with their caregiver’s proportional use of me in 1psg preverbal contexts. There were also lexically-specific effects in these me-for-I errors: the verbs that children produced in me-error utterances appeared in complex sentences containing me in the input more often than verbs that did not appear in these errors in the children’s speech. Of course there is no direct mapping from me used as a subject in the input but children seem to use lexical strings from the input in which me appears before the verb. For at least one of the children, this becomes a productive pattern: she uses the me +V pattern very often and some of these utterances are with finite verbs which are very unlikely to have occurred after me in her input. This is interesting since it indicates a process of abstraction from a set of strings learned from the input. These will eventually become of very low, or non-existent, frequency as the frequency of competing correct forms builds up. 6.
The development of more complex syntax
To summarise so far, children start by extracting from the input strings (of varying length) and mapping these to meaning. With development, generalisation and abstraction take place and constructions of varying levels of complexity are formed. This network of constructions becomes increasingly interconnected leading both to more flexible, less item-specific behaviour by the child, but also to competition between constructions which can lead to error. How does this scenario fit with the development of more complex syntax such as relative clauses and complement structures? Initially English-speaking children’s relative clauses are monopropositional: the main clause is usually a copula and functions to focus attention on the referent (e.g. Here’s a tiger that’s gonna scare him, This is the horse sleeping in a cradle [Diessel and Tomasello 2005]). Similar results were obtained for a child learning German, whose earliest utterances with relative clauses were monopropositional, topicalisation constructions with an isolated head noun and relative clause (Brandt, Diessel, and Tomasello 2008). These are very different to the relative clauses that children are typically presented with in experiments which usually contain two full noun phrases (e.g. an object relative The chicken that the pig pushed jumped over the sheep [see Corrêa 1995]). There is a large literature on these types of experiments, which has found that children find object relatives harder than subject relatives and this has been the subject of a number of explanations
22
Elena Lieven
based on processing (e.g. Bever 1970, Diessel and Tomasello 2001, 2005) and generativist accounts of relative clause syntax (e.g. Friedmann and Novogrodsky 2004, Goodluck, Guilfoyle, and Harrington 2006). However it turns out that if English- and German-speaking children are presented with the types of object relatives that they actually hear in the input and use (with inanimate head NPs and pronominal subjects, e.g. That’s the book that he bought last week), they do significantly better on these prototypical object relatives than on non-prototypical ones and there is no difference in their ability to interpret subject and object relatives (Kidd et al. 2007, Brandt et al. 2009). In these experiments, children were presented with the same relative clause structure that they heard most frequently in the input but not, of course, with exactly the same lexical items. So they had abstracted a relative clause construction based on overlapping semantics and structure from what they had heard but did not yet command all of relative clause syntax (see Fox and Thompson 1990 on the form and function of relative clauses). Thus children develop schematic representations out of specific exemplars that overlap in form, semantics and/or function. A similar analysis can be made of the development of sentences with finite complements (Diessel and Tomasello 2001, Brandt, Lieven, and Tomasello 2010). Again these are initially monopropositional rather than expressing two separate propositions, one in the main clause and the other semantically and syntactically subordinated (e.g. John thought that Jean would come yesterday). In children’s early utterances with finite complements, the matrix is formulaic, fixed for person, number and tense and acts as a discourse marker (for instance, as a hedge in I think X, Ich dachte X or an attention marker, Look X, Siehst Du X). With development, a wider range of complement-taking matrix verbs are used in a wider variety of forms, suggesting again that development proceeds from relatively lowscope, item-based patterns to a more abstract and inter-connected network. Note that, if lexical strings become very entrenched, it may be difficult for the child to break them up, leading to greater flexibility in the more abstracted parts of the network. For instance, Brandt et al. (2011) showed that 4- and 5-year-olds are better at producing less frequent matrix verbs in 3rd person complement-taking constructions than they are at changing highfrequency strings like I think X to He thinks X.
First language learning from a usage-based approach
7.
23
Competition and affordance in a network of constructions
The grammatical network builds up over development with different constructions becoming increasingly connected both through form and meaning. Highly frequent strings may be entrenched as a whole (e.g. what’s that? or I dunno [Bybee and Scheibmann 1999]) or as low-scope schemas but are likely to also develop connections with more schematic constructions in the network (e.g. of wh-questions), allowing much more flexibility in choice of constructions and, as a result, in conversational or stylistic variability (see Verhagen 2005). In principle, this can explain why children’s production and comprehension may look rather different, especially in the initial stages of language learning. In attempting to understand an utterance, the child may have already abstracted a number of strategies that help in comprehension: as we saw in Section 4 above, an example for English might be that the first noun in a sentence is likely to be an agent/actor. This will allow for the comprehension of prototypical transitives but may lead the child astray with non-prototypical transitives (Dittmar et al. 2008, Chan et al. 2009) and some types of intransitives (Hirsh-Pasek and Golinkoff 1996, Chang, Dell, and Bock 2006, Noble, Rowland, and Pine 2011). In production, however, the child will need to be able to produce a whole construction (which could, of course, be lexically-specific or of very low-scope). One way to think of abstraction is of the increasing interconnectedness of constructions along a variety of dimensions of both meaning and form. We have already seen that this can result not only in children’s increasing ability to deal with the more schematic instances of grammar but also can, if particular form-function mappings have become highly entrenched, impede this (for instance, children’s difficulty in converting an entrenched, first-person, matrix verb like ‘I think’ into a 3rd-person ‘He thinks’ which presumably presents a problem because [a] I think is so entrenched and [b] it is entrenched as a hedge whereas He thinks is genuinely propositional. Two examples of the consequences of this developing network are provided by the version of the construction conspiracy hypothesis [Morris, Cottrell, and Elman 2000] developed by Abbot-Smith and Behrens [2006] and by Ambridge et al’s work on the retreat from overgeneralisation [see the discussion in Ambridge and Lieven 2011: 242–265]). In their study of one child’s development of the German passive, AbbotSmith and Behrens (2006) analysed the shape of the learning curves for the development of the sein- and werden-passives. The sein-passive showed an
24
Elena Lieven
earlier and steeper curve indicating that it became productive before the werden-passive. The authors argued that this was because the learning of the sein-passive was supported by the already well-established copula construction whereas this was not the case for the werden-passive. In a second analysis, they argued that, by contrast, the development of the werdenfuture was delayed because of the existence of a competing construction that shares the same semantic-pragmatic function: the Präsens-future. Interestingly, the affordance for learning the sein-passive was more related to the form of the construction whereas the inhibition of the werden-future was related to the similar semantic-pragmatic functions of the two constructions, which supports the idea that constructions are connected through a number of different features in the network. One feature of child language development is that of overgeneralisation: when the child extends a construction to items that do not fit it in the adult system. Examples are argument structure overgeneralisations such as You cried her (= ‘You made her cry’, an overgeneralisation of an intransitive-only verb into the transitive) and morphological overgeneralisations such as (I unsqueezed it = ‘stopped squeezing it’, an overgeneralisation of reversative un-prefixation). The question of how children retreat from these errors, given that they do not necessarily receive clear negative evidence that these are errors is the subject of an extensive literature. A recent attempt at a solution is provided by Ambridge, Pine and Rowland’s (2011) FIT theory based on the relative fit between particular items and particular construction ‘templates’. The idea is that overgeneralisations always involve a degree of coercion in terms of the semantic, pragmatic or phonological match between the construction and the item placed in it and that constructions compete for selection to convey the speaker’s message. Competition takes place along different dimensions: whether the construction contains a slot for each item and has the right event-semantics; whether the semantic properties of the slot fit those of the item potentially to be placed in it; relative construction frequency; and, finally the frequency with which the particular items have appeared in the construction. Since all these build up during learning, overgeneralisation errors occur when the ‘wrong’ construction out-competes the one that is correct in the adult system. The important thing about both these examples is first, that competition is central both for comprehension and production and second, that competition can occur simultaneously on different dimensions: phonological, morphological, semantic, pragmatic and at the level of the overall construction.
First language learning from a usage-based approach
8.
25
Conclusions
In presenting the usage-based position on children’s language development, I have tried to show that children are not mapping the forms onto preexisting abstract categories. Rather they are learning phonologically specific strings (of varying length from morphemes up to multi-word strings) upon which more complex language is built. To do this, they use pattern extraction mapped to communicative intent, with generalisation to more schematic representations. The frequency of these strings in what they hear is centrally involved in learning but is not a straightforward mapping: salience, neighbourhood relations, type and token frequencies are all involved as, most importantly, is what the child wants to say. The issue is not whether the child is productive and capable of abstraction: this is already the case before a word is uttered. What matters is the scope of the productivity and how this changes with development and the ways in which the connections between constructions in the network build up. This can explain complex language acquisition data, including patterns of error. Of course this is only the beginning of a major research enterprise. There are major questions about the nature of the network in terms of what is stored and the nature of the links made between constructions. Much more also needs to be done on the implications for the usage-based approach for languages with very different typological characteristics to those so far studied. As well, we should note that almost all CDS corpora have been collected from monolingual, middle class families in technological societies and usually consist of dyadic interactions between child and caretaker. This is far from the situation in which most children learn to talk and we need take seriously the question of how children growing up in multilingual contexts and in contexts with many more adult and children around learn the language(s) of their environment. References Abbot-Smith, Kirsten, and Heike Behrens 2006 How known constructions influence the acquisition of other constructions: the German passive and future constructions. Cognitive Science 30 (6): 995–1026.
26
Elena Lieven
Aguado-Orea, Javier 2004 The acquisition of morpho-syntax in Spanish: Implications for current theories of development. Unpublished Ph.D. Thesis. University of Nottingham. Akhtar, Nameera, and Michael Tomasello 1997 Young children’s productivity with word order and verb morphology. Developmental Psychology 33 (6): 952–965. Ambridge, Ben, and Elena V. M. Lieven 2011 Child Language Acquisition: Contrasting Theoretical Approaches. Cambridge: Cambridge University Press. Ambridge, Ben, and Caroline F. Rowland 2009 Predicting children’s errors with negative questions: Testing a schema-combination account. Cognitive Linguistics 20 (2): 225–266. Ambridge, Ben 2010 Review of I. Gülzow and N. Gargarina: “Frequency effects in language acquisition: Defining the limits of frequency as an explanatory concept”. Berlin: Mouton de Gruyter. Journal of Child Language 37 (2): 453–460. Ambridge, Ben, Caroline F. Rowland, Anna L. Theakston, and Michael Tomasello 2006 Comparing different accounts of inversion errors in children’s nonsubject wh-questions: ‘What experimental data can tell us?’. Journal of Child Language 33 (3): 519–557. Ambridge, Ben, Julian M. Pine, and Caroline F. Rowland 2011 Children use verb semantics to retreat from overgeneralization errors: A novel verb grammaticality judgment study. Cognitive Linguistics 22 (2): 303–323. Bannard, Colin, Elena Lieven, and Michael Tomasello 2009 Modeling children’s early grammatical knowledge. PNAS 106 (41): 17284–17289. Barnes, Sally, Mary Gutfreund, David Satterly, and Gordon Wells 1983 Characteristics of adult speech which predict children’s language development. Journal of Child Language 10: 65–84. Bates, Elizabeth, and Brian MacWhinney 1987 Competition, variation, and language learning. In Mechanisms of Language Acquisition, Brian MacWhinney (ed.), 157–193. Hillsdale, NJ: Lawrence Erlbaum. Bever, Thomas G. 1970 The cognitive basis for linguistic structures. In Cognition and the Development of Language, John R. Hayes (ed.), 279–362. New York: Wiley.
First language learning from a usage-based approach
27
Bowerman, M. 1988 The “no negative evidence” problem: How do children avoid constructing an overgeneral grammar? In Explaining Language Universals, John A. Hawkins (ed.), 73–101. Oxford: Basil Blackwell. Brandt, Silke, Evan Kidd, Elena Lieven, and Michael Tomasello 2009 The discourse bases of relativization: An investigation of young German and English-speaking children’s comprehension of relative clauses. Cognitive Linguistics 20 (3): 539–570. Brandt, Silke, Elena Lieven, and Michael Tomasello 2010 Development of word order in German complement-clause constructions: Effects of input frequencies, lexical items, and discourse function. Language 86 (3): 583–610. Brandt, Silke, Arie Verhagen, Elena Lieven, and Michael Tomasello 2011 German children’s productivity with simple transitive and complement-clause constructions: Testing the effects of frequency and variability. Cognitive Linguistics 22 (2): 325–357. Brandt, Silke, Holger Diessel, and Michael Tomasello 2008 The acquisition of German relative clauses: A case study. Journal of Child Language 35: 325–348. Bybee, Joan L., and Joanne Scheibmann 1999 The effect of usage on degrees of constituency: The reduction of don’t in English. Linguistics 37: 575–596. Cameron-Faulkner, Thea, Elena Lieven, and Michael Tomasello 2003 A construction based analysis of child directed speech. Cognitive Science 27 (6): 843–873. Carpenter, Malinda, Katherine Nagell, and Michael Tomasello 1998 Social Cognition, Joint Attention, and Communicative Competence from 9 to 15 Months of Age. (Monographs of the Society for Research in Child Development), 255, vol. 63 (4). Chicago: University of Chicago Press. Chan, Angel, Elena Lieven, and Michael Tomasello 2009 Children’s understanding of the agent-patient relations in the transitive construction: Cross-linguistic comparisons between Cantonese, German and English. Cognitive Linguistics 20 (2): 267–300. Chang, Franklin, Gary S. Dell, and Kathryn Bock 2006 Becoming syntactic. Psychological Review 113 (2): 243–272. Chemla, Emmanuel, Toben H. Mintz, Savita Bernal, and Anne Christophe 2009 Categorizing words using ‘frequent frames’: What cross-linguistic analyses reveal about distributional acquisition strategies. Developmental Science 12: 396–406.
28
Elena Lieven
Clark, Eve 1996
Early verbs, event types and inflection. In Children’s Language, vol. 9, Carlon E. Johnson and John Gilbert (eds.), 61–73. Mahwah, NJ: Lawrence Erlbaum Associates. Corrêa, Letícia 1995 An alternative assessment of children’s comprehension of relative clauses. Journal of Psycholinguistic Research 24 (3): 183–203. Curtin, Suzanne, and Dan Hufnagle 2009 Speech perception. In The Cambridge Handbook of Child Language, Edith Bavin (ed.), 107–123. Cambridge/New York: Cambridge University Press. Dąbrowska, Ewa, and Elana Lieven 2005 Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics 16 (3): 437–474. Diessel, Holger, and Michael Tomasello 2005 A new look at the acquisition of relative clauses. Language 81: 1–25. Diessel, Holger, and Michael Tomasello 2001 The acquisition of finite complement clauses in English: A corpusbased analysis. Cognitive Linguistics 12 (2): 97–141. Dittmar, Miriam, Kirsten Abbot-Smith, Elena Lieven, and Michael Tomasello 2008 German children’s comprehension of word order and case marking in causative sentences. Child Development 79: 1152–1167. Erlekens, Marian A. 2009 Learning to Categorize Verbs and Nouns: Studies on Dutch. Utrecht: LOT. Farrar, M. Jeffrey 1990 Discourse and the acquisition of grammatical morphemes. Journal of Child Language 17: 604–624. Fox, Barbara A., and Sandra A. Thompson 1990 A discourse explanation of the grammar of relative clauses in English conversation. Language 66: 297–316. Freudenthal, Daniel, Julian Pine, Javier Aguado-Orea, and Fernand Gobet 2007 The developmental patterning of finiteness marking in English, Dutch, German and Spanish using MOSAIC. Cognitive Science 31: 311–341. Freudenthal, Daniel, Julian Pine, and Fernand Gobet 2010 Explaining quantitative variation in the rate of optional infinitive errors across languages: A comparison of MOSAIC and the Variational Learning Model. Journal of Child Language 37: 643–669.
First language learning from a usage-based approach
29
Friedmann, Nama, and Rama Novogrodsky 2004 The acquisition of relative clause comprehension in Hebrew: A study of SLI and normal development. Journal of Child Language 31: 661–681. Gertner, Yael, Cynthia Fisher, and Julie Eisengart 2006 Learning words and rules: Abstract knowledge of word order in early sentence comprehension. Psychological Science 17 (8): 684–691. Goodluck, Helen, Eithne Guilfoyle, and Síle Harrington 2006 Merge and binding in child relative clauses: the case of Irish. Journal of Linguistics 42: 629–661. Grünloh, Thomas, Elena Lieven, and Michael Tomasello 2011 German children use prosody to identify participant roles in transitive sentences. Cognitive Linguistics 22 (2): 393–419. Hirsh-Pasek, Kathy, and Roberta Michnick Golinkoff 1996 The Origins of Grammar: Evidence from Early Language Comprehension. Cambridge, Mass.: MIT Press. Hoff-Ginsberg, Erika, and Letitia Naigles 1998 Why are some verbs learned before other verbs? Effects of input frequency and structure on children’s early verb use. Journal of Child Language 25 (1): 95–120. Huttenlocher, Janellen, Marina Vasilyeva, Elina Cymerman, and Susan Levine 2002 Language input and child syntax. Cognitive Psychology 45 (3): 337– 374. Ibbotson, Paul, Anna L. Theakston, Elena Lieven, and Michael Tomasello 2011 The role of pronoun frames in early comprehension of transitive constructions in English. Language Learning and Development 7: 24–29. Kempe, Vera, and Brian MacWhinney 1998 The acquisition of case marking by adult learners of Russian and German. Studies in Second Language Acquisition 20: 543–587. Kidd, Evan, Silke Brandt, Elena Lieven, and Michael Tomasello 2007 Object relatives made easy: A crosslinguistic comparison of the constraints influencing young children’s processing of relative clauses. Language and Cognitive Processes 22 (6): 860–897. Kirjavainen, Minna, Anna Theakston, and Elena Lieven 2009 Can input explain children’s me-for-I errors? Journal of Child Language 36 (5): 1091–1114. Krajewski, Grzegorz, Elena Lieven, and Anna Theakston 2012 Productivity of a Polish child’s inflectional noun morphology: A naturalistic study. Morphology 22: 9–34.
30
Elena Lieven
Legate, Julie Anne, and Charles Yang 2007 Morphosyntactic learning and the development of tense. Language Acquisition 14 (3): 315–344. Lieven, Elena 2008 Learning the English auxiliary: A usage-based approach. In Corpora in Language Acquisition Research: Finding Structure in Data (Trends in Language Acquisition Research, vol. 6), Heike Behrens (ed.), 60–98. Amsterdam: Benjamins. Lieven, Elena 2010 Input and first language acquisition: Evaluating the role of frequency. Lingua 120: 2546–2556. Lieven, Elena, Dorothé Salomo, and Michael Tomasello 2009 Two-year-old children’s production of multiword utterances: A usage-based analysis. Cognitive Linguistics 20 (3): 481–508. Mintz, Toben H. 2003 Frequent frames as a cue for grammatical categories in child directed speech. Cognition 90 (1): 91–117. Morris, William C., Garrison W. Cottrell, and Jeffrey L. Elman 2000 A connectionist simulation of the empirical acquisition of grammatical relations. In Hybrid Neural Symbolic Integration, Stefan Wermter and Ron Sun (eds.), 175–193. Heidelberg: Springer Verlag. Noble, Claire H., Caroline F. Rowland, and Julian M. Pine 2011 Comprehension of argument structure and semantic roles: Evidence from English-learning children and the forced-choice pointing paradigm. Cognitive Science 35: 963–982. Noble, Claire H., Anna Theakston, and Elena Lieven subm. Comprehension of intransitive argument structure: The first noun as causal agent bias. Pine, Julian M., Caroline F. Rowland, Elena V. M. Lieven, and Anna L. Theakston 2005 Testing the agreement/tense omission model: Why the data on children’s use of non-nominative 3psg subjects count against the ATOM. Journal of Child Language 32: 269–289. Pine, Julian M., Elena V. M. Lieven, and Caroline Rowland 1998 Comparing different models of the English Verb Category. Linguistics 36: 807–830. Pizzuto, Elena, and Maria C. Caselli 1992 The acquisition of Italian morphology: Implications for models of language development. Journal of Child Language 19 (3): 491–557. Rispoli, Matthew 1998 Patterns of pronoun case error. Journal of Child Language 25 (3): 533–554.
First language learning from a usage-based approach
31
Rowland, Caroline, and Anna L. Theakston 2009 The acquisition of auxiliary syntax: A longitudinal elicitation study. Part 2: The modals and DO. Journal of Speech Language and Hearing Research 52 (6): 1471–1492. Rowland, Caroline F. 2007 Explaining errors in children’s questions. Cognition 104 (1): 106– 134. Rowland, Caroline F., and Sarah L. Fletcher 2006 The effect of sampling on estimates of lexical specificity and error rates. Journal of Child Language 33 (4): 859–877. Rowland, Caroline F., and Julian M. Pine 2000 Subject-auxiliary inversion errors and wh-question acquisition: ‘What children do know!’. Journal of Child Language 27 (1): 157– 181. Rowland, Caroline, Sarah Fletcher, and Daniel Freudenthal 2008 How big is enough? Assessing the reliability of data from naturalistic samples. In Corpora in Language Acquisition Research: Finding Structure in Data (Trends in Language Acquisition Research, vol. 6), Heike Behrens (ed.), 1–24. Amsterdam: Benjamins. Stoll, Sabine, Balthasar Bickel, Elena Lieven, Netra P. Paudyal, Goma Banjade, Toya N. Bhatta, Martin Gaenszle, Judith Pettigrew, Ichchha Purna Rai, Manoj Rai, and Novel Kishore Rai 2012 Nouns and verbs in Chintang: Children’s usage and surrounding adult speech. Journal of Child Language 39 (2): 284–321. Stoll, Sabine, Kirsten Abbot-Smith, and Elena Lieven 2009 Lexically restricted utterances in Russian, German and English child-directed speech. Cognitive Science 33: 75–103. Stumper, Barbara, Colin Bannard, and Elena Lieven 2011 ‘Frequent frames’ in German child-directed speech: A limited cue to grammatical categories. Cognitive Science 35:1190–1205. Theakston, Anna L., Elena V. M. Lieven, and Michael Tomasello 2003 The role of the input in the acquisition of third person singular verbs in English. Journal of Speech Language and Hearing Research 46 (4): 863–877. Theakston, Anna L., Elena V. M.Lieven, Julian M. Pine, and Caroline F. Rowland 2001 The role of performance limitations in the acquisition of verbargument structure: An alternative account. Journal of Child Language 28 (1): 127–152. Tomasello, Michael 2000 Do young children have adult syntactic competence? Cognition 74 (3): 209–253.
32
Elena Lieven
Tomasello, Michael Constructing a Language: A Usage-Based Theory of Language Ac2003 quisition. Cambridge, Mass.: Harvard University Press. Tomasello, Michael, and Daniel Stahl 2004 Sampling children’s spontaneous speech: How much is enough? Journal of Child Language 31 (1): 101–121. Tomasello, Michael, Malinda Carpenter, Josep Call, Tanya Behne, and Henrike Moll 2005 Understanding and sharing intentions: The origins of cultural cognition. Brain and Behavioral Sciences 28: 675–735. Verhagen, Arie 2005 Constructions of Intersubjectivity. New York: Oxford University Press. Wexler, Ken 1998 Very early parameter setting and the unique checking constraint: A new explanation of the optional infinitive stage. Lingua 106: 23–79. Wijnen, Frank, Masja Kempen, and Steven Gillis 2001 Bare infinitives in Dutch early child language: An effect of input? Journal of Child Language 28: 629–660.
Item-based patterns in early syntactic development Brian MacWhinney
1.
From words to combinations
Children begin language learning by producing one word at a time (Bloom 1973). It may seem obvious that children build up language by putting together small pieces into larger, more complex structures (Simon 1969). However, some researchers have argued that children cannot pick up single words from parental input without relying on additional processes such as statistical learning (Thiessen and Saffran 2007), syntactic bootstrapping (Gleitman 1990), or semantic bootstrapping (Pinker 1995, Siskind 2000). Although these processes are involved in various ways during language learning, it is not clear that they are crucially involved in word learning. Instead, as St. Augustine argued in his Confessions (1952) back in the 4th century, children pick up words because of the ways in which parents present them, often by pointing at objects directly and naming them. To explore this issue, I examined the maternal input to 16 children in the Brent Corpus in the CHILDES database (http://childes.psy.cmu.edu, MacWhinney 2000). The children in this corpus were studied between 9 and 15 months of age and the total size of the database is 496,000 words. This search shows that 23.8% of the maternal utterances are single word utterances.1 These results indicate that Augustine’s model is not that far off the mark, and that it is safe to assume that children can pick up many words without relying on additional segmentation (Aslin, Saffran, and Newport 1999) strategies and bootstrapping. Recent models of early word learning (Blanchard, Heinz, and Golinkoff 2010, Monaghan and Christiansen 2010, Rytting, Brew, and Fosler-Lussier 2010) place increasing emphasis on the role of the lexicon in guiding segmentation. Although statistical learning may help guide segmentation and thereby lexical learning, the pathway 1.
The CLAN commands are: freq +s"*MOT:" +u +re +d4 +y +.cha (locates 155906 maternal utterances) wdlen +t*MOT +re +u *.cha (locates 37110 maternal one-word utterances)
34
Brian MacWhinney
from lexical learning to segmentation is even more central to language learning. Although this learning path may not be available for bound morphemes and function words, it is clearly available for many content words. As we will see later, learning of bound morphemes follows a similar, but slightly different path. It is clear that child language learning is not based on the full batch recording of long sentences completely mapped to complex semantic structures, as suggested in models such as SCISSOR (Ge and Mooney 2005). Such models may seem attractive from a computational point of view, but they make implausible assumptions regarding children’s memory for sentences and their contexts. They make the excessively strong assumption that children store away everything they hear, along with complete episodic encodings of the situations in which utterances occur. Despite their full control of the language, even adults do not demonstrate this level of total recall (Keenan, MacWhinney, and Mayhew 1977), and it seems still less likely that children could have this level of recall for sentences that they do not yet even comprehend. Instead, we can think of the language learning process as one in which small components are first isolated and then assembled into larger combinations, step by step. Many early forms are multimorphemic combinations that function initially as single lexical items, or what MacWhinney (1975b, 1978) called “amalgams”. For example, a child who only knows the word dishes and not dish, may think that dishes is a mass noun with no plural suffix, much like clothes. In addition to these morphemic amalgams, children will pick up longer phrasal combinations as if they were single lexical items. For example, they may treat where’s the as a single lexical item, not analyzed into three morphemes. To trace this process, we need to look closely at the actual utterances produced by children. In practice, the study of these early two- and three-word combinations has involved decades of data collection and theory construction. The Child Language Data Exchange System (CHILDES) has collected longitudinal data from hundreds of children learning dozens of languages with a particularly heavy representation of data from children producing early word combinations. The construction of this important resource was made possible by the generous contributions of hundreds of child language researchers who have made their hard-won data publicly available. Using this resource, we are now able to produce increasingly refined accounts of the process of early syntactic development. The transition from children’s first words to their first sentences is nearly imperceptible. After learning the first words, children begin to produce
Item-based patterns in early syntactic development
35
more and more single-word utterances. As their vocabulary grows, children begin saying words in close approximation, separated only by short pauses (Branigan 1979). For example, they may say wanna, followed by a short pause and then cookie. If the intonational contour of wanna is not closely integrated with that of cookie, adults tend to perceive this as two successive single-word utterances. However, the child may already have in mind a clear syntactic relation between the two words. As the clarity of the relations between single words strengthens, the temporal gap between the words decreases. Eventually, we judge the production of want cookie to be a single multi-word utterance. Across a period of several months, two- and three-word combinations such as more cookie, my baby, hi Daddy, and look my birdie become increasingly frequent. In experiments attempting to teach signed language to chimpanzees, this transition from successive single word utterances to single multiword utterances seems to occur less frequently or not at all. This has led researchers (Hauser, Chomsky, and Fitch 2002, MacWhinney 2008a, Terrace et al. 1980) to suggest that the ability to communicate using integrated combinations is uniquely supported by the human language mechanism. 2.
Positional patterns
In parallel with the ongoing process of data collection, child language researchers have examined a variety of theoretical accounts of early word combinations. The goal of this work is to formulate a set of mechanisms (MacWhinney 1987) that can explain how children use the linguistic input they receive to construct productive grammatical patterns. The first attempt to provide an account of these early grammars was offered by Braine (1963, 1976). Braine suggested that early word combinations were structured in terms of “positional patterns” that involved the linear combination of two classes of words: pivot class words and open class words. Words in the pivot class could only occur in combination with open class words, whereas open class words could either appear alone, in combination with pivot class words, or in combination with other open class words. Braine referred to this system as a Pivot Grammar. His analysis of this system was backed up by experiments (Braine 1963, 1966) that showed how adults could extract word classes of this type in miniature linguistic systems (MLS). In a classic analysis, Bloom (1971) challenged the generative adequacy of the Pivot Grammar framework by emphasizing two problems. The first was the tendency for Pivot Grammar to overgenerate. For example, it
36
Brian MacWhinney
would allow forms like want take or my want in which words were combined in conceptually unlikely ways. The second problem involved the analysis of open-open combinations such as Mommy chair. In such combinations, it is difficult to determine if the child intends “Mommy’s chair”, “Mommy, there is the chair”, “Mommy is in the chair”, or some other possible interpretation. Bloom’s criticism reflected the emphasis during the 1970s (Leonard 1976, Schlesinger 1974) on the idea that children’s early word combinations were based on the use of some small set of universal conceptual relations such as modifier + modified, locative + locations or subject + verb. In an attempt to align his earlier theory with this Zeitgeist, Braine (1976) suggested that early combinations could best be viewed as “groping patterns” in which the conceptual relations were initially vague, but became solidified over time. Braine viewed patterns of this type as expressing high-level semantic relational features such as recurrence (another doll), possession (my doll), agency (doll runs), or object (want doll). 3.
Item-based patterns
My own analysis (MacWhinney 1975a) took a somewhat different approach to positional patterns. Rather than arguing that children were selecting combinations from two large classes or expressing a small set of universal conceptual relations, I looked at early combinations as based on an array of what I called “item-based patterns” (IBPs) with each pattern linked tightly to some individual lexical item. This emphasis on generation of syntax from lexical items was in tune with ongoing work at the time on Lexical Functional Grammar (Bresnan 1978, Pinker 1982). Over time, the emphasis on lexical determination of patterns of word combination has increasingly become the default in linguistics, whether it be through the Merge operation (Chomsky 2010) or the feature cancellation of Combinatory Categorial Grammar (Steedman 2000). Because IBPs emphasize individual lexical items as the building blocks of combinations, they avoid the imposition of adult conceptual relations on early child utterances. Instead, the relevant conceptual relations are, at least initially, the relations inherent in individual predicates such as more, want, or my. Rather than viewing the combination more milk as expressing a pattern such as recurrence + object, this framework interprets the combination as evidence for the pattern more + X, where the italicization of the word more indicates that it is a particular lexical item and not a general concept. This analysis stresses the extent to which the IBP first emerges as a highly limited construction based on the single lexical
Item-based patterns in early syntactic development
37
item more. These item-based combinations can be viewed as predicateargument relations. In the IBP for more milk, the predicate is more and the argument or slot filler is milk. In the case of the IBP for want there are two terms that can complete its argument structure. First, there must be a term that serves as a direct object, as in want cookie. Often this term is a nominal, but children also produce combinations in which the second term is optionally also a verb, as in want kiss. Second, there must be a nominal that serves as the subject, as in I want cookie. Because want expects these two additional complements, we can call it a two-argument predicate. Other predicates, such as under or my, take only one argument, and a few such as give take three (John gave Bill a dollar). The only lexical categories that typically take no additional arguments are nouns, such as dog or justice, and interjections, such as gosh or wow. Unlike verbs, adjectives, prepositions and other words that require additional arguments, nouns and interjections can express a full meaning without additional arguments. On the other hand, nouns that are derived from verbs, such as lack, destruction or decline can take prepositional phrases as additional complements (as in a lack of resources, the army’s destruction of the city or a decline in the dollar), but basic nouns such as chair and goat do not even have these expectations for additional complements. 3.1
How children learn IBPs
Children learn item-based patterns by listening to phrases, short sentences, or fragments of longer sentences. For example, if the child’s older sister says this is my dollie, the child may only store the last two words as my dollie. Within this sequence, the child will then recognize the word dollie from previous experience and associate that word with the actual doll. This extraction of the “known” segment then leaves the segment my as “unknown” or uninterpreted (MacWhinney 1978). At this point, the child can compare the phrase my dollie with the single word dollie, noticing the differences. The first difference is the presence of my before dollie. The second difference involves the meaning of possession by the speaker. Because this meaning makes no sense without an attendant argument, it is acquired as a predicate that takes on a meaning when combined with its argument. At this point, the child can establish a new lexical entry for my and associate it with the meaning of being possessed by the speaker (the older sister). While acquiring this new form, the child also extracts the item-based pattern my + X. This means
38
Brian MacWhinney
that, right from the beginning, the construction of this new lexical predicate involves a parallel construction of an IBP. In this case, the older sister may be asserting her control over the doll and wrestling it from the younger sister’s possession. Thus, the younger child picks up not only the meaning of my and its position with respect to its argument, but also the notion of a relation of possession and control between the two words. The important point here is that IBPs are formed directly when new predicates are learned. It is more accurate to speak of this item-based pattern as combining my + object possessed, rather than just my + X. The specific shape of the semantic relation here is shaped by the exact activity involved in the child’s possessing this particular doll. Embodied relations of this type can be represented within the general theory of Cognitive Grammar (Langacker 1989) and its more specific implementations in the theory of Embodied Cognition (Feldman 2006). From this perspective, we can see the relations between predicates and their arguments in terms of enacted actions, emotions, perceptions and space/time configurations. For example, when a child says my dollie, there is a specific reference to the embodied action of holding the doll. Often we can see this even as the child is talking. When the child says byebye Daddy, there is a concomitant waving of the hand and the physical experience of seeing the father leave. When the child sees a toy dog fall from the table and says puppy fall, there is a linkage to other experiences with falling either by the child herself or by other objects. In all these relations, children are expressing activities and relations for which they have had direct embodied physical contact and experience. Initially, the pattern of my + object possessed is restricted to the words my and dollie and the relation of possession that occurs between them. However, if the older sister then says and this is my horsie, the child can begin to realize that the open slot for the item-based pattern linked to my refers potentially to any manner of toy. Subsequent input will teach the child that any object can fill the slot opened up by the operator my. Each IBP goes through this type of generalization which I have called “feature superimposition” (MacWhinney 1975b) and which Worden (2002) calls “feature pruning”. By comparing or superimposing forms such as more milk, more toys and more cookies, the child can generalize the semantic features of the argument slot. This comparison prunes out features such as [+ solid] or [+ edible] and leaves features such as [+ object] or [+ force]. Parents can promote the child’s learning of IBPs by providing appropriate input structures. As soon as the child begins to understand the frame where’s your +X, parents can ask where’s your nose, where’s your tummy,
Item-based patterns in early syntactic development
39
and so on. Then, they can build on this structure by saying show me your nose, show me your tummy, and so on. From teaching sequences such as these, the child can pick up the IBP your + X. Sokolov (1993) observed that parents’s use of these frames increases at the time when children begin to show understanding of the relevant structures. Second language researchers refer to these repetition structures as “build ups”, whereas first language researchers refer to them as “variation sets”, because they emphasize the variation that arises in the argument slot of a given IBP. Küntay and Slobin (1996) report that roughly 20% of the input to children involves such variation sets, and Waterfall et al. (2010) have shown that the presence of these sets in the input can improve the learning of computational models of language acquisition. 3.2
The structure of IBPs
This view of the learning of IBPs motivates several assumptions regarding how IBPs are structured and function. Specifically, each IBP specifies: 1. the lexical identity of its predicate, which can be either a free or bound morpheme, 2. the possible lexical features of one or more arguments, 3. the position of the predicate vis a vis its arguments, and 4. the conceptual/grammatical relation that holds between the predicate and each argument. These four components of the IBP are shaped directly during the initial process of learning of predicates. In this regard, we can also analyse the learning of affixes in terms of IBP learning. For example, the learning of the English plural suffix -s can be described through the same learning scenario we used to describe the learning of the IBP for the quantifier more. Consider a child who knows the word dog and is now trying to understand the word dogs. Following MacWhinney (1978), we can assume that the comparison of the known word dog with the new form dogs, leads to the masking of the shared phonological segments and the isolation of the -s as the “unknown” segment. Similarly, comparison of the referent of dog with the current referent of dogs leads to the abstraction of plurality as the “unidentified” concept. The linking of the unknown form to the unidentified concept produces a new lexical form for the plural. This new predicate then links the nominal argument dog to the pre-predicate slot and establishes a relation of quantification between the suffix and the noun. Because affix-
40
Brian MacWhinney
based patterns are so frequent and consistent, children find them very easy to learn. We know that in English (Braine 1963), Garo (Burling 1959), Hungarian (MacWhinney 1976), Japanese (Clancy 1985) and Turkish (AksuKoc and Slobin 1985) the ordering of affixes in words is almost always correct, even at the youngest ages. In general, the learning of affixes is parallel to the learning of other lexical predicates. Of course, there are also differences between the two scenarios in terms of the triggering of phonological processes (MacWhinney 1978), but that is a story for another time. 3.3
Clustering
There are three other aspects of IBPs that arise from different processing sources. The first is the property of clustering, which produces the capacity for recursion. Clustering allows a combination of words to occupy an argument slot. For example, in the sentence I want more milk, the combination more milk functions as a cluster that can fill the object argument slot for the verb want. Clustering allows the child to gradually build up longer sentences and a more complex grammar. Examples of potentially infinite cluster types include structures such as (John’s (friend’s (sister’s car))) or I know (that John said (that Mary hoped (that Jill would go))). Chomsky has argued (Chomsky 2010, Hauser et al. 2002) that this type of recursive structuring represents a unique adaptation in human evolution determined by a single recent mutation. MacWhinney (2009), on the other hand, argues that recursion is grounded on a wide set of mnemonic and conceptual abilities in higher mammals that achieved more dynamic functioning once humans had developed systematic methods for encoding lexical items (Donald 1991). For our present purposes, what is important is the way in which the child can appreciate the fact that the combination more milk functions in a way that is equivalent to the single item milk. Whether this is a recent, unique development or an older basic cognitive function is irrelevant for our current purposes. 3.4
Non-locality
A second important property of IBPs is that they can sometimes specify non-local slot fillers. For example, in the sentence what did you eat? the argument slot of eat is filled by a non-local element in accord with the Active Filler strategy (Frazier and Flores d’Arcais 1989). The fact that chil-
Item-based patterns in early syntactic development
41
dren take years learning to control these patterns (Brown 1968, Kuczaj and Brannick 1979) shows that local attachment is the default assumption. However, the system is capable of eventually picking up all of these nonlocal positional specifications. Apart from active fillers, IBPs can also encode interrupted attachments, such as the sequence of can + NP + V operative in phrases such as can you sing? Learning of these discontinuous elements begins in contexts which have only one word or short phrase intervening between the elements, as in can he go, je ne vais pas, or er hat kein Frühstück genommen. Once this basic pattern is established, more complex forms can be created through clustering and adjunct attachments. As Chomsky (2007) notes, because non-local patterns go against the basic principle of economy of local attachment, it is likely that they serve other important pragmatic functions, such as the stressing of the new information in a wh-question. 3.5
Agreement
A third aspect of IBPs involves the possibility of additional structural context. This additional structural content is triggered primarily through agreement and complementation. In these structures, IBPs require feature agreement not just between the IBP predicate and its arguments, but also between the IBP predicate and the features of other predicates attached to the arguments. One common form of agreement is between the verb and its arguments; another is between the noun and its modifiers. In a phrase such as he goes, the verb affix marking the third person singular agrees in person and number not with its head, but with an argument of its head, the word he. Often agreement involves two grammatical morphemes that agree across a relation between their respective bases. In a Spanish phrase such as mis caballos lindos (‘my pretty horses’), the plural suffix on mis (‘my’) and lindos (‘pretty’) agrees not with the base caballo (‘horse’), but with the suffix -s on the head noun caballos to which these words bases attach as modifiers. The German phrase die Mädchen kochen (‘the girls are cooking’) shows a similar structure in which the plurality of the definite article die agrees with the plurality of the suffix -en on the verb kochen (‘cook’). These configurations must be marked as entries in the IBPs for each of these grammatical morphemes. IBPs must also occasionally agree with the features of arguments in subordinate clauses. For example, in the Spanish sentence supongo que venga (‘I imagine he will go’), the word venga in the complement clause is
42
Brian MacWhinney
placed into the subjunctive because of the selectional restriction of the main verb suponer for the irrealis mood. Another classic case of agreement in the child language learning literature involves children’s learning of complement structures in sentence pairs such as John is eager to please and John is easy to please. In the former, the IBP for eager specifies that the perspective/subject (MacWhinney 2008c) of the complement clause is also the subject of the main clause. In the latter, the IBP for easy specifies that the perspective/subject of the complement clause is some generic participant that pleases John. Children find it difficult to learn these patterns (Chomsky 1969), not only because of the more complicated IBP structures, but also because of the additional perspectival adjustments they require. This ability of IBPs to trace information across relational arcs into subordinate clauses conforms with the notion of degree-zero learnability proposed by Lightfoot (1989). Lightfoot argued that grammatical relations could be learned primarily from main clauses with only a little bit of “peeking” into the verbs of subordinate clauses. As Lightfoot noted, these restrictions on the effective environment for grammatical relations overcome the various complexities imagined in earlier work on learnability of transformational grammar and the so-called “logical problem of language acquisition” (Wexler and Culicover 1980). 4.
Processing IBPs
To understand how children build up these complex syntactic structures in both production and comprehension, we need to consider how a syntactic processor can combine words using item-based patterns (along with the feature-based patterns to be discussed later), operating in real time. Most current accounts of real-time syntactic processors use the basic logic found in the Competition Model of MacWhinney (1987). That model specifies a series of steps for the competition between constructions during comprehension: 1. Sounds are processed as they are heard in speech. 2. Closely related words compete for selection based on the support they receive from input sounds. 3. Each selected word activates its own item-based patterns along with related feature-based patterns. 4. Item-based patterns initiate searches for specified slot fillers. 5. Slots may be filled either by single words or by whole phrases. In the latter case, the attachment is made to the head of the phrase.
Item-based patterns in early syntactic development
6.
43
To fill a slot, a word or phrase must receive support from cues for word order, prosody, affixes, or lexical class. 7. If several words compete for a slot, the one with the most cue support wins. The details of the operation of this parser are controlled by the competitions between specific lexical items and the cues that support alternative assignments. Consider the case of prepositional phrase attachment. Prepositions such as on take two arguments: The first argument is the object of the preposition, the second argument is the head of the prepositional phrase (i.e., the word or phrase to which the prepositional phrase attaches). We can refer to argument #1 (arg1) as the local head or endohead and argument #2 (arg2) as the external head or exohead. Consider the sentence the man positioned the coat on the rack. Here, the endohead of on is rack and its exohead (the head of the whole prepositional phrase) could be either positioned or the coat. These two alternative attachment sites for the prepositional phrase are in competition with each other. Competition also governs the interpretation of verbs as either transitive or intransitive. Verbs like jog that have both transitive and intransitive readings can be represented by two competing lexical entries. When we hear the phrase, since John always jogs a mile, we activate the transitive reading. However, if the full sentence then continues as since John always jogs, a mile seems like a short distance, then the intransitive reading takes over from the transitive one. For detailed examples of the step-by-step operations of this type of processor consult MacWhinney (1987) or O’Grady (2005). Sentence production involves the inversion of many of the operations that occur during comprehension. The basic steps are: 1. The speaker formulates an embodied mental model of an activity (McNeill 1979), focusing on the core verbal predicate (MacWhinney 2008c) and its associated nominal starting perspective. 2. Associated with the core predicate can be interactional markers that are often preposed or postposed to the core predicate. 3. Each predicate activates slots for its arguments in accord with IBPs. 4. Arguments may also trigger the activation of further modifiers and clauses and verbs may trigger the activation of adjuncts in accord with IBP structures. 5. As slots become activated, lexical items are activated to fill them. 6. Production begins with preposed interactional forms and topics, sometimes followed by pauses.
44
Brian MacWhinney
7.
When the items linked into a phrasal group have all received lexical activation, the articulator can begin producing that phrase, while other processes continue on later phrases. If some items are not ready in time, there can be pauses, retracings, or other disfluencies.
8.
5.
Generative power of IBPs
A central goal in child language research is the formulation of a model of grammatical learning that can simulate or “generate” the utterances produced by the child without also generating forms that are clearly improbable or divergent. Of course, the sentences that are actually recorded in a given set of transcripts may be an incomplete representation of all the forms that the child can produce. However, if the sampling is dense enough (Tomasello and Stahl 2004), they can be viewed as a reasonable approximation to what the child is actually capable of producing. Applying the concept of IBPs (MacWhinney 1975a), I examined the word order of 11,077 utterances produced by two Hungarian children, Zoli and Moni, between the ages of 17 and 29 months. I found that, across the various samples from the two children, between 85% and 96% of the utterances in each sample could be generated by a set of 40 item-based patterns. In the terms of computational linguistics, this is to say that the use of IBPs achieved a “recall” level of between .85 and .96. This analysis did not consider the “precision” or possible overgeneration of IBPs, because the semantic features on the argument slots were configured to make implausible overgeneration impossible. As we will discuss later, this conservative nature of IBPs is a major strength. Some examples of these patterns in English translation are: X + too, no + X, where + X, dirty + X and see + X. The IBP model was able to achieve a remarkably close match to the child’s output, because it postulates an extremely concrete set of abilities that are directly evidenced in the child’s output. Because of this, it does not suffer from the overgeneration problems faced by Pivot Grammar or the problem of finding a set of universal relational forms that can be applied to early combinations in all languages. The details regarding the ages and lengths of the recordings are as follows:
Item-based patterns in early syntactic development Table 1.
45
Survey of data from two Hungarian children
Period
Age
Hours
Utterances
Mean Length
Zoli I
1;5, 2-5
4
51
1.10
Zoli II
1;6, 29-30
6
228
1.58
Zoli III
1;8, 6-8
8
2675
1.60
Zoli IV
1;10, 0-6
7
1911
1.87
Zoli V
2;0,0-5
6
835
2.58
Zoli VI
2;2,0-3
7
1826
2.50
Moni I
1;11,18-27
8
1478
1.53
Moni II
2;2,0-7
8
576
1.28
Moni III
2;4,16-17
5
797
1.15
Moni IV
2;5,20-23
8
700
1.03
In order to establish evidence for non-chance use of an IBP, we can use exact probabilities from the table of binomial probability distribution. For example, five identical occurrences of the same order of two equally possible outcomes (either XY or YX) reflects existence of a non-chance pattern at the p < .032 level of significance. Similarly, seven orders of one type out of nine trials occurs at the p < .02 level. Given a criterion level of p 0.97 and slope (γ) around 1. Inspection of the construction verb types, from most frequent down, also demonstrates that the lead member is prototypical of the construction and generic in its action semantics. You may notice that some of these items, such as ‘come across N’, are phrasal verbs and have metaphorically extended their mean-
Construction learning as category learning: a cognitive analysis
83
ing (e.g. “I came across an interesting book”) while others like walk and run are more literal. We allowed our searches to capture this mixed bag because our goal was to analyse just what meanings were associated with the verb types that appeared in VACs that were simply operationalised at this first stage on the basis of their linguistic form.
Figure 1.
Verb type distribution for V across N
Figure 2.
Verb type distribution for VNN
Since Zipf’s law applies across language, the Zipfian nature of these distributions is potentially trivial. But they are more interesting if the company of verb forms occupying a construction is selective, i.e. if the frequencies of the particular VAC verb members cannot be predicted from their frequencies in language as a whole. We measure the degree to which VACs are selective like this using a chi-square goodness-of-fit test and the statistic ‘1τ’ where Kendall’s tau measures the correlation between the rank verb frequencies in the construction and in language as a whole. Higher scores on both of these metrics indicate greater VAC selectivity. Another useful measure is Shannon entropy for the distribution. The lower the entropy, the more coherent the VAC verb family. Scores on all these metrics are given for all VACs later in Table 3.
84
Nick C. Ellis and Matthew Brook O’Donnell
Step 5 Determining the contingency between verbs and VACs Some verbs are closely tied to a particular construction (for example, give is highly indicative of the ditransitive construction, whereas leave, although it can form a ditransitive, is more often associated with other constructions such as the simple transitive or intransitive). The more reliable the contingency between a cue and an outcome, the more readily an association between them can be learned (Shanks 1995), so constructions with more faithful verb members should be more readily acquired. The measures of contingency adopted here are (1) faithfulness (also known as “reliance”, Schmid and Küchenhoff 2013) – the proportion of tokens of total verb usage that appear in this particular construction (e.g. the faithfulness of give to the ditransitive is approximately 0.40; that of leave is 0.01), and (2) directional mutual information (MI Word → Construction: give 16.26, leave 11.73 and MI Construction → Word: give 12.61, leave 9.11), an information science statistic that has been shown to predict language processing fluency (e.g. Ellis, Simpson-Vlach, and Maynard 2008, Jurafsky 2003). Table 2 lists these contingency measures for the verbs occupying the V across N VAC pattern. Table 2. Verb come walk cut run spread move look go lie lean stretch fall get pass reach travel
Top 20 verbs found in the V across N construction pattern in the BNC Constr. Freq. 474 203 197 175 146 114 102 93 80 75 62 57 52 42 40 39
Corpus Freq. 122107 17820 16200 36163 5503 34774 93727 175298 18468 4320 4307 24656 146096 18592 21645 8176
Faith. 0.0039 0.0114 0.0122 0.0048 0.0265 0.0033 0.0011 0.0005 0.0043 0.0174 0.0144 0.0023 0.0004 0.0023 0.0018 0.0048
MI Word → Construction 15.369 16.922 17.016 15.687 18.142 15.125 13.534 12.498 15.527 17.530 17.260 14.621 11.922 14.588 14.298 15.666
MI Construction → Word 10.726 15.056 15.288 12.800 17.971 12.295 9.273 7.333 13.610 17.708 17.442 12.287 7.020 12.661 12.152 14.924
Construction learning as category learning: a cognitive analysis fly stride scatter sweep
38 38 35 34
8250 1022 1499 2883
0.0046 0.0372 0.0233 0.0118
15.616 18.629 17.957 16.972
85
14.861 20.887 19.663 17.734
Step 6 Identifying the meaning of verb types occupying the constructions Our semantic analyses use WordNet, a distribution-free semantic database based upon psycholinguistic theory which has been in development since 1985 (Miller 2009). WordNet places words into a hierarchical network. At the top level, the hierarchy of verbs is organized into 559 distinct root synonym sets (‘synsets’ such as move1 expressing translational movement, move2 movement without displacement, etc.) which then split into over 13,700 verb synsets. Verbs are linked in the hierarchy according to relations such as hypernym (verb Y is a hypernym of the verb X) if the activity X is a [kind of] Y (to perceive is an hypernym of to listen), and hyponym [verb Y is a hyponym of the verb X if the activity Y is doing X in some manner (to lisp is a hyponym of to talk)]. Various algorithms to determine the semantic similarity between WordNet synsets have been developed which consider the distance between the conceptual categories of words, as well as considering the hierarchical structure of the WordNet (Pedersen, Patwardhan, and Michelizzi 2004). Polysemy is a significant issue of working with lexical resources such as WordNet, particularly when analyzing verb semantics. For example, in WordNet the lemma forms move, run and give used as verbs are found in 16, 41 and 44 different synsets respectively. To address this we have applied word sense disambiguation tools specifically designed to work with WordNet (Pedersen and Kolhatkar 2009) to the sentences retrieved at Step 3. The values on the metrics we have described so far are illustrated for the 23 VACs in Table 3. It can be seen that for all of the VACs, the type-token distribution is Zipfian (mean R2 = 0.98) and that there is contingency between verbs and VACs (mean MIword-construction = 14.16) – particular verbs select particular constructions, and vice versa.
86
Nick C. Ellis and Matthew Brook O’Donnell
Table 3. VAC Pattern
V about N V across N V after N V among pl-N V around N V as adj V as N V at N V between pl-N V for N V in N V into N V like N VNN V off N V of N V over N V through N V to N V towards N V under N V way prep V with N Mean
Values for our 23 Verb Argument Constructions on metrics of Zipfian distribution, verb form selectivity, and semantic coherence R2
γ
Entropy
χ2
1-τ
Mean MIw-c
Mean ΔPc-w
Type entropy per root synset
Token entropy per root synset
0.98 0.99 0.99 0.99 0.97 0.96 0.99 0.97 0.98 0.97 0.96 0.98 0.98 0.99 0.98 0.97 0.98 0.99 0.95 0.98 0.97 0.99 0.98 0.98
-0.80 -1.08 -1.04 -1.43 -1.17 -0.98 -0.80 -1.02 -1.08 -0.79 -0.96 -0.82 -1.08 -0.84 -1.29 -0.76 -1.08 -1.11 -0.92 -1.16 -1.10 -0.83 -0.96 -1.00
3.79 5.30 5.04 5.36 5.51 4.05 4.84 4.94 5.17 5.58 6.22 5.22 4.80 3.79 4.89 4.26 5.95 5.37 5.02 4.36 5.74 3.61 5.59 4.97
29919 23324 48065 9196 40241 8993 184085 66633 47503 212342 61215 82396 12141 51652 10101 319284 77407 29525 25729 15127 19244 29827 192521 69412
0.74 0.77 0.69 0.77 0.77 0.76 0.87 0.79 0.80 0.73 0.72 0.71 0.66 0.66 0.60 0.88 0.87 0.83 0.72 0.78 0.70 0.81 0.81 0.76
15.55 15.49 12.87 17.51 15.96 17.88 10.36 12.51 15.18 9.54 10.48 11.44 15.84 11.52 17.84 11.15 13.72 14.84 13.50 19.59 13.13 17.26 12.56 14.16
0.011 0.003 0.002 0.009 0.004 0.020 0.003 0.003 0.005 0.002 0.002 0.003 0.009 0.004 0.011 0.003 0.002 0.003 0.003 0.017 0.002 0.013 0.003 0.006
3.17 2.75 3.33 2.93 2.80 3.20 3.55 3.23 3.11 3.38 3.56 3.21 2.99 3.21 2.64 3.31 2.87 3.05 2.88 2.68 3.07 3.27 3.16 3.10
2.42 2.08 2.12 2.79 2.43 2.48 2.56 1.72 2.61 2.70 2.90 2.39 1.92 2.38 2.46 2.56 2.33 2.10 2.59 2.35 2.54 2.46 2.50 2.41
Proportion of tokens covered by top 3 synsets 0.45 0.25 0.31 0.11 0.19 0.34 0.25 0.36 0.21 0.16 0.10 0.26 0.34 0.41 0.21 0.33 0.17 0.26 0.19 0.31 0.16 0.39 0.18 0.26
lch
res
0.162 0.194 0.103 0.096 0.155 0.078 0.079 0.099 0.078 0.117 0.079 0.168 0.121 0.139 0.198 0.11 0.237 0.147 0.189 0.149 0.14 0.105 0.136 0.134
0.271 0.353 0.184 0.174 0.284 0.141 0.146 0.185 0.149 0.198 0.138 0.289 0.216 0.236 0.358 0.189 0.404 0.266 0.325 0.274 0.248 0.194 0.231 0.237
Step 7 Generating distributionally-matched, control ersatz constructions (CECs) Because so much of language distribution is Zipfian, for each of the 23 VACs we analyze, we generate a distributionally-yoked control (a ‘control ersatz construction’ [CEC]), which is matched for type-token distribution but otherwise randomly selected to be grammatically and semantically uninformed. We use the following method. For each type in a distribution derived from a VAC pattern (e.g. walk in V across N occurs 203 times), ascertain its corpus frequency (walk occurs 17820 times in the BNC) and randomly select a replacement type from the list of all verb types in the corpus found within the same frequency band (e.g. from learn, increase, explain, watch, stay, etc. which occur with similar frequencies to walk in the BNC). This results in a matching number of types that reflect the same general frequency profile as those from the VAC. Then, using this list of replacement types, sample the same number of tokens (along with their sentence contexts) as in the VAC distribution (e.g. 4889 for V across N) following the probability distribution of the replacement types in the whole corpus (e.g. walk, with a corpus frequency of 17820, will be sampled roughly twice as often as extend, which occurs 9290 times). The resulting distribution has an identical number of types and tokens to its matching VAC, although, if the VAC does attract particular verbs, the lead members of the
Construction learning as category learning: a cognitive analysis
87
CEC distribution will have a token frequency somewhat lower than those in the VAC. We then assess, using paired-sample tests, the degree to which VACs are more coherent than expected by chance in terms of the association of their grammatical form and semantics. We show such comparisons for the VACs and their yoked CECs in Table 4. Step 8 Evaluating semantic cohesion in the VAC distributions The VAC type-token list shows that the tokens list captures the most general and prototypical senses (come, walk, move etc. for V across N and give, make, tell, for V N N), while the list ordered by faithfulness highlights some quite construction specific (and low frequency) items, such as scud, flit and flicker for V across N. Using the structure of WordNet, where each synset can be traced back to a root or top-level synset, we compared the semantic cohesion of the top 20 verbs, using their disambiguated WordNet senses, from a given VAC to its matching CEC. For example, in V across N, the top level hypernym synset travel.v.01 accounts for 15% of tokens, whereas the most frequent root synset for the matching CEC, pronounce.v.1, accounts for just 4% of the tokens. The VAC has a more compact semantic distribution in that the 3 top-level synsets account for 25% of the tokens compared to just 11% for the CEC. We use various methods of evaluating the differences between the semantic sense distributions for each VAC-CEC pair. First, we measure the amount of variation in the distribution using Shannon entropy according to (1) number of sense types per root (V across N VAC: 2.75 CEC: 3.37) and (2) the token frequency per root (V across N VAC: 2.08 CEC: 3.08), the lower the entropy the more coherent the VAC verb semantics. Second, we assess the coverage of the top three root synsets in the VAC and its corresponding CEC. Third, we quantify the semantic coherence of the disambiguated senses of the top 20 verb forms in the VAC and CEC distributions using two measures of semantic similarity from Pedersen, Patwardhan and Michelizzi’s (2004) Perl WordNet::Similarity package, lch based on the path length between concepts in WordNet Synsets and res that additionally incorporates a measure called ‘information content’ related to concept specificity. For instance, using the res similarity measure the top 20 verbs in V across n VAC distribution have a mean similarity score of 0.35 compared to 0.17 for the matching CEC.
88
Nick C. Ellis and Matthew Brook O’Donnell
5.
Results
Our core research questions concern the degree to which VAC form, function, and usage promote robust learning. As we explained in the theoretical background, the psychology of learning as it relates to these psycholinguistic matters suggests, in essence, that learnability will be optimized for constructions that are (1) Zipfian in their type-token distributions in usage, (2) selective in their verb form occupancy, (3) coherent in their semantics. We show comparisons for the VACs and their yoked CECs on these aspects in Table 4. Table 4.
Comparisons of values for our 23 VACs and CECs on metrics of Zipfian distribution, verb form selectivity, and semantic coherence
Criterion dimension
Metric
Mean VACs
Mean CECs
t value for paired t-test (d.f. 22) ***=p [RETAINER.oblique.PP[from] from a container ship that rammed the Bay Bridge].
(5’)
In one instance [RETAINER.subject.NP a transformer] < target: LEAK.v-2a (Releasing) leaked> for almost two years before repairs were made.
(6’)
[RETAINER.subject.NP The Union Carbide pesticide plant in the northern city of Bhopal] < target: LEAK.v-2b (Releasing) leaked> [FLUID.object.NP 27 tonnes of deadly methyl isocyanate gas].
(7’)
[AGENT.subject.NP The secretary] < target:LEAK.v-3 (Revealing) leaked> [INFORMATION.object.NP details of the appointment] [RECIPIENT.oblique.PP[to] to the Washington Post].
It will be noticed that only the portions of the sentence which represent the target LU and its syntactic/semantic dependents are labeled. Since the purpose is to make obvious not only the identity of the FE-realizing expressions but the syntactic means by which such information is presented in the sentence, “markers” like prepositions (here to and from) are included inside the segments annotated. The reason for this is to show how the syntactic segmentation of the sentence expresses the semantic units. In the annotations just exemplified, only “core” FEs are shown. In addition to those FEs that seemed to be central to the meaning of the framebearing LUs, the project also annotated (and identified as such) two other kinds of FEs, which we referred to as peripheral and extrathematic. Peripheral FEs are phrases that give further information about the particular frame instance, but of the type often referred to as “circumstantial” – most typically expressions of Place, Time, Manner, etc., for frames in which those notions were not themselves considered core; in (5) the durationindicating phrase for almost two years before repairs were made is of that kind. Phrases identified as “extrathematic” FEs convey information that is not strictly part of the framed event but provides information relevant to the communication act or gives information outside of the frame being highlighted; in sentence (6), the discourse-structuring phrase In one instance is
Frames, Constructions, and FrameNet 133
of that kind. Ideally, the FrameNet viewer will permit the user to view core FEs with or without the accompanying peripheral and extrathematic FEs.21 4.
Grammar work-arounds
All of the valences of the verbal leak LUs, as with those of give seen earlier, can be expressed within the limitations of the BGE: in fact, for these words, and for most other content words taken one at a time, everything there is to say about their basic combinatorial properties can be said in terms of the nuclear syntactic functions plus various prepositional obliques. In general, when words are found in complex sentences, the combinatorial information needed to account for them would make use of exactly the properties that they show in simple sentences. It was hoped at the start that as new LUs and new frames are brought in, simple sentences will be found for all of them, and valence descriptions can be thoroughly illustrated using only features of the base grammar. Surely, it seemed, a really large corpus could allow that.22 In the earliest years FrameNet workers were in fact instructed to limit the selection, analysis and annotation of examples to structurally simple sentences. For example, if FN wanted to record a transitive use of the verb read, a sentence like (8) would be welcome but one like (9) would be bypassed. (8)
My children read that book.
(9)
Is this the book my children found so hard to read?
Sentence (8) straightforwardly shows read as a transitive verb, with book in direct object position as a typical collocate. In (9), by contrast, the collocational link between the verb read and its understood object is still there (though by way of relativization), but it has to be traced through a complex of grammatical processes that are not at all transparent. Sentence (9) could possibly be a good example of something, but it offers nothing relevant to our understanding of the lexical properties of the verb read.23 21. “Ideally” because currently the mechanism for allowing toggling between the two views appears not to be working. 22. The British National Corpus, approximately 100,000,000 running words (http://natcorp.ox.ac.uk). 23. Nor, as it happens, would (9) be a particularly good example for hard or find.
134 Charles J. Fillmore
One of the objectives in FN annotation is to include several sentences for each syntactic pattern, and although relative frequency would seem to be a high-level desideratum, it came to be outranked by the decision to include phrases with meanings that expressed something about the frame itself. Simple sentences with pronouns, though highly frequent, would not achieve that: sentences like I read it, it leaks, they gave me one, would not do for our purposes. Sentences illustrating read, we felt, should mention believable readers and possible reading material, like books, road signs, recipes, headlines, etc. In other words, we wanted the set of annotated sentences to include samples of collocations that contribute to showing the kinds of situations suggested by the frame.24 If we wanted frame-relevant examples for each FE, that ought to include subjects, but in many cases lexical subjects – as opposed to pronominal subjects – tend to be rare in actual text corpora. That induced us to relax the example-selecting criteria to include the controller NPs of the subjects of VP complements. Thus, we could find a representation of children as readers in a sentence like (10), taking read as the target LU: (10)
Some of the parents didn’t want to let their children read your poem.
We continued the requirement that phrases that counted as representing a target LU’s FEs had to have some precise syntactic structural relation to the target itself, and that of controlling the subject role in an embedding context could be one of those. Since their children in (10) is not “the subject” of read in that sentence, we assigned to such externally realized FE-fillers the invented grammatical function name external; and in the end we used the same name for NPs that were actual subjects as well. This means, of course, that in a full analysis of sentence (10), the phrase the children would stand as the object of let in the layers dedicated to let and the “external” of read in the layers dedicated to read. As a means of displaying collocation patterns discovered in the annotations, we developed an ability to derive what we called Kernel Dependency 24. There was no chance of extending the work so that we could count on getting collocates by frequency, but we at least wanted to choose contexts in which the target word was topically in familiar surroundings. Fortunately, there are services that offer corpus-based collocation information: The Corpus of Contemporary American English (Davies 2008–), The Dante Corpus (Atkins, Kilgarriff, and Rundell 2010), and SketchEngine (http://www. sketchengine.co.uk/)
Frames, Constructions, and FrameNet 135
Graphs (KDGs) from the annotations (Fillmore and Sato 2002). Each annotation-derived KDG was to show a cluster including a lexical predicator, with a set of labeled dependents, each branch labeled according to its FE and ending in the semantic lexical head of the phrase that instanced the FE. For a sentence like (11) a KDG would identify complain as the framebearing head, grandmother, hospital and mistreatment as the SPEAKER, RECIPIENT and TOPIC of the communication act classified as a complaint, and would additionally indicate the “marker” by which certain FEs were syntactically presented. (11)
My grandmother complained to the hospital about her mistreatment.
A representation of the KDG for (11) could be what is shown in Figure 1, simplifiable as the mock kernel clause [grandmother, complained, tohospital, about-mistreatment]:
complain
grandmother to: hospital about: mistreatment
Figure 1.
Accumulating a repertory of KDGs from a corpus would enable the automatic derivation of information about lexical collocations, since we expect most reportable collocations to be found among elements of a single frame. The possibility of automatically extracting KDGs from keyword sentences in given documents would give information about the topics of passages and would possibly be useful for document routing, word sense disambiguation, information retrieval, and other natural language processing applications. 25 In the case of KDGs, which would include 25. This was imagined as a possible alternative to the relationship extraction techniques associated with the Resource Description Framework, http://www.w3.org/TR/PR-rdf-syntax/ “Resource Description Framework (RDF) Model and Syntax Specification”.
136 Charles J. Fillmore
information about markers and head-words, the results did not have to be limited to simple subject-verb-object triples of the [dog, bite, man] type. This desire to have the KDGs represent hints about the content of the text led to a distinction between the syntactic head and the semantic head of a NP, and that in turn led us to the recognition of contexts where there was a discrepancy between the two: in particular, situations of support constructions and transparent nouns. The class of nouns we ended up calling transparent nouns includes those that are the head noun in a N-of-N pattern where it is not the phrasal head but the dependent noun that is collocationally related to the context external to the N-of-N phrase: examples are type, sort, kind, variety, species; pack, stick (of gum), wad (of tobacco); flock, herd, group; and many others. On encountering a sentence like (12), the KDG-recognizer would not do well to produce a KDG that connected [ majority, used, type, in-kind] but instead should offer [tobacco producers, used, arsenic, in-filters]. (12)
The majority of tobacco producers used a type of arsenic in this kind of filter.
Support constructions are verb-noun or preposition-noun patterns in which the noun bears the frame and the verb or preposition provides a syntactic role for the combination. There are various ways of describing such structures, but for our purposes we only identify the frame with the noun and choose as the frame’s FEs those phrases that are syntactically related either to the noun or the syntactic governor. Thus in (13), the two participants of the fighting scene are Lidia and her sister, though we recognize Lidia as the subject of the support verb had and with her sister as a complement of fight; in (14), we recognize the two participants in the advising frame as participants in the semantic frame evoked by advice but as the subject and first object of the support verb give. Note that this does not entail any commitment to a semantic relationship between, e.g., advising and giving (and in fact FrameNet does not recognize the former as a subtype of the latter, even though give in one of its literal senses does evoke the Giving frame). (13)
Lidia had a fight with her sister.
(14)
Othmar gave me wise advice.
From the start we were comfortable with including among representative FE collocates the nouns of “extracted” NPs with wh-determiners, as in the which employees of (15), as well as the antecedents of relative pronouns, as in (16). The latter decision gave us something to label (the “antecedent” of
Frames, Constructions, and FrameNet 137
a non-existent pronoun) in bare relative clauses, as in (17). In other words, all of these (invented) sentences could be taken as supporting a KDG that connected [boss, fire, employee]. (15)
Which employees did your boss have to fire?
(16)
…the employees that your boss had to fire …
(17)
…the employees your boss had to fire …
Valence-bearing nouns, such as event nouns, already mentioned in connection with support constructions, also permitted annotation of possessive determiners as providing argument positions for the noun's frame, as in (18) [Steve, fire, secretary], including cases in which the possessive of the noun provided a controlling argument of a VP that is the complement of the noun, as in (19), yielding [CEO, fire, son]. (18)
… Steve’s firing of his secretary…
(19)
… the CEO’s decision to fire his own son …
As a result of this chain of decisions, FN annotations, originally intended to be built on top of syntactic parses,26 came to look less and less like syntactic parses. The full set of work-arounds led more and more to syntactically unrealistic descriptions. In the case of target LUs inside of relative clauses, we annotated both the relative pronoun and its antecedent, in order to identify both the syntactic role of the relativized element and the appropriate lexical collocate, as noted. For degree phrases modifying scalar adjectives we distributed the FE label Degree separately across the degree word and its non-contiguous complement ([degree so] hungry [degree that I couldn’t wait]), whereas a correct parse (I believe) would show the degree-modified adjective as a single constituent. In recognizing support constructions we noted that the support verb or preposition was the syntactic head of its phrase but the object nominal was the semantic head, while identifying phrases that were subjects (or objects) of the support as instancing FEs of the frame evoked by the noun. A proper formal grammar might have decided to “project” the frame structure onto the verb as its own valence and accept all FEs as syntactic arguments of the verb, but our software commit26. A somewhat parallel project in Germany (http://www.coli.uni-saarland.de/ projects/salsa/) did its annotations by superimposing frame-semantic annotations onto parsed sentences; we did not find parsers that gave us the correct parse often enough to make that feasible for our data, and we lacked the technology to efficiently revise tree structures and labels.
138 Charles J. Fillmore
ted us to showing FE-bearing phrases as showing links to the targeted frame-bearing LU. In the case of existential sentences we simply identify there with the label Existential when annotating a noun like accident (there was an accident), but have no way of recognizing its own syntactic role in its sentence. For compounds and other kinds of multiword expressions, the existing annotation software offered no way of recognizing a constituent as a target of annotation and at the same time recognizing its components and their relation to each other or to the whole. Conjunction and the various kinds of ellipsis offered further challenges, requiring the annotators to “reuse” phrases in multiple annotation sets for the same sentence. 5.
Toward constructions
A full accounting of the syntactic and semantic structure of a sentence requires the recognition of constructions that have meanings and functions of their own, beyond the capacity of the BGE and beyond the limitations of FN’s lexicographic software. Many familiar constructions – questions, commands, conditional structures, exclamations, etc., contribute pragmatic interpretations, but some constructions contribute meanings that parallel the kinds of meanings offered by lexical items. For example, comparative constructions evoke frames shared by verbs like exceed (‘being more than’), prefer (‘liking better than’), and out-run (‘running faster than’), while simultaneously incorporating the scalar semantics of the predications being compared. Existential structures introducing event nouns (as in there was an accident) can be seen as having a function borne elsewhere by support verbs like occur, take place or happen. Sometimes constructions and words have very similar functions. These new considerations required a reconceptualization of our work, as well as the software tools capable of supporting a newly defined annotation process. The basic annotation steps of segmentation and labeling are fortunately the same, and the existing layering structure for stand-off annotation made it possible to have both the lexical and the constructional annotations exist in a single set of layers.27 The need to extend our activities to grammatical constructions became clear when FN was challenged to provide annotations for complete texts; it 27. What we have not yet achieved is the means of integrating them into a genuinely articulated representation.
Frames, Constructions, and FrameNet 139
would be good to know if FN lexical annotations could contribute to text understanding in general, or could support inference-making processes. Now, of course, we could no longer even hope to limit ourselves to simple sentences and “good” examples: we had to say something about everything we found in a sentence. This meant, not only that we had to find FEs realized in unfamiliar places, but we also had to be able to describe the grammatical structures that put them there. In work in Berkeley on construction grammar,28 it was considered a discovery that some constructions can only be expressed in terms of complex tree structures (“extended families”, not just mother and daughters), but versions of construction grammar were developed that (for the most part) preserved “locality”, in the sense that constructions needed only to specify the properties of a constructed linguistic expression as built on the properties of its immediate constituents. Under these assumptions, the description of a construction characterizes the needed features of the daughter constituents and describes the semantic interpretation and combinatorial affordances of the mother. In the selection and annotation of instances of given constructions, it is useful to consider the context in which the constructed expression has a role. In practice this means, for example, if a constructed phrase acquires a valence of its own, it is helpful to show how that valence is satisfied in its larger context; if a construction creates a plural NP, it would be helpful to choose a sentence that shows plural number agreement or that has the semantic features that reflect the plural number of the mother constituent.29 An important fact about the product of constructions is that the mother has properties that are not directly predictable from the constituents.30 If we assume that something like set intersection explains at least some instances of adjective modification, we have to agree that something special is need28. See Fillmore (1985b, 1986, 1988, 2002), Fillmore, Kay and O’Connor (1988), Kay and Fillmore (1999) and Kay (1994). 29. For example, a conjunction like a gentleman and a scholar, or one like my husband and my lover (referring to a single individual), is likely to appear in predicate position, but NP conjunctions in argument positions should be compatible with contexts requiring plurality, e.g. My husband and my lover hate each other. 30. Since the same is true of the simplest phrase-structure rules, the aspects of construction grammar that preserve locality are consistent with phrasestructure rules with complex symbols as the nodes.
140 Charles J. Fillmore
ed in the case of modification with the adjective favorite.31 To understand the contribution of this adjective, let us compare relational and nonrelational nouns and their modification by possessive determiners, adjectives, and following modifiers. Taking drink as a non-relational noun, we recognize that the NP the emperor’s drink shows only the simple possessive relation between the drink and the emperor: drink does not require mention of anything else. The word mother, by contrast, is a relational noun, where a possessive modifier identifies the relatum of the relation, on a par with a following of-phrase. That is, we find alongside of the emperor’s mother, the version the mother of the emperor. There is no corresponding *the drink of the emperor. A unique relational noun with a specifiable relatum32 requires completion by a definite determiner (like the or the possessive determiner) and explicit mention of the relatum; that relatum can either be communicated through the possessive or as a post-nominal ofphrase.33 Now consider the emperor’s favorite drink, and its alternate the favorite drink of the emperor. We can say that the adjective favorite introduces the idea of, say, ‘liked best by __’, and it dictates that the Experiencer of this preference is to be expressed as an argument of the resulting nominal, either as the possessive determiner or as a following of-phrase. The adjective creates, with its noun head, a relational noun. Nouns that are relational by themselves can also occur with the adjective favorite, they take on a different relatum. The word composer is a relational noun whose natural relatum is (in one of its senses) a musical composition. If the relatum is unspecific, we can speak of a composer, someone who writes music, or of the composer in my group of friends, the one who writes music. But when a specific relatum is understood, we find both the possessive determiner, as in the opera’s composer, or the definite article with a post-nominal of-phrase, the composer of the opera. When favorite is added to composer, however, it acquires the ‘like-best’ Experiencer argument, and we recognize pairs like my children’s favorite composer and the
31. The properties of this adjective are treated in somewhat different terms in Partee and Borschev (1999). 32. The “uniqueness” here applies to the head noun: one has just one mother. 33. The relatum argument can also be unexpressed-but-understood, as zero anaphora.
Frames, Constructions, and FrameNet 141
favorite composer of my children. The syntactic argument required by the adjective dominates over the relational structure that goes with the noun. A similar situation, where a constructed phrase has properties not determined by its head, is found in the case of degree modification of adjectives. There are some simple adjectives that take infinitival complements of both the “subject control” and the “non-subject control” type, and there are some that take that-complements. Examples can be seen in (20): (20)
a.
subject control: They are eager to leave.
b.
non-subject control: They are hard to read.
c.
that-clause: It is likely that we will fail.
d.
that-clause: He is sure that the job will be his.
There are constructions that take ordinary adjectives and combine them with a degree indicator yielding phrases to which those same features are assigned, and again, it is a valence requirement of the modifier that contributes new combinatorial possibilities to the phrase as a whole. The possibilities are exemplified in (21): (21)
a.
too + adjective > subject control: I am too tired to continue working.
b.
too + adjective > non-subject control: This is too hot to eat.
c.
adjective + enough > subject control: I am old enough to go to R movies by myself.
d.
adjective + enough > non-subject control: This is ripe enough to eat.
e.
so + adjective > that-clause comp: I am so tired that I can‘t continue working.
Why is this relevant to FN’s lexicographic past and grammatical future? Current FN associates a frame with a given LU and expects to find its FEs among phrases that are directly built around that LU. What is needed is the ability to recognize semantic and distributional properties to both words and constructed phrases while at the same time recognizing the internal structure of the phrases. In short, a combinatorial requirement that is assigned lexically to the LU mother is assigned constructionally to the phrase favorite drink; a property that is lexically assigned to the LU able is constructionally assigned to the phrase old enough. A constructed phrase, then, is a linguistic sign that is licensed by a construction, and a construct is a representation, in minimal tree form, of a mother and one or more daughter phrases. Put differently, in a construct, one phrase is labeled with a description of the “mother” constituent and the component phrases are labeled with descriptions of the “daughter” constit-
142 Charles J. Fillmore
uents, on the assumption that the properties of a construction-licensed phrase can be completely understood by understanding the independently known properties of the daughters and the rule-assigned properties of the mother.34 6.
Case study: A sample text
Before we describe the goals of the Constructicon database it might be valuable to persuade ourselves that non-canonical constructions are in fact frequent enough in text to deserve serious attention, or whether they are merely the marginalia of language that we could easily ignore. The text I have chosen for this initial exploration is a single paragraph from the article The Essential Tree, by Gordy Slack, in Bay Nature, Oct-Dec 2003. For convenience, I have numbered the text sentences separately. T1
Acorns may be California’s single greatest natural resource.
T2
An oak tree can bear more than 400 pounds of acorns a year.
T3
There are an estimated 1 billion oak trees in California.
T4
That’s hundreds of millions of pounds of nutrient that serves as the staple for more kinds of creatures than any other food source in the state.
T5
But the bulk of nutrients oaks churn out is only the beginning of their contribution.
T6
Oak trees form the organizational backbone of numerous habitats from coastal valley bottoms to highland meadows, providing food, shelter, and
34. This approach, intending wherever possible, to preserve “locality” in the sense of Sag (2012), is in conflict with earlier proposals in Berkeley by which, for example, we freely recognized “extended family constructions”, by which the satisfaction of some requirement on a single construction had to be located deep down in the “tree” that corresponded to a construction’s realization. One of the lively issues in construction grammar theorizing, in fact, has to do with whether the formal devices used to explain the effects that led to extended-family explanations are worth the resulting formal “simplicity”. My own preference is to look at constructions functionally, i.e. in terms of their communicative purposes, and to not only seek explanations of the outer properties of the daughter of a construct but to say at the same time how they got that way. This can be thought of as studying the derivations themselves, rather than simply the individual rules.
Frames, Constructions, and FrameNet 143 stability for whole communities of organisms. T7
According to a 1997 University of California study, California’s oak woodlands harbor more biodiversity than any other major habitat type in the state:
T8
At least 4,000 kinds of insects inhabit them, along with 2,000 kinds of plants, thousands of fungi and lichens, 170 different birds, 60 amphibians and reptiles, and 100 different mammals.
The questions we ask, in trying to understand how this text works, is what kinds of linguistic knowledge the writer has implicitly exhibited that go beyond the simple grammatical structures found in the Basic Grammar of English? We will go through the text, one sentence at a time, commenting on the most striking constructions noticeable in each sentence, to get a sense of the density of constructional information in an ordinary fairly simple written text. We will be looking for the kind of grammatical constructions not found in the Basic Grammar of English that will need to be added to the FN Constructicon, and that will have to be accounted for in any fullscale grammar of English. T1
Acorns may be California’s single greatest natural resource.
Here we have a Superlative adjective modifying a nominal, in predicate position with the segments {UNIQUE-RELATIONAL [SUPERLATIVE single greatest] [HEAD natural resource]}. 35 The constructed phrase – the construct – is single greatest natural resource treated as a unique-relational nominal (each of its pieces being phrasal), and the possessive California’s is marked as satisfying the Restriction against which the NP is unique.36 In the case of the favorite+nominal pattern, the pre-nominal genitive alternated with a postnominal of-phrase, but with the superlative+nominal pattern, the world or situation in which the entity in question has its uniqueness is expressed either as a preceding possessive determiner, as here, or in a number of other ways, among them (a) a following PP (the largest hotel in California), (b) a relative clause of almost any variety (the biggest bear I’ve ever seen, the 35. In this paper representations of constructions and constructs will be done informally, in words, or with abbreviations involving two layers of bracketing, { } for the resulting expression, and one or more instances of [ ] enclosing the components. Superscripts will hint at the functions of the units so identified. 36. The adjective great is vague, but we can assume it is intended to mean abundant or important or the like.
144 Charles J. Fillmore
largest whale recorded on film), (c) an infinitival relative clause (the oldest man to land on the moon). Several other structures can create uniquerelational nominals with the same properties, among them ordinals (first, second, etc., plus last), and the word only. Consider: the tallest/first/only man in the room; the oldest/first/only woman to walk on the moon; the youngest/first/only player I defeated. It is tempting to speak of the subject and the restriction in such predications as the two FE’s of the superlative construct’s frame, but the FN tradition has no way to accommodate this view.37 Sentence T1 uses the plural noun acorns as generic; the modal may expresses the tentative nature of the judgment; the phrase natural resource counts as an established compound, externally a noun; the superlative lexical form greatest shows one of the three ways of representing the superlative degree on an adjective: lexical (best, worst); suffixal (tallest, greatest); phrasal (most interesting, least important). T2
An oak tree can bear more than 400 pounds of acorns a year.
This sentence offers two important constructions, first, a variety of Comparative, second a Rate construction. In this case the comparative construction approximates (through more than) a magnitude, instead of being used to contrast two situations. The precise extent of the anchor of that comparison (the complement of the preposition than), as it happens, is technically unclear, though without entailing a semantic difference. Various groupings are possible: is it more than 400 (greater than that number), more than 400 pounds (heavier than that weight), more than 400 pounds of acorns (greater than that quantity), or more than 400 pounds of acorn a year (higher than that rate). The constituent structure of the final NP in each case is slightly different; I will arbitrarily choose the last grouping.38 37. The example in T1 is one of the simplest kinds of superlative constructions. It is paraphrasable in the manner of the comparative sentences in T4 and T7: i.e. ‘greater than any other natural resource in California’. 38. Arguments for constituent structure of rate expressions, whether linguistic or otherwise, are troubled. With I eat three apples every day we could treat every day as an adjunct to the whole VP; with three apples a day that won’t work, since a day looks like a NP rather than an adverbial; three apples every two days suggests a two-part ratio; but abbreviations of units of pressure (psi), speed (mph), fuel usage (mpg) suggests that the numerical values themselves should be separated out.
Frames, Constructions, and FrameNet 145
The formal expressions of comparatives are divided in the same way as those of the superlative: lexical (better, worse), suffixal (higher, greater) and phrasal (more important, less interesting). The Rate construction forms phrases with meanings based on ratios, as here, simplified as {RATE [NUMERATOR 400 pounds] [DENOMINATOR a year]}. The most common rate expressions offer two adjacent NPs, the first identifying a multiplicity measured in terms of one kind of unit, the second introducing another kind of unit, usually expressed with a singular indefinite determiner. Twenty dollars an hour, two milligrams a day, thirty miles an hour, fifty miles a gallon, four times a year, twenty cents a pound, etc., give standard ways of expressing wages, dosages, speed, fuel efficiency, frequency, and price-per-unit-measurement, and much more. In our sentence each component is itself phrasally built up from other constructions, but each component in a rate expression could be expressed with a single lexical item (once or twice, for example, for the “numerator”), and the unit that makes up the “denominator” can also be expressed lexically (each, apiece, annually, etc.). More complex rate expressions can be built up, exploiting the word per in the second daughter, as in 32 feet per second per second (acceleration), $400 per person per night (cruise cabin accommodations), and the like. In sentence T2 the singular indefinite NP (an oak tree) expresses the generic meaning, and the modal can expresses a potential. T3
There are an estimated 1 billion oak trees in California.
This sentence too offers two important constructions. The basic organization of sentence T3 is that of an Existential sentence of the there+be type. One kind of existential sentence proclaims (or denies) simple existence: there is a God, there are no unicorns; but the most common case introduces a Restriction (there’s a unicorn in the garden, there are lots of oak trees in California, there is nothing for me to do, there is someone waiting for you, there is something that troubles me). Existential sentences provide an anomaly in respect to the concept of grammatical subject in English. A common test for subject-hood is number agreement, apparent when the finite verb reflects the number (singular or plural) of the subject (There is a unicorn in the garden, there are unicorns in the garden). Other tests of subject-hood include the ability to change places with a verbal auxiliary in a question (Is there a unicorn in the garden? Are there unicorns in the garden?), or to occupy the relevant position in “raising” or “catenative” structures (There seems to be a unicorn in the garden. By the time I get home I expect there to be no unicorns in the gar-
146 Charles J. Fillmore
den). By the first of these tests the NP is the subject; by the other tests the subject has to be the word there. Another kind of number agreement anomaly is found in the NP in this sentence (an estimated 1 billion oak trees); this is an instance of what we have called the Whopping construction. This construction has the unusual feature of licensing phrases made up of an indefinite singular article (here an), an adjective (here estimated) and a cardinal number (here 1 billion), where the whole functions structurally as a cardinal number. As such, unless the number is one, it is plural, selecting a plural noun.39 In the tripartite phrase {[an] + [estimated] + [1 billion]} none of these components is omissible. The name of the construction is based on the fact that the English adjective whopping seems to show a strong preference for precisely this structure. The adjectives are only those that can be taken as qualifying, or expressing the speaker’s reaction to, the number: a whopping seven million, a mere five, an amazing two billion, a paltry two million, an additional five. It is possible to see another similar “mystery” in English as essentially another instance of this phenomenon: the combination of the singular word another with a number greater than one: today we have to read another five pages, I need another two thousand dollars. These examples can be accommodated to the analysis just proposed if we recognize that the phrase another five pages could be seen as analogous to an additional five pages, other could be treated as an adjective. In a better world the word another would be written with an internal space: an other. T4
That’s hundreds of millions of pounds of a nutrient that serves as the staple for more kinds of creatures than any other food source in the state.
The first construction to notice in T4 exemplifies the language of approximations with large numbers: tens of thousands, hundreds of millions, etc. The lexicalized powers-of-ten number words (ten, hundred, thousand, million, etc.) do not get pluralized when used for expressing precise numbers (three hundred and five, not *three hundreds and five); but versions of these words found in a context like T4 name groups or approximations. These words are translatable into French as the approximative number 39. We take the construction itself to describe a modified cardinal number, and regard the entire NP as a higher construct. Like other cardinal numbers, these expressions can occur without the noun in anaphoric contexts: we need five, we need a mere five.
Frames, Constructions, and FrameNet 147
words, dizaine, centaine, millier, etc. The phrase meaning ‘hundreds of thousands’ in French is des centaines de milliers and not *des cents de milles. A full story of these approximatives would point out that those higher than tens can be used directly to express quantities of objects (hundreds, thousands, millions, etc., of people). It seems we are blocked from saying tens of people by a preference for the word dozens. But we can, of course, say, tens of thousands, tens of millions, etc. (The absence of tens of hundreds is blocked for a different reason: we have the word thousands.) The phrase in T3 ending with the word nutrient is modified by a thattype Relative clause. That clause itself, however, illustrates a kind of Comparative construction quite different from what we saw in sentence T1. To show its semantic structure we first construct an abstract template in which Q stands for a quantifier. X serves as a staple for Q types of creatures We find in the sentence that the Q is the quantifier more, with its accompanying than-clause, that needs to fit into a comparison schema involving the subject (the X) of the host sentence. . A comparison is being presented between various possible food sources that can fit the “X” slot in this template and the sentence claims that Q is largest when the nutrients provided by acorns appear in slot X, in comparison to any other food source in the state. We will soon see another sentence with analogous structure. T5
But the bulk of nutrients oaks churn out is only the beginning of their contribution.
This sentence shows a non-subject Bare Relative clause, oaks churn out; it could have been introduced by which or that. Characteristic of the bare relative is that the shared noun has to be a non-subject of the clause. T6
Oak trees form the organizational backbone of numerous habitats from coastal valley bottoms to highland meadows, providing food, shelter, and stability for whole communities of organisms.
This sentence shows mostly structures that are fully accounted for by the augmented BGE, but we could point out the rhetorical structure from X to Y where the import is to suggest the impressiveness of the range of some phenomenon. In ordinary contexts in which the content has to do with, say, travel across a distance, the two phrases could be exchanged (all the way to
148 Charles J. Fillmore
Florida from Maine), or either could be omitted; but the purpose of this structure is different. The range can be geographical, as in the present case, with examples like from Maine to Florida, or it can include other possibilities: from A to Z, from soup to nuts. The point is that this rhetorical pattern requires the whole template, not the individual prepositional phrases one at a time. T7
According to a 1997 University of California study, California’s oak woodlands harbor more biodiversity than any other major habitat type in the state:
The comparative clause in T7 invites the template: X harbors Q biodiversity where Q stands for the degree of biodiversity, and its expansion is again more and its than-complement.
In T4 the comparison set was different food sources; here it is habitat types, and we are told that in California, Q in the formula is highest when X stands for the oak woodlands as opposed to other habitat types. Of the many kinds of comparative structures, there is an important distinction between those in which the elements of the comparison set are construed as fitting the subject of the sentence or some other constituent. The contrast can be seen by comparing (22a–b). (22)
a.
Lou eats more yogurt than Fran. (X eats Q yogurt)
b.
Lou eats more yogurt than flan. (I eat Q X)
The two sentences can be expanded as (23a–b), displaying the difference; (23)
a.
Lou eats more yogurt than Fran does/than does Fran
b.
Lou eats more yogurt than he does flan.
In both T4 and T7, one could insert does between than and any other.40 The colon at the end of T7 reveals it as a generalization for which T8 fills in the details. 40. The full story of than-clauses in comparative sentences would of course explain the “polarity” effects that account for the word any. But we have seen enough.
Frames, Constructions, and FrameNet 149 T8
At least 4,000 kinds of insects inhabit them, along with 2,000 kinds of plants, thousands of fungi and lichens, 170 different birds, 60 amphibians and reptiles, and 100 different mammals.
There are several aspects of this sentence that might be worth noticing. The phrase at least has a semantic function very similar to that of more than seen in T1. But this phrase, taken literally, leaves open the possibility that the number at issue might be exactly 4000. The part of the sentence that begins with along is an amplification of the subject, a kind of elaborate conjunction. There are two cases in this sentence of a single quantifier having scope jointly over both members of a conjunction, suggesting that the entities conjoined deserve to be considered a single category: Thousands of fungi and lichens and 60 amphibians and reptiles. The collection of amphibians and reptiles, taken jointly, we are told, can total 60. Quantifying over a conjunction suggests that the two groups combine as a single category: I have seven brothers and sisters sounds better than I have seven brothers and daughters. This discussion of the grammatical phenomena that can show up in a single short passage that nobody would consider particularly dense or complex should serve to show something of the depth of the contribution of peripheral grammatical constructions to English text. None of the properties we noticed were part of the BGE. 7.
The FrameNet Constructicon
The FrameNet construction data base, or Constructicon,41 is intended as a publicly available resource that will include (a) a repository of construction descriptions, which characterize complex linguistic expressions in terms of the kinds of phrases or other entities that make up their components and show how the meanings of the components fit into the meaning of the whole; (b) a body of selected “good examples” of construction instances, annotated with respect to the properties of the mother phrase and the properties of the daughter phrases. This section offers a variety of construction types, beyond those noticed in the Oak text, ranging from simple to moderately complex.
41. Fillmore, Lee-Goldman, and Rhomieux (2012).
150 Charles J. Fillmore
7.1
Pumping constructions
The simplest constructs are those with one daughter, in which the form of the mother is a copy of the form of the daughter: these are the various processes of zero derivation, abundantly present in English for verbs and nouns. These have been called unary or non-branching constructions, but following Ivan Sag, we refer to them as “pumping” constructions: the inner constituent is “pumped” to a constituent whose semantic interpretation includes reference to the meaning of the daughter constituent. As a simple example we begin with names taken as a class of nouns, and we describe the two unary constructions to account for the derivation of proper names and a specific type of common nouns from a name.42 The noun Harry as a name is found in a sentence like (24a); a use of this word to mean ‘the context-unique individual bearing the name Harry’ counts as a proper name (24b), and its use as a common noun meaning ‘a person having the name Harry’ is seen in (24c). (24)
a.
They named him Harry.
b.
Harry is late again.
c.
How many Harrys do you know?
d.
Harry was named after his paternal grandfather.
The proper name version is a singular referring expression, a definite NP on its own that does not welcome specification by a determiner; the common noun version has number and definiteness values determined by the syntactic or morphological constructs in which it appears. Evidence that a proper name has an internal analysis can be seen in (24d): an utterance of that sentence can be used to convey that Harry’s father’s father was also a Harry. The name Harry is an LU that exists in the language’s onomasticon. In this case it is a single morpheme whose status as a name exists from the start; other products of the name-creating system (presumably a finite-state machine more or less independent of the rest of the grammar) can include Lisa Wong, T. Harvey Metcalf III, and unlimitedly many others. The properties assigned to the mother of a proper name construct include maximality (it is a complete NP), definiteness (it functions as a definite NP), and a semantic description, something like ‘the context-unique entity bearing the 42. The analysis proposed here differs from the treatment found in Sag (2012).
Frames, Constructions, and FrameNet 151
name ___’. The speaker assumes the hearer knows the identity of the referent from the name. The bracketing formula means that if you insert into the square brackets something satisfying the condition of being a name, its participation in this construction allows it to function as a proper name, with the cluster of features associated therewith. Construction-internal rules guarantee that sex-specific names will pass the male vs. female specification to the mother.
{PROPER_NAME [NAME __ ] } Another unary construction based on a name “pumps” it to a common noun, non-maximal (requiring specification), with number and definiteness unspecified, having the meaning ‘person bearing the name ___’.
{CN [NAME __ ] } There is yet another name-related unary construction, this time based on a proper name in its relation to a well-known referent (another Albert Einstein).43
{CN [PROPER_NAME __ ] } As we have seen, there is a class of constructions that have exactly one daughter, where the function of the construction is to take the “original” sign and embed its meaning into that assigned to the mother LU. 44 The experience of noticing that a word with one meaning has to be taken with a different – possibly surprising – meaning in an unfamiliar syntactic context makes it easy to believe that the context itself “coerces” the derived meaning, making a polysemy account unnecessary. The view taken for the FN Constructicon, however, is to assign both the derived meaning and a derived combinatoric to the mother LU. It may be, of course, that to the interpreter it is the surprising mismatch between the expected context and the 43. Since places and institutions can also have names, there are analogous structures for non-human named entities. The proper_name to common_noun construction can produce not only another Einstein but also expressions like the Harvard of South India. 44. The asymmetry of the two senses, distinguishing the original and the derived senses is, I think, always clear. We know the mass noun chicken (meat) as the flesh of the animal, rather than knowing the count noun chicken (animal) as the source of the food.
152 Charles J. Fillmore
new context that invites the interpretation, but the facts are just as well explained by a process that assigns derived semantic and combinatorial features to the mother LU. The FN Constructicon includes the numerous pumping constructions in the literature, including those that convert count nouns to mass nouns (animal-name to food-name) and vice versa (substances to types or portions), as well as the “applicative” constructions that create causative verbs out of non-causative verbs. In one such case the inside verb indicates the activity that resulted in the caused change (to run one’s shoes ragged, to sneeze the napkin off the table). In another, a ditransitive verb is based on a monotransitive, where the meaning of the original verb names a subordinate event (to slip someone a ten-dollar bill), and so on.45 7.2
Measurement adjectives
In the discussion of the names constructions, the constructions themselves were hinted at, in the assumption that there was not much to point out. The existing Constructicon, viewable presently only through Hiroaki Sato’s FrameSQL,46 is the product of a brief exploratory grant from NSF,47 and is largely the work of Russell Lee-Goldman and Russell Rhodes. An example of a construction description, taken from the site, is that of Measurement_plus_adjective:
45. This accounting of verbal argument structure constructions follows the treatment given by Paul Kay (2005), rather than that of Adele Goldberg (1995). 46. http://framenet2.icsi.berkeley.edu/frameSQL/cxn/CxNeng/cxn00/21colorTag/ index.html 47. NSF #0739426 SGER: ‘Beyond the Core: A Pilot Project on Cataloging Grammatical Constructions and Multiword Expressions in English,’ 2007– 2008.
Frames, Constructions, and FrameNet 153
Measurement_plus_adjective This construction creates an adjective phrase with two constituents. The first is a measurement phrase (e.g., three feet four inches, nine years), and the second is an Adjective. In English, the only possible adjectives indicate linear extent (tall, high, deep, wide, thick, long) and age (old). Long may also be used in measurements of temporal extent (three minutes long). High normally indicates not linear extent, but linear distance from some reference point. Deep may express either extent or distance (a ship buried four miles deep, a lake one mile deep). In the unmarked case, only neutral adjectives are used; in exceptional contexts, others like short, low, and young may be possible. • The measurement phrase tends to have singular morphology when the construction is used attributively, even when the number is greater than one (six-foot-long pole), and have plural morphology when the construction is used predicatively, except when the number is one (that pole is six feet long). • Note that the measurement phrase may contain a series of units (2 days 7 hours 15 minutes long), in which case they are always ordered from largest to smallest. For further information, see the Measurement Phrase construction).
The two contexts for the pattern, predicative and attributive, are seen as accommodating variants of the same construction, with the added information that assignment of the number value (singular or plural) is regular in the predicate position (six years old), but typically limited to the singular in attributive position (six-year-old child), following the typical manner of noun+noun compounds. However, since the corpus offers exceptions of both kinds (singular with plural nouns in predicate position, plural in attributive), the labels on the measurement phrase indicate whether they do or do not match the number of the relevant Entity. Externally to the construction itself, there needs to be an Entity, either as subject of the predication or head of the attribution; its presence is also indicated in the annotations, outside of the main bracketing. In the following examples, the bracket labels are M+A for measurement_plus_adjective, MMP for matching_measurement_phrase (i.e. plural if the number of units is plural), MMMP for mismatching_measurement_phrase, ADJ for adjective. Note that the construct itself is enclosed within curly brackets: external to that is an FE of the adjective’s frame.
154 Charles J. Fillmore (25)
This is a cross section of the landfill at Oakley – with a {M+A [MMP one metre] [ADJ thick]} [ENTITY clay lining] at the bottom, covered by a layer of gravel.
(26)
And you did this with a {M+A [MMMP three foot] [ADJ wide]} [ENTITY bucket]?
(27)
Llyn y Bi is a small {M+A [MMMP 3 metre] [ADJ deep]} [ENTITY lake] which lies at 445 metres in the Rhinogs of Snowdonia National Park, the most acidified mountain range in Wales.
(28)
The trough was now a {M+A [MMMP fifty yard] [ADJ wide]} [ENTITY torrent] of deafening crazy water, made even more awesome by the brilliantly revealing moonlight.
Competing with constructions using measurement adjectives are constructions that use measurement nouns. The patterns include three feet in depth, half a mile in width, six years of age, and so on. 7.3
Adjectives as nominal
We have seen cases in which a phrase acquires properties that are otherwise shared by a single word (favorite shoe, etc.), but there are also cases in which the grammatical category of the result does not match that of the lexical head. There is a family of constructions in which the juxtaposition of the definite article with an adjective, in the absence of a noun, results in a maximal NP. One of the most important of these is seen in phrases like the poor, the rich, the young, the old. It is wrong to think of these as instances in which an adjective is “used as a noun”, since the adjective in question permits modification of the kind allowed for adjectives: the very poor, the morbidly obese, and the like. The general form of these constructs is {NP:GPH [the ] [AdjP ]}. The constraint on the second component is that the adjective phrase must be the kind that reasonably categorizes classes of humans, and the interpretation of the whole includes the features Generic, Human, Plural (GHP). The phrase the very rich cannot refer to a specific single individual. Adjective_as_nominal.people A noun phrase denoting the generic set of people with a particular property is formed from the Definite_determiner the and an Adjective_phrase that identifies that property. The noun phrase so created is grammatically plural.
Frames, Constructions, and FrameNet 155 (29)
The next morning orders came through that {NP:GHP [DEF the] [ADJ able-bodied]} were to begin the trek to India ... .
(30)
The poverty statistics are cited as further support for the view that { NP:GHP [DEF the] [ADJP elderly]} are becoming better off in relation to the rest of society
7.4
Reciprocal “best friends”
The plural form of a non-maximal nominal that designates a potentially symmetrical personal relation (like friend, cousin, co-worker, neighbor, but not daughter or boss) can be used to create a symmetrical predicate whose arguments appear either as a conjoined or plural NP covering both (or all) terms of the relationship, or distributed over two constituents, one the subject and one an oblique phrase with a preposition, here with (compare Sue and I collaborated and Sue collaborated with me). The nouns available for this construction are characterized by the defined relation (next-door neighbors, second cousins, best friends), and when one side of the relation appears as a singular subject, the structure gives the appearance of a number mismatch, as in the (b) version of (31). The preposition with is assigned by the construction itself, not by the noun: in other contexts the complement of friend is marked with of or to. (31)
a.
Julie and I were best friends in high school.
b.
I was best friends with Julie in high school.
The FrameNet Constructicon names this the Be_recip construction, defined as: Be_recip • A plural N’ whose head (the Head_noun) is a term of personal relation (e.g. friend, coworker) is used as a reciprocal predicate.* The head noun in the N’ can only be modified by Modifiers that characterize the relation denoted by that noun (e.g. good friends, close friends, but not rich friends). • There are two valences of the construction: (1) an asymmetrical one in which one side of the relation (Individual_1) is the external argument and the other (Individual_2) is expressed as a PP-with; (2) a symmetrical one in which all the parties (the Individuals) are expressed in the external argument. In valence (1) there may be an apparent number mismatch between the external argument and the morphologically plu-
156 Charles J. Fillmore
•
•
ral head noun, because the construction produces a predicate that does not have any number specification. In some cases frame elements belonging exclusively to the head noun may be expressed (e.g., friends from Uni), but we do not indicate these, as their presence is predictable from the regular valence of the Head_noun. Although the presence of an Individual_2 is also predictable, its expression as a PP-with is not (be friends with but is a friend of/to). The Modifier, when present, is also a frame element of the Head_noun, but the construction semantically constrains it to characterize the relation (see above). … *NB: This construction evokes the Reciprocality frame. The type of relation which the Individuals are in is indicated by the head noun of the N'.
Annotation of the construction indicates the relationship-noun and any modifiers, along with the construction-external constituents that represent the people in the relationship. (32)
[INDIVIDUAL_1 Sally] used to be {[MODIFIER good] [HEAD_NOUN mates]} [INDIVIDUAL_2 with Zaria], didn‘t she?
(33)
[INDIVIDUALS Russell and Michael] are {[MODIFIER close] [HEAD_NOUN friends]}.
7.5
Portions and Multiples
There are expressions by which a portion of some set or substance is indicated with a quantifier followed by an of-phrase. Examples are some of the water, much of the beer, most of the voters, many of your friends, half of the children, two thirds of the population. In the last case the word population is taken as referring to the people who inhabit a geographic unit. There is another pattern, without of, in which both fractions (a half, two thirds, etc.) and multiples (twice, three times, etc.) are used, that takes the magnitude represented by the second noun and represents a fraction or multiple of that. (you’re half my height, it’s almost two thirds the weight of my piano, I’m four times your age, that’s twice the population of India). In the last case, the word population refers to a magnitude, not people. These distinct patterns allow the following contrasts: if I am your assistant, I can complain that in this job I get half your salary, or if I am your agent I can express satisfaction that I get half of your salary. The pattern
Frames, Constructions, and FrameNet 157
with of, of course, is not available for the multipliers: I get twice your salary, but not *twice of your salary. 7.6
Conjunctions and Conjunction Reduction
A particularly troublesome class of constructions is found in those that license omitted material from within a phrase or sentence built through any of the various conjunction processes. Two important types are Gapping and Shared-Completion, exemplified in the following invented pairs: Gapping:
Kim loves Pat, and Pat, Lou. ([Kim loves Pat] and [Pat, __ Lou].)
Shared Completion: Kim loves, and Pat hates, Lou. ([Kim loves, __ ] and [Pat hates, Lou].)
A natural pronunciation of the first would destress loves and pause slightly after the second Pat, stressing every noun. A natural pronunciation of the second would (in this context of exclusively monosyllabic words) accent each word in each conjunct, and a pause would be introduced after hates. Gapping Inherits Coordination. • The construction contains two or more conjuncts, which can be linked by a Conjunction (e.g. and). When there is a Conjunction, the normal conjunction rules apply. • Each conjunct contains a Before and an After. All of the Befores correspond to each other semantically and all of the Afters correspond to each other semantically. • All conjuncts other than the final one can contain a Gapped_portion, which directly follows the Before and directly precedes the After. The first conjunct must contain the Gapped_portion. If multiple conjuncts contain the Gapped_portion, it is the same in all of them. • The Gapped_portion can be omitted from all conjuncts other than the first one. It must be omitted from the final conjunct. If the Gapped_portion is omitted then the Before directly precedes the After and the conjunct is interpreted as though the Gapped_portion were present between the Before and the After. • The conjuncts may be separated by Punctuation. • It is one of those places in which [ Before perspectives] [Gapped_portion are] [After restored] [Punctuation ,] [Conjunction and] [Before anxieties] [After erased].
158 Charles J. Fillmore
Shared_Completion Called Delayed Right Constituent Coordination in the Cambridge Grammar of the English Language. Construction includes examples of what is often called Right Node Raising. • The construction consists of two, or more, Sharers linked by one, or more, Connector(s). The final conjunct is followed by the Completion. Connectors are typically conjunctions (e.g. and, or). When there are multiple Connectors or there is a single Connector and it is a conjunction, there can be more than two Sharers. If there is a single Connector and it is a conjunction the normal conjunction rules apply. When the Connector is not a conjunction, there can only be two Sharers. The Completion is interpreted as completing each of the Sharers individually. ex.: We can [sha remind ourselves of] [pun ,] [con and] [sha help our children to realise] [pun ,] [com the need at all times for compassion].
A general description of Gapping structures is that the second conjunct has something missing in the middle, typically (or by definition) a portion of a VP, and the interpretation can be given by a paraphrase that copies the gapped portion into the appropriate place in the second conjunct; a general description of the Shared Completion structures is that each conjunct has something missing on its right edge, and the interpreting paraphrase copies the completion onto the end of the first conjunct. The annotational challenge for each of these is to represent all of the pieces needed for the reassembly while accurately displaying the constituent structure of the expression as a whole. For these constructions the FN annotations seem to be busier than they have to be, and in order to show the reader all of the components that need to be reassembled to derive an interpretation, the annotations in the Constructicon label all of the pieces by shameless brute force. Since they are variable in the possible structures to which the labels are attached, the labels are relevant only to these constructions. For Gapping, they are the following: Before + Gap-filler + After + connector + Before + After Here the interpretation is that a relevantly adjusted repetition of the gapfiller is introduced between the final of Before + and After are to be expanded, semantically, as Before + Gap-filler + After, yielding and Pat loves Lou.
Frames, Constructions, and FrameNet 159
For Shared Completion, recognizing the possibility of paired connectives (neither...nor, etc.), the components are these: (connector) + Sharer + connector + Sharer + Completion. The interpretation then inserts the completion after the first Sharer. For ease of reading, the following examples have the components bracketed but not labeled, enclosing the connectors with angle brackets < >. Punctuation is ignored. Gapping Examples (34)
A popular participatory democracy is a system in which {[decisions] [are] [taken], [policies] ___ [made]}, as a result of the widest possible free and open discussion.
(35)
{[he] [adores] [Mama], [she] ___ [him]}.
(36)
{[productivity and satisfaction] [can be] [increased], [absenteeism] ___ [decreased]}, simply by reinforcing group attractiveness and cohesion.
(37)
{[A couple of bedrooms] [overlook] [Loch Ness], [others] ___ [the village and the Caledonian Canal]}.
Shared Completion Examples (38)
And it allows companies { [to contribute to] ___ [benefit from] [election campaigns]}. (conjoined infinitive VP)
(39)
The remit which I set myself is to examine in outline the ways in which the political parties have { [responded to] ___ [helped to shape] [the pertinent social collectivities in Britain]} over the post-war years. (conjoined perfect participial VP)
(40)
It is a matter of their input systems being tuned to the contours of {[this physical] ___ [this social] [reality]}. (conjoined NP)
The two Ellipsis constructions combined. Many years ago I found myself puzzled by a sentence found in a zoology textbook, (41)
Bears have become largely, and pandas entirely, noncarnivorous.
This sentence is the product of both the Gapping and the Shared Completion construction, but it is not clear that either of them could be discovered automatically. The final adjective, noncarnivorous, is shared by both largely and entirely as its preceding context, but at the same time have become is missing in the second conjunct. I was happy to assume that it was an extremely rare possibility, but in the work of annotating examples with re-
160 Charles J. Fillmore
spect to these constructions, several new examples were found. In example 42, (a) shows the Shared Completion labels, and (b) shows Gapping labels. (42)
Friends might be appalled, and employees embarrassed, at how little Laura resisted Bernard‘s will. a.
{[Friends might be appalled] ___, [employees embarrassed] [at how little Laura resisted Bernard’s will]}.
b.
{[Friends] [might be] [appalled], [employees] [embarrassed]} at how little Laura resisted Bernard’s will.
___
Here separate annotations for the two constructions separately show the constituents needed for the reconstruction. 8.
Summary
The original motives behind the work of building a Constructicon were complex. One reason was to demonstrate the impossibility of counting on the core grammar of a language to provide enough structure for building the interpretations of sentences out of the sentences’ lexical structure, however detailed that might be. In many cases, e.g. that of noun modification with the adjective favorite, it is likely that expanded views of lexical information can be developed that explain all of the relevant phenomena. But I think we have seen that there are many other cases where complex grammatical structures bring problems of generation and analysis that cannot be anchored in individual lexical items. Another reason was to explore the density of “special” grammatical constructions in ordinary text, possibly even to inquire into the possibility of developing grammatical profiles of individual texts, the output of specific writers, the nature of varieties of academic discourse, the difference between spoken and written language, and so on. And one of these was to contribute to the writing of a grammar of English. It is the last of these motives which strikes me as the most despairing. I now think that the work of annotating sentences for their grammatical structure (one construction at a time) can be useful, as long as it claims no more than to have found the constituents that need to be attended to when writing a grammatical account, but annotators doing this work cannot bear the burden of being responsible for a complete understanding of the underlying grammatical reality. That is, it is one thing to find the critical elements of a comparative sentence, but it is something altogether different to try to understand how the grammar of comparatives actually works. Simi-
Frames, Constructions, and FrameNet 161
larly, annotations of lexical units and their relevant contexts is not the same as lexicography, though in both cases I believe that a body of annotated data could be useful for lexicographers and grammarians, if only as a way of assembling a wide enough collection of relevant examples to keep in mind. 9.
Conclusion
To conclude, one could say that a view of language in terms of constructions offers a number of challenges to areas such as compositional semantics, parsing, computational lexicography, grammar checking, grammar writing, text analysis, language learning, and the cataloguing of constructions. (1) The challenge to compositional semantics: It is misleading to characterize grammatical constructions, idioms and fixed expressions as “noncompositional”. Once we know what a fixed expression means, or what effect a construction has on the meaning of the phrases it licenses, it can figure perfectly well in the compositional process. Cases of “meaning” that go beyond compositionality belong to pragmatics, where the reasoning involves more than conventions attending linguistic form. For our purposes, the challenge to compositionality is in the need to discover and describe the actual meaning contributions of the specific constructions. This often means recognizing that the “core” processes don’t work and special features of linguistic form need to be recognized such as the poor, the young, the grossly obese – it’s no bigger than a golf ball – time was when men were men. (2) The challenge to parsing: Most parsers are trained on sentences that exhibit fairly general grammatical structures. The constructions that cause problems for parsers are those that allow the omission of important “anchoring” sentence constituents – like, for example, their main verbs! One problem is that parsers are built on the written language, and except for punctuation, the written language offers few clues to the structure: intonation and pausing in the spoken language often makes the structures quite clear.
162 Charles J. Fillmore
(3) The challenge to computational lexicography involves two issues: When to describe something as a lexical unit with specific contextual requirements, and when to describe the phrase or structure that contains the word. When to take corpus evidence as lexicographically relevant; Patrick Hanks distinguishes “norms” and “exploitations of norms”.48 Corpus lexicographers sometimes insist that the lexicographer’s job is to describe words as they’re used in real text, and not to editorialize. But some words-in-context are best seen as “guests” in alien syntactic environments and we should not have to describe those uses separately. We can think of certain words as “at home” in a particular syntactic environment, serving as models for words that can appear as “guests” in those environments. Thus verbs such as get or move can be seen as “hosts” in the caused motion construction, whereas sneeze or shake can be seen as “guests”: I got/moved the Kleenex off the table. I got the sand out of the sleeping bag. I sneezed the Kleenex off the table. I shook the sand out of the sleeping bag. We don’t really want to list ‘cause to move by sneezing on’ as a sense of sneeze, or to define this use of shake as ‘cause to move by shaking something’ (4) The challenge to grammar checking: Suppose we had a really serious grammar checker, something that would let you know if what you have written is really grammatically acceptable in the language: Would it be able to recognize certain patterns as grammatical that might look to be ungrammatical? Would it be able to recognize something that looks ordinary when we know it is special? (5) The challenge to grammar writing: The FrameNet Constructicon is not to be seen as a grammar; first, because at present we do not include those structures that familiar parsers handle perfectly well, and second, because we are presenting examples of phenomena that any construction 48
See, for example, # Hanks (2013).
Frames, Constructions, and FrameNet 163
grammar needs to account for without taking a stand on which of these does the best job. In general our goal is to make the analyses maximally compatible with “Sign-Based Construction Grammar” (SBCG) associated with the work of Ivan Sag, Paul Kay, Laura Michaelis, and Charles J. Fillmore and laid out in Boas and Sag (2012). 49 the boundaries of a construction a general treatment of polarity lexical versus phrasal treatments apparent “extended family constructions” (the problem of locality) (6) The challenge to text analysis: It is impossible to go through a text and simply underline or highlight all the constructions, collocations and fixed expressions. They overlap and contain each other. For instance, it is not true that every instance of “no + comparative” belongs to the same construction. Sometimes “no” means ‘not even’ and sometimes it means ‘not one’. Corpus sampling produces two kinds of contexts: no bigger than a postage stamp/thumb/pin, … there’s no bigger job/obstacle to peace/challenge (7) The challenge to language learning: some things are hard for nonnative speakers to learn: manner verbs (e.g. verbs of walking style: waddle, stagger, swagger, slink, plod, trudge, …), collocations (“lexical functions” in the Mel’cukian sense), richly contexted situation formulas (it takes one to know one), image-directed phraseology (sechs Minuten vor dreiviertel sechs [six minutes before a quarter to six]), etc., for foreign-language pedagogy, what’s necessary for Global English? (Kernerman).50
49 50
See # Michaelis 2012, # Kay and Sag 2012, # Fillmore, Lee-Goldman and Rhomieux 2012. Probably a reference to # http://dictionary.reference.com/help/kdict.html.
164 Charles J. Fillmore
(8) The challenge of cataloguing constructions. The biggest problem for a database such as FrameNet may be to classify constructions in such a way that users can find them. There are thousands of special constructions, and they can’t all be given transparent names. References Atkins, Beryl T. S., Adam Kilgarriff, and Michael Rundell 2010 Database of ANalysed Texts of English (DANTE): The NEID database project. Proceedings of the Fourteenth EURALEX International Congress, EURALEX 2010. Baker, Collin, Michael Ellsworth, and Katrin Erk 2007 SemEval'07 task 19: frame semantic structure extraction. Proceedings SemEval ’07 Proceedings of the 4th International Workshop on Semantic Evaluations, 99–104. http://dl.acm.org/citation.cfm?id=1621492. Boas, Hans C. and Ivan A. Sag (eds.) 2012 Sign-Based Construction Grammar. Stanford: CSLI Publications. Cruse, Alan D. 1986 Lexical Semantics. Cambridge: Cambridge University Press. Davies, Mark 2008– The Corpus of Contemporary American English: 450 Million Words, 1990-Present. Available online at http://corpus.byu.edu/coca/. Fillmore, Charles J. 1982 Frame semantics. In Linguistics in the Morning Calm: Selected Papers from SICOL-1981, The Linguistic Society of Korea (ed.), 111– 137. Seoul: Hanshin Publishing Company. Fillmore, Charles J. 1985a Frames and the semantics of understanding. Quaderni di Semantica 6: 222–254. Fillmore, Charles J. 1985b Syntactic intrusions and the notion of grammatical construction. Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics Society, 73–86. Fillmore, Charles J. 1986 Varieties of conditional sentences. Eastern States Conference on Linguistics 3: 163–182. Fillmore, Charles J. 1988 The mechanisms of “construction grammar”. In General Session and Parasession on Grammaticalization, Shelley Axmaker, Annie Jas-
Frames, Constructions, and FrameNet 165 sier, and Helen Singmaster (eds.), 35–55. Berkeley: Berkeley Linguistics Society. Fillmore, Charles J. 2002 Mini-grammars of some time-when expressions in English. In Complex Sentences in Grammar and Discourse: Essays in Honor of Sandra A. Thompson, Joan Bybee and Michael Noonan (eds.), 31–60. Amsterdam/Philadelphia: John Benjamins. Fillmore, Charles J. 2009 Review Article. A Valency Dictionary of English. International Journal of Lexicography 22.1, 55-85. Fillmore, Charles J. 2013 Berkeley Construction Grammar. In The Oxford Handbook of Construction Grammar, Thomas Hoffmann and Graeme Trousdale (eds.), 111–132. Oxford: Oxford University Press. Fillmore, Charles J., and Beryl T. S. Atkins 1992 Towards a frame-based lexicon: The semantics of RISK and its neighbors. In Frames, Fields and Contrasts: New Essays in Semantics and Lexical Organization, Adrianne Lehrer and Eva Feder Kittay (eds.), 75–102. Hillsdale: Lawrence Erlbaum Associates. Fillmore, Charles J., Paul Kay, and Catherine M. O’Connor 1988 Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64: 501–538. Fillmore, Charles J., Russell Lee-Goldman, and Russell Rhomieux 2012 The FrameNet Constructicon. In Sign-based Construction Grammar, Ivan A. Sag and Hans C. Boas (eds.), 309–372. Stanford: CSLI. Fillmore, Charles J., and Hiroaki Sato 2002 Transparency and building lexical dependency graphs. In Proceedings of the 28th Annual Meeting of the Berkeley Linguistics Society, J. Larson and M. Paster (eds.), 87–99. Goldberg, Adele E. 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Hanks. Patrick 2013 Lexical Analysis: Norms and Exploitations. Cambridge, MA: MIT Press. Herbst, Thomas 1987 A proposal for a Valency Dictionary of English? In A Spectrum of Lexicography, Robert F. Ilson (ed.), 29–47. Amsterdam/Philadelphia: John Benjamins. Herbst, Thomas, David Heath, Ian Roe, and Dieter Götz 2004 A Valency Dictionary of English. Berlin/New York: Mouton de Gruyter.
166 Charles J. Fillmore Kay, Paul 1994
Kay, Paul 2005
Anaphoric binding in construction grammar. In Proceedings of the Twentieth Annual Meeting of the Berkeley Linguistics Society: General Session Dedicated to the Contributions of Charles J. Fillmore, 283–299.
Argument structure constructions and the argument-adjunct distinction. In Grammatical Constructions: Back to the Roots, Mirjam Fried and Hans C. Boas (eds.), 71–98. Amsterdam: John Benjamins. Kay, Paul, and Charles J. Fillmore 1999 Grammatical constructions and linguistic generalizations: The What’s X doing Y? construction. Language 75, 1–33. Kay, Paul and Ivan A. Sag 2012 Cleaning up the Big Mess: Discontinuous Dependencies and Complex Determiners. In Sign-based Construction Grammar, Ivan A. Sag and Hans C. Boas (eds.), 229–256. Stanford: CSLI. Michaelis, Laura A. 2012 Making the Case for Construction Grammar. In Sign-based Construction Grammar, Ivan A. Sag and Hans C. Boas (eds.), 31–67. Stanford: CSLI. Partee, Barbara, and Vladimir Borschev 1999 Possessives, favorite and coercion. In Proceedings of ESCOL99, Anastasia Riehl and Rebecca Daly (eds.), 173–190. Ithaca, NY: CLC Publications. Sag, Ivan A. 2012 Sign-Based construction grammar: An informal synopsis. In SignBased Construction Grammar, Hans C. Boas and Ivan A. Sag (eds.), 69-202. Stanford: CSLI Publications.
The valency approach to argument structure constructions 1 Thomas Herbst
1.
Valency and the interaction of lexis and grammar
The phenomenon of valency must be of central interest to anybody who wants to investigate whether or not the levels of lexis and grammar should be kept strictly apart from one another,2 whether knowing how to use a word entails remembering a lot of idiosyncratic detail or to what extent the meaning of a word can serve as a predictor of its syntactic properties, whether or not it is worth exploring “the possibility that all of grammar can be viewed in terms of constructions” (Bybee 2010: 77), i.e. whether “it’s constructions all the way down” (Goldberg 2006a: 18). After all, any linguistic theory will have to account for facts such that (1)
I have given you the facts.
(2)
… you can provide them with some facts and figures.
and (1)
a.
(2)
a.
You can give that impression without actually saying so.
The book is structured to provide a chronological introduction to St Ives art …
represent established uses, whereas this is not the case with
1. 2.
I am very grateful to Adele Goldberg, Hans-Jörg Schmid, Susen Faulhaber, Michael Klotz, Kevin Pike and Peter Uhrig for their comments on earlier versions of this paper. For the view of a cline between lexis and grammar or of a lexicogrammatical continuum see e.g. Croft (2004: 275), Fillmore (2008: 49), Fillmore, Kay, and O’Connor (1988), Langacker (2008: 18), Halliday (1994: 15), Sinclair (2004: 164–165) or also Goldberg and Jackendoff (2004: 532) or Boas (2011: 38).
168 Thomas Herbst (1)
b.
*I have given you with the facts
(2)
b.
*You can provide them some facts and figures.
The concept of valency sheds light on one of the ways in which properties of individual words (or classes of words) exert influence on syntactic structure or in which the occurrence of particular structures or patterns are restricted with respect to the words which can occur in them.3 This kind of interrelationship between syntactic structure and lexical elements has long been the subject of linguistic discussion and has been dealt with under different names and from different perspectives in many theories of syntax and grammars. We will take valency theory as the starting point of the present discussion, in which the verb takes a central position in the clause. Karl Bühler’s (1934: 173) often quoted statement, that “words of certain word classes open up one or several slots which have to be filled by words of other word classes” can be seen as an early formulation of the principles underlying valency theory.4 Similarly, it is a central element of the design of Tesnière’s dependency grammar that verbs should take the highest position in the hierarchical representation of a clause, which is reflected in stemmata of the following type. donne
Alfred Figure 1.
le livre
à Charles
Dependency stemma (Tesnière 1959: 102)
If one follows the philosophy of these approaches, then the structure of a clause is largely determined by particular properties of lexical items. These ideas were followed up and developed to a considerable extent in German 3. 4.
For a discussion see Jacobs (2009). “… dass die Wörter einer bestimmten Wortklasse eine oder mehrere Leerstellen um sich eröffnen, die durch Wörter anderer Wortklassen ausgefüllt werden müssen.” See also Bühler (1934: 251 and 303). For a discussion of such approaches before Tesnière see, for instance, Ágel (2000: 15–32).
The valency approach to argument structure constructions 169
linguistics, where valency theory can be regarded as a standard model of syntactic description, which is reflected in the standard reference grammars of the language (Zifonun, Hoffmann, and Strecker et al. 1997, Eisenberg 1999) and shows in the publication of a large number of special valency dictionaries (Helbig and Schenkel 1969, VALBU 2004) and theoretical publications (e.g. Helbig 1992, Ágel 2000 etc.). There have been several attempts to apply the valency approach to the description of English (Emons 1974, Allerton 1982, Herbst 1983), and the concept of valency has also been used in a number of other frameworks (Matthews 1980, Fillmore 1988, Müller 2008).5 The outline given here will be based on the approach taken in the Valency Dictionary of English (VDE 2004) and the framework developed by Herbst and Schüller (2008). It is the aim of this article to outline the basic concepts used in these approaches and to show how this particular approach to verb valency differs from other versions of valency theory and other accounts of the same phenomena. In particular, the following questions will be addressed: why subject may be a useful or even necessary category to be employed in a valency description of English, but perhaps not object, why it is methodologically advantageous to consider the levels of form and meaning apart from one another, whether valency properties are best described in terms of an inventory of complements or in terms of patterns, how the valency approach can be combined with a construction grammar approach, whether and to what extent generalizations about valency structures or argument structure constructions are possible. 2.
Sketch of a valency approach
2.1
Basic categories
The basic concept of our valency approach can be described as follows (Herbst and Schüller 2008: 108):
5.
See also e.g. Borsley (1996).
170 Thomas Herbst
2.2
A lexical unit has the property of valency if it opens up one or more valency slots which can or must be realised by a complement. 6
At the formal level, valency slots will be described in terms of the complements which can fill them. A complement is any formal realisation of a valency slot – i.e. a phrase or a clause.
At the semantic level, valency slots can be characterized in terms of participants or participant roles, which characterize the semantic function of the complement in the clause.
With respect to optionality, valency slots will be characterized as to whether a slot must or can be realised by a complement.
Adjuncts are not part of the valency description.
Complements and adjuncts
The distinction between complements (Ergänzungen, or, in Tesnière’s [1959: 102–107] terminology, actants) and adjuncts (Angaben, circonstants) is central to the valency approach, although, of course, it is made in a similar form in other accounts of complementation.7 A unit (a phrase or a clause) will be classified as a complement if it meets at least one of the following two criteria, namely if it is determined by the valency carrier in its morphological form or its position in the clause, or if it has to be expressed whenever the valency carrier is used, i.e. if it realises an obligatory valency slot. On the other hand, an element will be classified as an adjunct if it is not determined in its form by the valency carrier and if it does not realise an obligatory valency slot.
6.
7.
Compare Schumacher, Kubczak, Schmidt, and de Ruiter (VALBU 2004: 25): “Unter syntaktischer Valenz verstehen wir die Eigenschaft von Verben, die Zahl und Art bestimmter sprachlicher Elemente ihrer Umgebung im Satz zu determinieren.” (‘Syntactic valency can be defined as the property of verbs to determine the number and type of certain linguistic elements in their environment in a clause.’ TH). See also Helbig and Schenkel (21973: 49). Cf. e.g. the discussion in the Cambridge Grammar of the English Language (2002: 219–228).
The valency approach to argument structure constructions 171
On the basis of these definitions, all of the elements underlined can be classified as complements: (1)
I have given you the facts.
(2)
… you can provide them with some facts and figures.
However, adjunct status would be given to to his left and at intervals in (3)
To his left, St Anthony lighthouse flashed at intervals…
Many criteria have been discussed to support the distinction between complements and adjuncts. For instance, adjuncts can be deleted without making the sentence ungrammatical (although, of course, they may be necessary from a communicative point of view in a particular context): (3)
a.
St Anthony lighthouse flashed …
Furthermore, adjuncts tend to be less fixed than complements with respect to the position in which they occur in the sentence: (3)
b.
At intervals, St Anthony lighthouse flashed …
c.
At intervals, St Anthony lighthouse flashed to his left …
d.
To his left, St Anthony lighthouse flashed.
Although such criteria are instrumental in eliciting certain differences between complements and adjuncts, they are not entirely unproblematic. Positional mobility or fixedness seems to be a matter of degree, and deletability also applies to certain types of complement (so-called optional complements) so that it might be preferable to make use of a criterion of free addibility. The fact that no single test or set of tests can be established on which the distinction between complements and adjuncts could be based operationally indicates that the distinction is one of gradience and that the categories are to be taken as prototypical in nature.8
8.
See e.g. Herbst and Schüller (2008: 113–116) and Somers (1987: esp. 11–18). For the criterion of free addibility see Ágel (2000: 185) and Heringer (1996: 159). For problems of different tests in German see Zifonun, Hoffmann, and Strecker et al. (1997: esp. 1027–1064) or Breindl (2006: esp. 945–948). See also Matthews (1980: 143–144). Jacobs (1994) argues that different types of test reflect different types of dependency relationships; see also Ágel (2000: 197–210).
172 Thomas Herbst
2.3
Description of complements
Since complements are defined as the formal realisations of a valency slot, they are specified in terms of formal categories, i.e. phrases and clauses such as, for example, the following (see Herbst and Schüller 2008 or VDE for a comprehensive list): [NP] [AdjP] [to_INF] [for_NP] etc. [that_CL]
The Tate Gallery St Ives will present twentieth-century art in the context of Cornwall. The Tate Gallery St Ives project is unique. West Cornwall continued to attract artists who wanted to work quietly, or study. The exciting public galleries in Penzance and Newlyn provide for the display of Newlyn School painting … … they said on the radio that it would be fine again tomorrow
It is obvious that for a complete characterization of valency complements, positional criteria will have to be taken into account. Following the approach taken in the Valency Dictionary of English (2004) and by Herbst and Schüller (2008), two criteria can be used to distinguish the three noun phrase complements of give in (1), for example. Firstly, a distinction can be made with respect to the ability of a complement to occur as the subject of finite active or passive clauses (which is indicated by subscripts such as [NPact-subj] and [NPpass-subj]) or not. Secondly, if there are two noun phrases in the predicate, their canonical order will be indicated by subscripts indicating position – [NPpass-subj1] for you and [NPpass-subj2] for the facts in (1).9 2.4
Semantic roles
At the level of semantics, valency slots can be characterized in terms of semantic roles (Helbig 1992). Admittedly, it has to be said that the discussion of fundamental questions of how many such roles are required for the description of a particular language and how they can be distinguished from one another, which has been going on ever since such roles became established in linguistics (Fillmore 1968, Halliday 1967/1968), has not resulted 9.
Note that apart from subject no functional categories are part of this kind of valency description (see section 3.4).
The valency approach to argument structure constructions 173
in a generally accepted inventory of such roles. It thus seems appropriate to assume relatively verb-specific roles in a valency description and to allow for more general semantic roles wherever possible, which is similar to – or at least compatible with – the approach taken by Goldberg (2006a). In the case of example (1), for instance, it seems possible to characterize the different slots as AGENT, BENREC (as a cover term for BENEFICIARY and RECIPIENT) and ÆFFECTED (combining AFFECTED and EFFECTED), for example.10 2.5
Optionality
Apart from the specifications at the levels of form (complements) and meaning (participants) valency slots can be characterized as to whether they have to be filled at all. In this respect, it seems possible to make a distinction between optional, contextually optional and obligatory valency slots.11 A valency slot is labelled optional if it does not need to be expressed by a complement. In the case of a contextually optional valency slot this is only possible under certain contextual conditions, usually entailing that the referent of the participant represented by the valency slot can be identified from the context. Thus a sentence such as (4)
Then perhaps you should try again.
in which the valency slot of try expressing the role of ÆFFECTED is not expressed formally only seems acceptable in a context in which the corresponding referent can be identified, which makes this valency slot a contextually optional one. However, in the case of read in
10. For a more detailed discussion of semantic roles see Herbst and Schüller (2008: 126–135). Compare also the more specific frame elements employed in the FrameNet project (Fillmore 2007). The fact that the valency approach and the approach taken in FrameNet (accessed 4th March 2012) are similar but not identical shows in the fact that for a verb such as read FrameNet assigns both NP- and about_NP-complements to the frame element “text”, whereas in the VDE or in the framework employed by Faulhaber (2011) they would be distinguished as ‘ÆFFECTED’ and ‘TOPIC’. 11. For a discussion of different types of optionality in FrameNet see Fillmore (2007).
174 Thomas Herbst (5)
Back in the living-room Helen settled down to read while he dabbled with a crossword.
it is still obvious that an ÆFFECTED participant is involved, but since this participant need not be specified and related to a referent for the sentence to be meaningful in a given context the valency slot is classified as optional. Finally, we define a valency slot as obligatory if it has to be expressed formally by a complement whenever the valency carrier (in the sense of a particular lexical unit) is used. This is the case, for instance, with the [NP]complement of call and the [as_NP]-complement of regard in (6)
2.6
There is a difference between being called a customer and being regarded as one, which seems entirely to have escaped BR.
Complement inventory
The resulting valency description of the lexical units of call in (6) or present in (7)
The Tate Gallery St Ives will present twentieth-century art in the context of Cornwall.
can then take the following form:12 call ‘name’
I II III
opt obl obl
AGENT ÆFFECTED PREDICATIVE
NPact-subj / by NP NPpass-subj NP AdjP
present ‘give’
I II
opt obl
AGENT ÆFFECTED
III
opt
BENREC
NPact-subj / by NP NPpass-subj with_NP NPpass-subj to_NP
12. See VDE for a slightly more detailed version of this, which also covers patterns of the kind They called it foolish to turn away instructional help when the district has to lay off teachers.
The valency approach to argument structure constructions 175
3.
Linking lexical valency with constructions
3.1
Valency as lexical potential
A valency description that follows the principles outlined above sees valency as the syntactic potential of lexical units, i.e. it takes a strongly lexical perspective.13 At the same time, it presupposes the existence of syntactic structures to which the complements of the verb etc. can be related. The precise nature of these syntactic descriptions will be discussed below, but it is obvious that identifying complement types such as [NP] act-subj and [NP]pass-subj entails a reference to syntactic structures in which the category subject exists as well as to active and passive clauses, neither of which, however, is taken as being primary. It should be mentioned that this is an important deviation from most, if not all, versions of valency theory used in the analysis of German. Here, the standard procedure is to base valency descriptions on active declarative clauses. Thus VALBU (2004) by Schumacher, Kubczak, Schmidt and de Ruiter would describe the valency of essen in (8)
a.
Kinder essen gerne Pommes frites. (Children like chips. TH)
in terms of one obligatory nominative complement – NomE – and one optional accusative complement – AkkE. In a passive clause, however, the obligatory NomE is no longer obligatory and, if at all, realised by a vonphrase, whereas the “optional” AkkE cannot be deleted from a passive and takes the form of a nominative as in (8)
b.
Die Pommes frites waren schon nach wenigen Minuten gegessen …
(All the chips had been finished after a few minutes … TH)
It is made clear in VALBU (2004: 55) that there are fundamental differences between the formal realisations of the semantic roles compared with the corresponding active clause.14 Thus if the standard active clause is taken as the basis for determining the valency of a verb, then rules affecting the valency description will have 13. See also Welke (1995: 170). 14. Compare VALBU (2004: 55): “Im Passivsatz ist die syntaktische Ausprägung der semantischen Rollen gegenüber dem entsprechenden Satz im Aktiv erheblich verändert.” (‘In passive clauses the syntactic realisation of the semantic roles is quite different from that in the corresponding active clause.’ TH).
176 Thomas Herbst
to be introduced – such as relating the nominative noun phrase of the passive to the accusative complement (AkkE) of the active as in (8) or reducing the valency of a verb in imperative clauses.15 This can be avoided by providing a valency description in a form that lists all possible formal complements for one valency slot and relates them to the types of clause in which they can occur. Since the latter can be seen as types of constructions it is very tempting to combine elements of the valency approach with elements of construction grammar, as will be outlined below. 3.2
The special status of the subject
Classifying a complement as a potential subject – Xact-subj or Xpass-subj – entails at least three different aspects: firstly, it refers to a function in clause structure; secondly, in the case of personal pronouns it relates to morphological information (nominative case); and, thirdly, it also affects its optionality status: in imperative-‘command’-constructions, such slots can be described as contextually-optional; in certain types of non-finite clause they need not be realised if the participant can be given a definite or a general interpretation. (9)
Use your imagination; somebody must have seen him.
(10)
Without being aware of getting there he found himself outside the printer’s shop.
(11)
Learning thoroughly the piece which you are going to do is important
3.3
SCUs and PCUs
Since subject seems to be a useful concept to use in a valency description of English – perhaps more so than in languages such as German in which complements lend themselves more easily to a morphological characterization – the distinction between subject and predicate can be usefully employed in this model. Herbst and Schüller (2008) suggest a model taking a clause (or verb phrase) as consisting of subject and predicate (and possibly 15. For a discussion of approaches to the passive in German valency theory see Ágel (2000: 119). See also Heringer (1996: 67 and 87) and DudenGrammatik (82009: 544).
The valency approach to argument structure constructions 177
adjuncts). The predicate consists of a predicate head unit and all valency complements realised in the clause (with the exception of the one that functions as subject). structure
subject
valency
complement The Tate Gallery St Ives SCU
Figure 2.
valency carrier will present PHU
predicate complement
adjunct
twentieth-century in the context art of Cornwall PCU AU
Combination of valency information and syntactic construction (Herbst and Schüller 2008)
Combining both terminologies, the following types of units can be established:16 SCU (subject complement unit): a phrase that functions as the subject of the clause and is realised by a complement of the verb; PHU (predicate head unit): a verbal head complex that contains the valency carrier and possible pre-heads (auxiliaries); PCU (predicate complement unit): a phrase that is part of the predicate and is realised by a complement of the verb; AU (adjunct unit). The approach taken here is based on the following line of thinking: certain structural properties of the clause (such as active/passive, declarative /imperative/etc., finite/non-finite) determine how and whether (and with which interpretation) certain valency slots are realised – something in which the notion of subject plays a crucial role. They do not determine, however, whether a clause has no, one, two, or three predicate complement units, which is solely a matter of the valency properties of the valency carrier. This is why it seems appropriate to single out the subject complement
16. See Herbst and Schüller (2008: 164–167). The term phrase subsumes complements that are traditionally referred to as clauses such as [to_INF] or [that_CL]. This is very much in line with Fillmore’s (1988: 43) treatment of “a clause or a sentence as a maximal verb-headed phrase”. Compare also Fillmore’s (1988: 43) distinction between P-subject (valency complement) and S-subject (of a finite sentence).
178 Thomas Herbst
unit but not to distinguish between different types of predicate complement units. This as such is of course not particularly innovative. Giving a special status to the subject and seeing elements of the “predicate” or, in some terminologies the “verb phrase”, as licensed by a particular head, is common practice in many grammars or models of syntax (Aarts and Aarts 1988, Aarts 2011).17 What distinguishes the valency approach from some of these models is that the unit functioning as subject in the clause is still seen as filling a valency slot of the governing verb, as is also the case in the model proposed by Huddleston in the Cambridge Grammar of the English Language (CamG) (2002: 216). 3.4
Subjects – but no “objects”
A further difference between our approach and many related ones is that although for the reasons given we use the term “subject”, PCUs are not further classified by using categories such as “object” or “subject complement” (CGEL 1985), “predicative complement” (CamG 2002: 251; Aarts 2011: 97), where the difference between (12)
The weathermen had got it right: blustery showers out of a turbulent grey sky …
and (13)
Can I get you a drink?
is captured in the following terms: (12) a.
subject
verb predicator
direct object
object complement predicative complement
CGEL CamG/Aarts
(13) a.
subject
verb predicator
indirect object
direct object
CGEL CamG/Aarts
Our main reason for not using such categories is that – apart from the fact that at least direct and indirect object are established categories of gram17. Compare also the distinction between internal and external arguments in Government-Binding Theory (e.g. Sells 1985: 35–36). CamG (2002: 216) treats the subject as an “external complement”. For subjects and objects in HPSG see Müller (2008: 46–47).
The valency approach to argument structure constructions 179
matical description – nothing much seems to be gained by making use of them. In fact, the difficulty with such “easily recognizable terms” (Goldberg 2006a: 221) is that they have received many different interpretations in different frameworks: thus there seem to be major differences as to which formal elements qualify as objects, in particular whether the term should be restricted to noun phrases or not.18 Furthermore, it seems that the main criterion for the definition of objects is a semantic one: “a phrase that refers to a person or entity that undergoes the action specified by the verb” (Aarts 2011: 91). This equals a description of a semantic role.19 In a model that provides a specification of semantic roles anyway, introducing objects as a category is thus redundant (unless the term would be used instead of semantic roles). This semantic criterion is often combined with a number of formal criteria, which however can easily be shown to be limited in their applicability: (a) Aarts (2011: 91–92) argues that “Direct Objects form a close bond with a lexical verb” and points out that they cannot be omitted with verbs such as delete, postpone or like and in the case of (14)
I’m reading.
it “is implicit” because “the addressee will understand that I must be reading something”. While this is undoubtedly true, it also holds for other types
18. Cf. Müller (2010: 280). In some frameworks the term object is used only for noun phrases (Bresnan 2001: 96) or almost only for noun phrases (cf. CamG 2002: 1017–1022, 1207–1209), whereas in CGEL or Aarts (2011: 92) the function of object is also assigned to clauses. See Newmeyer (2003: 151–165) for a detailed discussion of different approaches. For Langacker’s use of the term object see Langacker (2008: 210): “A subject is characterized as a nominal whose profile corresponds to the trajectory of a profiled relationship, and an object as one whose profile corresponds to a landmark.” Cf., however, Langacker (2008: 429): “A complement clause is one that specifies a salient participant in the main-clause relationship. In this respect, complements are analogous to subject and object nominals, which elaborate the schematic trajector and landmark of a profiled process. Indeed, some complement clauses function grammatically as subject or object of the main-clause predicate.” 19. For a discussion of the relationship of semantic roles and the trajectory/ landmark distinction see Langacker (2008: esp. 364–365). See also Radden and Dirven (2007: 47).
180 Thomas Herbst
of optional complement such as the prepositional complement of invest, which is realised in (15b) and (15c) but not in (15a): (15)
a.
You can invest up to nine thousand pounds over five years, er rates vary with the society unless you’re lucky enough to have a fixed rate.
b.
The party would also invest in education; that was he said, the most fundamental requirement of Britain’s future success.
c.
Twelve years ago he invested twenty three thousand pounds in a small local company.
(b) Similarly, the criterion of case in pronouns is only of limited usefulness because not only objects occur in the accusative or objective case:20 (16)
She wouldn’t either, if she were him.
(c) The most important formal criterion for direct objects is passivization. However, this criterion does not always apply. Lexical units such as have, mean (‘have as a consequence’), cost or resemble do not occur in passives in which the elements that tend to be classified as direct objects of active clauses would function as subjects:21 (17)
Yes, but I have a key.
(18)
After the war Nicholson set about re-aligning his practice to the international modern movement, which for him meant a re-evaluation of such pre-war giants as Braque …
(19)
The chisel-like marks depicting the eyes also resemble those produced by cutting into lino.
(20)
For a start, it would cost a fortune.
Since in the valency description proposed the ability of complements to occur as subjects is indicated specifically, there is no need for a subclassification of PCUs on those grounds.22 In contrast to approaches making use of 20. See Aarts (2011: 92). 21. See also CGEL (1985: 10.13–14). 22. A further argument against such an analysis is presented by cases such as (a) She taught them English, (b) He taught English and (c) He taught them. While CGEL (1985: 10.6) treats them in (c) as an indirect object, CamG (2002: 251) and Aarts (2011) describe it as a direct object with a role “typically associated with an IO” (Aarts 2011: 95). See CamG (2002: 250) for the
The valency approach to argument structure constructions 181
categories such as object, object complement, or oblique (as in FrameNet, for example, see Fillmore this volume) we think that little is gained by making such distinctions with respect to syntactic function since the various formal realisations of the corresponding valency slots (such as NP, to_INF, that_CL) are specified at a separate level anyway.23 4.
Presenting valency in terms of valency constructions
4.1
Complement inventories and patterns
Irrespective of the fact whether – as in the classical approach – we take the active declarative sentence as the basis for describing the valency properties of a verbal valency carrier or whether – as advocated in the approach outlined here – we describe valency slots in terms of a list of all their potential realisations, it seems appropriate to assume that, at least for English, statements about the valency of verbs entail making reference to other syntactic constructions (such as sentence type or active/passive). This may be a reason to consider alternative ways of presenting valency information. Describing the valency properties of a valency carrier in terms of a complement inventory as indicated in 2.6 is only one way of abstracting from data of observable language use. It bears recognition to the fact that it seems (more or less) possible to assign particular participant roles to a valency slot. Nevertheless, a description of valency slots in terms of participant roles and an inventory of complements alone cannot be considered adequate since the combinatorial properties of complements will also have to be taken into account (Herbst 2007). This is one of the advantages of describing valency relations in the form of patterns, known as Satzbaupläne in German linguistics (Engel 1977, Schumacher et al. 2004).24 Such a patcriterion of prenuclear position. Treating semantic roles strictly separately from formal criteria means that cases such as (b) and (c) can be accounted for in terms of an optional complement them (BENREC) and an optional ÆFFECTED-complement. From a construction grammar point of view, this might be justified if it were accompanied by a modification or reinterpretation of the semantic role of the complement. 23. Compare also Halliday (1994: 26). 24. Engel (1977: 179–183) and Schumacher et al. (2004: 46–48) distinguish between Satzmuster and Satzbaupläne with the latter taking into account the optionality of complements. For a definition of Satzbauplan see also Duden-
182 Thomas Herbst
tern representation shows, for instance, that although with a verb such as present the valency slots whose roles could be described as “PRESENTEE” and “ITEM PRESENTED” can be realised by noun phrase complements, they cannot be combined in the same sentence, as pointed out above: (7)
a.
… he presented awards [“ITEM PRESENTED”] to pupils from 33 primary and secondary schools [“PRESENTEE”] …
b.
The lifeboat crew [“PRESENTEE”] were presented with bravery awards [“ITEM PRESENTED”] for launching in violent seas …
c.
*He presented them [“PRESENTEE”] awards [“ITEM PRESENTED”].
Since such combinatorial factors will have to be made part of a valency description, the Valency Dictionary of English indicates the patterns in which particular complements can occur in the complement inventories given for verbs. 4.2
Valency constructions
Within the general framework of construction grammar, the fact that a valency description needs to account for both the identification of participant roles and the kind of patterning described above can be captured by describing the valency potential of a valency carrier in terms of (micro) valency constructions. In the sense of constructions as “form-meaning pairings”, the term valency construction will be used to refer to a type of patterning that comprises both the formal and the semantic aspect (including the meaning of the lexical unit).25 Thus the following uses of the verb give can be seen as representing the valency constructions specified:
Grammatik (2009: 916). Cf. also Eisenberg (1999: 70). For a lexicographic treatment of German verb valency in terms of Satzmustern see VALBU (Schumacher et al. 2004: 945–993). 25. Cf. Herbst (2009: 62–63). Compare also Stefanowitsch’s (2011: 383) use of the term “lexically-bound argument structure construction”. It seems that valency constructions in this sense are very similar to Boas’s (2013: 237–238) “lexical-constructional view” and the concept of “individual verb senses” as “miniconstructions with their own frame-semantic, pragmatic and syntactic specifications”.
The valency approach to argument structure constructions 183 [SCU: NP “GIVER”]__giveact__[PCU1: NP “GIVEE”]__[PCU2: NP “ITEM GIVEN”] ║Sem
(1)
I have given you the facts.
[SCU: NP “GIVER”]__giveact__[PCU: NP “ITEM GIVEN”] ║Sem
(1)
a.
You can give that impression without actually saying so.
[SCU: NP “GIVER”]__giveact__[PCU1: NP “ITEM GIVEN”]__[PCU2: to_NP “GIVEE”] ║Sem
(1)
c.
You can give that to him.
[SCU: NP “GIVEE”]__givepass__[PCU1: NP “ITEM GIVEN”] (__[PCU2: by_NP “GIVER”])║Sem
(1)
d.
Wycliffe drove the sixty miles home to another estuary while his team was given lodgings in the town.
[SCU: NP “GIVEE”]__givepass__[PCU1: to_INF “ITEM GIVEN”] (__[PCU2: by_NP “GIVER”]) ║Sem
(1)
e.
I’ve been given to understand that I should get a partnership one day.’
The idea of positing such valency constructions is very intriguing because they represent a rather typical case of a “syntactic configuration … with one or more substantive items” (Croft and Cruse 2004: 247), i.e. item-based constructions (MacWhinney this volume).26 Nevertheless, it is by no means obvious how this type of construction should be defined and how many such valency constructions should be identified because like all – or most – concepts used in linguistic analysis valency constructions are abstractions from observable language use. Up to a point the decisions involved will be arbitrary – at least as long as we do not have sufficient psycholinguistic evidence with respect to the mental representations of valency properties.
26. For item-based constructions in language acquisition see Tomasello (2003). Furthermore, valency constructions are also good examples of the fact that “there is in fact a continuum from substantive to schematic”, as pointed out by Croft and Cruse (2004: 248) (referring to Fillmore, Kay, and O’Connor 1988: 503), if one thinks of examples such as wear thin, throw NP clear/open, work NP free/loose (marked in the Erlangen Valency Patternbank as lexically specified complements); given to understand also being a case in point.
184 Thomas Herbst
Such decisions concern the relation of valency constructions to other constructions, linear order and optionality. [1] Actives and passives: One such decision to be taken with respect to the design of valency constructions is whether actives and passives should be represented as separate valency constructions of the same lexical unit or not. Despite the obvious correspondences between active and passive clauses, which are also reflected in the classification of complements as Xpass-subj, from a purely pattern-oriented point of view, there is a case to be made out for listing active and passive valency constructions separately: one reason for this is that all aspects considering morphological form and optionality can then be represented in a slightly simpler way. Furthermore, this makes it easier to account for cases in which passive constructions exist which cannot be directly related to formally identical active constructions as in:27 (21)
And some time after you went round to the front of the house a woman was seen to go in. [to_INF]
(21)
a.
I saw him go in there. [INF]
[2] Optionality of valency slots: The question of how many valency constructions should be posited also arises with respect to optional complement slots in cases such as the following: (1)
f.
It was unusual for a chief super to have to give evidence of the discovery of a body.
g.
We feel we’ve given you enough evidence sir that you can come to a reasoned decision.
h.
The museum and garden give a powerful and moving impression of the way Barbara Hepworth worked while living in St Ives.
i.
But I think I would be giving you the wrong impression if I suggested that most of the evaluation studies with which I’ve been concerned have involved this kind of conclusion, or this kind of result.
These could either be described as (i) representing two valency constructions 27. The subscripts act and pass provide information on verb morphology. For the treatment of the active/passive relation in different models see Sells (1985: esp. 160–162).
The valency approach to argument structure constructions 185 [SCU: NP “GIVER”]__giveact__[PCU: NP “ITEM GIVEN”] ║SEM [SCU: NP “GIVER”]__giveact__[PCU1: NP “GIVEE”]__[PCU2: NP “ITEM GIVEN”] ║Sem
or (ii) as one valency construction in which the “GIVEE”-slot is indicated as being optional: [SCU: NP “GIVER”]__giveact__([PCU1: NP “GIVEE”])__[PCU2: NP “ITEM GIVEN”] ║Sem
Two arguments can be brought forward in favour of option (i): firstly, the optionality of this valency slot would also have to be indicated in all other valency constructions in which it can be realised such as [SCU: NP “GIVER”]__giveact__[PCU1: NP “ITEM GIVEN”]__([PCU2: to_NP “GIVEE”]) ║Sem
in (1)
j.
He was giving evidence to the Welsh Affairs Committee of MPs in the Commons on the role of the authority.
Secondly, and more importantly perhaps, keeping such constructions apart from one another might lead to more conclusive results with respect to collostructional phenomena (Stefanowitsch and Gries 2003; Stefanowitsch this volume). Thus, for give, nouns such as evidence, lecture, impression seem to be preferred items in the divalent construction, while chance, opportunity or smile belong to the most preferred items for the trivalent construction.28 Apart from passives, where it would seem unnecessarily uneconomic not to opt for solution (ii) for by-phrases which are optional in almost all passive constructions, it may thus be more appropriate to consider such cases as separate valency constructions.29 [3] Subject complement units: The status of SCUs in valency constructions is actually related to the problem of optionality since, as pointed out above, certain types of non-finite clause and certain sentence types do or need not have subjects: (1)
k.
Give him that, please.
l.
He left without giving the little man a chance to reply.
m.
Giving explanations is vital for children’s training …
28. A further reason in favour of analysis (i) concerns the possibility of generalizations in the direction of argument structure constructions with a meaning in their own right (see below). 29. If the by-phrase is obligatory or cannot occur at all with a particular valency carrier, this must of course be indicated in the corresponding valency construction.
186 Thomas Herbst
Since, however, the fact that particular clause or sentence constructions contain a subject and others do not (and that the corresponding valency slots receive a general or a contextually definite interpretation) is a property of these constructions, there is no need to account for this in terms of different valency constructions. However, it is part of the properties of SCUs that they are subject to the conditions of use associated with particular clause types or syntactic constructions. This means that uses of give as in (1d) I’ve been given to understand that I should … or (1f) It was unusual for a chief super to have to give evidence … , in which the corresponding valency slot is not realised by a complement, can be subsumed under the same valency construction as (1).30 [4] Order of elements: The fact that valency constructions are combined with other constructions in real language expressions31 does not only affect the status of the SCU in the valency construction but is particularly problematic with respect to the linear order of the elements: although the units occurring in a particular valency construction are specified in a certain order, this order will not be reflected in every realisation of the valency construction:32 (22)
Young she might be, stupid she was not.
(23)
… he could not take seriously this case in which he had so readily involved himself …
(24)
Mark Rothko came to see Lanyon, with whom he had made contact in New York, and Heron, whom he knew as a writer as well as a painter.
(1)
n.
This is the first work by Barbara Hepworth to which she gave a title referring directly to landscape.
It would be misguided to postulate separate valency constructions for such cases since their word order can quite obviously be related to factors such as the thematic organisation of the message or the weight of constituents or to properties of the clause type (Herbst and Schüller 2008: 102–106), all of 30. This also applies to the form of complements (nominative in finite, accusative in non-finite clauses): Every time it would end up with him giving me some money. For cases such as (1f) compare also Fillmore’s (this volume) category “external”. 31. For this use of the term expression and the question of the combination of different constructions in one expression see Goldberg (2006a: 21). 32. For German Satzmuster in VALBU see Schumacher at al. (2004: 46).
The valency approach to argument structure constructions 187
which are independent of valency. For the majority of cases, there seem to be default rules such as that noun phrase PCUs tend to occur before PCUs realised by prepositional phrases or clauses or that in the case of two noun phrase or two prepositional phrase PCUs their order will be determined by semantic factors.33 In other cases, this is far less obvious: (25)
He pushed the door open …
(25)
a.
Wycliffe stepped down into the little hall and pushed open the door of the shop itself.
b.
She pushed open the door of a room overlooking the bay, a room flooded with sunshine but musty and stale.
It thus seems appropriate to make a distinction between cases where the order of different PCUs is relatively free (which as in VDE could be indicated by using a double headed arrow ) and those where it is relatively fixed in the sense that only special textual factors would lead to a deviation from it. Since it can be assumed that the mental representation of the degree of fixedness of a valency construction will be influenced by frequency of exposure, statistics of occurrence in corpora may be able to provide some indication for determining in which valency patterns the order of PCUs needs to be specified as being fixed or flexible, but it is obvious that the distinction is of a probabilistic and not of a clear-cut nature. 5.
Towards generalizations
5.1
Levels of generalization
Valency constructions as defined above are item-specific constructions. Such item-specific knowledge needs to be accounted for when we want to describe how speakers know how verbs and other valency carriers are used or how children learn to use them. It is the job of dictionaries, in this case valency dictionaries, to provide a description of such item-specific properties of lexical units. At the same time, linguistic analysis must not stop at listing item-specific properties but also try to discover generalizations. There are numerous levels at which we could posit such general structures, some being cognitively more plausible than others, for example the following: 33. Thus a ‘PREDICATIVE’-NP will tend to follow an ‘ÆFFECTED’-NP, whereas a ‘BENEFICIARY’-NP will precede it.
188 Thomas Herbst
at the level of form: valency patterns at the level of meaning: participant patterns at the level of form-meaning-pairings: general valency constructions and/or argument structure constructions. 5.2
Valency patterns
Valency patterns represent clusters of complements in terms of phrases (subsuming clauses under phrases as in Herbst and Schüller 2008). If one distinguishes between active and passive patterns, then (1)
I have given you the facts.
(26)
‘The outsider’, Wycliffe called him.
can be subsumed under the following valency pattern (thus disregarding the fact that (26) has no passive equivalent with the valency slot realising PCU2 occurring as the subject):34 [SCU: NP]__VHCact__[PCU1: NP]__[PCU2: NP]. Similarly, valency pattern [SCU: NP]__VHCact__[PCU: NP] covers (7)
The Tate Gallery St Ives will present twentieth-century art in the context of Cornwall.
(1)
h.
(27)
This painting is a turning point in Lanyon’s work.
The museum and garden give a powerful and moving impression of the way Barbara Hepworth worked while living in St Ives.
This is the format employed in the Erlangen Valency Patternbank, which is intended as a research tool for linguists in that it lists verbs, adjectives and nouns (as well as complex valency carriers) occurring in certain patterns, or, in a similar form, the grammar patterns provided by Francis, Hunston and Manning (1996). Furthermore, it is this level of description that most learners’ dictionaries of English draw upon when indicating the complementation possibilities of verbs, adjectives and nouns.35 34. One justification for this very surface-oriented kind of patterning is the fact that it is patterns of formal units of this kind that speakers experience in actual language use. 35. Such patterns can even be more specific: both Francis, Hunston and Manning (1996) and the Erlangen Valency Patternbank identify separate patterns for
The valency approach to argument structure constructions 189
5.3
Participant patterns
What is probably more relevant from a cognitive point of view is patterning at the semantic level. It is plausible to assume that at some stage in the L1learning process children discover that verb-specific participant roles such as “GIVER”, “SENDER”, “READER”, “PRESENTER” or “GIVEE”, “SENDEE”, “READEE”, “PRESENTEE” (given in double inverted commas) share certain meaning elements and arrive at more general semantic roles such as ‘AGENT’ or ‘BENEFICIARY/RECIPIENT’ (given in single inverted commas). One could assume that children will not only be able to generalize these roles but also that they develop some understanding of the patterning in which they occur, for instance, that the occurrence of an element with the role ‘PREDICATIVE’ presupposes an element to which it can refer. At this level, (1) and (26) would be distinguished as: (1)
I have given you the facts.
(26)
‘The outsider’, Wycliffe called him.
‘AGENT’ – V – ‘BENEFICIARY/RECIPIENT’ – ‘ÆFFECTED’ ‘AGENT’ – V – ‘ÆFFECTED’ – ‘PREDICATIVE’ (1c) and (2), on the other hand, can be subsumed under the same participant pattern as (1), although they represent different valency patterns: You can give that to him.
(1)
c.
(2)
… you can provide them with some facts and figures.
Although by their very nature the roles identified at this level are less specific than the verb-specific roles of the “SENDER”-type and to be seen more on a par with Fillmore’s (1968) case roles, this does not mean that all verbspecific roles can be subsumed under such general roles.
NPs which can only be realised by, for example, reflexive pronouns. The approach taken by Francis, Hunston and Manning (1996: vii) differs from valency constructions in that the subject is not always regarded as belonging to the pattern.
190 Thomas Herbst
5.4 5.4.1
General valency constructions and/or argument structure constructions Generalizing from item-specific valency constructions: combining participant roles and argument roles
What is most interesting from a construction grammar point of view is the kind of patterning that comprises both form and meaning. It is certainly one of the most intriguing research questions in this area to find out to what extent item-specific valency constructions can be or are generalized into more general constructions – and to what extent it is appropriate to refer to such more general constructions in linguistic description.36 Thus one might assume levels of generalization in terms of verb-class-specific constructions (Croft 2003) or general valency constructions based on valency patterns and participant patterns, which could then take the following form: [SCU: NP ‘AGENT’]__verbact__[PCU1: NP ‘ÆFFECTED’]__[PCU2: to_NP ‘BENREC’] ║SEM [SCU: NP ‘AGENT’]__verbact__[PCU1: NP ‘BENREC’]__[PCU2: NP ‘ÆFFECTED’]. ║SEM
Such general patterns have been studied most widely in the framework of argument structure constructions developed by Goldberg (1995, 2006a). Argument structure constructions, which are “generalizations over multiple verbs” (Goldberg: 2010: 52), differ from item-specific valency constructions in a number of respects: firstly, they contain argument roles (which are general semantic roles and not verb-specific participant roles), secondly, they are not lexically specified in that the verb slot is empty, thirdly, as a consequence, the meaning they are supposed to carry is expressed in more general terms.
36. Croft (2003: 56–62) actually distinguishes between verb-class-specific and verb-specific constructions, depending on whether a particular sense of the construction occurs with a particular class of verbs or with individual items. Croft’s verb-specific constructions parallel item-specific valency constructions only that Croft (2003: 59) uses functional category labels of the type [SBJ VERB OBJ1 OBJ2]. Compare also Langacker’s (2008: 240) concept of structural frames: “These frames can be of different sizes (so that some incorporate others) and characterized at different levels of specificity (so that some instantiate other). They amount to constructional schemas that contain the lexical item as one component element.”
The valency approach to argument structure constructions 191
In principle, it seems that the assumption of a level of argument structure constructions is perfectly compatible with the kind of valency-oriented approach outlined above. In fact, it can be argued that the theory of argument structure constructions and a valency approach need to be combined in order to arrive at a comprehensive picture of verb complementation. One of the great attractions of Goldberg’s theory lies in the fact that she distinguishes between two types of roles – the participant roles of the verb (which are part of a verb’s valency) and the argument roles of the construction. Making such a distinction provides an attractive framework for accounting for the combination of verb-specific (or valency construction specific) semantic properties with semantic properties of a phrasal construction. Goldberg’s (2006a: 40) Semantic Coherence Principle – “the more specific participant role of the verb must be construable as an instance of the more general argument role” – avoids some of the classificatory problems associated with Fillmore’s (1968) original framework. This shows, for instance, in Goldberg’s (2006a: 41) analysis of the much discussed loading-example, in which the participant roles of “loaded-theme” and “container” are fused with different argument roles in the two constructions in (28): (28)
a.
Pat loaded the hay onto the truck.
b.
Pat loaded the truck with hay.
This merging of participant roles and argument roles is particularly attractive in cases where one participant role can be realised syntactically in different ways. In the case of a verb such as fly, for example, the fact that different participants such as “PILOT”, “AIRLINE” or “PASSENGER” can appear as subjects and receive an ‘AGENTIVE’ interpretation can be explained in terms of assigning semantic roles to syntactic structures, as Götz-Votteler (2007: 44) argues:37 (29)
a.
You have to fly at least 12 hours before your initial flight test …
b.
From Belfast, British Airways Cargo flies to London Heathrow, Manchester and Glasgow …
c.
He flew to Prague to convince the embassy refugees it was not a trap.
37. For a discussion of the alternative analysis of postulating different lexical units for the various uses of fly see Götz-Votteler (2007: 44). Compare, however, the treatment of fly in FrameNet.
192 Thomas Herbst
There are thus good reasons for assigning semantic roles to units in clausal constructions (see also Herbst and Schüller 2008) and combining them with the participant roles of the verb. 5.4.2
Specifying form in argument structure constructions
One important respect in which a valency-based account of argument structure constructions differs from Goldberg’s is the type of specification given at the syntactic level: Valency patterns – apart from distinguishing between subject complement units (SCU) and predicate complement units (PCU) – focus on strictly formal categories at the level of the phrase and/or clause such as NP or to_INF. Argument structure constructions, however, are often described in terms of functional categories such as subject, object or oblique. The ditransitive construction, for example, is represented by Goldberg (2006a: 20) as follows: Sem: intend-CAUSE-RECEIVE
verb Syn: Figure 3.
(agt
rec(secondary topic)
theme)
( Subj
) Obj1
Obj2
The ditransitive construction (Goldberg 2006a: 20)
As pointed out above (cf. 3.4), the use of categories like subject and object is not entirely unproblematic.38 Goldberg (2006a: 21) justifies the use of
38. This may even affect the construction status. If the ditransitive is to be seen as “a construction in the Construction Grammar sense of the term, a pairing of both form and meaning” (Goldberg ([1992] 2006b: 411), the question of whether object is a purely formal or a semantic category is crucial. If the categories of subject and object are to be understood in the spirit of Langacker (2008: 210) in terms of trajectory and landmark (see note 18), then this is a distinction made at the level of meaning and not at the level of form. For the semantic nature of such categories in Langacker’s Cognitive Grammar see Croft and Cruse (2004: 279–280). See, however, Goldberg’s (2006a: 221–
The valency approach to argument structure constructions 193
“grammatical relations instead of grammatical categories” on the grounds that in this way the ditransitive construction can be distinguished from a construction Subj V Obj PRED (in which PRED “can also be realised as AP”) such as (30)
a.
She considered him a fool.
b.
She considered him crazy.
But would one really want to say that if we describe the differences between the noun phrases a nice cup of tea and a multi-millionaire in (31) in terms of Obj and PRED, we are making a statement about different forms? (31)
a.
… I’ll make you a nice cup of tea.
b.
… I’ll make you a multi-millionaire.
Clearly, there is a difference between the respective NPs in terms of semantic role; but can the fact that – as a consequence – one of them commutes with an adjective phrase and the other one does not be seen as a formal criterion?39 No doubt, Obj and PRED can be considered to “capture a relevant level of description” (Goldberg 2006a: 221) because they can be used to distinguish between the constructions exemplified in (30) and (31). Nevertheless, even if Obj can be seen as a statement about form because in Goldberg’s (personal communication) use it is only used for noun phrases, labels such as Obj and PRED have the disadvantage of obscuring the fact that constructions that can be distinguished in terms of their overall meaning and their argument roles, can have homonymous expressions in that it is not possible to assign a structure of the type NP – VP – NP – NP to either the ditransitive or the trivalent predicative construction alone. The fact that this is so may well be relevant with respect to language learning or language processing. There are thus strong arguments for presenting the formal side of argument structure constructions explicitly in categorial terms such as noun phrase, adjective phrase, etc. rather than in functional terms.
222) explicit statement that she takes these labels to refer to the “form of particular constructions”. 39. Compare the characterisation of complement category E5 in Emons’s (1974: 137) model.
194 Thomas Herbst
5.4.3
A hierarchy of constructional levels: argument structure constructemes and allostructions
There is thus a certain discrepancy between the specification of the level of form in terms of the valency patterns or valency constructions as suggested above and Goldberg’s argument structure constructions. However, this difference should not be overrated either, since, as is shown by the example of the trivalent predicative construction, a Goldbergian argument structure construction can in principle have a number of formal realisations. Another example of this is the resultative construction, whose formal properties Goldberg (1995: 189) characterizes as follows: V
SUBJ
OBJ
OBLAP/PP
(32)
The gardener watered the flowers flat.
(33)
Bill broke the bathtub into pieces.
However, if “resultative phrase” is used as a cover term for adjective phrases and prepositional phrases in the resultative construction (Goldberg and Jackendoff 2004: 536), then why should “object” not be used as a cover term for all complements potentially realising the corresponding valency slot? (34)
a.
You never told me that!
b.
Now let me tell you all the news!
c.
Tell them I don’t want to be disturbed.
d.
When did you tell him that you were pregnant?
e.
He’s telling us how he spent last night here, on the car-park, in his van.
Following the terminological framework of the Comprehensive Grammar of the English Language (1985: 1208–1216) by Quirk, Greenbaum, Leech and Svartvik, Obj2 would include the following categories: NP, CL, that_CL, wh_CL, wh_to_INF, to_INF.40 Since categorial information is given explicitly in the description of some of Goldberg’s argument struc40. In the case of (34h) Give my regards to your father and tell him not to worry. tell can be analysed as representing a different sense of tell; compare VDE. FrameNet follows a similar approach in that it lists a number of realizations for the core element MESSAGE (accessed 14/7/2012). For V-ingclauses as objects see CGEL (1985: 1189).
The valency approach to argument structure constructions 195
ture constructions anyway, providing information on the possible formal realisations of the PCU2-slot of an argument structure construction would be perfectly in keeping with the overall design of the model. 41 A representation of the ditransitive construction could then take the following form:42 Sem: intend-CAUSE-RECEIVE
verb Syn:
(agt
rec(secondary topic)
theme)
( Subj
) PCU1: NP
PCU2: NP/that_CL/CL/to_INF/ wh_CL/wh_to_INF
Figure 4. Ditransitive construction with valency specifications
Borrowing from structuralist terminology, the different realisations of an argument structure construction could be seen as different sub-
41. Cf. for instance, the descriptions provided by Fillmore (1988: 42) or Fillmore and Atkins (1992). See also the formal categories used in FrameNet, where such labels are used as VPto (To-Marked Infinitive Verb Phrase) or Sfin (Finite Clause [with or without that] etc.; see Ruppenhofer, Ellsworth, Petruck, Johnson and Scheffczyk 2010: 47–59) or Fillmore, Lee-Goldman and Rhomieux (2012: 318), which are very similar to the ones used in valency theory (Herbst and Schüller 2008). 42. This kind of representation is very similar to the description of the causaltransfer construction proposed by Newmeyer (2003: 174), which also includes a formal and a functional level: Sem: R: Syn:
instance, means
CAUSE-“RECEIVE” —R PRED V
> NP/AP OBJ2/OBL
However, the valency categories used in our description are more specific formally than the ones employed by Newmeyer. Furthermore, it must be said that without the valency realization principle (see 5.4.4), such a description of argument structure constructions would not account for the occurrences of individual verbs in these constructions.
196 Thomas Herbst
constructions or allostructions of one argument structure construction.43 Admittedly, one might object that the constructions with different realisations of PCU2 (objects) are not sufficiently similar semantically to subsume them under one construction. Basically, this is an issue of where (on the basis of which criteria) one is prepared to draw the line in terms of generalization: obviously generalizing often means considering certain differences as irrelevant. So, at the level of abstraction relevant for the description of argument structure constructions, the semantic differences between (34fg) with respect to overall pattern meaning and the argument roles may well be considered insignificant enough for them to be seen as constructions representing allostructions of one higher-level construction:44 (34)
f.
I heard her tell Mum the news that she was getting married.
g.
… we thought it was time to tell people that we were getting married and start seeing each other openly.
In the same way one could argue that (35)
a.
After arriving in St Ives in 1946, Frost began a series of works which rapidly evolved into pure abstract explorations of colour and line …
b.
She began to make more complex forms combining varied sculptural elements.
can be seen as instances of the transitive construction because the verbspecific roles “BEGINNER” and “THING/ACTION BEGUN” can be fused with the argument roles ‘AGENT’ and ‘ÆFFECTED’ (or ‘THEME’). The decision as to whether different valency constructions should be subsumed under one argument structure construction or not may not always be straightforward. A case in point is what is often referred to as “dative alternation”:
43. Compare the different allostructions given for a constructeme (see below) expressing judgment for verbs such as consider, judge, call, count, regard, think, look, see, view (Herbst and Uhrig 2009). Goldberg and Jackendoff (2004: 535) speak of a “family” of “subconstructions” with respect to resultatives. Goldberg (2013: 455) uses the term alloconstructions with respect to actives and passives etc. For an account of phrasal verbs in terms of allostructions see Cappelle (2006). 44. For an analysis of shell nouns such as news see Schmid (2000).
The valency approach to argument structure constructions 197 (36)
a.
The view was almost identical with Gifford Tate’s picture which the doctor had shown him …
b.
He wanted to see it and I showed it to him.
(37)
a.
They had sent him a WPC who looked like a schoolgirl dressed in police uniform for a school play.
b.
I … typed the statement which I intended to send to Wilder’s father …
c.
I’m not, but I’m sending Curnow to Carbis Bay to talk to the old man.
(36a) and (37a) are instances of the ditransitive construction, but for the band c-cases different analyses can be imagined: we can treat all of the b- and c-cases as instances of “caused-motion”, as Goldberg (2006a: 33) would, we can treat the b-cases as allostructions of the ditransitive construction but (37c) as an instance of “caused-motion” because to Carbis Bay is an unspecified partical (prepositional phrase) which can also be replaced by a single particle such as there. This is not possible in the case of (36b) or (37b). Semantically this could be justified on the grounds that both the NP in the a-examples and the to_NP in the b-examples can be seen as ‘RECIPIENTs’.45 A strong argument in favour of the latter analysis is presented by Rappaport Hovav and Levin’s (2008: 136) observation that “recipients and spatial path phrases” can be combined: (37)
d.
Anne is curious as to why her father sent her a telegram to America to return home at once.
On the other hand, making a distinction between participant roles and argument roles, it could also be argued that verbs such as give have a ‘RECIPIENT’ participant, which is construed as a ‘PATH’-argument in a case such 45. In the latter analysis (36b) and (37b) present a case of constructional homonymy, although in terms of valency patterns (37c) represents the category of unspecified particle phrase (Herbst and Schüller 2008, referred to as ADV in the VDE). See Rappaport Hovav and Levin (2008: 137–142) for a discussion of the different syntactic behaviour of give-, throw- and send-type verbs and their conclusion that “give-type verbs do not have a path argument”. See also Kay’s (2005: 76–77) distinction between an Intended Recipient Construction and a Direct Recipient Construction.
198 Thomas Herbst
as (37b), as suggested by Goldberg (2013: 445).46 This view certainly captures the fact that sentences such as (37b) seem to have both a ‘RECIPIENT’ and a ‘PATH’ element in their semantics. Since choice of construction in the case of the dative alternation – and this is also true of many other choices of this kind which have not been discussed so widely in the literature – may not only be dependent on some prototypical meaning of the construction but on factors such as information structure and heaviness of constituents (Goldberg 2006a: 137–143, Rappaport Hovav and Levin 2008: 156),47 it may in cases be very difficult to determine whether two valency patterns should be subsumed under one argument structure construction or not. It may thus make sense to extend the term allostruction to all valency patterns which can be seen as realisations of one participant pattern, which seems to be perfectly compatible with Goldberg’s (2013: 446) view that “the same verb meaning can appear in more than one argument structure construction, even though each argument structure construction has its own distinct semantics”. The constellation of a participant pattern (such as ‘CAUSER’ – ‘ÆFFECTED’ – ‘PREDICATIVE’) and all the valency constructions that can be seen as realisations of this participant pattern shall be called a constructeme (Herbst and Uhrig 2009). A constructeme can thus comprise one or more argument structure constructions. The concept of the constructeme seems useful because it illustrates the choices of linguistic expression that speakers have when they want to refer to a particular scene, it shows which constructions are similar enough in meaning to possibly give rise to pre-emption. Although we are able to identify differences in meaning between the constructions underlying the uses of verbs such as show or give in (36ab), the semantic overlap between them and in particular the fact that they serve to realise the same participant roles may be instrumental in explaining the fact that some verbs only seem to occur in one of them: (38)
a.
He said he wanted to explain something to Beryl …
46. Goldberg’s (2013: 445) statement (made in a similar form by Rappaport Hovav and Levin 2008: 142) that give requires an “animate recipient” may be in need of modification if we want to include give rise/priority/ credibility/weight/ coverage etc. to inanimate X. 47. See also Bresnan and Ford (2010) and Gries and Stefanowitsch (2004: 104– 107).
The valency approach to argument structure constructions 199
(39)
5.4.4
b.
*He wanted to explain Beryl something.
a.
Did your father say anything to you or to Swayne while you were there?
b.
*Did your father say you anything?
The valency realisation principle
Cases such as (38) and (39) are by no means exceptional. In fact, since it is very easy to show that we cannot predict a valency carrier’s syntactic behaviour (Herbst 1983, 2009, 2010; Faulhaber 2011) on the basis of its semantic properties, the item-specificity of the phenomenon of valency will have to be taken into account. The fact that often, as Goldberg and Jackendoff (2004: 542) say, “there is no explanation to be sought at this level of detail; there is only description” explains why a considerable amount of research done within the framework of the valency approach has resulted in valency dictionaries such as the VDE or VALBU. However, the idiosyncratic nature of valency phenomena has also been pointed out by researchers in the cognitive framework such as Goldberg (e.g. 2006b: 230– 231, 2011: 319) or Croft.48 Nevertheless, the two principles Goldberg (2006a: 39) identifies to “constrain the ways in which the participant roles of a verb and the argument roles of a construction can be put into correspondence or ‘fused’” – the Semantic Coherence Principle and the Correspondence Principle – do not fully account for the idiosyncratic nature of syntactic realisations because the model does not provide for a formal specification of a verb’s valency properties.49 Since in the light of the evidence there can be no doubt that a level of formal specification is absolutely necessary (a point also made very strongly by Jacobs 2008),50 it will be suggested here to introduce a further principle – the Valency Realisation Principle (Herbst 2011):
48. Cf. Croft (2003: 61): “… closer examination of the linguistic facts always reveals idiosyncrasies that show that more specific representations are required than is usually thought”. 49. For a discussion of the correspondence principle see Herbst (2010: 235–239). 50. For a discussion see Jacobs (2008: 34), who says: “Die Annahme kategorialer Valenzen ist unumgänglich”. (‘There is no avoiding categorial valencies’; my translation.) The discussion on linking sometimes suggests the opposite.
200 Thomas Herbst
Valency Realisation Principle: if a valency construction of a verb is fused with an argument structure construction and all of its participant roles are construed as argument roles, then the formal realisation of the argument structure construction (SYN) must coincide with the valency pattern of the valency construction. The Valency Realisation Principle thus introduces a formal component into Goldberg’s model of argument structure constructions, while at the same time maintaining the explanatory value of the generalizations entailed by it. The introduction of the valency realisation principle prevents the model of argument structure constructions from making generalizations which are too powerful in that they cannot account for the unacceptability of the explain- and say-examples in (38b) and (39b), for instance. At the same time, the combination of the valency approach and Goldberg’s theory of argument structure constructions goes beyond the scope of traditional valency descriptions in that it can account for creative uses of language (which will be addressed in the next section). The Valency Realisation Principle does not prevent uses of verbs where arguments are contributed by the construction, as is the case with Goldberg’s (2006a: 73) famous sneezing-example: (40)
5.5
Pat sneezed the foam off the cappuccino.
The creative potential of argument structure constructions
From a cognitive, and, to a lesser extent perhaps, also from a descriptive point of view, the question arises which role knowledge about verb-specific valency properties and knowledge about general constructional properties play when speakers make use of language. There is a lot of evidence to suggest that item-specific knowledge is central in at least the early stages of L1-learning and that “[c]hildren’s networks build up towards more complex constructions (with more parts) and more abstract constructions (less itemspecific)” (Lieven this volume: 2; see also MacWhinney this volume).51 The question is to what extent speakers lose item-specific knowledge and “replace” it by more general knowledge and to what extent both co-exist. 51. See also Goldberg, Casenhiser and Sethuraman (2004: 291), Goldberg (2006a: 63–64), Ibbotson and Tomasello (2009: 60), and Ambridge and Lieven (2011: 133–136). See also Bybee (2010: 45 and 103). For argumentstructure overgeneralizations see Lieven and Noble (2011: 418–420).
The valency approach to argument structure constructions 201
Following Bybee’s (1995: 450) suggestion that regular past tense forms are still stored with high-frequency verbs, it might be reasonable to assume that knowing how particular verbs are used may be more important than reference to the ditransitive construction when producing or processing sentences such as (1)
I have given you the facts.
(34)
a.
You never told me that!
This is very much in line with Bybee’s (2010: 15) statement that “the speaker does not necessarily have to throw away the examples upon which the generalization is based”. In any case, there is no conflict between the valency constructions of the verb and the argument structure construction. One could even account for such sentences without claiming that “the construction … exists independently of the individual verbs that may occur with it” (Goldberg 2006b: 411). What makes such a claim rather plausible, however, are cases where the occurrence of a word in a particular construction would not necessarily have to be related to its valency properties, as in the case of: (40)
Pat sneezed the foam off the cappuccino.
(41)
Jeannie blinks Tony down from an experimental aircraft.
(42)
Schreib dich nach Honolulu. (Write yourself to Honolulu. TH)
There is a certain attraction in saying that uses such as (40) should neither be accounted for in terms of valency properties associated with the verb (because and as long as they are not established) nor in terms of special verb meanings (such as ‘cause something to move somewhere by sneezing’) – a point made by Goldberg (1995: 156).52 The caused motion construction may thus be a good example of a construction being the determining factor. A similar case is presented by adjective constructions of the type 52. This does not necessarily mean that such constructions exist independently in the mind since such uses might also be explained in terms of analogy. For a critical view see Kay (2005). See also Goldberg (2011). For an account in terms of “‘pumping’ constructions” see Fillmore (this volume). See also Stefanowitsch (2011: 377–380). For an analysis of creativity in the resultative construction see Boas (2011).
202 Thomas Herbst (43)
Wasn’t it unwise of him to keep open house like that, with direct access to his office from a back lane?
which Hunston (this volume), working within the pattern grammar approach taken by Hunston and Francis (2000), discusses in the context of cases in which “the pattern might be argued to take precedence over the core word”, pointing out that it “is used with good, kind and nice to indicate a positive appraisal of someone’s actions” and that it “has a similar meaning even when apparently unrelated adjectives, such as big and subtle, are used in it” (Hunston, this volume). From a construction grammar point of view, the fact that a wide range of adjectives seems to be acceptable including words such as authoritarian, big, brave, disloyal, helpful, mature (Herbst 1983: 124), which will all be interpreted as referring to behaviour when used in this construction (cf. Hunston and Francis 2000: 105) makes (43) and variants such as (44)
a.
Nice of you to drop in.
b.
How nice of you to call.
very good candidates for evaluative adjective constructions which need not necessarily be seen as part of a valency description of all the adjectives that could occur in it.53 This does not mean, however, that all argument structure constructions have this creative potential in the same way. In fact, from the point of view of descriptive valency theory, the question arises whether all uses of verbs should necessarily be related to argument structure constructions (if we take argument structure constructions as generalizations over item-specific valency constructions). The Valency Patternbank contains a great number of hapax patterns (Stefanowitsch 2009); for instance the PCU1: to_INF PCU2: than_INF or PCU2: than_to_INF seems to occur only with the verb prefer:54 (45)
a.
You know, I prefer to drink wine than talk about it.
b.
There have been the predictable complaints that people increasingly prefer to watch than to play, that we are producing a generation of armchair athletes.
53. See Hunston and Francis (2000: 105) and Herbst (2009: 59) for this. 54. Note that prefer + to_INF occurs more frequently in the BNC with rather than than with than; rather than also occurs in this pattern with verbs such as like. Compare also would rather.
The valency approach to argument structure constructions 203
Otherwise, than_to_INF occurs as a complement of more (irrespective of whether it functions as head or modifier) as in (46) or of adjectival comparatives as in (47): (46)
a.
‘I can’t think of anything that would please her more than to be investigated by you.’
b.
Wycliffe felt sure that in normal circumstances she was a cheerful woman, more ready to laugh than cry.
c.
I only hope he’s had more sense than to leave her money!
(47)
You should know better than to ask such a question after all I’ve tried to teach you.
Since prefer also involves comparison, the occurrence of constructions (45a) and (45b) seems perfectly motivated; in fact, one could argue that the valency carriers for such comparative constructions include not only lexical items such as prefer and more but also other comparatives. The theory of argument structure constructions thus provides a very attractive way of accounting for creativity in language and for investigating generalizable patterns of semantic interaction between verb meaning and construction meaning. It thus clearly goes beyond the scope of a purely item-based description in the sense of classical valency theory, which per se is based on observed or established language use. At the same time, I would argue that a principle such as the Valency Realisation Principle is needed to provide a means for accounting for certain restrictions concerning this creativity – otherwise the theory would be too powerful.55 6.
The valency approach and the theory of argument structure constructions
There are thus very good arguments for combining the theory of argument structure constructions with the approach of corpus-based empirical valency research (see also Stefanowitsch 2011). Both focus on a phenomenon which takes a central position in the lexicogrammatical continuum, and since they do so from different perspectives they can be seen as complementary rather than contradictory. 55. Cf. Welke’s (2011: 185) view that there is no one-way street from constructions to verbs or from verbs to constructions.
204 Thomas Herbst
It has been shown that a description of verb valency can easily be combined with the overall framework of construction grammar: thus valency descriptions for English such as the ones given in the Valency Dictionary of English explicitly draw upon syntactic constructions in that subjecthood is a crucial element in the definition of complements, and Herbst and Schüller (2008) relate valency to sentence types such as the declarative-‘statement’construction etc. This does mean, of course, that in such an approach valency is seen as one, but not the sole determining factor with respect to clause structure, which is a move away from the principles of dependency grammar as outlined by Tesnière (1959: 102–103), for instance.56 Furthermore, as illustrated above, the phenomenon of valency can be envisaged in terms of item-based constructions.57 The distinctive characteristic of what one could call an empirical valency approach towards argument structure thus does not primarily concern the overall architecture of the grammar but the perspective taken, which is lexical (or lexico-grammatical), and the methodology employed. Both have to do with the historical background to the model: a lot of descriptive work within the valency framework has been done in a foreign language context, where the unpredictability of valency patterns for the foreign learner has resulted in lexicographic description. This entails a detailed description of item-specific properties of individual lexical units based, in many instances, on extensive corpus research. Construction grammarians, or grammarians generally, on the other hand, focus on the properties of the construction and investigate the types of generalizations that can be made. This perspective addresses a certain basic human need to search for regularity and explanation. If one takes the construction as one’s starting point, it is tempting to think that all verbs occurring in this construction should share properties beyond that of occurring in this particular construction. 56. Compare, however, Hudson’s (2008) plea for dependency relations in constructional frameworks. See also Welke’s (2011) dependency-based account. 57. Note the striking similarity between many definitions of valency including Bühler’s (1934) famous statement about Leerstellen (quoted in note 4) and MacWhinney’s (2005: 53) definition of item-based constructions: “Itembased constructions open up slots for arguments that may occur in specific positions or that must receive specific morphological markings”, which underscores the compatibility of the two approaches. Nevertheless, seeing valency in terms of constructions is not the only way of combining the two approaches; see Jacobs (2008: 41 and 2009), who argues in favour of a projectionist view of valency.
The valency approach to argument structure constructions 205
Depending on the respective points of view and the purpose of an analysis, different levels of abstraction may seem appropriate. Formal valency patterns seem to present a level that is perfectly suitable for the purposes of an encoding dictionary that intends to provide the foreign learner with information about the constructions in which a lexical unit can occur, whereas valency dictionaries, which also serve descriptive purposes, or the FrameNet project aim at more comprehensive descriptions that give an indication of semantic roles or valency constructions. With respect to cognitive issues and in particular models of language acquisition, it would seem that the level of constructions – both item-specific valency constructions and generalizable argument structure constructions – is the most relevant.58 Which, in a way, takes us back to square one because the importance attributed to item-specificity and generalizability may again be a matter of perspective, although the view that “both item-specific knowledge and generalizations coexist” (Goldberg 2006a: 63) should be fairly uncontroversial and the fact that they aim to accommodate both can be seen as one of the great strengths of usage-based approaches. Surprisingly, even Ambridge and Lieven (2011: 262–263) seem to presuppose that there is an explanation for which verbs can occur in which constructions when they write:59 Thus the cause of overgeneralization errors is the child’s failure to have acquired an adultlike understanding of the (semantic/phonological/pragmatic) properties of a particular slot or a particular item and/or an alternative construction which contains a slot that is a better fit for that item.
58. This is not to say that generalizations that concern only the formal or the semantic level should be ruled out. See Hudson (2008). 59. Ambridge and Lieven (2011: 260) describe the FIT account as follows: “Each slot in a construction is associated with particular semantic properties (and, in many cases, pragmatic, phonological and other properties …) that relate to the event-semantics of the construction. … The [ACTION] slot in the [AGENT] [ACTION] [GOAL] [PATIENT] double-object dative construction is an example of a slot with particular phonological (as well as semantic) properties: the item (verb) that fills this slot must be monosyllabic, or have stress on the first syllable (e.g. *The man suggested the woman the trip.).” This is quite obviously an overgeneralization since allow and guarantee, which clearly violate these criteria, belong to the “collexemes most strongly attracted to the ditransitive construction” (Stefanowitsch and Gries 2003: 229). See also Herbst (2011).
206 Thomas Herbst
From the point of view of empirical valency research the point cannot be made too strongly that valency is a Norm-phenomenon (Herbst 1983), i.e. in parts a historical accident.60 In particular, it seems worth emphasizing that restrictions on the use of verbs in certain patterns need not necessarily be semantic in nature, or (synchronically) explicable at all: Faulhaber’s (2011: 302) analysis of 22 different verb groups revealed considerable differences in the valency patterns to be observed.61 It thus seems important to emphasize that there is no guarantee that a particular lexical item with certain semantic characteristics will be able to occur in a particular valency pattern simply because other lexical items with the same characteristics do. Whether one feels that in a particular instance one should “lump” or “split”, to take up the approaches also referred to by Goldberg (2006a: 45), depends on how closely one is prepared to look or how closely one feels one should look for a particular purpose. Nevertheless, there can be no doubt that itemspecific properties exist at a large scale and must be accounted for not only at the level of argument structure constructions as specified by Goldberg but also with respect to the level of valency patterns. The proposals made here – specifying the possible formal realisations of argument structure constructions in terms of allostructions and introducing the valency realisation principle – are an attempt to provide for the itemspecific properties of verbs within the theory of argument structure constructions. In order to determine the relationship between lexical valency information and generalized argument structure constructions that exist autonomously in the minds of speakers it would be crucial to know whether all allostructions of an argument structure construction are subject to productive or creative use in the same way and when speakers draw upon which kind of knowledge in language use (Herbst 2011).62 Being able to recognize a generalization and attributing a meaning to a pattern does not necessarily mean that this knowledge is drawn upon when using a verb in an established valency construction. If we take into account the amount of 60. For aspects of historical development in verbs taking the ditransitive construction see Croft (2003: 59). For the distinction between System and Norm see Coseriu (1973: 41–48). See also Schmid (this volume). 61. Faulhaber (2011: 302), in a very cautious calculation, estimates that more than 50% of the restrictions observed in her material “cannot be accounted for in a systematic semantic way”. 62. See also Langacker’s (2005: 162–164) discussion of sneeze and kick in the caused-motion construction. Compare Boas (2003: 260–277) on sneeze and blow.
The valency approach to argument structure constructions 207
item-specific knowledge in the area of collocation and the phenomena described by Sinclair (1991: 110) in terms of the idiom principle it would seem feasible if not likely that speakers tend to go for the constructions they associate with a particular lexical item and make creative use of generalized argument structure constructions whenever the mental lexicon does not provide “prefabricated valency constructions” for that particular item. This leads to the question of whether all valency patterns to be observed in the language can plausibly be related to generalized argument structure constructions of that kind or whether there are only relatively few such general constructions which enable speakers to go beyond what is on offer in terms of prefabricated stored valency constructions.63 In any case, combining elements of the valency approach with construction grammar seems to provide a suitable framework for investigating the character of generalizations and the relationship between generalizations and item-specific knowledge. References Aarts, Bas 2011 Oxford Modern English Grammar. Oxford: Oxford University Press. Aarts, Flor, and Jan Aarts 1988 English Syntactic Structures. 2d ed. New York/Leyden: Prentice Hall/Martinus Nijhoff. Ágel, Vilmos 2000 Valenztheorie. Tübingen: Narr. Allerton, David J. 1982 Valency and the English Verb. London: Academic Press. Ambridge, Ben, and Elena Lieven 2011 Child Language Acquisition: Contrasting Theoretical Approaches. Cambridge: Cambridge University Press. Boas, Hans C. 2003 A Constructional Approach to Resultatives. Stanford: CSLI Publications. Boas, Hans C. 2011 Zum Abstraktionsgrad von Resultativkonstruktionen. In Sprachliches Wissen zwischen Lexikon und Grammatik, Stefan Engelberg, Anke Holler, and Kristel Proost (eds.), 37–69. Berlin/New York: Mouton de Gruyter. 63. See also Goldberg (2006a: esp. 98–100).
208 Thomas Herbst Boas, Hans C. 2013 Cognitive construction grammar. In The Oxford Handbook of Construction Grammar, Thomas Hoffmann and Graeme Trousdale (eds.), 233–252. Oxford: Oxford University Press. Borsley, Robert D. 1996 Modern Phrase Structure Grammar. Oxford/Cambridge, Mass.: Blackwell. Breindl, Eva 2006 Präpositionalphrasen. In Dependenz und Valenz/Dependency and Valency: Ein internationals Handbuch der zeitgenössischen Forschung/An International Handbook of Contemporary Research. 2. Halbband/Volume 2, Vilmos Ágel, Ludwig M. Eichinger, HansWerner Eroms, Peter Hellwig, Hans Jürgen Heringer, and Henning Lobin (eds.), 936–951. Berlin/New York: Walter de Gruyter. Bresnan, Joan 2001 Lexical-Functional Syntax. Malden/Oxford: Blackwell. Bresnan, Joan, and Marilyn Ford 2010 Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86: 168–213. Bühler, Karl 1934 Sprachtheorie: Die Darstellungsfunktion der Sprache. Jena: Verlag von Gustav Fischer. Bybee, Joan 1995 Regular morphology and the lexicon. Language and Cognitive Processes 10 (5): 425–455. Bybee, Joan 2010 Language, Usage and Cognition. Cambridge: Cambridge University Press. Cappelle, Bert 2006 Particle placement and the case for “allostructions”. In Constructions all over: Case Studies and Theoretical Implications, Doris Schönefeld (ed.) Special Issue of Constructions, um:nbn:de:0009-4-6839, (accessed at http://www.academia.edu/1432971/Particle_placement_ and_the_case_for_allostructions_, 22 August 2014). Coseriu, Eugenio 1973 Probleme der strukturellen Semantik. Tübingen: Narr. Croft, William 2003 Lexical rules vs. constructions: A false dichotomy. In Motivation in Language: Studies in Honor of Günter Radden, Hubert Cuyckens, Thomas Berg, René Dirven, and Klaus-Uwe Panther (eds.), 49–68. Amsterdam/Philadelphia: Benjamins.
The valency approach to argument structure constructions 209 Croft, William 2004 Logical and typological arguments for Radical Construction Grammar. In Construction Grammars: Cognitive Grounding and Theoretical Extensions, Jan-Ola Östman, and Mirjam Fried (eds.), 273–314. Amsterdam/Philadelphia: Benjamins. Croft, William, and David A. Cruse 2004 Cognitive Linguistics. Cambridge: Cambridge University Press. Duden. Die Grammatik 2009 8th ed. Mannheim/Wien/Zürich: Dudenverlag. Eisenberg, Peter 1999 Grundriß der deutschen Grammatik, Band 2: Der Satz. Stuttgart/ Weimar: Metzler. Emons, Rudolf 1974 Valenzen englischer Prädikatsverben. Tübingen: Niemeyer. Engel, Ulrich 1977 Syntax der deutschen Gegenwartssprache. Berlin: Schmidt. Faulhaber, Susen 2011 Verb Valency Patterns: A Challenge for Semantics-Based Accounts. Berlin/New York: de Gruyter Mouton. Fillmore, Charles 1968 The case for case. In Universals in Linguistic Theory, Emmon Bach and Robert T. Harms (eds.), 1–88. New York: Holt, Rinehart and Winston. Fillmore, Charles 1988 The mechanisms of “construction grammar”. In General Session and Parasession on Grammaticalization, Shelley Axmaker, Annie Jassier, and Helen Singmaster (eds.), 35–55. Berkeley: Berkeley Linguistics Society. Fillmore, Charles 2007 Valency issues in FrameNet. In Valency: Theoretical, Descriptive and Cognitive Issues, Thomas Herbst and Katrin Götz-Votteler (eds.), 129–160. Berlin/New York: Mouton de Gruyter. Fillmore, Charles 2008 Border Conflicts: FrameNet Meets Construction Grammar. In Proceedings of the XIII Euralex International Congress, Elisenda Bernal and Janet DeCesarius (eds.), 49–68. Barcelona: Universitat Pempeu Fabra. Fillmore, Charles this vol. Frames, Constructions and FrameNet. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid, and Susen Faulhaber (eds.), 113–158. Berlin/Boston: de Gruyter Mouton.
210 Thomas Herbst Fillmore, Charles, and Beryl T. Atkins 1992 Toward a frame-based lexicon: The semantics of RISK and its neighbors. In Frames, Fields, and Contrasts: New Essays in Semantic and Lexical Organization, Adrienne Lehrer and Eva Feder Kittay (eds.), 75–188. Hillsdale/Hove/London: Lawrence Erlbaum Associates. Fillmore, Charles, Paul Kay, and Catherine M. O’Connor 1988 Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64: 501–538. Fillmore, Charles, Russell R. Lee-Goldman, and Russell Rhomieux 2012 The FrameNet Constructicon, In: Sign-Based Construction Grammar, Hans C. Boas and Ivan A. Sag (eds.), 309–372. Stanford: CSLI Publications. FrameNet https://framenet.icsi.berkeley.edu/fndrupal/ Francis, Gill, Susen Hunston, and Elizabeth Manning 1996 Collins Cobuild Grammar Patterns. 1: Verbs. London: HarperCollins. Götz-Votteler, Katrin 2007 Describing semantic valency. In Valency. Theoretical, Descriptive and Cognitive Issues, Thomas Herbst and Katrin Götz-Votteler (eds.), 37–49. Berlin/New York: Mouton de Gruyter. Goldberg, Adele 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press. Goldberg, Adele 2006a Constructions at Work. Oxford/New York: Oxford University Press. Goldberg, Adele 2006b The inherent semantics of argument structure: The case of the English ditransitive construction. In Cognitive Linguistics: Basic Readings, Dirk Geeraerts (ed.), 401–437. Berlin/New York: Mouton de Gruyter. [Originally published 1992, Cognitive Linguistics 3 (1): 37– 74.] Goldberg, Adele 2010 Verbs, constructions and semantic frames. In: Lexical Semantics, Syntax and Event Structure, Malka Rappaport Hovav, Edit Doron, and Ivy Sichel (eds.), 39–58. Oxford: Oxford University Press. Goldberg, Adele 2011 Meaning arises from words, texts and phrasal constructions. In Argument Structure: Valency and/or Constructions? Thomas Herbst and Anatol Stefanowitsch (eds.), Special issue of ZAA 59 (4): 317– 329.
The valency approach to argument structure constructions 211 Goldberg, Adele 2013 Argument Structure Constructions vs. Lexical Rules or Derivational Verb Templates. Mind and Language 28 (4):435–465. Goldberg, Adele, Devin M. Casenhiser, and Nitya Sethuraman 2004 Learning argument structure generalisations. Cognitive Linguistics 15 (3): 289–316. Goldberg, Adele, and Ray Jackendoff 2004 The English resultative as a family of constructions. Language 80 (3): 532–568. Gries, Stefan, and Anatol Stefanowitsch 2004 Extending collostructional analysis: A corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics 9 (1): 97– 129. Halliday, Michael A. K. 1967/68 Notes on transitivity and theme in English. Journal of Linguistics, Part 1: 37–81, Part 2: 199–244, Part 3: 179–215. Halliday, Michael A. K. 1994 An Introduction to Functional Grammar. 2d ed. London/Melbourne/Auckland: Arnold. Helbig, Gerhard 1992 Probleme der Valenz- und Kasustheorie. Tübingen: Niemeyer. Helbig, Gerhard, and Wolfgang Schenkel 1973 Wörterbuch zur Valenz und Distribution deutscher Verben. 2d ed. Leipzig: Verlag Enzyklopädie. Herbst, Thomas 1983 Untersuchungen zur Valenz englischer Adjektive und ihrer Nominalisierungen. Tübingen: Narr. Herbst, Thomas 2007 Valency complements or valency patterns? In Valency: Theoretical, Descriptive and Cognitive Issues, Thomas Herbst, and Katrin GötzVotteler (eds.), 15–35. Berlin/New York: Mouton de Gruyter. Herbst, Thomas 2009 Valency: Item-specificity and idiom principle. In Exploring the Lexis-Grammar Interface, Ute Römer and Rainer Schulze (eds.), 49–68. Amsterdam/Philadelphia: John Benjamins. Herbst, Thomas 2010 Valency constructions and clause constructions or how, if at all, valency grammarians might sneeze the foam off the cappuccino. In Cognitive Foundations of Linguistic Usage Patterns: Empirical Studies, Hans-Jörg Schmid and Susanne Handl (eds.), 225–255. Berlin/New York: de Gruyter Mouton.
212 Thomas Herbst Herbst, Thomas 2011 The status of generalisations: Valency and argument structure constructions. In Argument Structure: Valency and/or Constructions? Thomas Herbst and Anatol Stefanowitsch (eds.), Special issue of ZAA 59 (4): 347–367. Herbst, Thomas, David Heath, Ian Roe, and Dieter Götz. 2004 A Valency Dictionary of English. Berlin/New York: Mouton de Gruyter. [VDE] Herbst, Thomas, and Susen Schüller [now Faulhaber] 2008 Introduction to Syntactic Analysis: A Valency Approach. Tübingen: Narr. Herbst, Thomas, and Peter Uhrig 2009 The Erlangen Valency Patternbank. http://www.patternbank.unierlangen.de Heringer, Hans Jürgen 1996 Deutsche Syntax dependentiell. Tübingen: Stauffenburg. Huddleston, Rodney, and Geoffrey K. Pullum (eds.) 2002 The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. [CamG] Hudson, Richard 2008 Word grammar and construction grammar. In: Constructional Approaches to English Grammar, Graeme Trousdale and Nikolas Gisborne (eds.), 257–302. Berlin/New York: Mouton de Gruyter. Hunston, Susan this vol. Pattern grammar in context. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid and Susen Faulhaber (eds.), 91–111. Berlin/Boston: de Gruyter Mouton. Hunston, Susan, and Gill Francis 2000 Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Amsterdam: John Benjamins. Ibbotson, Paul, and Michael Tomasello 2009 Prototype constructions in early language acquisition. Language and Cognition 1 (1): 59–85. Jacobs, Joachim 1994 Kontra Valenz. Trier: Wissenschaftlicher Verlag Trier. Jacobs, Joachim 2008 Wozu Konstruktionen? Linguistische Berichte 213: 3–44. Jacobs, Joachim 2009 Valenzbindung oder Konstruktionsbindung? Eine Grundfrage der Grammatiktheorie. ZGL: 490–513
The valency approach to argument structure constructions 213 Kay, Paul 2005
Argument structure constructions and the argument-adjunct distinction. In Grammatical Constructions: Back to the Roots. Mirjam Fried and Hans C. Boas (eds.), 71–98. Amsterdam/Philadelphia: Benjamins. Langacker, Ronald W. 2005 Integration, grammaticization, and constructional meaning. In Grammatical Constructions: Back to the Roots. Mirjam Fried and Hans C. Boas (eds.), 157–189. Amsterdam/Philadelphia: Benjamins. Langacker, Ronald W. 2008 Cognitive Grammar: A Basic Introduction. Oxford: Oxford University Press. Lieven, Elena this vol. First language learning from a usage-based approach. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid, and Susen Faulhaber (eds.), 1–24. Berlin/Boston: de Gruyter Mouton. Lieven, Elena, and Claire Noble 2011 The acquisition of argument structure. In Argument Structure: Valency and/or Constructions? Thomas Herbst and Anatol Stefanowitsch (eds.), Special issue of ZAA 59 (4): 411–424. MacWhinney, Brian 2005 A unified model of language acquisition. In Handbook of Bilingualism: Psycholinguistic Approaches, Judith F. Kroll and Annette M. B. De Groot (eds.), 49–67. Oxford: Oxford University Press. MacWhinney, Brian this vol. Item-based patterns in early syntactic development. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid, and Susen Faulhaber (eds.), 25–61. Berlin/Boston: de Gruyter Mouton. Matthews, Peter 1980 Syntax. Cambridge: Cambridge University Press. Müller, Stefan 2008 Head-Driven Phrase Structure Grammar: Eine Einführung. 2d ed. Tübingen: Stauffenburg. Müller, Stefan 2010 Grammatiktheorie. Tübingen: Stauffenburg. Newmeyer, Frederick J. 2003 Theoretical implications of grammatical category-grammatical relation mismatches. In Mismatch: Form-Function Incongruity and the Architecture of Grammar, Elaine J. Francis and Laura A. Michaelis (eds.), 149–178. Stanford, CA: CSLI Publications.
214 Thomas Herbst Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik The Comprehensive Grammar of the English Language. Lon1985 don/New York: Longman. [CGEL] Radden, Günter, and René Dirven 2007 Cognitive English Grammar. Amsterdam/Philadelphia: Benjamins. Rappaport Hovav, Malka, and Beth Levin 2008 The English dative alternation: The case for verb sensitivity. Journal of Linguistics 44: 129–167. Ruppenhofer, Josef, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson, and Jan Scheffczyk 2010 FrameNet II: Extended Theory and Practice, http://framenet2.icsi. berkeley.edu/docs/r1.5/book.pdf (accessed 26 November 2013). Schmid, Hans-Jörg 2000 English Abstract Nouns as Conceptual Shells: From Corpus to Cognition. Berlin/New York: Mouton de Gruyter. Schmid, Hans-Jörg this vol. Lexico-grammatical patterns, pragmatic associations and discourse frequency. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid, and Susen Faulhaber (eds.), 231–285. Berlin/Boston: de Gruyter Mouton. Schumacher, Helmut, Jacqueline Kubczak, Renate Schmidt, and Vera de Ruiter 2004 VALBU – Valenzwörterbuch deutscher Verben. Tübingen: Narr. Sells, Peter 1985 Lectures on Contemporary Syntactic Theories. Stanford: Center for the Study of Language and Information, Stanford University. Sinclair, John 1991 Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John 2004 Trust the Text. London/New York: Routledge. Somers, Harald L. 1987 Valency and Case in Computational Linguistics. Edinburgh: Edinburgh University Press. Stefanowitsch, Anatol 2009 Die Patternbank und ihre Bedeutung für die moderne Grammatikforschung. www.patternbank.uni-erlangen.de/pattern-bank_festvortrag.pdf (accessed 26 November 2013). Stefanowitsch, Anatol 2011 Argument Structure: Item-based or distributed? In Argument Structure: Valency and/or Constructions? Thomas Herbst and Anatol Stefanowitsch (eds.), Special issue of ZAA 59 (4): 369–386.
The valency approach to argument structure constructions 215 Stefanowitsch, Anatol this vol. Collostructional analysis: A case study of the English into-causative. In Constructions, Collocations, Patterns, Thomas Herbst, Hans-Jörg Schmid, and Susen Faulhaber (eds.), 209–230. Berlin/Boston: de Gruyter Mouton. Stefanowitsch, Anatol, and Stefan Th. Gries 2003 Collostructions: Investigating the interaction between words and constructions. International Journal of Corpus Linguistics 8 (2): 209–243. Tesnière, Lucien 1959 Eléments de syntaxe structurale. Paris: Librairie C. Klincksieck. Tomasello, Michael 2003 Constructing a Language. Cambridge, Mass./London: Harvard University Press. Welke, Klaus 1995 Dependenz, Valenz und Konstituenz. In Dependenz und Valenz, Ludwig M. Eichinger, and Hans-Werner Eroms (eds.), 163–175. Hamburg: Buske. Welke, Klaus 2011 Valenzgrammatik des Deutschen: Eine Einführung. Berlin/New York: de Gruyter. Zifonun, Gisela, Ludger Hoffmann, Bruno Strecker et al. 1997 Grammatik der deutschen Sprache. 3 Bände. Berlin/New York: Walter de Gruyter.
BN BNC QE TI VDE WCD WDF
Ben Nicholson. By Virginia Button. (2007), London: Tate Enterprises. British National Corpus. Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/. Quoted example. Tate St Ives. Tate Gallery St Ives. Barbara Hepworth Museum and Sculptore Garden. An Illustrated Companion. By Michael Tooby (1993). London: Tate Gallery Publications. A Valency Dictionary of English (see bibliography). Wycliffe and the Cycle of Death. By W.J. Burley ([1990] 1991). London: Corgi Books. Wycliffe and the Dead Flautist. By W.J. Burley ([1991] 1992). London: Corgi Books.
216 Thomas Herbst WDM WFJ
Wycliffe and the Dunes Mystery. By W.J. Burley ([1993] 1994). London: Corgi Books. Wycliffe and the Four Jacks. By W.J. Burley ([1985] 1988). London: Corgi Books.
Collostructional analysis: A case study of the English into-causative 1 Anatol Stefanowitsch
1.
Introduction
Corpora are, by now, well established tools for linguists of all theoretical persuasions, but for a long time they were (and often still are) mostly seen as a source for concordances to be browsed impressionistically or, even worse, as sources for hand-picked examples supporting some piece of linguistic argumentation arrived at a priori and without any recourse to empirical data. Unlike computational linguists, even corpus linguists in a narrower sense have been slow to recognize the true potential of linguistic corpora, which lies in their systematic, exhaustive and rigorously quantified analysis. If statistical tests were systematically applied at all, it was mostly in the domain of word associations (i.e., collocations, cf. for example, Church and Hanks 1991). This was the state of affairs, when Stefan Gries and I started thinking about how to approach grammatical constructions from a quantitative perspective. Both of us were, for slightly different reasons, interested in the question how particular grammatical structures and “rules” (in a wide sense 1.
Having already written three handbook articles or introductory chapters on collostructional analysis, two as sole author (Stefanowitsch, 2011, 2012) and one jointly with Stefan Gries (Stefanowitsch and Gries 2009), it is no easy task to write yet another introductory chapter on the method without repeating myself. I have tried to put these papers out of my mind and start from a blank page, but I cannot guarantee that I have been successful throughout. I thank Stefan Gries for the many years over which we developed and wrote about our method together, which were among the intellectually most stimulating of my career. I also thank the many other researchers who have used our method, especially, Stefanie Wulff and Martin Hilpert, as well as those of our critics who have made an honest attempt to grasp what they were criticizing, especially Hans-Jörg Schmid.
218 Anatol Stefanowitsch
of the word) interacted with individual lexical items; thus, it seemed natural, to extend existing research on collocations to the lexicon-grammar interface (see Stefanowitsch and Gries 2009 for a discussion that places collostructional analysis in the context of this research). And while it has since developed into an independent strand of research, this is, in essence, what collostructional analysis was originally intended to be and what it still is. In this chapter, I will provide a general, practically oriented introduction to the basic methods of collostructional analysis, aimed at students of linguistics and related fields with little or no background in quantitative methods. I will present these methods in increasing order of conceptual sophistication (which, perhaps unsurprisingly, is the order in which they were developed). Throughout the chapter, I will use a construction as an example, that we have called the into-causative, instantiated in examples They forced us into accepting a compromise. This construction has been used as an example in a number of the studies in which collostructional analysis and its various extensions were first introduced, thus providing a bridge from this introductory chapter to the research literature. In Section 2, I will present a minimum of theoretical and descriptive background, introducing both the construction in question and, in very general terms, the construction-grammar approach to argument structure that normally underlies case studies in collostructional analysis.2 In Section 3, I will introduce the original version of collostructional analysis, which I have retrospectively renamed simple collexeme analysis in some of my own writings, as collostructional analysis is now used as a cover term for an entire family of methods. Simple collexeme analysis is a method for investigating associations between one construction and the words occurring in one particular slot of that construction. In Section 4, I will introduce a fairly 2.
This is not to suggest that collostructional analysis is, or should be, restricted to construction-grammar analyses; it can be applied within any theoretical framework that accords any place whatsoever to language use (“performance”). Its close association to construction grammar is due in part to the fact that the conceptual separation of lexical items from argument structures typically assumed in construction grammar resonates very well with the analytic separation of lexical items from argument structures necessary for a collostructional analysis. However, the fact that Gries and I, at the time of presenting our findings, happened to consider ourselves construction grammarians (as I still do) and that our immediate research community therefore consisted mainly in construction grammarians and cognitive linguists more generally, has clearly also played a role.
Collostructional analysis: A case study of the English into-causative 219
simple, but, it has turned out, very useful extension called distinctive collexeme analysis, which is a method for investigating the association of words to two constructions that are somehow functionally related. In Section 5, I will introduce a variant we have called covarying collexeme analysis, which is geared towards investigating associations of words in different slots of the same construction. In Section 6, I will briefly introduce a variant of simple collexeme analysis that one might call negative collexeme analysis; this is simply an extension of the method to words that do not occur in a given construction. In Section 7, I will provide a brief outlook on the future of collostructional analysis. 2.
Theoretical and descriptive background
As mentioned above, I will use the into-causative as an example, by which I mean a pattern that is frequently used for expressing causation in English and that is instantiated in examples like the following: (1)
a.
[T]he British and Americans tried to force de Gaulle into accepting what they presented as a reasonable compromise on these issues. (BNC HXU)
b.
[T]he defendant was able to trick computer security systems into believing he was a bona fide systems analyst. (BNC K5D)
c.
You think I talked Peter into giving me those earrings, don't you? (BNC JXU)
In active declarative main clauses, this construction has the general form in 2 (it is not limited to such clauses but is also found in the passive voice, in subordinate clauses of all kinds, in interrogatives and imperatives, etc.): (2)
[SUBJ Vfinite OBJ into Vgerund]
Semantically, the subject refers to some agent, inanimate force or event which acts, in the manner specified by the finite verb, on some intermediary agent, encoded by the direct object, who in turn acts in the way specified by the gerund. Following established typological terminology, I will refer to the referent of the subject as the causer, the referent of the object as the causee, the event encoded by the finite verb as the causing event and the event encoded by the gerund as the caused or resulting event. In the canonical analytic causative constructions in English the (finite) matrix verb encoding the causing event is semantically relatively neutral, encoding nothing beyond the fact that causation takes place (make as in I
220 Anatol Stefanowitsch
made my kids help me, have as in I had my kids cut the lawn and cause as in The hot summer caused the lawn to thin out, see Stefanowitsch 2001). In the into-causative, in contrast, the matrix verb refers not to the act of causation itself, but to a specific action taken by the causer and directed at the causee, which is understood to be the causing event. This property in particular makes the into-causative a natural candidate for an argumentstructure construction in the sense of Goldberg (1995), as the semantic component CAUSE can only plausibly be attributed to the pattern as a whole. This general description yields the following argument structure construction (in the sense of Goldberg 1995):3 (3)
The into-causative (Version 1) Sem
CAUSE ‹ causer
causee
‹ Syn
V
result
› ›
SUBJ
OBJ
[into Vgerund]
The representation in (3) is deliberately schematic; crucially, it suggests that any verb can be used in the CAUSE slot. This is not the case, but this aspect of the construction will be specified inductively on the basis of the data discussed below. 3.
Simple collexeme analysis
In its first (and simplest) variant, collostructional analysis aims at uncovering statistical associations that hold between a particular slot in a construction and the words occurring in that slot (cf. Stefanowitsch and Gries 2003). The statistical association between a particular lexical item li belonging to a word class L (for example, the item force from the class Verb) and a particular construction c belonging to a class of constructions C (for exam3.
As is customary, the construction is represented here in its canonical (active, declarative, main-clause) form; this is not meant to suggest that the semantic roles causer, causee and result are directly linked to surface grammatical relations or that the construction can only occur in this canonical form; of course, it can occur in passives (e.g. De Gaulle was forced into accepting a compromise), subordinate clauses (as in 1a), etc.
Collostructional analysis: A case study of the English into-causative 221
ple, the into-causative from the class of Argument-Structure Constructions) can be determined on the basis of the occurrence frequencies shown in Table 1: (i) the frequency of the lexical item li in the construction c; (ii) the frequency of the lexical item li in other constructions (i.e. ¬c) of class C; the frequency of other lexical items (i.e. ¬li) of word class L in the construction c; and the frequency of other members of word class L in other constructions of class C. Table 1.
Simple collexeme analysis (schematic)
Construction c of Class C Other Constructions of Class C Total
Word li of Class L Frequency of L(li) in C(c) Frequency of L(li) in C(¬c)
Other Words of Class L Frequency of L(¬li) in C(c) Frequency of L(¬li) in C(¬c)
Total frequency of L(li)
Total frequency of L(¬li)
Total Total frequency of C(c) Total frequency of C(¬c) Total frequency of C
On the basis of this table, the direction of association (i.e., whether li is more frequent or less frequent in c than expected) can be determined on the basis of calculating the expected frequencies using standard procedures, and the statistical significance of this association can be determined using any statistical test for contingency tables, such as the Fisher-Yates exact test.4 Table 2 exemplifies this for the verb trick in the into-causative (with expected frequencies given in parentheses, rounded to whole numbers).
4.
Note that collostructional analysis, as a quantitative corpus-linguistic method, requires some statistical background knowledge concerning the calculation of expected frequencies on the basis of a contingency table such as that in Table 3 and inferential statistical tests suitable for determining the statistical significance of the observed distribution. As there are, by now, plenty of excellent statistical textbooks available, many of them aimed specifically at linguists, I will not provide any discussion of these issues here.
222 Anatol Stefanowitsch Table 2.
Simple collexeme analysis: trick in the into-causative trick 92 248 340
into-causative Other ASCs5 Total
(0) (340)
other verbs 1494 (1586) 10204466 (10204374) 10205960
Total 1586 10204714 10206300
Once this procedure has been repeated for all lexical items (l1...n) that occur in c in a given corpus, the collexemes can be ranked by the strength of their association to c and serve as a basis for interpretation in the context of the research question under investigation. Lexical items that are (significantly) more frequent than expected are called (significantly) attracted collexemes, those that are significantly less frequent are called (significantly) repelled collexemes. I will focus on the significantly attracted collexemes for now, and return to the issue of significantly repelled collexemes in Section 5 below. Table 3.
Top 20 significantly attracted collexemes in the V slot of the intocausative (cf. Stefanowitsch and Gries 2003: 225) Collexeme
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
5.
6.
trick fool coerce force mislead bully deceive con pressurise provoke pressure cajole blackmail dupe coax
Freq. Corp. 340 769 168 11940 1805 467 706 109 288 1976 135 85 203 114 317
Freq. CX (Obs.) 92 77 53 101 57 45 48 34 39 48 30 28 25 19 22
Freq. CX (Exp.) 0.0528 0.1195 0.0261 1.8554 0.2805 0.0726 0.1097 0.0169 0.0448 0.3071 0.0210 0.0132 0.0315 0.0177 0.0493
P-value6 2.11E-267 1.68E-187 1.15E-158 6.31E-136 9.57E-110 2.53E-109 5.94E-109 4.41E-102 4.80E-101 4.05E-87 3.88E-85 4.08E-85 3.31E-64 7.77E-52 6.00E-51
The frequency of other ASCs was determined by subtracting cases of the intocausative from the total number of ASCs in the corpus; the total number of ASCs in the corpus is assumed to be equal to the total number of verbs (excluding modal verbs). The p-value is shown in scientific notation, i.e. 2.11E-267 is to be read as 2.11 x 10-267.
Collostructional analysis: A case study of the English into-causative 223 Collexeme 16. 17. 18. 19. 20.
delude talk goad shame brainwash
Freq. Corp. 161 28699 147 232 62
Freq. CX (Obs.) 19 62 18 19 13
Freq. CX (Exp.) 0.0250 4.4597 0.0228 0.0361 0.0096
P-value6 8.83E-49 2.38E-48 1.35E-46 1.28E-45 2.42E-37
A list of verbs such as this can have a number of uses for both practical and theoretical purposes. Practical purposes include lexicography and foreign language teaching, where this list may guide the lexicographer in deciding which lexical entries need to mention the complementation pattern referred to as into-causative here, and the language teacher in deciding which of the combinations to treat as chunks in teaching vocabulary. From a theoretical perspective, a frequent use that has been made of such lists is to base semantic characterizations of the construction in question on them. The list essentially tells us, which verbs are typical for a given construction (or complementation pattern), and under the plausible assumption that the verbs most typically occurring with a given construction reflect its meaning (cf. Goldberg 1995: 39–43), this tells us something about the construction itself.7 The top twenty collexemes of the into-causative fall broadly into four classes, namely (in decreasing order of typicality: verbs of trickery (trick, fool, mislead, deceive, con, dupe, delude), verbs of force or the threat of force (coerce, force, bully, pressurise, pressure, blackmail), verbs of persuasion/communication (cajole, talk, coax), and, perhaps related to these, verbs of provocation (provoke, goad). In addition, there are two verbs that
7.
Note that, while this type of analysis seems most obviously useful within construction grammar (or other theories that recognize the existence of complex linguistic elements that are wholly or partially independent of specific lexical items, such as Pattern Grammar [see Hunston this volume]), it is not limited to such theories. Even in a framework that assumes, for example, that argument structure is part of the specification of individual lexical items, it can be useful to determine the relative importance that particular specification has for particular lexical items. Of course, in this case, a kind of reverse simplecollexeme analysis would be more useful, where instead of looking at associations between one construction/pattern and the set of words it occurs with, one looks at associations between one verb and the set of valence patterns it occurs with.
224 Anatol Stefanowitsch
do not fall into any of these classes, a verb of negative emotion (shame) and the verb brainwash. The first thing this list tells us, is that we are indeed dealing with a very productive syntactic construction, rather than a marginally productive idiom of the kind that Kay (2012: 33–35) calls “patterns of coining”: the heterogeneity of even just the top twenty verbs is too great to assume that these are due to analogical extensions from some canonical form. The second thing the list tells us is that the semantic characterization of the verb slot must be quite general, as the verbs seem to have very little in common beyond the extent of the semantic classes I grouped them into and as the classes themselves also do not easily fall under any general label other than the very abstract CAUSE. However, the third thing the list tells us is that CAUSE does not seem to be the right semantic characterization of the verb slot: If it were, we would expect the most typical verb (or at least one of the most typical verbs) in the construction to instantiate this semantic component directly (cf. Goldberg 1995: 39–43). However, the two verbs in English that do instantiate this component (relatively) directly, cause and make, do not occur among the top collexemes – in fact, as we will see in Section 5, they do not occur in the construction at all. Instead, all of the typical verbs encode a means of causation. While this is quite a frequently found relation between the meaning of a construction and the meanings of the verbs occurring in it, the intocausative would be the only construction I am aware of that does not permit verbs to occur in it that instantiate its meaning directly. What the list does not tell us straightforwardly is what the correct characterization of the construction’s semantics might be. Unlike for many of the constructions discussed by Goldberg (1995), none of the most typical verbs can be understood as being synonymous with the construction here. Regardless of which of the top collexemes we choose, we would not be able to account for the fact that verbs from the other semantic classes occur in the construction very frequently.8
8.
Note that in all of our earlier work on the construction, Gries and I actually assume that CAUSE is the right characterization; one might assume that we did not trust our method’s inductive potential enough to follow it through to its ultimate conclusion, but as far as I can reconstruct, this is not the case. It seems that we were simply not paying enough attention to this particular oddity of the into-causative.
Collostructional analysis: A case study of the English into-causative 225
On the other hand, the relatively clearly definable and distinguishable semantic classes tell us that, in addition to whatever general meaning we ascribe to the construction, we would be justified in positing subsenses of the construction corresponding to these classes in order to account for the fact that, while the construction seems to be capable of accommodating a wide range of verbs, the ones that occur in it routinely form semantic clusters (cf. Goldberg 1995: 31–39.). As just mentioned, collostructional analysis cannot in this case provide a straightforward answer to the question, what meaning to ascribe to the construction. This should not be seen as a major drawback, however, since this kind of information is not expected to just fall out from a quantitative analysis. Still, the crucial hint as to where we must look for the meaning of the into-causative is contained in the list. Note that most of the top 20 collexemes are verbs that seem to stand in a MANNER relationship to the meaning CAUSE: to trick someone into doing something means ‘to cause someone to do something by means of tricking them’, to coerce someone into doing something means ‘to cause someone to do something by means of coercing them’, etc. This is not the case with the verb talk, however: to talk someone into doing something does not have the general meaning ‘to cause someone to do something by means of talking to them’ – which would cover situations where someone is ordered to do something, persuaded to do it, threatened into doing it, tricked into doing it by being told something false, etc. In fact, however, it is only applicable in situations where someone is persuaded to do something by means of talking. Since the meaning ‘persuade’ is not contained in the verb talk, it must come from the construction. In other words, the construction’s meaning is not simply CAUSE, but something like ‘persuade’ or ‘convince’. I suggest the paraphrase CAUSE-DECIDE, since this seems to me to be what distinguishes the meaning of persuade/convince from that of cause – the former involve a conscious decision by the causee to perform the resulting event (although this decision does not have to be voluntary). This characterization, shown in (4) has the advantage that the instance relation is now permitted in the construction (cf. 5a–b):
226 Anatol Stefanowitsch (4)
The into-causative (Version 2) Sem
CAUSE-DECIDE ‹ causer
R: instance, means
Syn (5)
R
causee
result
‹
V
› ›
SUBJ
OBJ [into Vgerund]
a.
She allowed a man to persuade her into taking the drug… [BNC CCW]
b.
… his extremely clever speech then persuaded the crowd into thinking his way ... [BNC KA1]
4.
Distinctive collexeme analysis
Distinctive collexeme analysis is a variant of the original method aimed at uncovering differences in the statistical associations that hold between a particular slot in two related constructions and the words occurring in that slot. This method has frequently been applied to so-called syntactic ‘alternations’ such as the ditransitive-dative alternation, but it can be applied usefully in other contexts too (cf. Gries and Stefanowitsch 2004a). The statistical association between a particular lexical item li belonging to a word class L (for example, the item give from the class Verb) and a member of a pair of constructions c and d belonging to a class of constructions C (for example, the ditransitive and the dative from the class of Argument-Structure Constructions) can be determined on the basis of the occurrence frequencies shown in Table 4: (i) the frequency of the lexical item li in the construction c; (ii) the frequency of the lexical item li in the construction d; the frequency of other lexical items (i.e. ¬li) of word class L in the construction c; and the frequency of other members of word class L in the construction d. Table 4.
Distinctive collexeme analysis (schematic)
Construction c of Class C Construction d of Class C Total
Word li of Class L Frequency of L(li) in C(c) Frequency of L(li) in C(d)
Other Words of Class L Frequency of L(¬li) in C(c) Frequency of L(¬li) in C(d)
Total frequency of C(c) Total frequency of C(d)
Combined frequency of L(li) in C(c,d)
Combined frequency of L(¬li) in C(c,d)
Combined frequency of C(c,d)
Total
Collostructional analysis: A case study of the English into-causative 227
On the basis of such a table, the direction of association (i.e., whether li is more frequent than expected in c or d) can be determined on the basis of calculating the expected frequencies using standard procedures, and the statistical significance of this association can be determined using any statistical test for contingency tables, such as the Fisher-Yates exact test. Once this procedure has been repeated for all lexical items (l1...n) that occur in c or d in a given corpus, the collexemes can be ranked by the strength of their association to c or d and serve as a basis for interpretation in the context of the research question under investigation. Lexical items that are (significantly) more frequent than expected in one of the constructions (which will automatically make them (significantly) less frequent than expected in the other) are called (significantly) distinctive collexemes of that construction. As mentioned above, the method has typically been used in the context of conventionalized grammatical alternatives. The into-causative has no such alternative: although English has a range of causative constructions, it does not seem to have a special relationship with any of them. It can still be useful in the analysis of the into-causative, however: As I demonstrate in Stefanowitsch (2006a, 2009), it may be used to determine the specific semantic properties of a construction as compared to a general paraphrase of its meaning. For example, one might wonder whether the English have-toconstruction expresses a specific type of obligation as opposed to the verb must (it turns out that it does not), or whether the superficially analogous German modal infinitive with haben expresses a specific type of obligation as opposed to the verb müssen (it turns out that it does). In the context of the present analysis of the into-causative, it might be enlightening to use it to identify the distinctive collexemes of the intocausative as compared to the pattern persuade-to-V; since persuade seems to be among the closest lexical equivalents to the meaning I attributed to the into-causative in the preceding section, the distinctive collexemes should allow us to identify more specific semantic properties of the construction. Table 5 exemplifies this for the verb think in the two patterns (with expected frequencies given in parentheses).
228 Anatol Stefanowitsch Table 5.
Distinctive collexeme analysis: think in the into-causative and the pattern persuade-to-V (BNC, Sample) think
into-causative persuade-to-V Total
111 11 122
other verbs 1179 (1256) 3366 (3288) 4545
(34) (88)
Total 1290 3377 4667
Table 6a shows the distinctive collexemes of the into-causative compared to persuade-to-V, while Table 6b shows the distinctive collexemes of the pattern persuade-to-V compared to the into-causative. Table 6a.
Top 20 significantly distinctive collexemes in the into-causative as compared to the pattern persuade-to-V (BNC, original data)
Collexeme 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
think believe make say work hide lose expect express formulate give see run marry fight learn accept throw reveal use
Table 6b.
persuade+to (Observed) 11 11 47 9 17 0 0 0 0 0 86 14 10 9 2 0 89 5 4 28
V +into (Expected) 33.72 31.51 28.75 7.19 9.95 1.38 1.38 1.11 1.11 1.11 37.87 7.74 5.80 5.25 1.93 0.83 38.14 3.32 2.76 12.99
persuade+to (Expected) 88.28 82.49 75.25 18.81 26.05 3.62 3.62 2.89 2.89 2.89 99.13 20.26 15.20 13.75 5.07 2.17 99.86 8.68 7.24 34.01
P-value 1.58E-50 3.46E-46 3.36E-09 6.36E-05 1.20E-03 1.60E-03 1.60E-03 5.82E-03 5.82E-03 5.82E-03 8.35E-03 9.70E-03 1.40E-02 1.83E-02 2.00E-02 2.11E-02 2.46E-02 2.47E-02 3.20E-02 3.89E-02
Top 20 significantly distinctive collexemes in the pattern persuade-toV as compared to the into-causative (BNC, original data)
Collexeme 1. 2. 3. 4. 5. 6.
V +into (Observed) 111 103 57 17 19 5 5 4 4 4 51 14 11 10 5 3 49 7 6 19
do come change be return have
V +into (Observed) 0 6 7 0 1 0
persuade+to (Observed)
97 87 78 36 38 26
V +into persuade+to (Expected) (Expected) 26.81 70.19 25.71 67.29 23.49 61.51 9.95 26.05 10.78 28.22 7.19 18.81
P-value 1.59E-14 1.98E-07 7.41E-06 8.30E-06 5.01E-05 2.16E-04
Collostructional analysis: A case study of the English into-causative 229 Collexeme 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
part join stop stay vote go abandon visit stand let allow drop withdraw move
V +into (Observed) 0 10 1 10 1 19 2 0 1 12 8 1 2 4
persuade+to (Observed)
24 73 29 69 25 95 28 15 20 61 46 16 20 28
V +into persuade+to (Expected) (Expected) 6.63 17.37 22.94 60.06 8.29 21.71 21.84 57.16 7.19 18.81 31.51 82.49 8.29 21.71 4.15 10.85 5.80 15.20 20.18 52.82 14.93 39.07 4.70 12.30 6.08 15.92 8.85 23.15
P-value 4.15E-04 4.81E-04 7.38E-04 1.08E-03 2.38E-03 3.99E-03 4.53E-03 7.74E-03 9.99E-03 1.78E-02 2.00E-02 3.04E-02 3.46E-02 3.55E-02
Two semantic patterns jump out immediately when comparing these lists. First, among the distinctive collexemes of the into-causative there is a striking predominance of perception-, cognitition- and utterance-verbs (think, believe, say, expect, express, formulate, see, learn, accept, reveal), which account for half of the top 20 distinctive collexemes; their specific relevance to the into-causative is further corroborated by the fact that not a single such verb occurs among the top 20 distinctive collexemes of persuadeto-V. In contrast, activity verbs like work, fight, throw and use on the one hand and return, visit, and move on the other occur in roughly equal numbers on both list. Second, among the distinctive collexemes of the persuade-to-V pattern, there are a number of verbs that seem to refer to stopping some activity that is already in progress (change, stop, abandon, drop and withdraw), while no such verbs are found among the distinctive collexemes of the intocausative. Also, the verb be is distinctive for persuade-to-V, which is mostly due to its occurrence with passives. From the second pattern, we may conclude relatively straightforwardly that the into-causative is specialized towards encoding the causation of actions rather than the causation of the cessation of actions; we might represent this by specifying its meaning as CAUSE-DECIDE-ACT, rather than just CAUSE-DECIDE: (6)
The into-causative (Version 3) Sem
CAUSE-DECIDE-ACT ‹ causer R
R: instance, means
Syn
V
causee
‹
result › ›
SUBJ
OBJ
[into Vgerund]
230 Anatol Stefanowitsch
The second pattern seems to contradict this to some extent, however; it suggests that the into-causative is specialized to a certain degree (but not exclusively) towards the causation of perception-cognition-utterance acts and what these seem to have in common is that they are predominantly mental rather than physical acts. Since one does not typically decide to perceive something or experience a mental act, it calls into question the idea that CAUSE-DECIDE is a useful characterization of the into-causative’s semantics after all. Instead, it seems that the into-causative focuses on the fact that there is some initial resistance (conscious or subconscious) to the result and that this resistance is either overcome or circumvented by the causing event. Perhaps something like CAUSE-STOP-RESIST might capture the meaning more realistically; this would also account for the fact that the intocausative is not usually used in contexts where the result is the cessation of some activity: (7)
The into-causative (Version 4) Sem
CAUSE-STOP-RESIST ‹ causer R
R: instance, means
Syn
V
causee
‹
result › ›
SUBJ
OBJ
[into Vgerund]
There is also a connection between this meaning and the fact that the construction introduces the resulting event by the preposition into, which signals the transition from one location to another across a boundary, which may metaphorically signal the resistance that must be overcome. What is clear in any case is that the into-causative has a fairly complex set of sub-meanings more specific than any general meaning we might be able to posit. Let us investigate these in more detail. 5.
Co-varying collexeme analysis
One property of the into-causative that distinguishes it from other causative constructions of English (and from argument-structure constructions in general) is that it provides two slots into which verbs are seemingly freely inserted: that for the causing event and that for the resulting event. Clearly, it is of interest to a semantic characterization of the construction to deter-
Collostructional analysis: A case study of the English into-causative 231
mine if and to what extent dependencies between the two slots can be observed. Whenever there is a potentially interesting dependency between two slots of a single construction, a variant of collostructional analysis called co-varying collexeme analysis is useful. Instead of identifying associations between words and constructions, like the other two variants introduced above, this method identifies associations between two words within a construction; it is thus similar to traditional collocate analysis, except insofar as it takes into account the syntactic context of the collocates at a very specific level (cf. Gries and Stefanowitsch 2004b, Stefanowitsch and Gries 2005). The statistical association between two lexical items la and lb occurring in slots A and B of a construction C (for example, the items fool and think in the into-causative) can be determined on the basis of the occurrence frequencies shown in Table 7: (i) the frequency of the co-occurrence of la in slot A with lb in slot B of the construction C; (ii) the frequency of the cooccurrence of la in slot A with words other than lb in slot B; (iii) the frequency of the co-occurrence of words other than la in slot A with lb in slot B; and (iv) the frequency of the co-occurrence of words other than la in slot A with words other than lb in slot B of the construction. Table 7.
Co-varying collexeme analysis (schematic)
Word la in Slot A of Construction C Word lb in Slot B of Frequency of A(la) Construction C & B(lb) in C Other Words in Slot Frequency of A(la) B of Construction C & B(¬lb) in C Total
Total frequency of A(la) in C
Other Words in Slot A of Construction C Frequency of A(¬la) & B(lb) in C Frequency of A(¬la) & B(¬lb) in C Total frequency of A(¬la) in C
Total Total frequency of B(lb) in C Total frequency of B(¬lb) in C Total frequency of C
On the basis of such a table, the direction of association (i.e., whether la and lb co-occur more or less frequently than expected in C) can be determined on the basis of calculating the expected frequencies using standard procedures, and the statistical significance of this association can be determined using any statistical test for contingency tables, such as the Fisher-Yates exact test. Table 8 exemplifies this for the verb fool in the cause slot and think in the result-slot of the into-causative (with expected frequencies given in parentheses).
232 Anatol Stefanowitsch Table 8.
Co-varying collexeme analysis: fool and think in the into-causative Other verbs in the CAUSE slot 101 (140)
fool think Other verbs in the RESULT slot Total
46
(7)
31
(70)
1408
77
Total 147
(1369)
1439
1509
1586
Applying this to the cause- and result-verbs in the into-causative yields the list in Table 9 (I show only the top 19 co-varying collexeme pairs here, since these are followed by another 19 that are all at the same level of significance so that I cannot meaningfully select from them, while showing all of them would make the table unwieldy). Table 9.
Top 19 significantly co-varying collexemes in the into-causative (BNC, data from Gries and Stefanowitsch 2004b).
Collexeme A Collexeme B Freq. A Freq. B 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
fool mislead mislead deceive trick dragoon encourage aggravate panic seduce delude torture force shock stimulate blackmail drive con intimidate
think think believe think part serve farm produce seize misbehave believe reveal hide face develop marry hide post vote
77 57 57 48 92 8 8 2 15 17 19 6 101 16 8 25 9 34 8
147 147 104 147 10 2 2 12 2 2 104 6 6 3 6 10 6 2 7
Freq. AB (Obs) 46 26 18 16 6 2 2 2 2 2 7 2 4 2 2 3 2 2 2
Freq. AB (Exp.) 7.1368 5.2831 3.7377 4.4489 0.5801 0.0101 0.0101 0.0151 0.0189 0.0214 1.2459 0.0227 0.3821 0.0303 0.0303 0.1576 0.0340 0.0429 0.0353
P-value 8.71E-31 1.76E-13 4.42E-09 2.23E-06 5.65E-06 2.23E-05 2.23E-05 5.25E-05 8.35E-05 1.08E-04 1.12E-04 1.78E-04 2.11E-04 2.85E-04 3.31E-04 3.86E-04 4.25E-04 4.46E-04 4.62E-04
What is immediately clear is that there is no random relation between types of causes and types of results in the into-causative. TRICK events tend to cause mental states, while physical actions tend to be caused by FORCE, SCARE or PERSUASION events. The latter three can easily be subsumed under the meaning CAUSE-DECIDE-ACT proposed in Section 3 above, the former cannot. Indeed, mental-state results proved to be especially typi-
Collostructional analysis: A case study of the English into-causative 233
cal of the into-causative and at the same time difficult to capture. The results shown in Table 9 now reveals that these are systematically correlated with a particular, fairly narrow class of causing-event verbs involving DECEPTION, which may allow us to solve the problem discussed in Section 4 by positing two highly entrenched subtypes of the into-causative as represented in (7 above). One is the one shown in (6) above, the other is shown in (8): (8)
The into-causative (DECEPTION subtype) Sem
DECEIVE-STOP-RESIST ‹ causer R
R: instance, means
Syn
V
causee
‹ SUBJ
result › ›
OBJ
[into Vgerund]
Note, in conclusion, that in earlier work (Gries and Stefanowitsch 2004b, Stefanowitsch and Gries 2005), we attributed the kinds of correlations between cause and result verbs observed in Table 9 to general semantic frames that contain our culturally defined knowledge about what type of causing event is likely to cause what type of resulting event. However, while this is certainly plausible, it does not necessarily remove the need of specifying at least two subtypes of the into-causative, since not all causative constructions of English display such a clear tendency towards these two subtypes of causation (nor, indeed do they all allow encoding them). 6.
Negative collexeme analysis
In Section 2 above, I only discussed the significantly attracted collexemes of the into-causative, i.e. those that occur in the construction significantly more frequently than expected. As mentioned, however, collexemes may also occur significantly less frequently than expected in a construction, in which case we refer to them as repelled collexemes. The into-causative only has four such repelled collexemes, shown in Table 10.
234 Anatol Stefanowitsch Table 10.
Significantly repelled collexemes in the V slot of the into-causative (data from Stefanowitsch and Gries 2003, but not published there) Collexeme
1. 2. 3. 4.
will make get move
Freq. Corp. 250301 213469 220942 38139
Freq. CX (Obs) 2 1 2 1
Freq. CX (Exp.) 38.87 33.15 34.31 5.92
P-value 6.73E-15 9.80E-14 5.59E-13 1.84E-02
These items are repelled for different reasons, not all of which are equally relevant to a constructional analysis. Clearly, will is repelled not because there is something odd or even deviant about the combination will sb. into doing sth., but because the word form will also represents the very frequent modal will, which distorts the expected frequencies drastically. This could, in principle, be avoided by using a corpus with reliable POS tagging. Move and get are also very frequent verbs, but there is no immediate reason why they should be used less frequently than expected in the into-causative, especially as they fit the metaphorical logic of into. It is possible, that speakers see them as too generic Of the repelled collexemes, make is perhaps one of the most interesting. At least under our initial characterization of the into-causative’s meaning as CAUSE, the fact that make, as one of the major causative verbs of English, is strongly repelled from the construction would be very surprising, as it would represent an instance of the constructional meaning. In fact, it is doubtful that make can occur in the into-causative at all. The one example in the BNC where it seems to do so is the following: (9)
B: and then they er invite you just, some people A: Yes, it's good then? B: makes the people into going C: [unclear] (BNC: KD3)
While it is possible that this is an instance of the into-causative, the fragmented nature of this stretch of discourse makes it impossible to be certain. This raises an interesting question: What if a verb does not occur in a construction at all? Can we say something insightful about its absence? Common wisdom says that if something does not occur in a corpus, this tells us nothing about whether it could occur. However, notice that an occurrence of zero is not fundamentally different from an occurrence of one. Thus, there is nothing to stop us from constructing a contingency table ex-
Collostructional analysis: A case study of the English into-causative 235
actly like that used for other occurrence frequencies in a simple collexeme analysis. In other words, it is possible to compare an observed frequency of zero with the expected frequency (cf. Stefanowitsch 2006b). In the context of the present analysis, it may be interesting to apply this method to the second major causative verb in English, cause. As mentioned in Section 2 above, cause does not occur with the into-causative at all in the BNC. But since it is much less frequent overall, than, for example, the verbs in Table 10, this absence may be purely accidental. Table 11 shows the frequencies needed to test this (with the expected frequencies shown in parentheses). Table 11.
Zero collexeme analysis (cause and the into-causative)
into-causative Other ASCs Total
cause 0 (3) 20091 (20088) 20091
other verbs 1586 (1583) 10184623 (10184626) 10186209
Total 1586 10204714 10206300
As we can see, the comparatively low frequency of the verb cause would lead us to expect it to occur in the into-causative only three times by chance anyway. However, the difference between zero and three is, in this case, statistically significant (p = 0.04392), which means that cause joins the list of significantly repelled collexemes. This cannot, of course, be taken as evidence that it cannot occur in the into-causative, or that it would not do so in a larger corpus. However, it does tell us that its absence is unlikely to be due to chance and the small sample size. 7.
Summary and outlook
With this case study, I have attempted to show how different collostructional methods may be combined in the analysis of an individual construction, the into-causative. I started from a rough informal characterization of the construction, which I successively refined not, as is usually done, on the basis of traditional linguistic argumentation using grammaticality judgments, but on the basis of quantitative corpus-linguistic evidence. Simple collexeme analysis revealed that, while the meaning of the construction must be stated in very general terms, it cannot simply be CAUSE, as none of the strongly attracted collexemes in the verb slot instantiate this meaning. On the contrary, both major causative verbs, make and cause, are
236 Anatol Stefanowitsch
repelled by the construction, make occurring only once and cause occurring not at all. As the statistical evaluation of these repelled collexemes later showed, their low frequency and absence, respectively, are unlikely to be due to chance. The attracted collexemes were shown to fall into several classes, most importantly, verbs of trickery and verbs of verbal or physical coercion, which already imply, to a certain extent, a cause-effect relation. However, the verb talk, which does not imply such a relation, is also found among the most strongly attracted collexemes, which I took to suggest that the meaning of the into-causative may be something like CAUSE-DECIDE. Note, again, this characterization (like other semantic considerations) does not fall out directly from the quantitative corpus-linguistic evidence, but is the result of interpreting it within a particular model. This is not a drawback for collostructional analysis, since interpretation of the evidence will be part of any linguistic method; the advantage of collostructional analysis and other types of quantitative corpus-linguistic data is that what is being interpreted is not itself a product of interpretation but evidence derived from authentic language use by replicable and fairly objective means. Note also, that in addition to serving as a basis for explicating the initial informal characterization of the into-causative, simple collexeme analysis also showed it to be a productive construction, an insight that would be difficult to demonstrate by introspection. In order to further specify the meaning of the construction, I then compared the into-causative to the closest lexical paraphrase of my suggested characterization, the pattern [persuade to V]. The result suggested that, first, the into-causative is specialized towards expressing the causation of action while [persuade to V] is also used to express the causation of the cessation of action, and, second, that it is frequently used in expressing the causation of mental states. This apparent contradiction was further investigated using co-varying collexeme analysis, which suggested the existence of two strongly entrenched subtypes of the into-causative, one for expressing the causation of physical actions by persuasion, one for expressing the causation of mental states by deception. Clearly, the characterization of the into-causative could be further refined on the basis of the data presented or on the basis of additional quantitative corpus-linguistic evidence; the analysis presented here was intended, mainly, to show the potential of collostructional analysis for construction grammar and related theoretical models.
Collostructional analysis: A case study of the English into-causative 237
If the type of analysis presented here seems time-consuming and laborintensive, this impression is correct: extracting and processing the data presented here took more than twenty hours (for a very experienced corpus linguist). If this expenditure of time and energy seems unwarranted in the analysis of a single, rather marginal construction of English, however, this impression is due only to the fact that linguists have assumed (wrongly) that quickly and effortlessly attained grammaticality judgments are to be regarded as some kind of norm. Clearly, twenty hours is not a lot compared to the time spent on data collection and analysis in other empirical sciences, and quantitative corpus linguists routinely spend comparable amounts of time on their data analysis. If corpus methods were to be more widely adopted by theoretical linguists, the additional time spent on data analysis might well result in fewer papers being published, but as these will be better grounded in linguistic reality, this may present a desirable tradeoff between quantity and quality. Of course, not all collostructional methods have to be applied in the analysis of a given linguistic phenomenon, and indeed, this is the first analysis that I am aware of that attempts to do so in a single paper. As the substantial body of literature shows (see Stefanowitsch 2012 for a fairly complete survey), the methods can also be applied individually in answering a broad range of questions about linguistic constructions and the structure of language more generally. References Church, Kenneth, and Patrick Hanks 1991 Word association norms, mutual information and lexicography. Computational Linguistics 16 (1): 22–29. Goldberg, Adele E. 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Gries, Stefan Thomas, and Anatol Stefanowitsch 2004a Extending collostructional analysis: A corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics 9 (1): 97– 129. Gries, Stefan Thomas, and Anatol Stefanowitsch 2004b Co-varying collexemes in the into-causative. In Language, Culture, and Mind, Michel Achard and Suzanne Kemmer (eds.), 225–236. Stanford: CSLI.
238 Anatol Stefanowitsch Kay, Paul 2012
The limits of construction grammar. In The Oxford Handbook of Construction Grammar, Graeme Trousdale and Thomas Hoffmann (eds.), 32–48. Oxford/New York: Oxford University Press. Stefanowitsch, Anatol 2001 Constructing Causation: A Construction Grammar Approach to Analytic Causatives. Dissertation, Houston, TX: Rice University. Stefanowitsch, Anatol 2006a Konstruktionsgrammatik und Korpuslinguistik. In Konstruktionsgrammatik: Von der Anwendung zur Theorie, Kerstin Fischer und Anatol Stefanowitsch (eds.), 151–176. Tübingen: Stauffenburg. Stefanowitsch, Anatol 2006b Negative evidence and the raw frequency fallacy. Corpus Linguistics and Linguistic Theory 2 (1): 61–77. Stefanowitsch, Anatol 2009 Bedeutung und Gebrauch in der Konstruktionsgrammatik: Wie kompositionell sind modale Infinitive im Deutschen? Zeitschrift für Germanistische Linguistik 37 (3): 565–592. Stefanowitsch, Anatol 2011 Cognitive linguistics meets the corpus. In Cognitive Linguistics: Convergence and Expansion, Mario Brdar, Stefan Thomas Gries, and Milena Zic Fuchs (eds.), 257–289. Amsterdam/Philadelphia: John Benjamins. Stefanowitsch, Anatol 2012 Collostructional analysis. In The Oxford Handbook of Construction Grammar, Graeme Trousdale and Thomas Hoffmann (eds.), 290– 306. Oxford/New York: Oxford University Press. Stefanowitsch, Anatol, and Stefan Thomas Gries 2003 Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8 (2): 209–243. Stefanowitsch, Anatol, und Stefan Thomas Gries 2005 Covarying collexemes. Corpus Linguistics and Linguistic Theory 1 (1): 1-46. Stefanowitsch, Anatol, and Stefan Thomas Gries 2009 Corpora and grammar. In Corpus Linguistics: An International Handbook (Handbücher zur Sprach- und Kommunikationswissenschaft, 29), Anke Lüdeling and Merja Kytö (eds.), 933–952. Berlin/New York: Walter de Gruyter. BNC
British National Corpus. Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/.
Lexico-grammatical patterns, pragmatic associations and discourse frequency 1 Hans-Jörg Schmid
The virtue of a thing is related to its proper function. (Aristotle, Ethics, Book 6, ii) 1.
Introduction
In a paper that was published some years ago (Schmid 2003), I made the somewhat unusual claim that the sequence I love you can be considered a collocation. An anonymous reviewer of this paper rightly pointed out that I love you is a sentence consisting of a subject, a verb and an object, and concluded – to my mind mistakenly – that it could hardly be a collocation at the same time. His or her conclusion was apparently based on the premise that the constituents of sentences are connected by means of syntactic rules and relations, while the elements making up a collocation are connected by virtue of lexical attractions or associations. I responded to the reviewer’s objection by declaring that sequences of words can be sentences and collocations at the same time. Lexical associations, I argued, could supersede syntactic relations if the users of the language process sequences of words as more or less prefabricated lexical chunks. The paper was eventually published, but I do not think that the reviewer came round to sharing my view. Is it possible to prove the contention that I love you is a quasi-fixed lexico-grammatical unit in addition to being a sentence? One argument supporting this claim comes from the observation of frequencies in corpora. If one retrieves from the British National Corpus all sequences of a personal pronoun followed by any form of the verb love and again followed by a
1.
I would like to thank Karin Aijmer, Ulrich Detges, Susanne Handl, Thomas Herbst, Sylvia Jaki, Peter-Arnold Mumm, Ulrich Schweier and Alison Wray for their highly appreciated comments on earlier versions of this paper.
240 Hans-Jörg Schmid
personal pronoun,2 it turns out that I love you is by far the most frequently found manifestation of this schematic pattern accounting for more than a quarter of all cases found. Boasting 666 hits, I love you is almost four times as frequent as the runner-up, I love it (175 hits), which in turn is followed in the frequency rank list by the wonderful sequence of she loved him (132 hits), he loved her (125) and you love me (113). Even when we take into account that I is considerably more frequent than he and she, and that you is more frequent than its competitors in the object slot, i.e. him and her, I and you are still highly significantly more frequent as subjects and objects respectively of love than the other personal pronouns or indeed any other possible syntactic realization.3 A second, presumably more compelling argument for the lexicalassociation-hypothesis for I love you is of a different nature: much more than she loved him or he loved her, I love you immediately calls up a whole world of associations in your mind when you read or hear this sequence. These associations are related to the situations in which I love you is commonly used: that it is what lovers usually say when they want to tell their partner that they love them; that it is the conventional way of getting this communicative task done, and that other ways of expressing deep affection like I like you or I’m fond of you will fall short of achieving the intended communicative effect; that this sequence of words is often uttered in particularly romantic moments or moments where there is a special need for such an assurance, say in time of a relationship crisis; and presumably that this cliché-like phrase is often heard in more or less melodramatic movies and commercial pop songs. It is a fairly safe guess that none of these bits of knowledge will pop up in your mind when you read or hear she loved him or he loved her, except maybe that they are typically found in romantic fiction. In short, what sets I love you apart from she loved him and other lexically less strongly associated minimal sentences like I like soccer or Frida likes chicken is the fact that I love you activates a number of pragmatic associations, i.e. associations to typical users (lovers, fiction writers, figures 2. 3.
The query used for retrieving this material from BNCweb was “_PNP {love/V} _PNP”. The log-likelihood score for I in subject position preceding LOVE is 13,299, compared to 3,393 and 3,248 for she and he respectively; the score for you in the object slot immediately following I love is 4,106, with him and it trailing behind with the scores 607 and 439 respectively.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 241
in movies), typical situations (romantic moments), typical communicative intentions (assurance of deep affection). This observation raises the question, to be investigated in this paper, whether lexico-grammatical patterns in general are supported or even motivated by pragmatic aspects. More specifically, I want to discuss in which way pragmatic associations have an effect on the freezing and chunking of various types of lexico-grammatical patterns. In order to do so, lexico-grammatical patterns and pragmatic associations must first be salvaged from their less-than-splendid isolation outside the linguistic system proper and integrated in a model of linguistic knowledge (Section 2). Following a rough differentiation of types of lexico-grammatical patterns based on common criteria such as transparency, variability and irregularity (Section 3), the ways in which different types of lexico-grammatical patterns can be said to benefit from pragmatic associations will be investigated (Section 4). This discussion will specifically highlight the fact that collocations and lexical bundles differ with regard to the support they receive from pragmatic associations. Section 5 will relate the insights gained in Section 4 to the widespread idea that the chunking of lexico-grammatical sequences is related to their discourse frequencies and explain in which way pragmatic aspects motivate frequencies of occurrence. 2.
Theoretical background
Idioms, routine formulae, collocations and other types of multi-word expressions and lexical-association phenomena, on the one hand, and all kinds of phenomena subsumed under the label pragmatics, on the other, traditionally share the same fate: they are banned from models of grammar proper because they are irregular and unpredictable and are therefore said to defy large-scale generalizations. Nevertheless, hardly anyone will doubt that the two phenomena have an important role to play in how languages work and are used as communicative tools. An adequate theory of language should therefore strive to accommodate and integrate formulaic sequences and pragmatic aspects of language. In this section, a framework will be outlined which tries to do exactly this.4
4.
The framework is still under construction and will be detailed elsewhere.
242 Hans-Jörg Schmid
2.1
The general framework: Entrenchment and conventionalization
The framework starts out from the assumptions that what has traditionally been referred to as “the Language System” is not a stable entity, as is suggested by the established use of this definite noun phrase. Instead, the “system” is considered to emerge from and be continuously refreshed by the interplay of cognitive processes taking place in individual minds, on the one hand, and sociopragmatic processes taking place in societies, on the other. This dynamic model is inspired by and compatible with a number of recent approaches labelled by terms such as “usage-based”, “emergentist”, “socio-cognitive”, “complex-adaptive” and others.5 What distinguishes the present framework from these approaches is its explicit aim to reduce the complexity of the adaptive and dynamic system that is language to a limited number of cognitive and sociopragmatic processes and their interaction. The cognitive processes postulated in the model are subsumed under the term entrenchment, and the sociopragmatic ones under the label conventionalization. The framework is therefore referred to as the entrenchmentand-conventionalization model, or EC-model for short. Following dynamic conceptions of the notion of convention (cf. Croft 2000: 98–99, Eckert 2000: 45, Sinha and Rodriguez 2008), conventionalization is understood as the continuous mutual coordination and matching of communicative knowledge and practices, subject to the exigencies of the entrenchment processes taking place in individual minds. The term entrenchment refers to the on-going re-organization of individual communicative knowledge, subject to the exigencies of the social environment (cf.
5.
These approaches include: emergentist and usage-based models of grammar (e.g. Hopper 1987, MacWhinney 1999, Hawkins 2004), language acquisition (e.g. Tomasello 2003, Goldberg 2006, 2009, MacWhinney 1998, Behrens 2009) and language change (Bybee 1985, 2006, 2007, 2010, Bybee and Hopper 2001, Haspelmath 1999, 2002, Croft 2000, 2009, Traugott and Dasher 2004); cognitive-linguistic usage-based models, including various types of construction grammars (e.g. Langacker 1988, 2008, Barlow and Kemmer 2000, Fillmore, Kay, and O’Connor 1988, Goldberg 1995, 2006); exemplarbased approaches (e.g. Bybee 2001, Pierrehumbert 2001, 2006) and complexadaptive approaches (e.g. The Five Graces Group 2009, Blythe and Croft 2009); socio-cognitive approaches (e.g. Geeraerts 2003, Kristiansen 2008, Croft 2009, Harder 2010, Geeraerts, Kristiansen, and Peirsman 2010).
Lexico-grammatical patterns, pragmatic associations, discourse frequency 243
Langacker 1987: 59, 2008: 16–17, Evans and Green 2006: 114, Schmid 2007, Blumenthal-Dramé 2012).6 Neither entrenchment nor conventionalization ever come to a halt;7 the two terms denote on-going processes rather than resultant states. In the ECmodel, the “communicative knowledge” which takes centre-stage in the definitions of both entrenchment and conventionalization is available to individual speakers in the form of only one type of cognitive process: association. All types of “linguistic elements” and “linguistic structures” ultimately rely on this general cognitive process, which is of course not specific to language but nevertheless manifested in language-specific ways (see Section 2.2). On a general level of description, the process of association can simply be defined as creating “a link between two or more cognitive representations” (Smith and Mackie 2000: 37). The EC-model adopts the general idea of so-called “spreading activation” models (cf. e.g. Collins and Loftus 1975, Dell 1986, Aitchison 2003: 84101) that linguistic knowledge is available as a network of more or less routinized associations of various types. Within this network, activation spreads whenever an auditory or visual linguistic stimulus (i.e. the formal side of a “sign”) is presented to a hearer or reader or activated during language production by a speaker or writer. Activation spreads from associations to other associations that can be related in the network in a variety of ways (see again Section 2.2 for more details).
6.
7.
Note that this definition is at the same time more general and more specific than other frequently quoted conceptions of the notion of entrenchment, among them that proposed by Langacker (1987: 59). On the one hand, it is more general because it explicitly subsumes all kinds of re-organisation processes, not only those that lead to the formation of symbolic units. And on the other hand, it is more specific in the respect that it explicitly includes the relation to the social environment which influences the internal cognitive processes. In view of the so-called critical-period hypothesis, which postulates that the window for acquiring a language is only open for a limited number of years during childhood (Lenneberg 1967), the idea of a lifelong reorganization of linguistic knowledge taking place in individual speakers’ minds is presumably highly controversial. It is not unlikely, however, that the nature of the reorganization processes indeed remains the same from early language acquisition throughout speakers’ lives (cf. Tomasello 2000: 237), while the amount of reorganization taking place becomes smaller due to increased routinization and resulting fossilization of associations (MacWhinney 2012).
244 Hans-Jörg Schmid
In line with other usage-based and cognitive approaches (e.g. Langacker 1987: 100, 2008: 16, Haspelmath 1999: 1058, Smith and Mackie 2000: 37, Bybee 2006), the EC-model assumes that the strength of associations is fostered by routinization, which is in turn facilitated by repeated processing events. Linguistic elements or structures that are uttered, written, heard and read more frequently than competing structures are more likely to be processed faster and with less cognitive effort and control than rare ones. Frequent linguistic stimuli are thus more likely to produce routinized associations than infrequent ones (see Section 5 for more details). Concurrent with the routinization resulting from the repeated processing of similar or identical associations or association patterns, speakers begin to abstract commonalities and form new “second-order” associations. This process is referred to as schema-formation or schematization in the ECmodel. On the level of words, for example, schema-formation is required to build up “representations” of lexemes qua abstract units. Speakers do not just routinize form-meaning associations of the individual word-forms go, goes, going, went and gone, but also routinize associations to the schema GO, which abstracts from the different forms and meanings. Repeated encounters of such sequences as that’s right, that’s good, that’s great, that’s nice, that’s horrible and that’s awful will presumably not only result in a routinization of associations connecting each of these recurrent expressions to certain meanings, but also in the formation of a more general meaningcarrying schema THAT’S + EVALUATIVE ADJ. Associations to such schemas can also become more or less routinized. As mentioned above, in the ECmodel, linguistic knowledge is assumed to be available to individual speakers in the form of constantly re-adapted associations to such schemas, but also in the form of non-schematized but routinized associations.8 In addition, schemas are used as a source for “generative”, productive and creative language use. However, this description of individual knowledge is of course not sufficient for language to work as a communicative tool. Since language is not solipsistic but “has a fundamentally social function” (The Five Graces Group 2009: 1), one has to assume that there is some sort of match of the routinized and schematized linguistic associations in the minds of different speakers of a language. Hence, as expressed in the definition of the notion 8.
This view is of course compatible with construction-grammar approaches, but it is more dynamic and much less committed to rash claims concerning the existence of constructions.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 245
of entrenchment, the entrenchment-processes taking place in individual minds are not only subject to internal cognitive processes but also, of course, to external factors, i.e. to the input given by other speakers and the amount of output produced by speakers themselves, which arguably serves as a particularly privileged form of input.9 This is where the second major element of the EC-model comes in: the sociopragmatic processes. Linguistic associations in the minds of individual speakers are continuously and mutually strengthened as a result of actual communication in social situations. Trivial as it may appear, it must be stressed that the process of communication is the prerequisite for the mutual exchange of linguistic knowledge. While producing and comprehending linguistic utterances that are of course primarily meant to convey information and fulfil other communicative functions, interlocutors invariably and inadvertently process associations linked to linguistic elements and schemas, thus inevitably strengthening their routinization. Communication can take place synchronically, within the temporal boundaries of a shared speech event characterized by the exchange of spoken utterances, and also asynchronically, when written utterances are read at a later point in time. All four modes, writing, reading, speaking and listening/comprehending, are assumed to have effects on entrenchment processes. It is in this way that communication turns out to be the basic source of the much-quoted frequency effect on routinization and entrenchment mentioned above. While the details of how frequency of occurrence in actual spoken and written discourse translates into routinization are still far from clear, a set of processes known as co-adaptation (Ellis and Larsen-Freeman 2009: 91), accommodation (cf. Trudgill 1986: 1–38, Giles, Coupland, and Coupland 1991, Auer and Hinskens 2005, Giles and Ogay 2006), or alignment (e.g. Pickering and Garrod 2004) are very likely to play key roles. These terms capture the tendency of speakers to imitate and adapt features and elements encountered in the speech of their interlocutors, usually as an act of per-
9.
Note that the “input given by other speakers” is not only an external factor, as suggested in the text, but also an internal one, since what is crucial for further entrenchment is not the objectively given input, but input-as-processed. Studies of reanalysis (e.g. Detges and Waltereit 2002) indicate that a key element of this process resides in hearers’ parsing during comprehension rather than speakers’ innovative constructions. I would like to thank Ulrich Detges for drawing my attention to this important point.
246 Hans-Jörg Schmid
forming solidarity and group identity.10 One possible reason for this tendency is “the social pressure to speak like others” (Wray 2008: 18), which ultimately results in the establishment of linguistic conventions (Clark 1996: 71, Croft 2000: 98–99). From a neurological point of view, it is possible that the notorious mirror neurons (Pickering and Garrod 2004: 188) play a role in this process. Co-adaptation contributes to the diffusion of associations related to linguistic elements and features across the members of speech communities. The link between co-adaptation and diffusion is the cognitive process of routinization, which only works, however, when speakers carry over memory traces from concrete language-processing situations – no matter whether they are spoken dialogues or reception of written material – into new processing situations. Auer and Hinskens (2005: 336) use the term “individual long-term accommodation” for this effect. Pickering and Garrod (2004: 217–218, 2005: 89–100) even note that alignment in discourse plays a role in the emergence and conventionalization of routinized semifixed expressions: “if an expression becomes sufficiently entrenched [in a conversation, HJS], it may survive that conversation” (Pickering and Garrod 2004: 218). In the study of language change, the process of diffusion is typically related to the spread of innovations (e.g. Croft 2000: 166–183), but in fact it has much further-reaching effects. As for innovation, in terms of the ECmodel, new linguistic associations are replicated as a result of co-adaptation and thus diffuse and spread in the speech community, very much like a contagious virus or disease.11 Significantly, and this is where the EC-model differs from other accounts, co-adaptation and diffusion are also responsible for the stability of the linguistic system in the way that those linguistic associations that are frequently repeated in actual situations of language use will resist change, both in the minds of individual speakers and in the speech community. They are constantly renewed and will thus remain part of the shared and conventionalized norm.12 This may be available in the
10. Cf. also Johanson’s (2008) notion of “code-copying” and the work by Enfield (2005, 2008). 11. Blythe and Croft (2009) outline a mathematical model of how this could work. 12. This notion of norm is not to be confused with Coseriu’s (1967: 11) understanding of the same term. While Coseriu regards the norm as a level of usuality located between actual speech and his structuralist conception of the sys-
Lexico-grammatical patterns, pragmatic associations, discourse frequency 247
form of a tacit shared understanding of how communicative tasks are generally accomplished in the given speech community (institutionalization, cf. Brinton and Traugott 2005: 45–47, usualization, cf. Blank 2001: 1596) or – as is the case in all codified languages – is additionally laid down in grammars, dictionaries, or usage guides (codification; cf. Holmes 2008: 110–117). As a counterpart to innovation, extremely rare linguistic associations are in danger of losing their conventionality and becoming obsolete, since they are not reinforced in the minds of speakers and are thus subject to decay and forgotten. Examples of lexico-grammatical patterns that are currently facing this fate include “old-fashioned” expressions such as jolly good, old bean, old boy and others that have a somewhat P.G. Wodehousian ring to them. In sum, the major processes identified in the EC-model as being constitutive of a dynamic and adaptive model of language are the cognitive entrenchment-processes of association, routinization and schemaformation and the sociopragmatic conventionalization-processes of communication and co-adaptation. Whether diffusion and normation must be modelled as sociopragmatic processes in their own right or as results of the interaction of cognitive processes and co-adaptation is still an open question in the ongoing conception of the EC-model. The EC-model is unique in integrating the cognitive and the sociopragmatic forces in a dynamic conception of both linguistic stability and language change. It is a parsimonious model as it strives to reduce the number of processes and forces required to model linguistic systematicity, variability and dynamicity to the bare minimum. And it claims to be a psychologically and sociologically plausible model of language which relies on welldocumented language-specific variants of equally well documented domain-general processes. While the majority of the claims made so far have been backed up by reference to the work of others, the way they are integrated in the EC-model in order to form a coherent model of linguistic structure, variation and change is new. This will be shown in greater detail in the next section.
tem, here the process of normation is indeed part of the emergent and dynamic system.
248 Hans-Jörg Schmid
2.2
The place of lexico-grammatical patterns and pragmatic associations in the EC-model
As this paper focuses on the relation between pragmatic associations and lexico-grammatical patterns, the place of both in the EC-model must next be clarified. In order to do this, a framework of four types of associations which are assumed to underlie language as a dynamic communicative tool and system will be introduced: symbolic, paradigmatic, syntagmatic and pragmatic associations. In keeping with the aims of this paper, syntagmatic associations, which form the cognitive substrate of lexico-grammatical patterns, and pragmatic associations will be described in greater detail in the following account of these four types of associations. Firstly, symbolic association reciprocally link linguistic forms (on different levels of complexity) to meanings. They provide the cognitive foundation of linguistic signs (cf. Saussure 1916: 98) or constructions (Fillmore, Kay, and O’Connor 1988, Goldberg 1995). As already pointed out, in the EC-model, linguistic signs are considered to be highly routinized and schematized symbolic associations. Secondly, paradigmatic associations link linguistic associations to “competing” associations, i.e. to associations that could potentially enter the focus of attention under the given contextual and cotextual circumstances (cf. Aitchison 2003: 84–91). Routinized paradigmatic associations are the cognitive substrate of the well-known paradigmatic sense-relations (synonymy, antonymy, hyponymy, etc.). They are also essential for the development of variable schemas, since the generalization process involves recognizing the fact that certain elements are interchangeable within an observed pattern. For example, in generalizing the schema THAT’S + EVALUATIVE ADJ from expressions such as that’s right, that’s nice, or that’s lovely speakers recognize both the identity of that’s and begin to associate right, nice, lovely and other adjectives as paradigmatic competitors in the variable slot of the schema. Thirdly, syntagmatic associations emerge in the process of production and comprehension by connecting linguistic signs and constructions which follow each other in running text. They can be fleeting associations that are activated in “one-off” online processing situations to construct or make sense of a chain of linguistic stimuli, but, significantly, they can also be routinized and schematized as a result of repeated processing. This effect is particularly relevant in the context of this paper. If syntagmatic associations linking sequences of linguistic elements are routinized and schematized, the
Lexico-grammatical patterns, pragmatic associations, discourse frequency 249
symbolic associations (‘meanings’) are not activated in a gradual, sequential way, with the mind incrementally blending associations related to the component parts; instead there is a direct symbolic association to the meaning of the whole unit or chunk (cf., e.g., Sinclair 1991: 110, Wray 2002: 9, Sinclair and Mauranen 2006: 37–40, Terkourafi 2011: 358–359). In more traditional terminology, this gives rise to what Burger (2010: 82–83) calls “Zeichen zweiter Stufe” (‘second-order signs’, HJS] which are composed of signs that are themselves first-order signs, resulting in the existence of a “sekundäres semiotisches System” [‘secondary semiotic system’, HJS].13 Significantly, whether a sequence of words is processed as a chunked symbolic association or via associations triggered by the component parts depends on the processing history of individual speakers (Wray 2008: 11, The Five Graces Group 2009: 15). For example, if you are a hotline telephone counsellor you are more likely to process the sequence how can I help you today as one holistic chunk than other speakers of English, who are of course familiar with this sequence but hardly ever produce it. It is one of the strengths of the EC-model that such differences are predicted as an integral part of the framework. As more and more speakers begin to share the holistic type of associations, the type of processing can also change on the collective macro-level of the speech community. This means that the chunk becomes conventionalized. A good example of this is the sequence yes we can. In the BNC, which dates from the late 1980s and early 1990s, we find 26 attestations of this sequence of words. Whether this can be interpreted as evidence for a certain degree of routinization in the minds of at least some speakers of English at that time is certainly debatable, but this question can remain open here. What seems rather clear is that after Barack Obama’s presidential campaign in 2008, the phrase yes we can has undoubtedly been turned into a chunk in the minds of most Americans (and many other native and non-native speakers of English). In the present context it is particularly noteworthy that the chunk comes complete with a rich set of pragmatic associations relating to Obama, his campaign and election victory, the major messages that he was trying to get across with
13. See Grzybek (2007: 202–204) on the roots of this distinction in Barthes (1957) and Russian phraseology and an interesting discussion of further implications on pragmatic aspects of phraseology.
250 Hans-Jörg Schmid
this slogan, more recently presumably also to whether or not he has been able to live up to his promise.14 Symbolic associations resulting from the schematization of syntagmatic associations can be linked to formally fixed sequences of elements, e.g. in the case of totally frozen expressions that cannot be changed in any way (BY AND LARGE, KITH AND KIN), or to sequences that include open slots that can be filled in various ways (THAT’S + EVALUATIVE ADJ). In a cognitivelinguistic framework, both types can be referred to as chunks, but for the second, variable type the terms schema or schematic construction are more commonly used. Schemas define both invariable slots of patterns and restrictions on how variable slots can be filled (cf. Tomasello 2003: 173–175, Langacker 2008: 17, Behrens 2009: 397). Schemas are available in different sizes – relating to morphologically simple and complex units – and on different levels of abstraction – from lexically specific to highly schematic. As a result, the network is “heteromorphic” (Wray 2008: 12, 20), marked by multiple associations routines and a considerable degree of redundancy (Nattinger and DeCarrico 1992: 23, Bybee 2010: 24, for neurological evidence, see Capelle, Shtyrov, and Pulvermüller 2010: 198–199). For more or less any given linguistic element many different routinized and schematized associations compete for activation. The routinization of syntagmatic associations produces the well-known “priming” effect (cf. Hoey 2005: 7–14) that speakers and hearers are often able to anticipate the occurrence of a second element of a recurrent sequence as soon as they are confronted with the first one. In the EC-model, routinized syntagmatic associations are thus the cognitive source of the frequently voiced impression that collocations show a certain degree of “predictability” (cf. e.g. Greenbaum 1970, Sinclair 1991: 110, Herbst 1996: 389). Do all semi- or fully-fixed expressions emerge by means of the gradual routinization of syntagmatic associations? Probably not. According to Wray, there are “sequences that start off formulaic” (2002: 59), i.e. as “long strings with a complex meaning that have never got broken down” (2002: 14. That writers rely on the availability of a chunk-like association linked to yes we can even in a non-native speaker environment is demonstrated by the headlines Yes, we can’t found in a magazine accompanying the German broadsheet Süddeutsche Zeitung (SzExtra, 04.–10.02.2010, p. 10) and No, you can’t used as a hook in the German weekly Die Zeit (15 February 2012) (http://www.zeit.de/2012/08/USA-Atomkraftwerke). Thanks to Sylvia Jaki for the reference to the first of these sources.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 251
61), on the one hand, and “sequences that become formulaic” (2002: 60), i.e. as “strings of smaller units that have got stuck together” (2002: 61). Regarding the question of how sequences “become formulaic in the first place”, she rightly emphasizes that “[t]his question needs to be answered slightly differently depending on whether it relates to the language as a whole or the language knowledge of an individual” (Wray 2002: 60) – a remark which is in keeping with the distinction between individual entrenchment and collective conventionalization in the EC-model. On the micro-level, individual speakers can acquire routinized and schematized syntagmatic associations either wholesale, i.e. directly as holistic symbolic associations linking meanings and communicative needs to complex sequences of words, or by gradually chunking them as a result of repeated usage. As the large majority of formulaic sequences are already more or less conventionalized in the speech community, the first type of acquisition is presumably much more frequent than the second (cf. Wray’s “needs-only analysis” 2002: 130–132). On the macro-level of the speech community, chunks can also emerge gradually by means of long-term fusion processes (as seems plausible for complex prepositions of the type in spite of, cf. Beckner and Bybee 2009), but they can also be the result of the spread of the chunk. While this may suggest that the processes taking place in individual minds and those taking place in society are essentially the same, the EC-model makes it quite clear that this is not the case: on the one hand, chunking, as an individual cognitive process, cannot affect the speech community and result in long-term change unless its effects diffuse across members and are handed over to later generations of speakers; and on the other hand, individual chunking processes are subject to the perception of the input and co-adaptation processes in actual discourse situations. Pragmatic associations connect symbolic, paradigmatic and syntagmatic associations with perceptual input garnered from external situations.15 While pragmatic associations share with the other three types of associations the underlying cognitive process that is at work, they are special in two ways: on the one hand, pragmatic associations are associations of a second order in the respect that they operate on the other types of associations, in particular on routinized associations, and thus seem to rely on 15. Cf. Hoey’s more general definition of the notion of pragmatic association: “Pragmatic association occurs when a word or word sequence is associated with a set of features that all serve the same or similar pragmatic functions (e.g. indicating vagueness, uncertainty)” (Hoey 2005: 27, original emphasis omitted).
252 Hans-Jörg Schmid
them. This relates to the common conception of pragmatics as being some kind of facultative appendix that can, but need not, be invoked in linguistic description if helpful or necessary. Yet, on the other hand, pragmatic associations are arguably also the ultimate source of the other three types of associations, at least in any viable usage-based model of language, since symbolic, paradigmatic and syntagmatic associations can only emerge from actual usage events which invariably involve pragmatic associations. The key to reconciling these two seemingly opposing roles attributed to pragmatic functions lies in the routinization of pragmatic associations, which results in the emergence and routinization of symbolic associations, upon which, subsequently, new pragmatic associations can operate. Cotextual and contextual associations then become a part of symbolic associations, or, vice versa, as Nattinger and DeCarrico put it with reference to Levinson (1983: 33), “aspects of linguistic structure sometimes directly encode features of context” (1992: 4).16 According to the EC-model, pragmatic associations link the external speech event with internal cognitive processes and hence constitute the main interface between entrenchment processes, on the one hand, and conventionalization processes, that is communication and co-adaptation, on the other. In keeping with a general understanding of pragmatics as having to do with language-use in actual contexts and “meaning-in-context” (Bublitz and Norrick 2010: 4), pragmatic associations are defined as linking other types of associations, especially symbolic ones, to perceptual stimuli relating to the situational context (including discourse participants, places, settings, objects which may serve as targets of deictic references, types of events); the linguistic co-text (especially what was said before); the communicative intentions of speakers (including illocutionary acts and implicatures). For the purposes of this paper, four effects of pragmatic associations which are predicted by the EC-model should be highlighted. Firstly, since pragmatic associations link symbolic associations to usage events, they are instrumental in creating sensitivity to characteristics of lexemes and constructions with respect to style. For instance, there can be no doubt that compe16. I would like to thank Peter-Arnold Mumm for sharing with me his thoughts on the ubiquitous two-sided effects of pragmatic associations on the other types of associations.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 253
tent speakers of English know that competing idiomatic expressions referring to the death of a person (passed away, has left us, kicked the bucket, bit the dust, is partying with angels etc.) convey different attitudes to what is said and are appropriate in different types of situations. How does this knowledge come about? It is only possible because language users apparently do not have highly reductive, feature-like representations of the meanings of words and constructions of the type [ BECOME NOT ALIVE] for die, but can indeed rely on rich memory of situations where different expressions meaning ‘die’ were used (cf. Bybee 2010: 55–56).17 By virtue of the routinization of such pragmatic associations, language users are able to develop style sensitivity. In addition, as already mentioned, pragmatic associations become parts of or are even turned into symbolic associations whose communicative impact no longer depends on the specific context. Secondly, language users derive their knowledge of register differences (cf. Wray 2008: 117) from the routinization of pragmatic associations between certain linguistic forms and occasions when they were uttered, resulting for example in the awareness that patterns such as payment in due course, obstruction of justice, take into custody, or judgment notwithstanding the verdict are typically produced by legal experts when discussing legal matters. Thirdly, pragmatic associations are also a necessary source of connotative meanings attached to lexemes and patterns in the minds of individual speakers and eventually whole speech communities. Like style sensitivity, the knowledge of semantic nuances such as ‘positive’, ‘negative’, ‘offensive’, ‘derogatory’, ‘ironic’, ‘euphemistic’ and all kinds of more specific connotations must be derived from the experience of individual usage events (cf. Feilke 1996 156–180). This pertains to the lexicon as a whole but also, for example, to the knowledge that the collocation fine friend is typically used in an ironical way with negative connotations, while good friend and old friend have positive connotations. Similarly, the positive connotations attributed to the expressions a rough diamond, up and coming, or know something inside out (cf. Gläser 1986: 32) and the negative
17. The idea that situational properties of earlier experiences with linguistic expression are stored and constantly added to the existing stock of knowledge about expressions is central to so-called exemplar theories, a type of usagebased models which is particularly prominent in the field of phonology (cf. e.g. Pierrehumbert 2001, Bybee 2010: 14–32).
254 Hans-Jörg Schmid
ones of breed like rabbits and common as muck (Gläser 1986: 31) must be learned by extracting them from contexts via pragmatic associations. Finally, and this is to be probed more deeply in Section 4, pragmatic associations are likely to be instrumental, or even play a central role, in the acquisition of syntagmatic chunks, especially in early language acquisition but also throughout a speaker’s life. It is a very robust finding in usagebased approaches to language acquisition that infants and toddlers first learn unanalyzed chunks (Behrens 2009: 393) and only later begin to segment and generalize. Crucially, these chunks are learnt in social situations characterized by shared attention (Tomasello and Rakoczy 2003, cf. Behrens 2009), and thus it is more than likely that “these chunks would also be learnt together with their associated functions in context” (Nattinger and DeCarrico 1992: 11, cf. Tomasello 2003). As pointed out above, in the ECmodel, it is assumed that this pragmatically co-determined learning or reorganization process does not stop with the end of the so-called critical period but extends throughout a speakers life. 3.
The purview of the field of lexico-grammatical patterns
Because of its fluid and fuzzy boundaries and its internal heterogeneity, the field of lexico-grammatical patterns is difficult to demarcate from other phenomena and to differentiate internally (cf. e.g. Granger and Paquot 2008, Wray 2012). While it is neither necessary nor intended to contribute to solving the classificatory problem in this paper, for descriptive and terminological purposes an attempt must be made to superimpose some kind of terminological working structure on this notorious jungle.18 The umbrella term that I will use in this paper in order to avoid any theoretical commitments is lexico-grammatical patterns. These are defined in admittedly rough terms as recurrent sequences of lexical and grammatical elements 18. Superordinate terms having different semantic nuances and coming from a variety of theoretical backgrounds which are commonly found in the literature include multi-word units (e.g. Schmitt 2000: 96–100), FEIs (i.e. fixed expressions including idioms; Moon 1998), formulaic sequences (Wray 2002, Schmitt 2004), formulaic language (Wray 2008), prefabricated routines or prefabs (Erman and Warren 2000, Bybee 2010), routine formulae (Coulmas 1981), extended units of meaning (Sinclair 1996), lexical phrases (Schmitt 2000: 101–102), (lexical) chunks (Lewis 19931, Schmitt 2000: 101, Bybee 2010: 33–37), sedimented patterns (Günthner 20112: 158), (syntactic) gestalts (Aijmer 2007: 44, Auer 20073: 97, Imo 2011).
Lexico-grammatical patterns, pragmatic associations, discourse frequency 255
which serve an identifiable function.19 Figure 1 uses the dimension of frozenness/variability to chart the terrain in such a way that three groups of types of lexico-grammatical patterns can be formed. The individual types are briefly explained in what follows, adding further well-known dimensions such as degrees of transparency/compositionality, syntactic (ir-)regularity and pragmatic constraints. Group 1: More fixed lexico-grammatical patterns
Routine formulae: syntactically and semantically fixed phrases tied to social situations and pragmatic acts such as greetings, apologies, thanks; these are not integrated in syntactic structures but function in a syntactically autonomous way (great to see you, how are you doing, excuse me, thank you so much, long time no see). Transparent conventional phrases: institutionalized phrases which are formally fixed – with regard to both the elements involved and their order – but semantically more or less transparent (ladies and gentlemen, mind the gap). Proverbs and proverbial sayings: cliché-like, frozen sequences of words displaying shared cultural wisdom, which are typically not embedded in larger syntactic structures but are propositions, sentences and, arguably, even quoted texts in their own right (out of sight, out of mind; an apple a day keeps the doctor away).
19. It will be noted that this definition deliberately leaves open the perennial issue of what it exactly means for a sequence of words to be “recurrent”. One reason for this apparent surrender lies in the uncertainties regarding the assessment of frequencies discussed in Section 5. See e.g. Jones and Sinclair (1974: 19), Kjellmer (1982: 26) or Clear (1993: 277) for attempts to define thresholds concerning relative paradigmatic frequencies and Church and Hanks (1990), Clear (1993), Stubbs (1995), Manning and Schütze (2001) as well as Grzybek (2007: 196–201) on calculating significance levels of relative syntagmatic frequencies. The reference to “an identifiable function” is not meant to ensure that the sequences studied have an identifiable pragmatic role but used to exclude chance clusters such as the sequence of paper and so which can muster as many as 29 hits in the BNC.
256 Hans-Jörg Schmid
Figure 1.
Types of lexico-grammatical patterns arranged on the dimension of frozenness/variability20
Partly filled periphery constructions: syntactically deviant or somehow salient uses of familiar items subject to specific syntactic restrictions and triggering special semantic and or pragmatic effects, e.g. the let alone construction and the the X-er the Y-er construction (Fillmore, Kay, and O’Connor 1988, Capelle 2011) or the not-that con-
20. Literally all readers of earlier versions of this paper have commented on the problems inherent in this classification, drawing my attention to inconsistencies in the application of the key criterion, to the existence of hidden criteria such as abstractness and schematicity and to unconvincing placement of individual items in the list (e.g. pertaining to idioms and proverbs). The basic principle behind the arrangement relies on the variability of the types of phenomena named, which explains why collocations and collostructions are considered more flexible than idioms and proverbs. That the figure appears the way it does in spite of these convincing reservations is not only due to my pig-headedness, but also to the sheer fact that the purpose of the classification is a descriptive and terminological one.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 257
struction (Delahunty 2006, Schmid 2011, 2013). The invariable elements of these constructions are highly fixed, while the open slots are of course variable. Multi-word prepositions and connectors: so-called complex prepositions and connectors usually regarded as the result of grammaticalization processes (in need of, by virtue of, on top of, with regard to, as a result, as a consequence, in contrast; cf. Quirk, Greenbaum, Leech, and Svartvik 1985: 669–671, Hoffmann 2005). Discourse markers: elements found at clause peripheries, usually separated from clauses proper as autonomous units, serving a range of textual, conversational and interpersonal functions (I see, I mean, you know, mind you; cf. Schiffrin 1987). Verb-particle constructions: phrasal verbs, prepositional verbs, phrasal-prepositional verbs with more or less opaque meanings (get up, keep up, go through, look at). Idioms: sequences of orthographic words which are integrated as (parts) of clauses in the syntactic structures of sentences and whose composite meanings cannot or only partly be derived from the meanings of their parts (blow off steam, cry wolf, walk on a tightrope). Idioms cover a range from frozen to more variable expressions and thus straddle the boundary between Group 1 and Group 2. They range from completely opaque (kick the bucket) to idiomatic but largely analyzable expressions (spill the beans) (cf. Svensson 2008), and from expressions made up of entirely familiar elements (bite the dust) to those including otherwise unfamiliar ones (kith and kin) (cf. Fillmore, Kay, and O’Connor 1988: 506–511, Dobrovol’skij 1995).
Group 2: Medium fixed patterns
Collocations: recurrent lexical combinations, typically Adj-N (strong tea, towering figure), N-V (dog – bark, price – drop), V-N (propose – motion, sign – petition), Adj-Adv (highly selective, fully integrated). This class is here taken to include light-verb constructions, i.e. more or less fixed combinations of semantically empty or bleached verbs and nouns (have lunch, take a picture, make a proposal). Lexical bundles: “simple sequences of word forms that commonly go together in natural discourse”, “regardless of their idiomaticity and regardless of their structural status” (Biber, Conrad, Leech, Johansson,
258 Hans-Jörg Schmid
and Finegan 1999: 990–1024; e.g. if you want to, or something like that, I don’t know why). Terms used by other authors to refer to similar phenomena are lexicalized sentence stems (Pawley and Syder 1983), lexical phrases (Nattinger and Decarrico 1992) and conversational routines (Aijmer 1996). As indicated in Figure 1, lexical bundles (and lexicalized sentence stems) straddle the fluid boundary between Group 2 and Group 3, since many of them – e.g. I don’t know why – are lexically specific but are, of course, at the same time manifestations of lexically more variable syntactic patterns. In addition, they often include highly chunked grammatical elements such as don’t (cf. Bybee and Scheibmann 1999), want to (Krug 2000: 117–166), or going to. Group 3: More variable patterns
Valency patterns: complementation patterns associated with verbs and other valency carriers (cf. Herbst 2010: 191–192). Collostructions: mutual attractions of lexical elements and schematic (‘grammatical’) constructions, e.g. the tendency of the ditransitive construction to attract the verbs give, tell, send (Stefanowitsch and Gries 2003) or the tendency of the N-that construction to attract the nouns fact, view, or idea (Schmid 2000). As descriptions of collostructions typically start out from schematic constructions and investigate lexemes that are attracted by them (cf. Stefanowitsch and Gries 2003: 214), collostructions can be seen as being complementary to valency patterns, whose description proceeds from lexemes qua valency carriers to patterns.
Although there are many exceptions, two general correlational trends can be observed: Firstly, frozenness shows a relationship to transparency in such a way that the more fixed lexico-grammatical patterns also tend to be more opaque than the more variable ones. Secondly, frequencies of occurrences of actual manifestations, i.e. tokens, of members of these classes tend to increase as we go from the top of Figure 1 to the bottom. This is of course not unrelated to degrees of frozenness/variability, since lexicallyfilled, substantive patterns found in Group 1 are semantically much more specific and thus less widely applicable than schematic, “grammatical” patterns. Valency patterns, collostructional attraction phenomena and also collocational phenomena on the one end of the scale are clearly more fre-
Lexico-grammatical patterns, pragmatic associations, discourse frequency 259
quent than routine-formulae, transparent conventional phrases, proverbs and partly-filled periphery constructions on the other end, with the other categories covering the intermediate ground. Needless to say, the discourse frequencies of those general classes and especially of individual items belonging to them vary considerably depending on text-types, genres and registers. Routine formulae, discourse markers, particle verbs and lexical bundles, for example, are very frequent in spontaneous spoken interaction, while many complex prepositions and connectors are more often used in planned speech and writing (cf. Hoffmann 2005: 95–119). Interestingly, judgments concerning the likelihood that examples from the various classes are stored as schematized, prefabricated chunks follow an inverse trend, with idioms usually being judged as better candidates for holistic processing than valency patterns or collostructions. This is mainly because such judgments typically rest on the most reliable criteria of semantic opacity and syntactic irregularity. The reasonable rationale behind this is that despite their infrequent occurrence opaque idioms must be processed as prefabs since they cannot be calculated online on the basis of rules. 4.
The relation between pragmatic associations and types of lexico-grammatical patterns
4.1
Brief survey of previous literature
The existing literature on pragmatic aspects of more or less fixed multiword expressions has largely focussed on two types of patterns: routine formulae (e.g. Coulmas 1981), on the one hand, and various types of recurrent conversational sequences, on the other. Three publications stand out as particularly instructive sources for the present attempt to investigate the relation between lexico-grammatical patterns and pragmatic associations: Pawley and Syder’s (1983) seminal study on lexicalized sentence stems, Nattinger and DeCarrico’s (1992) book on lexical phrases and Aijmer’s (1996) volume on conversational routines. Three further studies that explicitly target pragmatic aspects of idioms or phraseology, among them Strässler (1982) and Filatkina (2007), do not approach the issue from the perspective of the fixed expressions, but set out from classic pragmatic topics such as speech acts, deixis, implicatures and presuppositions, and discuss their relevance for the description of phraseological units. Grzybek
260 Hans-Jörg Schmid
(2007: 201–202) emphasizes that all attempts to link phraseological units to specific communicative functions are doomed to failure due to the polyfunctionality of most elements. However, the fact that most types of lexicogrammatical patterns can of course serve several functions and be used in many contexts does not rule out the possibility that their emergence and use are indeed supported, or even motivated, by one or more of their more frequent functions. Many authors who do not focus on pragmatic aspects nevertheless acknowledge the pragmatic potential of certain types of more or less fixed expressions by introducing specific categories. Cowie (1988: 132) proposes a distinction between semantically specialized idioms and pragmatically specialized idioms (cf. also Wray 2002: 58). Fillmore, Kay and O’Connor, in their pioneering paper on let alone, devote a quarter page to the distinction between “idioms with and without a pragmatic point” (1988: 506), but do not dwell on this issue any further. Aijmer (1996: 24–28) investigates items that lend themselves to performing socially and interactionally relevant illocutionary functions such as thanking, requesting and apologizing. Moon subsumes simple formulae, sayings, proverbs and similes under fixed expressions that are “problematic and anomalous on grounds of [...] pragmatics” (1998: 19) – a characterization that is unlikely to do justice to the role of pragmatic associations. Gramley and Pätzold have a category of routinized stereotypical phrases referred to as “pragmatic idioms” (2004: 59), which also includes items that are found in greetings, introductions, partings and other recurrent types of social encounters.21 Pragmatic aspects of lexico-grammatical patterns have also been mentioned as providing a methodological tool for assessing the formulaicity of potential fixed expressions. While Pawley and Syder (1983) and Nattinger and DeCarrico (1992) laid the foundation for this idea, Read and Nation (2004: 33) explicitly mention the possibility of using a “pragmatic/functional analysis” as an “analytical criterion”, recognizing “that formulaic sequences have important roles in the performance of speech acts and are commonly associated with particular speech events”. Wray (2008: 117–118) includes the tests whether a sequence “is associated with a specific situation and/or register” and “performs a function in com21. References to a number of other relevant sources can be found in Moon (1998: 216). Feilke (1996) and Stein (1995) are very instructive investigations of German which highlight communicative functions.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 261
munication or discourse” among the possible criteria to be used as a diagnostic for the identification and assessment of formulaic sequences. 4.2
Survey of the criteria
On the basis of this literature and the definition of pragmatic associations in Section 2.2, two main criteria will be used to assess the interplay of pragmatic associations and different types of lexico-grammatical patterns: 1. Can a specific recurrent communicative intention or illocutionary force be identified as being associated with the potential pattern (cf. Wray 2008: 118)? Such an intention can include conventionalized indirect speech acts (Searle 1975) or frequent implicatures (Grice 1975). 2. Can we identify clear indicators for style and register constraints or connotative meanings extracted from pragmatic associations (cf. Wray 2008: 117)? The following analysis will not take up the three groups formed above but proceed from clear cases – in which pragmatic associations either undoubtedly do or do not play an important role – to the more interesting, so to speak “critical” cases. 4.3 4.3.1
Pragmatic associations and types of lexico-grammatical patterns Routine formulae, discourse markers and transparent conventional phrases
As has already been pointed out, routine formulae more or less by definition meet recurrent communicative needs and are closely associated with certain social situations and illocutionary functions. Given the right type of situation (such as when people meet for the first time), the use of certain routine formulae is both predictable and more or less obligatory; the meanings and functions of routine formulae depend on the specific situation and vary according to cultures and social groups (Coulmas 1981: 82–83). Conversely, the choices between competing routine formulae function as markers of styles and hence group identities, cf. the differing implications associated with greetings such as good afternoon, hello, what’s up, hi there, or
262 Hans-Jörg Schmid
hey dudes. All this supports the view that pragmatic associations play an important role in the use of routine formulae.22 This is also supported by the observation that routine formulae are prone to change ‘meanings’ and functions as a result of metonymic shifts of pragmatic associations from one aspect of a frame to another (Traugott and Dasher 2004). The EC-model treats these cases as re-schematizations of pragmatic associations. A good example is the formula how do you do, which emerged as a generalized chunk in the 17th century with the more or less transparent meaning and corresponding pragmatic function of inquiring about the health of the person addressed (cf. OED3, s.v. how do you do). Now since these inquiries are usually made in the early phases of social encounters, a pragmatic transfer seems to have taken place, creating an association between this expression and another conventional move common to the early stages of social encounters, namely greeting. Individual speaker/hearers apparently routinized this new pragmatic association and generalized it into a greeting – an understanding which then spread across the speech community and became conventionalized.23 As an additional pragmatic association restricting the use of this chunk even further, how do you do is today only used as a rather formal – and somewhat old-fashioned – greeting in first encounters of two interlocutors. The boundary between routine formulae and the class of transparent conventional phrases is rather fuzzy. Interestingly, phrases of the latter type, e.g. best before (on a food item), emphasis mine or my emphasis (following a quotation in academic prose), are referred to as “pragmatemes” by Mel’čuk (1995: 176–186) and defined as “pragmatic phrasemes” which are 22. With regard to their typical functions, Coulmas (1981: 94–108, 119), for example, distinguishes between discourse-controlling items (wait a minute, over to you), politeness formulae (don’t mention it, I beg your pardon), metacommunicative formulae (that’s all I have to say, are you with me), psychoostensive (you’re kidding, are you sure) and hesitation formulae (I guess, you know). Gläser (1986: 129–152) proposes categories on a more fine-grained level related to specific illocutionary acts including greeting and parting (good evening, take care), congratulating (many returns, a happy new year), apologizing (excuse me, no offence meant), regretting (what a pity, what a shame, I’m sorry), encouraging (take it easy, don’t panic), confirming (you can say that again, you said it), rejecting (don’t give me that, far from it) and warning formulae (mind the gap, watch your tongue). 23. Note that expressions like how are you doing, how are you and how is it going are currently undergoing the same shift of pragmatic associations.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 263
bound to certain situations. This kind of logic can be extended to other transparent but highly conventionalized expressions such as ladies and gentleman, laughing out loud, frequently asked questions, or what can I do for you, all of which are motivated by strong pragmatic associations along the lines suggested here. Phrasal discourse markers such as you know, I see, I mean, mind you or I think can be treated alongside routine formulae and conventional phrases, as it seems rather clear that their routinization and re-schematization has pragmatic and discursive foundations. This is irrespective of whether the whole process is modeled as grammaticalization (Brinton and Traugott 2005: 136–140), lexicalization (Aijmer 1996: 10) or indeed pragmaticalization (Aijmer 1997; cf. Brinton 2010: 303–305, Claridge and Arnovick 2010). In the EC-framework, then, knowledge about routine formulae, fixed conventional phrases and discourse markers – including of course knowledge related to their appropriate use and understanding, illocutionary aims as well as interpersonal and social implications – is explained as a routinization of pragmatic associations linking linguistic choices and usage events. As far as the emergence and developmental paths taken by these formulae are concerned, one can assume that in many cases the conventionalization of chunks and their potential concurrent pragmatic shifts does indeed result from repeated usage, while individual speakers learn them as fully chunked symbolic associations.24 4.3.2
Collostructions, valency patterns and verb-particle constructions
For different reasons, highly schematic, ‘grammatical’ collostructional and valency patterns, on the one hand, and more or less completely lexicalized verb-particle constructions, on the other, are rather unlikely to be related to pragmatic associations in any significant way. Regarding the former, none of the criteria given in section 4.2 can be applied with positive results, as neither the way in which lexemes are attracted by schematic constructions (i.e. collostructions) nor that in which 24. Overt signs of such a fusion-like process (cf. Brinton and Traugott 2005: 63– 67) can be observed in those extreme cases where chunking is accompanied by phonological reductions yielding simplex forms (i.e. univerbation) as in the greetings hi < hiya < how are you, howdy < how do you do or Bavarian pfüatdi ‘bye’ < behüte dich Gott lit. ‘may God watch over you’.
264 Hans-Jörg Schmid
constructions are demanded by lexemes (i.e. valency patterns) seems to be dependent on situational, functional, or other pragmatic aspects in the sense defined above. It is true that individual manifestations of these patterns can serve specific communicative functions – as has been argued to be the case for shell-noun constructions such as the thing is (cf. Delahunty 2011), the truth is or the fact is in Schmid (2001). Arguably, however, these lexicallyspecific patterns are on their way to becoming “sedimented” (Günthner 2011) as “lexicalized sentence stems” (Pawley and Syder 1983: 191) or “syntactic gestalts” (Aijmer 2007: 44), and have thus already made considerable progress in their development towards being conventionalized as fully chunked units.25 Verb-particle constructions, i.e. phrasal verbs, prepositional verbs and phrasal-prepositional verbs, can be treated as fully lexicalized multi-word lexemes boasting their own entries in the dictionary (Cappelle, Shtyrov, and Pulvermüller 2010).26 They do not, therefore, differ very much from other simple lexemes, except maybe in the respect that many of them are seen as belonging to a rather casual style level. While it seems clear that multi-word verbs are results of chunking-like developments – usually described as lexicalization (cf. Traugott 1999: 259) – the question whether or not pragmatic associations were instrumental in this diachronic development must remain open here.27 25. As regards typical, i.e. highly variable and lexically unfilled, clause-level constructions, collostructions and schemas, pragmatic aspects have not been explored in any detail so far. Goldberg (1995: 67 et passim) does mention issues such as information structuring, topic-comment arrangement and also style and register as having some relevance, e.g., for the passive construction, and exploits pragmatic considerations for the purpose of the semantic differentiation of argument-structure constructions (1995: 93–95), but does not focus on them in any detail. 26. Cf. Herbst and Schüller (2008: 119–120, 146–147) for a different approach which regards only simple verbs as listed lexemes and treats particle verbs as complementation patterns. 27. In her summary of the papers collected in the volume edited by Brinton and Akimoto (1999), Traugott (1999: 248–250) describes the general development of particle verbs and other complex predicates from Old English to Present-day English in three idealized stages passing from a) open and compositional “phrasal constructions” to b) “collocations and phrasal lexicalizations” and finally c) “idioms”. This classic lexicalization process towards a reduction of syntactic flexibility and semantic compositionality and the emergence
Lexico-grammatical patterns, pragmatic associations, discourse frequency 265
4.3.3.
Multi-word prepositions and connectors and partly filled periphery constructions
Multi-word, “complex” prepositions28 such as in view of, in spite of or with regard to and multi-word connectors such as as a result, on the contrary and in addition are explained by means of the routinization and generalization of syntagmatic associations into chunks in the EC-model (cf. also Beckner and Bybee 2009). In contrast to complex predicates, opinions are divided as to whether the emergence of complex prepositions should be seen as a grammaticalization or lexicalization process (Brinton and Traugott 2005: 65–66, Hoffmann 2005: 60–95). As far as the role of pragmatic associations in the chunking of complex prepositions is concerned, one should keep Hoffmann’s (2005) remark in mind that each element seems to have its own specific history. We have to rely on available case studies, then. Schwenter and Traugott (1995), for example, investigate the history of in place of, instead of and in lieu of and remark that pragmatic aspects must be taken into consideration in such endeavours. In the development of the three items they study, “the pragmatics of expectation” apparently plays a key role for chunking and concurrent semantic changes. More generally, “context-induced reinterpretation[s]” (Schwenter and Traugott 1995: 266) – i.e. shifts of pragmatic associations comparable to those observed for routine formulae above – can be shown to be involved. Bybee (2010: 173–174; see also Beckner and Bybee 2009: 36– 37) demonstrates the role of gradually conventionalized invited inferences (Traugott and Dasher 2004) for the development of in spite of. Systematic evidence on potential pragmatic motivations of partly-filled periphery constructions such as let alone, the X-er … the Y-er, or what’s X doing Y is also not available so far. What do studies of individual items tell of holistically processed multi-word predicates is accounted for by the ECmodel in terms of the routinization, generalization and diffusion of syntagmatic associations resulting in second-order symbolic associations. 28. Ulrich Detges (pers. comm.) has rightly pointed out that multi-word prepositions may in fact not deserve the special status they are awarded here, since they are essentially just manifestations of nominal idioms. They are nevertheless considered a special class here because of the impressively large number of items showing apparently similar historical developments that can be traced back to pragmatic associations. This is also the reason why they are treated differently from partly filled periphery constructions, which seem to be much less systematic in their sources and developments.
266 Hans-Jörg Schmid
us here? Although they do not probe the question in greater detail, Fillmore, Kay and O’Connor (1988: 532–533) emphasize that speakers’ knowledge about constructions such as let alone includes knowledge about “specific pragmatic functions in whose service they [i.e. constructions] exist” (1988: 534). This can certainly be understood as meaning that the construction owes its existence as a conventionalized form-meaning pairing to its pragmatic functions. The same line of argumentation seems also convincing for the what’s X doing Y construction (what’s that fly doing in my soup), which is closely related to the illocutionary act of a “request or demand for an explanation” and “the pragmatic force of attributing [… an] incongruity to the scene of proposition for which the explanation is required” (Kay and Fillmore 1999: 4; original emphasis omitted). The pattern Him be a doctor? (Akmajan 1984, Lambrecht 1990) is associated with the expression of incredulity (Kay 2004: 677). While Fillmore, Kay and O’Connor (1988: 506) regard the the X-er … the Y-er construction as an idiom “without a pragmatic point”, Nattinger and DeCarrico (1992: 36) attribute the function of “expressing comparative relationships among ideas”, admittedly not a very “pragmatic” type of function.29 To take a final example, the strong pragmatic associations and motivations of the not that construction – e.g. not that I care, not that it matters – have been amply demonstrated by Delahunty (2006) and Schmid (2013). As regards the way in which these constructions have emerged, the evidence available also speaks for considerable heterogeneity within this class. The entry for let alone in the OED3 strongly suggests that this gambit has developed by means of repeated usage of the imperative form of the verb let and alone, that is by means of the conventionalization of a gradual chunking process. In contrast, not that is very likely a case of constructional borrowing from the Latin fixed expression non quod, which had an equally chunked model in Ancient Greek in the form οὐχ ὅτι ‘not because, not that’ (Schmid 2011).30 29. What is interesting about the the Xer ... the Yer construction in the context of the EC-model is that the fully chunked and schematized, lexically-filled idiom the bigger they are/come the harder they fall co-exists with the schematic pattern the X-er ... the Y-er. 30. Interestingly, the historical data suggest that a very special pragmatic function may have supported the borrowing of not that, which is first attested in Wycliffe’s New Testament (1382), and its later conventionalization: the translators of the Wycliffe group used not that very frequently to render the Latin non quod and also non quia when it occurred in the marginal glosses in one of
Lexico-grammatical patterns, pragmatic associations, discourse frequency 267
An interesting third possibility of how periphery constructions can emerge by the assistance of pragmatic associations can be illustrated with the help of the what’s X doing Y construction. In this case, a likely development is that one lexically-filled and pragmatically determined item served as the source of a schematic pattern, comparable to the way in which individual non-analyzable lexemes (e.g. Watergate, hamburger) can spawn productive morphological schemas (cf. Iraqgate etc., chickenburger etc.). A good candidate for being the source of what’s x doing Y is the line waiter, what’s this fly doing in my soup which is part of a well-known joke. The process as such is also comparable to cases where proverbial quotations (e.g. to be or not to be) provide a model for modifications (e.g. to pee or not to pee, to see and not to see), which are also more or less conventionalized and can thus result in schema-formation. 4.3.4
Proverbs and idioms
Proverbs and proverbial sayings are typically discussed by emphasizing their epistemological, historical, cultural and ideological background and their expressive, figurative and quasi-authoritative functions in discourse (cf. e.g. Coulmas 1981: 59–65, Gläser 1986: 103–121, Moon 1998: 256– 260, Mieder 2007, Norrick 2007). With regard to more specifically pragmatic functions such as illocutionary acts, Gramley and Pätzold provide a good starting-point by stating that “[p]roverbs are said to have a didactic tendency: they suggest a course of action” (2004: 63). Moon attributes a deontic function to “proverbs in the abstract” and concludes that “they can be categorized as directives” (1998: 274). For Strässler, the major illocution of proverbs and other idioms lies in the “assessment of the social structure of the participants of a conversation” (1982: 128). Such direct or indirect illocutionary forces can indeed be attributed to a considerable number of proverbs such as look before you leap, out of sight, out of mind, let sleeping their sources, the Postillae litteralis super totam Bibliam written by the influential French theologist Nicolas de Lyra (1270-1349), in order to correct in an anticipatory fashion potential misunderstandings of scripture. In the Book of Judges, for example, there is the potential misunderstanding that Gedeon was offering a sacrifice to an angel, rather than God himself, which is rectified by the gloss “not that Gedeon wolde that the sacri|fice be offrid to him that ap|peride to him, for it is to offre to God aloone” (quoted in Schmid 2011: 303).
268 Hans-Jörg Schmid
dogs lie, or the early bird catches the worm. More generally, the use of proverbs in actual discourse can also be said to be triggered by recurrent pragmatic associations related to the somewhat patronizing implications which are supported further by the fact that proverbs are often perceived as being somewhat old-fashioned. From this perspective, particularly instructive examples come from the class of so-called truisms, common places, or platitudes, especially tautological ones such as boys will be boys, enough is enough, or business is business. As these expressions do not seem to have an informative propositional content, at least not on the linguistic surface, it is not unreasonable to assume that their use is entirely controlled by various kinds of pragmatic considerations, among them the persuasive or dismissive function and the types of situations that typically produce them. 31 What remains in the class of idioms once all the other types of lexicogrammatical patterns such as proverbs, routine formulae and transparent conventional phrases have been deducted is rather difficult to assess with regard to the role played by pragmatic associations. Essentially, what we are left with are idioms of the nominative, rather than propositional, type (e.g. Gläser 1986: 49, Grzybek 2007: 194, 203–204, Burger 2010: 36–37), which do not lend themselves to performing fully-fledged illocutionary acts like complaining, disagreeing, or complimenting. Even idioms extending across several clause constituents such as keep tabs on, have an axe to grind, or fall on deaf ears do not seem to be associated with specific types of illocutionary acts. Nevertheless pragmatic associations are certainly not irrelevant for the existence and use of idioms. On a rather anecdotal note, it is remarkable that people will often explain the meanings of idioms by means of phrases such as “you use this when …” rather than “this means …”, thus referring to pragmatic usage conditions rather than semantic aspects proper. In addition, the widespread agreement (Moon 1998: 68–74, 215–277, Burger 2010: 81–82) that many idioms are particularly expressive and figurative ways of conveying meanings and attitudes, carry rich connotations, serve specific functions in discourse and are highly genre-, registerand style-sensitive can be taken as evidence for the existence of strong pragmatic associations. 31. What should not be underestimated is the fact that unlike idioms and collocations, proverbs and commonplaces are fully saturated propositions containing finite verbs, or indeed autonomous texts, which lend themselves to performing fully-fletched illocutionary acts and carrying other types of pragmatic associations.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 269
4.3.5
Collocations and lexical bundles
Collocations and lexical bundles jointly form Group 2 in Figure 1 above, since they both exhibit medium degrees of frozenness/variability. The apparent similarity between the two notions or phenomena also shows in Biber, Conrad, Leech, Johansson and Finegan’s (1999: 992) claim that “threeword bundles can be considered as a kind of extended collocational association” (1999: 992). While this may be true in some respects, I will try to show in the following that the two phenomena differ considerably with regard to the role played by pragmatic associations. The terminological space carved out by the notion of collocation is more or less reserved for significant syntagmatic associations between words which do not fit into other, more clearly definable categories (Schmid 2003). This is by no means a deficit to be deplored but it is a result of how the space of lexico-grammatical combinations and patterns has traditionally been partitioned: at one end of the continuum, free and unpredictable combinations fall within the remit of “open-choice” syntax; at the other end, syntagmatic associations which are, or gradually become, fixed, opaque and highly predictable belong to the fields of idioms and conventional fixed expressions (Sinclair 1991: 110). The notion of collocation covers the intermediate ground. The notion of lexical bundles was originally introduced to refer to recurrent building blocks of discourse which cannot be defined and delimited in terms of grammatical functions, are highly register-sensitive and can be identified more or less mechanically in corpora due to their frequency of occurrence (Biber, Conrad, Leech, Johansson, and Finegan 1999: 990, Biber 2006: 134). Their cognitive status was deliberately left open. More recently, however, researchers have begun to investigate whether at least some lexical bundles are stored and processed as holistic chunks and produced substantial evidence suggesting that they indeed are (cf. Conklin and Schmitt 2012 for a recent survey). Schmitt, Grandage and Adolphs (2004), who used a psycholinguistic dictation experiment to test holistic processing of frequent lexico-grammatical clusters with native and non-native test participants, come to the conclusion that corpus frequency alone is not a reliable predictor of storage and retrieval type. Using a self-paced reading task, Schmitt and Underwood (2004: 187) arrived at rather unsystematic results, which forced them to conclude that “it must be questioned whether the self-paced reading task is the best methodology to research formulaic sequences”. Following up on Schmitt and Underwood’s lead, Tremblay,
270 Hans-Jörg Schmid
Derwing and Libben (2009) applied the method once more to probe the question whether lexical bundles are stored and processed as single units. They found that lexical bundles were read significantly faster than control strings if participants were allowed to go through the text in a chunk-bychunk and sentence-by-sentence mode. In contrast, when the reading had to be carried out in word-by-word display, the facilitatory effect of holistic processing was disrupted.32 Using a phrasal decision task, Arnon and Snider (2010) showed that more frequent four-word lexical bundles are processed significantly faster than less frequent ones. Their results also indicate that frequency must be treated, and tested, as a gradient rather than categorical – i.e. high vs. low frequency – variable, which they interpret as demonstrating that there is no clear-cut boundary between stored/represented and computed four-word lexical bundles. 33 Interestingly, Tremblay, Derwing and Libben comment on the claim that lexical bundles are linked with certain discourse functions and tend to occur in certain positions in sentences. They remark: However, the LBs [i.e. lexical bundles, HJS] used here were not embedded in their usual place within a sentence and as such did not carry the discourse functions they have been said to portray, if any at all. This suggest [sic!] that even though LBs might bear more often than not a set of specific discourse functions, there is no inherent association between the two (2009: 273).
32. An alternative explanation for these findings suggested to me by Susanne Handl could be that word-by-word presentation disrupts the processing of what Hunston and Francis (2000: 215) call “pattern-flow”, that is the way in which lexico-grammatical patterns sequentially overlapping in an utterance are worked out. 33. Tremblay, Derwing and Libben also note that “it is still unclear how exactly […] the term ‘stored’ [is] defined” and that “[f]urther research is needed to determine what exactly is storage” (2009: 272), adding that their test design does not discriminate between “knowledge that [individual words] go together” and are thus “linked together through combinatorial knowledge” and fully holistic chunk-storage. As the brief outline of the EC-model in Section 2 has indicated, such imponderables are predicted by the model, which does not work with dichotomous types of storage and retrieval but relies on a dynamic conception of degrees of association strengths resulting from degrees of routinization. In addition, as I have claimed (cf. Wray 2008: 11), different speakers may well process identical stimuli in different ways.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 271
This is not the only conclusion one can draw from their results, however. It could also be the case that the pragmatic associations triggered by lexical bundles do not only become routinized but “emancipate” from individual usage events and embark on a journey towards becoming symbolic associations as a result of increased routinization. In traditional terms, recurrent context-dependent pragmatic meaning components become lexicalized as semantic, i.e. context-independent meanings. How can this claim be substantiated? Firstly, it is supported by the work by Pawley and Syder (1983), Nattinger and DeCarrico (1992) and Aijmer (1996), who convincingly demonstrate that the pragmatic properties of the lexical sequences they study are more or less inherent to them and thus context-free rather than contextdependent meaning components. Secondly, additional evidence can be adduced by a look at the collection of the most frequent lexical bundles garnered from Biber, Conrad, Leech, Johansson and Finegan (1999), which is given in Table 1. This table includes only those four- and five-word lexical bundles which are marked as occurring more frequently than 100 occurrences per million words (symbolized by ***) and 40 occurrences per million words (**) in conversation by Biber, Conrad, Leech, Johansson and Finegan (1999: 1001–1014). In a sense, then, these are the most “successful” instantiations of the class of lexical bundles in conversation.34 Table 1.
Frequent lexical bundles in conversation and their pragmatic functions35
** I tell you what ** I was going to say ** was going to say ** you don’t have to
S announces a future speech act
S gives H permission
34. The items listed in the table are literally just the tip of an iceberg of a much larger set of somewhat less frequent lexical bundles. Rich material on more specific items including a larger number of content words can also be found in Pawley and Syder (1983: 206–208, 213–214), Nattinger and DeCarrico (1992) and Aijmer (1996). 35. The +-signs used by Biber, Conrad, Leech, Johansson, and Finegan (1999: 1001) to indicate that lexical bundles are incorporated in larger lexical bundles have been omitted.
272 Hans-Jörg Schmid
** I don’t think so ** I thought it was ** I think it was *** I don’t want to ** I’m not going to ** I would like to *** I was going to *** I don’t know what ** I don’t know how ** I don’t know if ** I don’t know whether ** I don’t know why ** well I don’t know ** oh I don’t know
S expresses disagreement with H
*** are you going to ** what are you doing ** if you want to ** do you know what ** what do you think ** you want me to ** do you want me ** you want to go *** do you want to ** do you want a ** what do you want ** are we going to
S inquires about H’s intentions
** It’s going to be ** going to be a ** going to have to ** have a look at ** let’s have a look ** you don’t want to ** I said to him
S informs H about S’s intentions (present and past)
S informs H about S’s lack of knowledge
S inquires about H’s knowledge S inquires about H’s state of mind S inquires about H’s wishes
S inquires about S’s and H’s common future actions S makes a prediction
S makes a suggestion S reports a previous speech act
Lexico-grammatical patterns, pragmatic associations, discourse frequency 273
** you know what I ** you know what I mean ** know what I mean ** what do you mean ** thank you very much ?** the end of the ?**at the end of ?** or something like that
S intends to secure understanding
S thanks H S makes a reference to time S marks vagueness
I have arranged the items in the table in such a way that common pragmatic associations, which are of course not mentioned by Biber et al. but have been added to support the present argument, come to the fore. Rough glosses of these associations are rendered in the right-hand side column of the table. The arrangement indicates that all items listed – except the final three marked by question marks – can easily be associated with quite specific pragmatic functions. The list is dominated by conventionalized direct or indirect requests and other directive speech acts of various types (cf. Nattinger and DeCarrico 1992: 49–54, Aijmer 1996: 124–199) and expressions of beliefs, intentions, or states of (lack of) knowledge (I don’t know + whelement). Particularly revealing examples include highly conventionalized indirect speech acts such as do you want me, you don’t have to, or I don’t think so, expressing speakers’ inquiries for hearers’ wishes, speakers’ giving permission to hearers and speakers’ expression of disagreement respectively. In order to fully appreciate the pragmatic potential of these lexical bundles, one should consider what is not included in this list: one does not find any familiar collocation-like sequences such as put it in the fridge (0.16 occurrences per million words in the BNC), take me home (0.39), or go to work (2.18).36 Instead, the list seems to indicate that lexical bundles are a mirror of what people in face-to-face social interactions most frequently negotiate: they exchange information concerning states of minds, intentions and plans for future actions, motivations for past actions; they reject each others’ opinions and try to secure understanding; they give and ask for permission; they inform each other about their intentions, and so on. It is 36. What has to be noted is that the individual items that are parts of frequent lexical bundles are of course also high-frequency items themselves and are therefore more likely to occur in any cotext than rarer words, irrespective of their pragmatic utility.
274 Hans-Jörg Schmid
thus not unlikely that the lexical bundles in the list in Table 1 stand out in terms of frequency because they reflect what Nattinger and DeCarrico (1992: 63–64), admittedly having in mind a much more specific meaning, call “necessary topics”.37 All these observations stand out in stark contrast to what can be said about collocations, which seem to lack such pragmatic associations more or less entirely (cf. Nattinger and DeCarrico 1992: 36). This is partly due to the fact that collocations are not fully saturated propositions but link verbs and objects, subjects and verbs, modifying adjectives and nominal heads, and modifying adverbs and adjectival heads or verbs. They are not represented as finite clauses and thus do not lend themselves to performing communicative functions such as illocutionary acts. Frequent lexical bundles, on the other hand, superficially similar to collocations as they may seem, can be motivated and fostered by pragmatic associations. It is argued here that this is instrumental in the process that can turn at least some, particularly the most frequent, lexical bundles into complex symbolic associations. Remarkably, if a collocation or, more precisely, an expression based on a collocation somehow “manages” to assume a special pragmatic function, as is the case in the typically ironic you are a fine friend or he is a fine friend, then chunking proceeds further up to the point where we would not classify the chunk as a collocation anymore. So basically, collocations are, more or less by definition, never processed as fully schematic chunks but can only become routinized to a certain degree. 5.
Pragmatic associations, discourse frequency, chunking and salience
The overall picture emerging from the discussion in Section 4 can be summarized as follows: Pragmatic associations seem to play an important role in the routinization, schematization and chunking of some types of lexicogrammatical patterns, while it seems to be more or less irrelevant for other types. The former category includes types of lexico-grammatical patterns subsumed in Group 1 in Figure 1 above (i.e. routine formulae, transparent conventional phrases, proverbs, discourse markers and complex preposi37. Nattinger and DeCarrico (1992: 63) use the term necessary topics to refer to “topics about which learners are often asked, or ones that are necessary in daily conversation” such as autobiographical questions, or questions relating to quantities, time, location, weather and personal likes and preferences.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 275
tions) as well as, notably, lexical bundles, while the latter contains the Group 3 types of valency patterns and collostructions as well as particle verbs and collocations. As for idioms, pragmatic associations are mainly related to attitudinal stance and stylistic choices rather than illocutionary forces. In view of these results, it can be concluded that pragmatic associations seem to contribute to the routinization and schematization of lexicogrammatical patterns in different ways: Firstly, pragmatic, especially illoctionary, utility can be a reason for repeated usage resulting in gradual chunking process, both in individual minds and in the speech community. Secondly, recurrent pragmatic associations can foster the associative symbolic strength of already chunked elements and hence their stability in the speech community. And thirdly, pragmatic associations can strengthen the productive role of schemas related to recurrent situations and communicative intentions. If it is indeed the case that pragmatic associations support chunking, one may be inclined to ask if and in which way pragmatic aspects are related to frequency of occurrence, which is often seen as a crucial factor influencing chunking-related grammaticalization processes such as fusion, coalescence and univerbation (Bybee 2010: 46–53). The more frequently instantiated phenomena in Group 3 (valency patterns, collostructional as well as collocational attractions) have been found less likely to be associated as routinized and schematized patterns and to be supported by pragmatic associations, while the rarer phenomena in Group 1 are both chunked and motivated or at least strengthened by pragmatic associations. Does this mean that pragmatic associations have a stronger effect on routinization and chunking than discourse frequency? To address this question, a closer look at what discourse frequency actually means and how it comes about is needed. To begin with, frequency is never frequency as such, i.e. absolute frequency, but always relative frequency, that is the frequency of occurrence of one thing as compared to that of another (cf. Hoffmann 2005: 148–149). Next, two types of relative frequencies must be distinguished, relative frequency with regard to paradigmatic competitors and relative frequency with regard to syntagmatic companions. The former, which is similar to Hoffmann’s (2005: 107–110) idea of “conceptual frequency” and Geeraerts’ (2006: 85) notion of “onomasiological salience” compares the frequency of a given element to the frequencies of paradigmatically related items, for example when one observes that the word dog is more frequent than, say, co-hyponyms such as camel or tapir and the fairly technical hyperonym mammal or the rarer hyponym
276 Hans-Jörg Schmid
collie. In the EC-model, this paradigmatic relative frequency38 serves as an indicator for the probability with which speakers activate one symbolic association that competes with other symbolic associations related to it by means of paradigmatic associations. Paradigmatic relative frequency can thus be regarded as an indicator for “cotext-free entrenchment” (Schmid 2010: 120). The second type of relative frequency is particularly relevant for the study of syntagmatic associations and lexico-grammatical patterns. It concerns the proportion of uses of a given form (in a corpus) in a certain syntagmatic environment as opposed to uses of the same form in others and can serve as an indicator for “cotextual entrenchment” (Schmid 2010: 120). This type of frequency is measured in different ways by the well-known range of lexical association statistics including t-score, log-likelihood, mutual information and others (cf. e.g. Evert and Krenn 2001 for a survey). The form kith, for example, is very special in this respect, because it invariably occurs in the syntagmatic environment of kith and kin. So it has a relative syntagmatic frequency of occurrence of 100% with respect to this pattern. As a very occasional glance at a modern desktop or learners’ dictionary will show, other words are of course much more versatile but show tendencies to recur in identical or similar environments, thus giving rise to collocations and all the other kinds of lexico-grammatical patterns discussed above. At present, we know deplorably little about how paradigmatic relative frequencies and syntagmatic relative frequencies are related to each other and how their interaction affects degrees of routinization and schematization (cf. Schmid 2010, Schmid and Küchenhoff 2013). It seems to be clear that a “cranberry” (Moon 1998: 21) idiom-part such as kith is firmly associated with the pattern … and kin on the syntagmatic level, but can hardly be claimed to be entrenched as a serious paradigmatic competitor to other, more versatile nouns. Presumably, this form is unlikely to be activated by itself as a symbolic association in its own right, but will be activated effort38. Note that paradigmatic relative frequency is essentially a theoretical notion which is very difficult to measure. It is based on an onomasiological perspective whose operationalization requires knowledge of all potential competitors for the encoding of a given idea (in a given corpus). In actual practice, absolute frequencies of occurrences in corpora or normalized relative frequencies per million words are taken as a proxy for the assessment of paradigmatic relative frequencies, but this is of course not the ideal solution (cf. Geeraerts, Grondelaers, and Bakema 1994 for an interesting approach).
Lexico-grammatical patterns, pragmatic associations, discourse frequency 277
lessly as a part of the chunked form kith and kin when pragmatic and semantic circumstances call for an activation of this fixed phrase. However, the way in which the relative paradigmatic and syntagmatic frequencies of the overwhelming majority of more versatile lexical items affect entrenchment processes is far less clear. It seems possible to make some progress regarding this question by relating the two types of relative frequencies to pragmatic associations. In order to do so, the concept of salience or saliency (cf. e.g. Giora 2003, Hoffmann 2005: 148–152, Schmid 2007) must be introduced. Generally speaking, this notion refers to the potential of any stimulus to attract a person’s attention, enter a person’s focus of attention and activate certain associations. In Schmid (2007: 119–120), two main types of salience are distinguished, cognitive salience and ontological salience. Cognitive salience concerns the activation of associations (“concepts”) in given speech events, either caused by the presence of external stimuli or by spreading activation in the network. Associations are said to be salient if they are in the present focus of attention. Ontological salience, on the other hand, is defined as the potential of external stimuli to attract attention and activate certain associations (cf. Hoffmann 2005: 151). For example, by their very nature as living beings, humans and animals are more likely targets of our attention than, say, lampposts or doorknobs. Ontologically salient entities have a better chance of triggering cognitively salient associations than ontologically less salient ones. As is captured in so-called salience or empathy hierarchies (cf. Silverstein 1976, Langacker 1991: 306), speakers typically find themselves more salient than they find hearers, followed by other humans, animals, physical objects and finally abstract concepts. This is not the whole story, however. While ontological salience may indeed constitute an important perceptual foundation for cognitive salience, its effects are undoubtedly superseded by what could be called situational or pragmatic salience, that is, the likelihood that a given stimulus will grab our attention in a given discourse situation. This aspect is highlighted by the short definition of salience given by Smith and Mackie (2000: 66) – “the ability of a cue to attract attention in a context” (my emphasis) – and, at least with regard to the temporal element, also in the characterization provided by Chiarcos, Claus and Grabski: “Salience defines the degree of relative prominence of a unit of information, at a specific point in time, in comparison to other units of information” (2011: 2; my emphasis). If you are about to enter your house, the doorknob will become very salient, and if you are trying to park your car without scratching it against a lamppost you
278 Hans-Jörg Schmid
will make sure that you focus on where the lamppost is located. Pragmatic salience can thus override both cotext-free entrenchment and cotextual entrenchment, and it is the interplay of ontological salience and pragmatic salience which produces the cognitive salience that ultimately determines frequency of use or “frequency of being talked about” (Croft 2000: 76). The large body of experimental evidence on the processing of non-literal language use (metaphor, irony etc.) collected by Giora and her collaborators, whose results are enshrined in the well-known graded-salience hypothesis (cf. Giora 2003 and 2012 for surveys), suggests that even though pragmatic and cotextual salience can override cotext-free entrenchment, cotext-free entrenchment is never completely inoperative. This insight now puts us in a position to go back to the relation between pragmatic associations, chunking and discourse frequency. Consider first the case of proverbs and tautological clichés like let sleeping dogs lie or boys will be boys. These expressions, whose status as fully chunked units is beyond doubt, are extremely rare in terms of paradigmatic relative frequency, i.e. “absolute frequency of occurrence”. In addition, the key elements of these phrases – sleep, dog and boys – are highly versatile lexemes that evoke syntagmatic associations to all kinds of patterns, of which these fixed expressions are clearly not the most prominent ones. This means that neither cotext-free nor contextual entrenchment is a likely motive for the existence and resilience of these phrases. Nor is the ontological salience of dogs and boys likely to play an important role, as the corresponding concepts are of little salience in the use of the idiomatic expressions. Crucially, however, given the right kind of context, the whole chunks seem to be able to muster a high degree of pragmatic salience, in such a way that specific pragmatic associations can trigger them efficiently and effortlessly. This means that high pragmatic salience overrides low relative frequency and results in high cognitive salience in the given situation. Counter-intuitive as it may be if one subscribes to a telic notion of entrenchment rather than the procedural, dynamic conception preferred here, it seems to be reasonable to argue that a third type of entrenchment termed contextual entrenchment interacts with cotext-free and cotextual entrenchment. The same line of reasoning seems very plausible for pragmaticallydetermined routine formulae (how do you do, I beg your pardon), transparent conventional phrases (laughing out loud, give me a break, you’re kidding), grammatical periphery constructions (let alone, not that) as well as grammaticalized multi-word prepositions. The emergence and diachronic development of all these patterns into chunks is supported, if not actually
Lexico-grammatical patterns, pragmatic associations, discourse frequency 279
motivated, by their pragmatic salience, which is mediated by pragmatic associations that are highly routinized and can eventually become part of symbolic associations. As for lexical bundles and collocations, the major difference worked out in Section 4 seems to be that the former are supported by pragmatic salience and resulting contextual entrenchment, while the latter are not. Depending on the degree of pragmatic salience, therefore, lexical bundles can become schematized and conventionalized, as chunks and thus acquire the status of “social gestalts” (Feilke 1996). Collocations, on the other hand, do not go beyond the stage of the routinization of syntagmatic associations. As far as codification is concerned, it is probably not unfair to say that traditional descriptions of languages have not done justice to either of the two classes. While the corpus revolution in lexicography has resulted in a much better coverage of collocational associations, lexical bundles remain the Cinderella of lexicographical practice, even though the applied linguists who have been quoted so frequently in the present paper (Pawley and Syder 1983, Nattinger and DeCarrico 1992, Wray 2002, Schmitt 2004) have long recognized the enormous relevance of these pragmatically important chunks. 6.
Conclusion
The overall picture that has emerged in this paper suggests that the effects of pragmatic associations on chunking and schematization processes may have been underestimated so far, especially in quantitative approaches which have very much focused on corpus frequencies and different ways of calculating their statistical significance. What I have tried to show is that only a broad dynamic framework that integrates grammatical, semantic, pragmatic, cognitive and sociolinguistic aspects and combines them with quantitative observations on discourse frequencies can do justice to the pragmatic foundations of lexico-grammatical patterns. Such a framework has been suggested and tested with regard to its potential to explain the relation between chunking phenomena, pragmatic associations and discourse frequencies. With regard to the way in which lexico-grammatical patterns are “represented”, it has emerged that knowledge is entrenched in the form of a potentially wide range of associations that can be routinized, schematized and spread in the speech community, i.e. conventionalized, in different ways and to different degrees. Depending on the linguistic and situational context
280 Hans-Jörg Schmid
and on the corresponding ways in which cotext-free (paradigmatic) entrenchment, cotextual (syntagmatic) entrenchment and contextual (pragmatic) entrenchment interact, different associations will come to the fore. The form mind, for example, can trigger a symbolic association connecting the form to a meaning of the lexical schema MIND, which will be connected syntagmatically with other associations activated by the co- and context (he did not seem to mind her presence); it can trigger a symbolic association to the schematized syntagmatic chunk MIND YOU and its frequent functions as a discourse marker; and it can trigger a routinized or even schematized association to the sequence DO YOU MIND IF I ..., particularly in a situation where someone politely asks for permission. While all these associations are to a large extent shared by most speakers of English and thus conventionalized, the degrees of cotext-free, cotexual and contextual entrenchment will presumably differ in the minds of different speakers. To end on a personal note, in hindsight I have to admit that I love you should indeed not be treated as a case of collocation. While I would still argue that there exists a level of lexical chunking that supersedes the free syntactic composition of this sequence of words, I love you has absorbed too many pragmatic associations to count as a collocation and should instead be regarded as a transparent conventionalized phrase which is routinized and schematized to different degrees in the minds of different speakers of English. References Aijmer, Karin 1996 Conversational Routines in English: Convention and Creativity. London/New York: Longman. Aijmer, Karin 1997 I think – an English modal particle. In Modality in Germanic Languages: Historical and Comparative Perspective, Toril Swan and Olaf Jansen Westvik (eds.), 1–47. Berlin/New York: Mouton de Gruyter. Aijmer, Karin 2007 The interface between discourse and grammar: The fact is that. In Connectives as Discourse Landmarks, Agnès Celle and Ruth Huart (eds.), 31–46. Amsterdam: John Benjamins. Aitchison, Jean 2003 Word in the Mind: An Introduction to the Mental Lexicon, 3d ed. Oxford: Blackwell.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 281 Akmajian, Adrian 1984 Sentence types and the form-function fit. Natural Language and Linguistic Theory 2: 1–23. Arnon, Inbal, and Neal Snider 2010 More than words: Frequency effects for multi-word phrases. Journal of Memory and Language 62: 67–82. Auer, Peter 2007 Syntax als Prozess. In Gespräch als Prozess: Linguistische Aspekte der Zeitlichkeit verbaler Interaktion, Heiko Hausendorf (ed.), 95– 124. Tübingen: Narr. Auer, Peter, and Frans Hinskens 2005 The role of interpersonal accommodation in a theory of language change. In Dialect Change, Peter Auer, Frans Hinskens and Paul Kerswill (eds.), 335–357. Cambridge: Cambridge University Press. Barlow, Michael, and Suzanne Kemmer (eds.) 2000 Usage-Based Models of Language. Stanford, CA: CSLI Publications. Barthes, Roland 1957 Mythologies. Paris: Seuil. Beckner, Clay, and Joan Bybee 2009 A usage-based account of constituency and reanalysis. In Language as a Complex Adaptive System [= Language Learning 59: Suppl. 1], Nick C. Ellis and Diane Larsen-Freeman (eds.), 27–46. Chichester: Wiley-Blackwell. Behrens, Heike 2009 Usage-based and emergentist approaches to language acquisition. Linguistics 47.2: 381–411. Biber, Douglas 2006 University Language. Amsterdam/Philadelphia: John Benjamins. Biber, Douglas, Susan Conrad, Geoffrey Leech, Stig Johansson, and Edward Finegan 1999 Longman Grammar of Spoken and Written English. London et al.: Longman. Blank, Andreas 2001 Pathways of lexicalization. In Language Typology and Language Universals. Vol. II [= Handbücher zur Sprach- und Kulturwissenschaft 20.2], Martin Haspelmath, Ekkehard König, Wulf Oesterreicher and Wolfgang Raible (eds.), 1506–1608. Berlin/New York: Walter de Gruyter. Blumenthal-Dramé, Alice 2012 Entrenchment in Usage-based Theories: What Corpus Data Do and Do not Reveal about the Mind. Berlin et al.: Walter de Gruyter.
282 Hans-Jörg Schmid Blythe, Richard A., and William A. Croft 2009 The speech community in evolutionary language dynamics. In Language as a Complex Adaptive System [= Language Learning 59: Suppl. 1], Nick C. Ellis, and Diane Larsen-Freeman (eds.), 47–63. Chichester: Wiley-Blackwell. Brinton, Laurel 2010 Discourse markers. In Historical Pragmatics. [= Handbooks of Pragmatics, Vol. 8], Andreas H. Jucker and Irma Taavitsainen (eds.), 285–314. Berlin/New York: De Gruyter Mouton. Brinton, Laurel J., and Minoji Akimoto (eds.) 1999 Collocational and Idiomatic Aspects of Composite Predicates in the History of English. Amsterdam/Philadelphia: John Benjamins. Brinton, Laurel J., and Elizabeth Closs Traugott 2005 Lexicalization and Language Change. Cambridge/New York: Cambridge University Press. Bublitz, Wolfram, and Neal Norrick (eds.) 2010 Foundations of Pragmatics [= Handbooks of Pragmatics, Vol. 1]. Berlin/New York: De Gruyter Mouton. Burger, Harald 2010 Phraseologie: Eine Einführung am Beispiel des Deutschen, 4. Aufl. Berlin: Erich Schmidt. Bybee, Joan 1985 Morphology: A Study of the Relation between Meaning and Form. Amsterdam/Philadelphia: John Benjamins. Bybee, Joan 2001 Phonology and Language Use. Cambridge: Cambridge University Press. Bybee, Joan 2006 From usage to grammar: The mind’s response to repetition. Language 82: 711–733. Bybee, Joan 2007 Frequency of Use and the Organization of Language. Oxford: Oxford University Press. Bybee, Joan 2010 Language, Usage and Cognition. Cambridge/New York: Cambridge University Press. Bybee, Joan, and Paul Hopper (eds.) 2001 Frequency and the Emergence of Linguistic Structure. Amsterdam/Philadelphia: John Benjamins. Bybee, Joan, and Joanne Scheibman 1999 The effect of usage on constituency: The reduction of don’t in English. Linguistics 37: 575–596.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 283 Capelle, Bert 2011 The the… the… construction: Meaning and readings. Journal of Pragmatics 43 (1): 99–114. Cappelle, Bert, Yury Shtyrov, and Friedemann Pulvermüller 2010 Heating up or cooling up the brain? MEG evidence that phrasal verbs are lexical units. Brain and Language 115 (3): 189–201. Chiarcos, Christian, Berry Claus, and Michael Grabski 2011 Introduction: Salience in linguistics and beyond. In Salience: Multidisciplinary Perspectives on its Function in Discourse, Christian Chiarcos, Berry Claus, and Michael Grabski (eds.), 1–28. Berlin et al.: de Gruyter. Church, Kenneth W., and Patrick Hanks 1990 Word association norms, mutual information & lexicography. Computational Linguistics 16: 22–29. Claridge, Claudia, and Leslie Arnovick 2010 Pragmaticalisation and Discursisation. In Historical Pragmatics [= Handbooks of Pragmatics, Vol. 8], Andreas H. Jucker and Irma Taavitsainen (eds.), 165–192. Berlin/New York: De Gruyter Mouton. Clark, Herbert 1996 Using Language. Cambridge: Cambridge University Press. Clear, Jeremy 1993 From Firth principles. Computational tools for the study of collocation. In Text and Technology; In Honour of John Sinclair, Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.), 271–292. Amsterdam/Philadelphia: John Benjamins. Collins, Allan M., and Elisabeth F. Loftus 1975 A spreading-activation theory of semantic processing. Psychological review 82: 407–428. Conklin, Kathy, and Norbert Schmid 2012 The processing of formulaic language. Annual Review of Applied Linguistics 32: 45–61. Coseriu, Eugenio 1967 Teoría del lenguaje y lingüística general: Cinco estudios, 2d ed. Madrid: Gredos. Coulmas, Florian 1981 Routine im Gespräch: Zur pragmatischen Fundierung der Idiomatik. Wiesbaden/Düsseldorf: Athenaion. Cowie, Anthony P. 1988 Stable and creative aspects of vocabulary use. In Vocabulary and Language Teaching, Ronald Cater and Michael McCarthy (eds.), 126–139. London/New York: Longman.
284 Hans-Jörg Schmid Croft, William Explaining Language Change: An Evolutionary Approach. Har2000 low/New York: Longman. Croft, William 2009 Toward a social cognitive linguistics. In New Directions in Cognitive Linguistics, Vyvyan Evans and Stephanie Pourcel (eds.), 395–420. Amsterdam/Philadelphia: John Benjamins. Delahunty, Gerald 2006 A relevance theoretic analysis of not that sentences: “Not that there is anything wrong with that”. Pragmatics 16 (2/3): 213–245. Delahunty, Gerald 2011 Contextually determined fixity and flexibility in ‘thing’ sentence matrixes. Yearbook of Phraseology 2: 109–135. Dell, Gary S. 1986 A spreading-activation theory of retrieval in sentence production. Psychological Review 93: 283–321. Detges, Ulrich, and Richard Waltereit 2002 Grammaticalization vs. reanalysis: a semantic-pragmatic account of functional change in grammar. Zeitschrift für Sprachwissenschaft 21 (2): 151–195. Dobrovol’skij, Dmitrij 1995 Kognitive Aspekte der Idiom-Semantik: Studien zum Thesaurus deutscher Idiome. Tübingen: Stauffenberg. Eckert, Penelope 2000 Linguistic Variation as Social Practice: The Linguistic Construction of Identity in Belten High. Oxford: Blackwell. Ellis, Nick C., and Diane Larsen-Freeman (eds.) 2009 Language as a Complex Adaptive System [= Language Learning 59: Suppl. 1]. Chichester: Wiley-Blackwell. Enfield, Nick 2005 Micro- and macro-dimensions in linguistic systems. In Reviewing Linguistic Thought: Converging Trends for the 21st Century, Sophia Marmaridou, Kiki Nikiforidou and Eleni Antonopoulou (eds.), 313– 325. Berlin: Mouton de Gruyter. Enfield, Nick 2008 Transmission biases in linguistic epidemiology. Journal of Language Contact 2: 299–310. Erman, Britt, and Beatrice Warren 2000 The idiom principle and the open choice principle. Text 20: 29–62.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 285 Evert, Stefan, and Brigitte Krenn 2001 Methods for the qualitative evaluation of lexical association measures. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France: 188–195. Evans, Vyvyan, and Melanie Green 2006 Cognitive Linguistics: An Introduction. Edinburgh: Edinburgh University Press. Feilke, Helmuth 1996 Sprache als soziale Gestalt: Ausdruck, Prägung und die Ordnung der sprachlichen Typik. Frankfurt/Main: Suhrkamp. Filatkina, Natalia 2007 Pragmatische Beschreibungsansätze. In Phraseologie: Ein internationales Handbuch zeitgenössischer Forschung/Phraseology: An International Handbook of Contemporary Research, Halbband 1/Vol. 1, Harald Burger, Dmitrij Dobrovol’skij, Peter Kühn and Neal Norrick (eds.), 132–158. Berlin/New York: Mouton de Gruyter. Fillmore, Charles J., Paul Kay, and Mary C. O’Connor 1988 Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64: 501–538. Geeraerts, Dirk 2003 Cultural models of linguistic standardization. In Cognitive Models in Language and Thought: Ideology, Metaphors and Meaning, René Dirven, R. Frank and Martin Pütz (eds.), 25–68. Berlin/New York: Mouton de Gruyter. Geeraerts, Dirk 2006 Words and Other Wonders: Papers on Lexical and Semantic Topics. Berlin/New York: Walter de Gruyter. Geeraerts, Dirk, Gitte Kristiansen, and Yves Peirsman (eds.) 2010 Advances in Cognitive Sociolinguistics. Berlin/New York: Mouton de Gruyter. Geeraerts, Dirk, Stef Grondelaers, and Peter Bakema 1994 The Structure of Lexical Variation: A Descriptive Framework for Cognitive Lexicology. Berlin et al.: Mouton de Gruyter. Giles, Howard, Nikolas Coupland, and Justine Coupland 1991 Accommodation theory: Communication, context and consequence. In Contexts of Accommodation: Developments in Applied Sociolinguistics, Howard Giles, Nikolas Coupland and Justine Coupland (eds.), 1–68. Cambridge: Cambridge University Press. Giles, Howard, and Tania Ogay 2006 Communication accommodation theory. In Explaining Communication: Contemporary Theories and Exemplars, Brian B. Whaley and Wendy Samter (eds.), 293–310. Mahwah, NJ: Erlbaum.
286 Hans-Jörg Schmid Giora, Rachel On Our Mind: Salience, Context, and Figurative Language. New York: 2003 Oxford University Press. Giora, Rachel 2012 Happy new war: The role of salient meanings and salience-based interpretations in processing utterances. In Cognitive Pragmatics. [Handbooks of Pragmatics, Vol. 4], Hans-Jörg Schmid (ed.), 233–260. Berlin/Boston: Walter de Gruyter. Gläser, Rosemarie 1986 Phraseologie der englischen Sprache. Leipzig: Verlag Enzyklopädie. Goldberg, Adele 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago, IL: Chicago University Press. Goldberg, Adele 2006 Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press. Goldberg, Adele 2009 The nature of generalizations in language. Cognitive Linguistics 20 (2): 93–127. Gramley, Stephan, and Kurt-Michael Pätzold 2004 A Survey of Modern English, 2d ed. London/New York: Routledge. Granger, Sylviane, and Magali Paquot 2008 Disentangling the phraseological web. In Phraseology in Foreign Language Learning and Teaching, Fanny Meunier and Sylviane Granger (eds.), 27–49. Amsterdam: John Benjamins. Greenbaum, Sidney 1970 Verb-Intensifier Collocations in English: An Experimental Approach. The Hague: Mouton. Grice, Herbert P. 1975 Logic and conversation. In Syntax and semantics, Vol. 3: Speech acts, Peter Cole and Jerry L. Morgan (eds.), 41–58. New York: Academic Press. Grzybek, Peter 2007 Semiotik und Phraseologie. In Phraseologie: Ein internationales Handbuch zeitgenössischer Forschung/Phraseology: An International Handbook of Contemporary Research, 1. Halbband 1/Vol. 1, Harald Burger, Dmitrij Dobrovol'skij, Peter Kühn and Neal Norrick (eds.), 188–208. Berlin, New York: de Gruyter, Günthner, Susanne 2011 Between emergence and sedimentation: Projecting constructions in German interactions. In Constructions: Emerging and Emergent, Pe-
Lexico-grammatical patterns, pragmatic associations, discourse frequency 287 ter Auer and Stefan Pfänder (eds.), 165–185. Berlin/Boston: De Gruyter.
Harder, Peter 2010 Meaning in Mind and Society: A Functional Contribution to the Social Turn in Cognitive Linguistics. Berlin/New York: De Gruyter Mouton. Haspelmath, Martin 1999 Why is grammaticalization irreversible. Linguistics 37 (6): 1043– 1086. Haspelmath, Martin 2002 Grammatikalisierung: Von der Performanz zur Kompetenz ohne angeborene Grammatik. In Gibt es eine Sprache hinter dem Sprechen? Sybille Krämer and Ekkehard König (eds.), 262–286. Frankfurt/Main: Suhrkamp. Hawkins, John A. 2004 Efficiency and Complexity in Grammars. Oxford: Oxford University Press. Herbst, Thomas 1996 What are collocations: Sandy beaches or false teeth. English Studies 4: 379–393. Herbst, Thomas 2010 English Linguistics: A Coursebook for Students of English. Berlin/New York: De Gruyter Mouton. Herbst Thomas, and Susen Schüller 2008 Introduction to Syntactic Analysis: A Valency Approach. Tübingen: Narr. Hoey, Michael 2005 Lexical Priming: A New Theory of Words and Language. London/New York: Routledge. Hoffmann, Sebastian 2005 Grammaticalization and English Complex Prepositions: A CorpusBased Study. New York: Routledge. Holmes, Janet 2008 An Introduction to Sociolinguistics, 3d ed. London et al.: Pearson Longman. Hopper, Paul 1987 Emergent grammar. Berkeley Linguistics Society 13: 139–157. Hunston, Susan, and Gill Francis 2000 Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Amsterdam/Philadelphia: John Benjamins.
288 Hans-Jörg Schmid Imo, Wolfgang 2011 Online changes in syntactic gestalts in spoken German. Or: do garden path sentences exist in everyday conversation? In Constructions: Emerging and Emergent, Peter Auer and Stefan Pfänder (eds.), 127– 155. Berlin/Boston: De Gruyter. Johanson, Lars 2008 Remodelling grammar: Copying, conventionalization, grammaticalization. In Language Contact and Contact Languages, Peter Siemund and Noemi Kintana (eds.), 61–79. Amsterdam/Philadelphia: John Benjamins. Jones, Susan, and John M. Sinclair 1974 English lexical collocations. Cahiers de Lexicologie 24: 15–61. Kay, Paul 2004 Pragmatic aspects of grammatical constructions. In Handbook of Pragmatics, Larry Horn and Gregory Ward (eds.), 675–700. Oxford: Blackwell. Kay, Paul, and Charles J. Fillmore 1999 Grammatical constructions and linguistic generalizations: The What’s X doing Y? construction. Language 75: 133. Kjellmer, Göran 1982 Some problems relating to the study of collocations in the Brown Corpus. In Computer Corpora in English Language Research 19751981, Stig Johansson (ed.), 25–33. Lund/Bergen: Norwegian Computing Centre for the Humanities. Kristiansen, Gitte 2008 Style-shifting and shifting styles: A socio-cognitive approach to lectal variation. In Cognitive Sociolinguistics: Language Variation, Cultural Models, Social Systems, Gitte Kristiansen and René Dirven (eds.), 45–88. Berlin: Mouton de Gruyter. Krug, Manfred 2000 Emerging English Modals: A Corpus-Based Study of Grammaticalization. Berlin/New York: Mouton de Gruyter. Lambrecht, Knud 1990 What, me worry? Mad Magazine sentences revisited. Berkeley Linguistics Society 16: 215–228. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar, Vol. I: Theoretical Prerequisites. Stanford, CA: Stanford University Press. Langacker, Ronald W. 1988 A usage-based model. In Topics in Cognitive Linguistics, Brygida Rudzka-Ostyn (ed.), 127-161. Amsterdam/Philadelphia: Benjamins.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 289 Langacker, Ronald W. 1991 Foundations of Cognitive Grammar, Vol. II: Descriptive Application, Stanford, CA: Stanford University Press. Langacker, Ronald W. 2008 Cognitive Grammar: A Basic Introduction. Oxford/New York: Oxford University Press. Lenneberg, Eric H. 1967 Biological Foundations of Language. Chichester: Wiley. Levinson, Stephen C. 1983 Pragmatics. Cambridge et al.: Cambridge University Press. Lewis, Michael 1993 The Lexical Approach. Hove: Teacher Training Publications. Manning, Chris, and Hinrich Schütze 2001 Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. MacWhinney, Brian 1998 Models of the emergence of language. Annual Review of Psychology 49: 199–227. MacWhinney, Brian (ed.) 1999 The Emergence of Grammar. Mahwah, NJ: Lawrence Erlbaum. MacWhinney, Brian 2012 The logic of the unified model. In The Routledge Handbook of Second Language Acquisition, Susan M. Gass and Alison Mackey (eds.), 211– 227. New York: Routledge. Mel’cuk, Igor 1995 Phrasemes in language and phraseology in linguistics. In: Idioms: Structural and Psychological Perspectives, Martin Everaert, Erik-Jan von der Linden, André Schenk and Rob Schreuder (eds.), 167–232. Hillsdale, NJ: Lawrence Erlbaum Associates. Mieder, Wolfgang 2007 Proverbs as cultural units or items of folklore. In Phraseologie: Ein internationales Handbuch zeitgenössischer Forschung/Phraseology: An International Handbook of Contemporary Research, 1. Halbband 1/Vol. 1, Harald Burger, Dmitrij Dobrovol'skij, Peter Kühn and Neal Norrick (eds.), 394–413. Berlin, New York: de Gruyter. Moon, Rosamund 1998 Fixed Expressions and Idioms in English: A Corpus-Based Approach. Oxford/New York: Oxford University Press. Nattinger, James R., and Jeanette S. DeCarrico 1992 Lexical Phrases and Language Teaching. Oxford/New York: Oxford University Press.
290 Hans-Jörg Schmid Norrick, Neal 2007 Proverbs as set phrases. In Phraseologie: Ein internationales Handbuch zeitgenössischer Forschung/Phraseology: An International Handbook of Contemporary Research, 1. Halbband 1/Vol. 1, Harald Burger, Dmitrij Dobrovol'skij, Peter Kühn and Neal Norrick (eds.), 381–393. Berlin, New York: de Gruyter. Pawley, Andrew, and Francis H. Syder 1983 Two puzzles for linguistics theory: Nativelike selection and nativelike fluency. In Language and Communication, Jack C. Richards and Richard W. Schmidt (eds.), 191–226, New York: Longman. Pickering, Martin J., and Simon C. Garrod 2004 Toward a mechanistic psychology of dialogue. Behavioral and Brain Science 27: 169–226. Pickering, Martin J., and Simon C. Garrod 2005 Establishing and using routines during dialogue: Implications for psychology and linguistics. In Twenty-First Century Psycholinguistics: Four Cornerstones, Ann Cutler (ed.), 85–102. Hillsdale, NJ: Lawrence Erlbaum. Pierrehumbert, Janet B. 2001 Exemplar dynamics: Word frequency, lenition and contrast. In Frequency and the Emergence of Linguistic Structure, Joan Bybee and Paul Hopper (eds.), 516–530. Amsterdam/Philadelphia: John Benjamins. Pierrehumbert, Janet B. 2006 The next toolkit. Journal of Phonetics 34: 516–530. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik 1985 A Comprehensive Grammar of the English Language. London: Longman. Read, John, and Paul Nation 2004 Measurement of formulaic sequences. In Formulaic Sequences: Acquisition, Processing, and Use, Norbert Schmitt (ed.), 23–36. Amsterdam/Philadelphia: John Benjamins. Saussure, Ferdinand de 1916 Cours de linguistique générale. Paris: Payot. Schiffrin, Deborah 1987 Discourse Markers. Cambridge: Cambridge University Press. Schmid, Hans-Jörg 2001 ‘Presupposition can be a bluff’: How abstract nouns can be used as presupposition triggers. Journal of Pragmatics 33: 1529–1552. Schmid, Hans-Jörg 2003 Collocation: Hard to pin down, but bloody useful. Zeitschrift für Anglistik und Amerikanistik 51 (3): 235–258.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 291 Schmid, Hans-Jörg 2007 Entrenchment, salience and basic levels. In The Oxford Handbook of Cognitive Linguistics, Dirk Geeraerts and Hubert Cuyckens (eds.), 117–138. Oxford: Oxford University Press. Schmid, Hans-Jörg 2010 Does frequency in text really instantiate entrenchment in the cognitive system? In Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches, Dylan Glynn and Kerstin Fischer (eds.), 101–133. Berlin et al.: Walter de Gruyter. Schmid, Hans-Jörg 2011 Tracing paths of conventionalization from the Bible to the BNC: A concise corpus-based history of the not that construction. In More than Words: English Lexicography and Lexicology Past and Present: Essays Presented to Hans Sauer on the Occasion of his 65th Birthday, Part I, Renate Bauer and Ulrike Krischke (eds.), 299–316. Frankfurt et al.: Peter Lang. Schmid, Hans-Jörg 2013 Is usage more than usage after all? The case of English not that. Linguistics 51 (1): 75–116. Schmid, Hans-Jörg, and Helmut Küchenhoff 2013 Collostructional analysis and other ways of measuring lexicogrammatical attraction: Theoretical premises, practical problems and cognitive underpinnings. Cognitive Linguistics 24 (3): 531–577. Schmitt, Norbert 2000 Vocabulary in Language Teaching. Cambridge/New York: Cambridge University Press. Schmitt, Norbert (ed.) 2004 Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam/Philadelphia: John Benjamins. Schmitt, Norbert, and Geoffrey Underwood 2004 Exploring the processing of formulaic sequences through a selfpaced reading task. In Formulaic Sequences: Acquisition, Processing, and Use, Schmitt, Norbert (ed.), 173–189. Amsterdam/Philadelphia: John Benjamins. Schmitt, Norbert, Sarah Grandage, and Svenja Adolphs 2004 Are corpus-derived recurrent clusters psycholinguistically valid? In Formulaic Sequences: Acquisition, Processing, and Use, Norbert Schmitt (ed.), 127–152. Amsterdam/Philadelphia: John Benjamins. Schwenter, Scott A., and Elisabeth Closs Traugott 1995 The semantic and pragmatic development of substitutive complex prepositions in English. In Historical Pragmatics, Andreas H. Jucker (ed.), 243–273. Amsterdam: John Benjamins.
292 Hans-Jörg Schmid Searle, John R. 1975 Indirect speech acts. In Syntax and Semantics, Vol. 3, Peter Cole and Jerry L. Morgan (eds.), 59–82. New York: Academic Press. Silverstein, Michael 1976 Hierarchy of features and ergativity. In Grammatical Categories in Australian Languages, Robert M. W. Dixon (ed.), 112–171. Canberra: Australian Institute of Aboriginal Studies. Sinclair, John McH. 1991 Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John McH. 1996 The search for units of meaning. Textus 9: 75–106. Sinclair, John McH., and Anna Mauranen 2006 Linear Unit Grammar. Amsterdam: John Benjamins. Sinha, Chris, and Cintia Rodríguez 2008 Language and the signifying object: From convention to imagination. In The Shared Mind: Perspectives on Intersubjectivity, Jordan Zlatev, Timothy P. Racine, Chris Sinha and Esa Itkonen (eds.), 357– 378. Amsterdam/Philadelphia: John Benjamins. Smith, Eliot R., and Diane M. Mackie 2000 Social Psychology. Philadelphia: Taylor & Francis. Stefanowitsch, Anatol, and Stefan Th. Gries 2003 Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8 (2): 209–243. Stein, Stephan 1995 Formelhafte Sprache: Untersuchungen zu ihren pragmatischen und kognitiven Funktionen im Gegenwartsdeutschen. Frankfurt/Main: Peter Lang. Strässler, Jürgen 1982 Idioms in English: A Pragmatic Analysis. Tübingen: Narr. Stubbs, Michael 1995 Collocations and semantic profiles: On the cause of the trouble with quantitative studies. Functions of Language 2: 23–55. Svensson, Maria Helena 2008 A very complex criterion of fixedness: Non-compositionality. In Phraseology: An Interdisciplinary Perspective, Sylviane Granger and Fanny Meunier (eds.), 81–93. Amsterdam/Philadelphia: John Benjamins. Terkourafi, Marina 2011 The pragmatic variable: Toward a procedural interpretation. Language in Society 40: 343–72.
Lexico-grammatical patterns, pragmatic associations, discourse frequency 293 Tomasello, Michael 2000 Do young children have adult syntactic competence? Cognition 74: 209–253. Tomasello, Michael 2003 Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA/London: Harvard University Press. Tomasello, Michael, and Hannes Rakoczy 2003 What makes human cognition unique? From individual to shared to collective intentionality. Mind and Language 18 (2): 121–147. The Five Graces Group [= Clay Becker, Richard Blythe, Joan Bybee, Morten H. Christiansen, William Croft, Nick C. Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman, Tim Schoenemann] 2009 Language is a complex adaptive system: Position paper. In Language as a Complex Adaptive System [= Language Learning 59: Suppl. 1], Nick C. Ellis and Diane Larsen-Freeman (eds.), 1–26. Chichester: Wiley-Blackwell. Traugott, Elisabeth Closs 1999 A historical overview of complex predicate types. In Collocational and Idiomatic Aspects of Composite Predicates in the History of English, Laurel J. Brinton and Minoji Akimoto (eds.), 239–260. Amsterdam/Philadelphia: John Benjamins. Traugott, Elisabeth Closs, and Richard B. Dasher 2004 Regularity in Semantic Change. Cambridge: Cambridge University Press. Tremblay, Antoine, Bruce Derwing, and Gary Libben 2009 Are lexical bundles stored and processed as single units? Working Papers of the Linguistics Circle of the University of Victoria 19: 258–279. Trudgill, Peter 1986 Dialects in Contact. Oxford: Blackwell. Wray, Alison 2002 Formulaic Language and the Lexicon. Cambridge/New York: Cambridge University Press. Wray, Alison 2008 Formulaic Language: Pushing the Boundaries. Oxford: Oxford University Press. Wray, Alison 2012 What do we (think we) know about formulaic Language? An evaluation of the current state of play. Annual Review of Applied Linguistics 32: 231–254.
Index Aarts 178–180 Abbot-Smith 12, 23 abstraction 9–10, 14, 17, 21, 23, 25, 39, 196, 205, 250 acquisition 4–6, 25, 39, 42, 47, 50, 52, 56, 72, 75–79, 183, 205, 242–243, 251, 254 Adolphs 269 affix 39, 41, 43 Ágel 168–169, 171, 176 Agent 15–18, 23, 102–103, 127, 173–174, 189–190, 196, 219 Aguado-Orea 18–19 Aijmer 239, 254, 258–260, 263–264, 271, 273 Aitchison 243, 248 Akimoto 264 Aksu-Koc 40 Allan 78 Allerton 169 allostruction 194, 196–198, 206 Ambridge 9, 18–19, 23–24, 200, 205 Anderson 73 argument-structure construction 4–5, 7, 15, 167, 169, 196–207, 220, 230, 264 Arnovick 263 Aslin 33, 73 association 2, 6, 72, 77–79, 84, 87, 126, 217–223, 226–227, 231, 239–241, 243–254, 259–271, 273–280
Atkins 110–111, 123, 134, 195 attraction 201, 205, 222, 233, 235–236, 258, 263 Auer 245–246, 254 Augustine 33 Baayen 75 Bakema 276 Baker 122 Baldwin 46 Bannard 12–13, 46, 58, 60, 75 Barlow 242 Barnbrook 107, 110 Barnes 11 Barthes 249 Bartlett 73 Bates 17, 50 Beckner 76, 90, 251, 265 Bednarek 107 Behrens 23, 242, 250, 254 Bellugi 46 Beltz 75 Bever 22 Biber 2, 257, 269, 271, 273 Blanchard 33 Blank 247 Bloom 33, 35–36, 52 Blumenthal-Dramé 243 Blythe 242, 246 BNC 80–81, 84, 86, 167, 171–172, 174, 176, 178, 180, 182–189, 191, 193– 194, 196–199, 201–203, 219, 226, 228, 232, 234–
296 Index
235, 239–240, 249, 255, 273 Boas 163, 167, 182, 201, 206 Bock 23 Bod 57, 73 Bookheimer 54 Borensztajn 57 Borschev 140 Borsley 169 Bowerman 19, 47, 53 Boyd 76 Braine 35–36, 40, 52, 59 Brandt 11, 21–22 Branigan 35 Brazil 103 Breindl 171 Bresnan 36, 56, 179, 198 Brinton 247, 263–265 Briscoe 80 Broccias 112, 114 Brooks 49 Brown 41, 46 Bublitz 252 Bühler 168, 204 bundle see lexical bundle Burger 249, 268 Burling 40 Bybee 3, 23, 72–76, 167, 200–201, 242, 244, 250– 251, 253–254, 258, 265, 275 Cadierno 73 Cameron-Faulkner 11–12 Capelle 196, 250, 256 Carpenter 11 Carroll 80
Caselli 19 Casenhiser 76, 200 categorization 9, 13, 73–76, 110 category 5, 10–15, 17–18, 25, 37, 49, 51, 53, 71, 73–79, 85, 89, 105, 125, 149, 154, 169, 172, 175, 178–179, 181, 186, 190, 192–195, 197, 259– 260, 262, 269, 274 causative 6, 53, 152, 217–236 Cause 108–109, 192, 195, 198, 220, 224–226, 229–236 Cazden 46 CDS see child-directed speech CEC see control ersatz construction Chan 17–18, 23 Chang 23 Chater 73 Chemla 12 Chiarcos 277 child-directed speech 10–13, 16–18, 25 CHILDES 33–34, 56–58 Chomsky, C. 42 Chomsky, N. 35–36, 40–41, 59 Christiansen 33, 73 chunk 3, 10, 223, 239, 249– 251, 254, 259, 262–266, 269–270, 274–280 chunking 6, 241, 251, 263– 266, 274–275, 278–280 Church 217, 255 Clahsen 55
Index 297
Clancy 40 Claridge 263 Clark, E. 15, 72, 77–78 Clark, H. 246 Claus 277 Clear 255 co-adaptation 245–247, 251– 252 COBUILD 79–81, 99–101 collexeme 205, 218–236 Collins 243 collocation 1–4, 6, 75, 79, 99– 100, 116, 134–136, 163, 207, 217–218, 239, 241, 250, 253, 256–258, 264, 269, 273–276, 279–280 collostructional analysis 6, 79, 185, 217–221, 225, 231, 235–237, 258, 263, 275 complex-adaptive system 90, 242, 247 concordance 101, 106, 217 Conklin 269 connotation 253, 268 Conrad 2, 257, 269, 271 conservatism 5, 48–51, 55 contingency 73, 78–79, 84–85, 89, 221, 227, 231, 234 control ersatz construction 86–89 conventional 240, 255, 259, 261–263, 268–269, 274, 278 conventionalization 242–243, 246–247, 251–252, 263, 266 conventionalized 1, 3, 72, 227, 246, 249, 251, 261–267, 273, 279–280
Cooreman 55 corpus 2–6, 10–11, 13–14, 17, 25, 33, 46, 49, 52, 57–58, 60, 79–80, 84, 86, 101, 106, 115–116, 123, 125, 127, 131, 133–135, 153, 162– 163, 187, 203–204, 217, 221–223, 227, 234–237, 239, 269, 276, 279 Corrêa 21 Coseriu 206, 246 cotext 273, 276, 278, 280 cotextual 248, 252, 276, 278, 280 Cottrell 23 Coulmas 254, 259, 261–262, 267 Coupland 245 Cowie 1–2, 260 Crain 51 Croft 3, 114, 167, 183, 190, 192, 199, 206, 242, 246, 278 Cruse 3, 125, 183, 192 Culicover 42 Dąbrowska 13, 46 Dasher 242, 262, 265 Davies 134 de Ruiter 170, 175 DeCarrico 250, 252, 254, 258–260, 266, 271, 273– 274, 279 DeKeyser 77 Delahunty 257, 264, 266 Dell 23, 243 Derwing 270
298 Index
Detges 239, 245, 265 dictionary 2, 99–101, 107, 121, 123, 169, 187–188, 199, 205, 247, 264, 276 Diessel 21–22 diffusion 54, 246–247, 265 Dirven 179 Dittmar 16–18, 23 Dobrovol’skij 257 Donald 40 Ebbinghaus 73 EC-model 242–252, 254, 262– 263, 265–266, 270, 276 Eckert 242 Edelman 58 Eisenberg 169, 182 Eisengart 16 Ellis 5, 71–73, 75–79, 82, 84, 89–90, 245 Ellsworth 122, 195 Elman 23 Emons 169, 193 Emotion 108–109 Enfield 246 Engel 181 entrenchment 10, 19, 22–23, 55, 72–73, 75, 233, 236, 242–243, 245–247, 251– 252, 276–280 entropy 83, 86–89 Erk 122 Erlekens 12 Erman 254 error 5, 9–10, 16, 18–21, 24– 25, 46–48, 53–54, 56, 205 Evans 105, 243
Evert 75, 276 exemplar 3, 22, 73–74, 76–78, 107, 242, 253 Experiencer 108–109, 140 Farrar 11 Faulhaber 6, 122, 167, 173, 199, 206 FBP see feature-based pattern FE see frame element feature-based pattern 5, 42, 51–55, 57, 59 Feilke 253, 260, 279 Feldman 38 Felser 55 Filatkina 259 Fillmore 3, 5, 109–111, 121, 123–124, 135, 139, 149, 163, 167, 169, 172–173, 177, 181, 183, 186, 189, 191, 195, 201, 242, 248, 256–257, 260, 266 Finegan 2, 258, 269, 271 Firth 2–3 Fischer 4 Fisher 16, 221, 227, 231 Fitch 35 fixed expression 2, 10, 22, 161, 163, 171, 187, 239, 246, 250, 254–255, 257–260, 263, 266, 269, 277–278 Fletcher 11 Flores d’Arcais 40 FN see FrameNet Fodor 51 Ford 198
Index 299
formulaic language 254 Fox 22 frame 5–6, 10–13, 19, 38–39, 47, 49, 52–53, 100, 110– 112, 121–138, 141, 144, 152–153, 156, 173, 182, 190, 233, 262 FrameNet 5, 109–110, 121– 128, 130, 133–134, 137– 139, 141, 143 144, 149, 151–152, 155, 158, 162, 164, 173, 181, 190–191, 194–195, 205 Francis 79–80, 99–100, 103– 104, 110–111, 113, 188– 189, 202, 270 Frazier 40 frequency 5–6, 10–11, 17–22, 24–25, 71–76, 78–79, 82–83, 86–87, 89, 100–101, 114– 116, 134, 187, 201, 221–222, 226–227, 231, 234–236, 239–241, 245, 255, 258–259, 269–270, 273–279 Freudenthal 11, 19–20 Friederici 54 Gabel-Cunningham 6 Garrod 245–246 Ge 34, 57 Geeraerts 242, 275–276 generalization 4, 9, 15, 19, 21, 25, 38, 48, 50, 52, 114, 126, 128, 148, 169, 185, 187, 190, 196, 200–202, 204– 207, 241, 248, 265 Gertner 16
Giles 245 Gillis 20 Giora 277–278 Gläser 253–254, 262, 267–268 Gleitman 33 Gobet 20 Goldberg 3, 5, 50, 56, 72, 76, 80, 112–113, 115, 152 167, 173, 179, 186, 190–194, 196–201, 205–207, 220, 223–225, 242, 248, 264 Goldschneider 77 Golinkoff 23, 33 Gómez 75 Goodluck 22 Goodman 50 Götz-Votteler 191 Grabski 277 Gramley 260, 267 grammaticalization 257, 263, 265, 275 Grandage 269 Granger 254 Green 243 Greenbaum 194, 250, 257 Grice 261 Gries 78–79, 101, 185, 198, 205, 217–218, 220, 222, 224, 226, 231–234, 258 Grondelaers 276 Grünloh 17 Grzybek 249, 255, 259, 268 Guilfoyle 22 Günthner 254, 264 Halliday 167, 172, 181 Hanks 107, 162, 217, 255
300 Index
Harder 242 Harnad 73 Harrington 22 Harris 58 Haspelmath 242, 244 Hauser 35, 40 Hausmann 1 Hausser 56 Hawkins 242 Hay 73 Heinz 33 Helbig 169–170, 172 Herbst 5–6, 102, 121, 123, 167, 169, 171–173, 176– 177, 181–182, 186, 188, 192, 195–199, 202, 204– 206, 239, 250, 258, 264 Heringer 171, 176 Hilpert 217 Hinskens 245–246 Hirsh-Pasek 23 Hoey 250–251 Hoff-Ginsberg 11 Hoffmann, L. 169, 171 Hoffmann, S. 257, 259, 265, 275, 277 Hoffmann, Th. 4 holistic 249, 251, 259, 265, 269–270 Holmes 247 Hopper 73–74, 242 Hornby 2 Huddleston 178 Hudson 56, 204–205 Hufnagle 9 Hunston 6, 79–80, 99–101, 104, 107, 110–111, 113–
115, 188–189, 202, 223, 270 Hunt 73 Huttenlocher 11 Hyams 59 Ibbotson 18, 200 IBP see item-based pattern idiom 2–3, 72, 75, 161, 207, 224, 241, 253–257, 259– 260, 264–269, 275–276 Imo 254 item-based pattern 5, 22, 33, 36, 60 Jackendoff 80, 167, 194, 196, 199 Jacobs 168, 171, 199, 204 Jannedy 73 Johansson 2, 257, 269, 271 Jones 255 Jurafsky 73, 84 Kay 3, 139, 152, 163, 167, 183, 197, 201, 224, 242, 248, 256–257, 260, 266 Keele 74 Keenan 34 Kello 75–76 Kelly 72 Kemmer 242 Kempe 17 Kempen 20, 56 Kidd 11, 22 Kilborn 55 Kilgarriff 134 Kirjavainen 20
Index 301
Kjellmer 255 Klima 46 Köhler 75–76 Kolhatkar 85 Krajewski 18 Krenn 276 Kristiansen 242 Krug 258 Kubczak 170, 175 Küchenhoff 84, 276 Küntay 39 Kwiatkowski 57 Labov 47 Lakoff 73 Lambrecht 266 Langacker 38, 56, 114, 167, 179, 190, 192, 206, 242– 244, 250, 277 language learning 1–5, 9–10, 12–13, 20–21, 23, 33–34, 42, 50, 52, 55, 57–59, 72, 161, 189, 193, 200 Larsen-Freeman 76, 79, 245 Lasnik 59 Lavie 56 Lee 53 Leech 2, 194, 257, 269, 271 Lee-Goldman 121, 149, 152, 163, 195 lemma 80, 85, 113–114, 126, 128 Lenneberg 243 Leonard 36 Levin 197–198 Levinson 252 Lewis 254
lexical bundle 2, 241, 257– 259, 269–275, 279 lexical unit 122, 125–129, 131–134, 137–138, 141, 150–151, 161–162, 170, 174–175, 180, 182, 184, 187, 191, 204–205 lexicalization 263–265 lexicalized 130–131, 146, 258–259, 263–264, 271 lexicogrammatical 2–3, 6, 167, 203–204, 218, 239, 241, 247–248, 254–256, 258– 261, 268–270, 274–276 lexicography 1–2, 4–6, 101– 102, 125, 127, 138, 141, 161–162, 182, 204, 223, 279 lexis 72, 100, 103, 112–113, 115, 167 Li 54, 58 Libben 270 Lieven 5–6, 9–18, 20, 22–23, 46–47, 58, 60, 75–76, 200, 205 Lightfoot 42 Lindstromberg 105 Loftus 243 LU see lexical unit Mackie 243–244, 277 MacWhinney 3, 5, 17, 33–40, 42–44, 48–56, 58, 60, 72, 78, 183, 200, 204, 242–243 Manin 75 Manning 80, 99–100, 104, 111, 188–189, 255
302 Index
Martin 73 Matthews 169, 171 Mauranen 103, 249 Mayhew 34 Maynard 84 McDonald 55 McNeill 43 memory 10, 34, 48, 57, 59, 73, 76, 246, 253 Menyuk 46 Mervis 77 Michaelis 163 Michelizzi 85, 87 Mieder 267 Miller 85 Mintz 12–13, 73 Monaghan 33 Moon 254, 260, 267–268, 276 Mooney 34, 57 Morris 23 Müller 169, 178–179 Nagell 11 Naigles 11 Nation 260 Nattinger 250, 252, 254, 258– 260, 266, 271, 273–274, 279 Nelson 46 network 5, 9, 11, 13, 19, 21– 25, 54, 57–58, 72, 75–76, 79, 85, 200, 243, 250, 277 Newman 76, 114 Newmeyer 179, 195 Newport 33 Ninio 75–77 Noble 16, 23, 200 Norrick 252, 267
Ogay 245 Onnis 58, 75 ontological 277–278 optional complement 171, 180–181, 184 optionality 170, 173, 176, 181, 184–185 overgeneralisation 19, 23–24, 47–51, 53, 200, 205 Pacton 73 Page 90 Palmer 1–2 Paquot 254 paradigmatic 248, 251–252, 255, 275–278, 280 Partee 140 participant 16, 42, 123, 136, 170, 173–174, 176, 179, 188–191, 197–198 participant role 170, 181–182, 189–192, 197–200 passive 18, 23–24, 53, 124, 172, 175–177, 181, 184– 185, 188, 196, 219–220, 229, 264 pattern flow 103, 270 Patwardhan 85, 87 Pätzold 260, 267 Pawley 258–260, 264, 271, 279 Pedersen 85, 87, 89 Peirsman 242 perception 73, 75, 77, 229– 230, 251 Perfors 49, 59 Perruchet 73
Index 303
Pickering 245–246 Pienemann 55 Pierrehumbert 242, 253 Pine 15–16, 19–20, 23–24, 46–47 Pinker 33, 36, 77 pivot schema 35 Pizzuto 19 Poeppel 59 Polguère 56 Pollard 56 Posner 74 pragmatic 6, 10, 24, 41, 72, 138, 161, 182, 205, 239–241, 248–249, 251–256, 259–269, 271, 273–275, 277–280 productivity 25, 47–49, 52–53, 74–75 projection 110 prototype 74, 77 prototypical 22–23, 76–77, 82, 87, 89, 171, 198 prototypicality 73, 77 proverb 255–256, 259–260, 267–268, 274, 278 Pulvermüller 54, 250, 264 p-value 222–223, 228–229, 232, 234 quantitative 74, 217–218, 221, 225, 235–237, 279 Quirk 105, 194, 257 Radden 179 Rakoczy 254 Rappaport Hovav 197–198 Read 260
recurrent 1, 3, 79, 244, 250, 254–255, 257, 259–261, 268–269, 271, 275 register 253, 259–261, 264, 268–269 Renouf 100 repulsion 222, 233–236 Rescorla 77–78 Rhomieux 149, 163, 195 Rice 114 Rispoli 19 Römer 82 Rosch 77 routine formulae 241, 254– 255, 259, 261–263, 265, 268, 274, 278 routinization 243–253, 260– 263, 265, 270–271, 274– 276, 279–280 Rowland 10–11, 15–16, 19, 23–24, 46 Roy 57 Rundell 134 Ruppenhofer 195 Saffran 33 Sag 56, 142, 150, 163 Sagae 56–57, 60 salience 17, 25, 73, 77–78, 105, 179, 256, 274–275, 277–279 Salomo 13 Sandbank 58 Sato 128, 135, 152 Saussure 248 scene 16, 57, 198, 266 Scheibmann 23, 258
304 Index
schema 3, 14–15, 23, 75, 79, 110, 147, 190, 244–245, 247–248, 250, 264, 267, 280 schema-formation 244, 247, 267 schematization 244, 248, 250– 266, 274–276, 279–280 Schenkel 169–170 Schiffrin 257 Schlesinger 36, 52 Schmid 6, 84, 121, 167, 196, 206, 217, 239, 243, 257–258, 264, 266–267, 269, 276–277 Schmidt 170, 175 Schmitt 254, 269, 279 Schüller 169, 171–173, 176– 177, 186, 188, 192, 195, 197, 204, 264 Schumacher 170, 175, 181– 182, 186 Schütze 255 Schwenter 265 Searle 261 Sells 178, 184 semantic role 16, 123, 128, 131, 172–173, 175, 179, 181, 189–193, 205, 220 Sethuraman 76, 200 Shanks 78, 84 Shtyrov 250, 264 Silverstein 277 Simon 33 Simpson-Vlach 84 Sinclair 2, 79, 99–101, 103, 107, 112–114, 167, 207, 249–250, 254–255, 269 Sinha 242
Siskind 33, 57 Slobin 39–40 Smith 243–244, 277 sociopragmatic 242, 245, 247 Sokolov 39 Solé 75–76 Somers 171 Stahl 11, 44 statistical 5, 33, 45, 56, 71–73, 75, 78–79, 83–84, 89, 217, 220–221, 226–227, 231, 235–236, 279 Steedman 36, 57 Stefanowitsch 4, 6, 79, 101, 182, 185, 198, 201–203, 205, 217–218, 220, 222, 226–227, 231–235, 237, 258 Stein 260 Stemberger 54 Steyvers 75 Stoll 12 Strässler 259, 267 Strecker 169, 171 Stubbs 1, 4, 255 Stumper 12 style 23, 122, 252–253, 261, 264, 268, 275 Svartvik 194, 257 Svensson 257 Syder 258–260, 264, 271, 279 symbolic 72, 243, 248–253, 263, 265, 271, 274–276, 279–280 synset 85–88 syntagmatic 6, 72, 248, 250– 252, 254–255, 265, 269, 275–280
Index 305
target 104–105, 108–109 Taylor 73, 105 Tenenbaum 49, 57, 59, 75 Terkourafi 249 Terrace 35 Tesniére 56, 168, 170, 204 Teubert 106 Theakston 10–11, 16, 18, 20 Thiessen 33 Thompson 22, 75 Tognini-Bonelli 106 token 5, 10, 25, 73–76, 79–82, 84–89, 101, 258 Tomasello 3, 9, 11, 13, 15–17, 20–22, 44, 46, 49, 53, 58, 60, 72, 76, 112, 183, 200, 242–243, 250, 254 transitives 15–18, 23, 50 Traugott 242, 247, 262–265 Tremblay 269–270 Trousdale 4 Trudgill 245 Tversky 74 Tyler 105 Uhrig 167, 196, 198 Underwood 269 usage-based 2–4, 6, 9, 19–20, 25, 72, 76, 114, 205, 242, 244, 252–254 VAC see verb-argument construction
valence/valency 1–6, 102, 106–107, 121–131, 133, 137, 139, 141, 155–156, 167–192, 194–207, 223, 258–259, 263–264, 275 verb-argument construction 5, 15, 71, 76–89, 152 Verhagen 23 Vosse 56 Wagner 77 Waltereit 245 Warren 254 Waterfall 39, 58 Watson 80 Welke 175, 203–204 Wexler 19, 42, 59 Wijnen 20 Wittgenstein 74 Wonnacott 49, 59 Worden 38 WordNet 85, 87, 89 Wray 239, 246, 249–251, 253– 254, 260–261, 270, 279 Wulff 78, 217 Zelle 57 Zhao 54, 58 Zifonun 169, 171 Zipf 75, 83 Zipfian distribution 5, 73, 75– 76, 79, 82–83, 85–86, 88–89 Zuidema 57