197 84 8MB
English Pages 277 [280] Year 2012
Incipient Productivity
Cognitive Linguistics Research 49
Editors Dirk Geeraerts John R. Taylor Honorary editors Rene´ Dirven Ronald W. Langacker
De Gruyter Mouton
Incipient Productivity A Construction-Based Approach to Linguistic Creativity by Arne Zeschel
De Gruyter Mouton
ISBN 978-3-11-027001-3 e-ISBN 978-3-11-027484-4 ISSN 1861-4132 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. 쑔 2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston Printing: Hubert & Co. GmbH & Co. KG, Göttingen ⬁ Printed on acid-free paper Printed in Germany www.degruyter.com
Preface
This book is a revised version of my doctoral dissertation “Delexicalisation patterns: a corpus-based approach to incipient productivity in fixed expressions” (Universität Bremen, 2007). There are a number of people without whose help this project would not have been possible, and I would like to express my sincere gratitude for the inspiration and the support that they have provided. Academically and intellectually, the first people to mention here are my two supervisors, Anatol Stefanowitsch and Gabi Diewald, who have not only guided the process of developing my own perspective on the topic of this book, but who have also left their mark on the way I think about language and linguistics more generally. The articulation of these views has also profited substantially from long (and mostly controversial) discussions with Felix Bildhauer, who has accompanied this project from the first ideas jotted on a piece of scrap paper all the way down to the proofreading stage. Apart from two anonymous reviewers (whose astute and constructive comments were very helpful for finalising the manuscript), I would like to thank the following people (in no particular order) for providing stimulating feedback and critical discussion of the ideas presented in this book: Kerstin Fischer, Stefan Müller, Holger Diessel and the remaining members of the German construction grammar network (DFG-Nachwuchsnetzwerk Konstruktionsgrammatik), my former colleagues Stefanie Wulff, Maike Krone and Cornelia Zelinsky-Wibbelt, and last not least the members of my dissertation committee: Thomas Stolz, John Bateman, and Juliana Goschler. Most of all, however, this book is indebted to the ideas of Ronald Langacker and especially Adele Goldberg, without whose inspiring 1995 monograph on argument structure constructions I may not have decided to do a doctorate in linguistics in the first place. Thank you to all of you! On the more practical side, my special thanks go to Philippa Cook for coding a sample of the English data analysed in chapter 6, as well as to my wife Bettina who coded a parallel share of the German data. Their willingness to put up with the niceties of the coding scheme and to engage in sometimes painstaking reflections on the subtlest of semantic intuitions cannot be valued highly enough – I know it was not exactly a fun thing to do, so I am all the more thankful for your effort and patience. On the side of the publisher, I would like to thank Birgit Sievert and Dirk Geeraerts for
vi
Preface
their sustained interest in the project and their encouragement to pursue its publication with De Gruyter Mouton. Furthermore, I am grateful to the University of Bremen for kindly awarding me a scholarship to cover the last months before the completion of the original dissertation. Finally, I want to thank those people whose love, en#couragement and patience have permitted me to pursue this project with the due dedication in the first place. These are Caroline Schatke, my parents Achim and Marianne, and above all Bettina and Niklas, the two people who have, each in their own way, done most to bring this project to a successful completion in the end. Sønderborg, August 2011
Contents
1. Introduction 1.1 The issue............................................................................ 1.2 Aims and scope................................................................. 1.3 Structure of the book.........................................................
1 1 3
2. Towards a usage-based model of constructional generalisation 2.1 Introduction ........................................................................ 2.2 Incipient productivity: From collocations to constructional schemas.................................................. 2.3 Theoretical framework....................................................... 2.3.1 Usage-based construction grammar.............................. 2.3.2 Cognitive semantics...................................................... 2.3.3 Models of constructional generalisation....................... 2.4 Previous research................................................................ 2.4.1 Insights from research on construction learning..................................................... 2.4.2 Insights from research on constructional change.................................................... 2.4.3 Insights from research on constructional variation................................................. 2.5 Chapter summary................................................................
5 5 7 8 16 21 25 26 32 36 41
3. Testing ground: Intensity collocations 3.1 Introduction......................................................................... 3.2 Intensity and intensification................................................ 3.2.1 Intensification as a linguistic function.......................... 3.2.2 Intensifier variation and change.................................... 3.3 Conceptualising intensification........................................... 3.3.1 Intensification strategies in English and German...................................................................
43 44 44 47 52 54
viii
Contents
3.3.2 PERCEPTION intensifiers................................................. 3.4 Constructing intensification................................................ 3.4.1 Construction A: Int + N................................................. 3.4.2 Construction B: Int + Adj.............................................. 3.4.3 Construction C: Int + with/vor + N................................ 3.5 Objectives............................................................................ 3.6 Chapter summary.................................................................
59 62 66 67 75 76 77
4. Lexicalisation patterns: From concepts to words 4.1 Introduction ........................................................................ 78 4.2 Prerequisites........................................................................ 78 4.2.1 The corpus-linguistic study of lexicalisation patterns.................................................... 78 4.2.2 Data................................................................................ 80 4.3 Procedure............................................................................. 85 4.3.1 Setting up the search space............................................ 85 4.3.2 Data extraction and coding............................................ 87 4.4 Results................................................................................. 91 4.4.1 Overview........................................................................ 91 4.4.2 Construction A............................................................... 95 4.4.3 Construction B............................................................... 102 4.4.4 Construction C............................................................... 108 4.5 Summary and discussion..................................................... 114 5. Fixed expressions: From words to collocations 5.1 Introduction......................................................................... 5.2 Prerequisites........................................................................ 5.2.1 Formulaicity and creativity............................................ 5.2.2 Corpus data as clues to cognitive entrenchment patterns.................................... 5.3 Procedure............................................................................. 5.4 Results................................................................................. 5.4.1 Overview........................................................................ 5.4.2 Construction A............................................................... 5.4.3 Construction B............................................................... 5.4.4 Construction C...............................................................
119 120 120 122 126 128 128 128 137 148
Contents
ix
5.5 Summary and discussion..................................................... 156 6. Incipient productivity: From collocations to constructional schemas 6.1 Introduction......................................................................... 6.2 Prerequisites........................................................................ 6.2.1 Problems of semantic classification............................... 6.2.2 Approaches to semantic classification........................... 6.2.3 Approaches to productivity............................................ 6.3 Procedure............................................................................. 6.3.1 Identifying item-based generalisations.......................... 6.3.2 Identifying pockets of productive use............................ 6.3.3 Identifying higher-level generalisations........................ 6.4 Results................................................................................. 6.4.1 Item-based generalisations............................................. 6.4.2 Incipient productivity..................................................... 6.4.3 Higher-order patterns..................................................... 6.5 Summary and discussion.....................................................
159 162 162 165 170 174 175 184 188 189 189 203 217 227
7. Conclusion..............................................................................
230
Appendix.....................................................................................
234
Notes............................................................................................ 241 References.................................................................................... 246 Index............................................................................................
266
ŘŖȱ
DZȱȱȱȱ¢ȱȱȱ¡ȱ
ȱ
Chapter 1 Introduction
1.1.
The issue
All languages contain a large stock of ‘fixed expressions’: prefabricated sequences like English take care, take a guess and take a liking to that are part and parcel of ‘idiomatic’ ways of putting things. How is it that certain of these elements are easily varied and expanded (cf. take a fancy/fondness/dislike/aversion/… to ), whereas others sound decidedly odd if manipulated in this way (cf. ?take a conjecture/surmise/speculation/…)? And what is the reason that even variants of the same semantic type may sound differently acceptable at times? What are the mechanisms involved in the extension of established formulas to conceivable variants? How does the observable variability relate to processes of language change and grammaticalisation? Or to the generalisation processes of children during language acquisition? These are some of the questions that are addressed in this monograph. Put differently, this is a study of linguistic creativity: it investigates the ways and the extent to which speakers vary established patterns of language use and adapt them to novel contexts of application. Its principal interest is the emergence of constructional generalisations through schematisation, a process that gradually brings out implicit commonalities between clusters of memorised elements of linguistic knowledge. The determinants of such generalisation processes are a much-debated issue in cognitively oriented, usage-based approaches to language learning (Tomasello 2003) and change (Traugott and Dasher 2002). The present study approaches the topic from a synchronic-variationist perspective: based on a large-scale contrastive corpus study involving roughly 70,000 semi-automatically collected samples, it traces the emergence of partial productivity in clusters of conventional English and German collocations. 1.2.
Aims and scope
The investigation departs from the observation that speakers tend to produce what they have heard produced in a relevantly similar way before,
2
Introduction
rather than what should be recoverable in principle (Pawley and Syder 1983; Wray 2002). This is to say that there is a conserving effect of convention which plays an important role in constraining the generalisations that speakers are prepared to extract from their input and to put forward in their own productions. At the same time, language is a motivated system which in many respects encourages the extraction of such generalisations, and also provides ample opportunities for plausible extensions of its patterns beyond the existing status quo. Moreover, speakers find themselves in a situation where there are not only sound reasons for conforming to convention, but also for departing from it under certain circumstances and in specific ways, thus giving a constellation of conflicting pressures on the speaker. My study addresses aspects of this tension by investigating semantic extension processes within clusters of conventional collocations. In the interest of manageability, it is restricted to a rather narrowly circumscribed set of expressions that match particular functional, structural and semantic requirements. Functionally, it is a study of intensification and intensity expressions, i.e. linguistic expressions that involve the semantic boosting of a particular aspect of the encoded predication (cf. chapter 3). Intensification phenomena are a promising object of study for research on semantic variation and change because they are particularly prone to innovation for pragmatic reasons (Bolinger 1972: 18; Hopper and Traugott 2003: 122). Structurally, the study is restricted to the following three constructions (henceforth, constructions A, B and C, respectively): (1) a. INTENSIFIER + N burning ambition b. INTENSIFIER + Adj burningly ambitious c. INTENSIFIER + with + N to burn with ambition (2) a. INTENSIFIER + N b. INTENSIFIER + Adj c. INTENSIFIER + vor + N
klirrende Kälte klirrend kalt klirren vor Kälte
‘clinking cold’ ‘clinkingly cold’ ‘to clink with cold’
From the large pool of intensifying devices that is available in both English and German, these constructions were chosen because they are both transparently related from a language-internal point of view and maximally isomorphic across the two investigated languages (cf. chapter 3). Finally, the study is restricted to intensifying uses of these constructions involving lexical material from particular semantic source domains in the intensifier slot. Specifically, it investigates the intensifying potential of four
Structure of the book
3
distinct semantic categories within the three investigated constructions. These categories were chosen because they form a coherent semantic set that is discussed in chapter 3. Diagrammatically, then, the target expressions of my study are located at the intersection of three different kinds of sets, and only expressions exhibiting the requisite specifications in all three dimensions in parallel were included in the dataset:
Figure 1.1 Delimiting the empirical testing ground
1.3.
Structure of the book
The book is divided into seven chapters. Chapter 2 introduces the topic and the theoretical background of the study. The central notions of schematisation and incipient productivity are discussed and anchored in the theoretical context of usage-based construction grammar. In addition, relevant previous research from diachronic, developmental and variationist perspectives is reviewed. Chapter 3 sketches the linguistic testing ground of the study, addressing each element in figure 1.1 above in a separate section (i.e. intensification as a linguistic function, the four semantic target patterns and three constructional categories that are investigated). Chapter 4 opens the empirical part of the study by taking stock of the conventional exploitations of the targeted intensification strategies in the three constructions in both languages. Having laid out data extraction and coding procedures, the focus is
4
Introduction
first on the paradigmatic perspective: how extensively are the targeted lexical resources recruited for the encoding of intensity in English and German, and which similarities and differences are found across the three investigated constructions? Chapter 5 switches to a syntagmatic perspective by exploring the combinatorial potential of the attested intensifiers. Collostructional analysis is introduced as a quantitative corpus-linguistic method for the identification of routinised ‘fixed expressions’ in the data that can be distinguished from creative exploitations of the attested patterns. Qualitative analyses of collocate tables provide a first, impressionistic approximation of the semantic patterning of the targeted constructions. Chapter 6 presents the analytical centrepiece of the investigation. Building on the empirical ground prepared in chapters 4 and 5, it proceeds to a detailed and systematic semantic classification of the obtained results. Using cluster analyses and multirater manual classifications of the data, the semantic structure of selected target collocation clusters is reconstructed on different levels of semantic schematicity. Furthermore, the categorised data are partitioned into groups of established vs. creative uses and determinants of incipient productivity within the investigated usage patterns are assessed. Finally, chapter 7 provides a brief recapitulation of the main findings of the study and places their implications in a wider theoretical context.
Chapter 2 Towards a usage-based model of constructional generalisation
2.1.
Introduction
Chapter 2 develops the main objectives of the study, puts them into theoretical perspective and provides a review of selected previous research. The exposition is structured as follows: section 2.2 gives a brief illustration of the phenomenon under investigation. Section 2.3 establishes the theoretical background of the study and introduces some useful terminology. Section 2.4 builds connections to previous studies and reviews related research in language acquisition, historical linguistics and synchronic constructional variation that has informed my approach. Finally, section 2.5 summarises the chapter. 2.2.
Incipient productivity: From collocations to constructional schemas
As pointed out in chapter 1, language is a motivated system that provides speakers/learners with many opportunities for extracting generalisations about its countless patterns. In the terminology of cognitive grammar (Langacker 1987, 1991), this process is referred to as schematisation: the apprehension of a certain formal and/or semantic similarity between two or more linguistic units that can be captured in the form of an overarching schema. On the one hand, schemas thus capture commonalities across already experienced instances of a given pattern. On the other hand, they can also be used to sanction novel/unexperienced instances of the relevant type (to varying degrees). In other words, a schema may be more or less productive. There is an extensive literature on the factors that contribute to the productivity of morphosyntactic schemas and their appropriate operationalisation in empirical research (cf. chapter 6). The present study adds to this discussion, and it focuses specifically on the lower end of the productivity spectrum: as pointed out in chapter 1, it investigates the way and the extent to which speakers vary established patterns of language use (meaning conven-
6
Towards a usage-based model of constructional generalisation
tional collocations that are sufficiently frequent to be stored as complex wholes) and adapt them to novel contexts of application (meaning creative modifications of their lexical content that expand their usage patterns to new semantic territory). In short, then, it is a study of the schematisation of ‘fixed expressions’ and their rise to productivity (however limited). Understood like this, the targeted phenomenon can be investigated from several different angles: it can be studied diachronically as a process in historical time (be it in ontogenetic or in phylogenetic perspective), or it can be framed purely synchronically. The first of the two ‘processual’ perspectives is concerned with acquisition, and the question is how individual speakers/learners form schemas over the input during language learning (cf. section 2.4.1). The second perspective is concerned with change, and the question is how formerly (more) restricted constructions expand to less constrained instantiations over time (cf. 2.4.2). Finally, the synchronic perspective is concerned with variation, and the question is what kinds of schemas are implicit in the range of attested instances of a construction at a specific point in time (cf. 2.4.3). This latter perspective constitutes the focus of attention here. For an illustration, consider the expressions in (1): (1) a. b. c. d.
to go wrong/mad/bankrupt/… to go ?successful/??likely/??obvious/… to go crazy/insane/mental… to go ?false/?inadequate/?deficient/…
Table 2.1 reports the ten most frequent adjectives that occur immediately adjacent to a form of to go in the 100 million words British National Corpus (Burnard 1995; specifically, the table shows the most frequent collocates tagged as ‘AJ0’, excluding participial types like unnoticed): Table 2.1 go + ADJ in the BNC Adjective wrong mad bankrupt crazy dead public free red white hungry
Frequency 692 413 124 121 83 82 81 76 75 74
Theoretical framework
7
Combinations marked ‘?’ in the examples in (1) are not attested in the BNC, and ‘??’ indicates that an expression was not even found on .uk-sites on the web at the time of the investigation (excluding false hits like Hand in hand went successful economic policies). The question is what these contrasts suggest about the conventional usage spectrum of the aspectual construction [go Adj] in present day English, and what kinds of representations speakers form and employ in categorising these and similar expressions. For one thing, (1a) shows that the construction is conventionally used with semantically quite heterogeneous adjectives. At the same time, (1b) suggests that the adjective slot is nevertheless not completely open. The first aim of the empirical studies presented below is therefore to provide an exact identification of (and, to the extent that this is possible, motivation for) such combinatorial restrictions with regard to the specific target structures examined in chapters 4 to 6. The second question to be investigated is how the existing usage patterns of such structures are creatively varied and expanded (or, to say it in the terminology of Hanks 2004, what constrains speakers’ ‘exploitations’ of an established collocational ‘norm’). Staying with the above example, (1c) indicates that e.g. substitutions with nearsynonyms are unproblematic for some of the high frequency instances in (1a), but at best marginally acceptable for others (1d). Is this just an arbitrary fact, or is there more to be said about which kinds of expressions are more readily extended than others? Clear as it seems that “what we store in some cases [of slots in ‘idioms’ and ‘fixed expressions’] is a meaning rather than a specific word” (Erman and Warren 2000: 41), very little is known about the way in which such lexically underspecified ‘schematic idioms’ emerge from lexically specific source structures. The present study takes steps towards closing this gap: in anticipation of the discussion in sections 2.3 and 2.4, it is a contribution to research on constructional generalisation processes that manifest in the transition from idiosyncratic lexical to more or less principled semantic constraints on the lexical instantiation of constructional schemas. 2.3.
Theoretical framework
Before delving deeper into the issue, it will be useful to put the investigation into theoretical perspective and introduce some relevant terminology. Basic tenets of the grammatical framework to be adopted – usage-based construction grammar (Bybee 1998, 2006, 2010; Croft 2001; Goldberg
8
Towards a usage-based model of constructional generalisation
1995, 2006; Langacker 2005, 2008, 2009) – are laid out in section 2.3.1. On the semantic side, the study draws chiefly on ideas from cognitive semantics (Lakoff 1987; Langacker 1987; Talmy 2000), an overview of which is provided in section 2.3.2. With these two preparations made, section 2.3.3 then narrows the focus to models of constructional generalisation processes as proposed in the usage-based/cognitive-linguistic literature. 2.3.1.
Usage-based construction grammar
My study adopts a construction-based approach to language. As the term implies, such models assume that the basic unit of linguistic organisation are constructions: learned associations of form and meaning that cut across the traditional levels of linguistic description (phonology, morphology, syntax, semantics and pragmatics). Constructions are signs: depending on what kind of construction is at issue, their form pole may involve anything from phonological specifications over morphological properties to complex configurations of morphosyntactic categories including any combination of properties in these dimensions. On the meaning side, they may encode anything from (more or less) concrete individually symbolised elements of conceptual structure over abstract functional relationships within composite assemblies of such structures to specific pragmatic implications and/or conventional restrictions of the given sign to particular usage contexts, plus again any combination of features in these dimensions. Some approaches (Bybee 2006; Langacker 2008) take internal complexity to be a further necessary criterion for constructionhood, thus excluding morphemes and (monomorphemic) words from the definition. Others (Goldberg 2006; Kay and Fillmore 1999) find it more important to stress that words and syntactic constructions are essentially the same type of data structure (i.e. conventional associations of form and meaning) and therefore apply the term ‘construction’ to both simplex and complex linguistic symbolisations alike. But these are terminological trifles. Common to both usages of the term is the assumption that linguistic knowledge is a vast repository of signs, that all information is stored in the same format, and that a principled distinction between ‘lexicon’ and ‘grammar’ is therefore problematic. Since lexical items (memorised bits of language) and grammatical schemas (routinised generalisations over such units) are not seen as objects of a qualitatively different type, syntactic composition is not envisioned as a process of stringing up meaningful words according to meaningless rules, but rather of ‘integrating’ smaller signs (words) into larger signs (syntactic construc-
Theoretical framework
9
tions), i.e. as a process of instantiating the slots of an underspecified construction with concrete lexical material. Construction grammar assumes that two constructions can be combined in this way if they are both formally and semantically compatible. Between the two endpoints of a fully specified lexical item and an entirely underspecified syntactic schema, construction grammar recognises a broad range of intervening structures that may be lexically filled to a greater or lesser degree. Goldberg (2006: 5) illustrates this heterogeneity with the following examples, each of which is recognised as a construction in its own right: – Morpheme – Word – Complex word – Complex word (partially filled) – Idiom (filled) – Idiom (partially filled) – Covariational Conditional – Ditransitive (double object) – Passive
e.g. pree.g. avocado e.g. daredevil e.g. [N-s] (for regular plurals) e.g. give the devil his dues e.g. jog memory e.g. the more you think about it, the less you understand e.g. he gave her a fish taco e.g. the armadillo was hit by a car
It follows from this characterisation that many constructions contain (more or less) ‘open slots’, i.e. positions that can be instantiated by a range of different lexical items/forms of these items subject to a set of constructionspecific constraints. Consider for instance the following contrast involving the English serial verb construction [NP V V-ing PP] (from Goldberg 2006: 50): (2) a. b. c. d. e.
Bill went whistling down the street. Bill ran whistling down the street. Bill came whistling down the street. *Bill walked whistling down the street. *Bill raced whistling down the street.
Similar to the problem of adjective variation in the earlier example of [go Adj], the variability in the first verb slot illustrated in (2a-c) may suggest that this construction is compatible with motion verbs in general rather than just the specific individual verb to go alone. However, the unacceptability of (d) and (e) shows that this is not the case. The contrast in (2) is thus a
10
Towards a usage-based model of constructional generalisation
further illustration of the central question to be pursued below: how do speakers/learners know which expressions are possible instances of a specific pattern and which are not, and what is the level of abstraction on which this knowledge is encoded? In addition, constraints on a constructional slot often involve restrictions on the specific form of the filler, as illustrated by the distinct but related serial verb construction [NP V VPbare] in (3) which rules out a tensed first verb (examples from Goldberg 2006: 53): (3) a. b. c. d. e.
Go tell it to the mountain. Won’t you come sit with me? Would you run get me a pencil? *She came sat/sit with me. *He goes bring/brings the paper.
One of the major motivations for the development of construction-based language models was the desire to account for such idiosyncrasies on a par with the more regular properties of language, thereby rejecting the core/periphery distinction of traditional generative grammar. More fundamentally, the constructions that are posited in such analyses are not seen as epiphenomenal “taxonomic artefacts” (Chomsky 2002: 95) of linguistic description without legitimate claims to psychological reality. Quite to the contrary, they are seen as the basic unit of grammatical organisation in speakers’ minds. This point also marks a contrast with uses of the term ‘construction’ in traditional grammar, where references to specific constructions (e.g. ‘the passive construction’, ‘the dative construction’ etc.) are legion, but it is not assumed that such units constitute the elementary building blocks of linguistic knowledge. On the other hand, ‘construction grammar’ itself is not so much the name of a single unified framework but rather a cover term for a whole family of related approaches. Apart from the issue of internal complexity that was already mentioned above, there is a further important disagreement between different versions of the theory when it comes to the criteria for constructional status: some approaches recognise only strictly nonpredictable structures as independent constructions in their own right (Goldberg 1995; Kay and Fillmore 1999), whereas others accord unit status to any structure that is cognitively routinised/entrenched (Goldberg 2006; Langacker 2008; cf. Zeschel 2009 for a comparison of both accounts). The latter interpretation is associated with usage-based variants of the framework whose central tenets can be summarised as follows:
Theoretical framework
11
– Linguistic structure emerges from language use – Linguistic knowledge is a structured inventory of symbolic assemblies – The cognitive instantiation of language is dynamic and mutable rather than static and fixed Since this is the perspective to be adopted in the present study, I will briefly address each point in turn. Beginning at the beginning, usage-based linguistic theories assume that speakers’ grasp of a language arises from their categorisations of concrete linguistic usage events, where the term ‘usage event’ is defined as follows: “the pairing of a vocalization, in all its specificity, with a conceptualization representing its full contextual understanding. A usage event is thus an utterance characterized in all the phonetic and conceptual detail a language user is capable of apprehending” (Langacker 1999: 99). Speakers are assumed to analyse and sort the structure of these experiences by mapping particular aspects of the input to relevantly similar elements of long-term memory (in several dimensions in parallel, thus recognising particular morphemes, words, multi-word units, syntactic phrases, argument structure constructions, intonation contours etc.). For this, linguistic categorisation draws on a powerful capacity for pattern matching in human cognition that is not peculiar to language. Likewise, speakers’ internalised linguistic systems are assumed to ‘emerge’ from countless individual categorising events of the abovementioned type in a way that does not presuppose any domain-specific innate constraints on possible grammatical abstractions. Second, the resulting system is nevertheless assumed to be highly structured: metaphorically speaking, new elements are stored ‘next to’ similar pre-existent units and contract all sorts of connections to other stored elements on the basis of perceived similarities (Bybee 2006). With growing exposure, speakers thus develop an increasingly complex, network-like structured inventory of categorised symbolic assemblies that exhibit varying degrees of entrenchment (cognitive routinisation). And third, the overall inventory is seen as fluid and dynamic since it is constantly adapting to experience. Put differently, usage-based theories assume that speakers’ internalised linguistic system does at no point settle to a more or less unchanging ‘final state’ as assumed in Chomskyan theories of language acquisition, and that this system will also vary from one speaker to the next in many respects (notably when it comes to less salient properties of less frequent constructions whose representations are not constantly aligned and hence accommodated accordingly).
12
Towards a usage-based model of constructional generalisation
In the remainder of this section, I will provide first approximations to three notions that are crucial for appreciating the way in which my study was set up and why, and discuss how they are understood within the usage-based constructionist perspective sketched above. These notions are the concept of ‘lexicalisation patterns’, the notion of ‘fixed expressions and idioms’, and finally the concept of ‘productivity’. More detailed discussion will be provided in the ‘Prerequisites’ sections of chapters 4 to 6, which are devoted to operationalising and exploring these very phenomena in my data. In addition, I will introduce some useful terminology for distinguishing constructions on different levels of schematicity (within the same taxonomy) that will hopefully contribute to a clearer presentation of the results (especially in chapter 6). The first term to introduce is Talmy’s (1985, 2000) notion of ‘lexicalisation patterns’. In Talmy’s own words (1985: 59), “lexicalisation is involved where a particular meaning component is found to be in regular association with a particular morpheme. The study of lexicalisation, however, must also include the case where a set of meaning components, bearing particular relations to each other, is in association with a morpheme, making up the whole of the morpheme’s meaning”. He goes on to suggest a research programme devoted to the exploration of recurrent patterns in the way that particular meaning components are lexicalised (i.e. conventionally associated with particular formal coding options) both within and across languages. Wherever different semantic elements are regularly expressed together in a single surface form, they are said to be ‘conflated’ in this form, and the underlying regularity is referred to as a ‘conflation pattern’. Talmy’s (1985) study focuses on crosslinguistic differences in the semantic packaging of verbs and a particular type of complements which he calls ‘satellites’, here exemplified using his example of motion events and the way in which they are conventionally expressed in different European languages: (4) a. The craft floated into the hangar. b. La botella entró a la cueva (flotando). the bottle move-in to the cave floating ‘The bottle floated into the cave.’ (Talmy 1985: 65) According to Talmy, Germanic languages like English contrast with Romance languages like Spanish in that they exhibit different verbal conflation patterns: English motion verbs typically behave like to float in that
Theoretical framework
13
they conflate the meaning components MOTION and MANNER (float: ‘to move in a floating manner’), leaving a specification of the PATH element to be expressed in a satellite (i.e. the PP in 4a). Spanish, by contrast, conflates MOTION and PATH in the verb (entrar: ‘to move into something’), leaving the MANNER specification to be encoded elsewhere (i.e. the separate form flotando in 4b). Goldberg (1995) recasts Talmy’s original analysis of such conflations in construction grammar terms in that certain semantic properties of the composite expression that Talmy assumes to be lexicalised ‘in the verb’ (here: the motion implication) are attributed to the construction instead (here: the English intransitive motion construction). In other words, Goldberg reinterprets the notion of a conflation pattern not as a combination of distinct meaning components that are conventionally associated with the verb, but as construction-specifically conventionalised interactions between the semantics of the construction and the semantics of the inserted lexical items. Evidence for Goldberg’s analysis comes from so-called ‘coercion effects’ in which a verb not normally signifying MOTION comes to express this implication in virtue of its being used in the intransitive motion construction: (5) a. The train screeched into the station. b. The fly buzzed out of the window. c. The elevator creaked up three flights. (Goldberg 1995: 62) The constructionist perspective thus introduces the third dimension diagrammed in figure 1.1 in chapter 1: constructional patterns of linguistic categories (e.g. SUBJ V OBL) can be used to express particular meanings/functions (here: ‘THING move-to LOCATION’) provided they are instantiated by lexical items with an appropriate/conventionally compatible meaning (in the above case, a verb of SOUND EMISSION). As illustrated in (5), such ‘compatibility’ may in fact extend to various kinds of mismatch between lexical and constructional meaning, in the sense that it is sometimes words from rather different source domains that are recruited for signalling the meaning/function expressed by the construction. Applied to the present study, the term ‘lexicalisation pattern’ will therefore be used to denote the link between bases from particular semantic source domains (‘target lexis’ in figure 1.1) and the pragmatic function INTENSIFICATION (‘target function’ in figure 1.1) within the three investigated grammatical environments (‘target constructions’ in figure 1.1). A corpus-based approach to the contrastive study of constructional lexicalisa-
14
Towards a usage-based model of constructional generalisation
tion patterns in this sense is developed in chapter 4. An illustration of one such pattern is provided in (6), again involving the domain SOUND EMISSION which also happens to be among the conflation classes investigated here: (6) a. a crashing bore, a resounding success, a crying shame b. squeaky clean, piping hot, thumping good c. to buzz with activity, to creak with age, to crackle with energy The second notion, ‘fixed expressions and idioms’ (‘FEI’, Moon 1998) is a cover term for a range of very heterogeneous structures that have attracted special attention in the construction grammar literature: As indicated above, the interest in ‘constructional idioms’ and seemingly quirky structures that pose a challenge to purely rule-based conceptions of grammar was a major impetus for the development of construction-based theories of grammar. Idiomaticity is a complex and multi-faceted phenomenon (cf. Moon 1998 and Wulff 2008 for recent book-length treatments), and many of its most prominent aspects will also come up in the present discussion at some point: for instance, issues of ‘literalness’ vs. figurativity (when distinguishing putative semantic source structures from their supposed extensions), questions of lexical fixedness vs. flexibility (when operationalising the continuum from frozen chunks to more or less productive schemas) and also of grammatical regularity vs. idiosyncrasy (when considering the distributional patterning of particular intensifier-predicate combinations across different constructional environments). Above all, however, the concept of fixed expressions will be of interest for my study because of the close link between habitual co-occurrence in usage and cognitive routinisation that is assumed in usage-based construction grammar. The connection between lexicogrammatical fixedness (or formulaicity) and psychological entrenchment will be considered in greater detail in chapter 5: following a discussion of the relationship between textual raw frequency and statistically significant co-occurrence in corpora on the one hand and cognitive routinisation on the other, the notion of fixed expressions is used to identify constructional exemplars that are presumably memorised as units by most speakers. Third, the factors that license creative variations of these expressions are central to discussions of the last of the aforementioned key notions, productivity. Like idiomaticity, productivity is a complex and ambiguous notion that has been addressed in several monographs (e.g. Aronoff 1976; Bardal 2008; Bauer 2001; Bybee 1985; Plag 1999). With Bardal (2008), it is here
Theoretical framework
15
assumed that productivity is a graded property that does not contrast with ‘creativity’ or ‘analogy’ and that is as fruitfully applied to combinatoriality in syntax as it is to novel coinages in morphology. A more detailed discussion of usage-based approaches to productivity is provided in chapter 6. Finally, it will be useful to introduce a terminological distinction between constructions on different levels of schematicity within the same taxonomy. Following Traugott (2007, 2008), I will distinguish between
Macro-constructions: high-level schemas, the highest level relevant for the discussion at hand, e.g. ditransitive construction, partitive construction, degree modifier construction; Meso-constructions: sets of similarly-behaving constructions, e.g. the set a bit/lot (of ), as distinct from the set (a) kind/sort of, etc.; Micro-constructions: individual construction-types, e.g. a lot of vs. a bit of; Constructs: empirically attested tokens of micro-constructions. (Traugott 2007: 525)
These distinctions are adopted with one modification: whereas Traugott reserves the notion ‘construct’ exclusively for tokens, I will further distinguish between construct tokens and construct types. Specifically, I will use the term ‘construct type’ for fully specified expressions, and ‘construct tokens’ for empirically attested instances of these expressions. Note that construct types in this sense are not the same as micro-constructions: even though Traugott’s above examples of micro-constructions (a lot of vs. a bit of) appear to be fully specified in lexis, a constitutive part of these constructions (i.e. the final nominal slot) is actually left out. Put differently, these micro-constructions should rather be represented as a lot of NP vs. a bit of NP, and these in turn subsume a number of different construct types. For instance, in the partitive reading, their most common construct types in the BNC are a lot of people (with 1220 empirically attested construct tokens) and a bit of time (101 construct tokens). For an approach that seeks to differentiate (potentially) competing constructions through their respective lexicalisation preferences, it is important to separate the notion of a lexically specified construct type from the actual tokens of this type in a corpus. With these terminological provisions made, it is now time to turn to the semantic side of things and introduce the analytical toolkit that is required on the meaning pole.
16
Towards a usage-based model of constructional generalisation
2.3.2.
Cognitive semantics
Just like construction-based approaches grammar, also the field of cognitive semantics comprises a wide range of diverse research that cannot be done justice here in its entirety. Following some brief introductory remarks on how the individual aspects are connected, I will therefore concentrate on just a few of the most elementary points that will be relevant for my study. These include the notions of domains and profiling, embodiment, image schemas and conceptual mappings. In cognitive semantics, meaning is equated with conceptual structure. More specifically, it is assumed that meaning is constructed dynamically in processes of conceptualisation (an operation also known as ‘construal’). Meaning construction takes place in so-called ‘mental spaces’ (Fauconnier 1985), ephemeral yet often complex conceptual structures that are set up during thinking and speaking. In conceptualisation processes, conceptual content is recruited into these spaces from encyclopaedic knowledge structures variously known as ‘frames’ (Fillmore 1982), ‘idealised cognitive models’ (ICMs, Lakoff 1987) or ‘domains’ (Langacker 1987). Individual concepts are ‘profiled’ against the ‘base’ of one or several such domains. Abstract concepts/domains (e.g. TIME) are assumed to be ‘grounded’ in more concrete, perceptually palpable aspects of sensorimotor experience (e.g. MOTION through SPACE) due to the embodied nature of cognition. The link between ‘source’ and ‘target domains’ in such pairings is provided by ‘conceptual mappings’, i.e. metaphorical projections and metonymic shifts between and/or within domains (Fauconnier 1997; Lakoff and Johnson 1980, 1999). Complex conceptualisations involving potentially many such mappings are referred to as the products of ‘conceptual blending’ (Fauconnier and Turner 2002), a mechanism which allows conceptualisers to construct complex interactions between elements of different mental spaces. Beginning with the notions of ‘domains’ and ‘profiling’, one of the most fundamental tenets of cognitive semantics is that meaning is encyclopaedic, i.e. that there is no principled distinction between linguistic semantics and general world knowledge. Words are seen as providing context-sensitive points of access to a vast network of background knowledge, rather than as encapsulating discrete packages of invariant conceptual content. Specifically, words are assumed to ‘profile’ a particular concept against a ‘base’ of presupposed background knowledge. As a result, their meaning resides in the relationship between the profile and the respective base, rather than in either of the two elements in isolation (Langacker 1987). Langacker’s il-
Theoretical framework
17
lustration of the distinction between profile and base centres around the word hypotenuse, which profiles a particular line segment against the base of the concept TRIANGLE. Langacker argues that the meaning of hypotenuse cannot be understood without recourse to the notion of a triangle: a hypotenuse is neither just any line segment nor is it a triangle itself – rather, it is a particular line segment that can only be understood in the context of triangles. As indicated, there are several different terms in the cognitive semantics literature for what was called the ‘base’ of a conceptualisation in the preceding paragraph: Langacker (1987) uses ‘base’ or ‘domain’, a term that is also used by Lakoff (1987). However, Lakoff, too, uses more than one term and also speaks of ‘idealised cognitive models’, or ‘ICMs’. Fillmore (1982) uses the term ‘frame’ to denote the same thing. The present study employs the terms domain and frame (which will be used interchangeably below). What is important to note about these structures is that they are not qualitatively different from the concepts that are profiled against them: what serves as the domain in one characterisation can be the profile in another. For instance, while TRIANGLE is the domain against which HYPOTENUSE is profiled, the concept TRIANGLE itself will require such concepts as SHAPE or GEOMETRY to be meaningful, and these in turn are characterised against the domain SPACE. In virtue of such chaining, a concept is commonly profiled against a number of domains at the same time, and these are said to constitute the ‘domain matrix’ for its characterisation. At some point in such chains, however, it will become difficult to characterise a concept in terms of another one that is being presupposed, like when one has arrived at a domain as fundamental as SPACE. Such domains, which relate directly to aspects of embodied human experience, are called ‘basic domains’ and hence contrasted with non-basic ‘abstract domains’ (Langacker 1987: 148). Cognitive semantics assumes that meaning is embodied. This is to say that the human capacities for reasoning and imagination are seen as inextricably bound up with our situated bodily experience of the world, a claim that has come to be known as the ‘embodied cognition’ hypothesis (Barsalou 2005; Gibbs 2006; Johnson 1987; Lakoff and Johnson 1999; Poirier et al. 2005). According to the embodiment thesis, humans conceptualise (and subsequently verbalise) particular states of affairs in the world in ways which directly reflect the physiological make-up (and limitations) of their bodies and certain prominent sensory-motor interactions with the environment that these capacities afford. Put differently, both what we can per-
18
Towards a usage-based model of constructional generalisation
ceive and conceive in the first place and how we do it is taken to be determined by the way in which the world manifests to us as specifically human cognisers, i.e. to a cognitive system that is embedded within a specific biological and social environment. It is of course specifically the aspect of how we conceive of things that is of interest here. According to the embodiment thesis, our conceptual system is structured by a number of basic experiential gestalts that are bodily grounded. A widely used term for these gestalts is Johnson’s (1987) notion of ‘image schemas’, characterised as “the recurrent patterns of our sensory-motor experience by means of which we can make sense of that experience and reason about it, and that can also be recruited to structure abstract concepts and to carry out inferences about abstract domains of thought” (Johnson 2005: 18-19). Understood as “dynamic analog representations of spatial relations and movements in space” (Gibbs 2006: 90), such image schemas are assumed to structure abstract thought via conceptual projection of skeletal structural blueprints from concrete, ‘basic’, ‘imagistic’ domains (Johnson 1987; Lakoff 1987). The image schema literature (e.g. Cienki 1997; Clausner and Croft 1999; Gibbs and Colston 1995; Hampe 2005) has produced several different inventories of these patterns; a definitive list does not exist. Schemas that appear to be generally accepted include the following (cf. Hampe 2005: 2): – PART-WHOLE – CONTAINMENT – PATH – UP-DOWN – FRONT-BACK – LEFT-RIGHT – CENTER-PERIPHERY – NEAR-FAR – CONTACT – COMPULSION – ATTRACTION – BLOCKAGE Johnson (2005: 20) comments on the ‘basic’ character of the experiences reflected in these schemas as follows: For example, given the relative bilateral symmetry of our bodies, we have an intimate acquaintance with right-left symmetry. As Mark Turner (1991)
Theoretical framework
19
observes, if we were non-symmetric creatures floating in a liquid medium with no up or down, no right or left, no front or back, the meaning of our bodily experience would be quite different from the ways we actually do make sense of things. Because of our particular embodiment, we project right and left, front and back, near and far, throughout the horizon of our perceptual interactions.
Image schemas are also considered ‘basic’ from a developmental perspective, in the sense that infants are claimed to structure the continuous and otherwise chaotic flow of their perceptions with the help of such schemas from early on (Mandler 1992). Given that verbalisations of abstract notions and relations commonly recruit vocabulary from more concrete domains, it has been argued that language provides particularly valuable clues to the image-schematic structuring of cognition: “since communication is based on the same conceptual system that we use in thinking and acting, language is an important source of evidence for what that system is like” (Lakoff and Johnson 1980: 3). As it stands, this line of reasoning might seem circular (linguistic facts are explained by invoking putative image schemas, which are in turn claimed to be evidenced by these facts). However, there is also independent evidence for the image-schematic structuring of cognition that does not involve language (e.g. Casasanto and Boroditsky 2008). The ‘conceptual mappings’ involved in the transfer of image-schematic structure from concrete to abstract domains have been a major focus of attention in cognitive linguistic research for many years. Much of this work goes back to conceptual metaphor theory as laid out in Lakoff and Johnson (1980; cf. also Lakoff 1990, 1993; Lakoff and Johnson 1999). The central tenet of the approach is that metaphors are not ‘figures of speech’, i.e. essentially linguistic in nature, but rather a fundamental structuring principle of cognition itself. This is to say that many different metaphorical expressions in fact draw on a single underlying mapping between a ‘source domain’ and a ‘target domain’ that operates on the conceptual plane. Standardly notated in the format TARGET DOMAIN IS SOURCE DOMAIN, it is assumed that such conceptual metaphors allow humans to apprehend abstract concepts in terms of simpler concrete ones, as in the following example of MORE IS UP (Lakoff and Johnson 1980: 15): (7) a. b. c. d. e.
The number of books printed each year keeps going up. His draft number is high. My income rose last year. The number of errors he made is incredibly low. If you’re too hot, turn the heat down.
20
Towards a usage-based model of constructional generalisation
The second major mechanism at work in conceptual mapping is metonymy, which involves shifts within (rather than projections between) domains. In the classical account of Jakobson (1956), this was captured in the characterisation of metonymy as a relation of ‘contiguity’. Langacker (1999) interprets this contiguity in terms of ‘cognitive access’: in metonymy, the designated entity provides a cognitive ‘reference point’ from which the actually intended target is accessed. Whereas metonymy did not attract as much attention as metaphor in the early years of cognitive semantics, interest in metonymy has grown in the late 1990s (cf. Kövecses and Radden 1998; Panther and Thornburg 2003; Radden and Panther 1999), notably in view of claims that conceptual metaphor is itself based on more elementary metonymic processes (see various papers in Barcelona 2003a). The argument here is that the cross-domain projections in metaphors such as ANGER IS HEAT or SADNESS IS DOWN can also be analysed as experientially grounded reference point phenomena (increased blood pressure and body temperature, redness of face etc. in ANGER IS HEAT; drooping posture and general lack of agitation in SADNESS IS DOWN) in which an underlying cause is accessed through its observable effects. In reconstructing the conceptual mechanisms that have given rise to a particular semantic shift, it is therefore not always easy to decide between metaphorical and metonymic explanations, and indeed what is seen as a metaphor by some authors is sometimes analysed as an instance of metonymy by others (cf. e.g. Taylor 1995 vs. Barcelona 2003b on the conceptual motivation of loud colour). Irrespective of the grounding issue and hence a possible primacy of one mechanism over the other, it is generally acknowledged that metaphorical and metonymic processes are often tightly interwoven within one and the same conceptualisation (Goossens 1990). Thirty years of cognitive linguistic research have not only brought a number of revisions and refinements of these ideas, but have in fact also seen the development of an integrated and in some sense complementary alternative to Lakoff and Johnson’s (1980) theory of conceptual mappings – the framework known as conceptual blending theory (Fauconnier and Turner 2002). Blending theory has developed from mental spaces theory (Fauconnier 1985) and seeks to provide a more general model of meaning construction than that provided by earlier theories of conceptual metaphor and metonymy. Specifically, proponents of blending theory have emphasised the need to account for ‘emergent structure’ in conceptualisation processes, i.e. implications that have no counterpart in the respective source domain(s) and hence cannot be interpreted as ‘mapped’ from there. While
Theoretical framework
21
the particulars of these proposals need not concern us here, it should be noted that the identification of expressions as ‘metaphorical’ or ‘metonymic’ in later chapters is not meant to imply that the two notions are mutually exclusive and cannot be involved within one and the same construal. 2.3.3.
Models of constructional generalisation
Also in usage-based construction grammar with its heavy emphasis on item-specificity and learning, there is no denying that competent speakers of a language are creative in the Chomskyan sense, i.e. capable of producing an infinite set of utterances that they have never heard before. The reason is that speakers/learners do not only memorise experienced bits of discourse, but certainly also generalise beyond the structures that they have directly experienced in the input. In cognitive linguistics, these generalisations are assumed to involve domain-general categorisation principles and skills that also underlie non-linguistic cognition. While there is no dispute over the fact that speakers/learners do extract such generalisations, opinions are divided about how these generalisations are best thought of. For instance, do speakers construct cognitively permanent abstractions that have independent psychological reality, i.e. detached from their concrete instantiations? Or are their generalisations computed ‘on the fly’ over sets of memorised instances of a particular structure? And what exactly is it that is memorised in the first place? Over the last decades, cognitive psychologists have developed a number of different categorisation theories that have strongly influenced the way in which cognitive linguists think about these questions. In the present section, I will briefly introduce two of these models that have been particularly influential in this respect, and then consider a hybrid position that seeks to integrate aspects of both accounts. For more extensive discussion, cf. Estes (1994) for an in-depth treatment of classification/categorisation models in cognitive psychology and Chandler (2002) for discussion from a psycholinguistic point of view. The first approach to mention are prototype theories of categorisation, which go back to a series of experimental findings from the late 1960s and early 1970s. The results of these experiments had challenged the so-called ‘classical’ conception of categorisation (so called because it dates back to Aristotle) according to which concepts are defined by sets of necessary and sufficient conditions (Posner and Keele 1968; Rosch 1973, 1975; Rosch and Mervis 1975). Specifically, these results suggested that
22
Towards a usage-based model of constructional generalisation
– membership in a conceptual category is graded – categories have fuzzy boundaries – members of a category cohere on grounds of family resemblance This is to say that particular instances may be ‘better examples’ of a category than others, categories shade off into one another at the edges, and the members of a category need not even share but a single defining feature, each in itself an assumption that is at odds with the ‘classical’ position. These findings led to the development of prototype theory, a model that had a strong influence on cognitive linguistics where it was applied to a wide range of linguistic phenomena (Taylor 1995). Paradoxically, though, the theory comes in two different versions that are founded on contrasting interpretations of the central notion of prototypes: in one understanding of the term, the prototype of a category is the mental representation of the ‘best exemplar’ of this category, as elicited in the original experiments by Rosch and colleagues. On this account, categorisation consists in comparing a target/potential instance of a category to the stored central member/prototype of this category, and it is in this sense (matching to a concrete stored unit) that Langacker (1987: 371) uses the term ‘categorization by prototype’. However, in what is probably the more influential version of the theory, the category prototype is not a particular stored exemplar representation, but an abstract conjunction of features that have been extracted from experienced exemplars. It is of course possible that there exists a particular entity in the world which happens to possess each (or the relatively highest number) of these prototypical features, but this is not in fact necessary, thereby allowing for prototypes that may not correspond to any single actual exemplar of their category. Crucially, this latter version assumes that what people store and retrieve in categorisation are summarised abstractions rather than concrete exemplars. On a prototype account, the overall usage profile of lexically restricted constructions such as [go ADJ] would therefore be assumed to be organised around one or more specific feature bundles. For instance, the fact that the construction often occurs with colour adjectives would be assumed to prompt speakers’ extraction of a semantic feature [+COLOUR] as an abstraction over such instances as to go black/white/red etc. The features involved in such categorisations (i.e. those defining a local prototype in the network) may also be much more specific – for instance in the case of expressions like to go mad/crazy/ballistics etc. On such an account, schema formation is assumed to proceed through semantic feature abstraction, and unprinci-
Theoretical framework
23
pled gaps within the coverage of the schemas thus defined are not predicted. The opposite view is taken in radical exemplar models (Medin and Schaffer 1978): in such approaches, only the concrete instances are stored, i.e. people are not assumed to form and employ cognitively permanent schematic abstractions over these instances at all. The difference to the exemplar interpretation of prototype theory hence resides in the fact that stored prototype categories are discarded altogether: rather than comparing targets against the central instance of some pre-packaged category, individual stored exemplars are allocated to analogical sets that are created on the fly subject to flexible context-specific requirements. Exemplar models are compatible with the empirical findings that motivated the original formulation of prototype theory: by summing over the properties of the different exemplars in an analogical set, prototype effects such as graded membership judgments arise, but they are not assumed to reflect properties of the underlying system of knowledge representation. At the same time, exemplar-based approaches are superior to prototype models in that they can explain effects which the latter cannot account for: for instance, ad-hoc categories such as THINGS TO SELL IN A GARAGE SALE exhibit the same kind of prototype effects as less context-specifically defined ones (Barsalou 1983), even though these effects are not plausibly attributed to a stored prototype. Furthermore, exemplar models can readily account for the occurrence of so-called ‘gang effects’ in which an item is influenced by the behaviour of a cluster of ‘nearby’ (i.e. highly similar) other items even though this cluster is in fact an outlier within the overall category (Chandler 2002: 64). This suggests that similarity to individual stored exemplars rather than similarity to summarised abstractions accounts for the observed categorisation behaviours (Medin and Schaffer 1978). On an exemplar-based approach, productively assembled expressions are modelled directly on one or more previously encountered instance(s) of the relevant construction(s). Since general semantic features are not extracted, it is not expected that attested extensions will necessarily cover a given region of semantic space exhaustively. However, also such strictly analogical accounts of course still have to specify which particular candidate items are (more readily) available for extension and which are not, and why this should be so. In spite of their popularity in psychology, exemplar-based models of categorisation have until recently not played an important role in linguistics (with the exception of phonology – cf. Pierrehumbert 2001, 2003; for an-
24
Towards a usage-based model of constructional generalisation
other notable exception, cf. Skousen 1989, 1992; Skousen et al. 2002). However, many proponents of contemporary constructionist theories argue for a hybrid account that integrates aspects of both schema and exemplar models (Abbot-Smith and Tomasello 2006; Bybee and Eddington 2006; Goldberg 2006; Langacker 1999). The general idea is to have a system in which memory for specific linguistic exemplars (construction tokens) is retained and plays an important role in processing, whilst also allowing for the possibility of capturing abstractions/schemas over these exemplars. The difference is smaller than it might appear at first glance: if linguistic abstractions are conceptualised as emergent generalisations that are merely implicit in a set of concrete stored instances, they can have a status in the system even though they do not have independent existence as such, i.e. detached from their concrete ‘carrier structures’ (Langacker 1999: 97). Dbrowska (2000: 94-95) characterises the process of schema abstraction as follows: The schema is the part of the representation which is shared by several formulas, it is already implicit in the first formula acquired by the learner, once the latter is analyzed into its component units. As new formulas are added to the learner’s repertoire, the schema becomes more and more entrenched, and eventually becomes a symbolic unit in its own right, rather than merely a component of a larger unit. When this happens, the schema can be used to assemble novel utterances.
If the term ‘schema’ is therefore no more than a convenient shorthand for the shared properties of a set of stored exemplars, the question whether such schemas do or do not ‘exist’ reduces to whether particular analogical sets may be stored/have cognitive permanence rather than be computed anew each time a relevantly similar token is encountered. Bybee and Eddington (2006: 326) argue that linguistic categories are plausible candidates for such prestructured categorisations in principle: seeing that they are used with extremely high frequency in everyday cognitive processing, linguistic categories and patterns of such categories (constructions) are likely to have psychological reality in long-term memory. However, as already pointed out in the preceding section, specifically which of these abstractions will actually attain the status of a permanent schema of course remains an open question. Given this understanding of the term ‘schema’ on the one hand and an entrenchment-sensitive model of analogy on the other, Langacker (1999: 145) notes that usage-based and purely analogical models of linguistic categorisation are virtually indistinguishable. Nevertheless, it should be
Previous research
25
noted that “even the learning of specific expressions (required as the basis for analogy) involves abstraction and schematization from actual usage events” (Langacker 1999: 144), a point taken up in Goldberg’s (2006: 46) observation that exemplars are always somewhat abstract due to selective encoding (not all attributes of a given stimulus are attended to) and memory limitations (memory for specific attributes may fade over time). Even though the question of how to distinguish on-line analogies from permanent schemas (which may in themselves be more or less cognitively salient/strongly represented) remains an interesting issue for psycholinguists, it can thus be concluded that for present concerns, the assumptions and predictions associated with purely item-based interpretations of constructional generalisation on the one hand and hybrid schema-plus-exemplar approaches on the other are essentially identical. However, an important implication for the research question formulated in section 2 is that schematisation as conceived here is not so much a process that affects speakers’ mental representations of individual ‘fixed expressions’ in isolation. Put differently, it would be misleading to think of schematisation as a process that applies to a particular concrete formula that initially admits of no variation at all, and which then transforms this string into a template-like structure with an increasingly schematic ‘open slot’. Rather, synchronic snapshots that are taken at a specific point in time will virtually always reveal a (more or less extensive) range of variants that cluster around the respective central collocation, which is to say that the category in which the schema is implicit is already more or less complex from the outset. What is ‘fixed’ about the expressions investigated in this study is therefore that they are pre-assembled holistic units that are not assembled from scratch, and not necessarily that they are also frozen in the sense of ‘not tolerating any lexical substitution’ whatsoever. The questions to be investigated regarding these expressions are: what is the range of attested synchronic variation for a given target? What kinds of consistent semantic patterns/schemas can be traced in the data for this target? And what is it that makes (only) some of these patterns/schemas productive (i.e. available for inspiring novel coinages of the relevant type)?
2.4.
Previous research
Recent years have seen a strong surge of interest in usage-based approaches to language in general as well as in studies concerned with the emergence
26
Towards a usage-based model of constructional generalisation
and subsequent development of constructional generalisations in particular. For the following review, it will be useful to group research that has informed my approach into three categories: studies of construction processing and learning (reviewed in section 2.4.1), studies of constructional change (2.4.2) and studies of synchronic constructional variation (2.4.3). Previous research that pertains most closely to the objectives and operationalisations of my own case studies will be reported directly in the ‘Prerequisites’ sections of chapters 4 to 6. 2.4.1.
Insights from research on construction learning
In section 2.1, the object of investigation was characterised as the schematisation of fixed expressions and their rise to productivity. Maybe the first perspective on such phenomena that comes to mind is to frame them as learning issues for the individual speaker: how do speakers’ mental representations of the relevant structures change over time? The topic therefore has close connections with research on construction learning, be it in child language acquisition or in foreign language and novel construction learning in adults: in either case, learners are faced with the same task of developing generalised grammatical constructions from lexically specific constructs. The usage-based theory of (first) language acquisition builds on the assumption that language is learnable from the input without a domainspecific innate ‘Universal Grammar’ (Chomsky 1965) that constrains learners’ grammatical hypothesis formation process. In classical statements of the approach (Tomasello 2003), acquisition is assumed to be slow, in the sense that early grammatical generalisations are piecemeal and item-based rather than sweeping and across-the-board (as predicted by ‘parametersetting’ accounts). Even though more recent work in this tradition acknowledges the possibility that children may form at least ‘weak’ general syntactic abstractions much earlier on in development than previously assumed (Dittmar et al. 2008), the characteristic property of the usage-based approach remains its strongly item-based, bottom-up orientation. It is therefore no surprise that the determinants of linguistic schematisation/generalisation processes are a topic of central concern for usage-based acquisition studies: the stated aim of one of the most influential recent monographs in the field (Goldberg 2006) is “to investigate the nature of generalization in language: both in adults’ knowledge of language and in the child’s learning of language” (p. 3) – an objective that is hailed by another leading theorist in the field as “exactly what linguists should have
Previous research
27
been doing all along – studying the nature of generalizations that real speakers make and trying to establish how children come to these generalizations from the specific material in the input” (Bybee 2006: 692). On the other hand, not much is known about the particulars of this process yet, and it is not uncommon for empirical studies of the issue to close with the conclusion that “[m]ore research into the factors which influence the formation and strengthening of syntactic schemas … is sorely needed” (Abbot-Smith et al. 2004: 54). The following review will cover selected findings from research on (first) language acquisition that has a bearing on the topic. More comprehensive coverage is provided by review articles like Ellis (2002), Diessel (2007) and Boyd and Goldberg (2009). To begin with, usage-based models of construction learning assume that learners retain memory for concrete constructional exemplars and re-use these representations for later linguistic categorisations. This idea is not new. The suggestion that children’s early language is built around an inventory of memorised ‘slot-and-filler’ patterns predates contemporary usagebased acquisition research by several decades: in a seminal monograph, Braine (1976) pointed out that children tend to use structures consisting of elements that are invariant across a great many utterances in combination with elements that are drawn from much more open sets (e.g. more + X, allgone + X). Since the idea of repetition and re-use is crucial for the overall approach, my review will first cover empirical evidence in its favour (from studies of child and adult language processing) before moving on to construction learning proper. Bod (2001) found that (adult) subjects were quicker to verify three-word sentences that are frequent in the British National Corpus (e.g. I like it) as correct English sentences (in contrast to ungrammatical controls such as I it like) than they were for isomorphic less frequent strings that were controlled for lexical frequency and phrasal plausibility (e.g. I keep it). Bod concludes that high frequency strings such as I like it can be accessed directly because they are stored in memory. Using the same experimental design, Arnon and Snider (2010) replicated the effect for four-word sequences (e.g. don’t have to worry vs. don’t have to wait) with controlled lexical and phrasal sub-string frequencies. Bannard and Matthews (2008) found the same for children, who were both more successful and faster at correctly repeating high frequency four-word sentences like Sit in your chair than structurally identical low frequency variants like Sit in your truck (with lexical frequencies and substring frequencies held constant). Zeschel (2008) found that such collocational chunks also affect syntactic
28
Towards a usage-based model of constructional generalisation
processing in adults: subjects were more strongly garden-pathed by V-NP collocations (e.g. prove one’s worth) that suggested a particular syntactic structure (e.g. Ronaldo will prove his worth for the team has been downplayed by media) than by semantically equivalent non-collocations (prove one’s value). Processing difficulty was highest if a collocation was embedded in a context that provided additional syntagmatic cues to the chunk reading with its associated parse (e.g. Ronaldo will get a chance to prove his worth for the team has been downplayed by the media). Using judgments rather than processing latencies as the dependent measure, Dbrowska (2008) found similar item effects for more vs. less frequent variants of wh-questions with long distance dependencies. She shows that targets that match the corpus-derived high frequency template WH do you think S-GAP (e.g. What do you think they decided to do when they got home?) are judged as more acceptable by adult speakers than wh-questions which deviate from this prototype in various respects (e.g. What would Claire claim that Eve said they know about the whole affair?). The same item effects are also found in children’s repetitions of such questions (Dbrowska, Rowland, and Theakston 2009), once more indicating that partially lexically specific templates for these constructions are stored and then re-used in later processing. Moreover, learners not only keep track of how often they have heard particular words or holophrastic structures at large, but also of the larger structural environments in which these units occur: Kidd, Lieven, and Tomasello (2006) showed that children’s ability to correctly repeat sentences with verbs occurring in a particular construction increased with the relative frequency of the verb’s occurrence in this construction. Similarly, children (Theakston 2004) and adults (Ambridge et al. 2008) were found to rate causativising argument structure violations of low frequency verbs (e.g. *Somebody tumbled it off) as more acceptable than the same type of violation for semantically matched high frequency verbs (e.g. *Somebody fell it off). Again, this result suggests that the representations that are invoked for these judgments have been abstracted from experience, with confidence decreasing where input evidence is sparse. Taken together, these findings point to three related conclusions: first, even if they are fully regular/predictable, speakers/learners commit complex high frequency strings to memory as units. Second, speakers implicitly keep track of how often they have encountered particular variants of such units. And third, both kinds of knowledge influence real-time processing and judgment tasks (in adults and children alike).
Previous research
29
Even where aspects of subjects’ productions are the dependent measure (e.g. repetition accuracy), the results of the studies reported so far all focus on comprehension in a sense (i.e. the question of what speakers store, and how matching to stored representations can speed up their processing and lead to fewer errors in subsequent productions). For the present study, a further crucial issue is how these memorised structures may influence speakers’ own spontaneous productions at a later point (i.e. beyond immediate repetition). To begin with child language learners (adult variation data is discussed in section 2.4.3), it is maybe not surprising that many studies have found close correspondences between learner input and output: frequent verbs in caretaker speech are not only acquired early (Naigles and Hoff-Ginsberg 1998) but also found among children’s own top frequent verbs (Theakston et al. 2001). Studies of spontaneous production furthermore support the idea that the stored structures are in fact holophrastic chunks rather than individual words: Rowland and Pine (2000) and Rowland (2007) found that in spontaneous wh-question formation, children produce more inversion errors (e.g. *what you can do?) for pronoun/auxiliary-combinations that are rare in the input (e.g. what can…) than for more frequent combinations (e.g. what do…), presumably because the correctly inverted combinations are stored as chunks. Third, comparative corpus studies of child language input and output also provide evidence that child speech reflects lexical-constructional associations in caretaker speech: Goldberg, Casenhiser, and Sethuraman (2004) found that children’s top frequent verbs in three English argument structure constructions mirrored parental use of these verbs in the investigated constructions. In a follow-up experiment, Casenhiser and Goldberg (2005) show that the Zipfian distribution of constructional exemplars that they found in the earlier study is facilitative for construction learning: in this experiment, children were better at acquiring a novel pairing of phrasal form and meaning if one of the verbs in the training input had a markedly higher token frequency in the target construction than the others. Goldberg and colleagues assume that the existence of a highly entrenched central instance of the construction is helpful for modelling the meaning of the construction itself on the meaning of its central exemplar: “the child categorizes learned instances into more abstract patterns, associating a semantic category with a particular formal pattern; the meaning of the most frequent and early verbs occurring in a particular pattern form the prototype of the category” (Goldberg 1999: 208).
30
Towards a usage-based model of constructional generalisation
With these many different indications for the importance of item-based knowledge in language processing and learning, usage-based acquisition studies have amassed considerable evidence that beginning learners are conservative in many respects (although they can also be trained to become productive with novel verb-construction combinations if the construction is explicitly trained prior to testing – cf. Childers and Tomasello 2001; AbbotSmith, Lieven, and Tomasello 2004). So how about learners’ creative variations of these memorised structures? Lieven et al. (2003) showed that in a dense corpus of child language, as many as 74% of the 2-year old target child’s novel multi-word utterances could be reconstructed as simple variations of an earlier utterance in the corpus (i.e. an already existing exemplar in the system) involving only a single type of modification operation (e.g. ‘substitute’, ‘insert’ etc.). Following up on this study, Lieven and colleagues developed a method called ‘traceback’ that matches children’s multiword utterances in dense corpora to (component units of) earlier productions and schemas derived from them (Dbrowska and Lieven 2005; Bannard and Lieven 2009; Lieven, Salomo, and Tomasello 2009). Schematic slots are postulated wherever two strings match except for one component unit in the same position that designates an entity from the same general semantic category in both cases (e.g. I sit on my Mummy’s bike and I sit there would be identified as instances of a schema I sit LOCATION). In its concern with schema formation from lexically specific strings, the aims of this research are closely related to the objectives of the present investigation. On closer inspection, though, the difference is not just that one approach is about child language acquisition and the other about adult language variation: the main concern of Lieven and colleagues is to uncover relations between constructions in child language and to track how these develop over time. Because of its focus on relations between many different constructions, the semantic categories employed are maximally broad ones such as REFERENT, PROCESS or ATTRIBUTE – a categorisation that is much too schematic for present purposes. For instance, all instances of the aspectual [go ADJ] construction exemplified at the beginning of this chapter would be identified as instances of an overly general schema go + ATTRIBUTE, which is not what speakers seem to work with when categorising relevant constructs. Other acquisition studies have looked into more fine-grained semantic distinctions regarding the elements that can go into a given constructional slot. Typically, this research is concerned with the question whether incipient constructional productivity in children involves analogies based on
Previous research
31
form and meaning or rather analogies based on form alone. From the perspective of usage-based construction grammar with its emphasis on formmeaning pairings, it seems most natural to expect that children’s early schematisations take both formal and semantic properties into account: assuming that children start out from sets of item-specific ‘verb islands’ (Tomasello 1992) or ‘mini-grammars’ (Morris et al. 2000) such as KICKERkick-KICKEE, HITTER-hit-HITTEE and KNOWER-know-KNOWEE, one would expect their earliest abstractions beyond the item-phase to be low-level schemas over semantically similar items (e.g. kick and hit) that are only later connected to ever less similar items (know) until they have arrived at a fully schematic, adult-like construction (Goldberg 1999; Tomasello 2000b; cf. also Pine, Lieven, and Rowland 1998 for discussion). However, empirical evidence suggests that semantics may not play a crucial role in the process after all. Abbot-Smith, Lieven, and Tomasello (2004) showed that training children with transitive uses of CAUSED MOTION-verbs made subjects productive with transitive uses of novel verbs from two unrelated semantic classes (LIGHT EMISSION and SOUND EMISSION). Similarly, in a longitudinal study of naturally occurring early verb-object constructions in 20 children, Ninio (2005) found that children’s novel uses of the pattern did not typically involve direct objects that were semantically similar to their earlier uses of the pattern (in terms of the object argument’s semantic role). But whatever the role of semantics in children’s analogical extensions of argument structure constructions may be, the issue again has only limited implications for the present study: even though a categorisation of verb participants in terms of semantic roles like PATIENT, THEME or EFFECTED OBJECT is certainly more specific than simply classifying the verbal complement as a REFERENT, this level is still too coarse-grained when it comes to the kinds of collocational restrictions that are at issue here. For instance, the fact that take a guess is an institutionalised expression of English whereas ?take a speculation is not (nor, for that matter, seize a guess) has nothing to do with the semantic role of the object noun in these expressions. As discussed in section 2.3.1 above, construction grammar assumes that there is no sharp dividing line between idiomatically restricted and fully generalised constructions. Still, also in construction-based approaches to language acquisition, the way in which children acquire the phraseology of their first language is a much less investigated topic than the acquisition of ‘core’ syntax. One last approach to be mentioned in this section is Johnson’s (1999) theory of ‘constructional grounding’ (see also Grady and Johnson 2003;
32
Towards a usage-based model of constructional generalisation
Schmidtke-Bode 2009), since it is explicitly concerned with the semantic extension processes that are involved in modelling abstract uses of a construction on more concrete (i.e. semantically more tangible) earlier uses of the same structure. Johnson argues that ‘interpretational overlap’ between distinct uses of a construction helps learners to bootstrap abstract ‘pragmatic’ uses of a construction (to do with the speech act scenario and interlocutors’ construal of the designated state of affairs) from more concrete/‘perceptually transparent’ source uses of the pattern (to do with properties of the designated external situation) in a way that directly mirrors principles of diachronic semantic change. For instance, the conventionalised ‘incongruity’ implication of the specialised ‘What’s X doing Y’construction (Kay and Fillmore 1999; e.g. What are your feet doing on the table?) is claimed to arise through pragmatic reanalysis of high frequency ‘source’ wh-questions (e.g. What’s the rabbit doing in this picture?) that are acquired earlier. Though explicitly concerned with semantic extension processes in construction learning, Johnson’s approach nevertheless differs from the perspective of the present study in an important respect: in Johnson’s study, what is at issue are relations between pairs of distinct constructions in which one pattern is developmentally ‘based on’ the other. For instance, the ‘What’s X doing Y’ construction has a number of specialised properties that set it apart from similar-looking wh-questions, thus calling for its recognition as an independent (meso-)construction with its own formal and semantic/pragmatic constraints that must be mastered by the learner at some point. This is not true of constructional schematisation processes as described in section 1: for instance, creative extensions of the aspectual [go Adj]-construction to adjectives which are not conventionally used in the pattern are still instances of the same general change-of-state construction. 2.4.2.
Insights from research on constructional change
The second obvious sense in which schematisation can be construed ‘processually’ is to put it into historical perspective: constructional usage patterns within a speech community may expand their domain of application over time and come to be used in novel contexts. Indeed, changes in the usage patterns of form-meaning pairings in the sense of usage-based construction grammar have recently become a much debated topic in historical linguistics and grammaticalisation research (cf. Traugott 2003, 2007; Diewald 2006; Hilpert 2008 as well as the papers in Bergs and Diewald
Previous research
33
2008 and Trousdale and Gisborne 2008): apart from many other affinities (such as the dominantly functionalist orientation of grammaticalisation studies, their emphasis on language in use and their attention to frequency), the main reason for this is the realisation that the starting points of grammaticalisation processes are highly specific usages of particular subtypes of a given structure in certain restricted contexts. In the words of Lehmann (1991: 503): “[w]hen a newly coined periphrastic expression is received by other members of the speech community, it will not be in isolation, but in the context in which it was originally coined. It will not spread at once to all kinds of context which, given the rules of grammar, would admit it, but will initially be restricted to certain collocations which come close to being phraseologisms”. Following Lehmann, a number of authors have pointed out that these ‘contexts’ – i.e. specific conjunctions of morphosyntactic, semantic-pragmatic and also concrete lexical properties – are conveniently described as form/meaning-pairs in the sense of usage-based construction grammar (Diewald 2006; Bybee 2006). And where Hopper and Traugott in 2003 still speak of a mere “tendency to see grammaticalization (and grammar) in terms of collocations of specific items rather than generalized changes” (2003: 35), Bybee (2006: 721) already chooses the much stronger formulation that “grammaticization, which is the major vehicle for the creation of new grammatical morphemes, demonstrates the need for the cognitive representation of instances of constructions, because if specific instances … were not registered in memory, the construction could not be subject to the processes that comprise grammaticization”. Here, grammaticalisation is tied to the notion of constructions, and more specifically equated with processes that affect specific subtypes of constructions that have independent memory storage in speakers’ minds (as discussed in section 2.3.1). The connection with grammaticalisation research is particularly close in the specific grammatical domain that is investigated in the following chapters (intensification): studies like Kirschbaum (2002), Lorenz (1999, 2002), Peters (1993, 1994) and Partington (1993) are all concerned with the emergence of new linguistic means for the encoding of intensification, and explicitly frame these changes as grammaticalisation processes. In these studies, loosening collocational restrictions on the investigated intensifiers are interpreted as signs of increasing grammaticalisation, and Partington (1993) goes so far as to directly equate grammaticalisation with “width of collocation”. The present study takes a different view on the issue. A more detailed discussion of the relationship between schematisation/collocational expansion on the one hand and grammaticalisation on the other is deferred to
34
Towards a usage-based model of constructional generalisation
chapter 3, when the exposition can be illustrated with concrete examples from my data. For the moment, suffice it to say that schematisation and constructional generalisation in the diachronic sense (as understood in this monograph) is a linguistic change in its own right that is not the same thing as grammaticalisation, which is why the following review will focus on another area of diachronic research that has more direct relevance for the present topic: the study of constructional semantic change. Semantic change has long been the poor cousin of diachronic research: historical linguists usually seek to uncover highly general principles of language change such as the ‘laws’ of sound change, the many supposedly universal ‘clines’ and ‘paths’ of grammaticalisation theory or the typical ‘S-curve’ pattern with which linguistic innovations propagate through the surrounding system. Diachronic semantics has resisted such sweeping generalisations and confident predictions more than any other of the traditional levels of description. Particularly in the field of lexical semantic change, early attempts at systematising the observed phenomena (Bréal 1964 [1900]; Stern 1968 [1931]; Ullmann 1957) have produced little more than inventories of the observable types of changes (often coming in pairs like ‘amelioration’/‘pejoration’, ‘broadening’/‘narrowing’ etc.) without an accompanying theory of which elements should change in which direction under which conditions. As Aitchison (2001: 122) puts its graphically, enumerating such possibilities is “a bit like trying to chart the directions in which an ice skater can glide, and ending up by saying ‘Every which way’”. Recent work has sought to improve on this. For instance, a 2002 monograph entitled “Regularity in semantic change” opens with the promise of delineating “predictable paths for semantic change across different conceptual structures and domains of language function” (Traugott and Dasher 2002: 1). The authors argue that the main mechanism of semantic change is the conventionalisation of formerly pragmatic, context-bound ‘invited inferences’ that hearers are encouraged to explore in ostensiveinferential communication. The main conceptual mechanism involved is metonymy, and the ‘predictable paths’ are stepwise metonymic extensions in the direction of subjectification (i.e. the tendency of meanings to shift from aspects of the external described situation to aspects of speaker’s conceptualisation of this situation; cf. Traugott 1989) and ultimately intersubjectification (the tendency of meanings to shift from aspects of the external described situation to aspects of the speech act situation). Traugott and Dasher’s ‘Invited Inference Theory of Semantic Change’ thus comes embedded in a general theory of constructional change as the conventionalisa-
Previous research
35
tion of communicative implicatures, it identifies the main operation involved on part of the interlocutors (conceptual metonymy) and it predicts two general directions in which meanings tend to change via this mechanism (subjectification and intersubjectification). On first glance, robust diachronic findings about ‘predictable paths’ of semantic change would seem highly relevant to the present study. On closer inspection, however, Traugott and Dasher’s (2002) approach is not directly concerned with the process of semantic generalisation/schematisation as understood here. Specifically, constructional semantic change as envisioned in the ‘Invited Inference Theory of Semantic Change’ consists in the emergence of a new construction based on a historically older source construction – for instance, the development of epistemic modals from deontic uses. The change that produces the target construction may be gradual and many attestations may in fact be ambiguous between the source and target use. What is ultimately at issue, however, is the emergence of an independent construction in its own right, as is most clearly evidenced in cases where the development has produced so-called ‘isolating contexts’ (Diewald 2002) that are compatible with only one of the variants and disallow the respective other reading. By contrast, constructional generalisation as understood here is a loss of collocational restrictions that does not lead to the formation of an entirely new construction (as evidenced by mutually incompatible ‘isolating contexts’). To stay with the earlier example: should speakers of English at some point use the [go Adj] construction in a completely generalised manner, the construction would still be counted as the same aspectual verb construction as before. On the assumption that constructional meanings can be read off from lexical-collocational cooccurrence properties of the construction in question, however, constructional schematisation processes as understood in this monograph are as much about constructional semantic change as the aforementioned study by Traugott and Dasher (2002). Among others, like-minded approaches to constructional semantic change as collocational change include the intensification studies mentioned above that will be discussed in greater detail in chapter 3. A further, oft-cited example of this line of research is Israel’s (1996) study of the development of the English way-construction (e.g. They cut their way through the jungle). Focusing on the piecemeal schematisation of the verb slot, Israel distinguishes different ‘threads’ in the diachronic development of the construction by pointing to the addition of new semantic clusters of eligible predicates that have expanded the constructional network into different
36
Towards a usage-based model of constructional generalisation
directions over time. Hilpert (2006) suggests a promising methodological improvement to Israel’s approach of tracing lexical raw frequencies in a construction across diachronic stages. Using a variant of collostructional analysis (i.e. the same family of corpus-linguistic methods that is also employed in the current study, cf. chapter 5), Hilpert shows how constructional usage patterns that are distinctive for a particular historical period can be identified and contrasted with earlier and later trends in the construction’s usage in a way that corrects for chance fluctuations. Hilpert (2008) offers a large-scale application of such analyses to the study of the diachronic development of future constructions in German, English, Danish, Swedish and Dutch. The studies by Israel and Hilpert share many of the guiding assumptions and analytical aims of the present investigation, but apply them from an explicitly diachronic perspective: their focus is on how associations between a construction and its lexical fillers change between two or more points in time. This contrasts with the synchronic perspective developed in chapter 6 below, where the focus is on how currently attested variants of the investigated constructions are semantically related at one specific point in time: now. 2.4.3.
Insights from research on constructional variation
This interest is the hallmark of the third perspective on constructional generalisation/schematisation to be distinguished here: a perspective that is not concerned with individual constructions’ change over ontogenetic or phylogenetic time, but with their relative degree of schematicity as compared to other constructions at one specific point in time. In this, it is similar to the synchronic perspective on grammaticalisation that allows analysts to identify e.g. causal since as a more grammaticalised construction of contemporary English than temporal since without necessarily putting the discussion into historical perspective. Such a synchronic perspective on schematisation and schematicity is the main concern of the present study. Previous research that has informed it (broadly identified as work on ‘constructional variation’ above) comes from very heterogeneous sources: corpus-based approaches to idiomaticity, collocation and other forms of formulaic language, applied linguistic discussions of such phenomena in the context of lexicography and second language pedagogy, and also psycholinguistic studies of idiom flexibility and novel construction learning. Being an empirical study of textual co-occurrence patterns, my investigation is first and foremost indebted to previous corpus-based approaches
Previous research
37
to linguistic description, and especially those with a focus on the more prepatterned aspects of naturally occurring language. Since the corpuslinguistic literature on this topic is enormous, it cannot be given a proper review here. In particular, no attempt will be made to trace the development of relevant terms like ‘collocation’ through the scientific history of linguistics (or even just corpus linguistics). Instead, my review will focus on a number of more recent corpus-linguistic approaches to formulaic language that have directly informed my approach. The first name to mention in this connection is that of John Sinclair, whose work in turn builds on the foundations of the linguistic school known as ‘British contextualism’ (Firth 1957; Halliday 1966). Above all, Sinclair will be remembered for his pioneering work as the leader of the COBUILD project in Birmingham. The goal of the project was to compile an unprecedented giga corpus of contemporary English (the Bank of English) to be used for corpus-driven lexicographic and grammatical descriptions of the English language. Empirical results are documented in the Collins Cobuild English Dictionary (1995) and in the two volumes of the Collins Cobuild Grammar Patterns series (1996, 1998). Theoretically, what is of particular relevance for present concerns is Sinclair’s distinction between two complementary principles of linguistic organisation which he calls the ‘open choice principle’ and the ‘idiom principle’, respectively. In his own words (Sinclair 1987: 320), the open choice principle manifests in a way of seeing language text as the result of a very large number of complex choices. At each point where a unit is completed (a word or a phrase or a clause), a large range of choice opens up, and the only restraint is grammaticalness. This is probably the normal way of seeing and describing language … Any segmental approach to description is of this type; any which deals with progressive choices; any tree structure shows it clearly: the nodes on the tree are all choice points. Virtually all grammars are constructed on the open choice principle.
The idiom principle, by contrast, accounts for the observation that a language user has available to him or her a large number of semipreconstructed phrases that constitute single choices, even though they might appear to be analysable into segments. To some extent, this may reflect the recurrence of similar situations in human affairs; it may illustrate a natural tendency to economy of effort; or it may be motivated in part by the exigencies of real-time conversation. However it arises, it has been relegated to an inferior position in most current linguistics, because it does not fit the open-choice model.
38
Towards a usage-based model of constructional generalisation
These prefabricated structures can be of very different types: fully specified units range from word-level collocations (e.g. stone deaf, strong tea, run riot) to sentence-level routine formulas (e.g. how do you do, don’t mention it, it takes one to know one). (Partially) underspecified structures can be thought of as ‘schematic idioms’ (Croft and Cruse 2004: 234) of varying degrees of complexity and abstraction (i.e. with one or more slots that may be more or less open, e.g. hit the hay/sack, in bloom/blossom/bud/ flower/leaf, throw NP to the dogs/lions/wolves, the Adj-er the Adj-er etc.). A book-length, corpus-based attempt at systematising the various kinds of phenomena that can be considered manifestations of Sinclair’s idiom principle is the monograph by Moon (1998) that was already mentioned in section 2.3.1. Hunston and Francis (2000) develop the research carried out at COBUILD into a corpus-driven, phraseologically enriched approach to grammatical description that they call pattern grammar. In their discussion of examples like the following, Hunston and Francis address the same type of semi-schematic/generalised structures that are also at issue in later chapters: (8) a. She is adamant in her refusal to make any statement. b. Both men are military officers and firm in their belief that the nation’s interests and their own are the same. c. Last week the fans were loud in their support for their manager, his players and his tactics. d. Even Greenpeace UK, so vocal in its opposition to Sellafield, said that their independent scientific advice was that low-level radiation posed no threat. (Hunston and Francis 2000: 78) They note that a purely structural description like ‘Adj in POSS N’ is too coarse-grained to adequately capture this construction, and that a more appropriate characterisation would have to specify that the noun in the prepositional phrase must also be of a specific semantic type (i.e. it “realises a way of thinking, such as belief, a way of talking, such as support or opposition, or an absence of the same, such as refusal”; Hunston and Francis 2000: 78). Although the theoretical background and the terminology employed is not the same, what is at issue here is clearly the same kind of phenomenon (a partially productive ‘schematic idiom’) that is at the heart of many cognitive-linguistic/constructionist discussions of the continuum between ‘lexicon’ and ‘syntax’.
Previous research
39
Linguists working in the Chomskyan tradition have tended to emphasise that human language is essentially creative (in the sense of being infinitely generative). While there is no disagreement about the fact that natural languages do possess this property, it is the merit of corpus linguists like Sinclair that there is growing awareness now that prefabricated structures of different kinds are nevertheless extremely common in naturally occurring language (notably spontaneous discourse). Erman and Warren (2000) report that more than half of the words (>55%) of the corpus they studied were part of a prefabricated structure which they counted as a manifestation of Sinclair’s idiom principle. In the same spirit, Barlow (2000: 316) suggests that “the main component of the grammar … comprises a large set of redundantly specified schemas, both abstract and lexically-specified, and the role of rules or constraints (or highly abstract schemas) is to provide the glue or mortar to combine these prefabricated chunks”. One common explanation for speakers’ heavy reliance on such ‘chunks’ in spontaneous discourse is that they convey a processing advantage: the more linguistic material is selected through an individual choice during formulation, the more time speakers buy for subsequent utterance planning. In line with this assumption, Erman (2007) provides evidence that pausing is indeed more common in ‘creative’ formulations that are computed from scratch as compared to strings that instantiate a routinised prefab. Conversely, Conklin and Schmitt (2008) show that reading times for formulaic sequences are lower than for matched non-formulaic sequences, again indicating that the formulaic units in question can be accessed holistically (see also earlier discussion in section 2.4.1). The pervasiveness of prefabricated routine expressions also marks their significance for acquiring truly nativelike proficiency in a language, which is why they have attracted considerable attention in applied linguistic research on second language learning and teaching, too. Pawley and Syder (1983: 192), who refer to such units as ‘lexicalized sentence stems’, estimate that “the stock of lexicalized sentence stems known to the ordinary mature speaker of English amounts to hundreds of thousands”. The relevance for second language learning is that a novel sequence will be nativelike at least to the extent that it consists of an institutionalized sentence stem plus permissible variations … It is a characteristic error of the language learner to assume that an element in the expression may be varied according to a phrase structure or transformational rule of some generality, when in fact the variation (if any) allowed in nativelike usage is much more restricted. The result, very often, is an utterance
40
Towards a usage-based model of constructional generalisation that is grammatical but unidiomatic e.g. You are pulling my legs (in the sense of deceiving me), John has a thigh-ache, and I intend to teach that rascal some good lessons he will never forget. (Pawley and Syder 1983: 214)
Especially in corpus-based approaches to the subject, this insight has inspired much research on the significance of phraseology to foreign language learning and teaching (cf. e.g. Howarth 1998 and papers in Meunier and Granger 2008). While the above quotation from Pawley and Syder (1983) already introduces the notion of ‘permissible variation’, my review has so far focused on research that seeks to establish the psychological reality and functional significance of the objects of such variation processes in the first place. Previous research on the variability of such structures is mostly psycholinguistic work on idiom processing. Although the literature is extensive (e.g. Cacciari and Tabossi 1993; Cronk et al. 1993; Gibbs 1990; Keysar and Bly 1995; Libben and Titone 2008; McGlone et al. 1994; Needham 1992; Schweigert 1991; Tabossi, Fanari, and Wolf 2005), most of these studies do not have a direct bearing on the topic pursued here. Work in this tradition is typically interested in the activation of figurative as opposed to literal meanings of semantically opaque idioms and how such activations may be affected by introducing different types of variation into the string. The underlying question is whether idiomatic phrases are inevitably noncompositional and accessed as ‘long words’ (Swinney and Cutler 1979), or whether there are semantic connections between elements of the literal and figurative interpretations of idioms, thus making them more or less regularly decomposable (Gibbs, Nayak, and Cutting 1989). Experimental research on idiom variability in this tradition therefore typically focuses on possible connections between lexical substitutability on the one hand and semantic componentiality on the other. For instance, a study by Gibbs et al. (1989) found that semantically decomposable idioms (e.g. break the ice) are more tolerant to lexical substitutions (e.g. burst the ice) than nondecomposable idioms like kick the bucket (#boot the bucket, #kick the pail). By contrast, the present study is not concerned with the relation between supposed ‘literal’ and ‘figurative’ meanings in comprehension, it focuses on ‘idiomatically combining expressions’ (in the sense of Nunberg, Sag, and Wasow 1994) rather than on opaque ‘idioms of decoding’ (in the sense of Makkai 1972), and it is not about whether these prefabricated units can be varied, but rather about how speakers do it. In fact, the abovementioned study by Gibbs et al. (1989) also touches on the last question when the
Chapter summary
41
authors ask “But what constrains the kinds of lexical changes that can be made without severely disrupting these phrases’ idiomatic meanings?” (Gibbs et al. 1989: 65). However, they do not go on to explore the question systematically but content themselves with the suggestion that “one possible constraint on the kinds of lexical changes that can acceptably be made is the restriction that any new word must come from the same semantic field as the original lexical item and its figurative referent” (Gibbs et al. 1989: 66). Erman and Warren (2000: 40) mention some further possible relations between variants: The most important feature of variability in a lexical prefab is the open slot … Some slots are not completely open, in which case we have what we refer to as restricted variability: books/novels/articles etc. deal with sth.; go to lectures/class/seminars/meetings etc.; have Christmas/Friday/the morning etc. off; with little/much/a lot etc. in common; to a limited/great etc. extent. It is interesting to note that in cases of restricted variability we find that the substitutes are normally semantically related. They are synonymous (much, a lot) or belong to the same lexical field (books, articles, essays, novels, speeches, etc.) or they are antonyms (limited, great). This is not an absolute rule (consider waste of time/effort/money), but it is certainly a clear tendency and is an indication that what we store in some cases is a meaning rather than a specific word.
Still, like Gibbs and colleagues, they do not pursue the issue any further and leave it at the somewhat unspecific observation that “substitutes are normally semantically related” in some way or other. One of the aims of the present study is to see whether there is more to be said about the way in which such variants spread. 2.5.
Chapter summary
The present chapter has introduced the topic of the investigation and introduced the theoretical framework in which it will be set. I have provided an initial illustration of the target phenomenon and characterised the aim of the study as a contribution to research on constructional generalisation processes involved in the transition from routinised ‘fixed expressions’ to constructional schemas (section 2.2). Section 2.3 has outlined central assumptions of usage-based construction grammar and cognitive semantics and introduced some relevant terminology for the following discussion. Finally, section 2.4 has provided a review of like-minded research in several related
42
Towards a usage-based model of constructional generalisation
fields (ranging from language acquisition over historical linguistics to studies of synchronic variation) that has informed the present treatment.
Chapter 3 Testing ground: Intensity collocations
3.1.
Introduction
Chapter 3 introduces the empirical testing ground of the study: ‘fixed expressions’ and their creative variants of a particular formal, functional and semantic type. Recalling figure 1.1 in chapter 1, relevant properties of the target expressions in all three dimensions are introduced in the following order: section 3.2 provides some elementary background on intensification as a linguistic function (3.2.1) and motivates its choice as the grammatical target domain of the study (3.2.2). Special attention is devoted to issues of intensifier variation and change. Since much previous research on constructional change in this domain has framed the issue as a grammaticalisation process, I return to the relationship between (intensifier) schematisation and grammaticalisation that was already raised in the preceding chapter and clarify how the two notions are used in the present study. Section 3.3 turns to the conceptual motivations of the targeted expressions and surveys the main semantic mechanisms and resources that can be employed for signalling intensity (3.3.1). These include both semantic lexicalisation patterns in the sense of chapter 2 (i.e. systematic associations between intensity meaning and a set of conceptual source domains) as well as pragmatic mechanisms that produce context-specific intensity interpretations for items that are not generally associated with this function. Section 3.3.2 motivates the choice of the semantic target patterns of the following case studies (PERCEPTION intensifiers) and illustrates each type with examples from the corpus data. Finally, section 3.4 turns to the formal side and motivates the choice of the specific target constructions to be investigated in chapters 4 to 6. Sections 3.4.1 to 3.4.3 provide separate discussions of each of these constructions, with section 3.4.2 exploring the connection between constructional schematisation and grammaticalisation on the concrete example the second target pattern, Int + N in German. Finally, section 3.5 introduces the objectives of the following case studies and section 3.6 provides a short summary of the chapter.
44 3.2.
Testing ground: Intensity collocations
Intensity and intensification
Intensification as a linguistic function is a topic that has been explored in book-length treatments of its own (Bolinger 1972; van Os 1989). Fortunately, many of the prime topics in the literature (such as the distinction between different levels of intensity) are not of interest here at all and can therefore be neglected. Instead, I will merely sketch the bare basics of the phenomenon in English and German and explain why my choice fell on intensification constructions in the first place. 3.2.1. Intensification as a linguistic function In the most general sense, intensification can be characterised as a type of scaling modification that is applicable to any predicate that can be qualified as obtaining to a greater or lesser degree – in whichever sense. Following van Os (1989), such judgments can be said to involve a rank-ordering of a given state of affairs relative to some standard of comparison with respect to a particular semantic dimension (i.e. domain in the activated domain matrix), where the ranking involved can be of the type ‘purely quantitative, assessing extent’ or ‘evaluative, assessing degree’ (van Os 1989: 35). In a first approximation, then, intensification can be characterised as an operation that serves to ‘strengthen’ or ‘weaken’ a predication in terms of extent and/or degree (the present study will only be concerned with the up-scaling, boosting variant though). Formally, intensification operations are typically realised by combining a suitable (i.e. intensifiable) predicate with an intensifying modifier to form a composite intensity expression:1 (1) very good, fairly obvious, stark raving mad Whereas the examples in (1) are indeed composite expressions consisting of a separate intensifier and intensified predicate, intensity implications are in fact not always conveyed by a separate formative. Rather, intensification can be realised using a number of very different formal means from all traditional levels of description. Consider (2): (2) a. b. c. d. e.
[NP] is DARK2 [NP] is superdark [NP] is extremely dark [NP] is dark as night [NP] is so dark that you can’t see your hand in front of your face
Intensity and intensification
45
The intensifying elements in (2) span the whole range from suprasegmental over bound morphological and free lexical to phrasal and discontinuous periphrastic means. Conversely, linguistic elements of very different formal categories can be intensified given an appropriate semantic constellation. For instance, there are not only intensified adjectives/APs, adverbs/AdvPs and verbs/VPs, but also intensified nouns/NPs and intensified PPs: (3) a. b. c. d. e.
quite disappointing, extremely difficult to follow to sweat heavily, to particularly like something very carefully, highly similarly to the rest superconductivity, total waste of time completely out of the question, bang up to date
Functionally, intensification is closely related, but not identical to, GRADAan operation that is both formally and semantically more restricted.3 Likewise, intensification is intimately connected with (and often very difficult to distinguish from) particular types of qualitative and quantitative modification, seeing that the latter often give rise to intensity implicatures and hence shade off into the domain of intensification proper: TION,
(4) a. cautiously optimistic b. boiling hot c. much-contended Hence, saying that somebody is cautiously optimistic not only expresses that they entertain this particular attitude in a certain manner but also that their optimism is limited, and boiling hot water need not in fact be boiling at all. A principled distinction is further complicated by the fact that especially among degree adverbs (i.e. what are often seen as prototypical intensifiers), new items typically arise from qualitative modifiers such as manner adverbs diachronically. These observations indicate that the boundary between intensification and other types of modification is both fuzzy and permeable. Difficult as the distinction between intensifying and non-intensifying readings of one and the same lexical item may be at places, there are also contexts in which it is intuitively obvious. For instance, the adjective strong is an intensifier in strong supporter but not in strong horse, and deep means ‘intense’ in deep regret but not in deep ditch. Such contrasts show that words like strong and deep are not invariably abstract intensifiers, but only
46
Testing ground: Intensity collocations
take on intensifying force when in construction with an appropriate predicate. Similarly, the readiness with which a given predicate can be intensified depends on larger syntagmatic context, too. Consider e.g. dead in the contrast between a completely dead person and a completely dead place: when the property ‘dead’ is attributed to a person, intensification only works in special contexts (e.g. contrasts with other entities/alternative states of affairs in which the property bearer is merely mortally wounded etc.). No such special context is required when the property is predicated of a place: here, the mismatch between the literal meanings of dead and place enforces a figurative interpretation of the adjective (‘dull’) that conveys an evaluative judgment, and this in turn is easily intensified. Finally, interpretive preferences like these do not necessarily depend on the semantics of just the overall intensity expression alone: as indicated, it is commonly only a very restricted aspect of the designated state of affairs that is intensified (cf. strong supporter), and hearers/readers may be guided by larger linguistic or extralinguistic context in determining which property that is. For means of illustration, consider the following contrast from Bolinger (1972: 163): (5) a. I wish those children in the back row wouldn’t whisper so. They make it awfully difficult to hear the others when they recite. b. I wish those children in the back row wouldn’t whisper so when they recite. I can hardly hear them. In Bolinger’s terms, only (5b) intensifies an ‘inherent’ property of the verb, namely the quality of talking in a low voice (‘I wish they wouldn’t recite so very quietly’). By contrast, (5a) demonstrates what Bolinger calls ‘extensibility’, or intensification of some implicit extent parameter associated with the predicate (here: volume or duration). At any rate, the different readings of the intensity expression whisper so exhibited by (5a) and (5b) are clearly triggered by the different contexts in which they occur. This extraordinary flexibility has the unfortunate consequence that (with the exception of fully grammaticalised intensifiers such as English very) intensifiers cannot be identified automatically – instead, potential target expressions must be assessed in their respective discourse context in all its specificity. What is needed, therefore, is not only a suitably flexible approach to semantics that can account for the kind of interpretive shifts demonstrated above, but also a concrete tool for distinguishing intensity expressions from closely related non-intensifying structures. The present study will use a simple diagnostic (ultimately going back to Horn 1969, but
Intensity and intensification
47
here adapted from Pittner 1996) for determining the presence of intensifying meaning, namely compatibility of the potential target expression with the adverbs even/sogar in the following context: (6) a. Z is X Y The water is boiling hot b. Is Z Y? Is the water hot? c. Yes, X Y even. Yes, boiling hot even. The presuppositions triggered by the adverb ensure that expressions of the type in (6) are only felicitous if an intensity implication is present. Even though intuitions are not always as clear as in the case of (6) and additional guidelines are required for more problematic cases (see below), this simple test provides a useful heuristic that will be adopted as the criterion for admitting potential target expressions into the study. Summing up, intensification is an extraordinarily flexible and contextsensitive operation. The possibility (or, where generally applicable, higher/lower probability) of assigning an intensifying interpretation to a given candidate form depends on emergent semantic properties of the complex expression of which it forms a part. More specifically, I assume that the possibility/likelihood of assigning an intensity interpretation to a particular ambiguous target depends on three factors:
Semantics: it must be possible to construe the potential target of the intensification operation as a predicate that could be intensified in principle within the present context, and to relate the potential intensifier to a particular conventional intensification strategy of the language (cf. section 3.3.1); Collocation: functionally ambiguous items are more likely interpreted as intensifiers if the properties of the expression at hand resemble typical intensifying uses of the item in question (i.e. do not violate the collocational restrictions of the intensifier reading); Coercion: the stronger the association of the respective grammatical construction with an intensification interpretation (as compared to its other functions), the more likely it is that an ambiguous combination will be interpreted as an intensity expression.
3.2.2. Intensifier variation and change Various authors have observed that intensification is a functional domain that is particularly suited for the study of linguistic innovation. Bolinger
48
Testing ground: Intensity collocations
(1972: 18) sums up the argument in a beautiful passage that is worth quoting at length: The study of degree words is of more than intrinsic interest. The comforting view of language is that it is sedate, structured in an orderly manner, and reducible to rule. But in another view, it is at war with structure, which is to say that it is at war with itself. Structure is the resolution of a conflict that is never settled … Degree words afford a picture of fevered invention and competition that would be hard to come by elsewhere, for in their nature they are unsettled. They are the chief means of emphasis for speakers for whom all means of emphasis quickly grow stale and need to be replaced … As each newcomer appears on the scene it has elbowed the others aside. The old favorites do not vanish but retreat to islands bounded by restrictions (for example precious few but no longer precious hot), and the newcomer is never fully successful and extends its territory only so far. Nothing has quite time to adjust itself and settle down to a normal kind of neighborliness before the balance is upset again. Degree words are an antidote to the overconfident description of language as a system. It is a system, but one fighting for survival, and forced to modify itself at every instant. Bolinger (1972: 18)
Many of the theoretical assumptions and empirical motivations for focusing on intensity collocations in this study are anticipated in this passage: the view of language as a constantly adapting (‘emergent’, ‘fluid’ etc.) set of symbolic conventions that are shaped by conflicting functional pressures; the concern with speakers’ striving for expressivity as a driving force of linguistic innovation; the attention to shifting collocational restrictions as a manifestation of the resulting competition between rival forms. Moreover, Bolinger also draws attention to the fact that schematisation (i.e. loss of collocational restrictions) is not the only direction in which a construction may develop: whereas only few candidates show signs of becoming constantly more generalised and common (i.e. type- and token frequent) over time, most items remain stuck on phraseological ‘islands’ located somewhere in between full productivity and complete fixedness (some of them growing and some of them shrinking), and conventional usage of an item may also lose ground up to the point of disappearing out of the language altogether. Given Bolinger’s characterisation of the phenomenon, it is no surprise to see that discussions of variation and change are so prominent in the intensification literature (or, conversely, that so many studies of variation and change have chosen to focus on intensification constructions). The synchronic approaches among these studies are usually about variation in in-
Intensity and intensification
49
tensifier choice and how it can be predicted by sociolinguistic factors such as speaker age, class and gender. In other words, they are usually not concerned with questions of co-occurrence and variability in the second constitutive slot of an intensity expression, that of the intensified predicate. In diachronic approaches, by contrast, both perspectives are common. On the one hand, many studies are again concerned with variability in the intensifier slot: from the historical point of view, the issue is framed in terms of ‘renewal’ (Hopper and Traugott 2003: 122), i.e. the recruitment of new lexical items for the expression of more grammatical, operator-like meanings and functions. An oft-stated aim of such research is the identification of typical semantic source domains of newcomers to the targeted paradigm (Kirschbaum 2002; Lorenz 2000; Peters 1993, 1994; cf. section 3.3.1). On the other hand, diachronic studies are sometimes also concerned with changes in the combinatorial potential of individual intensifiers (Partington 1993), i.e. variation and change in the slot of the intensified predicate. Though similar in orientation to the present study, such research nevertheless tends to focus exclusively on the most generalised intensifiers in a language and how they came to attain this status. The reason is that these studies typically pursue another common aim of diachronic research more generally: that of retracing the historical development of present-day function words from their lexical ancestors, as well as the identification of currently expanding alternatives that look potential heirs to their throne. Where both strands of diachronic intensification research converge is their shared concern with grammaticalisation. For intensification phenomena, the typical grammaticalisation account can be summed up as follows: INTENSITY is an abstract functional category that is typically signalled by purely relational (‘synsemantic’) operators. In English and German, its typical exponents are grammaticalised function words (in German also bound morphemes) like very and sehr that have developed from full lexical adverbs typically denoting MANNER (cf. 3.3.1).4 The historical development of these elements can therefore be subsumed to standard definitions of grammaticalisation such as “a process which turns lexemes into grammatical formatives and makes grammatical formatives still more grammatical” (Lehmann 1985: 303). As indicated above, the grammaticalisation account assumes that innovation and variability in the paradigmatic dimension is linked to innovation and variability in the syntagmatic dimension in a specific way: paradigmatically, new adverbs are constantly being sucked into the grammaticalisation channel because older additions are no longer perceived as expres-
50
Testing ground: Intensity collocations
sive enough. Syntagmatically, however, only very few of these items will make it beyond the restricted contexts of their original entry points into the system and become sufficiently type- and token frequent candidates for genuine function word status. This explains why grammaticalisation studies have tended to focus either on issues of renewal alone, ignoring the syntagmatics of the investigated items altogether, or on the syntagmatics of the most grammaticalised items alone, ignoring the multitude of more marginal and phraseologically restricted members of the paradigm. By contrast, for a study of these very restrictions and the way they are stretched and modified in processes of semantic extension, it is precisely those intensifiers whose usage patterns can still be characterised as ‘islands bounded by restrictions’ that are of particular interest. It follows that the present study will not be concerned with the same kinds of intensifiers that are usually discussed in grammaticalisation studies. Furthermore, contra Partington (1993), the notions of ‘width of collocation’ and ‘(degree of) grammaticalisation’ will also be kept apart on theoretical grounds: ‘width of collocation’ is here understood as the result of a constructional schematisation process in the sense of chapter 2. For ‘grammaticalisation’, by contrast, I adopt the above-mentioned definition by Lehmann (“a process which turns lexemes into grammatical formatives and makes grammatical formatives still more grammatical”). The essence and endpoint of a grammaticalisation process thus conceived is therefore the emergence of a new grammatical morpheme. This is not the case with constructional generalisation/schematisation processes as characterised in chapter 2. Now, grammaticalisation being a process, it is of course possible to identify instances of this process that have not yet reached their prospective endpoint (and maybe never will). However, also before that point, there are certain diagnostic criteria that can be employed in order to locate an item on the continuum between lexical and grammatical signs, and thereby to permit an assessment whether it is indeed ‘becoming more grammatical’ or not. One well-known account of these criteria is Lehmann’s (2002) model of grammaticalisation parameters. The approach assumes that a sign is grammaticalised to the extent that it has lost formal and semantic autonomy. Lehmann identifies three different parameters that are relevant for assessing the degree of grammaticalisation/loss of autonomy that has affected a sign: these are its weight, its cohesion, and its variability, each of which has a paradigmatic and a syntagmatic manifestation. An overview is provided in table 3.1:
Intensity and intensification
51
Table 3.1 Grammaticalisation parameters (after Lehmann 2002) Parameter Weight Cohesion Variability
Paradigmatic axis Integrity Paradigmaticity Paradigmatic variability
Syntagmatic axis Structural scope Bondedness Syntagmatic variability
Specifically, increasing grammaticalisation is indicated by an increase in cohesion and a decrease in weight and variability. These notions in turn can be characterised as follows: paradigmatic weight, also known as integrity, refers to the semantic and phonetic substance that differentiates a sign from other signs. A loss in integrity is referred to as phonological attrition (also called erosion) on the formal side and as semantic abstraction (also called bleaching) on the meaning side. The syntagmatic weight parameter refers to the structural scope of a sign, i.e. “the structural size of the construction which it helps to form” (Lehmann 2002: 128). Both integrity and structural scope reduce with increasing grammaticalisation. Moving on to the cohesion parameters, paradigmaticity relates to the degree to which a sign is integrated into regular formal and semantic paradigms, i.e. forming a class with other signs to which it is linked via clear-cut paradigmatic relations of opposition and complementarity. Syntagmatically, cohesion manifests as bondedness, i.e. increasing fusion or coalescence with neighbouring items in linear order. Paradigmaticity and bondedness increase with progressive grammaticalisation. The final two parameters are paradigmatic and syntagmatic variability. Paradigmatic variability refers to the relative freedom with which a particular sign may be picked by a speaker (i.e. choice of nouns for object reference is relatively free as compared to the mandatory use of agreement morphology). Syntagmatic variability relates to the “positional mutability” of a sign, i.e. the “ease with which it can be shifted around in its context” (Lehmann 2002: 140). Both types of variability decrease with increasing grammaticalisation. The six parameters are expected to correlate. On the basis of the above classification, it is therefore possible to characterise the prototypical grammaticalised sign as one that
has little formal and relatively abstract semantic substance; has limited structural scope; stands in regular oppositions to paradigmatic variants; is morphologically bound; is the mandatory choice for signalling a particular function, and is fully fixed in its linear position.
52
Testing ground: Intensity collocations
The conjunction of these properties amounts to a characterisation of grammatical morphemes such as case affixes or TAM-markers. It is therefore not surprising that Bybee (2006) identifies the creation of such elements as the “hallmark” of grammaticalisation processes. Constructional schematisation as characterised above does not fit neatly into this scheme – if anything, it corresponds to a loss of paradigmatic weight (‘semantic bleaching’), but it is certainly not the same as grammaticalisation (on the above understanding). On the other hand, acknowledging that the two notions are not co-extensive of course does not preclude the possibility that a construction may be affected by both processes at the same time. The second of the three constructions considered in my case studies is a case in point – see section 3.4.2 for an account of its grammaticalisation. Having introduced the grammatical function of intensification as such and considered its special appeal for studies of constructional variation and change, it is now time to explore the concrete formal and semantic means that are employed for signalling intensity in English and German. I will begin with the semantic side. 3.3.
Conceptualising intensification
As illustrated in section 3.2, the availability of an intensity interpretation for a given expression depends on specific semantic conditions. The total set of mechanisms and mappings that give rise to such readings will be called intensification strategies here. Those strategies that have been (at least semi-) systematically semanticised (i.e. that can be characterised as ‘lexicalisation patterns’ for intensity meaning in the sense of chapter 2) will be called intensification patterns. Both English and German show evidence for a number of different general intensification strategies and have conventionalised a much greater number of specific intensification patterns. The difference between a pragmatic strategy and a semanticised pattern is illustrated in (7) and (8): (7) a. I pranced about and lay on the ground and turned the lens through 45 degrees and stepped in pore-scouringly close, but what I was really doing, what I was after, was a good shot of Stuart's double chin. (BNC EDJ) b. Zambia was skinny as a leather lace. (BNC AD9)
Conceptualising intensification
53
(8) a. Never drink heavily during pregnancy. (BNC A0J) b. If in doubt, press a few keys; if that doesn't work, ask a colleague; if in deep trouble, consult the manual. (BNC A0C) The bold-face expressions in (7) require highly specific contexts in order to be interpreted as intensifiers, and they are not part of a larger semantic type that is commonly employed for intensification. By contrast, (8) illustrates two common intensification patterns of English (cf. section 3.3.1 below). The bold-face items in (8) are just exemplary instances of these patterns that could also be replaced with other members of these sets without losing their intensifying function. All intensifiers in a language can be identified as instances of a particular intensification strategy. This also includes the most grammaticalised items like English very and German sehr: synchronically, these are function words, i.e. purely relational operators that do not encode anything but an abstract degree specification. Diachronically, however, both of them can be traced back to content words with an appropriate semantics for the grammaticalisation of intensity meaning: English very goes back to the Old French adjective verai, ME verrai, verray, ‘true, genuine, real’ (Stoffel 1901: 29), and German sehr has developed from the Old High German root sr, ‘sore, painful’ (Seebold 1999: 754), a meaning that has been preserved in Modern German versehrt (‘injured, disabled’). Both of these meanings belong to common source domains for intensity implications (cf. really good, truly surprised; quälend langsam ‘excruciatingly slow’, peinlich genau lit. ‘painfully exact’). The present section explores the semantic and pragmatic bases of intensification in English and German. Section 3.3.1 surveys the inventory of intensification strategies in the two languages that has been discussed in the literature. Section 3.3.2 then moves on to the specific patterns that will be investigated in the following corpus study. 3.3.1. Intensification strategies in English and German Particularly in the domain of ‘degree adverbs’ (cf. note 11 below), previous attempts at identifying pragmatic mechanisms and semantic source domains for intensity meaning are not wanting (Borst 1902; Biedermann 1969; Kirchner 1955; Lorenz 1999, 2002; Peters 1993; Spitzbardt 1965). Prior to an enumeration of individual connections, however, it will be use-
54
Testing ground: Intensity collocations
ful to classify these various proposals on a more general level, i.e. in terms of the underlying conceptual mechanisms that they assume to be at work. Three sources of intensity meaning will be distinguished: conceptual mappings, semantic redundancy and inherent intensity. Conceptual mappings In a study of adjective intensification in German, Kirschbaum (2002) offers a systematisation of intensification patterns in terms of an opposition between metaphorical and metonymic construals. To begin with, Kirschbaum identifies the following eight metaphorical intensification patterns (the last one further divided into two distinct subtypes): (9) a. INTENSITY AS HEIGHT e.g. hoch, höchst (‘highly’) b. INTENSITY AS DEPTH e.g. tief, zutiefst, abgrundtief (‘deeply’, ‘abysmally’) c. INTENSITY AS SIZE e.g. riesig, gigantisch, kolossal (‘giant’, ‘colossally’) d. INTENSITY AS CIRCUMFERENCE e.g. massiv (‘massive’) e. INTENSITY AS STRENGTH e.g. gewaltig, mächtig, stark (‘forcefully’, ‘mighty’, ‘strongly’) f. INTENSITY AS DISTANCE e.g. weit, weitgehend, äußerst (‘(by) far’, ‘far-reaching’, ‘utterly’) g. INTENSITY AS WEIGHT e.g. schwer, leicht (‘heavily’, ‘lightly’/‘easily’) h. INTENSITY AS AMOUNT h´ INTENSITY AS QUANTITY e.g. viel, ein bisschen, etwas (‘much’, ‘a bit’, ‘some’) h´´ INTENSITY AS COMPLETENESS e.g. ganz, total, voll (‘wholly’, ‘totally’, ‘fully’) (Kirschbaum 2002: 84) The distinction between subtypes (a) and (b) reflects the fact that even though high and deep could also be construed as contrasting values on the same scale (high up in the air, deep down in the ground), both items actually project independent scales here (cf. highly/deeply moved, both of which are boosting in function). Though framed in a discussion that is restricted to adjective intensification constructions in German, it is not surprising that these patterns directly carry over to other types of intensifica-
Conceptualising intensification
55
tion constructions in German as well as to their equivalents in English – it seems reasonable to assume that there is only a limited number of options for metaphorically apprehending an abstract scalar configuration such as INTENSITY in terms of an experientially concrete manifestation (whose associated lexis can then be recruited for its linguistic expression). While metaphorical patterns like those in (9) serve to conceptualise the notion of intensity itself in terms of something else (i.e. INTENSITY-ASSIZE/DEPTH/WEIGHT etc.), there is also a second prominent strategy for marking abstract assessments of extent/degree in terms of an experientially palpable manifestation: here, the intensity implication is conveyed indirectly by invoking a specific consequence or effect of the intensified state of affairs. Because these construals involve accessing a cause via its effect (‘X is so Y that Z’), they can be characterised as metonymic in the sense of chapter 2. Kirschbaum classifies the following eight patterns as involving metonymy: (10) a. NEGATIVE EVALUATION STANDS FOR DEGREE e.g. scheußlich, elend, schlimm (‘awfully’, ‘miserably’) b. POSITIVE EVALUATION STANDS FOR DEGREE e.g. überragend, super (‘supremely’, ‘super’) c. ASTONISHMENT STANDS FOR DEGREE e.g. erstaunlich, überraschend (‘amazingly’, ‘surprisingly’) d. DAMNATION STANDS FOR DEGREE e.g. verdammt, verflucht, verflixt (‘damned’, ‘cursed’) e. DEVIATION FROM THE NORM STANDS FOR DEGREE e.g. ungewöhnlich, außerordentlich (‘unusually’, ‘extraordinarily’) f. INEFFABILITY STANDS FOR DEGREE e.g. unglaublich, unbegreiflich (‘unbelievably’, ‘incredibly’) g. NON-MEASURABILITY STANDS FOR DEGREE e.g. maßlos, unendlich (‘excessively’, ‘endlessly’) h. TYPICAL EXEMPLAR STANDS FOR DEGREE e.g. aalglatt, bienenfleißig (‘eel-sleek’, ‘bee-diligent’) i. AVERMENT STANDS FOR DEGREE e.g. wirklich, echt (‘really’, ‘genuinely’) j. FULFILMENT OF A NORM STANDS FOR DEGREE e.g. richtig, ziemlich (‘right’, ‘befittingly’) (Kirschbaum 2002: 131 f.) Kirschbaum’s criterion for including particular types in this class is that the intensified AP can be paraphrased by a consecutive clause:
56
Testing ground: Intensity collocations
(11) Hier ist es beneidenswert ruhig. Here is it enviably quiet ‘It is enviably quiet here.’ (= ‘so quiet that one can get envious’) (Kirschbaum 2002: 75) This approach offers a sensible starting point for carving up the field according to principled semantic criteria. One problem that it faces, though, is that it enforces a sharp distinction between metaphorical and metonymic construals even though the boundary between the two notions (or even the assumption that they are ultimately distinct) is contested (cf. chapter 2). Semantic redundancy A second important source of intensity implications is semantic redundancy. The close connection between redundancy and reinforcement has long been noted (Bolinger 1972), and many languages have grammaticalised intensity meanings in the form of genuine reduplication constructions (Stolz 2006). Also for languages that have not developed such constructions (such as English and German), redundancy has been claimed to play an important role in the grammaticalisation of new lexical intensifiers. For instance, Lorenz (1999, 2002) identifies a principle which he calls ‘semantic feature copying’, exemplified by the intensity collocations in (12), as an important source for the recruitment of new intensifiers: (12) closely attached, vitally important, heavily emphasised, unavoidably necessary, blatantly clear, evenly balanced, strictly forbidden In such expressions, the intensity implication is pragmatic rather than semantic in origin (i.e. not due to a systematic mapping between domains as in the metaphorical examples in 9). As Sperber and Wilson (1995: 219) note in their discussion of the interpretation of repetitions like We went for a long, long walk, intensity interpretations appear to be a particularly salient option for making sense of the repetition of a potentially intensifiable predicate. On the other hand, it must be pointed out that mere redundancy alone is not sufficient to produce an intensity implication. Compare the following examples from German: (13) a. … dem Licht, der Sonne, dem strahlenden Glanz des… the light the sun the beaming glow of.the (PUBLIC M96/602.08287)
Conceptualising intensification
57
b. … schon fast vergessen nach dem gleißenden Glanz… already almost forgotten after the glaring glow (PUBLIC R98/JUL.54720) c. … ihn in phosphoreszierenden Glanz getaucht hat… him in phosphorescing glow dipped has (PUBLIC A99/MÄR.18481) While there is always the same kind of semantic overlap between modifier and head noun in these examples (both head and modifier denote a certain type of light emission), only (13a) and (13b) are intensifying. The reason is that phosphoreszieren ‘phosphoresce’ in (c) does not lexicalise a greater intensity of the encoded light emission event than the head noun Glanz ‘glow’ itself, whereas strahlen ‘beam’ and especially gleißen ‘glare’ do. This shows that semantic redundancies may provide triggers for intensity interpretations through recurrent pointers to the same domain, but a boosting effect will only ensue if the potential intensifier also designates an appropriate region within this domain. Inherent intensity A third option for signalling intensity is to encode it directly in the form of dedicated degree words. Whereas highly grammaticalised items such as the aforementioned very and sehr are not of interest for my study, there is nevertheless a group of such items that should be mentioned here. Specifically, the intensification literature commonly recognises not only morphological and syntactic, but also lexical intensity expressions, i.e. bases which have lexicalised the intensity implication together with the predicate in a simplex root (in the same way in which the meaning of kill is analysed as CAUSE plus DIE in componential theories of word meaning). To give an example, van Os (1989: 102 f.) suggests that the meaning of German schrecklich ‘terrible’ can be decomposed into a qualitative component (‘bad’) and a degree component (‘very’), thereby explaining that (14a) is semantically equivalent to (14b): (14) a. Er hat sich schrecklich benommen. He has REFL terribly behaved ‘He has behaved terribly’ b. Er hat sich sehr schlecht benommen. He has REFL very badly behaved ‘He has behaved very badly’
58
Testing ground: Intensity collocations
Following Pusch (1972), he calls such predicates ‘superlative adjectives’ (van Os 1989: 82, 102); Kirschbaum (2002) calls them ‘inherently intensive adjectives’. The assumption that such predicates are implicit intensity expressions is corroborated by the even/sogar-test introduced in section 3.2.1 – it is just that the supposedly ‘underlying’ (i.e. appropriate non-intensified) predicate must be inferred: (15) a. X has behaved terribly. b. Has X behaved badly? c. Yes, terribly even. Cases like these underscore just how slippery the notion of intensification is: in contrast to the semantically related, but also much more strongly grammaticalised category GRADE, INTENSITY is deeply woven into the lexicon. This means that possible intensification relationships between two predicates (of the type suggested by van Os) are very much a matter of context and construal, rather than unambiguously present or absent as in the case of a particular GRADE marking on adjectives and adverbs. On first glance, it would seem that such problems can nevertheless be ignored here: the study is only concerned with composite intensity expressions, i.e. examples in which the intensity implication is unambiguously signalled by an overt formative. However, there are also cases in which it is not the intensifier, but rather the intensified predicate that seems to be implicit: (16) a. She was now seething. (BNC AN7) b Man merkte, wie er kochte. One realised how he boiled (PUBLIC L98/JUL.07596) The expressions in (16) appear to be ‘truncated’ instances of one of the target constructions of my study (i.e. construction C described in section 3.4.3 below). Canonically, this construction consists of a particular type of intransitive verb plus a prepositional complement (headed by with in English and vor in German) that encodes a property which the subject referent is said to possess to an extraordinary extent. In certain cases, however, a simple intransitive use of the verb seems enough to conjure up the semantics of the unexpressed element: both verbs in (16) convey the meaning ‘to be very angry’, thereby suggesting that the connection to ANGER is already lexicalised for particular uses of these items even when they occur without
Conceptualising intensification
59
the oblique complement. Hence, as in the case of example (15), they pass the even/sogar-test if the appropriate predicate is inferred: (17) a. X is seething. b. Is X angry? c. Yes, X is even seething. However, straightforward as such inferences may be for speakers (or rather hearers/readers), examples like (16) are nevertheless problematic for an empirical study of observed co-occurrences where supposedly ‘missing’ information cannot be simply ‘filled in’. As a result, such examples were not included in the study. 3.3.2. PERCEPTION intensifiers Given the aim of studying generalisation effects on different levels of constructional abstraction, it will be necessary to concentrate on more or less semanticised intensification patterns that can be (at least near-) exhaustively inventoried rather than on the manifestations of a purely pragmatic intensification strategy. At the same time, given the focus on early abstraction and incipient productivity, the target pattern(s) should be neither too generalised/type frequent already (both on the pattern level and regarding the combinatorial potential of the individual instances) nor so extraordinarily token frequent that an exhaustive extraction and manual annotation of the data is no longer feasible. Third, it would be desirable to choose a category whose semantic potential is also of intrinsic interest (i.e. not just for a discussion of intensification). For reasons laid out in chapter 2, cognitive semantics argues that domains which merit such special attention are those relating to aspects of embodied sensorimotor experience. One pattern that fulfils all three requirements – (semi-)systematically semanticised as an intensification pattern, neither too generalised to be of interest nor too token-frequent to handle, and also experientially basic – is the class of intensifiers deriving from different modalities of PERCEPTION. PERCEPTION intensifiers are metonymic in origin: the intensity implication arises from a consecutive inference from a predicated effect (X’s possessing property Y becoming perceptually manifest in some way) to its underlying cause (X’s possessing property Y to an extraordinary degree). As described in chapter 4, the study is restricted to deverbal intensifiers, i.e. items designating events of STIMULUS EMISSION in different sensory modalities. From the five basic senses (seeing, hearing, smelling, tasting and
60
Testing ground: Intensity collocations
touching), this restriction eliminates taste and touch since lexicalised concepts in these domains are exclusively (de-)nominal or (de-)adjectival (bitter/sweet/sour, soft/hard/sharp). In addition to these, physiology distinguishes four further senses for the perception of temperature (thermoreception), pain (nociception), balance (equilibrioception) and the position and motion of body parts (proprioception or kinaesthesia). Of these, I have also included verbs from the domain HEAT because of its prominent status in cognitive linguistic discussions of metaphor and embodied cognition. I will provide a short illustration of each pattern in turn. SOUND
The category SOUND EMISSION (SOUND for short) is by far the most strongly lexicalised/type-frequent source domain in my study. Examples of relevant intensifiers in all three investigated constructions are given below: (18) a. The evening was a resounding success. (BNC K9P) b. …systems are stuck with creakingly ancient installs of IE… (www.metropol247.co.uk/forum/viewtopic.php?p=27620&sid=32 9a4fd8f2095acb3cc5062a945ebcd8) c. A big house like this should really hum with life. (BNC J54) (19) a. ... wenn draußen klirrende Kälte herrscht. When outside clinking cold reigns (PUBLIC O96/DEZ.98544) b. Ihre oberösterreichische, knatsch-braune Tante... Her upper-Austrian KNATSCH-brown aunt (PUBLIC P99/OKT.41384) c. Deutschland brüllt und weint vor Euphorie. Germany screams and cries with euphoria (PUBLIC P00/SEP.32363) LIGHT LIGHT EMISSION (LIGHT for
short) is the second most type frequent pattern:
(20) a. …frothy dialogue and sparkling wit tend to… (BNC G1N)
Conceptualising intensification
61
b. …conspire to create a shimmeringly beautiful pop song… (http://www.tiscali.co.uk/music/reviews/11979.html) c. The company's passenger lists glittered with film stars… (BNC ASJ) (21) a. … sei ein „leuchtendes Beispiel für unsere Jugend“ be a shining example for our youth (PUBLIC A97/JUN.08378) b. Als Morrison mit gleißend hohen Tönen der Trompete… when Morrison with glaringly high tones of.the trumpet (PUBLIC P98/APR.14289) c. ... und flimmert vor Selbstironie. and glimmers with self-irony (PUBLIC A99/OKT.831763) SMELL
The third category, SMELL, is only very small. Examples include: (22) a. Having denounced the election as a stinking farce… (BNC CR8) b. I'm gonna be stinking pissed when I get home, he muttered. (BNC AC3) c. …mournful remains and reeking with dead bodies... (BNC H0N) (23) a. Ich habe eine Stinkwut auf ihn… I have a stink-anger on him (PUBLIC P99/OKT.38131) b. Nein, stinkfaul kann ma n ihn nicht nennen. no stink-lazy can one him not call (PUBLIC O96/FEB.14315) c. Jemand “stinkt” vor Geld… someone stinks with money (PUBLIC O99/APR.44851) HEAT
The last pattern (HEAT for short) is closely connected to pattern II, LIGHT: (24) a. …a searing indictment of French military incompetence… (http://www.moviemail-online.co.uk/stars/1887)
62
Testing ground: Intensity collocations
b. Dark, elfin, sizzlingly beautiful. (http://news.bbc.co.uk/1/low/in_depth/entertainment/2002/oscars_ 2002/1892305.stm) c. …bizarre machinery which seethed with naked power… (BNC G1M) (25) a. Eine glühende Patriotin. a glowing patriot (PUBLIC A01/FEB.08675) b. Jetzt wären Sie brennend now were you burningly (PUBLIC O99/APR.51522) c. Das Publikum kochte vor the audience boiled with (PUBLIC O96/JAN.03516)
interessiert… interested Begeisterung. enthusiasm
This concludes the overview of the semantic target patterns of my study. Section 3.4 now turns to the specific structural realisations of these patterns that will be investigated in the following chapters. 3.4.
Constructing intensification
As already observed at the beginning of this chapter, speakers can choose from a wide range of formal coding strategies (phonological, morphological, lexical or syntactic) for signalling intensity. This holds for both English and German alike, and the stock of conventional intensifying expressions in either language is enormous. Given the contrastive orientation of my study, it was desirable to restrict the focus of attention to structures that are maximally similar across the two investigated languages. Second, the set of targeted constructions should also be comparable within languages: this way, it is possible to investigate the extent to which one and the same intensifying base has developed similar co-occurrence profiles in different environments of the same system. For instance, one might be interested in the different intensity collocations in which the lexical root burn participates as an intensifier, and in investigating the extent to which these are the product of general semantic restrictions as opposed to construction-specific constraints. Comparability across constructions was ensured by selecting structures in which the intensifying element was invariably a (de-)verbal base that could in principle be plugged into all targeted environments. Finally, the relevant constructions were required to
Constructing intensification
63
exhibit a certain minimum amount of variability in the intensifier slot. This was ensured in order to permit collocational comparisons across different intensifiers with similar meanings in these constructions (e.g. the lexical co-occurrence profiles of all intensifying bases from the source domain HEAT). Given these restrictions, my choice fell on the three constructions exemplified in (26) and (27): (26) a. Construction A: Int + N …you don’t look your usual picture of glowing health… (BNC JY5) b. Construction B: Int + Adj …with sculpted hair, glowingly healthy skin… (http://mysite.wanadoo-members.co.uk/bunnyhistory/secrethistory/ bunny1.html) c. Construction C: Int + with + N You look wonderful, said John, glowing with health. (BNC A0R) (27) a. Construction A: Int + N ...von Zorn und glühende r Leidenschaft… of anger and glowing passion (PUBLIC V99/JUL.35619) b. Construction B: Int + Adj …mit glühend leidenschaftlichem Ton dargeboten… with glowingly passionate tone presented (PUBLIC P93/JUN.17898) c. Construction C: Int + vor + N …heißblütige Musik, die vor Leidenschaft glüht. hot-blooded music which with passion glows (PUBLIC H88/KM7.10432) These three constructions – henceforth referred to as constructions ‘A’, ‘B’, and ‘C’, respectively – meet the above requirements. First, the target structures are maximally isomorphic translation equivalents in English and German, making sure that it is in fact relevantly similar elements that are compared across the two languages. Second, there is also a close languageinternal connection between these three schemas in that they offer alternative ways of saying the same thing, in different structural contexts, and with different semantic profiling nuances: depending on the cohesive requirements of the unfolding utterance, the coexistence of these constructions first opens the possibility of choosing between an adjectival (construction
64
Testing ground: Intensity collocations
B) and a nominal realisation of the ascribed property (constructions A and C). In the latter case, the contrast between constructions A and C offers a further useful distinction: depending on the given discourse context, speakers may wish to profile either the ascribed property itself (using construction A) or predicate about the property bearing entity instead (using construction C). In other words, constructions A and C can be used to impose different profiles on the same underlying property ascription setting: construction A profiles the trajector (i.e. the property that is being ascribed) and construction C the landmark of the ascription relation (i.e. the property bearing entity) as the trajector of the secondary predication (i.e. of the event encoded by the intensifier). Construction B is grammatically ambiguous in this respect: comparable to the contrast between predicative adjectives and adverbs in (28), a reading in which the secondary predication applies to the subject is grammatically possible, but not mandatory here.5 However, appropriate substitutions of the ambiguous predicate may strongly bias a particular interpretation (cf. 29): (28) a. Er ißt das Fleisch roh. he eats the meat raw b. Er ißt das Fleisch nackt. he eats the meat naked (Haider 1984: 35) (29) a. Er he b. Er he c. Er he
ist is ist is ist is
glühend begeistert. glowing enthusiastic schwer begeistert. heavy enthusiastic sprachlos begeistert. speechless enthusiastic
Constructions A, B and C are thus not only comparable in the broad sense that they take the same kinds of (de-)verbal bases as intensifiers: rather, they constitute a set of closely connected structural alternatives that can substitute for one another depending on larger sentential context. Finally, in contrast to the many individual intensifying expressions of English and German that are not strongly integrated into an obvious paradigm (e.g. English lock, stock and barrel, German nach Strich und Faden ‘good and proper’), the expressions in (26) and (27) are clearly instances of some more general patterns. Close variants like the examples in (30) present speakers with ample opportunity to extract both formal and semantic
Constructing intensification
65
generalisations about the structures in question, which is precisely the desired constellation: (30) a. …you don’t look your usual picture of glowing health… (BNC JY5) b. My burning ambition is to be world champion. (BNC K4T) c. …gives little indication of the flaming aggression that… (BNC ED2) d. … she struggled to keep a check on her boiling emotions. (BNC HA6) e. The new man will face scorching criticism as an appeaser... (BNC AL6) (31) a. …with sculpted hair, glowingly healthy skin… (http://mysite.wanadoo-members.co.uk/bunnyhistory/secrethistory/ bunny1.html) b. I was blazingly angry with her… (BNC CEE) c. … followed by a sizzling hot June and July. (BNC G2Y) d. It will not be cheap, but it will be blisteringly fast! (BNC HAC) e. It is cataclysmic, heroic, searingly beautiful… (BNC ED6) (32) a. You look wonderful, said John, glowing with health. (BNC A0R) b. Kruger seethed with frustration. (BNC CK8) c. …Lucy was still simmering with resentment, and… (BNC HHB) d. Ronni flared with annoyance. Yes, of course I'm sure! (BNC JXT) e. Fuming with rage and frustration, Curtis called in… (BNC HJD) Whether and to what extent speakers really extract these generalisations of course remains to be seen – this is the very issue to be investigated in this study.
66
Testing ground: Intensity collocations
In short, then, the three target constructions were chosen because they are maximally similar from a contrastive point of view, because they are closely connected from a language-internal perspective, too, and because they are also at least partially generalised in their exploitation of the four semantic target patterns. The following sections provide a short overview of previous research on relevant aspects of these constructions. 3.4.1. Construction A: Int + N Structurally, construction A is an attributive modification pattern that embeds adjective phrases within noun phrases.6 As such, it is presumably one of the most frequent, generalised and productive grammatical constructions that exists in both English and German. In cognitive linguistic approaches, the construction has nevertheless received quite some attention, and largely because of its apparent inconspicuousness: authors like Langacker (1991, 2008) and Fauconnier and Turner (2002) have argued that rather than being a parade example of compositionality in language, so-called ‘intersective’ modifications of nouns with attributive adjectives in fact involve complex frame-semantic interactions between the meaning of the adjectival modifier and the meaning of the nominal head (cf. also Platts 1979). The point is illustrated by Fauconnier and Turner (2002; originally going back to a discussion in Turner and Fauconnier 1995) on the example of the adjective safe: the word evokes an abstract frame DANGER along with certain associated roles (such as VICTIM, HARM, INSTRUMENT, LOCATION etc.) to which the head noun can be assigned in flexible, context-appropriate ways. According to Fauconnier and Turner, it is thus misleading to say that the relationship between the adjective and the noun in e.g. a safe bet is essentially the same as in e.g. a safe speed, and neither is simply a trivial intersection of ‘things that are N’ and ‘things that are Adj’. Such interactions between the concept designated by the noun and the concept designated by the (deverbal) adjective also account for the intensifying interpretations of construction A that are at issue here.7 Though not couched in these particular terms, the point is already made by van Os (1989: 78) who emphasises that it is possible to intensify nouns on the basis of ‘any property associated with the word, i.e. its connotations’.8 Apart from these rather general observations, there are (to my knowledge) no accounts in the literature that are devoted specifically to intensifying uses of construction A.9
Constructing intensification
67
3.4.2. Construction B: Int + Adj The situation is quite different for construction B,10 which is very widely discussed in the intensification literature – in fact, the vast majority of intensification studies focus specifically on adjective intensification and the class of ‘degree adverbs’ (Adamson 2000; Biedermann 1969; Borst 1902; Claudi 2006; Ito and Tagliamonte 2003; Kirchner 1955; Kirschbaum 2002; Lorenz 1999, 2002; Paradis 1997, 2000; Partington 1993; Peters 1993, 1994; Pittner 1996, Spitzbardt 1965; Stenström 1999; Stoffel 1901).11 Especially among the more recent titles, the dominant focus is on issues of grammaticalisation. On the one hand, section 3.2.2 has pointed out that the primary interests of the present study lie elsewhere. On the other hand, even though the two notions are kept apart here, it was also held that a collocationally expanding construction may still be affected by a grammaticalisation process over and above its collocational expansion. For instance, several authors have pointed to the tightening positional restrictions on new members of the paradigm, with full-blown intensifiers ending up immediately adjacent to the intensified predicate (Adamson 2000; Peters 1993; van Os 1989).12 In terms of Lehmann’s (2002) grammaticalisation parameters, this amounts to a loss of syntagmatic variability, one of the characteristics of grammaticalisation. Van Os (1989: 88) illustrates this loss on the example of the distinction between VP-intensifiers and a closely related class which he calls ‘affirmative particles’ (Beteuerungspartikeln, cf. 33, 35). In contrast to the latter, VP-intensifiers (34, 36) cannot be left- or rightdislocated: (33) a. We really liked him. b. Really, we liked him. c. We liked him, really. (34) a. We quite liked him. b. *Quite, we liked him. c. *We liked him, quite. (35) a. Er hat uns He has us b. Wirklich, er Really he c . Er hat uns He has us
wirklich really hat uns has us gefallen, pleased
gefallen. pleased gefallen. pleased wirklich. really
68
Testing ground: Intensity collocations
(36) a. Er hat uns sehr gefallen. He has us very pleased b. *Sehr, er hat uns gefallen. Very he has us pleased c. *Er hat uns gefallen, sehr. He has us pleased very Adamson (2000) points to parallels in the domain of adjective intensification (i.e. construction B), showing that linear position influences the interpretation of adverbial modifiers in clause- and NP-structure in similar ways. Specifically, Adamson relates the grammaticalisation of sentence (STANCE) adverbs from MANNER adverbs (He will not speak frankly > Frankly, he will not speak) to the grammaticalisation of adjective intensifiers from asyndetically coordinated Adj + Adj-sequences (e.g. a jolly(,) small woman > a jolly small woman), arguing for a common “link between subjectivity and the left periphery” of both CP and NP (Adamson 2000: 42). She observes that evaluative adjectives “can appear directly in front of the noun, in which case they are interpreted as adjectives, but they can also appear in front of an adjective, in which case they may be interpreted as intensifiers, modifying not the head noun but the adjective that follows them in the string” (p. 54). In my data, evidence for an intensifier reanalysis of what are clearly adjectives (and hence cases of adjectival rather than adverbial submodification of an AP) is provided by examples like (37): (37) …by running the creaky old aristocrat out in a charity match… (BNC CU1) Even though the present study focuses exclusively on synchrony as well as on items that can hardly be said to have acquired a generalised intensification function, the shared interest in semantic abstraction processes relating to these items is clear. What is more, though not quite the same, the kinds of extensions investigated below could be likened to the stages of Diewald’s (2002, 2006) model of ‘context types’ for grammaticalisation. The model comprises three successive stages covering the preconditions, the triggering and the consolidation of grammaticalisation processes, each associated with a specific type of context: in the first stage, the item in question spreads to new, ‘untypical contexts’ in which the new meaning arises through conversational implicature. In the second stage, the item appears in so-called ‘critical contexts’ characterised by “multiple structural and semantic opacity, thus inviting several alternative interpretations,
Constructing intensification
69
among them the new grammatical meaning” (Diewald 2006: 4f.). In the third stage, the item is found in formerly incompatible ‘isolating contexts’ that unambiguously differentiate the older lexical and the new (more) grammatical reading, thereby consolidating the split between the two. Applied to the recruiting of new intensifiers from MANNER-adverbs in construction B and their subsequent semantic generalisation, approximate synchronic equivalents of these stages could be exemplified as follows: (38) a. ...mit knackig-krossem Boden wird diese italienische... with cracking crisp base becomes this Italian (PUBLIC X96/DEZ.31601) b. …Fisch, rohes Fleisch und knackig-frisches Gemüse… fish raw meat and cracking fresh vegetables (PUBLIC X96/AUG.14895) c. … im Süden knackig-heißes Ausflugswetter … in.the south cracking hot outing-weather (PUBLIC M01/107.53470) (38a) illustrates the original MANNER-meaning of knackig, ‘producing a cracking sound’, in a context in which the intensity meaning could arise through implicature (‘so crisp as if to produce a cracking sound’ +> ‘very crisp’). (38b) illustrates why the correspondence to Diewald’s model is merely approximate, since a characterisation of this example as a case of “multiple structural and semantic opacity” would be exaggerated. Nevertheless, it resembles the ‘critical context’ stage in representing an ambiguous setting that does not strongly privilege the original MANNER-meaning nor the derived INTENSITY-reading. Likewise, (38c) may not be a prime demonstration of the acquired autonomy of a new grammatical formative, but it shares characteristics of an ‘isolating’ context in the above sense in that the example is clearly incompatible with the original meaning of knackig that is illustrated in (a). In fact, looking at the remaining data, there are also more abstract uses of the item that exhibit increasing degrees of formal reduction and ‘bondedness’, a further characteristic of grammaticalisation processes according to Lehmann (2002): (39) a. Jetzt wird es wieder knackig kalt: … Now gets it again crack-y cold (RHZ97/JAN.02616) b. Mit seinem knackend saftigen Fruchtfleisch… with its cracking juicy pulp (PUBLIC E96/SEP.22979)
70
Testing ground: Intensity collocations
c. Knackenvoll ist es meistens im “Pflaumenbaum”... crack-LINKER-full is it mostly in.the plum-tree (PUBLIC L99/MAI.21365) d. „Es war knackevoll“, freut sich Thomas ... it was crack-LINKER-full rejoices REFL Thomas (PUBLIC R99/JUN.51576) e. …zur Stippvisite in die knackvolle Konkordienkirche... to.the flying-visit into the crack-full Konkordienkirche (PUBLIC M03/310.69344) Following this line of argumentation, such examples could in fact be interpreted as early signs of the incipient grammaticalisation of an ELATIVE (or so-called ‘absolute superlative’)13 in German along the following lines: (40) lexical adverb > degree adverb > compound modifier > elative affix Languages that have morphologised the category include e.g. Russian with its “suffixal superlative … often with an intensive rather than genuinely superlative force, for example slonejsij (< slonyi ‘complex’) suggests ‘very complex’” (Corbett 2004: 205). In English and German, the category does not have the same, undoubtedly grammatical status as such categories as the comparative or the superlative, but there are nevertheless some fairly generalised formatives in either language that serve the relevant function (in English, these are just a handful of ‘neo-classical’ bases such as superand mega-, but German has a much larger inventory of such elements).14 Not fitting neatly with either compounding or derivation, morphological intensification has received special attention in the German morphology literature, where relevant formatives have been variously identified as compound constituents, prefixes and so-called ‘semi-prefixes’ (Affixoide), the latter term clearly reflecting the problematic status of these elements. Diachronically, affixes often arise from compounds (cf. e.g. English –hood, German –heit), and the idea that bases which seem to fall between the two categories synchronically are in fact affixes in the making is of course not new. For German intensifiers, this position has already been argued a hundred years ago (Meyer 1902): Der erste Teil wird gewissermaßen ein Präfix und kann wirklich wie ein solches verwandt werden, wenn wir etwa nach Analogie von ‘steinhart’ steinreich bilden, lediglich um ‘reich’ zu steigern: ‘stein-’ hat dann hier keine andere Funktion als ‘ur-’ in uralt.
Constructing intensification
71
‘The first element in a way transforms into a prefix and can be used just like one, for instance when steinreich [‘stone-rich’] is formed in analogy to steinhart [‘stone-hard’], simply in order to intensify reich: stein- does not have a different function here than ur- in uralt [‘very old’].
On the other hand, the idea that a gradual transition from compound to affix may involve synchronically indeterminate borderline states is not uncontested,15 and most recent work has tended to avoid notions like ‘semiprefix’ or ‘Affixoid’ again and treated the problematic cases as compounds instead (for discussion, cf. Schmidt 1987; Fleischer and Barz 1995; Donalies 2005; Booij 2005).16 All in all, a mere two items are generally assumed to have acquired affix status already (ur- and erz-), and the classification of the numerous others is mostly controversial: looking around in the literature, it is not difficult to find one and the same expression sometimes classified as a copulative compound, sometimes as a determinative compound, sometimes as a semi-prefixation and sometimes as an independent type of word formation altogether (‘intensity formation’, e.g. Pittner 1996). Returning to the hypothesised grammaticalisation relationship between syntactic (e.g. kreischend bunt, ‘shriekingly colourful’) and morphological intensity expressions (kreischbunt, ‘shriek-colourful’) in German that was formulated in (40), four arguments can be adduced in favour of such a connection:
it is privileged by a general typological tendency towards complex word-formations that continues to influence German; the requisite target schema is already part of the morphological repertoire of the language; the putative change is facilitated by the phonological peculiarities of the target schema; there is a parallel development in yet another kind of intensification construction, the so-called EXCESSIVE (Dressler and Merlini Barbaresi 1994), which further adds to the plausibility of seeing a connection here.
I will briefly discuss each argument in turn. First, German has been subject to a general typological drift towards complex word-formations in the last few centuries (Glück 2000: 98). This trend also manifests in the adjectival domain, where sequences of modifier + head are often ambiguous between a phrasal and a morphological interpretation, thereby allowing for reanalyses:
72
Testing ground: Intensity collocations
(41) a. Die Produktion ist voll automatisch / vollautomatisch the production is fully automatic full-automatic b. Eine hoch abstrakte / hochabstrakte Idee A highly abstract high-abstract idea c. Der viel beschworene / vielbeschworene Friedensprozess The much invoked much-invoked peace-process While spelling is maybe not the best evidence for underlying reanalyses, there are also more obvious manifestations of such ambiguities such as variation in forming the superlative: (42) a. Europas meistbefahrene Straße ist ein Alptraum. Europe’s many.SUPERL-APPL-driven road is a nightmare (PUBLIC M98/805.41987) b. Die A5 ist die vielbefahrenste, die lauteste und... the A5 is the many-APPL-driven-SUPERL the loudest and (PUBLIC R97/AUG.60768) c. …daß dies der meistbefahrenste Teil… that this the many.SUPERL-APPL-driven-SUPERL part (PUBLIC X96/NOV.28280) Even though meistbefahren ‘most driven-on’ is written as a single word in (42a), the superlative is actually marked on the modifier viel ‘many’, indicating that it is treated like an independent adverb; in (42b), vielbefahren ‘much driven-on’ is treated as a complex adjective and the superlative attaches to the composite expression, and in (42c) the speaker plays safe and marks the superlative twice. Such insecurities testify to an underlying ambiguity problem in such sequences – a problem that arises from the tendency to incorporate adverbial modifiers into compound adjectives which indicates a general permeability between syntax and morphology here. Second, there is not only a tendency to incorporate adverbial modifiers in general, but also an existing word formation schema for the specific type of target expression that is at issue here. In other words, the fact that German already has a schema for verb-adjective compounds may have provided additional motivation for shortening the combination of an adverbially used deverbal (participial) adjective + head adjective (kreischend bunt) to the corresponding compound variant, i.e. verb root + head adjective (kreischbunt). In a detailed study of such formations, Kienpointner (1985) distinguishes three different semantic types of verb-adjective compounds in German:
Constructing intensification
73
(43) a. limitative-relational: ‘X is A with respect to V’ denkfaul ‘think-lazy’ b. causal: ‘X is A because of V’ schwitznass ‘sweat-wet’ c. consecutive: ‘X is so A that V’ triefnass ‘drip-wet’ Type (43c) corresponds to the present target pattern. Some examples of this type from my data are given in (44): (44)
siedeheiß, stinkreich, jammerschade, knallgelb, seethe-hot stink-rich wail-unfortunate bang-yellow blitzblank, quietschvergnügt, rappelvoll, brühwarm, flash-clean squeak-joyful rumble-full brew-warm klatschnass, klapperdürr clap-wet clatter-scrawny
Third, coalescence is further facilitated by the peculiar phonological characteristics of the target schema. Normally, German phrases and compounds are unambiguously distinguished by different stress patterns: (45) a. Die Kirschen sind FRÜH REIF (in diesem Jahr). ‘The cherries are ripe early (in this year).’ b. Das Kind ist FRÜHreif. ‘The child is precocious.’ However, morphological intensity expressions behave differently in this respect – in contrast to ordinary compounds, there is primary stress on both elements, just like in the phrasal variant: (46) a. Die Schuhe sind mir zu HOCHhackig. the shoes are me.DAT too high-heeled ‘The shoes are too high-heeled for me.’ (determinative compound) b. Das ist HOCH ANständig von dir. that is high decent of you ‘That is highly decent of you.’ (phrasal intensification)
74
Testing ground: Intensity collocations
c. Er wurde HOCHROT. He became high-red ‘He blushed darkly.’ (morphological intensification) In short, with no further changes but the loss of a syllable that is already unstressed anyway, the phrasal syntagm can be compacted to a more economical expression with an identical semantics, an opportunity that speakers are very likely to exploit. Fourth and finally, a similar grammaticalisation process from syntagm to prefix can be retraced for another type of German intensification construction, the so-called ‘excessive’. The excessive is a pragmatically specialised construction that serves to further intensify superlatives; in German it is expressed by attaching the formative aller- to the adjective: (47) a. Ein guter Spieler A good player ‘A good player’ b. Ein besserer Spieler A better player ‘A better player’ c. Der beste Spieler The best player ‘The best player’ d. Der allerbeste Spieler The EXC-best player ‘The very best player’
(Positive) (Comparative) (Superlative) (Excessive)
In their study of the German excessive, Dressler and Merlini Barbaresi (1994: 558) write: “The excessive historically originated from the juxtaposition of superlative and genitive plural all-er ‘of all’. Diachronically it acquired the status of a compound, but today it is rather felt as a case of a prefixed derivative, due to the isolated state of this combination with aller(cf. Fleischer 1976: 247)”. The development from syntagm to prefix can be represented as follows: (48) Aller bester Spieler all.GEN-PL best player ‘The best player of all’
>
allerbester Spieler EXC-best player ‘The very best player’
Constructing intensification
75
Even though the above quotation warrants some qualifications,17 the parallels to the development of prospective elative markers are obvious, which further adds to the plausibility of assuming a grammaticalisation process at work here. So, to sum up this small excursion, the literature on intensifying uses of construction B is often framed in discussions of grammaticalisation, and it is indeed possible to make a case for such developments with regard to a particular subset of my data (bound adjective intensifiers in German). 3.4.3. Construction C: Int + with/vor + N Construction C is a clause-level construction that comes in three closely related variants in both languages, and most earlier approaches have tended to focus on one (or sometimes two) of these. The three main usage patterns of the structure can be characterised as a causal adjunct construction (49a), a construction for abstract property ascription (b) and a construction for locative predication (c): (49) a. …fingers felt they would burst with the pain of freezing … (BNC FP0) b. She bursts with vitality and raw intelligence. (BNC A06) c. …to find his in-tray bursting with papers covering… (BNC AJM) Previous approaches to the English construction (Boas 2004; Dowty 2000; Levin 1993; Rohdenburg 1974; Salkoff 1983) have tended to focus on the locative uses in (c), sometimes in conjunction with subtype (b), and usually framed in the context of discussions of ‘argument alternation’ as an echo to the famous juxtaposition in (50) (see also Fried 2005 for Czech): (50) a. Bees are swarming in the garden. b. The garden is swarming with bees. (Fillmore 1968: 48) Radden (1998) deals with semantic issues relating to the causal variant. As regards German, previous approaches to the construction are rare – Rohdenburg (1974) offers some remarks on the German variant of (49b) and (c), Zifonun et al. (1997) and especially Rosenfeld (1983) provide accounts of the causative usage exemplified in (49a).
76
Testing ground: Intensity collocations
It seems fair to assume that these different uses are related historically specifically, it could be hypothesised that the causal variant (a) has given rise to the abstract intensifying uses (b) which in turn spawned the locative ones in (c) as a kind of concrete counterpart (Zeschel 2011). However, this is just speculation for which I do not have direct diachronic evidence. Synchronically, on the other hand, there is reason to assume that prototypical instances of variants (a) and (b) are distinct. Langacker (2009: 260) draws attention to the following contrast: (51) a. Her child was shouting with joy. The shouting child… b. Her child was screaming with pain. The screaming child… c. Her garden was really swarming with bees. *The swarming garden… d. Her cat was crawling with fleas. *The crawling cat… In spite of the contrast in (51), my study will not differentiate between the aforementioned subtypes and include all three in the data, for two reasons: first, all three variants have essentially the same schematic meaning in that the subject referent is strongly associated with/affected by the entity denoted by the oblique NP. And second, it is simply not possible to partition naturally occurring examples of the construction into two neat sets in the way it can be done with constructed examples like those in (51): in naturalistic data, unambiguously literal-causal uses like (51a) gradually shade off via different degrees of hyperbole into ultimately clear-cut figurative, purely intensifying uses of the construction.
3.5.
Objectives
With these preparations made, it is now possible to formulate the concrete empirical objectives of the study. They can be summarised as follows:
to assess the extent to which the four targeted conceptual domains are exploited for the encoding of intensity in the three grammatical environments in both languages; to identify routinised formulas (‘fixed expressions’) and productive exploitations (‘creative extensions’) of these formulas in the corpus data, and to compare the usage patterns of the attested intensifiers across constructions and languages;
Chapter summary
3.6.
77
to explore the internal semantic structure and schematic depth of the investigated collocation clusters and to identify determinants of their synchronic layering.
Chapter summary
Chapter 3 has introduced the empirical testing ground of the investigation. Section 3.2 has introduced some elementary background on the functional target domain of the study, intensification. Intensification was defined as a kind of scaling modification that serves to boost a flexibly identified aspect of the meaning of an intensified predicate (constructions A, B) or proposition (construction C). In order to distinguish intensity expressions from certain closely related modification structures, a test exploiting presuppositions triggered by the adverbs even/sogar was introduced (3.2.1). The empirical focus on intensification constructions was motivated with their intrinsic susceptibility to innovation: in their desire to signal emphasis in a maximally expressive, attention-getting way, speakers may be more prepared to push the limits of convention in this domain than they are in other areas of the grammar. Section 3.2.2 has continued a discussion raised in chapter 2 and considered both differences and points of contact between the present approach and studies of intensifier variation and change in grammaticalisation research. Section 3.3 has surveyed different semantic and pragmatic strategies for conveying intensity implications in English and German. Three common sources for such implications were distinguished before the specific semantic target categories of the study were introduced and exemplified with corpus samples from the dataset. These patterns were characterised as lexicalisation patterns connecting the category INTENSITY with four experientially primitive domains in the sense of chapter 2. Next, section 3.4 has given an overview of the three constructional target patterns of the study and provided pointers to selected previous analyses of these patterns that have informed my approach. Finally, section 3.5 has formulated the concrete objectives of the empirical case studies to which I will turn next.
Chapter 4 Lexicalisation patterns: From concepts to words
4.1.
Introduction
Chapter 4 opens the empirical part of the investigation. The question to be investigated is: how are the four targeted PERCEPTION domains exploited for the signalling of intensification in the three English and German target constructions? As in all following empirical chapters, I will begin with a sketch of some general theoretical and methodological prerequisites for the subsequent analysis. In the present chapter, these relate to perspectives for the corpus-linguistic study of lexicalisation patterns as conceived in chapter 2 as well as the specific data sources that were employed (section 4.2). In section 4.3, I will explain the concrete procedure that was adopted and discuss issues of operationalisation, data extraction and coding. Section 4.4 presents the first part of the results. Following a brief overview of the empirical basis of the study at large, a detailed characterisation of all three constructions in both languages is provided. Conventional instances of the targeted intensification patterns in each construction and language are exemplified alongside creative exploitations, type and token frequencies per pattern are compared across constructions and languages and also intraconstructional frequency distributions are inspected. Section 4.5 offers a short chapter summary. 4.2. 4.2.1.
Prerequisites The corpus-linguistic study of lexicalisation patterns
In chapter 2, constructional lexicalisation patterns were defined as “construction-specifically conventionalised interactions between the semantics of the construction and the semantics of the inserted lexical items”. There are two complementary approaches to the systematic investigation of such patterns in a corpus: first, an exhaustive extraction of the construction in question, followed by a clean-up procedure that eliminates all instances of the construction that do not involve the targeted lexis. And second, an ex-
Prerequisites
79
haustive extraction of the lexical items in question, followed by a clean-up procedure that eliminates all uses of these items in contexts other than the targeted construction. Both approaches come with their own advantages and problems. A strength of the grammar-based approach is that it is fully exhaustive: provided that the structural template of the target construction is indeed successfully identified in all its possible variants, even the most non-canonical lexical realisations will be retrieved, i.e. there is no danger of overlooking potential fillers during the extraction phase. On the downside, identifying constructions via their structural properties alone requires a corpus with appropriate annotation, and for a truly exhaustive retrieval, the required machinery will have to be rather complex. For instance, looking for Adj + N sequences in a POS-tagged corpus would find both the canonical realisation of the intensity expression gravierender Nachteil ‘grave disadvantage’ in (1a) and possibly also its coordinated variant in (b), yet not the split topicalisation in (c): (1) a. Doch fossile Brennstoffe haben gravierende Nachteile. But fossil fuels have grave disadvantages (PUBLIC M06/JAN.04126) b. …durch den Bau gravierende und irreparable Nachteile. by the building grave and irreparable disadvantages (PUBLIC O96/MÄR.31524)a. c. Nachteile hab ich im Grunde keine gravierenden gefunden Disadvantages have I basically no grave found (http://www.ciao.de/Philips_170S6FS__Test_3130340) The lexicon-based strategy has the advantage that it can be applied to unannotated (and hence in practice: larger) corpora. However, it comes with the inherent drawback that it is confined to a predetermined search space, and a truly exhaustive extraction requires a likewise truly exhaustive inventory of search terms. An obvious problem with this is that the targeted conceptual categories have graded membership and fuzzy boundaries (cf. chapter 2), something that is not easily accommodated in the process of compiling a finite word list that only offers a binary distinction between ‘members’ and ‘non-members’ of the targeted class. Moreover, no such enumeration will ever be truly exhaustive. For instance, even the most meticulous preparation will miss such unpredictable types in the data as e.g. nonce words (like non-anticipated instances of sound symbolism), noncanonical word-formations or conversions, or simply words that are unknown (or did not occur) to either the analyst or the compilers of the lexi-
80
Lexicalisation patterns: From concepts to words
cographic resources that are consulted (as there is no such thing as a genuinely ‘complete’ thesaurus). Previous discussions of the issue in the cognitive-linguistic literature are mostly found in connection with metaphor research (Deignan 2005, Stefanowitsch 2006). For instance, Stefanowitsch (2006) proposes a corpuslinguistic approach to the study of metaphor called metaphorical pattern analysis. Metaphorical pattern analysis is target-domain oriented in that it involves searching a corpus for sets of lexical items of a particular category that is known (or expected) to function as a metaphorical target domain. For instance, for a study of ARGUMENT metaphors, the approach would involve concordancing sets of lexical items instantiating the target domain (claim, criticism, argument etc.) and exhaustively systematising all metaphorical uses in which these expressions are found. Metaphorical pattern analysis is thus directly complementary to the present approach, which involves searching for source domain lexis in order to identify instances with the relevant meaning. A further difference is that the expressions targeted in the present investigation are not necessarily metaphorical or otherwise figurative. However, like in metaphorical pattern analysis, the targeted expressions always involve a semantic shift in the sense that lexical items denoting concrete perceptible events of stimulus emission are employed to convey abstract implications of intensity. And also like metaphorical pattern analysis, the present investigation adopts a lexicon-based approach to the investigation of constructional lexicalisation patterns. Before the particulars of the procedure are presented in section 4.3, section 4.2.2 introduces the data that form the empirical basis of the study. 4.2.2.
Data
The German data were extracted from PUBLIC, the complete collection of all freely accessible corpora of written German that is available from the Institut für Deutsche Sprache (IdS) in Mannheim. PUBLIC can be accessed online via the COSMAS II corpus query system using a downloadable client.19 At the time of investigation, the freely accessible part of these corpora comprised roughly 960m word forms. The English data were in part extracted from the British National Corpus (BNC), a balanced 100m word corpus of 1990s British English (Burnard 1995).20 However, with the exception of construction C, the data thus obtained was too sparse, confirming observations by Geyken et al. (2004)
Prerequisites
81
that 100 million words is too small a size for the study of particular idioms. As a result, the BNC results were combined with data obtained from .uk websites, which led to a considerable increase in both type and token frequencies of the investigated constructions (see section 4.4). Exploiting the web as a resource for corpus linguistics has been a much-debated topic in recent years. In the remainder of this section, I will briefly discuss some of the issues raised in this connection, with special attention to known problems and how they were addressed in my study. Specifically, these are questions relating to
representativeness replicability authorship duplicates noise
I will deal with each of these in turn. To begin with, it would seem that a carefully constructed corpus such as the BNC that seeks to balance material from spoken and written sources and different genres in a well thought-out ratio is infinitely superior in approximating representativeness for a particular language to something as messy and unfathomable as the web. On the other hand, it is not at all trivial to assess just how well thought-out this ‘well thought-out ratio’ is. Kilgarriff and Grefenstette (2003: 343) point to a number of serious problems and unresolved questions regarding the premises of representativeness estimates, coming to the provocative conclusion that “[t]he web is not representative of anything else. But neither are other corpora, in any well-understood sense”. Obviously, this does not support the reverse conclusion that ‘anything goes’ and concerns with representativeness can be discarded altogether. It is evident that specialised corpora are biased for the properties of the respective sublanguage at hand – for instance, if one wanted to know how common the English word crossbar is, the corpus should not consist entirely of sports reporting, or worse a text collection about penalty shootouts in football. However, the internet may not be all that bad in this respect. In a study that compares word frequencies in balanced corpora (such as the BNC) with counts from unbalanced ones (e.g. newswire corpora such as the Reuters corpus) on the one hand and web-derived corpora on the other, Sharoff (2006) finds that “the difference between word frequencies in the Internet and representative corpora are much less significant than those for corpora based on news-
82
Lexicalisation patterns: From concepts to words
wires” (Sharoff 2006: 93). Likewise, in a study entitled “Using the web to obtain frequencies for unseen bigrams”, Keller and Lapata (2003: 22) report that “web queries can generate frequencies that are comparable to the ones obtained from a balanced, carefully edited corpus such as the BNC” whilst at the same overcoming the problem of data sparseness. In short, there is empirical evidence that data extracted from the web are not grossly out of proportion when compared to data from dedicated linguistic corpora. Another topic is the issue of replicability. Here, it is important to distinguish different ways of using the web for corpus-linguistic purposes. Bernardini, Baroni, and Evert (2006) distinguish different interpretations of the notion ‘web as corpus’, among them the (presumably most popular) approach which sees the web “as a corpus surrogate” and a second one that uses the web “as a corpus shop” (p. 10). By the ‘web-as-corpus-surrogate’ type, the authors mean any approach that employs “a standard commercial search engine for opportunistic reasons” (such as the lack of a sufficiently large existing corpus for the purpose at hand), and which is optionally mediated by the use of a web concordancer that operates on top of these engines (such as KWiCFinder21 or WebCorp22). Since the web changes constantly, the results of such studies are not replicable in the sense that later researchers will be able to retrieve exactly the same data when submitting identical queries to the same engines. Some authors do not see a serious problem with that, however: for instance, Kilgarriff (2001, quoted in Sharoff 2006: 94) compares the use of such web-derived data to a chemical analysis of water from a river – repeating the analysis, you cannot expect to obtain exactly the same molecules, but the study as such is replicable. By contrast, the ‘web-as-corpus-shop’ approach uses traditional search engines merely for the purpose of identifying materials that meet certain predefined criteria for text inclusion before the sites containing these materials are downloaded and a locally stored corpus in the traditional sense is built. For closed corpora like these that have been downloaded from the web at a specific point in time and that are then made publicly available for linguistic research, it is of course possible to replicate the results of earlier studies in a narrower understanding of the term. On the other hand, this latter approach comes with the drawback that it is considerably more data intensive and time consuming, an aspect that cannot be lost sight of. For example, even granted that commercial search engine counts may be dubious in certain respects, nobody would want to construct a full-blown corpus of internet language before comparing the frequencies of two competing spelling variants. Needless to say, balancing cost and effort is also a vital
Prerequisites
83
concern for more genuinely linguistic research purposes like the present study, where the construction of an appropriately post-processed giga corpus of internet language along the lines suggested in e.g. Bernardini, Baroni, and Evert (2006) as a mere precursor to the actual investigation itself was out of the question for practical reasons. Since no such precompiled corpus was available to me at the time of the investigation either, I therefore stuck to the first approach and used the web concordancer WebCorp to acquire the requisite additional data. The main advantage of such web concordancers over using the underlying search engines directly is that they allow for more sophisticated search facilities such as wildcard queries. Moreover, such programs actually concordance the pages returned by the search engine, meaning that if a site contains several instances of the target pattern, all of them will be retrieved as separate hits to be presented in a familiar KWIC-concordance display. At the time of the investigation, a problem with WebCorp was that it had a preset restriction to a maximum of 200 concordanced websites.23 Due to this limitation, distributional differences in the results will reflect differences in the relative preponderance of intensity readings as opposed to other functions of the extracted constructions. For instance, whereas construction A is surely much, much more frequent than constructions B and C in both English and German at large, it is maybe also considerably less often used for intensifying purposes, such that the net count of relevant observations obtained from the web sample would be lower than for the latter two (incidentally, the same qualification also applies for individual bases, where the number of relevant hits in the web queries likewise depends on how strongly the base in question is associated with intensifying function). As a result, enhancing the English data with examples from the web helped to ensure that there were enough observations for a quantitative analysis to begin with, but it must be borne in mind that the obtained results boost the relative frequency of elements which are strongly associated with intensifying function at the expense of others where this association is more marginal. However, since there was no alternative to a recourse to web data given the resources that I had available at the time of the investigation, and seeing that even for a more marginally intensifying construction such as Adj + N token counts could still be boosted by more than 70%, this was a price that had to be paid. Moving on to the remaining three issues identified in the beginning of this section, the next one is the question of authorship. For present purposes, this issue boils down to the question of how to ensure that a given
84
Lexicalisation patterns: From concepts to words
text in the sample was in fact produced by a native speaker. To cut it short, there is of course no way to make sure that this is in fact the case. However, WebCorp offers the possibility of restricting the search to particular domains, and a restriction to ‘.uk’-websites not only increased the likelihood of obtaining native speaker data but also minimised skewing through homographic hits from other languages. Fourth, there is the difficult issue of (near-)duplicates, i.e. replicated texts or text fragments that one would not want to count as independent hits each time anew. Automatised approaches employ algorithms detecting conspicuous amounts of identical n-grams that can serve to discover (near-) identical documents and hence weed out such things as printer-friendly versions of websites that were already included in the data set. However, this does not solve the problem of duplicates within one and the same document (such as e.g. quoted passages in a discussion forum). Since the data in the present study had to be coded manually anyway, I could take a different approach here: duplicates were defined as concordance lines exhibiting an overlap of at least eight successive words, with one of these being the targeted (potential) intensifier. Shorter duplicates were excluded if they were obvious instances of a product or brand name (e.g. Baron Blazing Hot Pepper Sauce). Finally, manual coding of the results also helped to eliminate noise in the form of such deviant hits as metalinguistic ‘mentions’ rather than naturally occurring usages (as found in e.g. dictionary explications) and certain obvious artefacts of the concordancing procedure (e.g. the juxtaposition of terms in a link or page name, or from adjacent columns in a table), thereby guaranteeing a very thorough cleaning of the data. Given these sources, it is obvious that my investigation draws on data from different channels, registers and text types. This is certainly not to imply that these have the same linguistic characteristics. However, the focus of my study is specifically on what is possible ‘in English’ and ‘in German’ at all, i.e. on the contrast between attested vs. unattested usages of relevant expressions in the investigated data rather than on the relative preponderance of particular usages over others in specific sublanguages. Many of the intensifying devices that are investigated in this study are not only differentiated semantically (in the sense of being reserved for particular classes of intensified predicates), but also in terms of various extralinguistic associations (such as appropriateness for a particular channel, within a certain genre, at a specific level of formality, for speakers of a particular age or social group, and in some cases even for a specific discourse topic).24 While any such information is undoubtedly part of what speakers know
Procedure
85
about the conventional uses of a given expression, it is nevertheless only the first kind of properties – semantic co-occurrence restrictions in the narrower sense – that will be investigated here.
4.3. 4.3.1.
Procedure Setting up the search space
Recalling the issues discussed in section 4.1, the choice of a lexicon-based retrieval strategy raised two problems that had to be addressed: first, the challenge of devising comprehensive lexical inventories of the targeted semantic domains, and second, the problem of polysemy (for items that could be assigned to more than one of the investigated patterns). As already mentioned in chapter 3, the study was limited to deverbal bases. Seeing that there was no satisfactory pre-compiled list of relevant items (i.e. verbs of SOUND, LIGHT, SMELL and HEAT EMISSION) that I could simply take over, I started out by augmenting existing approximations for both languages (taken from previous work on semantic verb classes in English and German such as Levin 1993 and Snell-Hornby 1983) introspectively until no further items came up that were judged as words for concepts of the targeted domains. In addition, I consulted the literature on intensification and the three investigated constructions reviewed in chapter 3, seeing that some of these studies (e.g. Salkoff 1983) provide lists of lexical items that are attested in the investigated constructions. Third, the resulting preliminary lists were checked item by item in several thesauri.25 For each item, results of the thesaurus queries were submitted to the thesauri again until there were no more new types coming up that were judged as relevantly similar to the original start items in their relevant reading(s). Finally, the obtained results were cross-checked all over again contrastively by individually submitting each English and German item to the web-based dictionary LEO26 and filling in relevant misses from the translation results. When a list of potential instances had been compiled for each pattern, the second issue that needed to be addressed was how to deal with bases that could have been assigned to more than one of the investigated classes (e.g. glow as a possible instance of both LIGHT and HEAT EMISSION). Even though the reduction of ambiguous items to a single central sense is dubious from a cognitive point of view (cf. chapter 2), it is nevertheless a price that had to be paid here: without imposing clear boundaries onto the data
86
Lexicalisation patterns: From concepts to words
and arriving at well-delimited sets of bases for each of the investigated lexicalisation patterns, it is not possible to compare these patterns among each other or across languages (or at least not quantitatively). This means that items that are located at the very boundary between two target categories had to be disambiguated in favour of one of the two relevant domains. Since it is difficult to decide such questions on purely semantic grounds through speculations about which of several related reading is ‘more basic’ than the others, I took a distributional approach to the problem: given that speakers’ lexical choice in the intensifier slot is determined by the semantics of the particular property at hand, it is reasonable to assume that similar co-occurrence patterns can offer useful clues as to which intensifying lemmas are also grouped together on the conceptual plane. Hence, if an ambiguous base like glow was found to share more intensity collocations (i.e. types) with unambiguous HEAT items like burn, blaze and scorch than with unambiguous LIGHT items like beam, glare and shine, it was assigned to the former category because of its greater collocational overlap with the remaining class. The following procedure was applied for the disambiguation task: first, the co-occurrence properties of prospective superordinate categories were identified by removing all ambiguous types from the original dataset and then lumping the remaining (unambiguous) members of the investigated classes together to form a reference set for each intensification pattern. Seeing that some of the investigated categories are considerably more diversified than others (cf. section 4.4), it was furthermore necessary to normalise the type counts to the overall type frequency (of composite intensity expressions) of the respective class. For each construction, the evaluated base was then assigned to the intensification pattern that exhibited the largest percentual overlap in co-occurring predicate types. In the final step, the base was assigned to the pattern that came out as distributionally most similar across all three constructions. Where there was no majority but a tie between patterns, items were assigned to the category that collocated most strongly with the top co-occurrence partner of this base (in the majority of constructions). Where there was no significant collocation either, the pattern in which the top co-occurrence partner occurred with the highest token frequency was chosen instead. If an item was not attested as an intensifier in either of the prospective target categories at all, it was classified as ‘unattested’ and retained on the list of potential types for each eligible category. Following this procedure, disambiguations were performed for 3.3% of the
Procedure
87
types (16 bases). With the exception of sizzle/brutzeln (ambiguous between SOUND and HEAT), all ambiguities were HEAT/LIGHT polysemies. Tables A.1 and A.2 in the appendix give the final item lists for all four intensification patterns in both languages after disambiguation. 4.3.2.
Data extraction and coding
Once the item lists had been set up, potential intensity expressions involving the targeted bases could be extracted from the corpora. The BNC data were extracted using MonoConc 2.2, the web data were collected with WebCorp and the German data were obtained through the COSMAS IIclient and search interface provided by the IDS. All in all, more than 2000 concordances were generated for 481 lemmas.27 For English, the following queries were used: Table 4.1
Search strings (English)
Query [ROOT]-ing [ROOT]-ingly [ROOT]-y [ROOT] [w3] with [ROOT]-s [w3] with [ROOT]-ing [w3] with [ROOT]-[PAST] [w3] with [ROOT]-[PAST-PARTICIPLE] [w3] with
Example creaking creakingly creaky creak…with creaks…with creaking…with creaked…with creaked…with
Target construction A, B B B C C C C C
‘W3’ denotes a span of three words between the verb and the preposition with, a window size that had been found to work best in a series of pretests. The span served to ensure that also examples like the following were extracted: (2) a. The liberals were buzzing ominously with activity. (BNC HRJ) b. Dot's heart began pounding, first with anxiety, then with an irrational hope. (BNC AC5) c. …Martha felt her body flush to its core with a spasm of heat… (BNC APU)
88
Lexicalisation patterns: From concepts to words
In German, verb morphology is considerably richer than in English. Fortunately, an explosion of the number of necessary queries could nevertheless be avoided since the IDS-corpora are lemmatised. Hence, it was possible to retrieve all inflections of a given target item by simply searching for its base form in conjunction with the lemmatisation operator ‘&’. On the other hand, there were also a handful of rare types that did have occurrences in the corpus, but appeared not to be lemmatised in the system. For instance, searches for “&britzeln” and “&japsen” both yielded zero hits, even though both are attested in the corpus in construction C: (3) a. …einem vor lauter Ladungsdichte britzelnden an with genuine load-density BRITZEL Trauermarsch… funeral-march (PUBLIC P98/JUL.29213) b. Da japste Wolfi vor lauter Glück… there panted Wolfi with genuine happiness (PUBLIC O98/JAN.02664) In such cases, a simple wildcard search for the stem was performed. Furthermore, in view of the much freer word order of German, an increased span of five words in both directions was allowed between the verb and the preposition vor. For constructions A and B, a wildcard search for the bare root of all intensifiers was performed. Not surprisingly given the corpus size, such queries (e.g. strahl*) often matched an enormous number of types in the corpus (the vast majority being irrelevant compounds). Fortunately, the COSMAS-II client offers the convenient function of displaying a list of all types in the corpus that match the query before retrieving the associated tokens, with the option of deselecting all unwanted types before generating the actual concordance itself. That way, the number of bad hits could be reduced drastically. In addition, the system has an inbuilt restriction to a maximum of 10,000 exported tokens per query when accessing the corpora online rather than locally at the IDS in Mannheim. Whenever a search yielded more than 10,000 hits (as in the case of strahl* ‘beam’, for instance), the base was concordanced anew, this time with separate concordances for the three structural templates [ROOT]*-, [ROOT]-end* and [ROOT]-ig* that were exported individually. Where any of the three queries in isolation still yielded more than 10,000 tokens, I had to make to do with these 10,000 hits. Table 4.2 summarises the search strings for German:
Procedure Table 4.2
89
Search strings (German)
Query [ROOT]-end* [ROOT]-ig* [ROOT]-* &[ROOT] [w5] vor [ROOT]-* [w5] vor
Example knallend* knallig* knall* &knallen…vor knall*…vor
Target construction A, B A, B A, B C C
Following this procedure, more than 300,000 concordance lines were retrieved in more than 2000 queries. The identification of intensity expressions in these results required full manual post-editing. Coding effort was reduced by sorting the concordances so that long stretches of nonintensifying uses of the targeted bases in the concordance display could be conveniently identified and discarded en bloc. Data editing comprised three steps:
eliminating unwanted hits coding lemmatisation
Unwanted hits were non-intensifying uses of the targeted bases that did not pass the even/sogar-test introduced in chapter 3 when inspected in context. As detailed in chapter 3, new intensity meanings commonly arise as conversational implicatures, passing through a more or less prolonged stage of ambiguity between the original meaning and the new intensity reading that consolidates through piecemeal distributional expansion. Since a study that is interested precisely in these kinds of semantic changes cannot afford to miss out on the ambiguity stage (beyond which many proto-intensifiers may never advance), it was crucial to also include all tokens in which an intensifying reading of the expression is at least possible. As a rule of thumb, therefore, items in which the relative salience of the intensity implication (as evidenced by the test) was judged as dubious were nevertheless retained in the data as long as a fitting interpretation was seen as possible in principle. Likewise, intensified predicates’ membership in the classes ‘N’ and ‘Adj’ was not interpreted too strictly. For instance, nominal coercions like a resounding yes were counted as instances of the pattern Int + N, and construction B was extended to cover not only canonical APs but more generally all intensifiable non-nominal predicates that can occur in attributive and/or predicative position (e.g. adjectivally used verbal participles like
90
Lexicalisation patterns: From concepts to words
admired and also some pronominal adverbs like e.g. German dagegen ‘against it’). Observations that passed the test were then coded for the lemma of both the intensifier and the intensified property. In order to avoid artificially boosting the data, examples with more than one intensifier and/or intensified property such as those in (4) were only counted once: (4) a. …the air throbbed, whined, hummed with carmine-winged grasshoppers, locusts, huge hornets, bees, midges, bots, and ten thousand other anonymous insects. (BNC G13) b. … den Eindruck der flirrend-versengenden Hitze. the impression of.the whirring-scorching heat (PUBLIC R99/JAN.02562) In case there were several coordinated intensifiers, the one that appeared closest to the (first) intensified property was chosen.28 In case there were several coordinated predicates that were modified by the same intensifier, I also picked the one that appeared closest to this intensifier. Furthermore, if the head of the phrase that encoded the ascribed property was not itself the word that actually designated this property, semantic considerations were ranked higher than syntactic ones and the example was coded for the latter element. Some examples are given in (5): (5) a. …a real princess tiara glittering with loads of sparkling beads… (http://www.ayedo.co.uk/listman/listings/l0109.html) b. Everyone loves a good wrestling rumour, and every year around the time of WrestleMania the internet is buzzing with them. (http://www.thesun.co.uk/article/0,,2002120004,00.html) c. It's like a card game of sex, death and intrigue crackling with what the author calls the "baroque language" of the noir genre. (http://www.robertsherwood.homechoice.co.uk/bplay.html) In the partitive construction in (5a), beads rather than loads was coded as the element that goes with glitter; the anaphora in (5b) was resolved so that buzzed was grouped with wrestling rumour (rather than them), and the free relative in (5c) was coded for (baroque) language. Finally, both the intensifier and the intensified predicate were lemmatised. For the intensifier, spelling variants and (where applicable, i.e. especially in constructions A and B in German) bound and free realisations of
Results
91
the respective base were collapsed prior to the collocation calculation. Intensified predicates were stripped of inflections (such as case and number marking) and homogenised for spelling (e.g. English colour/color colour; adrenalin/adrenaline/adrenline adrenaline; program/programme programme). 4.4.
Results
The present section gives an overview of the main descriptive results of the study prior to any further analysis. Section 4.4.1 surveys the data with figures for all three constructions collapsed and sections 4.4.2 to 4.4.4 go through the constructions one by one. 4.4.1.
Overview
Table 4.3 summarises the frequencies of concordanced bases, attested intensifiers and the overall type and token counts of all intensity expressions in the entire dataset (with all intensification patterns and constructions combined): Table 4.3 Language English German Total
Intensifiers and intensity expressions in the study Bases 253 228 481
Attested as intensifiers 222 (87.7%) 172 (75.4%) 394 (81.9%)
Tokens 17,358 52,633 69,991
Types 7885 5331 13,216
The tabulation shows that the use of PERCEPTION intensifiers in the targeted environments is more variegated in English than it is in German: not only are there more intensifier types of the relevant kind (both in absolute terms and relative to the size of the investigated lexicon), but these items are also found in a greater variety of construct types: English has many more distinct intensifier-predicate combinations in the data than German (+48% as compared to the German count), even though it has substantially fewer tokens in the dataset (corresponding to a proportion of only 32.9% of the German count). Put differently, the German data appear to contain a greater proportion of (relatively) high frequency expressions that may be candidates for the status of fixed expressions.
92
Lexicalisation patterns: From concepts to words
Focusing on construct types (not intensifier types) to begin with, figure 4.1 breaks these frequencies down for the three different constructions:
Figure 4.1 F 4 1 Construction C type andd token k frequencies f across llanguages
The graph illustrates that the lion’s share of the tokens in the study is taken up by German Int + N and Int + Adj expressions. It also shows that type and token frequency proportions within languages are reversed: in German, token frequency decreases from construction A (51% of all tokens) over construction B (45%) to construction C (4%), which is exactly the other way round in English (A: 17%, B: 37%, C: 45%). As regards English, however, token frequency differences between constructions A-C in the overall dataset must be taken with a pinch of salt (cf. section 4.2.2 above): as a consequence of the sampling procedure for web data, relatively more instances of constructions B and C found their way into the dataset than relevant tokens of construction A. Since frequency differences between the three constructions are not the major focus attention of the study, this was considered an acceptable price to pay in order to obtain enough data for the co-occurrence analyses. Nevertheless, when it is exactly this question that is at issue, it is possible to correct for this distortion by inspecting only the English data from the BNC in isolation. Figure 4.2 contrasts construct type and token frequencies of the three target constructions in the BNC alone, thus providing an unbiased indication of their relative frequencies in English:
Results
93
Figure 4.2 English l h construct type and d token k frequencies f in the h BNC
As before, there are still more tokens of construction C than there are of construction B, but the frequency of intensifying uses of construction A relative to B and C is markedly higher now – in fact, figure 4.2 indicates that also in English, Int + N is the most token frequent of the three investigated intensification constructions. What remains striking, however, is the stark contrast in the relative frequency of construction C between languages. Going back to figure 4.1, a look at type frequencies shows that they mirror the token frequency ranking between constructions in both languages (A > B > C in German and C > B > A in English). Also remarkable are the noticeable token/type-ratio peaks in German constructions A (~10 tokens per type) and B (~30.9, as compared to 2.1–3.5 tokens per type for the remaining four constructions). This indicates that high-frequency exemplars of the investigated expression types are predominantly found in the environments Int + N and especially Int + Adj in German. Approaching the data from the semantic point of view, figure 4.3 on the top of the next page gives an overview of construct type and token frequencies of all four patterns in all three constructions taken together. Ignoring differences between constructions for the moment, the graph shows that in German, the domains SOUND and LIGHT have by far the most tokens in the dataset; in English, the dominant patterns are SOUND and HEAT, closely followed by LIGHT:
94
Lexicalisation patterns: From concepts to words
Figure 4.3 Intensification f pattern type and d token k ffrequencies across llanguages
Figure 4.4 plots concordanced against attested bases from all four domains, again with all three constructions taken together:
Figure 4.4 Bases concordanced and attested as intensifiers in both languages
Given the highly specialised nature of many of the investigated bases (notably in large domains such as SOUND), a result that was not anticipated here is the remarkable proportion of potentially eligible bases that are in fact attested as intensifiers. Setting aside the domain SMELL with its 50% attested types in both languages (out of a mere six in total in both cases
Results
95
though), five of the remaining six proportions are .75 or higher, and two even greater than .9. Tables A.3 to A.6 in the appendix report the most type frequent individual items per lexicalisation pattern (with counts for all three constructions collapsed). For a closer look at the data, it is now time to compare the three English and German constructions side by side. 4.4.2.
Construction A
Basic frequency information for construction A is summarised in table 4.4 (where: ‘Intensifiers’=number of bases attested as intensifiers, ‘Tokens’=number of construct/intensity expression tokens and ‘Types’=number of construct/intensity expression types): Table 4.4 Pattern SOUND LIGHT SMELL HEAT
Total
Frequency overview: construction A Intensifiers 163 (88.1%) 21 (84%) 3 (50%) 35 (94.6%) 222 (87.7%)
English Tokens 1170 826 16 1025 3037
Types 369 318 10 390 1087
Intensifiers 116 (77.3%) 25 (92.6%) 3 (50%) 28 (62.2%) 172 (75.4%)
German Tokens 9887 9176 58 7663 26,784
Types 1345 1301 19 1144 3807
For ease of orientation, figure 4.5 compares construct type and token frequencies across patterns and languages also in graphic form:
Figure 4.5 Construct type/token frequencies across languages: Construction A
96
Lexicalisation patterns: From concepts to words
The diagram shows that high frequency expressions within construction A are especially found in the domains SOUND, LIGHT and HEAT in German. Moving down from the patterns to the level of specific lexical items, the token frequency distribution of individual intensifiers in this construction is plotted in figures 4.6 and 4.7:
Figure 4.6 Intensifier token frequency distribution: construction A (English)
Figure 4.7 Intensifier token frequency distribution: construction A (German)
Results
97
In both languages, the token frequency of intensifiers approximates a Zipfian distribution: the item count is highest where token frequency is lowest (i.e. in the limiting case n=1), and token frequencies are highest where the number of items per class is lowest (again, in the limiting case n=1). Figures 4.8 and 4.9 present an alternative visualisation of frequency proportions between items in the form of collexeme clouds. The clouds are faithful replications of frequency proportions (e.g. if word A occurs twice as often as word B in the data, it will be reproduced exactly twice as large) that permit an easy identification of the top frequent items in a category. Collexeme clouds were generated using Wordle, a free online tool that is available from http://www.wordle.net/.
Figure 4.8 Intensifier token frequency cloud: construction A (English)
Figure 4.9 F 4 9 Intensifier I f token k ffrequency cloud: l d construction A (G (German))
Examples of high token frequency intensifiers from each pattern in context are provided in (6) and (7): (6) a. The day was a resounding success and… (BNC CJS) (SOUND)
98
Lexicalisation patterns: From concepts to words
b. The most glaring omission was the lack of reference to… (BNC CR5) (LIGHT) c. Amateurs, I thought. Bloody stinking amateurs. (BNC H80) (SMELL) d. …Pele paid glowing tribute to Moore. (BNC K5A) (HEAT) (7) a. Mit einem rauschenden Fest feierte n hundert… with a whooshing party celebrated hundred (PUBLIC A99/MÄR.18686) (SOUND) b. Strahlender Gewinner der Gala war die Agentur… beaming winner of.the gala was the agency (PUBLIC P00/SEP.33405) (LIGHT) c. …sie habe "eine Stinkwut" auf ihren Mann… she had a stink-anger on her husband (PUBLIC N99/OKT.43219) (SMELL) d. Der flammende Appell des Außenministers… the flaming appeal of.the foreign-secretary (PUBLIC N98/MÄR.07836) (HEAT) Creative exploitations (i.e. low frequency members) of the four target patterns are exemplified in (8) and (9): (8) a. To start with, there is the clunking obviousness of… (BNC A5F) (SOUND) b. …as a Polish colleague put it with glinting irony. (BNC HNU) (LIGHT) c. I tried to stand up, but the smell... the horrific reeking stink, such… (http://www.rab.org.uk/mc//permalink.php?tid=98&at=27&before= 11&after=12) (SMELL)
Results
d. I am burn'd up with inflaming wrath, A rage, whose heat… (http://william-shakespeare.classic-literature.co.uk/the-life-anddeath-of-king-john/ebook-page-16.asp) (HEAT) (9) a. Die Strohblumen riechen streng in der sirrenden Hitze... the strawflowers smell strict in the buzzing heat (PUBLIC E99/SEP.25406) (SOUND) b. …herrscht Transparenz und fluoreszierende Klarheit... dominates transparency and fluorescing clarity (PUBLIC P98/DEZ.48756) (LIGHT) c. Architekt Wolf Prix fürchtet "miefige Spießigkeit... architect Wolf Prix fears ponging narrow-mindedness.. (PUBLIC K00/FEB.10317) (SMELL) d. …die entzündende Verve der Eisernen Lady oder... the igniting verve of.the Iron Lady or (PUBLIC N92/APR.13174) (HEAT) Type frequency distributions are reported in figures 4.10 and 4.11:
Figure 4.10 Intensifier type frequency distribution: construction A (English)
99
100
Lexicalisation patterns: From concepts to words
Figure 4.11 Intensifier type frequency distribution: construction A (German)
Again, these plots can be contrasted with cloud displays of the same data:
Figure 4.12 Intensifier type frequency cloud: construction A (English)
Figure 4.13 Intensifier type frequency cloud: construction A (German)
Results
101
Figures 4.6/4.7 and 4.10/4.11 show that intensifier type and token frequency distributions are very similar. Moreover, the cloud displays suggest that there is a connection between high token frequency and high type frequency for individual items: salient items in the English token frequency cloud (figure 4.8) like glare, burn and resound are also prominent in the type frequency cloud (figure 4.12). The same is true for German items like glühen ‘glow’, rauschen ‘whoosh‘ and strahlen ‘beam’. Since the frequency distributions are skewed, mean type frequency scores (English: 8, SD=12.3; German: 23; SD=42.1) are not an optimally suited measure of central tendency. However, neither the mode (1 in either language) nor the median (English: 3, German: 4) are particularly instructive measures for present purposes either. In order to indicate relative comabinatorial variability, I will use z-scores. Rounding up the introductory overview of construction A, table 4.5 reports the 20 most type frequent intensifiers and their associated z-scores in construction A in both languages: Table 4.5 Pattern LIGHT HEAT HEAT HEAT LIGHT LIGHT SOUND HEAT LIGHT HEAT SOUND HEAT HEAT SOUND LIGHT LIGHT SOUND SOUND HEAT SOUND
Top type frequent intensifiers: construction A English Intensifier Types glare 69 burn 61 glow 56 sear 50 blind 44 dazzle 43 resound 42 blaze 34 glitter 32 blister 28 roar 28 brood 22 seethe 19 thunder 18 coruscate 17 shimmer 17 gush 16 howl 16 scorch 16 jar 15
z +5.01 +4.36 +3.95 +3.46 +2.97 +2.89 +2.80 +2.15 +1.99 +1.66 +1.66 +1.17 +0.92 +0.84 +0.76 +0.76 +0.68 +0.68 +0.68 +0.60
Pattern HEAT SOUND LIGHT HEAT LIGHT LIGHT HEAT LIGHT SOUND LIGHT HEAT SOUND LIGHT SOUND HEAT SOUND LIGHT LIGHT SOUND LIGHT
German Intensifier Types glühen 475 rauschen 296 strahlen 242 flammen 208 leuchten 198 schillern 150 brennen 148 gleißen 143 dröhnen 137 glänzen 131 lodern 111 schreien 107 blenden 102 tosen 102 zünden 88 donnern 87 funkeln 76 glitzern 67 krachen 56 flirren 55
z +6.77 +4.05 +3.22 +2.71 +2.55 +1.82 +1.79 +1.71 +1.62 +1.53 +1.23 +1.17 +1.09 +1.09 +0.88 +0.86 +0.69 +0.56 +0.39 +0.37
102
Lexicalisation patterns: From concepts to words
4.4.3.
Construction B
Table 4.6 summarises intensifier and pattern frequencies for construction B: Table 4.6 Pattern SOUND LIGHT SMELL HEAT
Total
Frequency overview: construction B Intensifiers 127 (68.6%) 20 (80%) 3 (50%) 30 (81.1%) 180 (71.1%)
English Tokens 2919 1765 120 1656 6460
Types 1602 625 40 632 2899
Intensifiers 60 (40%) 17 (63%) 3 (50%) 15 (33.3%) 95 (41.7%)
German Tokens 9236 10772 1622 1891 23,521
Types 352 372 29 108 861
A graphic comparison of type and token frequencies across patterns and languages is supplied in figure 4.14:
Figure 4.14 Construct type/token frequencies across languages: Construction B
As noted above, type counts are on the whole higher in English and lower in German (as compared to construction A). Top token frequent patterns are again SOUND and LIGHT in German with many lexicalised expressions. In German, HEAT is substantially less and SMELL substantially more token frequent than in construction A (the latter has almost 30 times more tokens
Results
103
than in construction A). The higher count of SMELL-expressions is due to a small number of compounds involving the base stink- ‘stink’ that can be considered lexicalised (e.g. stinkfaul ‘very lazy’, stinklangweilig ‘very boring’, stinknormal ‘very normal’, stinkreich ‘very rich’ etc.). In English, there are no striking differences in token frequency proportions between patterns as compared to construction A. Token frequency distributions for construction B are plotted in figures 4.15 and 4.16:
Figure 4.15 Intensifier token frequency distribution: construction B (English)
Figure 4.16 Intensifier token frequency distribution: construction B (German)
104
Lexicalisation patterns: From concepts to words
The graph for German is distorted by two extreme outliers: the bases knall‘bang’ with 6781 tokens and blitz- ‘flash’ with 6682 tokens (28.8% and 28.4% of the entire tokens of construction B, respectively). Apart from that, the distributions are very similar to those found in construction A. The corresponding cloud displays are supplied in figures 4.17 and 4.18:
Figure 4.17 Intensifier token frequency cloud: construction B (English)
Figure 4.18 Intensifier token frequency cloud: construction B (German)
High frequency intensifiers in construction B are exemplified in (10) and (11): (10) a. Screamingly conspicuous by their absence are Godflesh… (BNC CK4) (SOUND) b. …very tall and very fair and dazzlingly beautiful creature… (BNC CCW) (LIGHT)
Results
105
c. This gorgeous redhead, obviously stinking rich. (BNC H9H) (SMELL) d. …was blazing mad that his training ground discussion… (BNC CH3) (HEAT) (11) a. In einem knallblauen Anzug feierte ... in a bang-blue suit celebrated (PUBLIC O95) (SOUND) b. …garantiert ein Computer den blitzschnellen Zugriff auf… guarantees a computer the flash-quick access on (PUBLIC K98/JUL.51764) (LIGHT) c. Blocher findet den Comic "stinklangweilig". Blocher finds the cartoon stink-boring (PUBLIC E99/JUN.15885) (SMELL) d. …den an der Materie brennend interessierten Leser. the on the matter burningly interested reader (PUBLIC P99/FEB.08048) (HEAT) Less common variants are illustrated in (12) and (13): (12) a. Smith is just as whoopingly post-Coltrane-ish as... (http://www.guardian.co.uk/arts/fridayreview/story/0,12102, 1210863,00.html) (SOUND) b. …orchestra under Daniel Barenboim is radiatingly lush… (http://cd.ciao.co.uk/Panorama_Colours_of_the_Orchestra__Revi ew_5579931) (LIGHT) c. …applauded by the reekingly hypocritical bystanders… (http://www.ureader.co.uk/msg/11839.aspx) (SMELL) d. …but yourself when the incineratingly hot curry arrives. (http://www.itchynottingham.co.uk/venue/189575/MemSaab.html) (HEAT)
106
Lexicalisation patterns: From concepts to words
(13) a. …die klimperkleine Hexe Irma… the jingle-small witch Irma (PUBLIC H86) (SOUND) b. ...die flimmernd heisse Luft über dem Wüstensand… the flickering hot air above the desert-sand (PUBLIC A99/MÄR.14831) (LIGHT) c. ...als das miefig-spießige Bonn. than the ponging-narrowminded Bonn (PUBLIC L98/NOV.20976) (SMELL) d. Einen Ausweg aus der brutzelnd heißen Zukunft biete... a way-out out of.the sizzling hot future offers (PUBLIC N00/FEB.08041) (HEAT) At times, unconventional usages are explicitly acknowledged as such: (14)
Ignorant people! Stinkingly (ok don't think that's a word but hey!) rude people who seem to think that it's totally fine to… (http://www.faceparty.com/kittykat199)
Type frequency distributions are reported in figures 4.19 and 4.20:
Figure 4.19 Intensifier f type ffrequency distribution: d b construction B (English) ( l h)
Results
Figure 4.20 Intensifier type frequency distribution: construction B (German)
Figures 4.21 and 4.22 give the corresponding cloud displays:
Figure 4.21 Intensifier type frequency cloud: construction B (English)
Figure 4.22 Intensifier type frequency cloud: construction B (German)
107
108
Lexicalisation patterns: From concepts to words
Top type frequent intensifiers in construction B are reported in table 4.7 (English: M=16; SD=17.2; German: M=9; SD=14.5): Table 4.7 Pattern LIGHT LIGHT LIGHT LIGHT LIGHT HEAT SOUND SOUND HEAT HEAT HEAT LIGHT SOUND SOUND SOUND LIGHT SOUND SOUND SOUND SOUND
4.4.4.
Top type frequent intensifiers: construction B English Intensifier Types dazzle 70 sparkle 68 gleam 61 glitter 61 shimmer 61 sear 59 crackle 55 screech 55 blaze 53 brood 50 scorch 50 shine 50 groan 47 jar 46 thud 44 glare 43 resound 42 shriek 42 crash 41 grate 41
z +3.13 +3.01 +2.61 +2.61 +2.61 +2.49 +2.26 +2.26 +2.14 +1.97 +1.97 +1.97 +1.79 +1.74 +1.62 +1.56 +1.50 +1.50 +1.45 +1.45
Pattern SOUND LIGHT LIGHT LIGHT LIGHT SOUND LIGHT HEAT LIGHT HEAT LIGHT LIGHT SOUND SOUND SMELL SOUND SOUND LIGHT HEAT LIGHT
German Intensifier Types knallen 78 strahlen 62 leuchten 62 glänzen 42 blenden 39 knacken 37 blitzen 33 glühen 33 schillern 32 brennen 28 glitzern 27 gleißen 26 schreien 24 schrillen 23 stinken 21 quietschen 20 krachen 17 funkeln 15 flammen 11 flirren 11
z +4.75 +3.65 +3.65 +2.27 +2.06 +1.93 +1.65 +1.65 +1.58 +1.31 +1.24 +1.17 +1.03 +0.96 +0.82 +0.75 +0.55 +0.41 +0.13 +0.13
Construction C
Pattern frequencies for construction C are contrasted in table 4.8: Table 4.8 Pattern SOUND LIGHT SMELL HEAT
Total
Frequency overview: construction C Intensifiers 135 (73%) 19 (76%) 2 (33.3%) 30 (81.1%) 186 (73.5%)
English Tokens 4627 1384 175 1675 7861
Types 2021 874 120 884 3899
Intensifiers 75 (50%) 15 (55.6%) 2 (33.3%) 11 (24.4%) 103 (45.2%)
German Tokens 1475 448 13 392 2128
Types 401 164 12 86 663
As before, it will be useful to contrast type and token frequencies also in graphic form:
Results
109
Figure 4.23 Construct type/token frequencies across languages: Construction C
As anticipated in section 4.4.1, frequency counts for construction C are markedly different from those for constructions A and B. First, token/typeratios are comparably low here, so there are fewer high frequency intensifier-predicate combinations to be expected in this grammatical environment. Second, C is the only construction in which English intensifiers are both more type and token frequent than their German counterparts in the dataset. Contrastive differences are especially obvious in the domain HEAT, where 81.1% of the English target bases are attested as intensifiers, yet only 24.4% of their German equivalents (the lowest degree of exhaustion of all investigated patterns). Figures 4.24 to 4.27 supply token frequency plots and cloud displays for construction C. Some of the high token frequency items in these graphs are exemplified in (15) and (16), rare types are illustrated in (17) and (18): (15) a. A big house like this should really hum with life. (BNC J54) (SOUND) b. His blue eyes glittered with anger. (BNC JYD) (LIGHT) c. …black and rotten and stinking with corruption… (BNC CB5) (SMELL)
110
Lexicalisation patterns: From concepts to words
d. Still seething with anger, she pulled the bedclothes… (BNC JXT) (HEAT) (16) a. Schreiend vor Schmerz ließ der Verbrecher… screaming with pain let the criminal (PUBLIC O98/AUG.78465) (SOUND) b. Innen strahlt alles vor Sauberkeit. inside beams everything with cleanliness (PUBLIC E99/SEP.25392) (LIGHT) c. …es stank schon schier vor Uneigennützigkeit und… it stank already sheer with unselfishness and (PUBLIC WKD/HFS.11092) (SMELL) d. Das Publikum kochte vor Begeisterung. the audience boiled with enthusiasm (PUBLIC O96/JAN.03516) (HEAT) (17) a. …race against time in this vrooming with fun car racing game. (http://www.searchready.co.uk/game/fun-car-game.html) (SOUND) b. …scintillate with an internal and infernal dark blue light… (BNC CCW) (LIGHT) c. …with countless mournful remains and reeking with dead bodies. (BNC H0N) (SMELL) d. …and the guitars ignite with excitement… (BNC CK6) (HEAT) (18) a. …raschelt das Papier vor kulturbeflissenen Klischees… rustles the paper with culture.eager clichés (PUBLIC R99/AUG.67298) (SOUND) b. ...alles glimmert, glitzert und gleißt vor Marmor. everything glimmers glitters and glares with marble (PUBLIC H86/KZ3.50560) (LIGHT)
Results
111
c. …in diesem vor Reichtum stark duftenden Land... in this with riches strongly scenting country (PUBLIC R99/JAN.04277) (SMELL) d. …sondern weil die Stadt vor Hitze brütet... but because the city with heat broods (PUBLIC P00/AUG.29992) (HEAT)
Figure 4.24 Intensifier token frequency distribution: construction C (English)
Figure 4.25 Intensifier token frequency distribution: construction C (German)
112
Lexicalisation patterns: From concepts to words
Figure 4.26 Intensifier token frequency cloud: construction C (English)
Figure 4.27 Intensifier token frequency cloud: construction C (German)
Figures 4.28 and 4.29 give the type frequency distributions:
Figure 4.28 Intensifier type frequency distribution: construction C (English)
Results
113
Figure 4.29 Intensifier type frequency distribution: construction C (German)
Figures 4.30 and 4.31 present the data in the form of type frequency clouds:
Figure 4.30 Intensifier type frequency cloud: construction C (English)
Figure 4.31 Intensifier type frequency cloud: construction C (German)
114
Lexicalisation patterns: From concepts to words
Closing the descriptive overview the lexicalisation pattern analysis, table 4.9 lists the top type frequent intensifiers in construction C (English: M=21; SD=26.0; German: M=6; SD=8.0): Table 4.9 Top type frequent intensifiers: construction C Pattern HEAT LIGHT LIGHT HEAT SOUND LIGHT LIGHT LIGHT SMELL SOUND SOUND SOUND SOUND SOUND LIGHT LIGHT HEAT SOUND HEAT SOUND
4.5.
English Intensifier Types glow 120 gleam 107 glitter 102 seethe 99 groan 96 shimmer 96 glint 92 glisten 86 reek 85 hum 83 reverberate 80 resound 78 echo 74 buzz 72 sparkle 69 shine 68 blaze 67 fizz 63 burn 62 resonate 59
z +3.81 +3.31 +3.12 +3.00 +2.89 +2.89 +2.74 +2.50 +2.47 +2.39 +2.27 +2.20 +2.04 +1.97 +1.85 +1.81 +1.77 +1.62 +1.58 +1.46
Pattern LIGHT SOUND HEAT LIGHT LIGHT HEAT SOUND LIGHT SOUND SOUND LIGHT SOUND SOUND SOUND SOUND SOUND SOUND SMELL SOUND HEAT
German Intensifier Types strahlen 43 schreien 42 glühen 29 funkeln 28 glänzen 23 brennen 22 heulen 21 leuchten 21 stöhnen 20 knistern 19 blitzen 16 kreischen 15 brüllen 14 ächzen 12 aufschreien 12 brummen 12 quietschen 12 stinken 11 keuchen 10 kochen 10
z +4.55 +4.42 +2.81 +2.68 +2.06 +1.93 +1.81 +1.81 +1.69 +1.56 +1.19 +1.06 +0.94 +0.69 +0.69 +0.69 +0.69 +0.57 +0.44 +0.44
Summary and discussion
Chapter 4 has explored how the four targeted intensification patterns are exploited in English and German. I have introduced the data that were used in the study and described how they were extracted and coded. I have then reported the first part of the empirical results. Both between languages and between constructions, these results point to a number of interesting similarities and differences alike. Beginning with the contrastive comparison, construction A is the most commonly found environment for PERCEPTION intensifiers in both languages (based on English frequency counts for the BNC data, cf. figure 4.2). Also semantically, there is an obvious parallel in lexicalised convention: invariably, the vocabulary for SOUND EMISSION events is by far the largest of the four
Summary and discussion
115
classes, and SMELL is the least extensively lexicalised domain. This may be a linguistic reflection of a perceptual bias: compared to other species, humans and their immediate ancestors are much better at hearing than they are at smelling/scenting. It can be assumed that the pressure to make finegrained conceptual and concomitant communicative discriminations in at least one non-visual domain (for situations when visual orientation is impeded or impossible) has therefore affected the auditory modality more strongly, thus resulting in larger inventories of concepts and words for sounds than for smells. Bases from the domain SOUND also dominate among the extensions of perception verbs to intensifying function: summing for all three constructions, they are the category with the highest count of intensifier types, construct types and construct tokens in both languages. The extent to which not only the overall proportions and distributions but also the concrete individual expressions involving these intensifiers are similar in both languages will be investigated in chapter 5. More striking than the similarities, however, are the contrastive differences. First, intensifiers are more widely distributed across constructions in English than they are in German: in English, about half of all bases that are attested as intensifiers are found in all three constructions (49%); in German, it is less than a third (29%). Conversely, the proportion of intensifiers which are only found in a single construction is more than twice as high in German (41%) than it is in English (20%). Figures 4.32 and 4.33 compare these figures for individual lexicalisation patterns:
Figure 4.32 Distribution of intensifier types across constructions: English
116
Lexicalisation patterns: From concepts to words
Figure 4.33 Distribution of intensifier types across constructions: German
Intensifiers are also more widely distributed across concrete expression types in English than they are in German. Averaging over all constructions, the overall token/type-ratio in German is more than four times higher than the same measure in English (which has altogether more types in the dataset at less than a third of the German tokens). And third, there is a particularly notable difference in the relative conventionality of intensifying uses of construction C, which is disproportionately more type and token frequent in English than it is in German. This points to the importance of language-specific conventionalisation effects for what are essentially equivalent constructions from a contrastive point of view. Moving from the constructional to the semantic perspective, a look at proportions rather than absolute frequencies reveals that the relative degree of exhaustion of the investigated patterns is different across languages: in English, HEAT is the most systematically exploited resource in all three environments (A: 95%; B: 81%; C: 81%); in German, it is LIGHT in all three cases (A: 93%; B: 63%, C: 56%; as reported above, SOUND has most intensifiers in absolute figures, but the much larger size of the category here also accounts for a higher number of both non-typical and non-attested members in both languages). All in all, connections between PERCEPTION words and INTENSITY function are more common in English than they are in German (with 88% vs. 75% of potentially eligible bases attested as intensifiers). The most striking contrast between individual patterns is found for HEAT in construction C, where more than 80% of the source domain
Summary and discussion
117
lexis is found as an intensifier in English as compared to less than a quarter of all bases (24%) in German. Once more, this points to the substantial amount of language-specific idiosyncrasy in the data for construction C. Behavioural differences outweigh the similarities also from a languageinternal, cross-constructional perspective. As mentioned above already, more than half (English) to more than two thirds (German) of the attested intensifiers are not found across the board but are restricted to one or two of the investigated environments. Taking all patterns together, construction A shows the most systematic exploitation of the four lexicalisation patterns (with 88% attested bases in English and 75% in German), followed by construction C (74% English, 45% German) and finally construction B (71% English, 42% German). On the level of individual patterns, the contrast is most pronounced for HEAT (English: -15% of intensifiers from construction A to constructions B and C; German: -38% from construction A to construction C). Finally, marked differences are also found for the number of attested construct types per pattern (e.g. twelve times as many SMELL intensity expressions in English construction C as compared to construction B; more than ten times as many HEAT intensity expressions in German construction A as compared to construction C). In sum, these results provide support for the view that linguistic generalisations are essentially construction-specific (Croft 2001): not only do speakers of English and German exploit the targeted lexicalisation patterns differently from one another, they also do it in different ways within their respective language, i.e. depending on the specific constructional environment. This is to say that speakers do not seem to work with formindependent, purely semantic generalisations about the productivity of particular lexicalisation patterns (e.g. the greater or lesser availability of HEAT concepts for the expression of INTENSITY in general), but implicitly keep track of how often they have heard other speakers employ which kinds of intensifiers in which specific environments. At the same time, there are clearly still substantial similarities in the data, both between languages and between constructions: not only can speakers of English and German draw on the same conceptual and on immediately equivalent constructional means for the given functional end, but the way they exploit these resources also produces essentially identical distributions of concrete intensifier type and token frequencies in all three environments in both languages (i.e. they are all Zipfian in shape). Chapter 5 now turns to the extent to which it is also the same kind of intensity expressions (i.e. semantically equivalent or
118
Lexicalisation patterns: From concepts to words
at least related combinations) that are found for a given intensifier (both across languages and constructions).
Chapter 5 Fixed expressions: From words to collocations
5.1.
Introduction
Chapter 5 complements the paradigmatic perspective of chapter 4 with an exploration of the syntagmatic/combinatorial potential of the targeted intensifiers. The main questions to be addressed are:
what kinds of predicates are found with which kinds of intensifiers? to what extent are the semantic patterns thus identified similar across the two investigated languages? do the intensifiers combine with the same (or at least similar) kinds of meanings in each construction?
In addition, I will also have a first look at the way in which the top attracted exemplars in each construction are related semantically. Prerequisites developed in section 5.2 include a sketch of the connection between formulaicity and creativity (as understood in cognitive-functional language models) as well as a discussion of the connection between textual co-occurrence patterns in a corpus and cognitive entrenchment patterns. Section 5.3 introduces collostructional analysis, the corpus-linguistic method employed for uncovering significantly attracted and, by assumption, also cognitively routinised intensifier + predicate pairings in the data for each construction (‘fixed expressions’). Section 5.4 presents the results in a detailed walkthrough for all four patterns in all six constructions (three English, three German). Both contrastive and intralinguistic similarities and differences are discussed, and a first glimpse at the semantic patterning of the identified collocation clusters is provided. Section 5.5 presents a short summary and discussion.
120
Fixed expressions: From words to collocations
5.2.
Prerequisites
5.2.1.
Formulaicity and creativity
Usage-based language models stress the crucial importance of repetition for virtually all aspects of language structure and use, including its learning, processing and change. Language knowledge and language use are seen as closely intertwined in a kind of ‘feedback loop’ (Barlow and Kemmer 2000): ‘competence’ is seen as a system of cognitive routines in constant flux, ‘sedimented’ (and in fact always sedimenting anew) from recurrent patterns in speakers’ verbal interactions with their environment. Since the linguistic resources that a speaker has mastered are not only instantiated in (i.e. constitutive of) but also shaped by (i.e. adapting to) subsequent usage, questions of frequency take centre stage in this conception: frequent exposure to a structure is what makes this element become entrenched (cognitively routinised), and increasing entrenchment in turn favours its selection over rival units in subsequent productions, thereby consolidating its frequent use in discourse and hence producing a kind of feedback effect again. Most of the routinised structures thus acquired are internally complex. Consider again the expressions from example (1) in chapter 2 (to go wrong/mad/bankrupt etc.): in terms of Sinclair’s ‘open choice principle’, they would be analysed as routinised combinations of the atomic units go, wrong, mad, bankrupt etc. However, they can also be viewed as analysable wholes (go wrong, go mad, go bankrupt) that have an independent (and presumably even privileged) status as elements of speakers’ linguistic knowledge (over and above their knowledge of the individual parts). Usage-based research on child first language and adult second language acquisition has amassed substantial evidence that learners start out from lexically specific constructional exemplars (‘holophrases’) and only later come to analyse these units into segments that can be more or less productively (re-) combined (Ellis and Larsen-Freeman 2009; Tomasello 2003). However, there is no reason to assume that a strongly entrenched composite representation is ‘unlearned’ once its internal structure (or componentiality) is recognised – quite to the contrary, a growing body of evidence suggests that such units continue to play an important role also in adult language processing (cf. chapter 2). The structures in question go by many different names in the literature. Some of the more familiar include ‘collocation’, ‘fixed expression’, ‘formula’, ‘holophrase’, ‘prefab’, ‘listeme’, ‘multiword expression’ and
Prerequisites
121
‘idiom’. Not all of these convey exactly the same implications, and some of them are used rather differently by different authors. For instance, the term ‘multiword expression’ implies that the structure consists of more than one word (rather than morpheme) and the labels ‘idiom’ and ‘idiomaticity’ are cover terms for a whole syndrome of properties that go beyond lexical fixedness alone (see references in chapter 2). Up to this point in the exposition, I have used the terms ‘collocation’ and ‘fixed expression’, and I will continue to use them to designate the following phenomenon: A sequence, continuous or discontinuous, of words or other elements, which is, or appears to be, prefabricated; that is, stored and retrieved whole from memory at the time of use, rather than being subject to generation or analysis by the language grammar. (Wray 2002: 9)
It is the introduction of variation (i.e. unconventional lexical instantiations) into such chunks (i.e. lexically specific constructions) that will be understood as ‘linguistic creativity’ in this study – in this following Tomasello’s (1998: 433) position that “much of the creativity of language comes from fitting specific words into linguistic constructions that are non-prototypical for them”. Following Israel (1996, 2002), I assume that linguistic creativity thus conceived is driven by the interplay of two complementary mechanisms: on the one hand, the capacity for flexible analogical extension, where an existing expression is extended to a contextually related variant in order to meet the demands of the present discourse situation. And on the other hand, a cognitive pressure to maximise similarity across linguistic representations through generalisation. In the words of Israel (2002), children learning a language, and speakers in general, represent linguistic units in ways that maximize their motivation and emphasize their commonalities. Two units are consistent with each other to the degree that they match in their formal and semantic specifications. LOCAL CONSISTENCY applies to linguistic units activated online in usage events, and requires these to be as consistent as possible with entrenched utterance types. GLOBAL CONSISTENCY applies to the repertoire of constructions as a whole, and requires that units be represented in ways which maximize their consistency with each other. Local consistency favors a massive inventory of low-scope constructions to represent the rich details of experienced usage events: it thus fosters arbitrariness in the grammar, but also makes on-line processing easier by offering conventional units for every occasion. Global consistency favors the development of abstract representations and recurrent inheritance links across constructions: it thus increases motivation in the grammar, but also makes processing harder as the schematic units it favors are farther re-
122
Fixed expressions: From words to collocations
moved from the details of actual usage. Global consistency motivates the emergence of schematic linguistic units which can license novel utterances; local consistency constrains the use of such units by encouraging conformance to familiar patterns of usage. (Israel 2002:123–124)
From this perspective, a study of linguistic creativity requires a detailed account not only of speakers’ particular extension strategies, but also of the existing linguistic conventions against which these extension processes are set. Put differently, in order to recognise a particular formation as unconventional and presumably modelled on some underlying more familiar type, it is essential to first identify what is conventional and can be assumed as linguistic common ground within a given speech community before extensions of a given norm can be identified as such. The purpose of the present chapter is to establish what these conventions are within the restricted testing ground delimited in chapter 3. 5.2.2.
Corpus data as clues to cognitive entrenchment patterns
Over the past years, the assumed link between frequency and cognitive routinisation has led to a steadily growing interest in corpus-based research within cognitive linguistics (cf. Arppe et al. 2010; Schönefeld 1999; Glynn and Fischer 2010; Gries and Stefanowitsch 2006; Stefanowitsch 2011; Tummers, Heylen, and Geeraerts 2006). While there is no dispute over the existence of the relationship as such, approaches nevertheless differ in their assumptions about the most adequate interpretation of corpus frequencies as cues to cognitive entrenchment patterns. Leaving other kinds of frequency effects aside, I will here focus on the issue that is of primary concern for my study: the connection between the token frequency of an expression in a corpus and its cognitive status. In a series of papers, Stefanowitsch and Gries have proposed that high token frequency of an expression alone is not a reliable indicator of its privileged status vis-à-vis other instances of the relevant construction (Stefanowitsch and Gries 2003, 2005; Gries and Stefanowitsch 2004a). In a nutshell, the idea is that since words vary drastically in frequency, these differences must also be taken into account when assessing their status as more or less typical (or, conversely, unremarkable) instances of the grammatical patterns in which they occur. To give an example: suppose there are two words A and B with a token frequency of 100 and 10,000 occurrences in the corpus, respectively. If it is found that A and B both occur e.g. 90
Prerequisites
123
times in a particular construction C, it is assumed that their connection with this construction is not on equal footing: in the case of word A, virtually all of its occurrences (90%) are found in C, whereas the corresponding proportion for word B amounts to less than 1% of the overall tokens. In order to control for such asymmetries when assessing the relative importance of particular fillers for a given constructional slot, Stefanowitsch and Gries (2003) proposed a method known as collostructional analysis. Collostructional analysis is an extension of traditional collocational methods to the study of interactions between words and grammatical constructions as conceived in chapter 2: in the terminology of Stefanowitsch and Gries, any grammatical construction is associated with a range of collexemes (i.e. lexemes collocating with a particular slot of the construction), and the affinity between the construction and the range of attested collexemes can be ranked in terms of collostruction strength (i.e. the statistical association strength between the construction and the attested fillers of the relevant slot). Calculating collostruction strength, the authors suggest, amounts to determining what in psychological research has become known as one of the strongest determinants of prototype formation, namely the cue validity of, in this case, a particular collexeme for a particular construction. That is, collostructional analysis provides the analyst with those expressions which are highly characteristic of the construction’s semantics and which, therefore, are also relevant to the learner. (Stefanowitsch and Gries 2003: 237)
In later papers, two further variants of the approach were proposed, such that collostructional analysis has since become a cover term for a group of three related methods: first, collexeme analysis, which investigates the degree of attraction between a given constructional slot and its lexical fillers; second, distinctive collexeme analysis, which serves to differentiate supposedly synonymous/functionally equivalent constructions on the basis of those collexemes that are distinctive for either variant; and third, covarying collexeme analysis, which investigates interdependencies between different slots of one and the same construction. As illustrated by the above quotation, collostruction strength was originally conceived as a corpus-derivable cue to constructional meanings. However, in two later papers, Gries, Hampe, and Schönefeld (2005, 2010) extended the case for collostructional analysis to the psycholinguistic domain and suggested that collostructional relationships directly account for online language processing latencies (cf. also Wiechmann 2008; Zeschel 2008). In the (2005) study, the authors compared frequency- and collostruc-
124
Fixed expressions: From words to collocations
tion-based predictions about processing latencies in a sentence completion task: having investigated the verbs that preferentially occurred in a particular argument structure construction (the English ‘as-predicative’ as exemplified by He regarded him as stupid) within the ICE-GB corpus, they asked subjects to complete sentence fragments that ended with verbs that were either highly frequent in (but not necessarily strongly attracted to) or strongly attracted to (but not necessarily highly frequent in) the targeted construction. As predicted by the collostructional approach, they found that sentence beginnings with verbs that are strongly attracted to the aspredicative cued a significantly higher proportion of continuations involving this construction than sentences with verbs that were not strongly attracted to the pattern. By contrast, a verb’s high token frequency in the construction alone had no such effect. The second experiment (Gries, Hampe, and Schönefeld 2010) was a self-paced reading study that was devoted to the same construction. Here, the dependent measure was subjects’ reading time at the word immediately following as, and it was assumed that collostruction strength was a better predictor of processing effort than token frequency. In this experiment, none of the two factors had a significant influence on reading times, albeit collostruction strength (but not frequency) reached marginal significance at p=.065 (frequency: p=.293). While the authors acknowledge that frequency and collostruction strength are highly correlated, they also maintain that their results “clearly show that the collostructional approach is superior to the frequency approach” and suggest a direct connection between collostruction strength on the one hand and cognitive routinisation/entrenchment on the other: [W]hile cognitive linguists regularly regard frequency data as directly reflecting the degree of routinization or entrenchment, we have shown that (i) frequency alone runs the risk of severely misrepresenting speakers’ behavioural patterns and that (ii) collostruction strength outperforms frequency as a predictor of speakers’ behaviour in both production and comprehension tasks.
Specifically, they characterise the advantages of their method as follows: [T]he theoretical advantages of collexeme analysis over frequency-based approaches are that, unlike the latter, (i) collexeme analysis does not neglect the word’s and the construction’s overall frequencies, (ii) collexeme analysis allows for identifying cases where a construction and a word repel each other, and (iii) collexeme analysis allows for separating the wheat from the chaff by distinguishing significant from random cooccurrence. (Gries, Hampe, and Schönefeld 2005: 648)
Prerequisites
125
On the other hand, concerns have been raised regarding each of these assumed advantages. I will not dwell on the last two issues here (cf. Kilgarriff 2005 and Gries 2005 for discussion) but stay with (1): the question whether or not overall frequencies can (or, on the alternative account, must not) be neglected for the identification of entrenched linguistic patterns from corpus data. This is quite clearly an empirical issue, and one that has not yet been settled: on the one hand, it is reasonable to assume that an unexpected accumulation of rare events (i.e. occurrences of a rare word) in a certain identifiable context (i.e. a particular grammatical construction) will not go unnoticed with speakers, and it is conceivable that this will cause speakers to see the relation between these two items as somehow privileged (in the sense that it has something to say about the context in which the unexpected observation was made, or in other words the meaning of the construction in question). Following this line of reasoning, attention to overall frequencies would indeed be important for identifying “those expressions which are highly characteristic of the construction’s semantics”. On the other hand, the factors that shape speakers’ assumptions about the meaning of grammatical constructions and the relative typicality of particular instances of these constructions that they have encountered are not necessarily identical to the factors that lead to the entrenchment of particular instances of these constructions as memorised units. This is quite clearly illustrated by constructional types involving low frequency words that occur either near- or fully exclusively in the pattern under investigation: in the limiting case, only a single attestation in the corpus may be enough for an item to be identified as significantly attracted by a collostructional analysis. For instance, in the study by Gries, Hampe, and Schönefeld (2005), 56 out of the 107 verbs that were attested in the as-predicative construction in the ICEGB corpus were identified as significantly attracted collexemes of the construction (52.3%). Out of these 56 combinations, 16 (28.6%) only occurred once in the construction, and more than 60% no more than three times. The observation that types with a frequency of merely one, two or three occurrence(s) in the entire corpus constitute the majority of the attracted types casts doubt on the assumption that it is this very set of instances (i.e. the class of significantly attracted collostruction pairs) that best approximates what is entrenched in speakers’ minds: for the low frequency items in the set (in the present study, verbs like e.g. catapult and credit), it seems rather unlikely that speakers will invoke these types as salient standards of comparison when producing (or comprehending) novel instances of the construction in question. In practice, such types are also commonly ignored in
126
Fixed expressions: From words to collocations
collostructional analyses: the standard procedure in collostructional studies undertaken so far is to concentrate on the top attracted collexemes of a construction, and it is these items on which the characterisation of the constructional meaning is based. However, since collostruction strength is highly correlated with frequency, the top attracted collexemes also tend to be frequent, thus making collostruction-based and frequency-based predictions of processing preferences difficult to distinguish for these items. For the more difficult to identify cases where the predictions can be teased apart, the empirical evidence is inconclusive: on the one hand, the results of Gries, Hampe, and Schönefeld (2005) point in favour of collostructions as the more powerful predictor. However, it must be borne in mind that the authors actually discarded the singleton types (i.e. those significantly attracted items that one would least expect to induce processing biases in the predicted direction on the above assumption) when dichotomising the two variables FREQUENCY and COLLSTRENGTH into the two levels ‘high’ and ‘low’ for their experiment (Gries, Hampe, and Schönefeld 2005: 657). Furthermore, evidence against collostruction strength as a superior predictor of speakers’ categorisation preferences is presented in Bybee (2010): in an extension of an earlier experiment on Spanish ‘verbs of becoming’ (Bybee and Eddington 2006; cf. section 6.2.2), the combined factors of frequency and semantic similarity to a frequent exemplar were found to outperform collostruction strength in predicting acceptability judgments for different types of these constructions. Given this situation, further experimental research is required in order to decide between entrenchment predictions based on collostructional analyses and those based on token frequency analyses in cases where the two differ. However, it is also important not to overstate the differences and bear in mind that the measures are highly correlated and often yield highly similar predictions. 5.3.
Procedure
Bearing the above issues in mind (particularly regarding the unclear status of significantly attracted singleton types), my study adopts a collostructional approach to the identification of typical exemplars of the investigated intensification constructions. Since it is assumed that speakers’ choice of the intensifier depends on the identity of the intensified predicate, covarying collexeme analyses were calculated (Stefanowitsch and Gries 2005).
Procedure
127
The frequencies required for conducting a covarying collexeme analysis are summarised in table 5.1: Table 5.1
Covarying collexeme analysis
Intensifier X ¬ Intensifier X
Predicate Y X+Y ¬X + Y
¬ Predicate Y X + ¬Y ¬X + ¬Y
In prose, four frequencies are being compared: the number of attested combinations of intensifier X with predicate Y (e.g. sparkling + wit), the number of all combinations of intensifier X with predicates other than Y (sparkling + N), the number of all combinations of intensifiers other than X with predicate Y (Int + wit) and finally the number of all combinations of intensifiers other than X with predicates other than Y (Int + N) occurring within the construction. The characterisation ‘Int N’ signals that the analysis is applied to the specific meso-construction for nominal intensification with PERCEPTION intensifiers that is investigated in my study (rather than to the overarching macro-construction Adj + N at large). Such analyses were conducted for all six constructions in the study (A, B, C in both English and German). Covarying collexeme analysis comes in two variants: the so-called ‘item-based’ and the ‘system-based’ version. The item-based analysis investigates the covariance of collexeme A and collexeme B within construction C alone, i.e. it is geared at bigrams within the corpus defined by all tokens of the investigated construction. The system-based variant takes the construction itself into account as well and assesses dependencies between collexeme A, collexeme B and construction C as a trigram within the larger corpus. Given the nature of the data employed for the present study (cf. chapter 4), performing system-based corrections was not possible, and item-based analyses were calculated instead. The computation of the collostruction strength figures was performed using a script for R for Windows, kindly provided by Stefan Gries, that automates the procedure (Coll.analysis 3, Gries 2004).
128
Fixed expressions: From words to collocations
5.4.
Results
5.4.1.
Overview
In the following tables, collostruction strength values are reported in the column ‘FYE’. Figures in this column express the probability of error that a given combination of intensifier and intensified predicate is non-chance as computed by the Fisher-Yates exact test (‘FYE’). Hence, the lower the score in column ‘FYE’, the higher the collostruction strength. In extreme cases, the figure may be too low to calculate with a custom personal computer, in which case an output of ‘0’ appears in the results (meaning ‘maximally attracted’). In order to compensate for the fact that hundreds of tests were computed for the same dataset, a Bonferroni correction was applied to the results. The correction amounts to dividing the numerical threshold for a significant result (i.e. .05) by the number of tests computed (i.e. the number of distinct intensifier-predicate combinations fed into the test). To illustrate: given 1087 different intensifier-predicate combinations (i.e. construct types) in construction A in English, significantly attracted collexeme pairs had to come in at FYE < (.05/1087) = 4.60E-05. All in all, the collostructional analyses identified more than 1000 significantly attracted combination types in both languages and all three constructions taken together. Table 5.2 gives an overview of the number of distinct combination types and the number of significant pairs among them in each construction in both languages and reports the corrected alpha levels used for assessing statistical significance: Table 5.2 Cxn A B C Total
Bonferroni-corrected alpha levels
Types 1087 2899 3899 7885
English Significant 146 170 193 509
= 5% 4.60E-05 1.73E-05 1.28E-05
Types 3807 861 663 5331
German Significant 478 202 66 746
= 5% 1.31E-05 5.81E-05 7.52E-05
Sections 5.4.2 to 5.4.4 discuss the results for each construction in turn. 5.4.2.
Construction A
Table 5.3 reports the results of the covarying collexeme analysis for construction A in English:
Results Table 5.3 Pattern SOUND SOUND HEAT SOUND SOUND SOUND LIGHT LIGHT SOUND LIGHT HEAT HEAT HEAT LIGHT LIGHT HEAT SOUND HEAT SOUND HEAT
129
Top collexeme pairs: construction A (English) Collexeme pair resounding success thundering shower burning issue crying shame resounding victory roaring fire beaming smile glaring error roaring trade blinding light blazing row burning question searing pain blinding flash glittering career blistering attack ringing endorsement searing heat gushing praise burning ambition
Frequency 173 58 66 30 63 33 33 32 24 49 26 31 43 25 23 20 13 44 14 22
FYE 9.24E-125 2.18E-98 1.55E-70 3.66E-61 9.47E-51 2.64E-49 1.55E-47 1.46E-38 1.19E-36 3.48E-35 1.84E-34 1.74E-33 2.41E-33 2.37E-31 1.14E-30 9.52E-28 1.30E-25 2.19E-25 1.39E-24 7.97E-24
The first thing to note about table 5.3 is the difference in raw frequency and collostruction strength ranking of the tabulated expressions that was anticipated in section 2. For instance, as is to be expected, predicates from the source domains of the investigated intensifiers like light and heat occur more frequently in the dataset (117 and 133 tokens, respectively) than nouns from unrelated domains like e.g. trade (24 hits) or shame (31 occurrences). As laid out in section 5.2, collostructional analyses correct for such asymmetries in the frequency of predicates that enter into a particular combination. As a result, combinations like roaring trade (24 tokens, i.e. the only combination in which trade occurs in the data) and crying shame (30 tokens, i.e. all but one of 31 instances of intensified shame) rank higher in the association strength tabulation than blinding light (49 tokens) and searing heat (44 tokens). Table 5.4 reports the results for German. Seeing that these 2x20 pairs make up for a mere 6.4% of the 624 significantly attracted combinations in construction A, tables 5.3 and 5.4 of course only scratch the surface of the overall results. Nevertheless, they already provide evidence for interesting
130
Fixed expressions: From words to collocations
Table 5.4
Top collexeme pairs: construction A (German)
Pattern HEAT HEAT HEAT HEAT SOUND SOUND LIGHT SOUND SOUND LIGHT LIGHT SOUND SOUND SOUND SOUND LIGHT LIGHT SOUND SOUND HEAT
Collexeme pair brennende Frage brennendes Problem brütende Hitze flammender Appell gellendes Pfeifkonzert gellender Pfiff gleißendes Licht klirrende Kälte knisternde Spannung leuchtendes Beispiel leuchtende Farbe rauschende Ballnacht rauschendes Fest schallendes Gelächter schallende Ohrfeige strahlender Sieger strahlender Sonnenschein tosender Applaus tosender Beifall lodernde Flammen
Frequency 549 435 746 316 238 218 280 988 188 355 790 338 847 306 633 458 1318 1404 510 177
FYE 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
semantic patterning in different respects: language-internally, several of the top attracted collocates of the investigated intensifiers are near-synonymous (e.g. English burning issue/question, resounding success/victory, blinding light/flash). In the case of source domain collocations (i.e. expressions in which the intensifier applies to a word from the same semantic category as the intensifying base), this is unremarkable – if the expression is semantically predictable, the intensifier should work just as well for word X as for its regular variant Y. By contrast, figurative expressions of the type resounding success are not entirely predictable: the fact that there is a conventional collocation resounding success does not in itself guarantee that close semantic variants of success like achievement, accomplishment or triumph will likewise idiomatically combine with resounding. Where such variation is found, it is therefore an indication of conventionalised semantic generalisations of the type to be investigated in detail in chapter 6, and not surprisingly near-synonymy as a particularly close connection between meanings is prominent among the relations evidenced in tables 5.3 and 5.4. However, it is not the only kind of semantic connection between entrenched exemplars that is found in the tables. For instance, there are also
Results
131
metonymic extensions of the kind flammende Rede ‘flaming speech’ > flammender Appell ‘flaming appeal’: here, the meaning to which the intensifier attaches shifts from a communicative act to the intentional content of the designated act. And third, the broad overview above also points to the existence of a number of parallel/independent metaphorisations of one and the same source domain concept (e.g. burning issue/question vs. burning ambition; cf. chapter 6 for details on their presumably distinct motivations). Finally, the fact that only a minority of the top attracted combinations are source domain collocations in which the intensifier carries its literal meaning (25% in English and 40% in German) underscores that intensity collocations of the PERCEPTION type are an interesting object of study for present purposes: whatever it may be in each specific case, the majority of the investigated expressions involves some special conceptual motivation whose scope and potential for inspiring novel coinages of the relevant type can be assessed by intralinguistic and contrastive comparison. For a closer look at these similarities and differences across exemplars and languages, it will now be useful to filter the results for individual lexicalisation patterns. SOUND
Table 5.5 overleaf reports the top attracted combinations for English and German SOUND intensifiers. Beginning with English, the extended results for SOUND in construction A still contain a number of seemingly isolated collocations that are the only significant combination for the given intensifier among the top attracted pairs (e.g. crashing bore, cracking form, thumping loss). Others are more variegated in their collocational behaviour, and further semantic relations between their instances begin to show up alongside synonymy: these are antonymy (resounding yes/no, resounding victory/defeat) and frame-based metonymic shift (thundering shower/storm as a profiling shift within the domain THUNDERSTORM). Since the German dataset for construction A is much larger, the top 20 combinations in table 5.5 represent a smaller segment of the overall results as compared to English. It is maybe due to this difference that in the results for German, synonymy prevails much more clearly than in English (cf. gellendes/r Pfeifkonzert/Pfiff ‘(barrage of) catcalls’, klirrende/r Kälte/Frost ‘clinking cold/frost’, rauschende/s Ballnacht/Fest/Party ‘whooshing ballnight/celebration/party’, tosender Applaus/Beifall ‘roaring applause’, schreiende/s Ungerechtigkeit/Unrecht ‘screaming injustice’). However,
132
Fixed expressions: From words to collocations
Table 5.5
Top collexeme pairs: SOUND intensifiers, construction A
English Collexeme pair resounding success thundering shower crying shame resounding victory roaring fire roaring trade ringing endorsement gushing praise resounding yes thundering vengeance crying need crashing bore resounding no resounding defeat cracking form thundering storm whimpering coward thumping loss splashing fun crunching tackle
FYE 9.24E-125 2.18E-98 3.66E-61 9.47E-51 2.64E-49 1.19E-36 1.30E-25 1.39E-24 1.07E-19 3.37E-19 3.63E-18 6.00E-15 5.88E-14 5.67E-12 3.12E-10 6.08E-09 7.53E-09 7.79E-09 3.05E-08 3.09E-08
German Collexeme pair gellendes Pfeifkonzert gellender Pfiff klirrende Kälte knisternde Spannung rauschende Ballnacht rauschendes Fest schallendes Gelächter schallende Ohrfeige tosender Applaus tosender Beifall donnernder Applaus knisternde Erotik schreiende Ungerechtigkeit gellender Schrei heulendes Elend rauschende Party schreiendes Unrecht klirrender Frost gellender Hilferuf rauschender Erfolg
FYE 0 0 0 0 0 0 0 0 0 0 2.72E-298 1.65E-158 2.12E-142 1.47E-138 2.78E-138 1.20E-123 9.28E-107 7.53E-93 4.27E-46 5.01E-46
more indirect connections between expressions that invoke the same framesemantic constellation are also found here (e.g. knisternde Spannung/Erotik ‘crackling suspense/eroticism’; cf. chapter 6 for discussion). Comparing the dominant pairs across languages, a few combinations exploit the same construals for deriving their intensifying force (e.g. crying/schreiend, resounding/schallend), but the intensifiers nevertheless combine with different predicates in the two languages. Also languageinternally, table 5.5 does not suggest that there is a unified type of meanings that is systematically associated with SOUND intensifiers. Most combinations are idiosyncratic, deriving their intensifying force from itemspecific interactions between the lexical semantics of the intensifier and the intensified predicate, either with (knisternde Spannung, klirrende Kälte) or without (thundering shower, whimpering coward, crunching tackle) an element of figurativity involved. Some combinations are difficult to motivate altogether (crashing bore, cracking form, rauschendes Fest), and yet other predicate pairs look similar when considered in isolation (e.g. endorsement/praise), although they actually occur in collocations that betray
Results
133
different underlying construals in each case (i.e. approval as a substance that is copiously and possibly gratuitously brought forth in the case of gushing praise, and as a forcefully reverberating statement in the case of ringing endorsement). All in all, then, there is little evidence for a remarkable amount of semantic systematicity behind the top exemplars in this domain. LIGHT
Table 5.6 contrasts the results for LIGHT intensifiers in construction A: Table 5.6
Top collexeme pairs: LIGHT intensifiers, construction A
English Collexeme pair beaming smile glaring error blinding light blinding flash glittering career glittering ceremony sparkling form dazzling smile glaring omission glaring mistake coruscating brilliance glaring problem glaring contradiction glaring inconsistency glaring weakness sparkling wit dazzling light glaring contrast blinding glare radiating energy
FYE 1.55E-47 1.46E-38 3.48E-35 2.37E-31 1.14E-30 7.76E-21 8.75E-21 3.87E-16 7.60E-16 1.41E-11 5.56E-11 1.79E-10 4.59E-09 7.94E-09 7.94E-09 3.28E-08 6.01E-08 2.53E-07 4.17E-07 4.69E-07
German Collexeme pair gleißendes Licht leuchtendes Beispiel leuchtende Farbe strahlender Sieger strahlender Sonnenschein blendende Form leuchtendes Vorbild strahlendes Lächeln gleißendes Scheinwerferlicht glänzende Karriere gleißendes Sonnenlicht leuchtendes Rot glänzendes Geschäft strahlender Held glänzende Form glänzendes Comeback leuchtendes Gelb schillernde Farbe strahlender Gewinner glänzender Erfolg
FYE 0 0 0 0 0 8.55E-202 7.04E-175 1.88E-169 7.49E-157 6.74E-123 3.24E-120 8.95E-115 1.98E-113 9.09E-99 1.22E-96 9.14E-95 4.19E-82 6.94E-67 3.75E-60 1.18E-54
Here, there are clear semantic patterns in the results. There are relations of synonymy (glaring error/mistake, blinding light/flash/glare), pairings that indicate systematic metaphors (dazzling smile/light, beaming smile) and once more frame-based metonymic shifts (glaring error/mistake vs. glaring omission/problem/contradiction/inconsistency/weakness – all words for concepts that are not strictly interchangeable with error or mistake, but that can be interpreted as such in context). Yet other combinations share a par-
134
Fixed expressions: From words to collocations
ticular evaluative component. For instance, glitter combines with positively valued and/or pleasant phenomena (glittering career/ceremony), whereas glare marks things as deviating from some implicit norm or standard in a disfavoured way. The German results are also structured by relations of synonymy (leuchtendes Beispiel/Vorbild ‘shining example/model’, strahlender Sieger/Gewinner ‘beaming winner’) and, showing up for the first time, (co-)hyponymy (gleißendes Licht/Scheinwerferlicht/Sonnenlicht ‘glaring light/headlights/sunlight’, leuchtende/s Farbe/Rot/Gelb ‘shining colour/red/yellow’). Above all, however, the German results are dominated by a cluster of figurative expressions (blendende Form ‘dazzling form’, strahlender Sieger/Gewinner/Held ‘beaming winner/hero’, glänzende/r Karriere/Geschäft/Form/Comeback/Erfolg ‘gleaming career/deal/form/ comeback/success’) that provide an instructive illustration of the interaction of metaphor and metonymy in conceptualisation: paradigmatically, all these expressions are connected to the source domain of their respective intensifiers through a metaphorical connection between SUCCESS and LIGHT. Within the target domain, however, links between individual members of the paradigm are metonymic: all expressions invoke the shared framesemantic constellation of a protagonist (Sieger/Held/Gewinner) working towards a certain goal (Karriere/Geschäft/Comeback) with the right capabilities (Form) to master it (Erfolg). This shows that a given item may be motivated by more than one type of link to more than one other entrenched exemplar at the same time: for example, strahlender Sieger is connected to strahlendes Licht via metaphor, to strahlender Erfolg via metonymy and to strahlender Gewinner via synonymy. Capturing this kind of multidimensionality will be an important issue to be addressed in chapter 6. As regards contrastive similarities and differences, the top 20 LIGHT combinations already contain a few direct translation equivalents (beaming smile/strahlendes Lächeln, glittering career/glänzende Karriere), and more are found further down on the list of significant combinations (e.g. sparkling wit/funkelnder Witz [FYE 5.81E-11]). It is not difficult to see that these are ultimately a reflection of shared more general similarities on the conceptual plane (i.e. different metaphorisations of the source domain concept LIGHT that have similar linguistic manifestations in both languages). The fact that these general similarities do not extend to the most finegrained details (for instance, German gleißen does not show the association with negatively evaluated predicates of its English counterpart glare) underscores that identical conceptual resources may nevertheless find con-
Results
135
trasting conventional exploitations in different languages. All in all, however, the situation in this pattern is different from the one in the SOUND domain: there is good evidence that certain meanings are systematically associated with LIGHT intensifiers for principled reasons, and that these associations are furthermore parallel or at least highly similar in both languages. SMELL
The domain SMELL is different from the two previously considered lexicalisation patterns because it has substantially fewer (potential) intensifiers and also much fewer actually attested intensity expression types involving these intensifiers. As it turns out, only five of these expression types are statistically significant combinations within construction A (two in English and three in German): Table 5.7 Top collexeme pairs: SMELL intensifiers, construction A English Collexeme pair stinking hypocrisy stinking cold
FYE 9.39E-13 3.13E-07
German Collexeme pair Stinkwut duftende Frische miefiges Kleinbürgertum
FYE 1.14E-85 1.62E-13 1.26E-07
As a result, there is nothing much to be said about similarities within and across languages here: the two significant collocations in English are semantically unrelated (apart from the fact that they both designate negatively evaluated phenomena), and they do not have direct counterparts in German. Here, two of the three significant combinations also involve intensifiers denoting unpleasant smells (stinken ‘stink’ and miefen ‘pong’), but they combine with different types of predicates and there is also a significant combination with a positively valued SMELL word, duften ‘scent, be fragrant with’. HEAT
Finally, table 5.8 reports the results for the last of the investigated lexicalisation patterns in construction A, HEAT:
136
Fixed expressions: From words to collocations
Table 5.8 Top collexeme pairs: HEAT intensifiers, construction A English Collexeme pair burning issue blazing row burning question searing pain blistering attack searing heat burning ambition flaming desire glowing colour scorching heat broiling sun burning desire glowing tribute burning love searing indictment blazing inferno blazing sunshine scorching sun steaming heat blistering heat
FYE 1.55E-70 1.84E-34 1.74E-33 2.41E-33 9.52E-28 2.19E-25 7.97E-24 5.45E-21 2.40E-18 9.03E-17 1.42E-14 1.86E-14 9.29E-14 4.33E-10 2.30E-09 3.29E-09 8.82E-09 4.42E-08 5.51E-08 5.85E-08
German Collexeme pair brennende Frage brennendes Problem brütende Hitze flammender Appell lodernde Flamme flammende Rede brennendes Thema sengende Hitze glühender Verfechter glühender Verehrer glühender Anhänger sengende Sonne flammendes Plädoyer glühender Fan brennende Sorge brennende Aktualität brennender Schmerz brennendes Interesse flammender Protest glühender Befürworter
FYE 0 0 0 0 0 1.12E-257 7.56E-232 4.39E-231 3.36E-230 8.40E-181 2.54E-163 6.77E-157 5.61E-145 3.27E-75 1.70E-65 1.25E-62 1.20E-59 1.21E-57 6.20E-54 5.73E-45
The English results are remarkable insofar as in this pattern, source domain collocations involving either the word heat itself or otherwise things that are literally hot (sun, sunshine, inferno) make up for nearly half of the top combinations. Moreover, three intensifier types occur exclusively with source domain concepts among these top 20 pairs (scorch, broil, steam). Among the remaining items, there are transparent semantic connections of different types: quasi-synonymy in both the predicate (burning issue/question) and the intensifier slot (flaming/burning desire), metaphorical relations (searing heat/indictment, blistering heat/attack) and metonymic profiling shifts (burning love/desire/ambition). In German, there are two larger clusters of closely related expressions (brennende/s Frage/Problem/Thema ‘burning question/problem/issue’, glühender Verfechter/Verehrer/Anhänger/Fan/Befürworter ‘glowing advocate/admirer/ follower/fan/supporter’) and a number of metonymic connections (flammende/r/s Appell/Rede/Plädoyer/Protest ‘flaming appeal/speech/plea/ protest’, brennende Frage/Aktualität ‘burning issue/timeliness’). As in Eng-
Results
137
lish, there are three items that occur exclusively with source domain collocates among the top 20 pairs. In both languages, recurrent semantic types in the predicate slot are primarily emotions and, by metonymic extension, experiencers (Verfechter ‘advocate’/Verehrer ‘admirer’/Anhänger ‘supporter’/Fan ‘fan’/Befürworter ‘proponent’), attitudes (ambition, Interesse ‘interest’) and communicative acts and interactions manifesting these attitudes and emotions (row/attack/tribute/indictment, Appell ‘appeal’/Rede ‘speech’/Plädoyer ‘plea’/Protest ‘protest’). There are some other figurative uses that are also found in both languages (issue/question, Frage ‘question’/Problem ‘problem’/Thema ‘topic’; pain, Schmerz ‘pain’), but the use of HEAT intensifiers in combination with emotion terms is clearly dominant in both languages. In order to see whether such similarities are also found between constructions, it is now time to turn to construction B, Int + Adj. 5.4.3.
Construction B
On first glance, the top attracted exemplars in construction B do not look like regular variants of their counterparts in construction A: Table 5.9 Top collexeme pairs: construction B (English) Pattern SOUND LIGHT SMELL HEAT SOUND HEAT SOUND SOUND SOUND SOUND HEAT HEAT SOUND LIGHT HEAT LIGHT LIGHT LIGHT HEAT LIGHT
Collexeme pair squeaky clean glaringly obvious stinking rich blazingly fast creakingly old scorching hot piping hot barking mad cracking good blaringly obvious searingly honest glowing red screamingly funny dazzling white scaldingly hot beamingly proud shiny new gleamingly white blisteringly fast sparklingly clean
Frequency 102 124 49 82 38 92 54 23 51 55 30 32 49 52 32 16 44 54 46 49
FYE 4.76E-151 2.01E-108 1.10E-79 1.48E-78 2.63E-67 3.39E-61 2.77E-57 2.33E-48 2.62E-44 2.47E-40 6.41E-40 2.14E-39 3.52E-37 1.06E-36 1.55E-36 4.77E-36 1.95E-33 1.98E-33 4.60E-32 4.36E-31
138
Fixed expressions: From words to collocations
In construction B, each intensifier in table 5.9 occurs only once among the top 20 combinations. This is quite a difference to construction A, where this applies for less than half of the intensifier types. Moreover, there is comparably little lexical overlap between them: 75% of the intensifiers in table 5.9 are not found among the top 20 exemplars in construction A. Restricting the focus of attention to just the 20 most strongly attracted items is of course an arbitrary decision that is made purely for ease of exposition, and the overlap is in fact greater if combinations further down on the list (and even more so, non-significant pairings) are taken into account, too. Still, it is interesting to note that at the very top of the association strength rankings, the overlap is small, and the range of intensifiers occurring in construction B is broader. One reason for this is that a greater number of expressions in table 5.9 involve intensifiers that occur in just a single conventional collocation within the larger dataset (squeaky clean, piping hot, barking mad). Since none of the other intensifiers occurs more than once in table 5.9 either, it does not hold cues to a possible semantic patterning of the expressions in which a given intensifier is found. However, table 5.9 indicates that there may also be non-semantic connections between items, as documented by the phonological interference between glare and blare that presumably underlies the coinage blaringly obvious. When it comes to the central question of whether the collocational profile of the investigated intensifiers is primarily determined semantically (such that it should be similar in different grammatical environments) or alternatively a matter of idiosyncratic convention (i.e. different from one construction to the next), the comparison does not afford a clear answer: some of the top combinations in construction A that could plausibly be expected to have more or less direct counterparts in construction B do have significant equivalents there (e.g. resounding success/resoundingly successful, FYE: 1.17E-18), others have correspondences that are non-significant but at least attested (glaring error/glaringly wrong, FYE: 2.39E-03), and yet others are plain unattested in the respective other construction (crying shame/?cryingly shameful, piping hot/?piping heat). While crossconstructional analogies like resounding success/resoundingly successful presumably do not occur as particularly striking instances of linguistic creativity to most speakers, other such uses do. Presumably because it is such a restricted collocation, the following extension of the top attracted SOUND exemplar squeaky clean has the intuitive quality of creative wordplay and is deliberately employed for this very reason in order to attract attention to the services of a commercial cleaning company:
Results
139
(1) The Walsall domestic cleaner to speak to is Squeaky Cleaning who offer a reliable domestic cleaning service across Walsall and surrounding areas. This helps take the grind out of the house cleaning because, let's face it, life is too short to spend your spare time doing domestic cleaning! (http://www.thebestof.co.uk/local/walsall/business-guide/feature/ squeaky-cleaning-specialist/13056) It is conceivable that other such cases can be identified in the data when further non-attracted combinations are taken into consideration (squeaky cleaning of course being rare and non-attracted in construction A). However, the focus of the study is not specifically on manifestations of linguistic creativity and collocational variation that are experienced as intuitively pun-like (for whichever reason). If such a distinction is made, the study is no less interested in the determinants of seemingly inconspicuous, run-of-the-mill extensions that may in fact be more revealing with regard to the properties of ordinary language processing than expressions that appear to be deliberately crafted. Of course the difference between these idealised alternatives for the manifestation of creativity-in-reuse is gradual, and intensification was explicitly chosen as the functional target domain of the study because speakers were assumed to make special investments into expressivity in this field. However, the investigation is not specifically devoted to fishing for oddities, which is why I will continue to restrict assessments of cross-constructional collocation overlap to the top attracted exemplars in each pattern. Moving on, the element of construction-specific idiosyncrasy is even more marked in German. Many of the combinations in table 5.10 are again more or less completely fixed lexicalisations that permit little to no variation in the predicate slot (jammerschade lit. ‘wail-unfortunate’, klatschnass lit. ‘clap-wet’, rappelvoll lit. ‘clatter-full’). And like in English, many combinations that could be expected to have counterparts in construction A on purely morphological and semantic grounds are nevertheless restricted to the adjectival construction (klatschnass/?Klatschnässe/?klatschende Nässe, knallhart/?Knallhärte/?knallende Härte, rappelvoll/?Rappelfülle/?rappelnde Fülle). On the other hand, there are also some items that do occur in both environments (brennend interessiert/brennendes Interesse, klirrendkalt/klirrende Kälte, glühend heiß/glühende Hitze), so that the picture is mixed here, too.
140
Fixed expressions: From words to collocations
Table 5.10 Top collexeme pairs: construction B (German) Pattern LIGHT LIGHT LIGHT HEAT SOUND SOUND SOUND SOUND SOUND SMELL SMELL LIGHT HEAT SOUND SOUND SOUND HEAT HEAT LIGHT LIGHT
Collexeme pair blitzblank blitzsauber blitzschnell brennend interessiert jammerschade klatschnass klirrend kalt knallhart rappelvoll stinknormal stinksauer strahlend blau brühwarm knallrot knackfrisch quietschvergnügt glühend heiß siedend heiß strahlend schön blendend weiß
Frequency 663 853 4433 251 306 184 162 3612 244 383 883 470 148 1459 170 169 262 190 201 165
FYE 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.60E-299 5.82E-278 9.20E-252 2.29E-240 1.90E-212
SOUND
Coming to the more fine-grained pattern-level comparisons, table 5.11 contrasts the top combinations for English and German SOUND intensifiers in construction B. Still, relations between exemplars are more difficult to trace than in the corresponding data for construction A: in English, only two SOUND intensifiers occur more than once in table 5.11, one of them combining with two near-synonyms (crashingly boring/dull), the other with two less obviously related adjectives (roaringly funny/drunk). The German data are only slightly more telling: again, there are only two recurring intensifiers plus an additional cluster of ideophones combining with the predicate nass ‘wet’. The items that occur more than once provide some evidence for analogical extensions (quietschvergnügt/-fidel/-lebendig lit. ‘squeakhappy/blithe/vivacious’, knallrot/-bunt/gelb, but also -hart). The ideophone cluster is a group of collocations involving a single intensified predicate (nass ‘wet’) that combines with a range of onomatopoeic formatives (klatsch-, pitsch-, patsch-) of which only one is also lexicalised as a
Results
141
Table 5.11 Top collexeme pairs: SOUND intensifiers, construction B English Collexeme pair squeaky clean creakingly old piping hot barking mad cracking good blaringly obvious screamingly funny resoundingly successful raspingly dry roaringly funny roaringly drunk resoundingly positive blaringly loud crashingly dull crashingly boring splashingly good droningly repetitive thuddingly dull screechingly high clunkingly obvious
FYE 4.76E-151 2.63E-67 2.77E-57 2.33E-48 2.62E-44 2.47E-40 3.52E-37 1.17E-18 4.11E-18 4.63E-17 1.75E-16 9.78E-16 4.89E-15 7.74E-15 5.98E-14 2.67E-12 4.86E-12 2.87E-11 1.19E-10 1.62E-10
German Collexeme pair FYE jammerschade 0 klatschnass 0 klirrend kalt 0 knallhart 0 rappelvoll 0 knallrot 0 knackfrisch 0 quietschvergnügt 1.60E-299 quietschfidel 1.23E-148 pitschnass 8.76E-141 klapperdürr 3.55E-137 klitschnass 2.92E-124 patschnass 1.68E-111 knallbunt 4.11E-104 raspelkurz 3.31E-84 piepegal 1.95E-80 quietschlebendig 1.27E-56 knallgelb 6.03E-54 zischfrisch 2.09E-51 schreiend bunt 8.94E-44
verb (klatschen, ‘to smack, clap’; further down on the list, there are yet further variants of the collocation such as plitsch-, platsch- and quitschnass). Is there evidence for any semantic systematicity regarding the adjectives that take SOUND intensifiers? In order to answer this question, it is necessary to first consider the kinds of categories that might count as equivalents to the semantic domains that nouns can be assigned to. One well-known (though not unproblematic)29 approach to the semantic classification of adjectives is Dixon’s (1977) crosslinguistic typology of adjective meanings. Dixon distinguishes the following ten classes:
DIMENSION (e.g. big) PHYSICAL PROPERTY (e.g. COLOUR (e.g. red) AGE (e.g. new) VALUE (e.g. good) SPEED (e.g. fast)
heavy)
142
Fixed expressions: From words to collocations HUMAN PROPENSITY (e.g. proud) SIMILARITY (e.g. equal) DIFFICULTY (e.g. easy) QUALIFICATION (e.g. possible)
Applying these distinctions, the order of categories among the top 20 pairs in English is (in decreasing order): VALUE (dull, boring, good, funny, positive, successful), PHYSICAL PROPERTY (clean, dry, hot, loud) and, with only one item each, DIMENSION (high), AGE (old) and QUALIFICATION (obvious). Moreover, there are two adjectives that do not fit easily into Dixon’s category scheme (mad, drunk). In German, the ranking is PHYSICAL PROPERTY (nass ‘wet’, kalt ‘cold’, hart ‘hard’, voll ‘full’, dürr ‘scrawny’) before COLOUR (rot ‘red’, gelb ‘yellow’, bunt ‘multi-coloured’), HUMAN PROPENSITY (vergnügt ‘amused’, fidel ‘merry’, lebendig ‘vivid’) and, with one type each, DIMENSION (kurz ‘short’) and AGE (frisch ‘fresh’). Two types are again difficult to assign to Dixon’s categories (schade ‘unfortunate’, egal ‘not making a difference’). What do these results suggest? Language-internally, SOUND intensifiers cover a wide range of semantic adjective types in both cases. In English, those of the 18 adjectives in table 5.11 that could be classified are from five out of the ten overall categories, and in German, 18 types are distributed over 5 different classes. Furthermore, at least for some of Dixon’s classes, it is difficult say whether their position in the ranking in fact says something about the peculiar characteristics of SOUND intensifiers. For instance, it is conceivable that VALUE is on the top of the list in English simply because speakers are more likely to boost the force of their personal value judgments as compared to other types of predications (in which case the German ranking would be unexpected, though). At any rate, the distribution of the top attracted adjectives across Dixon’s classes is quite heterogeneous in both English and German, and also from a contrastive perspective, there is no compelling evidence that the patterning is shaped by general conceptual (and hence contrastively similar) motivations. For instance, the German data point to a conventionalised synaesthetic connection between SOUND intensifiers and COLOUR predicates that is also evidenced by many expressions further down on the list: (2) a. Das quietschrote Auto war sogar bereits zu… The squeak-red car was even already to (PUBLIC R98/MAI.36330)
Results
143
b. ...die Besucher an klatschgelben Forsythien… the visitors on clap-yellow forsythias (PUBLIC L99/APR.16091) c. ...ehe er einen knackgrünen Apfel vom Teller nahm… before he a crack-green apple from.the plate took (PUBLIC H85/UZ1.16696) d. … frisch plitschblau gestrichen wirbt die Iglesia… freshly PLITSCH-blue painted promotes the Iglesia (PUBLIC M00/SEP.58027) e. ...zum dunklen Anzug knatschbraune Schuhe. to.the dark suit KNATSCH-brown shoes (PUBLIC M02/JUN.42900) f. Jawohl, lila! So richtig schön knallviolett. Yes violet so right beautifully bang-violet (PUBLIC K96/SEP.08880) g. … eine schreiend rosa Strickjacke:… a screaming pink cardigan (PUBLIC M01/APR.26041) h. Der knallweiße Kirchturm ragt aus dem Dorf... the bang-white spire towers out of.the village (PUBLIC K98/JUL.55988) i. Als krachend bunte Travestie des Showgeschäfts… as crashingly colourful travesty of.the show business (PUBLIC R98/JUN.45049) j. … das leuchtende knatschfarbige Gummibärchen… the shiningly KNATSCH-coloured jelly-baby (PUBLIC R98/NOV.88497) Hence, (2) suggests that there is a systematic pattern behind German expressions like knallrot/-gelb/-bunt ‘bang-red/yellow/multicoloured’ in table 5.11 that also extends to other kinds of SOUND intensifiers. However, not all of them are equally productive within the pattern. And although the connection is conceptually motivated through synaesthesia, no comparable pattern is found in English. All in all, then, the evidence for truly consistent semantic patterning in the domain SOUND is only slightly stronger than in construction A, and also cross-constructional equivalences of the type exemplified in (3) and (4) are rather the exception than the norm: (3) a. The evening was a resounding success. (BNC K9P)
144
Fixed expressions: From words to collocations
b. Reading fund ‘resoundingly successful’ says arts minister… (http://www.culture.gov.uk/Reference_library/Press_notices/archiv e_2002/dcms101_2002.htm) (4) a. Knisternde Spannung bis zum Schluß: Erst nach... crackling suspense until the end only after (PUBLIC O94/APR) b. … hatten sich knisternd spannend ausgenommen. had REFL cracklingly suspenseful seemed (PUBLIC RHZ05/AUG.16558) LIGHT
Maybe in part because it simply has substantially fewer intensifiers (i.e. different items that could potentially occur high up in the list), the situation is different among the top exemplars in pattern II, LIGHT EMISSION: Table 5.12 Top collexeme pairs: LIGHT intensifiers, construction B English Collexeme pair glaringly obvious dazzling white beamingly proud shiny new gleamingly white sparklingly clean sparklingly clear glisteningly white glaringly apparent blindingly fast gleamingly new beamingly happy glisteningly wet shimmeringly beautiful shiny white glitteringly successful sparklingly fresh scintillatingly witty glaringly absent coruscatingly brilliant
FYE 2.01E-108 1.06E-36 4.77E-36 1.95E-33 1.98E-33 4.36E-31 7.30E-23 2.40E-21 3.34E-20 3.46E-20 9.08E-19 2.38E-17 2.49E-13 1.81E-11 1.82E-11 4.09E-11 8.29E-10 8.62E-09 9.71E-08 9.29E-07
German Collexeme pair blitzblank blitzsauber blitzschnell strahlend blau strahlend schön blendend weiß leuchtend rot blitzgescheit strahlend weiß leuchtend gelb gleißend hell leuchtend orange leuchtend grün leuchtend blau glänzend poliert glänzend gemeistert strahlend hell glänzend vorbereitet leuchtend farbig schillernd bunt
FYE 0 0 0 0 2.29E-240 1.90E-212 7.88E-189 3.51E-184 8.88E-150 3.90E-147 2.80E-105 3.90E-62 6.78E-57 5.44E-49 4.06E-32 5.46E-32 9.16E-27 3.53E-21 5.88E-18 7.92E-18
Here, many items in both English and German appear more than once in the table, and a number of combinations are also straightforwardly related.
Results
145
Semantic relations evidenced in table 5.12 include synonymy (glaringly obvious/apparent, sparklingly clean/clear, blitzblank/-sauber), (co-) hyponymy (strahlend blau/weiß/gelb, leuchtend rot/gelb/orange/grün/blau/ farbig), metaphor (strahlend hell/schön), metonymy (sparklingly clean/fresh, gleamingly white/new, shiny white/new, glänzend vorbereitet/gemeistert), and, for want of a better term, a relation that could be identified as higher-level co-hyponymy (meaning membership in the same broader semantic field, e.g. strahlend gelb/hell). In terms of Dixon’s categories, the predicate types in table 5.12 are again different in English and German. In English, the category ranking among the top 20 exemplars is VALUE30 (beautiful, successful, brilliant, witty), PHYSICAL PROPERTY (clean, clear, wet), HUMAN PROPENSITY (happy, proud), AGE (new, fresh), QUALIFICATION (obvious, apparent), and, with one item each, SPEED (fast) and COLOUR (white). This amounts to seven out of Dixon’s ten classes already among the top 19 items (absent does not fit easily into the scheme), and the source domain COLOUR surprisingly coming at the very end. In German, it is by far the most frequent category among the top combinations (blau ‘blue’, weiß ‘white’, rot ‘red’, gelb ‘yellow’, orange ‘orange’, grün ‘green’, hell ‘bright’, farbig ‘coloured’, bunt ‘multi-coloured’), followed by VALUE (schön ‘beautiful’, gescheit ‘smart’), PHYSICAL PROPERTY (blank ‘shiny’, sauber ‘clean’) and SPEED (schnell ‘fast’). Three items stand out because they are in fact adjectivally used verbal participles (poliert ‘polished’, gemeistert ‘mastered’, vorbereitet ‘prepared’). What is similar in both languages is the association between LIGHT intensifiers and VALUE predicates, and this also marks a connection to the semantic patterns observed in construction A: several nouns that occur in the metaphorical patterns found there (most of them not among the top 20 pairs though) are derivationally related to VALUE predicates that are also found in construction B, thus supporting the idea that these intensifiers are indeed associated with recurrent meanings rather than individual words that may be different from one construction to the next: (5) a. …and bowl him over with my sparkling wit! (BNC C8A) b. A dark, sparklingly witty novel about life, love, death... (http://www.orionbooks.co.uk/HB-32995/Luck.htm) (6) a. In strahlender Schönheit strebt das weiße Flugobjekt… in beaming beauty drifts the white flight-object (PUBLIC R99/MÄR.20949)
146
Fixed expressions: From words to collocations
b. Das strahlend schöne Wetter zum WM-Auftakt in... the beamingly beautiful weather to.the world-cup start in (PUBLIC N96/FEB.06868) SMELL
Table 5.13 has the results for the next pattern, SMELL: Table 5.13 Top collexeme pairs: SMELL intensifiers, construction B English Collexeme pair stinking rich stinking drunk stinkingly sweet stinkingly corrupt
FYE 1.10E-79 2.68E-06 1.35E-05 1.71E-05
German Collexeme pair FYE stinknormal 0 stinksauer 0 stinkreich 1.63E-131 stinklangweilig 2.6534E-122 stinkfaul 6.41845E-35 stinkfad 1.25733E-26 stinkfrech 4.74515E-14 stinkfein 5.93651E-07 stinkbesoffen 1.99589E-05
In both languages, only one single intensifier has significant collocations in the data. However, again in both languages, the adjectives that it combines with are with two exceptions (German stinklangweilig lit. ‘stink-boring’ vs. stinkfad lit. ‘stink-dull’ vs. stinknormal lit. ‘stink-normal’) semantically unrelated. In cross-constructional perspective, the top combination in German, stinksauer ‘stink-angry’, is closely related to the top exemplar in construction A, Stinkwut ‘stink-anger’. All others are restricted to the adjectival construction even if direct morphological equivalents exist (?Stinklangweile/-reichtum/-faulheit). HEAT
Finally, table 5.14 has the results for HEAT intensifiers in construction B. Even though the pattern HEAT has only 20 different intensifiers altogether in construction B in English, nine of the English HEAT intensifiers among the top 20 combinations occur only once in the table. Relations among the recurrent types include quasi-synonymy (blisteringly fast/quick, smoulderingly sexy/handsome), co-hyponymy (seethingly jealous/angry) and meta-
Results
147
Table 5.14 Top collexeme pairs: HEAT intensifiers, construction B English Collexeme pair blazingly fast scorching hot searingly honest glowing red scaldingly hot blisteringly fast boiling hot fuming mad steaming hot sizzling hot blistering hot smo(u)lderingly sexy burningly bright sizzlingly sexy seethingly jealous steaming drunk seethingly angry flaming gay smo(u)lderingly handsome blisteringly quick
FYE 1.48E-78 3.39E-61 6.41E-40 2.14E-39 1.55E-36 4.60E-32 3.98E-27 8.26E-22 2.52E-21 8.76E-20 3.48E-19 9.73E-19 1.28E-15 2.15E-15 1.94E-14 1.55E-12 2.00E-12 6.56E-12 3.47E-11 2.28E-10
German Collexeme pair brennend interessiert brühwarm glühend heiß siedend heiß brennend aktuell kochend heiß brütend heiß brennend heiß flammend rot sengend heiß brennend interessant brennend gern brennend scharf brühend heiß glühend verehrt brennend intensiv brennend wichtig glühend dafür glühend dagegen glühend leidenschaftlich
FYE 0 0 5.82E-278 9.20E-252 1.33E-181 4.99E-173 1.40E-160 9.01E-154 5.35E-62 8.92E-29 8.90E-28 9.21E-25 1.08E-24 2.93E-23 5.43E-19 6.90E-16 1.01E-06 3.41E-06 3.41E-06 3.41E-06
phor (sizzling[ly] hot/sexy). In German, a large share of the top combinations is taken up by just two intensifiers, brennen ‘burn’ and glühen ‘glow’. Semantic connections between German combinations in table 5.14 include (near-)synonymy (brühwarm/brühend heiß), antonymy (glühend dafür/dagegen), metaphor (glühend heiß/verehrt) and metonymy (brennend interessiert/interessant). In terms of Dixon’s taxonomy, the predominant class among the top 20 combinations in English is HUMAN PROPENSITY (honest, angry, jealous, gay), followed by VALUE (sexy, handsome), SPEED (fast, quick), COLOUR (red, bright) and finally PHYSICAL PROPERTY (hot). As was already the case for SOUND, the top combinations for HEAT intensifiers also include the two (more difficult to categorise) adjectives mad and drunk. In German, the ranking is PHYSICAL PROPERTY (warm ‘warm’, heiß ‘hot’, scharf ‘spicy’) and HUMAN PROPENSITY (interessiert ‘interested’, leidenschaftlich ‘passionate’, gern ‘gladly’) before VALUE (wichtig ‘important’, interessant ‘interesting’), AGE (aktuell ‘up to date’) and COLOUR (rot ‘red’). Four items do not fit into Dixon’s scheme (intensiv ‘intensive’, dafür ‘in favour of’,
148
Fixed expressions: From words to collocations
dagegen ‘opposed to’ and verehrt ‘admired’). Taken together, the category that sticks out is HUMAN PROPENSITY, which fits well with the association of HEAT intensifiers with aspects of mental life that was observed in construction A. Significant collocational equivalents across constructions (other than source domain expressions) are also commonly (seethingly angry/seething anger, burningly ambitious/burning ambition, burningly curious/burning curiosity), though not exclusively (blisteringly fast/blistering speed) from this class of examples: (7) a. …a mixture of Cockney wisecracking and seething anger. (BNC CHA) b. This makes me really, really, seethingly angry. (http://news.bbc.co.uk/1/hi/talking_point/2896303.stm) (8) a. Das erklärt das brennende Interesse des Autors an den… this explains the burning interest of.the author on the (PUBLIC P98/MAI.20087) b. … auch Europa an einer Lösung brennend interessiert sein. also Europe on a solution burningly interested be (PUBLIC M04/406.43099) 5.4.4.
Construction C
Table 5.15 reports the top attracted collexeme pairs within construction C in English. 75% of the combinations in this table involve intensifiers from the domain SOUND, and many of these are about human property bearers that literally produce these sounds in response to their being strongly affected by the element in the second slot: recalling the continuum from the literal/two-proposition/causal source structure to the hyperbolic/oneproposition/property ascription variant of construction C, expressions like e.g. grunt with effort/exertion, hoot/roar with laughter and squeal/whoop with delight are more towards the literal end of the spectrum, whereas types like creak with age, hum with activity and fizz with ideas are more towards the hyperbolic and ultimately clearly figurative end. The more literal variant tends to occur with predicates denoting bodily experiences and processes (effort, laughter, exertion), but also mental states (delight, glee). The more obviously hyperbolic/figurative variant occurs with emotions and other psychological phenomena (pride, ideas), but also with other kinds of abstract qualities (health, age, energy). The top attracted combinations in
Results
149
Table 5.15 Top collexeme pairs: construction C (English) Pattern SOUND SOUND SOUND SOUND SOUND SOUND SOUND SOUND LIGHT LIGHT SOUND HEAT SOUND SOUND SOUND SOUND HEAT SOUND HEAT SOUND
Collexeme pair grunt with effort hoot with laughter roar with laughter squeal with delight cackle with glee whoop with delight hum with activity creak with age glisten with sweat beam with pride gush with praise glow with health fizz with energy groan with food buzz with activity grunt with exertion glow with pride crackle with energy glow with light fizz with ideas
f 48 87 92 80 48 65 28 15 20 23 16 24 25 13 22 12 29 21 40 16
FYE 1.40E-77 1.27E-74 1.79E-68 3.88E-67 5.02E-61 5.18E-44 5.42E-33 3.27E-28 6.33E-26 3.78E-25 1.76E-24 5.38E-23 1.77E-22 1.86E-21 2.34E-21 2.72E-21 3.35E-21 7.97E-20 1.05E-19 1.16E-18
English also include at least one genuinely locative type in which a large amount of some (concrete) locatum is said to be located in a likewise concrete location (groan with food as in the tables were groaning with food). Table 5.16 has the results for German. Like in English, the table is dominated by more literal variants of the construction with sound intensifiers, typically human subjects, and either physical sensations/processes or emotions in the oblique slot (brüllen vor Lachen ‘scream with laughter’, schreien/wimmern vor Schmerz/Angst ‘cry/whimper with pain/fear’, schnauben vor Wut ‘snort with fear’, quietschen/johlen/kreischen vor Vergnügen ‘squeak/shriek with pleasure’). The restriction to human subjects also applies for most of the figurative variants in table 5.16 (brennen vor Ehrgeiz ‘burn with ambition’, kochen vor Wut ‘boil with anger’, strahlen vor Freude/Glück ‘beam with happiness/joy’, rauchen vor Zorn ‘fume with rage’). Types in which non-human participants are associated with either directly perceptible/concrete (flimmern/flirren vor Hitze ‘flicker with heat’, klirren vor Kälte ‘clink with cold’, blitzen vor Sauberkeit ‘flash with cleanliness’) or abstract properties (knistern vor Spannung/Erotik ‘crackle with
150
Fixed expressions: From words to collocations
Table 5.16 Top collexeme pairs: construction C (German) Pattern HEAT SOUND HEAT SOUND SOUND LIGHT SOUND SOUND LIGHT SOUND SOUND SOUND LIGHT SOUND SOUND SOUND SOUND LIGHT HEAT LIGHT
Collexeme pair brennen vor Ehrgeiz brüllen vor Lachen kochen vor Wut schreien vor Schmerz knistern vor Spannung strahlen vor Glück quietschen vor Vergnügen knistern vor Erotik strahlen vor Freude schnauben vor Wut wimmern vor Schmerz schreien vor Angst flirren vor Hitze johlen vor Vergnügen kreischen vor Vergnügen klatschen vor Begeisterung klirren vor Kälte flimmern vor Hitze rauchen vor Zorn blitzen vor Sauberkeit
f 109 108 99 174 31 68 41 12 81 22 24 35 9 20 23 18 7 9 8 8
FYE 3.58E-143 1.59E-101 2.04E-93 6.55E-69 7.52E-48 1.40E-40 4.28E-34 4.33E-20 8.29E-19 9.07E-19 1.57E-18 8.56E-15 1.39E-14 1.53E-13 2.55E-13 4.53E-13 7.69E-13 9.63E-12 1.58E-11 1.95E-11
suspense/eroticism’) are in the minority, and there is no prototypically locative instance among the top 20 in which the head of the oblique NP denotes a concrete locatum. SOUND
Looking at individual patterns, table 5.17 has the extended results for SOUND. Since most of the top combinations in the combined ranking were from the domain SOUND, much of this has already been commented on above. In English, four out of the five newcomers to the ranking in table 5.17 are of the dominant type with literal meaning and a human subject. In German, each of the eight additions is of the same type. Semantically, then, the associations of SOUND intensifiers in construction C are largely inconspicuous. The few types that are maybe not immediately predictable are hum with activity, gush with praise, fizz with energy/ideas and buzz with activity/news in English, as well as knistern vor Spannung/Erotik and klirren vor Kälte in German. Several of them have been commented on above
Results
151
Table 5.17 Top collexeme pairs: SOUND intensifiers, construction C English Collexeme pair grunt with effort hoot with laughter roar with laughter squeal with delight cackle with glee whoop with delight hum with activity creak with age gush with praise fizz with energy groan with food buzz with activity grunt with exertion crackle with energy fizz with ideas crow with delight whimper with fear rumble with thunder buzz with news purr with pleasure
FYE 1.40E-77 1.27E-74 1.79E-68 3.88E-67 5.02E-61 5.18E-44 5.42E-33 3.27E-28 1.76E-24 1.77E-22 1.86E-21 2.34E-21 2.72E-21 7.97E-20 1.16E-18 3.31E-17 7.32E-17 9.31E-17 4.26E-16 5.90E-16
German Collexeme pair brüllen vor Lachen schreien vor Schmerz knistern vor Spannung quietschen vor Vergnügen knistern vor Erotik schnauben vor Wut wimmern vor Schmerz schreien vor Angst johlen vor Vergnügen kreischen vor Vergnügen klatschen vor Begeisterung klirren vor Kälte schnurren vor Behagen jubeln vor Begeisterung jubeln vor Freude johlen vor Begeisterung stöhnen vor Schmerz knurren vor Hunger wiehern vor Lachen schnattern vor Kälte
FYE 1.59E-101 6.55E-69 7.52E-48 4.28E-34 4.33E-20 9.07E-19 1.57E-18 8.56E-15 1.53E-13 2.55E-13 4.53E-13 7.69E-13 2.73E-10 7.90E-09 2.68E-08 5.43E-08 1.54E-07 1.70E-07 1.89E-07 7.39E-07
already since they have significant equivalents in the other constructions. Nevertheless, in this semantic class, significant full equivalences (i.e. expressions involving morphological variants of exactly the same predicate in all three constructions) are only found in German. English has some fully parallel examples, too, yet none where all three combinations reach statistical significance: (9) a. …his previous visit he'd been full of gushing enthusiasm… (BNC F9C) b. The DVD has a gushingly enthusiastic commentary… (http://www.thefinalword.co.uk/content/view/72/41/) c. Very friendly, but not gushing with fake enthusiasm like… (http://www.viewlondon.co.uk/review_2289.html) (10) a. Die klirrende Kälte bis zu minus 14 Grad machte… the clinking cold up to minus 14 degrees made (PUBLIC R97/JAN. 00113)
152
Fixed expressions: From words to collocations
b. Ganz egal, ob es draußen klirrend kalt ist oder… Wholly indifferent whether it outside clinking cold is or (PUBLIC RHZ04/MAR.01703) c. … und die Hochebene vor Kälte klirrte, erfroren auf… and the plateau with cold clinked frozen on (PUBLIC M00/009.52048) However, even counting non-significant combinations, such exact matches are rare, and several combinations that would have regular morphological variants in constructions A and B are nevertheless unattested there (cf. hum with activity/?humming activity/?hummingly active). LIGHT
The results for pattern II, LIGHT EMISSION, are reported in table 5.18 on the opposite page. In English, combinations that share the same intensifier are again transparently related (glisten with sweat/dew/tears, beam with pride/pleasure/joy, glitter with gold/stars/diamonds, blink with surprise/astonishment), and some of them are indicative of systematic metaphorical connections (twinkle with light/humour, sparkle with diamonds/wit). Most common are source domain expressions in which the predicate denotes either LIGHT or COLOUR itself (light, gold) or alternatively an object that either emits (stars) or reflects light in a salient way (polish, diamonds, sweat, dew, tears). Other common predicates denote mental states and emotions (pride, pleasure, joy, anger, surprise, astonishment). Where the subject argument is a person (e.g. He beamed with pride), the intensifier is unambiguously figurative. A link between these figurative types and the literal source uses is provided by metonymic variants that are predicated of the experiencer’s eyes (or some metonymically related predicate like gaze or stare): though already clearly on the hyperbolic side (angry people’s eyes do not literally flash), they are arguably still more towards the literal pole (i.e. the designated event does involve some actual light reflection) as compared to purely metaphorical expressions like sparkle with wit. The latter extend the emotion and mental state uses to other types of abstract mental phenomena for which a number of additional instances are found further down on the list (e.g. charm, humour, genius, irony, intelligence etc.). The German results are more or less exactly parallel: at the top of the list are words for objects or substances (Chrom ‘chrome’, Gold ‘gold’, Schweiß
Results
153
Table 5.18 Top collexeme pairs: LIGHT intensifiers, construction C English Collexeme pair glisten with sweat beam with pride beam with pleasure glisten with dew twinkle with stars glisten with tears glitter with gold glitter with stars gleam with polish sparkle with diamonds glimmer with light flash with anger shimmer with elegance blink with surprise sparkle with wit beam with joy twinkle with light glitter with diamonds twinkle with humour blink with astonishment
FYE 6.33E-26 3.78E-25 2.65E-11 3.71E-11 4.58E-11 1.49E-10 2.51E-09 5.91E-09 1.55E-08 2.13E-08 2.42E-08 3.65E-08 4.11E-07 4.69E-07 4.78E-07 1.84E-06 3.71E-06 4.73E-06 6.48E-06 6.99E-06
German Collexeme pair strahlen vor Glück strahlen vor Freude flirren vor Hitze flimmern vor Hitze blitzen vor Sauberkeit flimmern vor Gluthitze strahlen vor Optimismus blitzen vor Chrom strahlen vor Stolz strahlen vor Zufriedenheit schimmern vor Wißbegier funkeln vor Bosheit glänzen vor Gold glänzen vor Schweiß strahlen vor Vergnügen strahlen vor Glück strahlen vor Freude flirren vor Hitze flimmern vor Hitze blitzen vor Sauberkeit
FYE 1.40E-40 8.29E-19 1.39E-14 9.63E-12 1.95E-11 4.01E-09 1.15E-08 3.84E-08 1.31E-07 3.69E-06 7.79E-06 1.60E-05 1.73E-05 1.73E-05 3.90E-06 1.40E-40 8.29E-19 1.39E-14 9.63E-12 1.95E-11
‘sweat’) or properties of such entities (Sauberkeit ‘cleanliness’, Hitze ‘heat’, Gluthitze lit. ‘glow-heat’) that literally reflect/emit light in a specific way. Next up are words for emotions (Freude ‘happiness’, Glück ‘joy’, Vergnügen ‘pleasure’, Zufriedenheit ‘contentedness’, Stolz ‘pride’) and related mental phenomena like attitudes (Optimismus ‘optimism’, Wißbegier ‘thirst for knowledge’). Both in English and in German, emotions and mental phenomena intensified with LIGHT intensifiers show a strong tendency to be positively valued (as would be expected from the larger polysemy structure of LIGHT and DARKNESS and the fact that valuable concrete objects like e.g. jewellery are also physically shiny). Cross-constructionally, there are again some collocational equivalence sets in both languages, but none of them (in either language) reaches significance in all three environments at the same time: (11) a. ...frothy dialogue and sparkling wit tend to overshadow… (BNC G1N) b. A dark, sparklingly witty novel about life, love, death... (http://www.orionbooks.co.uk/HB-32995/Luck.htm)
154
Fixed expressions: From words to collocations
c. The terraces used to sparkle with wit; songs were… (BNC K4T) (12) a. …versprüht nach außen strahlenden Optimismus… sprays to outside beaming optimism (PUBLIC P94/FEB.06750) b. ...klang nicht ganz so strahlend optimistisch wie früher. sounded not wholly so beamingly optimistic as before (PUBLIC R99/APR.30231 c. Damals strahlte er noch vor Optimismus, daß... Back then beamed he still with optimism that (PUBLIC P91/SEP.02580) SMELL
As before, the number of significant pairs in pattern III is small: Table 5.19 Top collexeme pairs: SMELL intensifiers, construction C English Collexeme pair reek with filth reek with blood reek with stench reek with perspiration
FYE 5.44E-15 1.36E-12 2.10E-11 5.48E-07
German Collexeme pair stinken vor Geld
FYE 2.04E-05
Cross-constructionally, the table provides evidence for analogical extension also in this domain: German stinken vor Geld ‘stink with money’ is clearly modelled on its much more frequent adjectival counterpart stinkreich ‘stink-rich’. However, none of the remaining collocates in construction B that were discussed above (e.g. sauer/langweilig/normal etc.) has a counterpart in construction C, just like stinkreich does not have an equivalent ?Stinkreichtum in construction A. The few significant combinations in English are all semantically inconspicuous. HEAT
Finally, table 5.20 reports the top combinations for the last pattern in the last environment, HEAT intensifiers in construction C. As was already the case in constructions A and B, the results for HEAT resemble those for LIGHT in a number of ways: in both languages, there are source domain
Results
155
Table 5.20 Top collexeme pairs: HEAT intensifiers, construction C English Collexeme pair glow with health glow with pride glow with light seethe with anger burn with desire boil with rage blaze with light sear with pain fume with anger simmer with tension blaze with fire seethe with discontent seethe with rage melt with heat glow with happiness simmer with resentment glow with warmth boil with anger burn with fever seethe with resentment
FYE 5.38E-23 3.35E-21 1.05E-19 2.14E-17 8.61E-16 1.32E-15 1.72E-14 2.44E-14 2.60E-11 4.49E-11 6.44E-11 7.52E-11 1.20E-10 5.96E-10 1.30E-09 2.26E-09 2.28E-09 2.44E-09 9.88E-09 2.63E-08
German Collexeme pair brennen vor Ehrgeiz kochen vor Wut rauchen vor Zorn glühen vor Hitze glühen vor Stolz brennen vor Tatendrang brennen vor Leidenschaft kochen vor Zorn kochen vor Eifersucht kochen vor Ärger brennen vor Schmerz
FYE 3.58E-143 2.04E-93 1.58E-11 2.79E-09 2.93E-07 1.42E-06 3.67E-06 1.44E-05 2.29E-05 6.21E-05 1.80E-05
collocations relating to heat or hot things (heat/warmth/fire/fever, Hitze ‘heat’) and then one dominant cluster of metaphorical extensions, as in the case of LIGHT intensifiers relating to emotions and mental states. In contrast to the latter, however, the emotion terms collocating with HEAT intensifiers tend to be unpleasant and/or have an antagonistic element to them (anger, rage, discontent, resentment; Wut ‘fury’, Zorn ‘rage’, Ärger ‘anger’, Eifersucht ‘jealousy’), though there are also some exceptions to this tendency (pride, desire, happiness; Stolz ‘pride’, Ehrgeiz ‘ambition’, Leidenschaft ‘passion’, Tatendrang ‘zest for action’). Two examples of intensifierpredicate combinations that are found in all three constructions are (13) and (14): (13) a. My burning ambition is to be world champion. (BNC K4T) b. ...and Minnie, his burningly ambitious mother... (http://www.armchairfans.co.uk/books/0879100362/) c. But, even at 40, he still burns with ambition to dump... (BNC CH3)
156
Fixed expressions: From words to collocations
(14) a. Glühender Neid auf das Glück der Familie hatten sich... glowing envy on the happiness of.the family had REFL (PUBLIC R98/FEB.16716) b. Glühend beneidet um sein prachtvolles Haus… Glowingly envied for his splendid house (PUBLIC A98/JUL.46275) c. ... Kaca in Sydney glüht vor Neid, als sie erfährt, daß.. Kaca in Sydney glows with envy as she learns that (PUBLIC M98/807.54701) This closes the descriptive overview of the collocational behaviour of and HEAT intensifiers in the three investigated constructions.
SOUND, LIGHT, SMELL
5.5.
Summary and discussion
Chapter 5 has examined the actual attestations of the investigated intensifiers in the targeted constructional environments in both languages. Section 5.2 has prepared the ground with a brief discussion of the notion of formulaicity and sketched its close relationship to linguistic creativity as understood from the usage-based perspective adopted in this study. Furthermore, the relationship between frequency patterns in corpora and cognitive entrenchment was discussed. Section 5.3 has introduced the method employed for the empirical identification of formulaic and, by assumption, cognitively entrenched realisations of the target structures: collostructional analysis. Finally, the empirical analyses reported in section 5.4 have examined the syntagmatic associations between the attested intensifiers and the range of predicates that they were found to occur with in all six targeted environments. What do these results show? The concluding discussion in chapter 4 has suggested that from the bird’s eye view, differences in the way the four semantic target domains are exploited for the encoding of intensity outweigh the similarities, both between languages and constructions. The more detailed perspective on the data developed in the present chapter indicates that this observation needs to be qualified in two respects. On the one hand, the patterning of concrete individual expressions within these categories casts doubt on the assumption that speakers in fact draw such pattern-wide generalisations about the intensifying potential of the four target categories in the first place. It was already observed in chapter 4 that certain items
Summary and discussion
157
have a much higher type frequency in their intensifying use than others of the same class, some of which are unattested as intensifiers altogether. And also if the focus is restricted to just those items that are attested as intensifiers, their concrete collocations within one and the same constructional environment are often mutually incompatible, which shows that they must ultimately be learned on an item-by-item basis (cf. steaming/?flaming drunk vs. flaming/?steaming gay). In fact, even different uses of one and the same intensifier within the same constructional environment may involve unrelated construals that give rise to the intensity implication (cf. burning issue/ambition). This underscores the fact that there are many individual pathways to intensity meaning, some of them pragmatic and highly contextdependent, some of them semanticised as a conventional figurative construal, but in either case not predictable from a given item’s source domain alone. All of this certainly does not invalidate the descriptive generalisations made in chapter 4. However, it stresses their status as descriptions of the object languages that do not necessarily capture a generalisation that is also relevant to speakers. On the other hand, as regards similarities and differences between constructions and languages, an account of those exploitations of the four categories that are actually found in naturally occurring usage indicates that they are often shaped by the same underlying motivations. True, concrete entrenched exemplars do not generally have direct equivalents in the respective other constructions. While this again emphasises the strong formulaic component of the targeted expressions, the data nevertheless also evidence some such extensions of entrenched collocations from one construction to the next, and in some cases even parallel routine expressions in two or all three environments. This indicates that at least within certain regions of their usage patterns, the three constructions are recognised as principled variants of one another that should permit an intralinguistic collocation transfer, and be it just to achieve rhetorical effect through an attention-getting but still recoverable formulation (e.g. squeaky cleaning). Likewise, many common expressions also have direct translation equivalents between languages (e.g. burning ambition/brennender Ehrgeiz, stinking rich/stinkreich, beam with pride/strahlen vor Stolz). This indicates that the conceptual motivations for particular intensifier-predicate combinations are not only the same from one construction to the next, but may also hold (at least in similar ways) for speakers of both languages. Particularly prominent among the semantic regularities that were found to recur across constructions and languages are figurative links between the domains
158
Fixed expressions: From words to collocations
LIGHT and HEAT on and EMOTION on the
the one hand and the domains VALUE, INTELLECTION other. In short, the status of semantic pattern-level generalisations over PERCEPTION intensifiers (as envisaged in chapter 4) is dubious from a cognitive point of view. Speakers work with much more specific representations. Moreover, the concrete exploitations of the four target patterns in the form of specific intensifier-predicate combinations are often highly similar across constructions and languages. This points to interesting tensions between formulaicity and motivation in the data that will be explored in detail in chapter 6, which is concerned with the status of those generalisations that can be attributed to speakers on the basis of a more bottom-up approach to the data than that of chapter 4.
Chapter 6 Incipient productivity: From collocations to constructional schemas
6.1.
Introduction
Chapter 6 moves from the consideration of individual expressions of the targeted types to the identification of coherent patterns of such expressions: constructional schemas. As noted in chapter 5, the data provide evidence for different kinds of semantic generalisations over conventional PERCEPTION intensity expressions in constructions A, B and C. On the one hand, these are item-based generalisations, such that a given intensifier often occurs with a number of semantically related predicates: (1) a. …he had had a burning admiration for grand prix racing… (BNC K2D) b. …Pip's intensely painful and burning love for Estella… (BNC KA1) c. …erupted into a torrent of burning hatred at the discovery that… (BNC JXS) On the other hand, there is also evidence for higher-order generalisations, such that certain semantic types of predicates also take certain semantic types of intensifiers: (2) a. She can twist me round her little finger, is said with glowing pride. (BNC G0T) b. However, his blazing indignation appears to have given him a… (BNC CAP) c. Polly's flaring anger at his sarcasm was quickly extinguished by… (BNC KA1) In addition, the more entrenched uses of these intensifiers seem to have spawned variants through very similar kinds of extension processes in each case:
160
Incipient productivity: From collocations to constructional schemas
(3) a. burning love – burning passion – burning appraisal – … b. glowing appraisal – glowing congratulation – glowing success – … c. blazing hatred – blazing passion – blazing row – … On the other hand, the generalisations that such patterns seem to embody are also restricted in many ways, i.e. compromised by both item-specific specialisations (cf. 4) and gaps within the semantic spectrum that is covered by a particular intensifier (cf. 5). As before, combinations marked ‘?’ are not found in the BNC, and combinations marked ‘??’ were not found on the web at the time of the investigation, either: (4) a. burning confusion – ?glowing confusion – ??blazing confusion b. glowing affirmation – ?blazing affirmation – ??burning affirmation c. blazing row – ?burning row – ??glowing row (5) a. burning conviction – ?burning belief – ??burning creed b. glowing success – ?glowing achievement – ??glowing win c. blazing row – ?blazing quarrel – ??blazing wrangle In short, the data clearly evidence creative variation of the formulas identified in chapter 5, but this variation is also subject to constraints. Furthermore, the contrasts in (4) and (5) indicate that the explanatory force of sweeping semantic generalisations of the type EMOTION IS HEAT (which could be invoked to explain the observations in [1] and [2]) is limited when it comes to the many grey areas and gaps of such metaphorical patterns. The purpose of this chapter is to move from impressionistic acknowledgments of semantic relatedness that are based on anecdotal similarities between usage patterns to a more comprehensive, bottom-up identification of the internal semantic structure of the investigated collocation clusters. The results of the semantic classifications that are obtained for this purpose are then used to evaluate a general theory of semantic extension and the emergence of incipient productivity in entrenched collocations. The guiding questions for the empirical analyses are:
What kinds of low-scope generalisations over individual formulas are evidenced by the data, and to what extent are similar generalisations observable across constructions and languages? What are the roles of local analogy and global schema extraction for the emergence of incipient productivity in the investigated collocation clusters?
Introduction
161
What is the evidence for consistent higher-order schematisations over the identified extensions in the data?
These questions are addressed in three separate analyses that offer different perspectives on the data. In the first step, the data are approached from a semantic point of view: for each of the four intensification patterns, one intensifier lemma is selected (along with its direct translation equivalent in the respective other language) and its usage patterns are compared across constructions and languages. More precisely, therefore, this step involves 4x2=8 separate analyses in which the meaning of the intensifying base is held constant, and its combinatorial potential is explored in different linguistic contexts. The second analysis investigates how the usage patterns of the most type frequent intensifiers in each construction and language are structured internally. What is held constant in these 3x2=6 case studies is (high) type frequency, and the focus is on the distribution of established uses of the targeted intensifiers across semantic space as compared to the distribution of creative variants which they have spawned. Here, individual bases are not compared across either constructions or languages – instead, the focus is on the internal semantic structure of (and quantitative proportions within) the targeted collocation clusters themselves, and what is compared is the relation between an intensifier’s established and extended uses of different semantic types. Finally, having investigated how the usage patterns of individual intensifiers compare across constructions and languages and the way they grow in their respective niches through an interplay of analogy and schematisation, the third analysis turns to global similarities and dissimilarities in the combinatorial potential of the whole spectrum of intensifiers in the study. In these 3x2=6 analyses, it is the construction that is held constant, and what is compared is the degree of collocational overlap between the intensifiers that are attested in a given constructional environment. The question to be pursued here is: to what extent do the observed patterns reflect more or less principled semantic regularities (such that semantically similar predicates also take semantically similar intensifiers) rather than purely idiosyncratic constraints on co-occurrence? The chapter is structured as follows: section 6.2 discusses general problems of semantic classification (6.2.1) and reviews selected approaches to the task that have been employed in cognitive-functional/constructionist research (6.2.2). Section 6.2.3 provides a similar discussion for the concept
162
Incipient productivity: From collocations to constructional schemas
of productivity. Section 6.3 then lays out how the problems and concepts investigated in section 6.2 were operationalised. Section 6.3.1 introduces the procedure for identifying item-based generalisations in the data. Section 6.3.2 discusses the procedure for assessing productivity predictions and for investigating the relationship between analogy and schematisation. Finally, section 6.3.3 introduces the method for identifying higher-order semantic schematisations in the data. Section 6.4 presents the results. Item-based generalisations are reported in section 6.4.1. The interplay of analogy and schematisation in the emergence of these generalisations is discussed in section 6.4.2, and section 6.4.3 reports the results on higher-orders generalisations across the targeted expressions. As in previous chapters, the final section provides a short summary and discussion. 6.2.
Prerequisites
6.2.1.
Problems of semantic classification
Providing fine-grained semantic classifications of corpus data that seek to do justice to all potentially relevant aspects of speakers’/writers’ underlying construal operations is a challenging task, and some authors are fundamentally sceptical about the viability and scientific status of such analyses per se (for instance, in an essay entitled “Meaning and the limits of science”, Sampson 2001: 181 maintains that an “analysis of word meaning cannot be part of empirical science”). The present section sketches some of the main difficulties that are involved in the task. These issues can be summarised as problems of
observability representativeness fluidity granularity discreteness
From an empiricist perspective, the most fundamental problem of semantic analyses is the fact that meanings are not directly observable. Whereas formal linguistic properties (such as the presence or absence of a particular morphological marking) and distributional characteristics (i.e. an item’s occurrence or non-occurrence in a particular context) are straightforwardly
Prerequisites
163
assessed and counted, semantic properties are not: even if the social dimension is ignored for the moment and meanings are located in the heads of individual speakers, there is no direct access to the semantic representations that they entertain. This is just as true for speakers and hearers as it is for professional semanticists who have no privileged access to the nature of meaning and human sensemaking. Most of all, however, observability (or rather non-observability) poses a major problem for automatic classification systems that have no direct access to the conceptual content of language – an issue known as the ‘symbol grounding problem’ (Harnad 1990). In children, initial semantic acquisition is aided through explicit tutoring and situated ostension within joint-attentional frames (Tomasello 1995) which serve to delimit learners’ hypothesis space considerably. From some point on, however, the most important evidence for semantic generalisations about a particular word or other construction is provided by the distributional patterning of language itself (Vigliocco et al. 2009). Since these distributions are of course observable, the idea of exploiting such indirect evidence for the induction of (non-grounded) equivalence classes also lies at the heart of many contemporary computational approaches to semantic classification. The second problem, representativeness, is a corollary of nonobservability combined with the fact that semantic representations must only be aligned to the extent that they permit successful communication. In other words, the full spectrum of semantic implications that is associated with a particular word is never identical for all speakers, and the set that is shared is not fully determinate: based on their accumulated experience with how a given word or other construction is used, all members of a speech community have formed certain assumptions about what it denotes and where it can be used felicitously. Core aspects of these attributions (i.e. those that are most frequently attended to) are automatically updated and aligned in discourse, thereby establishing the necessary common ground for verbal communication to be possible in the first place. However, it is not possible to give an exhaustive and fully explicit account of exactly which attributions are shared by all speakers of a speech community (which is itself a fuzzy notion), and precisely where these ‘core’ features shade off into more contingent and idiosyncratic associations (heroic attempts at the task like e.g. Wierzbicka 1985 notwithstanding). Even between professional lexicographers with massive corpora at their disposal, agreement on word senses and their boundaries goes only so far, as a quick comparison of word definitions (and their level of detail) across dictionaries easily
164
Incipient productivity: From collocations to constructional schemas
demonstrates. It follows that the more fine-grained a semantic analysis becomes, the less confidence there is in the reliability and empirical validity of the proposed distinctions (i.e. it becomes increasingly uncertain whether the analysis in fact describes something other than just the personal intuitions of the analyst). Representativeness thus poses a major problem for introspective semantic classifications performed by human analysts. Third, ‘fluidity’ captures the fact that meanings are open-ended and protean: word meanings are not only not fully shared, but also inherently fluid and malleable. Words are not “convenient capsules of thought” (Sapir 1921: 12) that invariably convey an identical set of fixed implications. Instead, word meaning is better conceived of as a variable potential that receives flexible spell-outs and inferential enrichments in different structural and pragmatic contexts. Interpretive fluidity is a much discussed issue that goes under a variety of different names in different theoretical frameworks, discussed for instance in connection with the notions of ‘regular polysemy’ (Apresjan 1974), ‘active zone’ (Langacker 2002 [1991], ch.7) and ‘cocomposition’ or ‘coercion’ (Pustejovsky 1995) to name but a few. For present purposes, the problem caused by fluidity is that words can be used in very flexible ways, thus making it more difficult to recognise commonalities between items that are used in such shifted readings. This is of course all the more true if the items in a classification are considered out of context. The fourth problem relates to questions of granularity. The goal of semantic classification is to establish similarity relationships between the categorised items, a task with implies sorting them with respect to a certain set of semantic dimensions. A problem here is that it is not a priori clear which categories or dimensions this should be: for a given word in its contextually appropriate reading, many different categorising relationships of the type ‘X is a type of Y’ can be set up. The point is not that the item can be placed in entirely different taxonomies altogether (depending on which aspect of its meaning potential is foregrounded: e.g. the word dog can denote a certain type of animal/companion/nuisance/threat/commodity etc. depending on context) – this is an aspect of what was called semantic ‘fluidity’ above. Rather, even within the same taxonomy, a given target can still be classified on several different levels of abstraction (e.g. PET/ANIMAL/MAMMAL/ORGANISM/CONCRETE OBJECT for dog). Furthermore, not all taxonomies that can be invoked for a given classification have the same number of levels, thus making it problematic to compare the simi-
Prerequisites
165
larity of related elements across different taxonomies in the overall classification. Finally, ‘discreteness’ relates to the problem of categorising a target as an instance of a certain well-defined category when the underlying categories are in fact not well-defined at all and blur into their neighbours at the edges. The problem was already encountered in chapter 4 in connection with the problem of unambiguously assigning verbs to exactly one lexicalisation pattern when in fact they could be seen as members of more than one category. In chapter 4, the problem was not severe: it only affected a small handful of items that were ambiguous between only two classes, and the distribution-based disambiguation procedure that was proposed offered a feasible data-driven solution. Compared to this, the classification to be carried out in the present chapter will be more demanding. From the cognitive semantic point of view described in chapter 2, the targeted clusters of intensity expressions can be expected to be related by complex networks of family resemblances: items will show many different kinds of similarities with various other items that occur in their construction all at the same time (without these other items necessarily being related themselves), and it will be difficult to impose hard and fast category boundaries between particular subtypes of usages. 6.2.2.
Approaches to semantic classification
The present section compares different approaches to semantic classification that could be used for present purposes. Each approach is briefly illustrated with an example from previous cognitive-functional research that demonstrates advantages and problems of the given strategy. The three basic options to be compared are:
Manual classification based on introspective judgments by human analysts Statistical classification based on distributional similarities in corpus data Statistical classification based on experimentally obtained similarity judgments
An example of the first approach is Zeschel (2010). The study presents a corpus-based investigation of semantic extension processes on the example of Adj + N collocations involving the heavily polysemous German adjec-
166
Incipient productivity: From collocations to constructional schemas
tive tief ‘deep’. The target expressions were classified on three successively more fine-grained levels of analysis: in the first step, figurative tief + N expressions were assigned to one out of seven sense classes for the adjective that were taken from a large monolingual dictionary of German. In the second step, instances of the most frequent of these sense classes were coded for one out of nine distinct conceptual mappings that were judged to underlie the targeted expressions. Finally, the expressions instantiating the most frequent of these mappings were classified according to the semantic field of the nominal head. An advantage of the approach is that is allows very fine-grained semantic classifications of the data up to the discrimination of different metaphorisation strategies that the target expressions were found to embody. A potential problem of the approach is that it relies on semantic preconceptions of the analyst. Even though intercoder reliability measures were calculated and a good level of agreement was attained, the problem remains that the category scheme employed for the classification was imposed onto the data (by the dictionary definitions in classification step 1 and by the first coder in steps 2 and 3) rather than induced bottom-up from the data themselves. A study that sought to remedy this problem by employing distributionbased inductions of semantic equivalence classes instead is Gries and Stefanowitsch (2010). Faced with the same problem as the present investigation – the semantic classification of collexeme pairs obtained from a covarying collexeme analysis – the authors propose an approach involving hierarchical cluster analysis. They present three case studies (of the English ditransitive, the English into-causative and the English way-construction) that compare the results of clustering verbs on the basis of all linear collocates within a certain span with the results obtained by clustering them on the basis of their covarying collexemes alone. The authors find that clustering collexemes on the basis of their covarying collexemes yields tidier results with fewer groupings that are difficult to interpret. The results of the study are interesting in that they corroborate the findings of an earlier introspective systematisation of the collexeme pairs found in this construction (Gries and Stefanowitsch 2004b). This suggests that cluster analyses of covarying collexemes can be useful for evaluating introspective classifications based on the top attracted collexeme pairs of a construction in a way that is less influenced by interpretive biases of the analyst (‘less’ because the introspective element is not entirely absent here either: on the one hand, the quantitative results of course still require qualitative interpretations, and on the other hand, only manually pre-classified instances of the target con-
Prerequisites
167
struction were submitted to the cluster analysis to begin with). For present purposes, a limitation of the approach is that it that can only be applied to that level of semantic analysis for which the initial annotation of the construction provides distributional information (i.e. the distribution of different collexemes of slot A across usages involving different collexemes of slot B). When it comes to the semantic classification of the co-occurrences of just one item (i.e. item-based generalisations of the type exemplified in [1] above), the method cannot be applied because the targeted element is held constant. An example of the third approach, statistical classification based on the results of a similarity judgment experiment, is provided by Bybee and Eddington (2006). The authors were interested in the collocations of adjectives with different ‘verbs of becoming’ in Spanish. Their prediction was that in the targeted change of state expressions, each verb combines with one or several classes of semantically related adjectives that cluster around a highly token frequent central exemplar of the relevant use. The first step of their analysis involved an introspective systematisation of the adjectival collocates of the investigated verbs (obtained from a corpus study) through a sorting experiment: for the classification task, a native speaker of Spanish was presented with each adjective occurring in the targeted V + Adj construction written on a Post-it note and asked to position these notes on a blank sheet of paper. The results of the classification provided support for the assumption that the adjectives that co-occur with a specific change of state verb are semantically related. However, in view of the representativeness issue discussed in section 6.2.1 above, the authors concede that “the fact that only one native speaker was consulted raises the question of whether other speakers would agree with the similarity groupings proposed” (Bybee and Eddington 2006: 347). In order to address this question, step two of the classification involved a similarity judgment experiment that was analysed using multidimensional scaling. For this experiment, a subset of 20 adjectives (out of the total 161 that were found to co-occur with the targeted verbs in the corpus study) was presented in pairwise comparisons to 77 Spanish native speaker subjects whose task was to rate the semantic similarity of each pair on a scale from 1–‘not similar at all’ to 5– ‘highly similar’. The experiment was restricted to 20 adjectives because each item in the classification had to be compared to each other item in the classification in a separate judgment, thus giving the unfeasible amount of 12,880 stimulus items for an experiment based on all 161 adjectives obtained from the corpus study (161x160/2). Restricting the study to 20 ad-
168
Incipient productivity: From collocations to constructional schemas
jectives reduced the number of stimuli to 190 items, which were divided into two separate questionnaires and presented to the participants with no further context provided (in particular, no mention of the context of change of state collocations was made). Responses were then inverted (turning e.g. 5–‘highly similar’ into 1–‘closely related’) such that they could serve as a measure of semantic distance between items, and the data were subject to a multidimensional scaling analysis. For the most part, the results of the analysis were closely in line with the results obtained from the preceding sorting task. The approach employed by Bybee and Eddington has the advantage of avoiding several of the analytical pitfalls discussed in section 6.2.1. Concerns about representativeness are addressed by basing the classification on the judgments of a relevantly large sample of experimental subjects (an impressive 77 participants in the case of their study). Fluidity is not recognised as a potential problem. Instead, the authors assume that its effects can be eliminated from the classification: The pairs were presented in writing with no context provided; that is, participants were asked to rate each pair in terms of how similar the two words were in meaning without any knowledge that we were interested in what verbs of becoming these adjectives were used with. (Bybee and Eddington 2006: 347)
Nevertheless, their results provide evidence that fluidity in fact is relevant since decontextualised comparisons do not in themselves determine how the items in the comparison are to be construed and which dimension is to be invoked in order to assess their similarity: The major way that the experimental results differ from our own analysis is evident in the groupings on the right side of the figure—those used with hacerse. From the point of view of a linguist, bueno ‘good’, fuerte ‘strong’, rico ‘rich’, and famoso ‘famous’ are not all semantically similar. But from the point of view of speakers asked to rate how similar the adjectives are, these adjectives may have seemed similar because they all express strong positive values, and some such as rico and famoso are often paired in discourse. It may be that their use with hacerse is in fact related to these contextual pairings, a factor we did not consider in our initial analysis but something that should be considered in future research. (Bybee and Eddington 2006: 348)
This suggestion is taken up in the procedure explained in section 3, which explicitly aims to capture also more schematic commonalities like shared evaluative implications (as in the example of bueno and forte) as well as
Prerequisites
169
frame-based links (which underlie the discourse co-occurrence of items like rico and famoso). Finally, granularity and discreteness may not seem issues for the approach proposed by Bybee and Eddington on first glance (since subjects rate pairwise similarities instead of giving categorising judgments). However, also these pairwise comparisons of course involve categorising judgments (since the two items must be related against the backdrop of a particular semantic dimension), so that granularity problems can be assumed to manifest implicitly in this approach, too: since subjects’ task was to rate 95 adjective pairs for similarity on a numerical scale from 1 to 5, it is likely that judgments for pairs later in the task were influenced by assessments of their comparability (i.e. relative similarity in semantic distance) to judgments for earlier pairs, and these in turn may have involved assessments relative to semantic dimensions with a different taxonomy depth. At any rate, granularity and especially discreteness problems (which are avoided altogether) are substantially reduced as compared to an approach that elicits category membership judgments of the type ‘X is a kind of Y’ directly. On the downside, the approach also has a number of problems. First of all, semantic similarities differ not only in intuitive quantity (‘distance’), but also in quality. For instance, it is difficult to say whether e.g. blue is more similar in meaning to coloured than it is to green, or whether dead is more similar to lethal than it is to alive. In addition, the approach also fails to accommodate ambiguity: for instance, certain uses of the word green are similar in meaning to certain uses of the word blue, but there are also variants of green that mean ‘unexperienced’, ‘non-processed’ or even ‘ecologically aware’, and none of these is similar in meaning to blue. Since the perceived semantic distance between two items depends on the specific readings that are selected for them, failure to control these choices is an obvious problem of the approach. Finally, in contrast to the clustering method proposed by Gries and Stefanowitsch (2010), a limitation of the multidimensional scaling approach taken by Bybee and Eddington is that it can only be applied to a rather small set of items because their number otherwise gets out of hand very quickly: to illustrate, 100 items require 4950 judgments, 200 types involve 19,900 classifications and 300 elements would already call for a full 44,850 separate comparisons. Hence, unless the number of items in the classification is very small, it is not possible to employ the approach in place of an introspective (or other) systematisation. Instead, as in the study by Gries and Stefanowitsch (2010), Bybee and Eddington interpret their experimental results by relating them to the results of
170
Incipient productivity: From collocations to constructional schemas
a previous introspective classification. This underscores the great importance that is accorded to intuitive similarity judgments also in these two approaches, which use quantitative techniques to corroborate previous qualitative analyses. 6.2.3.
Approaches to productivity
In chapter 2, the schematisation of established collocations was characterised as a “rise to productivity” (p. 6). Since many authors have used the term ‘productivity’ to mean different things, the following section clarifies how the term is used in the present study. To begin with, most approaches to productivity see it as a property of morphological rather than morphosyntactic units more generally: for instance, the ‘Productivity’ entry in the Encyclopedia of Language and Linguistics (Brown 2006) makes no reference to syntactic productivity at all and instead defines the term as a rule’s “general potential to be used to create new words” (Plag 2006: 127). Second, productivity is often contrasted with the notion of ‘creativity’ and/or ‘analogy’ in order to establish a distinction between formations that are regular and systematic (‘productive’) and others that are unsystematic and exceptional (‘creative’/’analogical’). In usage-based accounts, both these restrictions of the term are rejected. Since construction grammar sees no principled difference between lexical and phrasal constructions (in the sense that the representations of morphological and syntactic constructions would be viewed as entities of a fundamentally different type), the concept is also applied to combinatorial restrictions in syntax: for instance, Goldberg (1995: 120) refers to the productivity of argument structure constructions as their ability to be “extended to new and hypothetical verb forms”. And second, productivity is seen as a gradient phenomenon that relates to a quantitative rather than to a qualitative (all or nothing) distinction, thus eliminating the principled difference between ‘productivity’ and ‘creativity’ (or ‘rules’ and ‘analogies’): in the words of Langacker (2008: 244), the term refers to “a schema’s degree of accessibility for the sanction of new expressions” (my emphasis). However, also within usage-based approaches, there are different views of exactly what the term implies and how the property in question should best be measured (cf. Bardal 2008, chapter 2 for a detailed discussion). Following the same procedure as in section 6.2.2 above, I will review three common approaches to the phenomenon that have been pro-
Prerequisites
171
posed in previous research before laying out my own perspective on the issue in section 6.3. On the most influential of these accounts, productivity is seen as a function of type frequency (Bybee 1985, 1995; MacWhinney 1978). More specifically, (high) type frequency is seen as both a cause and an effect of productivity: as pointed out in chapter 2, speakers/learners are assumed to extract salient commonalities between expressions in the input in the form of constructional schemas, and part of the learning task consists in discovering the correct level of abstraction for these generalisations. Now, the assumption behind the type frequency approach is that the more evidence a speaker has that a particular generalisation is correct – meaning that its application produces structures and interpretations that yield communicative success – the more confident he or she will become that the schema can also be applied in further contexts of the relevant type. As a consequence, the number of productions that fall within the scope of the extracted generalisation will increase. If this is correct, (high) type frequency is both a cause and an effect of productivity. Refining this account, Bybee (1995) argues that in fact not all instances of a pattern contribute to the consolidation of a superordinate schema to the same extent. Rather, instances of a schema that have a very high token frequency do not strengthen this schema because they are accessed as units instead. Bybee therefore associates high type frequency with productive schema application (which promotes regularisation) and high token frequency with unanalysed item storage (which prevents regularisation; cf. also Bybee and Thompson 1997), and only low-token frequency types are assumed to contribute to productivity. However, also low-scope generalisations (which by definition have a lower type frequency than more inclusive abstractions) may be highly productive within their limited domain. This observation lies at the heart of Bardal’s (2008) suggestion to view productivity as the inverse correlation between a construction’s type frequency and its degree of (semantic or phonological) coherence (graphically illustrated in figure 6.1 overleaf). On this approach, productivity is envisioned as a cline: at the bottom end of the spectrum are analogical extensions based on exactly one fully specific exemplar (minimum type frequency, zero schematicity). Beyond that are progressively more type frequent patterns that are productive to the extent that they are coherent (intermediate type frequency, restricted schematicity). Finally, high productivity is found in large and abstract patterns (high type frequency, high schematicity). As the author puts it,
172
Incipient productivity: From collocations to constructional schemas
The higher the type frequency of a construction, the lower the degree of semantic coherence [that] is needed for a construction to be productive. Conversely, the lower the type frequency of a construction, the higher the degree of semantic coherence [that] is needed for a construction to be extendable. (Bardal 2008: 34)
Type Frequency High Regularity – Generality – Open schema Different degrees of productivity
Analogy Analog
Coherence Low
High
Figure 6.1 Productivity, type frequency, coherence (following Bardal 2008: 38)
Furthermore, Bardal’s approach also envisages a different role for (high) token frequency than e.g. Bybee (1985) and Bybee and Thompson (1997): only instances of “higher level” (i.e. more schematic) constructions are assumed to be affected by the conserving effect of high token frequency (Bardal 2008: 52). Conversely, on the view that highly productive constructions gradually shade off into ever less productive patterns and ultimately fully item-specific analogical extensions, elements at the lower end of the spectrum will actually benefit from high token frequency because their entrenchment makes them more likely objects of analogical extension than rival types with low token frequency (cf. Bardal 2008: 54). Bybee and Eddington (2006) make the same observation in their study of Spanish
Prerequisites
173
‘verbs of becoming’ (cf. section 6.2.2), and they also note that it conflicts with the idea that high token frequency is in fact detrimental to productivity. The authors suggest that the reason for this apparent contradiction may be that what counts as ‘high token frequency’ will vary from one construction to the next. Indeed, insofar as the term is used to express relative frequency (i.e. the token frequency of an instance relative to other instances of the same construction), very different absolute figures may count as ‘high token frequency types’ of a pattern when constructions on different levels of schematicity are being compared: One effect of extremely high token frequency is to render an exemplar of a construction autonomous from other exemplars or other related items (Bybee 1985, Bailey & Hahn 2001, Hay 2001). It appears that the degree of token frequency found in the data to be analyzed here is not sufficient to cause autonomy, as we find evidence of relatedness in all the cases of highfrequency exemplars. (Bybee and Eddington 2006: 326)
Bardal (2008: 95) takes this to mark an important difference between her own account and the approach of Bybee (1985, 1995): [i]t is not the high type frequency of a construction which contributes to its productivity, as argued for instance by Bybee (1985, 1995, 2001), but rather all the individual lower-level item-specific constructions in themselves. And since more lower-level constructions make up high type frequency patterns than low type frequency patterns, high type frequency patterns will of course attract more new items than patterns which only exist at lower levels. If it is the case that it is the low-level verb-specific constructions which are responsible for a construction’s extensibility, token frequency will not be less important for productivity than type frequency. Type frequency will be an indicator of the highest level of schematicity each construction exists at, and hence an indicator of the semantic scope of the construction and its productivity domain, but token frequency will be an important psycholinguistic factor singling out model items for speakers when they extend lowlevel constructions. Token frequency will thus be quite important for productivity, contra what has hitherto been assumed in the traditional cognitive-functional literature. (Bardal 2008: 95-96)
A second important difference is that productivity is relativised to level of constructional abstraction, or what is called the ‘productivity domain’ (Bardal 2008: 40) of a construction. Hence, on this account, not only high type frequency patterns count as maximally productive, but also mid- and
174
Incipient productivity: From collocations to constructional schemas
low-scope schemas that are exhaustively exploited. In view of the many difficulties of semantic analysis discussed in section 6.2, however, an obvious problem with this approach is that it is not clear how the notion of ‘productivity domain’ is to be defined and operationalised. Even though it is often easy to see that a construction appears to occur with words of certain similar meaning(s), pinning these meanings down at exactly ‘the right’ level of specificity and deciding which other lexical units could reasonably be expected to occur as possible exponents of these meanings is certainly much more challenging (a problem that was already encountered in chapter 4). The third approach to be mentioned here is not based on constructional type frequencies (be they absolute counts or relativised to some potentiality) but rather relies on token frequency instead. Specifically, it develops the assumption that if high token frequency instances are stored, a promising indication of a pattern’s readiness to be extended productively is the number of its instances that are too rare to be routinised and therefore not plausibly stored. To the extent that such unfamiliar forms are nevertheless found in naturally occurring language use and easily interpreted by speakers, the argument goes, they provide evidence for an underlying schema that can be applied productively. In the limiting case, these rare items occur only once in a large representative corpus of the object language, in which case they are referred to as hapax legomena (or simply hapaxes for short). Hapax-based measures of productivity have been developed by Harald Baayen and colleagues in a number of publications, each of which highlights a different aspect of productivity that is based on the frequency of these singleton types relative to some other property of the pattern in question. For instance, Baayen and Lieber (1991) distinguish between ‘productivity in the narrow sense’, which relates the count of constructional hapaxes to the number of overall constructional tokens, and ‘global productivity’, which additionally relates the resulting measure to type frequency (cf. also Baayen 1992). On top of that, Baayen (1993) suggests a third measure that compares constructional hapax counts to the number of hapaxes in the corpus at large (‘hapax-conditioned productivity’).
6.3.
Procedure
The current section describes the concrete procedure adopted for the semantic classification tasks and for assessing productivity predictions.
Procedure
6.3.1.
175
Identifying item-based generalisations
Item-based generalisations were identified through introspective classifications of selected micro-constructions (i.e. intensifier-construction pairs such as burn + N). Classification targets were chosen such that the study contained one item from each of the four lexicalisation patterns that had a direct/uncontroversial translation equivalent in the respective other language and was relevantly type frequent across the three constructions in both languages (i.e. appeared among the top items when type frequency counts for all three constructions were summed). Tables A.3 to A.6 in the appendix report the top ten items in the summed type frequency counts for all four patterns in English and German. From these tables, the eight target items crackle/knistern (SOUND), dazzle/blenden (LIGHT), stink/stinken (SMELL) and glow/glühen (HEAT) were selected. Target items were classified in all three constructional environments. Following Bybee and Eddington, target items were not classified by categorising them as instances of some specific supertype, but compared individually in pairwise semantic similarity judgments. To keep the study to manageable proportions, the classification was restricted to the significantly attracted collocates of the targeted intensifiers. In other words, not all predicates that occurred in a particular micro-construction were related to all other predicates occurring in this construction one by one, but just the significantly attracted items. Where there were more than ten significantly attracted collexemes, the classification was restricted to the ten most strongly attracted items. Recalling the discussion in chapter 5, it was furthermore ensured that none of the items among these top ten was a hapax type. Where there was no significantly attracted collexeme at all, the most frequently co-occurring predicate was chosen instead. This way, the semantic relations of the assumed core of the exemplar cloud could be systematised exhaustively in order to assess for each use whether it was potentially based on/motivated by the item in question, whilst at the same time avoiding a strong proliferation of similarity judgments. To illustrate, the most type frequent micro-construction in the classification study, German glühen + N, occurred with a total of 467 different collocates in the data. For an exhaustive classification of all possible relations between all predicates, the unfeasible number of 108,811 separate similarity judgments would have been required for just this one single item in just this one construction in just one language. By following the above
176
Incipient productivity: From collocations to constructional schemas
procedure instead, the number of classification judgments required for glühen + N could be reduced to 4615, which is arguably still substantial. To increase (or at least assess) representativeness, 10% samples of the data were annotated by two independent coders, one a native speaker of British English, the other a native speaker of German. For each language, the complete sets of all pairings were tabulated in a spreadsheet that contrasted combinations one by one and provided a naturally occurring example for each member of the pair from the corpus data. Rather than attempting to eliminate context from the classification, second coders were explicitly encouraged to inspect the compared items in the sample context provided. They were furthermore informed that the objective of the classification was to identify reasons why all the different items in the table did occur with the targeted intensifier, and they were encouraged to explore potential underlying semantic motivations for the patterns that occurred to them. For this, they were provided with full collocate tables of all target items (since they only coded a random sample consisting of 10% of the entire judgments) that they were instructed to inspect before the classification of the relevant item. Furthermore, they were supplied with a sample annotation of the micro-constructions burn + N and brennen + N for training purposes. All in all, coders were provided with the following materials:
a concise set of coding instructions (included in the appendix); a spreadsheet containing a sample classification of the microconstructions burn + N in English (1830 judgments) and brennen + N in German (820 judgments); the text of the present section of this chapter (6.3.1) for a discussion of the assignments in the sample classification; the complete collocate frequency tables for all target items; a spreadsheet containing the individual comparisons for each target item tabulated in a separate worksheet .
Classification itself proceeded in two steps: first, coders went through the table and gave spontaneous assessments whether they found the pair in a given line semantically related or not. Second, they returned to those pairs that they had judged as similar in the first step and specified the exact nature of the connection that they had identified. This was done in order to distinguish cases where the two coders bluntly disagreed about the existence of a semantic connection as such from others where they merely dis-
Procedure
177
agreed about the specific nature of the link (which, in view of the manifold complexities of the task discussed in section 6.2.1, seemed advisable). Also, splitting the classification task into two separate assessments permitted the calculation of separate intercoder reliability scores for similarity intuitions as such as well as for coders’ understanding of the specific nature of a given link. In contrast to the Bybee and Eddington study, similarity judgments did not involve ratings on a numerical scale, but assessments of the qualitative nature of the link (i.e. semantic relation) that held between two items that were judged as similar in a relevant respect (i.e. relevant for the choice of the particular intensifier). The reading(s) to be considered for a particular item were not supplied. Instead, raters were simply instructed to indicate a semantic relation wherever they perceived one (for whichever reading of the target items that occurred to them). The following categories were provided for capturing such relations: – Synonymy (‘SYN’) – Antonymy (‘ANT’) – Hyponymy (‘HYP’) – Low-level co-hyponymy (‘CHY’) – Frame-based metonymic shift (‘FRA’) – Metaphor (‘MET’) – Higher-level co-hyponymy within a larger semantic field (‘FIE’) –No relation I will illustrate each relation with some examples from the sample classification for English burn + N. The complete co-occurrence table is reproduced on the top of the next page. Beginning at the top of the list, the tag ‘SYN’ was assigned to intensity expressions with (near-) identical meaning. The test was to see whether a given item A could be substituted by item B (and vice versa) without yielding a notable change of meaning in the present context: (6) a. burning issue vs. burning question b. burning desire vs. burning wish c. burning anger vs. burning rage Crucially, the task was to compare the concrete composite expressions in their specific context rather than to give decontextualised judgments about
178
Incipient productivity: From collocations to constructional schemas
Table 6.1
Collocate table: burn + N
Expression burning issue burning question burning desire burning ambition burning love burning pain burning sun burning heat burning light burning passion burning need burning tear burning interest burning problem burning curiosity burning conviction burning blush burning hunger burning enthusiasm burning resentment burning power burning intensity burning anger burning headache burning ague burning crusade burning debate burning demand burning determination burning dryness burning election issue
f/FYE 66*** 31*** 24*** 22*** 9*** 21* 14ns 6ns 6ns 5ns 5ns 4ns 4ns 4ns 3ns 3ns 2ns 2ns 2ns 2ns 2ns 2ns 2ns 2ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns
Expression burning faith burning goal burning honesty burning impatience burning imperative burning protest burning revenge burning sense of injustice burning soreness burning tenderness burning thirst burning urge burning wish burning zeal burning admiration burning awareness burning sexuality burning topic burning urgency burning appraisal burning confusion burning excitement burning hatred burning injustice burning temper burning sunlight burning success burning silence burning rage burning sunshine
f/FYE 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns 1ns
the predicates in isolation. This was important because many of the predicates in the classification were polysemous, and the relevant reading was not always the dominant, i.e. intuitively basic one. For instance, the first entry that is listed for the word question in dictionaries is typically the meaning ‘interrogative utterance’, but it is the metonymic extension to communicative content (‘subject matter’) that is instantiated by the intensity expression a burning question. Hence, in this particular reading, relevant (near-) synonyms of question would be words like issue, topic and problem, not query or request. Sometimes, a given predicate also occurred
Procedure
179
with more than one relevant (i.e. intensifiable) reading in the data (e.g. hot ‘high in temperature’, ‘spicy’, ‘attractive’ etc. in construction B). In this case, coders were instructed to consider the predicate in all relevant readings rather than just the one exemplified in the (arbitrarily selected) corpus example that was provided in the classification table. As a result, an item could have a range of different synonyms in the dataset that were nevertheless unrelated to one another if compared directly (e.g. spicy vs. attractive). Finally, even after the lemmatisation procedure described in chapter 4, there were still some singular/plural distinctions in the data that were marked through compounding rather than inflection (e.g. Verehrer ‘admirer’ vs. Verehrerschar ‘clutch of admirers’). Such contrasts were ignored in the classification (i.e. pairs like Verehrer and Verehrerschar were treated as synonyms). The second relation was antonymy (‘ANT’). Antonyms were defined as pairs of expressions that could be construed as polar opposites with regard to a salient dimension. The test for antonymy was: ‘A is the opposite of B’ (and vice versa). As in the case of ‘SYN’, therefore, antonymous pairs had to be symmetrical in the sense that expression A could be construed as an opposite of expression B just as readily as expression B could be construed as an opposite of expression A. This is not necessarily always the case. For instance, the word embarrassment denotes a certain (sub)type of unpleasant experience, whereas the word pleasure denotes a state of gratification quite generally. For want of an exact reverse of embarrassment, the more unspecific supertype pleasure may therefore suggest itself more readily as a potential antonym of embarrassment than vice versa (at least in my intuition). Where a relationship was judged to be less than perfectly symmetrical in this sense, either ‘CHY’ or ‘FIE’ was chosen instead of ‘ANT’ (see below). Examples of the ‘ANT’-relation from table 6.1 include: (7) a. burning love vs. burning hatred b. burning resentment vs. burning admiration c. burning rage vs. burning tenderness Where one of the items in a pair was a superordinate term of the other, the category ‘HYP’ was assigned. This also applied to compounds, since these had not been reduced to the head component during coding (cf. chapter 4). The test for hyponymy was: ‘A is a type of B’ (or vice versa). Examples from table 6.1 include:
180
Incipient productivity: From collocations to constructional schemas
(8) a. burning pain vs. burning headache b. burning sunlight vs. burning light c. burning hunger vs. burning need The fourth category, co-hyponymy (‘CHY’), may appear more difficult to delimit at first glance. Recalling the granularity problem discussed in section 6.2.1, it is virtually always possible to find a superordinate category of some sort for two lexical concepts, and be it just a next to vacuous notion such as THING. The problem, then, was to decide for a given pair whether a potential superordinate concept constituted a relevantly informative category for the problem at hand (i.e. systematising the co-occurrence restrictions of the given intensifier). But what counts as ‘relevantly informative’? And what about the fact that not all domains are lexicalised to the same extent, to the effect that some taxonomies offer many more levels of potential superordinate terms than others? These difficult questions notwithstanding, it is often surprisingly unproblematic to intuit a broad two-way distinction between ‘co-hyponymy proper’ (understood as the subsumption of two items, typically complementaries in some respect, under the same immediate supertype) on the one hand and any other form of more schematic commonality between two lexical concepts on the other. For instance, the pair salt and pepper invokes the immediate supercategory SPICE much more readily than the pair salt and wood invokes the more remote supercategory PHYSICAL SUBSTANCE. The tag ‘CHY’ was reserved for relations of the former type: instances of intuitively ‘immediate’ cohyponymy in which expressions A and B were complementaries in some respect (though not antonyms) that conjured up a shared direct superordinate concept (not necessarily lexicalised) without extensive deliberation. (9) gives some examples from table 6.1: (9) a. burning love vs. burning anger b. burning hunger vs. burning thirst c. burning headache vs. burning soreness The fifth category, ‘FRA’, was for metonymic profiling shifts within the same semantic frame. Among other options, expression A can designate a participant, a cause, a result or a typical corollary of the state of affairs denoted/invoked by a ‘FRA’-linked item B (or vice versa). Some examples are provided in (10):
Procedure
(10) a. b. c. d.
181
...the burning urge to whistle a happier tune. …it was from his burning wish for people to admire... A Burning Zeal for Righteousness: Women in... ...mind centred on the one burning goal of reaching the...
Force-dynamically, all of these invoke a schematic transitive frame in which the action, attention or aspirations of a human participant are strongly focused on a particular state of affairs (which may be encoded in a dependent construction – cf. the urge/wish to do sth., the zeal for sth., the goal of doing sth.). At the same time, the expressions in (10) profile different aspects of this underlying frame: (10a) is about the actor’s MOTIVATION, (10b) about the actor’s VOLITION, (10c) about the actor’s ATTITUDE in pursuing the set objective and (10d) about the GOAL itself. In all cases, the psychic energy expended is metaphorised as something ‘burning’, i.e. as a kind of FUEL. If this is a plausible account of why all these different predicates can take the intensifier burning, there is a semantic connection here that cannot be captured by any of the aforementioned relations: the word goal for instance is neither a synonym or antonym nor a superordinate or complementary term of the word wish. Rather, the relationship is metonymic in nature: the expressions in (10) pick out distinct, but conceptually contiguous aspects of the same underlying image schema (i.e. possessing a goal implies a wish to attain it, which in turn produces a zeal to act according to the set objective and so on). In addition to connections between independent frame elements, the label ‘FRA’ was also assigned to pairs that could be construed as metonymic shifts between correlated experiences ‘within’ one and the same participant: (11) a. burning love vs. burning admiration b. burning hatred vs. burning anger c. burning enthusiasm vs. burning impatience While neither love and admiration, hatred and anger or enthusiasm and impatience are strictly substitutable in terms of denotation (to the effect that neither ‘SYN’ nor ‘HYP’ apply), their denotata are nevertheless often experienced concurrently. As a result, the metonymic extension of an established collocation along these lines involves only little semantic strain: apart from the fact that the predicates in (11) are all of the same general semantic type (PSYCHOLOGICAL PHENOMENON), it is conceivable that a speaker may choose either variant of these pairs as a characterisation of one
182
Incipient productivity: From collocations to constructional schemas
and the same designated situation (depending on the facet that is to be foregrounded in the construal). Importantly, conceivable metonymic/frame-based relationships between predicates that did not involve profile shifts within the same intensification construal were not recognised as ‘FRA’-relations. To illustrate, recall from chapter 5 that different collocations of the same intensifier may draw on different metaphorisations of the intensifier’s source domain: for instance, if the above explication is correct, some uses of the intensifier burning involve the conceptualisation of the intensified predicate as a kind of FUEL that sustains a certain state within an experiencer (e.g. burning enthusiasm/interest/desire). Other collocations likewise involve the presence of an experiencer somewhere in the construal, but they relate to entities that are external to this participant and derive their intensifying force through a different set of implications: here, the intensified element is conceptualised as something that needs urgent attention, like a literally burning fire that might otherwise get out of hand (e.g. burning issue/question/problem). Now, if the predicates occurring in these two subtypes of intensity collocations were compared out of the context of their specific intensity construals, a coder might be led to posit a certain frame-based link between them (for instance, the predicate issue could be seen as an instantiation of the GOAL element in the frame invoked by the predicate desire) even though there is presumably no direct connection between the two expressions. As with the other relations (cf. the above discussion of synonymy), it was therefore crucial that ‘FRA’-assignments were made relative to the specific construals of the given combinations in context, and not just with regard to the meanings of the intensified predicates compared in isolation. Of course, the conceptual motivation of a given expression was often less than fully determinate and/or ambiguous. Hence, coders were advised to avoid ‘FRA’ classifications only in such cases where they did feel reasonably confident that the two expressions in question were not variants/extensions of one another but manifestations of two distinct construals. Next, there are of course not only figurative connections within a given domain (metonymies), but also connections across domains (metaphors). Examples from table 6.1 are given in (12): (12) a. burning heat vs. burning love b. burning hunger vs. burning interest c. burning thirst vs. burning curiosity
Procedure
183
Metaphorical connections (‘MET’) were posited in two cases: first, wherever there was a direct and systematic conceptual link between a source domain collocation of the targeted intensifier (e.g. burning heat) and a combination with a given more abstract concept (cf. 12a, an instance of the conceptual metaphor EMOTION IS HEAT). And second, wherever such a connection (direct and systematic) held between two non-source domain collocates of the given intensifier due to an independent metaphor (i.e. independent of the intensifier’s own source domain: cf. 12b and c as instances of the metaphor EXPERIENCING IS INGESTING, which has no intrinsic connections to the domain HEAT). Finally, all items that were found to be related through a relevant commonality that was not captured by any of the above relations were classified as ‘FIE’. Typically, such relations only suggested themselves relative to other (less similar) pairings, and often only in the course of (or even after) the first coding round. For instance, items in the sample classification like love and sense of injustice are intuitively not very similar. However, looking at the usage pattern of the intensifier burning as a whole, they are still more similar to one another than pairs like e.g. love/sunshine or love/debate: in contrast to these, both love and sense of injustice denote a certain kind of psychological phenomenon. And since the intensifier burning indeed seems to occur with quite a number of such terms (i.e. words for emotions, attitudes, intentions and so on – cf. table 6.1), this is certainly an interesting fact that should captured by the classification, even though the pair is not easily subsumed under any of the more specific relations discussed above. Examples of ‘FIE’-relations from the sample classification (all subtypes of PSYCHOLOGICAL PHENOMENON) include: (13) a. burning excitement vs. burning faith b. burning interest vs. burning honesty c. burning anger vs. burning wish When they were finished, coders were instructed to verify their judgments in both classification steps (first by going through the list of supposedly unrelated pairs again, and second by checking the different categories of related pairs for internal coherence). Where more than one category could have been assigned (e.g. in the case of synonyms and antonyms, which are by definition also co-hyponyms), coders were instructed to assign categories in the following order:
SYN, ANT, HYP > FRA > CHY > MET > FIE
184
Incipient productivity: From collocations to constructional schemas
To illustrate, a potential ‘FRA’- classification was only assigned to pairs that would not have counted as instances of either ‘SYN’, ‘ANT’ or ‘HYP’ already; ‘CHY’ was only chosen if the pair was neither ‘SYN’, ‘ANT’, ‘HYP’ or ‘FRA’ and so on. Coders were encouraged to use placeholders for controversial assignments such that the problematic cases could be reviewed together at the end. To fully complete the classification, however, all pairs in the comparison ultimately had to be assigned to one of the seven semantic relations in the scheme or else be marked as unrelated. When coders were finished, interrater agreement was calculated (both for the overall relatedness assessments and for the specific category assignments). Interrater agreement was quantified using Cohen’s kappa (, Cohen 1960). 6.3.2.
Identifying pockets of productive use
The second analysis is devoted to the relationship between item-based analogical extension and higher-level schema extraction. Consider again the earlier quotation from Israel (2002) discussed in chapter 5: Children learning a language, and speakers in general, represent linguistic units in ways that maximize their motivation and emphasize their commonalities. Two units are consistent with each other to the degree that they match in their formal and semantic specifications. LOCAL CONSISTENCCY applies to linguistic units activated online in usage events, and requires these to be as consistent as possible with entrenched utterance types. GLOBAL CONSISTENCY applies to the repertoire of constructions as a whole, and requires that units be represented in ways which maximize their consistency with each other. Local consistency favors a massive inventory of low-scope constructions to represent the rich details of experienced usage events: it thus fosters arbitrariness in the grammar, but also makes on-line processing easier by offering conventional units for every occasion. Global consistency favors the development of abstract representations and recurrent inheritance links across constructions: it thus increases motivation in the grammar, but also makes processing harder as the schematic units it favors are farther removed from the details of actual usage. Global consistency motivates the emergence of schematic linguistic units which can license novel utterances; local consistency constrains the use of such units by encouraging conformance to familiar patterns of usage. (Israel 2002:123f.)
The connection between analogy and schematisation that is suggested in this paragraph was operationalised such that speakers’ creative extension of
Procedure
185
a construction (i.e. the occurrence of locally consistent analogies) was predicted to reflect the clustering of established instances of this construction across the covered region of semantic space (i.e. the existence of certain more globally consistent schemas). With Bardal (2008), it was assumed that the overall usage spectrum of a construction can be broken down into more or less coherent subschemas, and that in low-level schemas such as the present target constructions, incipient productivity is (if at all) found within pockets of semantically consistent subtypes of the overall category. Furthermore, following Baayen and colleagues, it was assumed that productive coinages/creative extensions of a schema are adequately operationalised as constructional hapaxes. The hypothesis, then, was that creative extensions of a construction (i.e. hapaxes) are not scattered randomly across the semantic space covered by this construction, but that their distribution can be predicted from the distribution of established variants of the same construction (i.e. non-hapaxes) across semantic space. In short: the more type frequent a particular semantic variant of a construction, the more novel coinages of the same semantic type are to be expected. This hypothesis was evaluated in three steps: first, the data were categorised semantically. Second, the categorised data were partitioned into ‘established’ vs. ‘creative’ types. And third, correlation analyses were conducted in order to assess whether the distribution of novel coinages across the identified semantic spectrum reflected the distribution of already established types across these classes. For the two constructions in which the intensified predicate was nominal (i.e. A and C), semantic classification built on the results of the framebased analyses reported in section 6.4.1. For construction B, the category scheme suggested by Dixon (1977) was employed (cf. chapter 5). The frame-based annotation for constructions A and C built on the observation that although a given expression invokes an entire frame as the base of its conceptualisation, it also profiles one specific participant or aspect of this frame at the expense of others. For instance, a given intensity collocation may profile the AGENT of the invoked event, the ACTION carried out by the agent, maybe a certain additional participant like PATIENT, INSTRUMENT or GOAL, a particular property of any of these participants, the RESULT or the SETTING of the encoded event and so on and so forth. Given this multiplicity of possibilities, the hypothesis developed in the preceding paragraph predicts that the distribution of constructional hapaxes across the designated frame should reflect the distribution of established uses across the same categories of frame elements. This is to say that if a particular intensi-
186
Incipient productivity: From collocations to constructional schemas
fier is used chiefly in combination with e.g. different words for agents of a specific kind of event but hardly ever in connection with designations of the event itself, speakers should be more confident to extend this intensifier to novel agentive uses than to novel eventive types. Where there was a potential conflict between different frame element categories due to regular polysemy (e.g. between eventive and result readings of expressions like glowing endorsement), types were assigned to the category that instantiated the intuitively basic/non-derived reading (here: ACTION rather than RESULT). Also in constructions A and C, nouns denoting a property of one of the frame participants were classified further according to Dixon’s typology of adjective meanings (e.g. AGE, COLOUR, HUMAN PROPENSITY etc.). In addition, they were indexed for the participant that possessed the property in question, thus giving classifications like ‘AGENT PROPERTY: HUMAN PROPENSITY’ or ‘ACTION PROPERTY: DIFFICULTY’. Semantic roles like AGENT and EXPERIENCER were distinguished to the extent that the assignment was uncontroversial and did not lead to an unnecessary proliferation of categories. This was sometimes the case in construction C, where the intensified property is invariably ascribed to the subject participant, but this participant does not necessarily have the same semantic role each time. For instance, in (14a) the subject would standardly be identified as an EXPERIENCER, but in the frame-semantically parallel case of (14b), it is an inanimate entity that is not compatible with such a characterisation: (14) a. When everything was back in place, she beamed with pride and boasted: Aw weel, ah feel much better noo! (BNC CAV) b. Though it revolves around death, Cameron Crowe's hotly anticipated follow-up to 'Vanilla Sky' is optimistic overall, beaming with the same life-affirming mood as the crowd-pleasers 'Jerry Maguire and 'Almost Famous'. (http://www.dooyoo.co.uk/dvd-title-e/elizabethtown) For present purposes, what the classification should capture is that in both cases, it is a certain kind of psychological state that is attributed, and the entity that it is being attributed to is the subject of the designated ‘beaming’ event (rather than any other implicit or explicitly realised frame participant). By contrast, the fact that the two property bearers in (a) and (b) are nevertheless of different semantic types is secondary here. To capture such commonalities, relevant examples were classified more schematically in
Procedure
187
the format ‘SUBJECT PROPERTY: ’. Finally, items that were not straightforwardly assigned to any of Dixon’s categories were classified as ‘OTHER’. Using these categories, an exhaustive frame-based systematisation of the entire co-occurrence spectrum was provided for the most type frequent intensifier in each construction and language. Target items for the second classification step are reported together with their constructional type frequencies in table 6.2: Table 6.2
Targets items of the productivity study English
Cxn A B C
Intensifier glare dazzle glow
German Types 69 70 120
Intensifier glühen knallen strahlen
Types 467 77 45
Following up on the classification, instances were partitioned into hapax vs. non-hapax occurrences. Finally, the hypothesised connection between the counts for established and novel types of a given meaning was assessed using correlation analysis. Correlation analysis is a statistical method for assessing dependencies between different measurements of two variables X and Y (here: number of hapaxes vs. number of non-hapaxes within a given semantic class). Correlations may be negative or positive in polarity, and small (±0.1–0.3), moderate (±0.3–0.5) or large (±0.5–1.0) in size. Standard measures such as Pearson’s product moment correlation coefficient (the one used here) can be calculated by freely available statistics packages like R or Spreadsheet software such as OpenOffice Calc. In interpreting the results of such an analysis, it is necessary to bear in mind that the existence of a correlation between two variables X and Y does not in itself provide evidence that the values for Y in fact depend on those for X: it could just as well be the other way round (Y is the independent variable, and X’s value depends on Y), or both variables may be affected in similar ways by an unknown third variable Z without there being a direct connection between X and Y. However, given widespread assumptions about the connection between type frequency and productivity (as discussed in section 6.2.3), it was assumed that the possibility of mixing up cause and effect (or even failing to recognise an underlying ‘true’ predictor) could be neglected here. In other words, where a significant positive correlation between the distribution of established and novel types was found, this was interpreted as
188
Incipient productivity: From collocations to constructional schemas
evidence that creative extensions of different constructional variants are in fact driven by variants’ semantic type frequency. 6.3.3.
Identifying higher-level generalisations
Finally, for the identification of higher-level generalisations (i.e. similarities across different micro-constructions), the data were subject to a hierarchical cluster analysis along the lines suggested by Gries and Stefanowitsch (2010). For this analysis, the intensified predicates of all six constructions (A, B and C in English and German) were clustered according to the intensifiers that they occurred with (rather than the other way round). By clustering the data for intensified predicates rather than for intensifiers, it was possible to detect regularities in the connection between semantic domains whose instances tended to take the same intensifiers in the data. Hierarchical cluster analysis is an explorative statistical technique that groups a set of cases (here: intensified predicates) according to their comparative (dis)similarity with regard to a matrix of variables (here: cooccurrence counts with particular intensifiers). The results of the analysis are represented in a hierarchical tree diagram (dendrogram) that is interpreted as follows: on the left hand side, each individual object that is part of the comparison starts out as a separate cluster. By comparing the relative (dis)similarities of the classification targets with regard to their distribution across the investigated spectrum of variables, the method then starts to link (or amalgamate) clusters into ever more inclusive clusters, with the respective (dis)similarities represented in the diagram in terms of a particular distance value: the further to the right in the diagram a particular linkage, the more dissimilar the amalgamated objects are. The goal of the analysis is therefore to uncover similar clusters and branches of these clusters within the overall tree which can then be interpreted. Hierarchical cluster analysis is actually a cover term for a whole family of methods distinguished by different criteria for amalgamating clusters (linking rules) and quantifying the (dis)similarities involved (distance measures). I will not offer a discussion of these different options here but simply adopt the pre-existent implementation proposed for the specific purpose at hand (i.e. clustering the covarying collexemes of a grammatical construction) proposed by Gries and Stefanowitsch (2010). This implementation involves computing similarities using the City-block (Manhattan) distance measure and linking the clusters using Ward’s method. Cluster analyses yield more telling results if they are restricted to relevantly fre-
Results
189
quent items. For this reason, the analysis was restricted to the 50 most token frequent intensified predicates occurring in each construction. 6.4.
Results
6.4.1.
Item-based generalisations
This section presents the results of the manual classification of item-based co-occurrence patterns. The classification is based on 7789 semantic similarity judgments (1653 English and 6136 German comparisons). Interrater reliability for semantic relatedness (i.e. determining whether or not two items were related at all) was fair, though not strikingly high (English=.73, German=.77). The great majority of disagreements involved pairs which one of the two raters had marked as unrelated and the other as remotely or indirectly related via ‘FIE’. Given its negative characterisation as a kind of ‘last resort’ for indicating commonalities that were not captured by any of the more clearly defined relations, this is not surprising. Interrater reliability for the specific nature of a given link was calculated for all pairs where coders had agreed on the existence of such a link in the first place. Reliability scores were again acceptable (English=.79, German=.82), with most disagreements involving classifications of the type ‘FIE’ and ‘FRA’. All results reported in the following sections are based on the classifications of the first coder. As will be seen below, these results point to many similarities both across languages and across constructions. Since there were considerably more German than English datapoints in the classification study (i.e. combination types and hence also judgments), the discussion of the results will open with a discussion of the findings for German to which the English results will then be related. SOUND
The two target items in the domain SOUND were English crackle and German knistern. Table 6.3 contrasts their co-occurrence patterns across constructions and languages and reports the number of classification judgments per variant on which the following discussion is based. The table suggests that intensifying uses of the concept CRACKLE have different constructional
190
Incipient productivity: From collocations to constructional schemas
Table 6.3 Cxn A B C
Co-occurrence patterns: crackle/knistern
Types 4 55 51
English Significant Judgments 3 2 107 3 147
Types 29 8 19
German Significant Judgments 3 81 2 13 2 35
associations in the two languages: in English, its type frequency drops from construction B over construction C to construction A, whereas the German ranking is exactly the other way round. Beginning with German, the top attracted combination in construction A is knisternde Spannung ‘crackling suspense/tension’. The fact that the designated quality is often said to ‘be in the air’ (knisternde Spannung lag in der Luft) suggests that the expression derives from a simile in which the designated scene is likened to the imminent break of a thunderstorm (i.e. a situation in which crackling electric currents indicate the build-up of an atmospheric charge). From here, collocations with knistern are extended to all kinds of metaphorical ‘atmospheres’ in which the setting of the designated scene is experienced as (or, at least, hyperbolically described as) particularly heavy with the quality denoted by the oblique nominal (typically something unsettling or thrilling). Types related to Spannung include one synonym (Anspannung) as well as a number of hyponyms (Derby/Hoch-/Krimispannung) and one hyperonym of the prototypical filler (Unterhaltsamkeit, literally: ‘entertaining-ness’). All remaining connections are frame-based. One group designates the cause of the experienced state of tension/suspense/thrill (Rivalität ‘rivalry’, Hassliebe ‘love-hate’, Erotik ‘eroticism’, Geheimnis ‘mystery, secret’), a second refers to concurrently experienced or resulting psychological states of the experiencer participant (Wachheit ‘alertness’, Erwartung ‘anticipation’, Vergnügen ‘joy’, Nervosität ‘nervousness’, Unsicherheit ‘uncertainty’, Unruhe ‘disquiet’), and a third group consists of evaluations of the overall quality of the designated scene (Dramatik ‘drama’, Brisanz ‘explosiveness’). The two remaining significant combinations, Hochspannung and Erotik, invoke the same semantic frame as Spannung. In the other two constructions, the two only attracted predicates are again Spannung and Erotik (in construction C) and their adjectival equivalents spannend and erotisch (construction B). Also among the related types
Results
191
further down in the association strength ranking, the exact same words (or morphological variants of these words) as in construction A come up again (e.g. Hochspannung, Dramatik, Geheimnis; also geheimnisvoll ‘mysterious’ in construction B). In other words, German knistern occurs with largely identical predicates in all three environments, suggesting that at least in the case of this micro-construction, speakers have indeed memorised a particular meaning (i.e. a certain complex metaphorical conceptualisation) that goes with it rather than a set of specific words that are independent from one construction to the next. Moving on to English, crackle is only found with four different collocates in construction A (tension, intensity, energy and silence). None of them is significantly attracted to the intensifier, but all are straightforwardly related to the same frame that is also evidenced by the German data (which raises the question whether either of the two expressions is possibly a calque). By contrast, the two significant associations in construction B are semantically unrelated (cracklingly smart, cracklingly fine). Both expressions denote positively valued qualities. Each of the two also has a number of more immediate semblances involving (near-)synonymy (smart: intelligent, brilliant, inventive, arch; fine: good, brilliant) and in the case of smart also co-hyponymy with other predicates from the domain INTELLECTION (funny, humorous, original, witty). Fine was categorised as expressing an abstract value judgment, such that expressions denoting a positively valued quality in some more specific domain (e.g. delicious, fresh, suspenseful) were classified as contextual hyponyms; for smart, these were tagged as ‘FIE’. Incidentally, the connection between fine and suspenseful marks a point of contact between the evaluative cluster and expressions instantiating the abovementioned ATMOSPHERE frame, which are also attested in construction B (tense, suspenseful, sinister, frightening, creepy), though only much further down in the association strength ranking. In fact, all of the latter are constructional hapaxes, suggesting that they are creative analogical extensions. Since these uses are obviously not modelled on the central exemplars of construction B itself, one possibility would be to see them as crossconstructional analogies based on entrenched expressions in construction C, where relevant predicates are found among the top attracted combinations. Interestingly, the three central exemplars in this environment in fact connect the ATMOSPHERE frame also found in constructions A and B (crackle with tension) with the group of positively valued qualities and traits found in construction B (crackle with humour) via a third cluster cen-
192
Incipient productivity: From collocations to constructional schemas
tred on crackle with energy. The connection leads from experienced atmospheric qualities (tension, unease, fear) over the forces that generate them (energy, electricity, power, life, dramatic fire, something malevolent) to a special case of the latter that could be identified as ‘psychic energies’ (enthusiasm, excitement, passion, creativity, ingenuity, humour, charm, honesty, ambition, interest, appetite, aggression). Furthermore, like in German, the sense of presence or imminence of the quality or event denoted by the oblique nominal is repeatedly characterised as being ‘in the air’: (15) a. Leeds United's rural training ground outside Wetherby looks as pretty as a picture in a light winter blizzard though, on the day the club sell Jonathan Woodgate to Newcastle United, the Yorkshire air is crackling with something far more malevolent than falling snowflakes. (http://observer.guardian.co.uk/print/0,,4596761-102283,00.html) b. I can still hear the smack of knuckle on bone, the words spat at me like bullets, the air crackling with aggression, the bully spitting and snarling, performing for the crowd, the desire for acceptance and pain of approval withheld. (http://www.roberthiggs.co.uk) c. Political life is in turmoil. Hundreds of thousands, millions of people are being drawn to street protest and other forms of direct action for the first time in their lives. The air is crackling with real change. (http://www.cpgb.org.uk/worker/470/call.html) In sum, the attested variation for intensity collocations with crackle/ knistern provides strong evidence for a coherent conceptual basis of relevant expressions in all three constructions in both languages. Whereas significantly attracted combinations in German are restricted to what was identified as the central ATMOSPHERE frame above, the English counterparts of these expressions have spawned additional extensions in the direction of evaluative meanings (particularly in construction B) that attach to a particular subset of the ‘energetic’ qualities that appear in a variant of the ATMOSPHERE frame (i.e. positively valued psychological properties and personality traits).
Results
193
LIGHT
Instances of the second pattern were investigated on the example of English dazzle and German blenden. Table 6.4 gives an overview of the classification: Table 6.4 Cxn A B C
Co-occurrence patterns: dazzle/blenden
Types 43 69 24
English Significant Judgments 2 83 3 201 23
Types 94 38 -
German Significant Judgments 12 885 4 142 -
The top attracted combination in German is blendende Form ‘dazzling form’. The expression invokes a schematic ACCOMPLISHMENT frame in which an agent working on a particular task is well equipped to deliver a successful performance. The expression is typically found in sports contexts, but the schematic frame structure can also be fitted onto activities in many other domains. Apart from two hyponyms (Tagesform ‘day’s form’, Frühform ‘early form’), frame-based connections again dominate the picture. These include relations to further aptitudes and specific abilities of the agent participant (Talent ‘talent’, Übersicht ‘vision’, Selbstbewusstsein ‘self-confidence’), the (potent) agent itself (Virtuose ‘virtuoso’, Stratege ‘strategist’), the aspired achievement/result state (Spitzenresultat ‘top result’, Erfolg ‘success’) alongside opportunities to attain it (Gelegenheit ‘opportunity’, Torchance ‘scoring chance’, Finalchance ‘chance of reaching the final’) and finally descriptions of the successful display itself which imply the ‘dazzling form’ of the agent (e.g. Lauf ‘roll’ as in to be on a roll, Spielverständnis and Spielkultur ‘sophistication of play’). Apart from SPORTS, three other frames with the same schematic accomplishment structure recur in the data: BUSINESS (Gewinn ‘win’, Umsatz ‘turnover’, Geschäft ‘deal/bargain’, Karriere ‘career’, Bankkarriere ‘banking career’), PERSUASION (Überredungskünstler ‘smooth talker’, Motivator ‘motivator’, Redekunst ‘elocution’, Verführungskraft ‘seductive power’, Verheißung ‘promise’, Anklang ‘appeal’) and ENTERTAINMENT (Unterhaltung ‘entertainment’, Unterhalter ‘entertainer’, Komiker ‘comedian’, Tanzkunst ‘dance’, Gag ‘gag’). Internally, these clusters embody the same kind of metonymic shifts between frame participants that were already described for the SPORTS variant. Between domains, relations between members of
194
Incipient productivity: From collocations to constructional schemas
the four variants were tagged as ‘FIE’ insofar as they represented equivalent roles in their respective frames (e.g. blendendes Geschäft/Unterhaltung as different manifestations of the intended GOAL of the action), but marked as unrelated if the connection required yet further shifts within the parallel frames (e.g. blendendes/r Geschäft/Unterhalter, which instantiate GOAL and AGENT roles of the underlying accomplishment frame, respectively). A ‘FIE’-relation was also posited for connections with Licht ‘light’ and Glanz ‘shine’: the fact that the above predicates all combine with blenden indicates a schematic metaphorical connection to the domain LIGHT at some point (possibly because of a stereotypical connection between shiny objects and value), but this is mediated by additional construal operations in each case since none of the above notions is conventionally conceptualised as a form of LIGHT itself (thereby demoting the connections from ‘MET’ to ‘FIE’). Finally, for positively connotated predicates from the above frames (e.g. Talent, Erfolg, Verheißung), a schematic ‘FIE’-type commonality also shows up in conjunction with other positively valued qualities that do not invoke an otherwise similar frame structure (i.e. the implication of a successful pursuit of some sort; cf. Schönheit ‘beauty’, Jugendlichkeit ‘youthfulness’, Reinheit ‘purity’). The only other combination among the top ten that is not already covered by the above characterisation is blendendes Weiß ‘dazzling white’. Being a source domain collocation, the classification results are mostly uninteresting here: Apart from a few co-hyponyms (Bläue ‘blue’, Grün ‘green’), some metonymic links to property bearers (Sonne ‘sun’, Blitzlichtgewitter ‘flurry of flashbulbs’, Scheinwerferlicht ‘floodlight’) and a group of relations to other items from the larger field LIGHT (Helle ‘brightness’, Licht ‘light’, Kontrast ‘contrast’), there is only a metaphorical link to Reinheit ‘purity’. The top attracted combination in construction B is the direct equivalent blendend weiß which has virtually identical collocates in the adjectival domain, among them also one of the three remaining significant items, hell ‘bright’. The two others are the participial form funktionierend ‘working’ and schön ‘beautiful’. The former is easily related to the ACCOMPLISHMENT frame characterised above, as are its close contextual variants nutzbar ‘usable’, parierend ‘parrying/saving (sports)’ and harmonierend ‘harmonising’. Like in construction A, metonymic variants include extensions to characterisations of the agent’s aptitude and performance (souverän ‘superior’, einfallsreich ‘resourceful’, brillant ‘brilliant’) as well as to preconditions (vorbereitet ‘prepared’) and results (geglückt ‘succeeded’) of the designated process. Also among the more schematic commonalities tagged
Results
195
as ‘FIE’-relations, there are direct parallels to collocates already found in construction A, i.e. positively valued terms that do not (necessarily) invoke an ACCOMPLISHMENT frame (schön ‘beautiful’, gut ‘good’, fürstlich ‘princely’). Summing up the results for construction B, then, two significant combinations are immediate adjectival equivalents of members of the source domain collocation cluster around blendendes Weiß in construction A, one combination directly relates to the posited core meaning of successful accomplishments, and the last significant item is motivated by an independent metaphor (BEAUTY IS LIGHT), but still remotely related to the accomplishment frame through its likewise positive attitudinal colouring. Interestingly, these consistent semantic associations have no manifestations in construction C, where blenden is unattested altogether. This marks a difference to English, where the direct translation equivalent dazzle is found in all three environments. Being translation equivalents, the contrast cannot be attributed to the lexical semantics of the intensifier but is to be sought in general syntactic differences between the two languages (to do with permissible mappings between verbal semantic roles and grammatical functions; cf. Hawkins 1986; Taylor 1995): the verbs dazzle and blenden are semantically transitive, and only English allows the experiencer-patient participant to be expressed as the subject in construction C. Compare: (16) a. Gently at first, and then with one sharp movement, she opened the door. Her eyes dazzled with light. (BNC CJF) b. *Ihre Augen blendeten vor Licht. Moving on to English, construction A constitutes a case where the significantly attracted exemplars identified by the collexeme analysis do not provide promising entry points for systematising the larger semantic category structure of dazzle + N expressions. There are only two significant combinations: first, the source domain expression dazzling light, and second the collocation dazzling smile. The latter embodies a construal that extends the metaphor JOY IS LIGHT metonymically to facial expressions and gazes that manifest the experiencer’s state of mind. The metaphor is evidenced by a number of other expressions in the data for construction A: (17) a. …ruled the room, while the dancers whirled in glittering joy. (BNC CMP) b. …giving her pretty blonde aunt a wide, beaming grin. (BNC JXX)
196
Incipient productivity: From collocations to constructional schemas
c. …blonde-haired and bright-eyed with a sparkling smile… (BNC H8T) d. ...held Una in his arms mirrored by the shining happiness in Una. (BNC B1X) e. …cheerfullly with huge gleaming grin spreading across… (http://www.library.veryhelpful.co.uk/Page%2046%20Jenny%20S ullivan.htm) f. Lorna gave him the intense, twinkling smile that had once… (BNC APU) g. O running stream of sparkling joy… (BNC B0Y) h. But the flashing grin and the warm Leeds dialect of… (BNC HP0) However, no other variants of such expressions occur with the intensifier dazzle. The only remaining expression, dazzling light, is obviously related to all other collocates (at least indirectly) in virtue of their taking an intensifier from the domain LIGHT, but the expression does not directly invoke a particular coherent frame structure in the way e.g. blendende Form and crackling suspense do. Hence, the strategy of concentrating on just the significantly attracted items fails to reveal significant portions of the underlying category structure in the case of this micro-construction, although the data in fact do contain a fair number of expressions that again resemble the semantic patterning of the corresponding German cluster: (18) a. … we had an array of dazzling talent to do it. (BNC CHA) b. In fact it has been a dazzling success, by almost every measure. (BNC ABE) c. … His dazzling victory makes him one of just six men with… (BNC K2D) d. … and discovered a woman who forged a dazzling career with… (BNC CEK) e. Pure, dazzling entertainment. (http://www.bbc.co.uk/films/2006/07/27/warrior_king_2006_ review.shtml) In construction B, the significantly attracted collexemes of dazzle are white, beautiful and bright. Setting the source domain collocations aside, beautiful can be related to quite a number of other collexemes, among them a set of quasi-synonyms (handsome, elegant, chic, pretty, attractive), one hyponym
Results
197
(picturesque) and one metaphorical connection (bright) plus a large set of ‘FIE’-relations that once more underscore the metaphorical association of LIGHT with positively connotated qualities. Some of the predicates that are related to beautiful through this schematic link are themselves instances of vision and light metaphors, for instance in the mental domain (brilliant, clever, creative, innovative, innovatory, inventive, original, funny, animated). Others refer to aspects of the ACCOMPLISHMENT frame discussed above (talented, gifted, adroit, powerful, successful), and a third group is ambiguous between LIGHT and EVALUATION readings (brilliant, splendid, glorious). Finally, the fact that there are also positively connotated terms from domains that have no systematic metaphorical connection with the domain LIGHT (fresh, delicious, lovely) indicates that the shared abstract association with positive evaluation has begun to spread and take on a life of its own. Unlike German blenden, dazzle is also attested in construction C, albeit rare. The intensifier does not have any significant associations in the construction, and the most frequently co-occurring noun is a source domain item, light. However, even among the small number of attested types in this environment, there are again some instances of the ubiquitous metaphors discussed above: (19) a. The dialogue is razor-sharp and dazzling with wit. (http://www.oneword.co.uk/programmes/cinemascope/review/ barton_fink) b. …those writers who although not dazzling with originality are… (http://www.writewords.org.uk/forum/46_9027.asp) c. …or dazzling with superior wisdom here. (http://www.seangabb.co.uk/freelife/flhtm/fl24dyke.htm) SMELL
Target items in the domain SMELL were English stink and German stinken. Type frequencies, number of significant combinations and classification judgments per environment are contrasted in table 6.5. The results show that almost half of the collocates of German stinken in construction B are significant associations. As already mentioned in chapter 4, these are entrenched high token frequency expressions like stinksauer ‘stink-angry’, stinknormal ‘stink-normal’ and stinkreich ‘stink-rich’ that are included in
198
Incipient productivity: From collocations to constructional schemas
Table 6.5 Cxn A B C
Co-occurrence patterns: stink/stinken
Types 8 25 34
English Significant Judgments 2 13 1 24 33
Types 4 20 10
German Significant Judgments 1 3 9 135 1 9
many dictionaries. Collocations with emotion terms among these predicates are connotationally neutral (stinksauer, -wütend ‘angry’, -beleidigt ‘offended’), all others are pragmatically marked in that they signal the speaker’s distancing from the bearer of the intensified property (not surprising given the source domain meaning!). Insofar as it is possible to say so with just twenty attested collocates altogether, what seems to recur in construction B are expressions referring to social distinctions (or properties assumed to reflect them by their bearers) such as stinkreich ‘stink-rich’, -vornehm ‘noble’, -fein ‘dignified’, -bourgeois and -konservativ ‘conservative’, but also less covertly evaluative ones (stinklangweilig ‘stink-dull’, -fad ‘boring’, -korrekt ‘correct’, which takes on a distinctly negative meaning when combined with stink-) and otherwise deprecating terms (stinkfaul ‘stink-lazy’, -blöd ‘stupid’, -billig ‘cheap, worthless’). Constructions A and C have even fewer types for stinken and just one attracted collexeme each (Wut ‘anger’ in construction A and Geld ‘money’ in C). Both fit into one of the groups identified for construction B. In addition, a predicate that is found in all three environments (in different morphological realisations) is faul ‘lazy’. In sum, stinken is (unsurprisingly) used mainly in contexts that imply a negative evaluation of the entity bearing the intensified property in all three constructions, and there is some evidence for specialisations regarding the kind of context/domain of the value judgment in which stinken tends to be employed. Significant combinations in English are partly parallel (stinking rich) and partly indicative of different specialisations (stinking hypocrisy, stinking cold). Both significant combinations in construction A have analogical variants in constructions B (stinking cold, stinkingly hypocritical). The remaining items in construction A are mostly swearwords and insults (asshole, bastard, bitch, sod, amateur). Construction B is likewise dominated by adjectives encoding negative value judgments (bad, awful, horrible, average, rude, corrupt) and unpleasant sensations and states (cold, jealous, poor, underpaid, hung over, pissed), some of them most transparently modelled on the significantly attracted items rich and drunk. Construction
Results
199
C has no significant associations and mostly occurs in the literal-causative variant, although there are also some purely intensifying uses (e.g. stinking with pride/stupidity/regret). HEAT
The last pattern, HEAT, was examined on the example of English glow and German glühen. Table 6.6 provides an overview of the classification: Table 6.6
Co-occurrence patterns: glow/glühen
Cxn A B C
English Significant Judgments 4 218 3 102 6 699
Types 57 36 120
Types 467 31 28
German Significant Judgments 45 4615 6 165 2 53
In German, the combination glühen + N is the most type frequent microconstruction in the entire study by a wide margin, and the vast majority of these expressions are also closely related semantically. Beginning at the top of the list, the strongest collocation is glühender Verfechter ‘glowing advocate’. The expression refers to a person who strongly identifies with a particular cause and seeks to promote it actively. Within this frame, the top attracted expression profiles the acting participant itself, i.e. an (AGENT-) EXPERIENCER. Other elements that are involved in the frame include the profiled participant’s belief/supported cause as well as the specific action carried out in its support. Contextual (near-)synonyms of the canonical realisation with a profiled EXPERIENCER include Befürworter ‘supporter’, Vertreter ‘exponent’, Förderer ‘patron’, Wortführer ‘doyen’, Kämpfer ‘fighter’, Anwalt ‘advocate, champion’, Verteidiger ‘defender’, Parteigänger ‘partisan’, Gefolgsmann ‘acolyte’, Bekenner ‘confessor’, Apostel ‘apostle’ and Propagandist ‘propagandist’. Apart from these, there are two major types of frame-based variants in the data. The first group includes metonymic shifts to the experiencer’s attitudes, beliefs and motivations (e.g. Überzeugung ‘conviction’, Identifikation ‘identification’, Bekenntnis ‘confession’, Interesse ‘interest’, Begeisterung ‘enthusiasm’, Optimismus ‘optimism’, Wut ‘anger’, Beschützerinstinkt ‘protective instinct’, Widerstandsgeist ‘spirit of defiance’), including its degree of emotional involvement and active commitment to the cause
200
Incipient productivity: From collocations to constructional schemas
(e.g. Engagement ‘commitment’, Impetus ‘drive’, Eifer ‘zeal’, Ehrgeiz ‘ambition’, Leidenschaft ‘passion’, Inbrunst ‘fervour’, Hingabe ‘devotion’, Verve ‘verve’, Beseeltheit ‘animation’, Idealismus ‘idealism’). The second major group of metonymic variants denotes the action that is performed to promote the supported cause. Typically, it is some kind of communicative intervention: collocates in this class include e.g. Plädoyer ‘plea’, Appell ‘appeal’, Kampagne ‘campaign’, Fürbitte ‘intercession’, Beschwörung ‘imploration’, Verteidigung ‘defense’, Bejahung ‘affirmation’, Laudatio ‘laudation’, Hommage ‘appraisal’, Loblied ‘hymn’ and Eloge ‘eulogy’. Numerous as these frame-based connections are, they are nevertheless not the largest class in the data: even more frequent are hyponymy and connections of the schematic ‘FIE’ type. Hyponyms of Verfechter in this collocation are typically words for adherents of political ideologies and orientations (Zionist ‘zionist’, Bolschewik ‘bolshevik’, Nationalsozialist ‘national socialist’, Linker ‘left-winger’, Liberaler ‘liberal’ etc.) or religious beliefs (Katholik ‘catholic’, Calvinist ‘calvinist’, Atheist ‘atheist’), but there are also a few cases where the supported cause is more specific in nature (Natogegner ‘Nato opponent’, Bahnbefürworter ‘rail supporter’, Naturschützer ‘conservationist’). As before, ‘FIE’-relations mark all types of commonalities that are not captured by one of the more specific relations above, and in particular correspondences with items that play equivalent roles in a related frame. The one other major frame that is found in the data is centred around the three collocates that come next in the association strength ranking: Verehrer ‘admirer’, Anhänger ‘adherent’ and Fan ‘fan’. Especially Anhänger and Fan are certainly very similar in meaning to Verfechter, but here the aspect of conflict and opposition to a rival orientation is backgrounded in favour of an element of admiration. In addition, the affection is typically not for a set of beliefs but, more mundanely, people and, even more mundanely, football clubs (betraying the fact that the corpus consists chiefly of journalese). Nevertheless, it is obvious that the two frames share many of the predicates discussed above, or in other words are only subtly differentiated variants of one and the same more schematic supertype. Moreover, no less than nine out of the ten most strongly attracted collexemes of glühen in construction A are covered by one of the above categories – the only remaining item is the source domain concept HEAT. Since 3673 of the 4615 pairwise semantic comparisons revealed a semantic link that fits somewhere into the structure of these two frames (79.6%), the amount of se-
Results
201
mantic coherence encountered in this collocation cluster is truly outstanding. In Construction B, glühen is substantially less type frequent. Highest up in the ranking is the source domain collocation glühend heiß ‘glowing(ly) hot’ that has one synonymous (erhitzt ‘heated’) and one metonymic variant (rot ‘red’) in the data. In addition, there is a rather high proportion of metaphorical relations to participial forms of emotion verbs (e.g. geliebt ‘loved’, verehrt ‘admired’, gehasst ‘hated’, beneidet ‘envied’, bewundert ‘admired’). The second item, verehrt ‘admired’, invokes the ADMIRATION frame already encountered in connection with construction A and has a number of relations within this frame (e.g. relating to manner in temperamentvoll ‘spirited’ and leidenschaftlich ‘passionate’, to the relation of attraction itself in erotisch ‘erotic’ and interessiert ‘interested’, and to the agent’s (positive) affection in liebend ‘loving’, dafür ‘in favour’ and überzeugt ‘convinced’). It has three synonyms (bewundert, ersehnt, geliebt), one antonym (gehasst ‘hated’) and a number of more schematic connections to other attitude words such as dagegen ‘opposing’, antikommunistisch ‘anti-communist’, optimistisch ‘optimistisch’ and pessimistisch ‘pessimistic’. None of the remaining significant attractions opens up a new sense group beyond these categories, such that the semantic associations between constructions A and B are again highly similar in German. Furthermore, the abovementioned categories also cover the associations of the two significant collexemes of glühen in construction C, Hitze ‘heat’ and the emotion term Stolz ‘pride’: as in construction B, there are many metaphorical connections between the domains HEAT and EMOTION, and ‘FIE’-connections between emotion words and terms for other kinds of mental phenomena such as attitudes (Optimismus ‘optimism’) and knowledge states (Weisheit ‘wisdom’). In English, glow is no less notable for its association with emotional intensity than German glühen, but the frames invoked by the four significant combinations are nevertheless subtly different. Two of them are source domain combinations (glowing colour and glowing warmth) that can be neglected here. The other two are close variants of one another (glowing testimonial and glowing tribute). The semantic proximity of these items to the ADMIRATION frame encountered in German is obvious; what is lacking in English, however, is a salient association with beliefs and convictions and, by metonymic extension, the element of fervency and also combativeness that is a prominent aspect of its German counterpart. By contrast, collocations with glow in construction A in English extend from the central
202
Incipient productivity: From collocations to constructional schemas
members testimonial and tribute to such items as appraisal, endorsement, advert, affirmation and congratulation. Although testament, pride and sense of virtue are among its collocates, too, reference to specific beliefs and orientations is completely amiss, and the other emotion/mental property terms in the data for glow are words like kindness, hope, pleasure, confidence, feeling and charm. This impression is confirmed when the perspective is extended to construction B. Here, the attracted combinations involve two source domain collocates (red and pink) and only one other adjective, positive. Closely matching the associations for the top items in construction A, spin-offs from this central exemplar include favourable, enthusiastic, appealing, friendly and techno-friendly. Emotions, attitudes and psychological states going with glow are happy, secure, tender and romantic, and the evaluative collocates consist of beautiful, lovely, successful and vivid. The latter establishes a connection to the likewise positively evaluated items alive and healthy, with only true providing an echo to the weakly evidenced CONVICTION frame represented by testament, pride and sense of virtue in construction A. Construction C is the most type frequent variant (120 items) and has six attracted pairings. The three non-source domain items are health, happiness and pride. Interestingly, the associations of glow in this environment are only partially similar to the two previously considered constructions. As is to be expected, collocates related to happiness fit well with the above account: in construction C, these include a large family resemblance cluster subsuming joy, enthusiasm, rapture, exhilaration, laughter, smile, wit, charm, enjoyment, satisfaction, pleasure, relief, ease, harmony, inner peace, confidence, self assurance, hope and optimism. More remote connections exist with the groups affection, soul, compassion and love as well as magnificence, splendour, gorgeousness, beauty, perfection, praise, success, achievement and triumph. Not far off is also the cluster that is centred on the second attracted non-source domain item, health. Collocates that were grouped into this part of the network include vitality, sense of wellbeing, energy, activity, vibrancy, presence, power, potential and life as well as, more remotely related, ripeness and pregnancy. All these uses are found in contexts where the association of the property bearer with the respective property is presented in positive terms. However, there is also an entirely different branch involving such predicates as anger, indignation, fury, rage, hostility and malevolence, and in fact also a small group of predicates relating to the CONVICTION frame (inward burning, fervour, fierceness). While
Results
203
the latter has at least some correspondences in constructions A and B, the antagonistic concepts of the former group come unexpected. However, a closer look at the data reveals that they are yet more specialised within construction C: (20) a. The dark eyes were glowing with hostility and… (BNC CN3) b. …his silver hair awry, his eyes glowing with insane fury (BNC HJD) c. Kruger-Daine's eyes were glowing with malevolence. (BNC GVL) d. …her amber eyes glowed with the pent-up anger which was … (BNC C98) e. His eyes narrowed as he looked at her, and seemed to glow with fury… (BNC H8S) f. Mandeville looked down, his eyes glowing with a murderous rage. (BNC H90) The examples in (20) illustrate that the above collocate group is restricted to expressions in which the element in the subject slot is the word eyes (or some metonymic variant like gaze), thus pointing to a further fine-grained specialisation in the usage pattern of this intensifier that can be evidenced in the data. 6.4.2.
Incipient productivity
As laid out in the introduction, the second question to be addressed in this chapter is how speakers extend the existing semantic spectrum of a collocation cluster. The idea is of course not to predict exactly which extensions of a given collocation will become conventional in which particular contexts at which time or in which order – this is impossible. Rather, the question is whether the synchronic data that form the basis of my study can provide insights into the structure of the currently experienced variation which in turn hold clues to general determinants of such variation processes (i.e. irrespective of the question whether any specific novel use in these data may or may not get off the ground and become conventionalised). Recalling the general vision of constructional sense extension laid out in section 6.3.2, the prediction was that speakers should be most confident about novel coinages that resemble many already existing types, such that there
204
Incipient productivity: From collocations to constructional schemas
should be a connection between constructional hapaxes and non-hapaxes of a given semantic category. Frame-based analyses of the type delivered in section 6.4.1 provide the categories that are required for assessments of the issue on the most specific level of semantic variation and extension: that of the individual collocation cluster. Using these categories, the analyses reported in the present section evaluate the prediction formulated in section 6.3.2 for the most type frequent collocation cluster in each constructional environment. Taken together, this second analysis is based on 848 classifications. 68 (8%) of the datapoints could not be assigned to a coherent larger category within their frame and were thus classified as ‘OTHER’. glare + N The most type frequent intensifier in construction A in English is glare. Figure 6.2 provides a cloud display for this collocation cluster. In contrast to the cloud displays supplied in chapter 4, figure 6.2 illustrates collostruction strength rather than frequency (i.e. the bigger a word in the cloud, the stronger its association with the construction, attracted collexemes only):
Figure 6.2 Attracted collexeme cloud: glare + N
Figure 6.2 shows that the construction is centred on the collocation glaring error, which has many quasi-synonymous (mistake, inaccuracy, flaw, fault, bug) and hyp(er)onymous variants in the data (typo, contradiction, miscarriage of justice, inconsistency, disparity, discrepancy, shortcoming, deficiency and contrast). The data show that collocations with glare are typically found in contexts implying a negatively valued breach of expectations or violation of some norm. The underlying frame is that of an agent performing a particular action that produces the negatively evaluated result. The frame elements that one would therefore expect to find in the data are
Results
205
at least AGENT, ACTION and RESULT. However, it turns out that the agent participant is deprofiled in glaring-collocations, as there is no single instance of a relevant expression in the data. This may not seem surprising since the quality of being ‘glaring’ is attributed to the inappropriateness of a particular action or resulting state of affairs rather than to its instigator or cause. On the other hand, alternations between actions and their outcomes (which are already an instance of metonymy) are not the only kind of metonymic frame shifts in the data: for instance, there are also extensions to properties of the agent that account for its failure to deliver a result that meets the evaluator’s expectations (e.g. glaring inefficiency/disability/weakness). Still, the fact remains that conventional variations of glaring-collocations do not extend to semantically compatible agents themselves (i.e. conceivable expressions for unsuccessful performers at some task like e.g. ?glaring failure [on the agentive reading]/loser/flop etc.). A second finding to emerge from the classification is that the intensifier not only combines with properties of the agent, but may also combine with properties of the agent’s action (glaring difficulty), thus giving the revised scheme AGENT PROPERTY – ACTION – ACTION PROPERTY – RESULT for instances of the central NORM VIOLATION frame. In addition, the data also contain source domain collocations of glare with STIMULUS participants (glaring light/heat/brightness/…) and a few uses that fit into neither of the two main types (glaring emptiness/nightmare/example). Figure 6.3 on the next page visualises the frequency distribution of nouns occurring in this micro-construction, and table 6.7 reports the results of the semantic classification of these types (where ‘EST’=number of nonhapaxes in the respective category, ‘HAP’=number of hapaxes in the same category; categories ranked according to hapax frequency): Table 6.7
Classification results: glare + N Category
RESULT OTHER STIMULUS ACTION ACTION PROPERTY: DIFFICULTY AGENT PROPERTY: VALUE
Example glaring error glaring evidence glaring light glaring critique glaring difficulty glaring weakness
Total 40 9 9 5 1 5
EST 21 1 4 1 0 4
HAP 19 8 5 4 1 1
The table shows that the largest and most productive sense class is RESULT. In other words, most hapaxes tend to instantiate the same frame element as the central member of the collocation cluster, glaring error. The table also
206
Incipient productivity: From collocations to constructional schemas
shows a significant high positive correlation between the number of recurrent and hapax types in each category (rPearson=.89, p FIE
In other words, only use e.g. ‘FRA’ if the pair in question is definitely neither ‘SYN, ‘ANT’ or ‘HYP’, only use ‘MET’ if the pair is neither ‘SYN’, ‘ANT’, ‘HYP’, ‘FRA’ or ‘CHY’ etc. 5) When you have reached the end of the list, set the filter in column ‘#Item’ to the next item on the list and go back to step 4 6) When you have judged all pairs in a given worksheet, set the filter in column ‘Related?’ to ‘Blanks’ and recheck whether you still find all these pairs unrelated. Adjust your classification where appropriate 7) Remove any provisional classifications (‘XXX’) that you might have assigned during coding. Choose one of the seven categories provided or else leave cell ‘Relation’ blank 8) Check your completed category assignments for coherence by successively filtering the table for all seven categories and reviewing your judgments. Adjust where appropriate 9) Proceed to next worksheet and go back to step 1 until all sheets are finished.
Tables
237
2. Tables Table A.1 Pattern SOUND
LIGHT
SMELL HEAT
Candidate bases: English Verbs baa, bang, bark, bawl, bay, beat, beep, belch, bellow, blare, blast, blat, bleat, bleep, bluster, boom, brawl, bray, bump, burble, buzz, cackle, cannonade, caw, cheep, cheer, chime, chink, chirp, chirrup, chitter, chug, clack, clamo(u)r, clang, clango(u)r, clank, clap, clash, clatter, click, clink, cluck, clunk, coo, crack, crackle, crash, creak, crepitate, croak, crow, crunch, cry, cry out, cuckoo, din, ding, drone, echo, fissle, fizz, fizzle, fulminate, gnarl, gnash, grate, groan, growl, grumble, grunt, gurgle, hammer, hiss, holler, honk, hoot, howl, hum, jangle, jar, jingle, knell, knock, meow, mew, mewl, moan, moo, murmur, mutter, neigh, oink, patter, peal, peep, ping, pink, pipe, pitter-patter, plonk, plop, plunk, pop, pule, purr, putter, quack, rap, rasp, rattle, resonate, resound, reverberate, ring, roar, rumble, rustle, scrape, scratch, screak, scream, screech, scrunch, shriek, shrill, slam, slosh, smack, snap, snarl, snort, sough, sound, splash, splosh, squall, squawk, squeak, squeal, squelch, stridulate, swash, swish, swoosh, swoosh,thud, tap, thrum, thud, thump, thunder, tick, ting, tinkle, toll, toot, tootle, trill, trumpet, twang, tweet, twitter, ululate, vroom, wail, warble, wham, wheeze, whimper, whine, whinny, whir(r), whish, whisper, whistle, whiz(z), whoop, whoosh, yammer, yap, yell, yelp, yip, yowl, zing beam, blind, blink, coruscate, dazzle, effulge, flash, fluoresce, glare, gleam, glimmer, glint, glisten, glitter, iridesce, jitter, luminesce, opalesce, phosphoresce, radiate, scintillate, shimmer, shine, sparkle, twinkle pong, reek, scent, smell, stink, whiff blaze, blister, boil, broil, brood, burn, combust, cook, flame, flare, flicker, fume, glow, heat, ignite, incandesce, incinerate, inflame, kindle, light up, melt, parch, roast, scald, scorch, sear, seethe, simmer, singe, sizzle, sizzle , smo(u)lder, smoke, smother, spark, steam, stew
238
Appendix
Table A.2 Pattern SOUND
LIGHT
SMELL HEAT
Candidate bases: German Verbs ächzen, aufkreischen, aufschreien, ballern, batsch-, bellen, bimmeln, blöken, brausen, britzeln, brüllen, brummen, dingeln, dongeln, donnern, dröhnen, dudeln, erklingen, erschallen, ertönen, fauchen, fiepen, fiepsen, flöten, flüstern, gackern, gellen, gickeln, gluckern, glucksen, grölen, grollen, grummeln, grunzen, gurgeln, gurren, hallen, hämmern, heulen, hupen, jammern, japsen, jaulen, johlen, jubeln, keuchen, klacken, klackern, kläffen, klappern, klatschen, klicken, klimpern, klingeln, klingen, klirren, klitsch-, klopfen, knacken, knacksen, knallen, knarren, knarzen, knatschen, knattern, knipsen, knirschen, knistern, knurren, krachen, krächzen, krähen, kratzen, kreischen, lärmen, läuten, meckern, miauen, muhen, murmeln, orgeln, patsch-, pfeifen, piepen, piepsen, pitsch-, plärren, platsch-, plätschern, plitsch-, plumpsen, pochen, poltern, prasseln, puckern, quaken, quäken, quatsch-, quieken, quieksen, quietschen, quitsch-, rappeln, rascheln, raspeln, rasseln, ratschen, rattern, raunen, rauschen, ringen, röcheln, rummsen, rumoren, rumpeln, säuseln, sausen, schallen, schellen, scheppern, schluchzen, schnalzen, schnarren, schnattern, schnauben, schnaufen, schnurren, schreien, schrillen, schwirren, sirren, stöhnen, summen, surren, ticken, tirilieren, tosen, trillern, trommeln, tuckern, wiehern, wimmern, winseln, wispern, wummern, zetern, zirpen, zischeln, zischen, zwitschern aufblinken, aufblitzen, aufleuchten, blenden, blinken, blitzen, brillieren, erglänzen, erglühen, erstrahlen, flackern, flimmern, flirren, fluoreszieren, funkeln, glänzen, gleißen, glimmen, glimmern, glitzern, irisieren, leuchten, phosphoreszieren, scheinen, schillern, schimmern, strahlen duften, miefen, müffeln, riechen, stinken, wohlriechen abbrennen, anbrennen, anfachen, anheizen, aufflackern, aufflammen, aufglühen, aufheizen, auflodern, aufwärmen, braten, brennen, brüten, brutzeln, emporlodern, entbrennen, entfachen, entflammen, entzünden, erhitzen, erwärmen, flammen, funken, glosen, glühen, grillen, kochen, kokeln, lodern, lohen, rösten, schüren, schwelen, sengen, sieden, verbrennen, verbrühen, verdampfen, verdorren, verglimmen, verglühen, verlodern, verschmoren, versengen, zünden
Tables Table A.3
Intensifier type frequency across constructions: SOUND
English Base Types resound 162 groan 145 crackle 110 hum 109 fizz 102 buzz 101 gush 98 reverberate 94 echo 89 howl 79
Table A.4
English Types 195 182 174 152 142 137 132 123 105 70
German Base strahlen ‘beam’/’shine’ leuchten ‘shine’/’glow’ glänzen ‘gleam’ schillern ‘scintillate’ gleißen ‘glisten’ blenden ‘dazzle’ funkeln ‘sparkle’ glitzern ‘glitter’ blitzen ‘flash’ flirren ‘flicker’
Types 347 281 196 187 170 141 119 94 82 70
Intensifier type frequency across constructions: SMELL English
Base reek stink smell
Types 306 173 150 104 93 91 75 60 60 59
Intensifier type frequency across constructions: LIGHT
Base glitter gleam shimmer sparkle glare dazzle shine glisten glint blind Table A.5
German Base rauschen ‘swoosh’/’gush’ schreien ‘scream’ dröhnen ‘drone’ tosen ‘roar’ knallen ‘bang’ donnern ‘thunder’ krachen ‘crash’ brüllen ‘scream’/’roar’ gellen ‘yell’ knistern ‘crackle’
Types 93 71 1
German Base stinken ‘stink’ duften ‘be fragrant’ miefen ‘pong’
Types 37 12 11
239
240
Appendix
Table A.6
Intensifier type frequency across constructions: HEAT
English Base Types glow 213 burn 163 seethe 155 blaze 154 sear 119 smo(u)lder 101 blister 95 brood 93 sizzle 87 flame 85
German Base glühen ‘glow’ flammen ‘flame’ brennen ‘burn’ lodern ‘blaze’ zünden ‘ignite’ sengen ‘singe’ schmelzen ‘melt’ brüten ‘brood’ kochen ‘boil’ sieden ‘seethe’
Types 537 219 198 121 90 35 32 29 25 12
Notes
1. In addition, there are also so-called ‘inherently intensive’ expressions such as revolting (‘very unpleasant’) which cannot be decomposed into a distinct intensifying and intensified component (see section 3.3.1). 2. Capitals indicate emphatic stress. 3. GRADE is a grammatical category of adjectives and adverbs that encodes the relative position of an entity as compared to that of some other entity (or all others of the relevant kind) on a particular semantic scale (an easier task, the easiest task). By contrast, intensification also applies to many categories which cannot be graded (such as VPs, NPs and PPs) and opens up a richer spectrum of distinctions than merely relative position on a scale. 4. As Bolinger (1972: 18) observes, “if there are function words, very is surely one of them”. 5. The same ambiguity is also found between copulative and determinative compound interpretations in morphology, cf. eine höflich-bestimmte Aufforderung (‘a request that is polite and authoritative’ vs. ‘a request that is authoritative in a polite way’). 6. In German, I also included functionally equivalent combinations of intensifier and nominal head in morphology as in e.g. Blitzgescheitheit (‘flash’+‘cleverness’, i.e. ‘ingenuity’). 7. Take the above example of glowing health: health being an abstract property of animate entities that cannot itself be conceptualised as ‘glowing’, comprehenders must first establish a metonymic connection to the property bearer. Second, the property bearer thus accessed (i.e., the referent of you in 26a) is not a prototypical emitter of heat or light, either. It is therefore only in a further extension step that possible motivations suggest themselves: feeling hot and/or becoming red in the face due to heightened blood pressure are salient attributes of people only in quite restricted contexts, among them situations of physical effort and strain, and it is this specific constellation that can finally be related to the ascribed property health in terms of a causal inference (i.e., somebody is fit/healthy enough to engage in such activities). Regardless whether this particular account is correct (in fact, one might just as well tell a different story about HEALTH being metaphorised as a valuable shiny object and the like – recall the disagreements about loud colour reported in chapter 2), it is clear that some such elaboration is required in order to obtain the observed interpretation, which cannot be reconstructed as ‘the intersection of things that are glowing and things that are (different states of) health’.
242
Notes
8. „Wie bei den Verben ist auch bei den Substantiven die Möglichkeit gegeben, klassifizierende Lexeme aufgrund irgendeiner mit dem Wort verbundenen Eigenschaft, d.h. der Konnotation zu intensivieren“ (van Os 1989: 78). 9. Bolinger (1972) has a chapter on the intensification of nouns but talks about different constructions like e.g. exclamatives with such and what (e.g. Such foolishness! What a house that is! etc.). 10. As in construction A, the corpus study also includes the morphological variant of the pattern in German (e.g., blitzgescheit, ‘ingenious’). 11. Like many earlier studies, Huddleston and Pullum (2002) use the term degree adverb instead of intensifier, rejecting the latter for its semantic inappropriateness in connection with downscaling modification. By contrast, Claudi (2006) points out that the two notions are not coextensive since that there exist intensifiers like German höchst ‘highly, to the highest extent’ which cannot be analysed as adverbs (cf. Sie ist höchst ADJ ‘she is highly ADJ’ vs. *She V-t höchst ‘she V-s highly’). 12. In the words of van Os (1989: 213): “Syntaktisch gesehen läßt sich die allgemeine Regel aufstellen, daß Intensivierer unmittelbar vor dem Element stehen, auf das sie sich semantisch beziehen. Dieses Prinzip der minimalen Distanz (Adjazenz) kann nur unter sehr spezifischen Bedingungen durchbrochen werden” (‘Syntactically, there is a general rule that intensifiers must immediately precede the element which they relate to semantically. This principle of minimal distance [adjacency] can only be compromised under very specific conditions.’) 13. Bußmann (2002: 186) defines the category as follows: “Im Unterschied zum Positiv (‘Grundstufe’), Komparativ und Superlativ höchste Steigerungsstufe des Adjektivs zur Bezeichnung eines hohen Grades einer Eigenschaft, vgl. neueste/schlimmste Nachrichten, aber (im Unterschied zum relativen Superlativ) ohne vergleichende Komponente: Man nennt den Elativ daher auch ‘absoluten Superlativ’“ (‘In contrast to the positive (‘base form’), the comparative and the superlative, [the elative is] the highest level in adjective grading used for marking a high degree of a property, cf. latest/worst news, but (in contrast to the relative superlative) without the element of comparison: therefore, the elative is also called an ‘absolute superlative’’. 14. van Os (1988) lists 161 intensifying elements, Pittner (1996) has 194 types, and neither list is exhaustive. 15. „Es kann nicht geleugnet werden, daß lexikalische Einheiten im Laufe ihrer Entwicklung z.B. vom Grundmorphem zum Affix geworden sind, d.h. die Kategorie gewechselt haben. Doch synchron gesehen lassen sie sich zu einem bestimmten Zeitpunkt nur entweder als das eine oder das andere identifizieren“; ‘there is no denying that there exist lexical units that have transformed from a base morpheme into an affix, i.e. changed categories in the course of their develolpment. Synchronically, however, it is always only possible to identify them as being one or the other at a given point in time’, Schmidt 1987: 99).
Notes
243
16. From a constructional point of view, Booij (2005) observes: “The theoretical problem that there is no sharp boundary between compounding and affixal derivation is not solved, however, by postulating a category of semi-affixes or affixoids; it is just a convenient description of the fact that the boundary between compounding and derivation is blurred, but does not in itself provide an explanation of why this is the case. What we need is a model of morphological knowledge that will enable us to explain these facts”. Arguing for a constructional approach to morphology, he then goes on to suggest the following solution: “The notion ‘affixoid’ thus receives a formal interpretation in terms of linking patterns in the lexicon, and is therefore not to be seen as a theoretical term that introduces a third class of morphemes besides lexical morphemes and bound morphemes. An affixoid is a lexeme that occurs in a subschema for compounds in which the other position is still a variable, that is, without a lexical specification. Such schemas are intermediate between concrete individual compounds and fully abstract schemes for compound structures. The specific and recurrent meaning of a lexeme in the compound structure is specified at this intermediate level”. 17. Later editions of Fleischer’s book (e.g., Fleischer and Barz 1995: 282) do not analyse words like allerbester as a prefixed adjective but as a compound. Furthermore, claiming an “isolated state” for aller- as the only formative performing the relevant function is not entirely correct – there are also occasional uses of elative intensifiers such as super-, sau-, erz- and knall- in this function, though aller- is undoubtedly the most grammaticalised choice. 18. Incidentally, this marks an important difference between construction C on the one hand and constructions A and B on the other: construction C combines the two functions of property ascription and intensification. In other words, constructions A and B are (among other things) grammatical means for signalling intensification, whereas construction C encodes intensified property ascription. Strictly speaking, it is therefore misleading to refer to the oblique nominal in construction C as an intensified property: for instance, in an expression like the place was buzzing with journalists, it is not the meaning of journalists that is intensified, but rather the assocication of the subject referent the place with the entities denoted by the oblique nominal. 19. http://www.ids-mannheim.de/cosmas2/ 20. http://www.natcorp.ox.ac.uk 21. http://miniappolis.com/KWiCFinder/KWiCFinderHome.html 22. http://www.webcorp.org.uk 23. In addition, WebCorp sometimes yielded fewer than 200 hits even though a direct query in the underlying search engine returned more than these 200 hits. In a discussion of this unexpected mismatch on the Norwegian ‘Corpora’ mailing list ([email protected]), one of the developers of WebCorp, Antoinette Renouf, offered the following explanation: “at the moment WebCorp takes the first 200 hits for your search term from your chosen search engine (Google by
244
Notes
default) and extracts concordances from those pages. Unless you choose the 'one concordance line per site' option, there is no limit on the number of concordance lines extracted from each of these 200 pages. However, you will sometimes get fewer than 200 concordance lines in the WebCorp output for your search term. This happens if you have chosen additional filtering options (which will filter out some of the 200 hits from Google), or if certain pages are not accessible when WebCorp tries to access them or have changed since they were indexed by Google and no longer contain your search term” (http://torvald.aksis.uib.no/corpora/2005-1/0392.html). 24. For instance, the collocation blazingly fast appears to have an association with computer jargon, and as such seems to be fairly recent: whereas it is quite common in the web data (where it is mostly used to modify such nouns as e.g. image response rate), there is not a single occurrence of blazingly fast in the BNC (compiled in the 1990s). 25. These included Roget’s thesaurus (Chapman 1994), the Merriam-Webster online thesaurus (http://www.m-w.com/), WordNet (Miller 1995), the MS Office thesaurus and the thesaurus function of the electronic version of the Cambridge Advanced Learner’s Dictionary (CALD) for English, plus Dornseiff’s Der Deutsche Wortschatz nach Sachgruppen (Dornseiff and Quasthoff 2004) and the MS Office thesaurus for German. 26. http://dict.leo.org/ 27. The number of individual concordances is that high for two reasons: first, constructions A, B and C involve (at least partially) different forms of the investigated bases, thereby calling for separate concordances for the different case studies. Furthermore, the English web searches were conducted with separate queries for each different form of the investigated lemmas (rather than in the form of a joint wildcard or batch search) in order to increase the number of retrieved tokens. 28. An exception were stacked morphological intensifiers such as e.g. German funkelnagel-(neu), piepschnurz-(egal) and rappelzappel-(voll), all of which were included under the rubric of the first of the two intensifying elements (i.e., funkel-, piep-, and rappel- in the present examples). 29. Anticipating the discussion in chapter 6, classifications that seek to assign word meanings to exactly one well-delimited class out of a fixed small set of introspectively devised semantic categories (such as Dixon’s, or any comparable scheme) run into a number of principled theoretical and methodological problems. See section 6.2.1 and note 30 for discussion. 30. A reviewer expresses concerns about the assignment of beautiful to the category VALUE, arguing that it may just as well (or even better) be assigned to the category PHYSICAL PROPERTY instead. More seriously, it is argued that such disagreements cast doubt on the appropriateness of using Dixon’s semantic typology in general, because there is overlap between the supposedly distinct (and mutually exclusive) classes in his scheme. This is a sensible objection. My
Notes
245
classification of the data in terms of Dixon’s broad category scheme is undoubtedly a simplification, and some of the specific assignments that were made may well be controversial. On the other hand, such simplifications are not peculiar to the particular system proposed by Dixon, but rather a necessary corollary of any classification system that uses a fixed set of mutually exclusive categories. And the problem in fact goes even deeper than this: for a given usage, it will often be possible to adduce principled arguments for or against a particular assignment (for instance, the classification PHYSICAL PROPERTY would not be appropriate for the meaning of beautiful in expressions like a beautiful day/mind/compromise etc.). However, as illustrated in chapter 3 on the example of the adjective dead, different tokens of the same word form often mean different things depending on the context in which they are used. In other words, the fact that it is not unified words that collocate with a given intensifier but rather specific readings of these words implies that the calculation of purely form-based collocations already constitutes a first simplification in itself, irrespective of any later semantic classifications to which these collocations are then subject. The deeper question is therefore whether such simplifications are accepted as heuristically useful and analytically legitimate in spite of their apparent imperfection, or whether any such analyses should rather be rejected from the outset. Ultimately, this is again an empirical question: should it turn out that a maximally bottom-up, token-based approach yields grossly different results for a given analysis – here meaning that e.g. the PHYSICAL PROPERTY and VALUE readings of an adjective like beautiful collocate with entirely different intensifiers – the more faithful token-based classification strategy would have to be preferred. It must be added, however, that adopting such an approach is also much, much more labour-intensive, and that large-scale investigations of the kind conducted in this study would not be feasible for such an approach. This in turn underscores that any expected analytical benefits of such a move of course still need to be weighed against the resulting costs, and that it is not sensible to always opt for the most faithful/detailed/bottom-up approach in principle if this imposes new (and possibly unjustified) restrictions on the kinds of phenomena that can be investigated in the first place. To come back to the original question, then, I would hold that it is better to aim at a classification that is simplified in the above sense (and to acknowledge and critically discuss its limitations) than to provide no such classification at all, which would mean to throw out the baby with the bathwater. 31. One area where this was especially notable was the use of LIGHT metaphors in classical music and concert reviews, where authors appeared to be particularly prone to boost the expressive effect of their assessments through unconventional instances of synaesthesia.
References
Abbot-Smith, Kirsten, Elena Lieven, and Michael Tomasello 2004 Training 2;6-year-olds to produce the transitive construction: the role of frequency, semantic similarity and shared syntactic distribution. Developmental Science 7: 48–55. Abbot-Smith, Kirsten, and Michael Tomasello 2006 Exemplar-learning and schematization in a usage-based account of syntactic acquisition. The Linguistic Review 23: 275–90. Adamson, Sylvia 2000 A lovely little example. Word order options and category shift in the premodifying string. In Pathways of change: Grammaticalization in English, Olga Fischer, Anette Rosenbach, and Dieter Stein (eds.), 39–66. Amsterdam/Philadelphia: John Benjamins. Aitchison, Jean 2001 Language Change: Progress or Decay? Cambridge: Cambridge University Press. Ambridge, Ben, Julian M. Pine, Caroline F. Rowland, and Chris R. Young 2008 The effect of verb semantic class and verb frequency (entrenchment) on children’s and adults’ graded judgements of argument structure overgeneralization errors. Cognition 106, 87–129. Apresjan, Juri D. 1974 Regular polysemy. Linguistics 142: 5–32. Arnon, Inbal, and Neal Snider 2010 More than words: frequency effects for multi–word phrases. Journal of Memory and Language 62: 67–82. Aronoff, Mark 1976 Word Formation in Generative Grammar. Cambridge, Mass: MIT Press. Arppe, Antti, Gaëtanelle Gilquin, Dylan Glynn, Martin Hilpert, and Arne Zeschel 2010 Cognitive corpus linguistics: five points of debate on current theory and methodology. Corpora 5 (1): 1–27. Baayen, Harald 1992 Quantitative aspects of morphological productivity. In Yearbook of Morphology 1991, Geert E. Booij, and Jaap van Marle (eds.), 109– 49. Dordrecht: Kluwer. 1993 On frequency, transparency, and productivity. In Yearbook of Morphology 1992, Geert E. Booij, and Jaap van Marle (eds.), 181–208. Dordrecht: Kluwer.
References
247
Baayen, Harald, and Rochelle Lieber 1991 Productivity and English derivation: a corpus-based study. Linguistics 29: 801–844. Bannard, Colin, and Elena Lieven 2009 Repetition and reuse in child language learning. In Formulaic Language. Vol. II: Acquisition, Loss, Psychological Reality, Functional Explanations, Roberta Corrigan, Edith A. Moravcsik, Hamid Ouali, and Kathleen M. Wheatley (eds.), 297–321. Amsterdam/Philadelphia: John Benjamins. Bannard, Colin, and Danielle E. Matthews 2008 Stored word sequences in language learning: The effect of familiarity on children’s repetition of four–word combinations. Psychological Science 19: 241–248. Barcelona, Antonio 2003a Metaphor and Metonymy at the Crossroads: A Cognitive Perspective. Berlin/New York: Mouton de Gruyter. 2003b On the plausibility of claiming a metonymic motivation for conceptual metaphor. In Metaphor and Metonymy at the Crossroads: A Cognitive Perspective, Antonio Barcelona (ed.), 31–58. Berlin/New York: Mouton de Gruyter. Barlow, Michael 2000 Usage, blends, and grammar. In Usage-based Models of Language, Michael Barlow, and Suzanne Kemmer (eds.), 315–345. Stanford: CSLI. Barlow, Michael, and Suzanne Kemmer (eds.) 2000 Usage-based Models of Language. Stanford: CSLI. Baroni, Marco, and Silvia Bernardini 2006 WaCky! Working Papers on the Web as Corpus. Bologna: GEDIT. http://wackybook.sslmit.unibo.it/ Barsalou, Lawrence W. 1983 Ad hoc categories. Memory and Cognition 11: 211–227. 2005 Situated conceptualization. In Handbook of Categorization in Cognitive Science, Henri Cohen, and Claire Lefebvre (eds.), 619–650. Oxford: Elsevier. Bardal, Jóhanna 2008 Productivity: Evidence from Case and Argument Structure in Icelandic. Amsterdam/Philadelphia: John Benjamins. Bauer, Laurie 2001 Morphological Productivity. Cambridge: Cambridge University Press. Bergs, Alexander, and Gabriele Diewald 2008 Constructions and Language Change. Berlin/New York: Mouton de Gruyter.
248
References
Bernardini, Silvia, Marco Baroni, and Stefan Evert 2006 A WaCky introduction. In WaCky! Working Papers on the Web as Corpus, Marco Baroni, and Silvia Bernardini (eds.), 9–40. Bologna: GEDIT. http://wackybook.sslmit.unibo.it/. Biedermann, Reinhard 1969 Die deutschen Gradadverbien in aus synchronischer und diachronischer Sicht. Ph.D. diss., Universität Heidelberg. Boas, Hans C. 2004 Argument alternations in construction grammar: How do you define verb classes? Paper presented at 3rd International Conference on Construction Grammar, Marseille/France. Bod, Rens 2001 Sentence Memory: storage vs. computation of frequent sentences. Proceedings CUNY 2001, Philadelphia. Bolinger, Dwight 1972 Degree words. The Hague: Mouton. Booij, Geert E. 2005 Compounding and derivation: evidence for construction morphology. In Morphology and its Demarcations, Wolfgang U. Dressler, Dieter Kastovsky, Oskar E. Pfeiffer, and Rainer Franz (eds.), 109– 132. Amsterdam/Philadelphia: John Benjamins. Borst, Eugen 1902 Die Gradadverbien im Englischen. Heidelberg: Carl Winter. Boyd, Jeremy, and Adele E.Goldberg 2009 Input effects within a constructionist framework. Modern Language Journal 93 (3): 418–429. Braine, Martin D. S. 1976 Children’s First Word Combinations. Monographs of the Society for Research in Child Development 41. Bréal, Michel 1964 Reprint. Semantics: Studies in the Science of Meaning. New York: Dover. Original edition, London: Heinemann, 1900. Brown, Keith (ed.) 2006 Encyclopedia of Language and Linguistics. 2nd edition. Oxford: Elsevier. Burnard, Lou 1995 Users Reference Guide for the British National Corpus. Oxford: British National Corpus Consortium, Oxford University Computing Services.
References
249
Bußmann, Hadumod (ed.) 2002 Lexikon der Sprachwissenschaft. Stuttgart: Kröner. Bybee, Joan L. 1985 Morphology: A study of the relation between meaning and form. Amsterdam/Philadelphia: John Benjamins. 1995 Regular morphology and the lexicon. Language and Cognitive Processes 10: 425–455. 1998 The emergent lexicon. Proceedings of the 34th Annual Meeting of the Chicago Linguistics Society (The Panels), 421–435. Chicago: Chicago Linguistics Society. 2006 From usage to grammar: the mind’s response to repetition. Language 82 (4): 711–733. 2010 Language, Usage and Cognition. Cambridge: Cambridge University Press. Bybee, Joan L., and David Eddington 2006 A usage-based approach to Spanish verbs of ‘becoming’. Language 82 (2): 323–355. Bybee, Joan L., and Sandra A. Thompson 1997 Three frequency effects in syntax. In Proceedings of the 23rd Annual Meeting of the Berkeley Linguistics Society, Matthew L. Juge, and Jeri L. Moxley (eds.), 378–388. Berkeley: Berkeley Linguistics Society. Cacciari, Cristina, and Patrizia Tabossi (eds.) 1993 Idioms: Processing, Structure and Interpetation. Hillsdale: Lawrence Erlbaum. Casasanto, Daniel, and Lera Boroditsky 2008 Time in the mind: using space to think about time. Cognition 106, 579–593. Casenhiser, Devin M., and Adele E. Goldberg 2005 Fast mapping between a phrasal form and meaning. Developmental Science 8: 500–508. Chandler, Steve 2002 Skousen’s analogical approach as an exemplar–based model of categorization. In Analogical Modeling, Royal Skousen, Deryle Lonsdale, and Dilworth B. Parkinson (eds.), 51–105. Amsterdam/Philadelphia: John Benjamins. Chapman, Robert L. (ed.) 1994 The International Concise Roget’s Thesaurus. New York: Harper. Childers, Jane B., and Michael Tomasello 2001 The role of pronouns in young children’s acquisition of the English transitive construction. Developmental Psychology 37 (6): 730–748. Chomsky, Noam 1965 Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press.
250
References
2002 On Nature and Language. Cambridge: Cambridge University Press. Cienki, Alan 1997 Some properties and groupings of image schemas. In Lexical and Syntactical Constructions and the Construction of Meaning, Marjolijn Verspoor, Kee Dong Lee, and Eve Sweetser (eds.), 3–15. Philadelphia: John Benjamins. Claudi, Ulrike 2006 Intensifiers of adjectives in German. Sprachtypologie und Universalienforschung 59 (4): 350–369. Clausner, Timothy C., and William Croft 1999 Domains and image-schemas. Cognitive Linguistics 10 (1): 1–31. Cohen, Jacob 1960 A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20: 37–46 Conklin, Kathy, and Norbert Schmitt 2008 Formulaic sequences: are they processed more quickly than nonformulaic language by native and nonnative speakers? Applied Linguistics 29 (1): 72–89. Corbett, Greville G. 2004 The Russian adjective: a pervasive yet elusive category. In Adjective Classes: A Crosslinguistic Typology, Robert M. W. Dixon, and Alexandra Y. Aikhenvald (eds.), 199–222. Oxford: Oxford University Press. Croft, William 2001 Radical Construction Grammar. Oxford: Oxford University Press. Croft, William, and David A. Cruse 2004 Cognitive Linguistics. Cambridge: Cambridge University Press. Cronk, Brian C., Susan D. Lima, and Wendy A. Schweigert 1993 Idioms in sentences: effects of frequency, literalness, and familiarity. Journal of Psycholinguistic Research 22, 59–82. D browska, Eva 2000 From formula to schema: the acquisition of English questions. Cognitive Linguistics 11(1/2): 83–102. 2008 Questions with long–distance dependencies: a usage-based perspective. Cognitive Linguistics 19 (3): 391–425. D browska, Eva, and Elena Lieven 2005 Towards a lexically specific grammar of children’s question constructions. Cognitive Linguistics 16 (3), 437–474. D browska, Eva, Caroline F. Rowland, and Anna Theakston 2009 The acquisition of questions with long distance dependencies. Cognitive Linguistics 20 (3): 571–596
References
251
Deignan, Alice 2005 Metaphor and Corpus Linguistics. Amsterdam/Philadelphia: John Benjamins. Diessel, Holger 2007 Frequency effects in language acquisition, language use, and diachronic change. New Ideas in Psychology 25, 104–123. Diewald, Gabriele 2002 A model for relevant types of contexts in grammaticalization. In New Reflections on Grammaticalization, Ilse Wischer and Gabriele Diewald (eds.), 103–120. Amsterdam/Philadelphia: John Benjamins. 2006 Context types in grammaticalization as constructions. Constructions SV9/2006. Dittmar, Miriam, Kirsten Abbot-Smith, Elena Lieven, and Michael Tomasello 2008 Young German children’s early syntactic competence: a preferential looking study. Developmental Science 11 (4): 575–582. Dixon, Robert M. W. 1977 Where have all the adjectives gone? Studies in Language 1: 19–80. Donalies, Elke 2005 Die Wortbildung des Deutschen. Ein Überblick. Tübingen: Narr. Dornseiff, Franz, and Uwe Quasthoff (eds.) 2004 Der deutsche Wortschatz nach Sachgruppen. 8th edition. Berlin: de Gruyter. Original edition, Berlin: de Gruyter, 1934. Dowty, David 2000 ‘The garden swarms with bees’ and the fallacy of ‘argument alternation’. In Polysemy. Theoretical and Computational Approaches, Yael Ravin, and Claudia Leacock (eds.), 111–128. Oxford: Oxford University Press. Dressler, Wolfgang U., and Lavinia Merlini Barbaresi 1994 Morphopragmatics: Diminutives and Intensifiers in Italian, German and Other Languages. Berlin: de Gruyter. Ellis, Nick C. 2002 Frequency effects in language processing. Studies in Second Language Acquisition 24: 143–188. Ellis, Nick C., and Diane Larsen-Freeman 2009 Constructing a second language: Analyses and computational simulations of the emergence of linguistic constructions from usage. Language Learning 59, Supplement 1, 93–128. Erman, Britt 2007 Cognitive processes as evidence of the idiom principle. International Journal of Corpus Linguistics 12: 25–53. Erman, Britt, and Beatrice Warren 2000 The idiom principle and the open choice principle. Text 1: 29–62.
252
References
Estes, William K. 1994 Classification and Cognition. Oxford: Oxford University Press. Fauconnier, Gilles 1985 Mental Spaces. Cambridge: Cambridge University Press. 1997 Mappings in Thought and Language. Cambridge: Cambridge University Press. Fauconnier, Gilles, and Mark Turner 2002 The Way we think: Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. Fillmore, Charles 1968 The case for case. In Universals in Linguistic Theory, Emmon Bach, and Robert Thomas Harms, (eds.), 1–88. New York: Holt, Rinehart, and Winston. 1982 Frame semantics. In Linguistics in the Morning Calm, Linguistic Society of Korea (ed.), 111–137. Seoul: Hanshin. Firth, John Rupert 1957 Papers in Linguistics, 1934–1951. London: Oxford University Press. Fleischer, Wolfgang, and Irmhild Barz 1995 Wortbildung der Deutschen Gegenwartssprache. Tübingen: Niemeyer. Fried, Mirjam 2005 A frame–based approach to case alternations: the swarm–class verbs in Czech. Cognitive Linguistics 16(3): 475–512. Geyken, Alexander, Alexey Sokirko, Ines Rehbein, and Christiane Fellbaum 2004 What is the optimal corpus size for the study of idioms? Paper presented at 26. Jahrestagung der Deutschen Gesellschaft für Sprachwissenschaft, Mainz. Gibbs, Raymond W. 1990 Psycholinguistic studies on the conceptual basis of idiomaticity. Cognitive Linguistics 1: 417–451. 2006 Embodiment and Cognitive Science. Cambridge: Cambridge University Press. Gibbs, Raymond W., and Herbert L. Colston 1995 The cognitive psychological reality of image schemas and their transformations. Cognitive Linguistics 6 (4): 347–378. Gibbs, Raymond W., Nandini P. Nayak, John Bolton, and Melissa Keppel 1989 Speakers’ assumptions about the lexical flexibility of idioms. Memory and Cognition 16: 58–68. Gibbs, Raymond W., Nandini P. Nayak, and Cooper Cutting 1989 How to kick the bucket and not decompose: analyzability and idiom processing. Journal of Memory and Language 28, 576–593.
References
253
Glück, Helmut 2000 Die Ab- und Neuschaffung von Adverbien durch die Neuregelung der deutschen Rechtschreibung 1996/98. In Das Adverb. Zentrum und Peripherie einer Wortart, Friederike Schmöe, Helmut Glück, Elisabeth Leiss, and Miorita Ulrich (eds.), 95–106. Wien: Edition Praesens. Glynn, Dylan, and Kerstin Fischer (eds.) 2010 Quantitative Methods in Cognitive Semantics: Corpus-driven Approaches. Berlin/New York: Mouton de Gruyter. Goldberg, Adele E. 1995 Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. 1999 The emergence of the semantics of argument structure constructions. In The Emergence of Language, Brian MacWhinney (ed.), 197–213. New Jersey: Lawrence Erlbaum Associates. 2006 Constructions at Work. The Nature of Generalization in Language. Oxford: Oxford University Press. Goldberg, Adele E., Devin M. Casenhiser, and Nitya Sethuraman 2004 Learning argument structure generalizations. Cognitive Linguistics 15: 289–316. Goossens, Loius 1990 Metaphtonomy: the interaction of metaphor and metonymy in expressions for linguistic action. Cognitive Linguistics 1 (3): 323–340. Grady, Joseph, and Christopher R. Johnson 2003 Converging evidence for the notions of subscene and primary scene. In Metaphor and Metonymy in Comparison and Contrast, René Dirven, and Ralf Pörings (eds.), 533–554. Berlin/New York: Mouton de Gruyter. Gries, Stefan Thomas 2004 Coll.analysis 3. A program for R for Windows. 2005 Null–hypothesis significance testing of word frequencies: a followup on Kilgarriff. Corpus Linguistics and Linguistic Theory 1 (2): 277–294. Gries, Stefan Thomas, Beate Hampe, and Doris Schönefeld 2005 Converging evidence: bringing together experimental and corpus data on the association of verbs and constructions. Cognitive Linguistics 16 (4): 635–676. 2010 Converging evidence II: more on the association of verbs and constructions. In Empirical and Experimental Methods in Cognitive/Functional Research, John Newman and Sally Rice (eds.), 5972. Stanford: CSLI.
254
References
Gries, Stefan Thomas, and Anatol Stefanowitsch 2004a Extending collostructional analysis: a corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics 9 (1): 97– 129. 2004b Covarying collexemes in the into-causative. In Language, Culture and Mind, Michelle Achard, and Suzanne Kemmer (eds.), 225–236. Stanford: CSLI. 2010 Cluster analysis and the identification of collexeme classes. In Empirical and Experimental Methods in Cognitive/Functional Research, John Newman, and Sally Rice (eds.), 73-90. Stanford: CSLI. Gries, Stefan Thomas, and Anatol Stefanowitsch (eds.) 2006 Corpora in Cognitive Linguistics: Corpus-based Approaches to Syntax and Lexis. Berlin/New York: Mouton de Gruyter. Haider, Hubert 1984 Mona Lisa lächelt stumm: Über das sogenannte deutsche ‘Rezipientenpassiv’. Linguistische Berichte 89: 32–42. Halliday, Michael A. K. 1966 Lexis as a linguistic level. In In Memory of J. R. Firth, Charles. E. Bazell, John C. Catford, Michael A.K. Halliday, and Robert H. Robins (eds.), 148–162. London: Longman. Hampe, Beate (ed.) 2005 From Perception to Meaning: Image Schemas in Cognitive Linguistics. Berlin/New York: Mouton de Gruyter. Hanks, Patrick 2004 The syntagmatics of metaphor and idiom. International Journal of Lexicography 17 (3): 245–274. Harnad, Stevan 1990 The symbol grounding problem. Physica D 42: 335–346. Hawkins, John A. 1986 A Comparative Typology of English and German. Unifying the Contrasts. London: Croom Helm. Hilpert, Martin 2006 Distinctive collexeme analysis and diachrony. Corpus Linguistics and Linguistic Theory 2 (2): 243–57. 2008 Germanic Future Constructions: A Usage-based Approach to Language Change. Amsterdam/Philadelphia: John Benjamins. Hopper, Paul, and Elizabeth C. Traugott 2003 Grammaticalization. 2nd edition. Cambridge: Cambridge University Press. Horn, Laurence 1969 A presuppositional analysis of only and even. Papers of the Regional Meetings of the Chicago Linguistic Society, Robert I. Binnick (ed.), 98–107. Chicago: Chicago Linguistics Society.
References
255
Howarth, Peter 1998 Phraseology and second language proficiency. Applied Linguistics 19 (1): 24–44. Huddleston, Rodney, and Geoffrey K. Pullum 2002 The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Hunston, Susan, and Gill Francis 2000 Pattern Grammar: a Corpus–driven Approach to the Lexical Grammar of English. Amsterdam/Philadelphia: John Benjamins. Israel, Michael 1996 The way constructions grow. In Conceptual Structure, Discourse and Language, Adele Goldberg (ed.), 217–230. Stanford: CSLI. 2002 Consistency and creativity in first language acquisition. In Proceedings of the 28th Annual Meeting of the Berkeley Linguistics Society, Julie Larson, and Mary Paster (eds.), 123–136. Berkeley: Berkeley Linguistics Society. Ito, Rika, and Sali Tagliamonte 2003 Well weird, right dodgy, very strange, really cool: layering and recycling in English intensifiers. In Language in Society 32: 257–279. Jakobson, Roman 1956 Two aspects of language and two types of aphasic disturbances. In Fundamentals of Language, Roman Jakobson, and Morris Halle (eds.). The Hague: Mouton. Johnson, Christopher R. 1999 Constructional Grounding: The Role of Interpretational Overlap in Lexical and Constructional Acquisition. Doctoral dissertation, University of California, Berkeley. Johnson, Mark 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: Chicago University Press. 2005 The philosophical significance of image schemas. In From Perception to Meaning: Image Schemas in Cognitive Linguistics, Beate Hampe (ed.), 15–33. Berlin: Mouton de Gruyter. Kay, Paul, and Charles J. Fillmore 1999 Grammatical constructions and linguistic generalizations: the What’s X doing Y? construction. Language 75: 1–33. Keller, Frank, and Mirella Lapata 2003 Using the web to obtain frequencies for unseen bigrams. Computational Linguistics 29 (3): 459–484. Keysar, Boaz, and Bridget Bly 1995 Intuitions of the transparency of idioms: can one keep a secret by spilling the beans? Journal of Memory and Language 34, 89–109.
256
References
Kidd, Evan J., Elena Lieven, and Michael Tomasello 2006 Examining the role of lexical frequency in children’s acquisition and processing of sentential complements. Cognitive Development, 21: 93–107. Kienpointner, Anna Maria 1985 Wortstrukturen mit Verbalstamm als Bestimmungsglied in der deutschen Sprache. Innsbruck: Institut für Germanistik, Universität Innsbruck. Kilgarriff, Adam 2005 Language is never ever ever random. Corpus Linguistics and Linguistic Theory 1 (2): 263–276. Kilgarriff, Adam, and Gregory Grefenstette 2003 Introduction of the special issue on the web as corpus. Computational Linguistics 29 (3): 333–347. Kirchner, Gustav 1955 Gradadverbien: Restriktiva und Verwandtes im heutigen Englisch. Halle: Niemeyer. Kirschbaum, Ilja 2002 Schrecklich nett und voll verrückt. Doctoral dissertation, Unversität Düsseldorf. Kövecses, Zoltan, and Günter Radden 1998 Metonymy: Developing a cognitive linguistic view. Cognitive Linguistics 9: 37–77. Lakoff, George 1987 Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago: University of Chicago Press. 1990 The invariance hypothesis: is abstract reason based on imageschemas? Cognitive Linguistics 1 (1): 39–74. 1993 The contemporary theory of metaphor. In Metaphor and Thought, Andrew Ortony (ed.), 2nd ed., 202–251. Cambridge: Cambridge University Press. Lakoff, George, and Mark Johnson 1980 Metaphors we live by. Chicago: University of Chicago Press. 1999 Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites. Stanford: Stanford University Press. 1991 Foundations of Cognitive Grammar. Volume 2: Descriptive Application. Stanford: Stanford University Press. 1999 Grammar and Conceptualization. Berlin/New York: Mouton de Gruyter.
References 2002
257
Reprint. Concept, Image and Symbol. The Cognitive Basis of Grammar. Berlin/New York: Mouton de Gruyter. Original edition, Berlin/New York: Mouton de Gruyter, 1991. 2005 Construction grammars: cognitive, radical, and less so. In Cognitive Linguistics: Internal Dynamics and Interdisciplinary Interaction. Francisco J. Ruiz de Mendoza Ibánez, and M. Sandra Pena Cervel (eds.), 101–159. Berlin/New York: Mouton de Gruyter. 2008 Cognitive Grammar: A Basic Introduction. Oxford: Oxford University Press. 2009 Constructions and constructional meaning. In New Directions in Cognitive Linguistics, Vyvyan Evans, and Stéphanie Pourcel (eds.), 225–268. Amsterdam/Philadelphia: John Benjamins Lehmann, Christian 1985 Grammaticalization: synchronic variation and diachronic change. Lingua e Stile 20 (3): 303–18. 1991 Grammaticalization and related changes in contemporary German. In Approaches to Grammaticalization. Vol. II: Focus on Types of Grammatical Markers, Elizabeth C. Traugott, and Bernd Heine (eds.), 493–535. Amsterdam/Philadelphia: John Benjamins. 2002 Thoughts on Grammaticalization. 2nd, revised version. ASSIDUE: Arbeitspapiere des Seminars für Sprachwissenschaft der Universität Erfurt 9. Levin, Beth 1993 English Verb Classes and Alternations. A Preliminary Investigation. Chicago: University of Chicago Press. Libben, Maya R., and Debra A. Titone 2008 The multidetermined nature of idiomatic expressions. Memory and Cognition 36: 1103–1131. Lieven, Elena, Dorothé Salomo, and Michael Tomasello 2009 Two-year-old children’s production of multiword utterances: a usage-based analysis. Cognitive Linguistics 20 (3): 481–508. Lieven, Elena, Michael Tomasello, Heike Behrens, and Jennifer Speares 2003 Early syntactic creativity: a usage-based approach. Journal of Child Language 30, 333–370. Lorenz, Gunter 1999 Adjective Intensification – Learners Versus Native Speakers: A Corpus Study of Argumentative Writing. Amsterdam: Rodopi. 2002 Really worthwhile or not really significant? A corpus-based approach to the delexicalization and grammaticalization of intensifiers in Modern English. In New Reflections on Grammaticalization, Ilse Wischer, and Gabriele Diewald (eds.), 143–162. Amsterdam/Philadelphia: John Benjamins.
258
References
MacWhinney, Brian 1978 The Acquisition of Morphophonology. Monographs of the Society for Research in Child Development, 43. Makkai, Adam 1972 Idiom Structure in English. The Hague: Mouton. Mandler, Jean M. 1992 How to build a baby II: conceptual primitives. Psychological Review 99: 587–604. McGlone, Matthew S., Sam Glucksberg, and Cristina Cacciari 1994 Semantic productivity and idiom comprehension. Discourse Processes 17, 167–190. Medin, Douglas L., and Marguerite M. Schaffer 1978 Context theory of classification learning. Psychological Review 85: 207–38. Meunier, Fanny and Sylviane Granger (eds.) 2008 Phraseology in Foreign Language Learning and Teaching. Amsterdam/Philadelphia: John Benjamins. Meyer, R. M. 1902 Die Umbildung fertiger Worte. Zeitschrift für Deutsche Wortforschung 2: 36–42. Miller, George A. 1995 WordNet: a lexical database for English. Communications of the ACM 38 (11): 39–41. Moon, Rosamund 1998 Fixed Expressions and Idioms in English: A Corpus-based Approach. Oxford: Oxford University Press. Morris, William C., Garrison W. Cottrell, and Jeffrey Elman 2000 A connectionist simulation of the empirical acquisition of grammatical relations. In Hybrid Neural Systems, Stefan Wermter, and Ron Sun (eds.), 175–193. Berlin: Springer. Naigles, Letitia R., and Erika Hoff-Ginsberg 1998 Why are some verbs learned before other verbs? Effects of input frequency and structure on children’s early verb use. Journal of Child Language 25: 95–120. Needham, William P. 1992 Limits on literal processing during idiom interpretation. Journal of Psycholinguistic Research 21 (1): 1–16. Newman, John, and Sally Rice 2005 Inflectional islands. Paper presented at the 9th International Cognitive Linguistics Conference, Yonsei University, Korea, 17–22 July 2005.
References
259
Ninio, Anat 2005 Testing the role of semantic similarity in syntactic development. Journal of Child Language 32: 35–61. Nunberg, Geoffrey, Ivan A. Sag, and Thomas Wasow 1994 Idioms. Language 70 (3): 491–538. Panther, Klaus-Uwe, and Linda L. Thornburg (eds.) 2003 Metonymy and Pragmatic Inferencing. Amsterdam/Philadelphia: John Benjamins. Paradis, Carita 1997 Degree Modifiers of Adjectives in Spoken British English. Lund: Lund University Press. 2000 It’s well weird. Degree modifiers of adjectives revisited: the nineties. In Corpora Galore. Analyses and Techniques in Describing English, John M. Kirk (ed.), 147–160. Amsterdam: Rodopi. Partington, Alan 1993 Corpus evidence of language change: the case of the intensifier. In Text and Technology, Mona Baker, Gill Francis, and Elena TogniniBonelli (eds.), 177–92. Amsterdam/Philadelphia: John Benjamins. Pawley, Andrew, and Frances Hodgetts Syder 1983 Two puzzles for linguistic theory. In Language and Communication, Jack C. Richards, and Richard W. Schmidt (eds.), 191–227. London: Longman. Peters, Hans 1993 Die Englischen Gradaderbien der Kategorie Booster. Tübingen: Narr. 1994 Degree adverbs in Early Modern English. In Studies in Early Modern English, Dieter Kastovsky (ed.), 269–288. Berlin/New York: Mouton de Gruyter. Pierrehumbert, Janet B. 2001 Exemplar dynamics: word frequency, lenition and contrast. In Frequency and the emergence of linguistic structure, Joan L. Bybee, and Paul Hopper (eds.), 137–157. Amsterdam/Philadelphia: John Benjamins. 2003 Probabilistic phonology: discrimination and robustness. In Probabilistic Linguistics, Rens Bod, Jennifer Hay, and Stefanie Jannedy (eds.), 177–228. Cambridge, Mass.: MIT Press. Pine, Julian M., Elena Lieven, and Caroline F. Rowland 1998 Comparing different models of the development of the English verb category. Linguistics 36: 807–830. Pittner, Robert J. 1996 Der Wortbildungstyp Steigerungsbildung beim Adjektiv im Neuhochdeutschen. Sprache und Sprachen 19/20: 29–66.
260
References
Plag, Ingo 1999
Morphological Productivity. Structural Constraints in English Derivation. Berlin/New York: Mouton de Gruyter. 2006 Productivity. In Encyclopedia of Language and Linguistics, 2nd edition, volume 10, Keith Brown (ed.), 121–128. Oxford: Elsevier. Platts, Mark de Bretton 1979 Ways of Meaning: An Introduction to a Philosophy of Language. London: Routledge. Poirier, Pierre, Benoit Hardy–Vallée, and Jean–Frédéric DePasquale 2005 Embodied categorization. In Handbook of Categorization in Cognitive Science, Henri Cohen, and Claire Lefebvre (eds.), 739–765. Oxford: Elsevier. Posner, Michael I., and Steven W. Keele 1968 On the genesis of abstract ideas. Journal of Experimental Psychology 77 (3): 353–363. Pusch, Luise F. 1972 Smear = schmieren/beschmieren. Bemerkungen über partitive und holistische Konstruktionen im Deutschen und Englischen. In Reader zur Kontrastiven Linguistik, Gerhard Nickel (ed.), 122–135. Frankfurt: Athenäum. Pustejovsky, James 1995 The Generative Lexicon. Cambridge/Mass.: MIT press. Radden, Günter 1998 The conceptualisation of emotional causality by means of prepositional phrases. In Speaking of Emotions. Conceptualisation and Expression, Angeliki Athanasiadou, and El bieta Tabakowska (eds.), 273–294. Berlin/New York: Mouton de Gruyter. Radden, Günter, and Klaus-Uwe Panther 1999 Metonomy in Language and Thought. Amsterdam/Philadelphia: John Benjamins. Rohdenburg, Günter 1974 Sekundäre Subjektivierungen im Englischen und Deutschen. Vergleichende Untersuchungen zur Verb- und Adjektivsyntax. Stuttgart: PAKS. Rosch, Eleanor H. 1973 Natural categories. Cognitive Psychology 4: 328–350. 1975 Cognitive representation of semantic categories. Journal of Experimental Psychology: General 104: 192–233. Rosch, Eleanor, and Carolyn B. Mervis 1975 Family resemblances: studies in the internal structure of categories. Cognitive Psychology 7: 573–605.
References
261
Rosenfeld, Helga 1983 Erklärungen und Begründungen. Sätze mit kausalem aus und vor. Eine Korpusanalyse. Frankfurt: Lang. Rowland, Caroline F. 2007 Explaining errors in children’s questions. Cognition 104: 1, 106– 134. Rowland, Caroline F., and Julian M. Pine 2000 Subject–auxiliary inversion errors and wh–question acquisition: what children do know? Journal of Child Language 27, 157–181. Salkoff, Morris 1983 Bees are swarming in the garden: a synchronic study of productivity. Language 59 (2): 288–346. Sampson, Geoffrey 2001 Empirical Linguistics. London: Continuum. Sapir, Edward 1921 Language: An Introduction to the Study of Speech. New York: Harcourt, Brace and Company. Schmidt, Günter Dietrich 1987 Das Affixoid. Zur Notwendigkeit und Brauchbarkeit eines beliebten Zwischenbegriffs in der Wortbildung. In Deutsche Lehnwortbildung, Gabriele Hoppe, Alan Kirkness, Elisabeth Link, Isolde Nortmeyer, Wolfgang Rettig, and Günter Dietrich Schmidt (eds.), 53–101. Tübingen: Narr. Schmidtke-Bode, Karsten 2009 Going-to-V and gonna-V in child language: a quantitative approach to constructional development. Cognitive Linguistics 20 (3): 509– 538. Schönefeld, Doris 1999 Corpus linguistics and cognitivism. International Journal of Corpus Linguistics 4: 137–71. Schweigert, Wendy A. 1991 The muddy waters of idiom comprehension. Journal of Psycholinguistic Research 20, 305–314. Seebold, Elmar (ed.) 1999 KLUGE: Etymologisches Wörterbuch der Deutschen Sprache. 23. Auflage. Berlin: Walter de Gruyter. Sharoff, Serge 2006 Creating general-purpose corpora using automated search engine queries. In WaCky! Working Papers on the Web as Corpus, Marco Baroni, and Silvia Bernardini (eds.), 63–98. Bologna: GEDIT. http://wackybook.sslmit.unibo.it/
262
References
Sinclair, John McHardy 1987 Collocation: a progress report. In Language Topics: Essays in Honour of Michael Halliday, vol. 2, Ross Steele, and Terry Threadgold (eds.), 319–331. Amsterdam/Philadelphia: John Benjamins. Skousen, Royal 1989 Analogical Modeling of Language. Dordrecht: Kluwer. 1992 Analogy and Structure. Dordrecht: Kluwer. Skousen, Royal, Deryle Lonsdale, and Dilworth B. Parkinson (eds.) 2002 Analogical Modeling. Amsterdam/Philadelphia: John Benjamins. Snell-Hornby, Mary 1983 Verb Descriptivity in German and English. Heidelberg: Carl Winter. Sperber, Dan, and Deirdre Wilson 1995 Relevance: Communication and Cognition, 2nd ed. Oxford: Blackwell. Spitzbardt, Harry 1965 English adverbs of degree and their semantic fields. Philologica Pragensia 8 : 349–359. Stefanowitsch, Anatol 2006 Words and their metaphors: a corpus-based approach. In Corpusbased Approaches to Metaphor and Metonymy, Anatol Stefanowitsch, and Stefan Thomas Gries (eds.), 61–105. Berlin and New York: Mouton de Gruyter. 2011 Cognitive linguistics meets the corpus. In Cognitive Linguistics: Convergence and Expansion, Mario Brdar, Stefan Thomas Gries, and Milena ic Fuchs (eds.), 257–290. Amsterdam/Philadelphia: John Benjamins. Stefanowitsch, Anatol, and Stefan Thomas Gries 2003 Collostructions: investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8 (2): 209–43. 2005 Covarying collexemes. Corpus Linguistics and Linguistic Theory 1 (1): 1–46. Stenström, Anna-Brita 1999 He was really gormless – she’s bloody crap. Girls, boys and intensifiers. In Out of Corpora: Studies in Honour of Stig Johansson, Hilde Hasselgård, and Signe Oksefjell (eds.), 69–78. Amsterdam: Rodopi. Stern, Gustav 1968 Reprint. Meaning and Change of Meaning with Special Reference to the English Language. Bloomington: Indiana University Press. Original edition, Göteborg: Elander, 1931. Stoffel, Cornelis 1901 Intensives and down–toners. Heidelberg: Carl Winter.
References
263
Stolz, Thomas 2006 (Wort-)Iteration: (k)eine universelle Konstruktion. In Konstruktionsgrammatik: Von der Anwendung zur Theorie, Kerstin Fischer, and Anatol Stefanowitsch (eds.), 105–132. Tübingen: Stauffenberg. Swinney, David A., and Anne Cutler 1979 The access and processing of idiomatic expressions. Journal of Verbal Learning and Verbal Behavior 18: 523–534. Tabossi, Patrizia, Rachele Fanari, and Kinou Wolf 2005 Spoken idiom recognition: meaning retrieval and word expectancy. Journal of Psycholinguistic Research 34: 465–495. Talmy, Leonard 1985 Lexicalization patterns: semantic structure in lexical forms. In Language Typology and Syntactic Description, Volume 3: Grammatical Categories and the Lexicon, Timothy Shopen (ed.), 57–149. Cambridge: Cambridge University Press. 2000 Toward a Cognitive Semantics. Cambridge, Mass.: MIT Press. Taylor, John R. 1995 Linguistic Categorization. Prototypes in Linguistic Theory. Oxford: Oxford University Press. Theakston, Anna L. 2004 The role of entrenchment in children’s and adults’ performance on grammaticality-judgement tasks. Cognitive Development 19 (1): 15– 34. Theakston, Anna L., Elena Lieven, Julian Pine, and Caroline F. Rowland 2001 The role of performance limitations in the acquisition of verb– argument structure: an alternative account. Journal of Child Language 28: 127–152. Tomasello, Michael 1992 First Verbs. A Case Study of Early Grammatical Development. Cambridge: Cambridge University Press. 1995 Joint attention as social cognition. In Joint Attention: Its Origins and Role in Development, Chris Moore, and Philip J. Dunham (eds.), 103–130. Hillsdale: Lawrence Erlbaum. 1998 The return of constructions. Journal of Child Language 25: 431–442. 2000a Do young children have adult syntactic competence? Cognition 74: 209–253. 2000b The item-based nature of children’s early syntactic development. Trends in Cognitive Sciences 4: 156–163. 2003 Constructing a Language. A Usage-based Theory of Language Acquisition. Cambridge, Mass.: Harvard University Press. Traugott, Elizabeth C. 1989 On the rise of epistemic meanings in English: An example of subjectification in semantic change. Language 57: 33–65.
264
References
2003
Constructions in grammaticalization. In Handbook of Historical Linguistics, Brian D. Joseph, and Richard D. Janda (eds.), 624–647. Oxford: Blackwell. 2007 The concepts of constructional mismatch and type–shifting from the perspective of grammaticalization. Cognitive Linguistics 18 (4): 523–557. 2008 Grammatikalisierung, emergente Konstruktionen und der Begriff der ‘Neuheit’. In Konstruktionsgrammatik II: Von der Konstruktion zur Grammatik, Anatol Stefanowitsch, and Kerstin Fischer (eds.), 5–32. Translated by Arne Zeschel. Tübingen: Stauffenburg. Traugott, Elizabeth C., and Richard B. Dasher 2002 Regularity in Semantic Change. Cambridge: Cambridge University Press. Trousdale, Graeme, and Nikolas Gisborne 2008 Constructional Approaches to English Grammar. Berlin: Mouton de Gruyter. Tummers, Jose, Kris Heylen, and Dirk Geeraerts 2005 Usage-based approaches in cognitive linguistics: a technical state of the art. Corpus Linguistics and Linguistic Theory 1 (2): 225–261. Turner, Mark and Gilles Fauconnier 1995 Conceptual integration and formal expression. Metaphor and Symbolic Activity 10, 183–204. Ullmann, Stephen 1957 Principles of Semantics. 2nd edition. Oxford: Blackwell. Van Os, Charles 1989 Aspekte der Intensivierung im Deutschen. Tübingen: Gunter Narr. Vigliocco, Gabriella, Lotte Meteyard, Mark Andrews, and Stavroula Kousta 2009 Toward a theory of semantic representation. Language and Cognition 1 (2): 219–247. Wiechmann, Daniel 2008 Initial parsing decisions and lexical bias: Corpus evidence from local NP/S-ambiguities. Cognitive Linguistics 19 (3): 447–463. Wierzbicka, Anna 1985 Lexicography and Conceptual Analysis. Ann Arbor: Karoma Publishers. Wray, Alison 2002 Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. Wulff, Stefanie 2008 Rethinking Idiomaticity: A Usage-based Approach. London: Continuum.
References
265
Zeschel, Arne 2008 Lexical chunking effects in syntactic processing. Cognitive Linguistics 19 (3): 419–438. 2009 What’s (in) a construction? Complete inheritance vs. full-entry models. In New Directions in Cognitive Linguistics, Vyvyan Evans, and Stéphanie Pourcel (eds.), 185–200. Amsterdam/Philadelphia: John Benjamins. 2010 Exemplars and analogy: Semantic extension in constructional networks. In Quantitative Methods in Cognitive Semantics: Corpusdriven Approaches, Dylan Glynn, and Kerstin Fischer (eds.), 201– 222. Berlin/New York: Mouton de Gruyter. 2011 Den Wald vor lauter Bäumen sehen – und andersherum: zum Verhältnis von “Mustern” und “Regeln”. In Konstruktionsgrammatik III: Aktuelle Fragen und Lösungsansätze, Alexander Lasch, and Alexander Ziem (eds.), 43–58. Tübingen: Stauffenburg. Zifonun, Gisela, Ludger Hoffmann, and Bruno Strecker 1997 Grammatik der Deutschen Sprache, vol. 3. Berlin: de Gruyter.
Index
acceptability, 1, 7, 28, 41, 126 acquisition, 6, 11, 24, 26–32, 120, 163 active zone, 164 analogy, 15, 23–25, 31, 121, 138, 154, 160, 161, 170–172, 184– 185, 191, 198, 218, 231 antonymy, 41, 131, 147, 179, 183, 201, 219 authorship, 83–84 autonomy, 50–51, 69, 173 bondedness, 51, 69, 211 boosting, 44, 54 channel, 84 cloud diagram, 97 cluster analysis, 166–167, 188–189, 217–227 coalescence, 51, 73 COBUILD project, 37–38 coercion, 13, 47, 89, 164 co–hyponymy, 145–146, 180, 183, 191, 194, 219 collexeme, 97, 123–127, 166, 175, 188 collocation, 1, 6, 7, 27–28, 28, 31, 33, 35–38, 47, 48, 50, 77, 86, 120–123, 131, 138, 139, 153, 157–158, 160, 161, 167, 176, 182, 203, 218, 219, 228, 231 collostruction strength, 123–126, 127, 128, 129, 232 collostructional analysis, 36, 123– 126, 129, 222 componentiality, 8, 40, 120 compound, 70–74, 103, 179, 241, 243
conceptual blending, 16, 20–21 conceptual mapping, 16, 19–20, 54– 56 conceptual structure, 8, 16, 34 conflation, 12–14 conservatism, 1–2, 30, 228, 231 construal, 16–17, 21, 58, 132–133, 157, 162, 182 construction, definition of, 8–10 macro vs. meso vs. micro, 15 construction grammar, 8–12, 170 constructional grounding, 31–32 correlation analysis, 185, 187–188 creativity, 1, 6, 15, 21, 30, 39, 76, 98–99, 105–106, 110–111, 120– 122, 138–139, 160, 170, 184– 185, 188, 228, 232 degree adverb, 45, 53, 67, 70, 242 dendrogram, 188 derivation, 70, 145, 243 discreteness, 165, 169 domain, 13, 16–17 domain matrix, 17 duplicate, 84 elative, 70, 75, 241 embodiment, 16–18, 59–60 entrenchment, 10–11, 14, 120, 121, 122–126, 157, 231–232 erosion, 51 excessive, 71, 74–75 exemplar models of categorisation, 23–25 expressivity, 48, 139, 245
Index fixed expression, 1, 6, 14, 25, 91, 120–122 fluidity, 11, 48, 164, 168 formality, 84, 232 formulaicity, 14, 36–39, 120–122, 157 function word, 49, 50, 53, 241 genre, 81, 84, 232 grammaticalisation, 1, 32–33, 219 definition of, 49 of intensifiers, 33, 49–50, 53, 56, 67–75 parameters, 50–52 synchronic perspective on, 36 generalisation, 1–2, 7, 8, 21–25, 35, 50, 59, 117, 121, 156, 157, 159, 160, 163, 171, 231 gradation, 45, 58, 70, 241 grammar–based retrieval strategy, 78–79 granularity, 164, 169, 180 hapax legomenon, 125–126, 174, 175, 185, 187, 191, 204–216, 228, 232 hyponymy, 134, 145, 179–180, 190, 191, 193, 196, 200 idealised cognitive model, 16 ideophone, 140 idiom, 9, 14, 121 of decoding, 40 schematic, 7, 38 variation, 40–41 idiom principle, 37–38 idiomatically combining expression, 40, 130 idiomaticity, 14 image schema, 18–19, 181 implicature, 35, 45, 68, 69, 89 inherent intensity, 57–59, 241 innovation, 2, 47–49, 231
267
integrity, 51 intensification, 44–47, 139 metaphorical, 54–55 metonymic, 55–56 through redundancy, 56–57 through reduplication, 56 intensification pattern, 52 intensity formation, 71 intercoder reliability, 177, 184, 189 intersective modification, 66 intersubjectification, 34 introspection, 85, 164–166, 170, 175–177, 244–245 invited inference, 34 isolating context, 35, 69 lemmatisation, 88, 89–90, 179, 212, 232 lexicalisation pattern, 12–14, 52, 78– 80, 114–116 lexicalised sentence stem, 39 lexicon–based retrieval strategy, 79, 85–87 literal meaning, 14, 40–41, 76, 131, 148, 150, 152, 153, 182, 199, 225 mental space, 16 metaphor, 16, 19–21, 54–55, 56, 60, 80, 131, 133, 134, 136, 145, 147, 152, 160, 166, 181, 182–183, 190, 191, 194, 195, 197, 201, 214, 219, 225, 229, 232 metaphorical pattern analysis, 80 metonymy, 16, 20, 34–35, 54–56, 59, 131, 133, 134, 136, 137, 145, 147, 152, 178, 180–182, 193, 194, 195, 199, 200, 201, 203, 205, 214, 219 motivation, 2, 5, 7, 20, 121–122, 131, 134, 142, 143, 157, 158, 175, 176, 182, 195, 227
268
Index
noise, 84 observability, 162–163 open choice principle, 37, 120 paradigmaticity, 51 pattern grammar, 38 polysemy, 85–87, 153, 178 regular, 164, 186 presupposition, 47 productivity, 5, 6, 14–15, 30, 170– 174 and token frequency, 172–174 and type frequency, 171–172 domain, 173–174 incipient, 5–7, 30, 185, 203–216 profile, 16–17, 63–64, 131, 136, 180–181, 182, 185, 199, 205 prototype, 21–23, 29, 45, 51, 121, 123 prototype theory, 21–23 reanalysis, 32, 68, 71–72 renewal, 49–50 regularisation, 171 repetition, 27, 29, 56–57, 120 replicability, 82 representativeness, 81, 163–164, 167–168, 174, 176 schematisation, 1, 5–7, 23, 24–26, 27, 30, 31, 32, 33–35, 48, 50, 52, 159–162, 171, 184, 228, 231 semantic bleaching, 51, 52 semantic change, 32, 34–36, 89 semantic coherence, 172, 201 semantic field, 41, 145, 166, 185 semantic frame, 16–17, 66, 131, 132, 133, 134, 169, 180–182, 185–
187, 190, 191, 192, 193, 194, 195, 196, 199, 200, 201, 202, 204, 205, 206, 217—227, 228, 231, semi–prefix, 70–71, 246 slot–and–filler pattern, 7, 9, 25, 27, 30–31, 35, 38, 41, 49, 123 span, 88 structural scope, 51 subjectification, 34 symbol grounding problem, 163 synaesthesia, 142–143, 245 synonymy, 7, 41, 130, 131, 133, 134, 136, 140, 145, 147, 177–179, 183, 190, 196, 199, 201, 204, 218, 219 thesaurus, 80, 85, 244 token frequency, 6, 25, 27–30, 86, 96–97, 103–104, 111–112, 120, 122–126, 171, 172–174, 189, 232 token–type ratio, 93, 109, 116, 222 type frequency, 99–101, 106–108, 112–114, 157, 171–173, 187–188 usage–based model, 10–11, 120, 170, 232–233 variability lexical, 1, 9, 40–41, 49, 63, 101 syntagmatic vs. paradigmatic, 51 verb class, 85 web as corpus, 81–84 WebCorp, 82, 83, 87, 243, 243–244 Zipfian distribution, 29, 97, 117, 206, 207, 216 z–score, 101